How to extract acronyms from source text?
Thread poster: Erik Freitag
Erik Freitag
Erik Freitag  Identity Verified
Germany
Local time: 11:45
Member (2006)
Dutch to German
+ ...
Feb 22, 2018

Dear colleagues,

This may not be the best forum for my question, but here goes:

I'm looking for a convenient way to extract all acronyms/abbreviations from the source text, by which I basically (as a working definition) mean words that are not found in standard monolingual dictionaries and are written in capitals.

Ideally, I'd like to have them exported as a list, possibly with the whole sentence they appear in for context.

If anyone know a way
... See more
Dear colleagues,

This may not be the best forum for my question, but here goes:

I'm looking for a convenient way to extract all acronyms/abbreviations from the source text, by which I basically (as a working definition) mean words that are not found in standard monolingual dictionaries and are written in capitals.

Ideally, I'd like to have them exported as a list, possibly with the whole sentence they appear in for context.

If anyone know a way to achieve this with SDL Trados Studio 2017, TermExtract, or third party software, I'd be grateful for a hint.

Many thanks in advance,
kind regards,
Erik
Collapse


 
Adam Łobatiuk
Adam Łobatiuk  Identity Verified
Poland
Local time: 11:45
Member (2009)
English to Polish
+ ...
With Word Feb 22, 2018

For a rough list of acronyms with capital letters, you can copy and paste the text in MS Word, search with wildcards for <[A-Z]{2;}> (see note below) and replace with just bold formatting, and then search (without wildcards) for non-bold formatting and replace with ^p. That should leave you with just 2-letter or longer words in ALL CAPS with line breaks.

In the regular expression, you may need to use {2,} instead of {2;} depending on your system settings.

[Edited at 201
... See more
For a rough list of acronyms with capital letters, you can copy and paste the text in MS Word, search with wildcards for <[A-Z]{2;}> (see note below) and replace with just bold formatting, and then search (without wildcards) for non-bold formatting and replace with ^p. That should leave you with just 2-letter or longer words in ALL CAPS with line breaks.

In the regular expression, you may need to use {2,} instead of {2;} depending on your system settings.

[Edited at 2018-02-22 20:00 GMT]
Collapse


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How to extract acronyms from source text?







CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »