Searching words inside other words with TMLookup
Thread poster: Dominique Pivard
Dominique Pivard
Dominique Pivard  Identity Verified
Local time: 09:39
Finnish to French
Mar 2, 2016

In Memsource and memoQ, you can do a concordance search (Ctrl+K) on a word located inside another word (useful in languages that make heavy use of compound words) by adding an asterisk before and after the word you are interested in. For instance, *tila* will find segments that contain urheilutilan:



I guess the same is possible with with András Farkas’ TMLookup, using Regex. As a Regex analphabet, I’m therefore asking: what would be the equivalent syntax with Regex?


 
Dominique Pivard
Dominique Pivard  Identity Verified
Local time: 09:39
Finnish to French
TOPIC STARTER
Highlight, Filter and -Filter buttons in TMLookup Mar 2, 2016

Oh, and another TMLookup question, while I’m at it: what are the Highlight, Filter and -Filter buttons for? How would you typically use them?

Any plans to add version 1.55 (currently available via the Dropbox link mentioned in this post) to its proper home page at FarkasTranslations.com (which current hosts the older 1.0 and 1.31 versions)?


 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 08:39
English to Hungarian
+ ...
*tila* Mar 3, 2016

Well, the normal search mode cannot do this.* The regex mode can do searches like this. You don't need to enter anything special, just tila. You only need wildcards if you want to search for two strings with something in between. E.g. you want to find "special knowledge", "specialist knowledge", "specialized knowledge" and other variants. Then you'd enter "special.*knowledge". The full stop is the any-character wildcard, and the asterisk stands for "any number of occurrences". So .* is largely t... See more
Well, the normal search mode cannot do this.* The regex mode can do searches like this. You don't need to enter anything special, just tila. You only need wildcards if you want to search for two strings with something in between. E.g. you want to find "special knowledge", "specialist knowledge", "specialized knowledge" and other variants. Then you'd enter "special.*knowledge". The full stop is the any-character wildcard, and the asterisk stands for "any number of occurrences". So .* is largely the same as the * in MQ. Note that regex search is extremely slow compared to normal searches. It's fine for a db with a hundred thousand entries, but it's not really practical if you have several million entries in your db. There is a regex cheat sheet in the Help menu.

The new version will eventually be hosted on my site, it's just that updating the site is a bit of a pain.
Highlight/Filter/-Filter help refine searches and find the relevant bits in your text. E.g. you do a search with a French search term and you know the English result you're looking for is a two-word term where you already know one word. Enter that in the highlight box, click highlight and all the hits will be highlighted in the list so they're easier to find in that wall of text. You also get stats at the top of the window in parens (highlight gives you stats of the hits in the displayed hit list, which is capped at 500 hits). Filter hides all the hits that don't contain the filter term and -Filter hides all that do. You can often get the same result by using the main search boxes but 1) you can't always do negative searches in the main search box and 2) Filter/-Filter usually executes much faster.


* That's because TMLookup does its searches using a special text search technology called FTS in the SQLite database engine. FTS makes word searches on very large databases extremely fast, but it has its limitations: it can't do fuzzy searches and it can't do in-word searches. CAT tools have other similar text search technologies in whatever database engine they use, which usually have fuzzy and in-word searches - at the expense of slower speed and larger file sizes.
Collapse


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Searching words inside other words with TMLookup







Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »