Converting 2-column list to glossary Thread poster: Tony M
| Tony M France Local time: 12:51 Member French to English + ... SITE LOCALIZER
I have a two-part problem: 1) I often receive 2 column aligned SOURCE / TARGET lists of terms — basically, a client glossary. Does anyone know of a convenient utility that can be used to convert this into a CAT tool glossary (in my case, for Wordfast Classic, but the actual tool isn't really the issue)? As far as I have been able to ascertain, my CAT tool doesn't have a built-in utility for doing this (though I may be wrong!) I have a manual workaround which in... See more I have a two-part problem: 1) I often receive 2 column aligned SOURCE / TARGET lists of terms — basically, a client glossary. Does anyone know of a convenient utility that can be used to convert this into a CAT tool glossary (in my case, for Wordfast Classic, but the actual tool isn't really the issue)? As far as I have been able to ascertain, my CAT tool doesn't have a built-in utility for doing this (though I may be wrong!) I have a manual workaround which involves looking at an existing glossary — which is basically a tab-separated text file — and counting the number of tabs (many of which only separate blank fields that I don't need to use. I then add enough blank columns to the right of my bilingual table to create the corresponding number of tabs, convert table-to-text, and then to be on the safe side, copy and paste that text into an existing blank glossary. But it's a bit long-winded, and a little routine for doing it would certainly help! 2) I currently have a particular glossary with a slightly different format — it has 2 lines for each entry, the first being the acronym and its translation, and then the second being the expanded form of the acronym and its translation. What I need to do is get the two acronyms in to the Source and Target fields of my glossary, and then the 2 expanded forms together into the 'Notes' field; anyone got any brilliant ideas how to do this? i suspect I am going to have to first manually combing the expanded text + translation into a 3rd column alongside the acronyms, and then proceed with my original system as above, albeit with one less 'extra' column. All suggestions gratefully received! ▲ Collapse | | | Patrick Porter United States Local time: 07:51 Spanish to English + ... Regular expression find and replace could work | Mar 11, 2016 |
For your second issue...if you have a text editor that allows find/replace with regex...you could use that to find every pair of lines and then take out the line ending in the middle. If you have Notepad++ (or care to download..it's free)...open the file....press Ctrl+H (for find/replace) and put the following expressions in the corresponding boxes: Find: (.+?)\t(.+?)\r\n(.+?)\t(.+?)\r\n Replace: $1\t$2\t$3\t$4\r\n Make sure to check at the bot... See more For your second issue...if you have a text editor that allows find/replace with regex...you could use that to find every pair of lines and then take out the line ending in the middle. If you have Notepad++ (or care to download..it's free)...open the file....press Ctrl+H (for find/replace) and put the following expressions in the corresponding boxes: Find: (.+?)\t(.+?)\r\n(.+?)\t(.+?)\r\n Replace: $1\t$2\t$3\t$4\r\n Make sure to check at the bottom: Search Mode: Regular Expression....and check the box "matches newline" This will make every second line appear as fields 3 and 4 of the previous line. Make sure that there is a carriage return after the last line (i.e. the file doesn't end at the very end of the last line but at the beginning of a new line.) Also, if you are not on a Windows machine or the file was not created on a Windows machine, then the newline might be \n instead of the \r\n in the expressions above. You can tell by trying a quick find and if nothing comes up then try removing all the \r from the regexes (there are 3 total). ▲ Collapse | | | CafeTran Training (X) Netherlands Local time: 12:51 Try another CAT tool? | Mar 11, 2016 |
Tony M wrote: but the actual tool isn't really the issue When you're willing to try another tool: CafeTran's native glossary format is tab-delimited: you can use your list right away. It also allows source-side and target-side alternatives: ACRONYM;long form source TAB ACRONYM;long form target That way, you can keep together what belongs together. During translation, you can easily switch between automatic insertion of the alternative target via the right mouse button: so you can choose to have the acronym translated as acronym or as a long form. Once or in the whole project. A similar regex would be needed to prep the glossary. I've recorded a short video to demonstrate this: https://youtu.be/roX4yksMssk
[Edited at 2016-03-11 07:21 GMT] | | | Glossary search without CAT tools | Mar 11, 2016 |
Not sure how helpful it may be to your issue, but I've been using Search and Replace from Funduc (http://www.funduc.com/) since I started. When I receive 2+-column glossaries, I adapt/convert them to .csv or .txt and the app searches all files, the results window shows all occurrences line by line. It has many other features that I've never used, but to search quickly many heterogenous files at once w... See more Not sure how helpful it may be to your issue, but I've been using Search and Replace from Funduc (http://www.funduc.com/) since I started. When I receive 2+-column glossaries, I adapt/convert them to .csv or .txt and the app searches all files, the results window shows all occurrences line by line. It has many other features that I've never used, but to search quickly many heterogenous files at once without opening them, it's handy. If I remember well, incorporating 2-column glossaries into MemoQ is also quite easy. Philippe ▲ Collapse | |
|
|
Samuel Murray Netherlands Local time: 12:51 Member (2006) English to Afrikaans + ... Two columns is enough for WFC | Mar 11, 2016 |
Tony M wrote: Does anyone know of a convenient utility that can be used to convert this into a CAT tool glossary (in my case, for Wordfast Classic, but the actual tool isn't really the issue)? WFC does not care if different records have different numbers of fields, as long as the two required fields (source and target) are present. So you can safely add more terms to the WFC glossary, even if the terms that you add has only source and target, whereas the other entries have more tabs. I currently have a particular glossary with a slightly different format — it has 2 lines for each entry, the first being the acronym and its translation, and then the second being the expanded form of the acronym and its translation. What I need to do is get the two acronyms in to the Source and Target fields of my glossary, and then the 2 expanded forms together into the 'Notes' field... I see a long road of manual copying ahead. Samuel | | | esperantisto Local time: 14:51 Member (2006) English to Russian + ... SITE LOCALIZER No need in any tool | Mar 11, 2016 |
You don’t need any tool. As mentioned, some CAT programs can use tab-delimited text files as glossaries directly (just to add: OmegaT, Anaphraseus), others can import them by an established procedure using a built-in tool or feature. Just read the respective manual. | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Converting 2-column list to glossary Wordfast Pro | Translation Memory Software for Any Platform
Exclusive discount for ProZ.com users!
Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value
Buy now! » |
| Trados Business Manager Lite | Create customer quotes and invoices from within Trados Studio
Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |