Pages in topic: < [1 2 3 4 5 6 7 8 9] > | New free & open source aligner (for Windows, OS X and linux) Thread poster: FarkasAndras
| FarkasAndras Local time: 17:35 English to Hungarian + ... TOPIC STARTER What do you mean? | Dec 9, 2010 |
Marta Hasyuk wrote: Is there any possibility to reiterate alignment for the same texts? Run the aligner with the same input files again...? Marta Hasyuk wrote: How to set realign option in the algorithm? Do you mean the -realign switch in hunalign to improve the quality of the alignment by doing two passes? You can add that to the hunalign command in the .pl easily if you're on linux or OS X. On windows, see the previous post on how to run the .pl instead of the .exe. | | | FarkasAndras Local time: 17:35 English to Hungarian + ... TOPIC STARTER 3- and 4-language versions added | Dec 22, 2010 |
I thought I'd let you all know that the 3-language and 4-language versions of the aligner are also available now. If you're an interpreter with more than one passive language or a PM who needs to align files for multilingual projects, this should come in pretty handy. It generates xls files and multilingual TMXes. You can autoalign texts in 4 languages in one go, correct the alignment also in one go, generate a TMX and send it to everyone involved in the project. Then everyone can im... See more I thought I'd let you all know that the 3-language and 4-language versions of the aligner are also available now. If you're an interpreter with more than one passive language or a PM who needs to align files for multilingual projects, this should come in pretty handy. It generates xls files and multilingual TMXes. You can autoalign texts in 4 languages in one go, correct the alignment also in one go, generate a TMX and send it to everyone involved in the project. Then everyone can import the same TMX into their own TMs, and their CAT should know which languages to import and which to ignore. Feedback and bugreports welcome as always, the download URL is the same as ever. Windows only at the moment, mac/linux coming soonish. Of course there were other updates in all versions since I last posted here, so if you're a user, get downloading. Mac and linux versions are at 2.302 now. ▲ Collapse | | | Charles Ek United States Local time: 11:35 Member (2009) Norwegian to English + ... Thanks very much for this excellent software | Jun 26, 2011 |
As I've told you privately, this is an excellent piece of software. I'm unschooled in non-GUI commands. However, with the aid of your clear documentation I was able to install this tool in minutes and then immediately align a reference translation and its source. The resulting TMX ran flawlessly in OmegaT. Thanks! | | | FarkasAndras Local time: 17:35 English to Hungarian + ... TOPIC STARTER
Charles Ek wrote: As I've told you privately, this is an excellent piece of software. I'm unschooled in non-GUI commands. However, with the aid of your clear documentation I was able to install this tool in minutes and then immediately align a reference translation and its source. The resulting TMX ran flawlessly in OmegaT. Thanks! Thanks, glad you like it. Incidentally, version 2.55 was released a few days ago. The main new feature is that the main script can now do multilingual alignments (up to 100 languages). This also means that the 3- and 4-language versions, which were always lagging behind in development/release, aren't needed anymore. | |
|
|
ni-cole Switzerland Local time: 17:35 German to French + ...
Dear FarkasAndras Thank you very much for this wonderful tool! I think it is very useful. But... I had first some problem trying lf aligner: when I opened the source file (doc) and the target file, lf aligner disappeared...! I read in this post that the reason for closing was the name of the file. My filenames look like this: name_D for the source document and name_F for the target document (while name = the name given by my customer, D = german, F = frenc... See more Dear FarkasAndras Thank you very much for this wonderful tool! I think it is very useful. But... I had first some problem trying lf aligner: when I opened the source file (doc) and the target file, lf aligner disappeared...! I read in this post that the reason for closing was the name of the file. My filenames look like this: name_D for the source document and name_F for the target document (while name = the name given by my customer, D = german, F = french). I changed it by test and essai and then it works. But I cannot change my whole system just because of one tool, even if this tool is great! So my question is: is there any possibility to make lf aligner understand that there is a difference between name_D and name_F? I mean, all other softwares know it... By the way: are letters like ä ö or ü in the filename also a problem? Kind regards Nicole. ▲ Collapse | | | FarkasAndras Local time: 17:35 English to Hungarian + ... TOPIC STARTER
Well, the file name issue is the following: due to some issues outside of my control, filenames with non-ASCII characters don't work on Windows. It's a character encoding problem in Windows and the programming language I used for the project. It may be possible for me to find a workaround to the problem, but it wouldn't be easy or simple. I decided not to bother and spend my time working on what I consider real features. But now that you mention it, maybe I'll try and add better error handling o... See more Well, the file name issue is the following: due to some issues outside of my control, filenames with non-ASCII characters don't work on Windows. It's a character encoding problem in Windows and the programming language I used for the project. It may be possible for me to find a workaround to the problem, but it wouldn't be easy or simple. I decided not to bother and spend my time working on what I consider real features. But now that you mention it, maybe I'll try and add better error handling or possibly rename offending files automatically (I don't have high hopes of the latter working out). So yes, the problem is "letters like ä ö or ü in the filename". Anything else is fine, pretty much. You can call your files "File_12_of_Client_X version 12.3.4_new.doc", (i.e. underscores, spaces and full stops are not a problem). You can't call the files "jökkmokk.doc" or "ű.txt". Obviously, you can't use Asian characters, either, or a French c with a cédille. By the way, avoiding non-ASCII characters in filenames is good practice in general. They can cause problems left and right. For instance, if several files are attached to an email in Yahoo and you choose "download all", then Yahoo zips them for you. Non-ASCII characters in the filenames will be corrupted because of a similar character encoding problem to the one in LF Aligner. I'm sure you've seen character corruption in file names... that's always due to something like this.
[Edited at 2012-02-12 09:35 GMT] ▲ Collapse | | | ni-cole Switzerland Local time: 17:35 German to French + ...
Thank you very much for answering so fast! The problem was not only the ü, this was actually the first problem and it is actually easy to resolve. And as you said it may be anyway better not to use ü and similars in filenames. But my main problem is that lf aligner closes after I open the files because it considers they have the same name - at least this is what I understood in this topic. The source file is called uebersetzung_D and the target file uebersetzung_F. I... See more Thank you very much for answering so fast! The problem was not only the ü, this was actually the first problem and it is actually easy to resolve. And as you said it may be anyway better not to use ü and similars in filenames. But my main problem is that lf aligner closes after I open the files because it considers they have the same name - at least this is what I understood in this topic. The source file is called uebersetzung_D and the target file uebersetzung_F. I always keep the original filename, just adding a D for german and a F for french at the end. Obviously lf aligner consider then as identical but they aren't. Is there a solution for this (except renaming all the files)? In between I may found a solution, but it is not ideal: I will create a special folder called "lf aligner-files" and copy the two files inside, then rename then in german.doc and french.doc and run lf aligner. I think it will work, but it makes it less easy to use and I am actually looking for something that I can do within my daily business, even if there is some work-pressure. By now, I am using Plus Tools and I am loosing so much time because it makes a big mess, putting german segments in the french part and vice-versa, etc. So I often do not align just because I am afraid to loose so much time on it. I also tried bitext2tmx but was not convinced. Kind regards, Nicole. ▲ Collapse | | | FarkasAndras Local time: 17:35 English to Hungarian + ... TOPIC STARTER
ni-cole wrote: my main problem is that lf aligner closes after I open the files because it considers they have the same name - at least this is what I understood in this topic. The source file is called uebersetzung_D and the target file uebersetzung_F. I always keep the original filename, just adding a D for german and a F for french at the end. Obviously lf aligner consider then as identical but they aren't. No, it doesn't consider them identical. The problem is something else. The two input files have to be in the same folder - I can't think of any other limitation. To find the root cause, read aligner/scripts/log.txt after a failed alignment and post it here if you can't figure out what went wrong. Also, if the console window just closes on you before you have a chance to read the error message, open a persistent console window by typing cmd into the search window in the start menu (win7) or pressing "Run" in the Start menu and typing cmd there (XP). Then just drag and drop the aligner exe into the console window and press enter to launch it. This way the window won't disappear. ni-cole wrote: By now, I am using Plus Tools and I am loosing so much time because it makes a big mess, putting german segments in the french part and vice-versa, etc. So I often do not align just because I am afraid to loose so much time on it. I also tried bitext2tmx but was not convinced. Neither tool has an autoaligner, so I don't consider them real full-featured aligners. However, they both have a better editing UI then LF Aligner (which just uses Excel/OOo Calc for this purpose). The readme tells you how to use the PlusTools UI for alignments done with LF Aligner. | |
|
|
ni-cole Switzerland Local time: 17:35 German to French + ... Log.txt -> it seams to be a folderproblem...?!? | Feb 17, 2012 |
Hi! Sorry, I had some busy work in the last days so I hadn't time to try it again. I made like you said and this is the log: Program: LF aligner, version: 2.56, OS: Windows, launched: 2012/02/17, 11:58:19 Setup: filetype_def: t; filetype_prompt: y; l1_def: en; l2_def: hu; l1_prompt: y; l2_prompt: y; segmenttext_def: y; segmenttext_prompt: y=; cleanup_def: y; cleanup_prompt: y; review_def: x; review_prompt: y; create_tmx_def: y; create_tmx_promp... See more Hi! Sorry, I had some busy work in the last days so I hadn't time to try it again. I made like you said and this is the log: Program: LF aligner, version: 2.56, OS: Windows, launched: 2012/02/17, 11:58:19 Setup: filetype_def: t; filetype_prompt: y; l1_def: en; l2_def: hu; l1_prompt: y; l2_prompt: y; segmenttext_def: y; segmenttext_prompt: y=; cleanup_def: y; cleanup_prompt: y; review_def: x; review_prompt: y; create_tmx_def: y; create_tmx_prompt: y; l1_code_def: EN-GB; l2_code_def: HU; l1_code_prompt: y; l2_code_prompt: y; creationdate_prompt: y; creationid_def: ; creationid_prompt: y; ask_master_TM: n; chopmode: 0; tmxnote_def: ; tmxnote_prompt: y; pdfmode: y GUI on filetype: t Input file 1: Compendium_D.doc (C:/Users/[myname]/Documents/übersetzungen/[client]/hilfe/Compendium_D.doc) Input file 2: Compendium_F.doc (C:/Users/[myname]/Documents/übersetzungen/[client]/hilfe/Compendium_F.doc) ERROR: File 1 not found; folder: C:/Users/[myname]/Documents/übersetzungen/[client]/hilfe, file: Compendium_D.doc ERROR: File 2 not found; folder:C:/Users/[myname]/Documents/übersetzungen/[client]/hilfe, file: Compendium_F.doc ((I just change the name of the client in the log here)) Then I did copy the two documents on the Desktop and did the same and... it works! So it seams to be a folder problem...? Why do I have to put then on the Desktop? You said it is important that they are in the same folder and they were. Any idea? By the way: I am again very impressed by the result, thank you very much! ▲ Collapse | | | FarkasAndras Local time: 17:35 English to Hungarian + ... TOPIC STARTER
ni-cole wrote: ERROR: File 1 not found; folder: C:/Users/[myname]/Documents/übersetzungen/[client]/hilfe, file: Compendium_D.doc ERROR: File 2 not found; folder:C:/Users/[myname]/Documents/übersetzungen/[client]/hilfe, file: Compendium_F.doc There's your answer. The same limitations apply to folder names as file names (no non-ASCII characters allowed). So, just rename "übersetzungen" to "ubersetzungen" and you should be good to go. Or, of course, put the files in any other location that has no accented letters anywhere in the path name. | | | ni-cole Switzerland Local time: 17:35 German to French + ...
Of course, you are right. I didn't realise that I have also to watch the name of the folders... Shame in me FarkasAndras wrote: So, just rename "übersetzungen" to "ubersetzungen" and you should be good to go. Or, of course, put the files in any other location that has no accented letters anywhere in the path name. I'll do that. And thank you very much for your help! | | | FarkasAndras Local time: 17:35 English to Hungarian + ... TOPIC STARTER Beta testers wanted | Mar 30, 2012 |
A pretty major update to LF Aligner is almost ready for release, and I'd like to have a few people run it through its paces before it goes live. The update adds a graphical user interface to the aligner, hopefully making the tool a lot more user friendly. So, if you're interested and good with computers, let me know. You don't need to be a programmer to beta test, but you need to be able to give me detailed bugreports & feature requests, make sense of log files etc. I'll need ... See more A pretty major update to LF Aligner is almost ready for release, and I'd like to have a few people run it through its paces before it goes live. The update adds a graphical user interface to the aligner, hopefully making the tool a lot more user friendly. So, if you're interested and good with computers, let me know. You don't need to be a programmer to beta test, but you need to be able to give me detailed bugreports & feature requests, make sense of log files etc. I'll need you to run the program with a variety of realistic usage scenarios with your own texts, make sure everyting works and make suggestions for reshuffling the GUI or adding stuff etc. In return, you get early access and... updates from sourceforge later like everyone else. I'm mainly looking for WinXP, Vista and Win7 users. If you'd like to try out the GUI on Linux or OSX, you'll need to install a perl module or two on your own, so more expertise is needed on these platforms. Send an email to lfaligner (gmail) to get in on the action. ▲ Collapse | |
|
|
FarkasAndras Local time: 17:35 English to Hungarian + ... TOPIC STARTER Still looking | Apr 8, 2012 |
Still looking for beta testers. | | | How does the batch aligner work ? | Feb 10, 2013 |
Hi Farkas, First of, thanks for building LF Aligner ! I discovered it a few weeks ago only, but I love it ! I have questions about the batch mode, though. Say, I want to create my own personal bilingual editions of Harry Potter (just an example). I tried the following syntax: LF_aligner_3.11.exe --filetype="t" --infiles="C:\HarryPotter01ENG.txt","C:\Har ... See more Hi Farkas, First of, thanks for building LF Aligner ! I discovered it a few weeks ago only, but I love it ! I have questions about the batch mode, though. Say, I want to create my own personal bilingual editions of Harry Potter (just an example). I tried the following syntax: LF_aligner_3.11.exe --filetype="t" --infiles="C:\HarryPotter01ENG.txt","C:\HarryPotter02ENG.txt","C:\HarryPotter03ENG.txt","C:\HarryPotter04ENG.txt","C:\HarryPotter05ENG.txt","C:\HarryPotter06ENG.txt","C:\HarryPotter07ENG.txt","C:\HarryPotter01FRE.txt","C:\HarryPotter02FRE.txt","C:\HarryPotter03FRE.txt","C:\HarryPotter04FRE.txt","C:\HarryPotter05FRE.txt","C:\HarryPotter06FRE.txt","C:\HarryPotter07FRE.txt" --languages="en","en","en","en","en","en","en","fr","fr","fr","fr","fr","fr","fr" --segment="y" --review="xn" --tmx="n" But it created an Excel table with as many columns as they were files ! That's not what I want. I just want it to align them two by two. I tried alternating the English and the French files in the BAT, like writing HarryPotter01ENG HarryPotter01FRE HarryPotter02ENG HarryPotter02FRE (and en fr en fr for the languages), but the result was the same. What am I doing wrong ? *** EDIT : I have an additional question. I just tried aligning just one pair. Here's the log : Program: LF Aligner, version: 3.11, OS: Windows, launched: 2013.02.10_20.08.51 Setup: filetype_def: t; filetype_prompt: y; lang_1_iso_def: en; lang_2_iso_def: fr; l1_prompt: y; l2_prompt: y; segmenttext: y; confirm_segmenting: y; cleanup_def: y; cleanup_prompt: n; review_def: x; review_prompt: y; create_tmx_def: n; create_tmx_prompt: n; tmx_langcode_1_def: en; tmx_langcode_2_def: fr; tmx_langcode_1_prompt: y; tmx_langcode_2_prompt: y; creationdate_prompt: y; creationid_def: LF Aligner 3.11; creationid_prompt: y; ask_master_TM: n; chopmode: 15000; tmxnote_def: ; tmxnote_prompt: y; pdfmode: y GUI on filetype: t Input file 1: HP01ENG.txt (D:/HP01ENG.txt) Input file 2: HP01FRE.txt (D:/HP01FRE.txt) Input file sizes: 440012 bytes 517197 bytes File sizes after conversion to txt: 440012 bytes 517197 bytes Initial stats: - en: 2929 segments, 82249 words, 432359 chars - fr: 3033 segments, 92561 words, 495020 chars Segmentation: y, segment numbers: - en: 2929 -> 6413 - fr: 3033 -> 6770 Reverted to unsegmented Hunalign dictionary: en-fr.dic Using Hunalign in normal mode, (2929 is less than 15000) Aligned file: 2786 segments, 979241 bytes (D:/align_2013.02.10_20.08.51/aligned_en-fr.txt) Cleanup: y Review: x Generated xls with 2786 lines Converted xls to txt after review; 2786 lines Create TMX: n Terminated normally. Why the discrepancy between the number of segments originally seen and the number of segments in the aligned file ? It's the same when I don't revert to unsegmented : Program: LF Aligner, version: 3.11, OS: Windows, launched: 2013.02.10_20.14.53 Setup: filetype_def: t; filetype_prompt: y; lang_1_iso_def: en; lang_2_iso_def: fr; l1_prompt: y; l2_prompt: y; segmenttext: y; confirm_segmenting: y; cleanup_def: y; cleanup_prompt: n; review_def: x; review_prompt: y; create_tmx_def: n; create_tmx_prompt: n; tmx_langcode_1_def: en; tmx_langcode_2_def: fr; tmx_langcode_1_prompt: y; tmx_langcode_2_prompt: y; creationdate_prompt: y; creationid_def: LF Aligner 3.11; creationid_prompt: y; ask_master_TM: n; chopmode: 15000; tmxnote_def: ; tmxnote_prompt: y; pdfmode: y GUI on filetype: t Input file 1: HP01ENG.txt (D:/HP01ENG.txt) Input file 2: HP01FRE.txt (D:/HP01FRE.txt) Input file sizes: 440012 bytes 517197 bytes File sizes after conversion to txt: 440012 bytes 517197 bytes Initial stats: - en: 2929 segments, 82249 words, 432359 chars - fr: 3033 segments, 92561 words, 495020 chars Segmentation: y, segment numbers: - en: 2929 -> 6413 - fr: 3033 -> 6770 Using segmented file versions Hunalign dictionary: en-fr.dic Using Hunalign in normal mode, (6413 is less than 15000) Aligned file: 6223 segments, 1012881 bytes (D:/align_2013.02.10_20.14.53/aligned_en-fr.txt) Cleanup: y Review: x Generated xls with 6223 lines Converted xls to txt after review; 6223 lines Create TMX: n Terminated normally. Thanks.
[Edited at 2013-02-10 19:31 GMT] ▲ Collapse | | | FarkasAndras Local time: 17:35 English to Hungarian + ... TOPIC STARTER separate commands | Feb 10, 2013 |
Hi, if you list all 16 files in one command, the aligner assumes that there are 16 different languages in this project and generates a 16-column table as you found out. What you need to do is issue a separate command for each file pair: LF_aligner_3.11.exe --filetype="t" --infiles="C:\HarryPotter01ENG.txt","C:\HarryPotter01FRE.txt" --languages="en","fr" --segment="y" --review="xn" --tmx="n" LF_aligner_3.11.exe --filetype="t" --infiles="C:\HarryPotter02ENG.txt","C:\HarryPo... See more Hi, if you list all 16 files in one command, the aligner assumes that there are 16 different languages in this project and generates a 16-column table as you found out. What you need to do is issue a separate command for each file pair: LF_aligner_3.11.exe --filetype="t" --infiles="C:\HarryPotter01ENG.txt","C:\HarryPotter01FRE.txt" --languages="en","fr" --segment="y" --review="xn" --tmx="n" LF_aligner_3.11.exe --filetype="t" --infiles="C:\HarryPotter02ENG.txt","C:\HarryPotter02FRE.txt" --languages="en","fr" --segment="y" --review="xn" --tmx="n" LF_aligner_3.11.exe --filetype="t" --infiles="C:\HarryPotter03ENG.txt","C:\HarryPotter03FRE.txt" --languages="en","fr" --segment="y" --review="xn" --tmx="n" LF_aligner_3.11.exe --filetype="t" --infiles="C:\HarryPotter04ENG.txt","C:\HarryPotter04FRE.txt" --languages="en","fr" --segment="y" --review="xn" --tmx="n" LF_aligner_3.11.exe --filetype="t" --infiles="C:\HarryPotter05ENG.txt","C:\HarryPotter05FRE.txt" --languages="en","fr" --segment="y" --review="xn" --tmx="n" LF_aligner_3.11.exe --filetype="t" --infiles="C:\HarryPotter06ENG.txt","C:\HarryPotter06FRE.txt" --languages="en","fr" --segment="y" --review="xn" --tmx="n" LF_aligner_3.11.exe --filetype="t" --infiles="C:\HarryPotter07ENG.txt","C:\HarryPotter07FRE.txt" --languages="en","fr" --segment="y" --review="xn" --tmx="n" Just put these in a .bat file (one line per command) and you'll be set. If you add an --outfile to each command, then a single txt file will be generated, containing the text of all the file pairs. ▲ Collapse | | | Pages in topic: < [1 2 3 4 5 6 7 8 9] > | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » New free & open source aligner (for Windows, OS X and linux) Anycount & Translation Office 3000 | Translation Office 3000
Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.
More info » |
| Protemos translation business management system | Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |