How your CAT tool handle entity references?
Thread poster: tz7

tz7
미국
Local time: 05:17
English to Japanese
Apr 2

Entity References for particular symbols, like ", &nbsp, and &, used in HTML or XML source should be shown as actual characters, like ", [space], and & when you are translating (  may be shown differently from regular space in some CAT tool), so that you can see what they are and if necessary you can change or remove them in your translation. These entity references need to be kept as is, unless you changed or removed, in the resultant target file. I guess many of po... See more
Entity References for particular symbols, like ", &nbsp, and &, used in HTML or XML source should be shown as actual characters, like ", [space], and & when you are translating (  may be shown differently from regular space in some CAT tool), so that you can see what they are and if necessary you can change or remove them in your translation. These entity references need to be kept as is, unless you changed or removed, in the resultant target file. I guess many of popular CAT tools work like this; Trados, MemoQ, Smartling, Memsource... except XTM 

Handling these entities correctly is ver important for translators. Otherwise, you have to deal with strings like below:
    <RecipientStatuses>

How does your CAT tool handle these entities?

[Edited at 2020-04-02 22:57 GMT]

[Edited at 2020-04-02 22:58 GMT]
Collapse


 

Stepan Konev  Identity Verified
러시아 연방
Local time: 12:17
English to Russian
XTM Apr 3

Why “except XTM”?
Both XTM and Smartling insert a special tag for nbsp, unlike Memsource, memoQ and Trados that use regular character (“invisible” degree sign).


 

Jorge Payan  Identity Verified
콜롬비아
Local time: 04:17
Member (2002)
German to Spanish
+ ...
DéjaVu DVX3 Apr 3

It allows exporting special characters either as entities or directly as special characters. The XML filter has to be configured for this purpose.

 

Samuel Murray  Identity Verified
네델란드
Local time: 11:17
Member (2006)
English to Afrikaans
+ ...
@TZ7 Apr 3

tz7 wrote:
Entity References for particular symbols, like ",  , and &, used in HTML or XML source, should be shown as actual characters, like ", [space], and & when you are translating.


1. Using the full-width ampersand instead of the generic ampersand in the forums, to avoid the forum software from interpreting it, is a neat trick.

2. Yes, I agree that that would be what I would have assumed, if the source text is XML or an XML-like format.

... except XTM 


Are you speaking as a project manager or a translator? If the latter, do you have access to the source file so that you can verify the source says e.g. " and not perhaps " ? Or: are you sure the source text is an HTML or XML file, and not e.g. a Word file with XML-like text in it?


 

Samuel Murray  Identity Verified
네델란드
Local time: 11:17
Member (2006)
English to Afrikaans
+ ...
@TZ7 II Apr 3

Stepan Konev wrote:
Both XTM and Smartling insert a special tag for  ...


Smartling does so by default for XML files, but it can be disabled by the project administrator:
https://help.smartling.com/hc/en-us/articles/360008000893-XML

Smartling does support DOCX but I don't know what Smartling does to non-breaking spaces in DOCX files (the help files don't say). The stuff that I usually do in Smartling are localisation files, so I imagine special rules might apply to such types of text, but I can't be certain what the actual file types are.

I confirm that I have seen non-breaking spaces displayed as an "sp" tag in XTM, but I don't know for which file formats those were. I was not able to find an XTM user manual page on this.


 

tz7
미국
Local time: 05:17
English to Japanese
TOPIC STARTER
RE: XTM Apr 3

Stepan Konev wrote:

Why “except XTM”?
Both XTM and Smartling insert a special tag for nbsp, unlike Memsource, memoQ and Trados that use regular character (“invisible” degree sign).


Yes, as default, XTM represent non-breaking space as an inline tag in red. If you want, like me in Japanese, you can remove them from your translation. The issue was XTM changes non-breaking space ( ) in XML to   automatically and this broke our build process, because   is illegal for XML. The solution provided from XTM was representing all the entities as-is. So translators see "    <RecipientStatuses>", instead of "{sp}{sp}{sp}{sp}<RecipientStatuses>", when translating. Well... at least we could get the valid target XML files. But, as expected, translators complained. Another solution provided was representing all entities, including  , as inline tags, which you can't remove from your translation. We couldn't accept the 2nd solution. So currently, translators are seeing "    <RecipientStatuses>".

I am working as a translator for Japanese and a PM for other languages. I have access to both source files and target. We don't translate any Word files in XTM.

Thank you everyone for your comments!


Stepan Konev
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How your CAT tool handle entity references?

Advanced search







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »



Forums
  • All of ProZ.com
  • 용어 검색
  • 일거리
  • 포럼
  • Multiple search