Problem with Unicode translation memories
Thread poster: Johnson Sumpio (X)
Johnson Sumpio (X)
Johnson Sumpio (X)
Local time: 18:38
Chinese to English
Apr 2, 2005

I am new to Wordfast. When I try to use Wordfast with a Chinese source doc loaded in MS Word, it says "The system does not support double-byte (DBCS): please use Unicode translation memories. Refer to manual." I can't find any information on how to access/install/use the "Unicode translation memories" in the manual.

I am using Wordfast 4.2 Build 43d and MS Office Word 2003 SP1 on an English WinXP Pro SP2 system. My wordprocessor can display Chinese characters.

How do I
... See more
I am new to Wordfast. When I try to use Wordfast with a Chinese source doc loaded in MS Word, it says "The system does not support double-byte (DBCS): please use Unicode translation memories. Refer to manual." I can't find any information on how to access/install/use the "Unicode translation memories" in the manual.

I am using Wordfast 4.2 Build 43d and MS Office Word 2003 SP1 on an English WinXP Pro SP2 system. My wordprocessor can display Chinese characters.

How do I proceed? Please help. Thanks in advance.
Collapse


 
Robert Tucker (X)
Robert Tucker (X)
United Kingdom
Local time: 11:38
German to English
+ ...
DBCS Apr 2, 2005

I do not use a Microsoft O/S, Word or Wordfast myself, but have read some amount about Unicode formats elsewhere. In relation to your question, in the absence of more experienced advice, take a look at:

Q) What is a double byte character set (DBCS)?

at:
this page

16-bit languages (Chinese, Japanese, Korean)

at:
<
... See more
I do not use a Microsoft O/S, Word or Wordfast myself, but have read some amount about Unicode formats elsewhere. In relation to your question, in the absence of more experienced advice, take a look at:

Q) What is a double byte character set (DBCS)?

at:
this page

16-bit languages (Chinese, Japanese, Korean)

at:

http://64.233.183.104/search?q=cache:9-fmw23nmWIJ:www.astti.ch/vault/wordfast/wordfast.doc%20double-byte%20wordfast&hl=en



[Edited at 2005-04-02 20:02]

[Edited at 2005-04-02 20:08]
Collapse


 
Sonja Tomaskovic (X)
Sonja Tomaskovic (X)  Identity Verified
Germany
Local time: 12:38
English to German
+ ...
Wordfast Unicode TM? Apr 2, 2005

Hi,

I'm not sure I understand your problem, so please bear with me if this is not what you are looking for.

Wordfast has two options for its internal TMs: save the TM as a normal txt or save it as Unicode txt. For double-byte characters the Unicode TM is mandatory, if I understand that one correctly.

When you create a new WF TM, save it as "Encoded txt". This is an option that can be chosen from the file type dropdown list in Word.

HTH.
... See more
Hi,

I'm not sure I understand your problem, so please bear with me if this is not what you are looking for.

Wordfast has two options for its internal TMs: save the TM as a normal txt or save it as Unicode txt. For double-byte characters the Unicode TM is mandatory, if I understand that one correctly.

When you create a new WF TM, save it as "Encoded txt". This is an option that can be chosen from the file type dropdown list in Word.

HTH.

Sonja
Collapse


 
Johnson Sumpio (X)
Johnson Sumpio (X)
Local time: 18:38
Chinese to English
TOPIC STARTER
Where is Unicode Text format Apr 4, 2005

Thanks for the replies.

Sonja - if I got you idea correctly - yes, that's what I've been trying to do, create and save a new TM in Unicode but I can't.

In MS Word (alone), the \File\Save As only offers the usual formats. I don't see any "Encoded txt." If I call out Wordfast and try to create a new TM (choosing TMX, TMW, or Unicode), Wordfast tells me to save it in "Unicode Text format" all right, but where do I specify that? The pull-down file save menu inside Wordfas
... See more
Thanks for the replies.

Sonja - if I got you idea correctly - yes, that's what I've been trying to do, create and save a new TM in Unicode but I can't.

In MS Word (alone), the \File\Save As only offers the usual formats. I don't see any "Encoded txt." If I call out Wordfast and try to create a new TM (choosing TMX, TMW, or Unicode), Wordfast tells me to save it in "Unicode Text format" all right, but where do I specify that? The pull-down file save menu inside Wordfast offers the same formats as does Word; I don't see any Unicode-whatever format.
Collapse


 
Piotr Bienkowski
Piotr Bienkowski  Identity Verified
Poland
Local time: 12:38
English to Polish
+ ...
Choose "Plain Text" and a dialog should pop up Apr 4, 2005

mospeada wrote:

Thanks for the replies.

Sonja - if I got you idea correctly - yes, that's what I've been trying to do, create and save a new TM in Unicode but I can't.

In MS Word (alone), the \File\Save As only offers the usual formats. I don't see any "Encoded txt." If I call out Wordfast and try to create a new TM (choosing TMX, TMW, or Unicode), Wordfast tells me to save it in "Unicode Text format" all right, but where do I specify that? The pull-down file save menu inside Wordfast offers the same formats as does Word; I don't see any Unicode-whatever format.



Hi,

If your Save as dialog in Word does not have Encoded text or Unicode Text in the "Save file as type" list, then choose plain text and a dialog should pop-up where you can choose the encoding. Choose Unicode (UTF-16) from the available list of encodings.

HTH

Piotr


 
Johnson Sumpio (X)
Johnson Sumpio (X)
Local time: 18:38
Chinese to English
TOPIC STARTER
Plain text Apr 4, 2005

I did. I - of course - tried saving in almost all the formats. None of them worked. The message about the Unicode translation memories kept coming up.

So, I downloaded and installed Wordfast ver. 5.0z. Created a new TM and saved it in Plain Text. No pop-up list for choosing encoding (both for versions 4 & 5) BUT no message about Unicode translation memories. Seems to be working now... ?


 
Johnson Sumpio (X)
Johnson Sumpio (X)
Local time: 18:38
Chinese to English
TOPIC STARTER
Saving in encoded Plain Text Apr 5, 2005

The message about Unicode translation memories didn't come up the last time, so I thought the problem was solved. Wrong.

Although WF Ver.5.0z no longer gives me the message (as did Ver.4.2), it is unable to create a TM in encoded plain text. I can see in TM Edit that the Chinese source lang in the plain-text TM is all in garbage characters. Naturally, WF cannot function as it should.

Based on the information I gathered, Word 2003 no longer offers to save encoded plain t
... See more
The message about Unicode translation memories didn't come up the last time, so I thought the problem was solved. Wrong.

Although WF Ver.5.0z no longer gives me the message (as did Ver.4.2), it is unable to create a TM in encoded plain text. I can see in TM Edit that the Chinese source lang in the plain-text TM is all in garbage characters. Naturally, WF cannot function as it should.

Based on the information I gathered, Word 2003 no longer offers to save encoded plain text in its "Save As Type" menu as did the earlier version.

I am still experimenting with Word 2003. For instance, I open a blank doc and want to save it. If the default choice in the Save As Type is in Word DOC and I use the pull-down menu to choose Plain Text, and press Save, only then would Word pop a small window asking me if I want to encode in Windows (Default), MS-DOS, or Other Encoding. Here I can choose Other Encoding -> Unicode. I can add Chinese characters in this encoded plain-text file later, and read them with even the simple Notepad.

On the other hand, if I create a new TM in WF and go through with the process; at the end, the default Save As Type choice is already in the Plain Text. If I press Save, WF just creates a TM in a non-encoded plain text without complaint. Word does not pop a window asking for the encoding type. I guess herein lies my probem. Seems like WF is not making Word 2003 pop the small window asking for the encoding method, or whatever is the issue.

The above are my observation. I wonder if other people with Word 2003 and WF V5.0z combination have the same problem.
Collapse


 
Robert Tucker (X)
Robert Tucker (X)
United Kingdom
Local time: 11:38
German to English
+ ...
DBCS/Unicode Apr 5, 2005

I lost the ability to edit my first post and have notified “support”. My two references were:

See more
I lost the ability to edit my first post and have notified “support”. My two references were:

http://h21007.www2.hp.com/dspp/tech/tech_TechDocumentDetailPage_IDX/1,1701,3954,00.html#dbcs

http://64.233.183.104/search?q=cache:9-fmw23nmWIJ:www.astti.ch/vault/wordfast/wordfast.doc%20double-byte%20wordfast&hl=en

My reason for re-posting is that it is not over evident to me from the above postings that the significance of double-byte is understood. It means that each CJK character needs to be represented by two 8-bit “words”. Thus one would expect that to save as utf-8 (made up of 8-bit words) would require additional information while saving as utf-16 (made up of two 8-bit words) should be easier. (Piotr, of course, suggests saving as utf-16)

Since your system is not CJK and is post Windows 95/98/NT4 double-byte mode is not available and Unicode must be used. Whether it saves as utf-8 or utf-16 it will need to know it is working with/saving double-byte characters.

The paragraph in my second reference:

“In Wordfast's main window, next to the translation memory path and name, you should see the (CJK) mention. This mention appears if the source language ISO code begins with either ZH-, JA- or KO-. This mention is essential for Wordfast to switch to a mode compatible with Chinese, Japanese, or Korean. TMs and glossaries must be reorganised or indexed when Wordfast displays the (CJK) mention, not before.”

seems to me to be very pertinent.
Collapse


 
Johnson Sumpio (X)
Johnson Sumpio (X)
Local time: 18:38
Chinese to English
TOPIC STARTER
CJK mention Apr 5, 2005

Robert Tucker wrote:

Since your system is not CJK and is post Windows 95/98/NT4 double-byte mode is not available and Unicode must be used. Whether it saves as utf-8 or utf-16 it will need to know it is working with/saving double-byte characters.

The paragraph in my second reference:

“In Wordfast's main window, next to the translation memory path and name, you should see the (CJK) mention. This mention appears if the source language ISO code begins with either ZH-, JA- or KO-. This mention is essential for Wordfast to switch to a mode compatible with Chinese, Japanese, or Korean. TMs and glossaries must be reorganised or indexed when Wordfast displays the (CJK) mention, not before.”

seems to me to be very pertinent.


I tried to understand the references in your previous post, especially the part quoted here because it seems to hold the key.

It says "Wordfast's main window." Which main window? The one popping out AFTER I press the green "f" icon on the WF menu bar inside Word? If it is, then I can't find any "translation memory path and name" in all the tabs, much less a "CJK mention."

Okay, how about... am I supposed to see the "translation memory path and name" in the process of saving a new TM? When the new TM is going to be saved, WF or Word allows me to indicate the path where the memory file will be saved, but no CJK mention here either.

The reference also says "if the source language ISO code begins with ZH-" I indicate ZH-xx for source language during creation of the new TM, but I don't see any "CJK mention" anywhere in the process.

Do I see the above EVEN BEFORE, DURING, or AFTER successfully creating my FIRST-NEW-and-WORKING TM anyway?

Again, I am using Word 2003 with Wordfast ver. 5.0z on an ENG WinXP Pro system, but my wordprocessor can read CHI characters - and my Notepad can display CHI characters inside Unicode-encoded Plain Text files (saved/edited using Word alone).

I know Word 2003 has been around for years, and the WF page says WF works with Word 2003. Makes me wonder why I am having this problem even at the starting point. Don't tell me I need a native/pure CHI, JAP, or KOR OS to run WF with Word 2003?


 
Johnson Sumpio (X)
Johnson Sumpio (X)
Local time: 18:38
Chinese to English
TOPIC STARTER
Finally got it! Apr 5, 2005

Robert Tucker wrote:

“In Wordfast's main window, next to the translation memory path and name, you should see the (CJK) mention. This mention appears if the source language ISO code begins with either ZH-, JA- or KO-. This mention is essential for Wordfast to switch to a mode compatible with Chinese, Japanese, or Korean. TMs and glossaries must be reorganised or indexed when Wordfast displays the (CJK) mention, not before.”



I got what it means by "translation memory path and name" now, but I found the solution to my problem.

After creating the new TM (not encoded, yet), call out TM Editor in the menu bar. Press Tools, then choose "Rewrite TM as Unicode" for Special Filters. That's it. WF will rewrite the previously created TM in Unicode format. Now, I can see the source inputs in TM Editor are CHI and not garbage anymore, and WF is working fine.

I understand the solution is also mentioned in the WF ver. 5 doc but I got off track because it says "Glossary editor," and I searched for it literally (chuckle).

Thanks to all of you for your time and attention.




[Edited at 2005-04-05 17:34]


 
Robert Tucker (X)
Robert Tucker (X)
United Kingdom
Local time: 11:38
German to English
+ ...
CJK Mention Apr 5, 2005

Robert Tucker wrote:

“In Wordfast's main window, next to the translation memory path and name, you should see the (CJK) mention. ...


There's:

"Current Translation Memory (Unicode) (CJK)"

followed by, presumably, a translation memory path at:

http://www.christophermayo.com/articles/2004/img/wordfast13.jpg



Christopher Mayo's Wordfast instructions:

http://www.christophermayo.com/articles/2004/wordfast.html





[Edited at 2005-04-05 16:02]


 
Johnson Sumpio (X)
Johnson Sumpio (X)
Local time: 18:38
Chinese to English
TOPIC STARTER
CJK Mention in Main Window Apr 5, 2005


Robert Tucker wrote:

"Current Translation Memory (Unicode) (CJK)"

followed by, presumably, a translation memory path at:

http://www.christophermayo.com/articles/2004/img/wordfast13.jpg



Before the TM was rewritten in Unicode, there was only the Current TM in my main window; no mention of CJK like the one in the picture. After the TM was rewritten into Unicode (refer to process above), it says Current TM (Unicode) but still no mention of CJK.




[Edited at 2005-04-05 14:58]


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Problem with Unicode translation memories







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »