Tuesday, March 01, 2005

Swampy weed suggests whole state order recover open trust

Tom Braman:

Has your government considered machine translation of its Web site? Or worse, are you actually doing it? My advice: Don't go there, or you could end up with a headline in the paper the way the Washington's Secretary of State did today, in the Seattle Post-Intelligencer and the Seattle Times:

(Section 203 Voting Rights Coalition member) Debbie Hsu says something was lost in translation when Washington residents who speak Chinese tried to view the Secretary of State's Web site in their native language. The Web site lets visitors view the site in different languages, but the Chinese translation was apparently way off. For example, a statement about Secretary of State Sam Reed proposing "statewide mandates to restore public trust" was translated as "Swampy weed suggests whole state order recover open trust," according to Hsu and others in the Section 203 Voting Rights Coalition. ... The office pays a California company, Systran, about $6,000 a year for use of translation software that takes the English version and currently allows people to view it in Russian, Japanese, French, German, Spanish and Italian.

Machine translation, as you're probably aware, is where computers do all the work, and is estimated to be somewhere in the neighborhood of 60 to 75 percent correct. Where I work, we considered using Systran, which apparently holds a corner on the machine-translation market (AltaVista's Babelfish and others lean heavily on Systran). The debate among internal staff was between giving residents a rough translation of everything on the Web site using machine translation, or instead providing residents with accurate translations (edited/reviewed by experts) of only a smattering of critical Web pages. During out test of Systrans' software, we encountered the many, many instances of "swampy weeds," and we were worried: What if a resident made a costly--or worse, deadly--decision based on mistranslated information? The possibilities were too numerous (even with a disclaimer), and instead we chose to use language experts to translate only specific parts of our Web site (into at least Chinese, Spanish, and Vietnamese). The City of Bellevue went through a similar process, and identified the top 200 or so pages on its Web site and translated them the same way, into Spanish.

Another weakness of the Systran solution was its lack of languages critical in our region, including Vietnamese, Tagalog, Mon-Khmer, Ukrainian, and Somali, all of which represent larger communities than Italian speakers.

The Secretary of State's Office, which is fooling itself if it thinks the problem is restricted to Chinese and Korean, should take the step that Denver did -- once a Systran customer. Two years ago, when its new Spanish-speaking Chief Information Officer took over, the city called Systran and ended the deal, and stopped using the machine-translation service.

In the end, we made a decision on machine translation that maps to Debra Hsu's take on the Secretary of State's site:

She said, "having no translation is better than having very bad translation."

1 comment:

Anonymous said...

The Language Technology Software Review site is constantly updated with many new entries about translation software and systems:
http://www.geocities.com/langtecheval/

A new set of tips/hints/tricks/questions/answers for several Machine Translation software products is available at:
http://www.geocities.com/jeffallenpubs/MT-tips.htm

and many new entries available at my MT postediting page (http://www.geocities.com/mtpostediting/)

Jeff Allen