originally published in CBA PracticeLink
In the translation of law documents, technology has found a niche, but don’t expect computers to replace translators any time soon.
Imagine what it’s like to translate large volumes of other people’s writing from English to French or from French to English. Once you’ve done that, you’ll understand why professional translators work with translation assistance software.
Translation software tools fall into two broad categories. In the first, software uses rules to translate text. But translation is very difficult to codify in a set of rules.
So translation software developers discarded rules in favor of statistical machine translation, a process akin to that of training voice recognition software. Today’s translators feed both source texts and correct translations into “translation assistance” software to build up databases of acceptable translation units, or pairs of aligned (source- and target language) text segments.
The system that receives source text from the translators scans the database to find potential matches for source segments, and produces a list of those matches for each segment. The translator verifies the “pre-translation” and otherwise cleans up the document before handing it to the lawyer. Ultimately, a copy of translation is fed back to the database.
These systems are voracious learners. “You need a very large volume of data to properly train the software, not just 100 pages or so,” says Dr. Atefeh Farzindar, president of translation-system developer NLP Technologies Inc. and an adjunct professor at the University of Montreal.
Farzindar cites a recent project, done for the Federal Courts of Canada, as an example. It was successful, in part, because the system had been developed using “ten years’ worth of judgments.”
Translation-assistance software is also being used at some large firms, such as Fasken Martineau LLP. “We feed every translation job we do into a system,” says translation director Yannick Pourbaix, of the firm’s Montreal office. His team of 20 translators has done so for the past three years, when they first began using the system. He notes the database was “well-fed” when it arrived, so it proved useful from day one.
If they choose to do so, translators can further stack databases by having their systems “feed on” text from reliably translated web pages, like those published by Canada’s federal government.
Legal translators shy away from free web-based translation tools in favour of specialized systems. “We use the same machinery,” Farzindar explains, “but in a specific domain.”
“(Web-based tools are) good enough to get just enough meaning when you surf the web,” she continues, “but they’re trained on a general web corpus.” Translation: the more domains you include in a database, the less likely translations will meet domain-specific needs. “It leaves too much cleanup at the end,” Farzindar says.
So, legal translators work to keep chaff from taining the databases, feeding them carefully to increase their usefulness. Fasken Martineau translator Catherine Duhamel-Gendron says although the database used at the firm is not infallible, “the signal-to-noise ratio is better than it is on the web,”
“When a client tells you they have taken text from three or four other previously translated documents,” Pourbaix explains, “we can feed those documents to the database prior to translating a new document.”
Certain translators create silos in their translation systems for specific areas of law. “If a word in English can be translated three different ways in French, the system can sometimes differentiate based on the domain,” says Duhamel-Gendron.
But even a carefully fed database can generate mistakes. “We must always use terminology drawn from the law,” says Pourbaix.
“Computers don’t always provide accurate translations of technical terms and they can assign wrong meanings to words or to sentences,” says Sofia Aguilar, patent prosecution administrator for Osler Hoskin & Harcourt LLP.
Pourbaix says while the system speeds up work anywhere from zero to 50 per cent, post-translation verification remains essential.
A further complication: laws change over time, so translators must perpetually feed databases. “If you have four or five results, you often choose the most recent,” Duhamel-Gendron says.
And translation systems may misalign segments if they stumble over charts, tables and other non-text elements in source documents.
Translation difficulty varies in proportion to the gulf between given pairs of languages. Even advanced software can stumble over variations “within” a given language (such as Québec French versus “French” French) and the translator may need to adapt translated text according to the language use of a particular location.
Software developers continue to work towards promising improvements in translation assistance systems and other tools. When applied to chat, for instance, some of these innovations might one day allow people to type messages in their native languages and have correspondents who do not speak said languages understand them.
Similar possibilities could serve people who monitor foreign language broadcasts, which could lead to live voice-to-voice phone interpretations. “We could save money on expensive interpreters,” suggests Farzindar.
And web-based translation tools may become more reliable than they are today. Should people tag certain sites as pertinent to specific domains, free translation tools could potentially provide domain-specific, and thus more useful, translations.
But various factors will prevent translation assistance systems from supplanting people in the foreseeable future. Duhamel-Gendron offers an obvious reason: “They can’t ask the original writer about the text,” she says. “This will keep translation systems from becoming completely automated.”