Category Archives: Machine Translation

Professional Translation Software for just $6.95 per month.

shutterstock_162664115Navigating the world of translation software can be time consuming and confusing. Luckily, things just got a lot easier (and more affordable).

Whether you’ve been using professional translation software for years, or you’re trying it out for the first time, you’ll find that it can help you translate faster and more efficiently than ever before.

To start working with professional translation software for just $6.95 per month*, follow the three easy steps below:  Continue reading Professional Translation Software for just $6.95 per month.

Advertisements

Machine translation is to translation as…

Clipart is to art
Rap is to poetry
Facebook friends are to friends
Diet pills are to dieting
Automated speech recognition is to listening
A picture of the Great Wall of China is to visiting the Great Wall of China
Apple’s Siri voice is to Morgan Freeman’s voice

The Virtues of MT

I recently ran across an article extolling the virtues of Google MT – http://www.independent.co.uk/life-style/gadgets-and-tech/features/how-google-translate-works-2353594.html. While I agree with many of the ideas in the article, a few of the points and the whole tone of the article seemed out of line with reality. First, the idea that MT should focus on statistics more than extracting meaning I agree with…at least for now. But lets at least concede that that is fundamentally different than what we do as humans. I do believe in statistical theory, and have in my linguistic background studied the role that statistics plays in human language but I do NOT believe that word sequences and alignment statistics are the only determining characteristics of acceptability for a sentence. I DO look at meaning. So given this fundamental difference in processing, we have to assume the introduction of errors. So while the article praises the virtues of statistical based processing, let’s temper our enthusiasm as that is only part of the puzzle, and probably not the most important one for real fully automated high-quality machine translation.
Which brings me to my next point. The most inexplicable part of the article is where the author, David Bellos, discusses how human translation errors are usually more dangerous than MT errors. I’m completely lost on the reasoning here. He says when Google MT makes an error, it’s obvious, but the human translators make an error, it’s not, so human translation errors are more dangerous. Analogously then, as a business owner, the data entry guy that 10 to 20% of the time spews junk is preferable because I know I can ignore his garbage vs. the guy that makes an error once every 100 or 200 entries. What? Isn’t the reason I’m getting data entry (analogously translation) because I WANT to understand, not disregard, the output? I’m baffled. The only reason that human errors would be more dangerous, is because the output is actually useful – and that’s kinda the idea.
Finally, the name of the article is “How Google Translate Works”. While I understand that he’s probably trying to write for a broader crowd, it doesn’t really go into any technical detail beyond “it uses human translations” and “it uses statistics”. No equations, no specifics. And then he makes the assumption that because our desires and needs are the same, the premise of everything-has-been-said-before MT should work. Once again, while I don’t outright disagree, linguistic nuances go a little deeper than that.
Whether or not you like Google, the virtues of a good MT engine (which Google’s system is) are numerous. Many would say that we’re just beginning to tap the MT-as-a-professional-translation-resource well, but let’s stay grounded here in reality.

 

MT and reality

The question of MT post-editing for professional translation work is pretty much an unavoidable question nowadays. It’s also a powder keg for some people. For some MT is a godsend with potential to speed up translations like nothing before. For others it’s a crude hack that should be avoided at all cost. I think most would agree that, just like MT in general, it really depends on the language pair and the subject matter in question. The debates surrounding this question get quite heated in many cases, but what I’ve noticed over the past couple years is a surprising lack of data to back up any claims. Sure, there’s anecdotal evidence all over – “I use it and my translations are better than ever,” vs. “MT slows me down and messes me up ALWAYS.” But where’s the data?
In search for an answer to this question, I stumbled upon this article:
http://www.mt-archive.info/AMTA-2010-VanEss-Dykema.pdf. It appears that someone is actually looking into this empirically. The difficulty in this empirical approach to this question of course lies largely in translation metrics – how do you really determine translation quality? There’s no question that using MT and doing post editing will give you different results for those you would get if a translator did it the old fashioned way, but how do you determine quality of the one versus the other? And when does speed increase trump linguistic accuracy?
The article mentioned above is a case study proposal from last year that doesn’t seem to have had results published yet. I’ll see if I can get a hold of any preliminary findings, but it’s encouraging to me that in the near future this question won’t just be he-said-she-said. Without the data, I can see both sides of the argument – if post editing of MT gives you quick, intelligible results then I can see many cases where it would be very useful. On the other hand, MT doesn’t think. If you are familiar with the Vauquois Triangle (a often used graphic to demonstrate the different levels at which meaning transfer for MT can occur), anything that doesn’t use an interlingua to translate is not going to the level of a human and thus probably misses nuances that would be left out of a MT editing processes. And while MT can stay try to word, syntactic, and some semantic constructs through statistical leveraging of enough data and some rules, translators understandably balk at numbers and rules governing something both innately artistic and cerebral.
So until we have the data, live and let live. If professional translators are using MT to increase their speed without compromising quality, then great. And if you are one of those that believe for whatever reason that MT is never a viable solution for professional translation, then ponder and paint away.

 

In Context

First of all, thank you to all of you for bearing with us last week. We were overwhelmed last week by the amount of interest our ProZ ad generated and have had about 1700 people download our trial! Please, if you have questions or need support, don’t hesitate to contact us.
Last week I was looking around on my alma mater’s website and saw an article about context and translation that had recently been published by an old professor of mine and family acquaintance, Alan Melby. He’s done extensive work in the translation field and is serving on the American Translators Association Board of Directors. Needless to say, I thought it would be worth my time to read the article. It was an interesting read. If you want the full text, it’s available here http://www.trans-int.org/index.php/transint/article/viewFile/87/70
The point of the article is to claim that there are 5 aspects of context: co-text, chron-text, rel-text, bi-text, and non-text. If you want an in-depth explanation of these, I refer to the paper. In short, co-text is the text surrounding the word or phrase. This is the definition that most of us are familiar with. Chron-text is the context of different versions of the document in which the word/phrase is found. Rel-text is the related information available in other resources. Bi-text is the context available from translation memories. Finally, non-text is what Melby refers to as ‘paralinguistic information’. The context of a word/phrase in the language and culture as a whole.
There are many interesting lines of discussion this could generate, but for the sake of brevity, I’ll just pick two. First is machine translation. I said a few posts ago my background is computational linguistics, and as such I’m familiar with the algorithms used in machine translation. Most good MT systems now are largely statistical, with some systems making use of linguistic rules as well with a hybrid approach. In either case, one of the big questions is “what is context?” The problem that Melby’s context definition poses for MT is two-fold. Namely, with the first 4 “-texts,” how far is far enough? What portion of the surrounding text needs to be looked at? And, the last “-text” is currently intractable for MT. MT doesn’t know culture. Given Melby’s definition of context, it’s easy then to see why MT struggles sometimes and why, despite the care given to the sophisticated statistical algorithms, MT will never replace the human – unless they can understand culture and “paralinguistic information”.
Which brings us to the next line of discussion – how do you as a translator cope with the all the variations of context for a translation in the limited time you are given? This is where the TEnT comes in handy. You don’t have the time or the resources without a tool to look at everything you could to inform you on a translation. Even with a tool, you won’t have time. But with a tool, you should be able to quickly figure out what context you most need to reference and do your research. This is where one of the strengths of Fluency is manifest – you have quick easy access to lots of resources. Online resources, integrated terminology, TMs, etc. Online resources allow you access to rel-text and non-text to a degree without wasting your time. All you need to do is highlight the phrase in the source window (or just hover over a particular word) and click on the appropriate tab, and off you go. You also get an intuitive approach to co-text, where each segment is still part of a whole document, not cells in a table, each an island, entire of itself. Not to mention any images and formatting that could give you further “paralinguistic info” on the document.
Anyway, I recommend you look at Melby’s article. It’s an interesting exploration of one of the main challenges that face you as translators.