Coded fiction.

Saturday 11 August 2007

Screenshot anime Lucky*Star

Pic of the day: Of course, there are some things in foreign languages that man was never meant to know. "Yaoi" is one of them.

Imaginary 2011

Continuing the series of imaginary journal entries that I started on Wednesday, each set one year further into the future. This is pure fiction, based on how I imagine the near future to be.

Fiction starts here

11. August 2011 - New Google translator

I'm impressed. Actually, I'm in awe. This is because I know a few languages myself and specifically know how difficult it is to translate between languages that were separated at birth, more or less, whose speakers parted ways as soon as mankind migrated out of Africa during the Ice Age. Since 2007 sometime I have been trying on and off to learn Japanese, a language which I was exposed to almost every day from Japanese animation. (This too has faded somewhat lately, but I still watch some pretty much every week at least.) After all these years, I would still be hard pressed to translate more than the simplest statements from Japanese to English (or to my mother tongue, New Norwegian), much less the other way.

I believe it was in the late 1990es that I came across the Babelfish translator site, which was somehow related to the Altavista search engine. It did a passable job translating from Spanish to English (and probably the other way around), although it left out the words that were the most difficult to guess, which were of course the words you most needed it to translate. It also made some bad guesses, so it became a bit of an internet sport to translate a text into Spanish and back for laughs. With German the unknown words became more plentiful. Try Japanese or Korean, and all you got was a mishmash of words which may or may not actually be part of the text. I find it hard to believe that anyone could make sense of it unless they already had a good idea of what it ought to say. This was the gold standard of translation for the next decade as well. The Google translator was indistinguishable from Babelfish, probably because it used the same translation engine.

Well, that was then and this is now. The new Translator uses a completely different engine, one that has admittedly been in the works for five years or so. It is fed with an enormous database of real text in the respective languages (and who has a larger such database than Google?) and builds a deep semantic structure from this. When it has identified a phrase, it will not just spew out the corresponding word or two, but continue to build a web of meaning that includes the other phrases around it. This pattern is then used to search for an identical pattern in the target language. If the text already exists in the target language (say, a quote from a well-known book) the existing translation is used. If no exact correspondence is found, the program tries with smaller segments until it reaches a match. When assembling parts of text, it will remember context, so you don't get translations like "fu*ing ugly countenance" unless there is a similar style breach in the source text. Probably not even then.

By its very nature this Translator does away with the Babelfish Game. If you translate back and forth, it still remembers the original source text so you'll get it back unmodified (or with spelling errors fixed if it is a quote). But more importantly, it provides amazingly lifelike and natural sounding translations of everyday text in the most remote languages, as long as they have an extensive literature available online. Japanese blogs look like they were written in English. Well, except for the graphics, these are still not translated. It would not surprise me if that comes one day. But probably not for another five or ten years. Unfortunately my native Norwegian is not supported either, but I assume it will be sometime, since this is Google. At that time, my relatives will finally be able to read my archives in their own language. Whether they find it worth the time is another matter. Probably not, although perhaps in some remote future some genealogy obsessed Itland may give it a go.

More important, the new Google Translator should promote world peace more than any invention in history. Take the current crisis between China and the USA. Now that Americans can read the blogs of ordinary Chinese with their everyday joys and sorrows, it won't be easy to think of them as "yellow devils" who should just be nuked. It will probably not be quite as easy for ordinary Chinese to read uncensored American web sites, but this is the Internet after all. It was made to route around broken connections. From its perspective, censorship is just another technical difficulty to route around.

A quote from the Bible comes to mind, Genesis 11,6: The LORD said, "If as one people speaking the same language they have begun to do this, then nothing they plan to do will be impossible for them"… Disturbingly, God did not consider this a good thing at the time. Now we are on our way to the same spot again. Have we learned our lesson since then? (OK, so it is a myth, but the point still stands. Have we learned from history, so that we can use the power of a common language for good rather than for evil? There is nothing less civil than a civil war, after all.)


Yesterday <-- This month --> Tomorrow?
One year ago: Fast forward
Two years ago: Willpower revisited
Three years ago: Gaming for money
Four years ago: I have a fan!
Five years ago: Steatopygia
Six years ago: Growing pains
Seven years ago: Payday! Payday!
Eight years ago: Your wager with Death

Visit the archive page for the older diaries I've put out to pasture.


I welcome e-mail. My handle is "itlandm" and I now use gmail.com.
Back to my home page.