Neural network, earned in Yandex.The translator from September 14, makes machine translations more natural.


Neural network, the transfer looks much more natural.

Sometimes, neural networks have difficulties with the translation of one-word sentences and rare words. For such cases, in Yandex.The translator along with neural network continues to operate with old model.”

Absurd machine translations leave in the past. September 14, Yandex.The translator began to use in their work a neural network. As is known, the neural network copes with the processing of a variety of natural information, in particular, and text.

To train the machine translation in different ways. For example, you can offer her a dictionary and grammar of a particular language, and then we get machine translation based on linguistic rules . Another option is to give the machine to compare a large number of original texts with their translations that it alone established the correspondence between words of different languages. Then she will learn statistical machine translation.

In this approach word or phrase in the sentence are translated independently from each other, and then most likely, the translations are collected into one. The undoubted advantage of statistical machine translation is that machines are so good at it to translate the rare and complex words. But there is a serious drawback: often the translations are unnatural, their fragments alien to each other.

Yandex.The interpreter earned in 2011 and still only worked on the statistical scheme. Neural network, which is now also studying on the basis of statistics, that is, analyzes the array of parallel texts for patterns. However, the approach to the translation of her other proposals not crushed, and processed as a whole. This allows the neural network to “understand” the context. She catches the sense, even in the case when the words that convey them, are in different parts of a sentence. As a result, the translation turns out folding and natural. The weak point of the neural network is what is the strength of the statistical model is a few common names, toponyms, and other rare words. Therefore, two approaches perfectly complement each other.

After the user makes a transfer request, the system generates two versions: one is the old method and the other using neural network. The sentence translated by the neural network, then passes through the language model (the so-called information accumulated by the system about existing language phrases and the frequency of their use) to avoid grammatical errors such as “daddy go” or “severe pain”. Next, the algorithm based on the method of machine learning CatBoost compares the translation offered by the neural network with the obtained statistical model, and shows the user the best. (The comparison is performed for a variety of criteria, ranging from the length of sentences is short it is better translate the statistical model and ending with the syntax.)

While this hybrid system only works for translation from English into Russian, but in the coming months, as promised in Yandex, it will launch in other languages. For the interest in the web version of the Translator, the developers have added a switch with which you can compare hybrid translation with the statistics.