In March Microsoft Research announced they had created the first machine translation system capable of translating news articles from Chinese to English with the same accuracy as a person.
Achieving human parity is a significant milestone in a field that has been evolving over many years and that few believed would ever come to fruition. But with the combined developments in AI and speech recognition, the impossible may now be a matter of course in the months to come.
Olivier Fontana, director of product strategy and marketing for Microsoft Translator in Seattle, says there have been a number of key turning points for machine translation. The advent of speech recognition in 2012 with applications such as Siri and Cortana is one of the more obvious ones.
While good for simple commands, context was often a challenge. For example, repeated words, punctuation, or distinguishing words that sound alike such as ‘hear’ and ‘here’).
“At that point, we had to develop new technology that sits between speech recognition and translation to provide that context and move closer to human speech,” Fontana explains.
One of the biggest watershed moments was the switch from statistical machine translation (SMT) to neural machine translation (NMT) about one-and-a-half years ago, he says. “That was a total technology shift. It’s like converting from a gas to electric engine in a car. It’s still a car, but it works in a totally different way.”
Moving away from statistical machine translation is a huge improvement in terms of translation quality and accuracy, notes Sanjay Malhotra, co-founder and CTO of Clearbridge Mobile. “The reason for this is because it approaches translation differently.”
As he explains it, SMT processes each word individually, so translation can come out sounding very awkward and choppy with many mistakes, especially when translating a longer sentence or paragraph.
NMT on the other hand is more contextual and therefore more accurate, producing a natural flow of results, Malhotra says.
“The reason for this is that it compares words in a sentence and sees which ones are related. The shift to more natural language processing allows more accurate information, especially for businesses operating globally.”
As capabilities have improved, Fontana says they have seen significant growth in demand at the API level for a wide range of industries. “In the last four years especially use has accelerated in terms of variety such as internal communications and even business intelligence for statistical analysis on a global basis. Another Canadian gaming company is exploring options to enable multilingual interaction.”
He adds that many tend to think that text and speech translation is something that will replace existing functions. “But that’s not the right way to look at it. The question really is, how can I use it for things I cannot do today? I believe 95 per cent of the applications for this capability will be brand new, such as translating Twitter feeds. There is no way humans could ever be fast enough to do that.”
Malhotra says as machine translation capabilities improve, and more languages become available, there are significant opportunities to be gained in areas such as e-commerce and global marketing, as well as communication among global workforces.
Affinio is a Halifax-based entrepreneur that has recently been exploring opportunities for augmenting its data analytics services. The company went through the Microsoft Ventures program and now has a solid portfolio of high profile global marketing and advertising clients.
According to Stephen Hankinson, co-founder and CTO, it began working with the Microsoft Translator API earlier this year, using it to run reports based on data input from Twitter, Facebook, Pinterest and other social media sites. It is now proving to be a sought-after value-added service offering for its international customers.
“One problem customers have is that a lot of data shared from all over the world is not in a language they understand. We can now take that data and translate it back to what customers want so they can view and analyze it in their own language. For one customer with a parent office in Paris and a group in Spain, we are able to convert a full report in Spanish to French in minutes.”
He says while they did look at translation options in the early days of deep learning, “the results were not good enough”.
The good news is, the major players are upping their game considerably, Malhotra says. “All the big players are now focused on neural machine translations systems, and all are competing for who can offer the most advanced solution. We’re now seeing Google, Microsoft, and Amazon committing to NMT systems, which is helping to provide better results for consumers specifically as there are now dozens of languages that can be automated effectively.”
He also confirms there is a growing demand for translator APIs, and that it’s an area where he sees Microsoft catching up at a significant pace. “Microsoft is announcing a myriad of developments in NMT technology, making it more accessible and inexpensive for industries to adopt. However, there is still a lot of room for development before we see mass adoption including accuracy and the addition of many more languages.”
Fontana agrees that there is much more work to be done. But he also believes the results will be sooner rather than later. “Now the future of machine translation is simply a question of when. It’s not going to be 10 years or five for that matter. It might be as early as two or three years when we will see very interesting new possibilities for breaking the language barrier. When that happens it will change the life of many, including immigrants, travellers, people with disabilities, and businesses.”