Thursday, October 27, 2016

Major Milestone for Speech Recognition Technology by Microsoft

On October 17th, 2016, Microsoft announced the most advanced speech recognition software to date. Behind the breakthrough are deep neural networks that utilize a large amount of data to train the software to recognize patterns from inputs. These processing units are specialized in sounds as well as graphics, allowing computers to process algorithms and deliver results at speed not previously possible.

Originally reported a Word Error Rate of 6.3% in the month of September, Microsoft achieved a 0.4% improvement in just a month, lowering the percentage to 5.9. Although the error rate is far from perfection, it is reported that this speech recognition software recognize just as well as humans who were asked to transcribe the identical conversation. Created by the National Institute of Standards and Technology, the test is comprised of a set of telephone conversations in English, Spanish and Mandarin Chinese and has been used as a benchmark to test speech recognition technologies by many tech giants since the 1990s.

Moving forward, the team hopes to transition from recognition to understanding. Now, the next challenge for Microsoft is to be able to employ speech recognition technologies even in complex situations such as heavy background noises and multi-party conversation.  

No comments:

Post a Comment