In my vision of future state for enterprise applications, the 6th element is "Sensing". I've used this term to capture how future applications will create new value for users by sensing relevance, context and personal preferences through analytics of voice, video, text, location, attention or other ambient and declarative data from the user. The ability to capture, store, index, search and analyze voice recordings is fundamental to this future state vision. Nuance, BBN, TellMe/Microsoft, Nexidia, CallMiner, Utopy, SER, IBM and others have invested to improve STT, TTS, ASR and Speech Analytics technologies that are all critical to this "sensing" end-state. Recently, Microsoft announced at the Mobile World Congress in Barcelona - Microsoft Recite - a Voice capture and search application for Windows Mobile devices. To get a sense of the UX and VUI, check out this video clip...
It appears as if Recite uses some type of voice pattern matching or phonetic search engine. There is not translation from speech to text and the accuracy improves with longer search phrases. Both of these characteristics points to phonetic processing.
You can download the app here.

Comments