Today’s guest post is from Nicole Caron, talking about a somewhat underreported part of communication disability – that of taking the ‘un’ about of ‘unintelligible’. If you like what you read I strongly recommend you visit their crowdfunding page… Nicole was very patient with me on the back and forth of this guest post… “No, my readers know all about that”, “No, my readers are much more technical than that…” and we had lots of fun haggling over the words.
As regular readers of this blog will know, being unable to express and communicate your thoughts and feelings is an everyday reality for millions of people (1.5% of the Western World’s population) who suffer from speech disabilities. This is very hard and frustrating for them and affects much larger circles in society such as families, caregivers, friends and society as a whole.
Until now the approach taken by developers of assistive technology for people with speech disabilities has completely bypassed voice, opting to use other modes of communication including communication boards that replace speech with symbols and images, head-tracking, eye-tracking, and switches. These solutions are often expensive, awkward and unnatural to use and degrade natural speech. There are no products on the market today that allow people with speech disabilities to communicate using their voice.
At Voiceitt, we are trying to fill this gap. Voiceitt is developing Talkitt, an innovative speech technology which is able recognize unintelligible language and translate it into understandable speech. Ultimately, Talkitt is giving individuals with speech impairments their voice back! Talkitt’s slogan is “This is my voice”.
TalkItt is a voice to voice application that will translate distorted pronunciation into understandable speech. For example, a person can say “o-ko-la” and software will translate it to “chocolate”. Our software is in the development stage but we’ve had excellent results so far.
Talkitt is speaker dependent and requires the user to create and maintain a dictionary of utterances and associated text and/or icons. The creation of the dictionary is the calibration phase. Once a dictionary is ready, a recognition stage may begin the application will perform pattern matching with enhancement of intonation features. If the user puts a word into the dictionary then the application will be able to recall that word at a later date and relay what the person is trying to say. The user will be able to access the application without help because during the recognition phase it is fully voice activated. The dictionary can be categorized so as to reduce mistakes by the machine.
Our approach is based on simultaneous exploitation of both the content and the intonation of a speaker voice. Today’s speech recognition solutions are based on detection on the ‘word’ level and the ‘phenome’ level which are statistical based. When speech is non-standard, these approaches no longer work. Talkitt works in a different way. It is user dependent and language independent. The user builds a personal dictionary that contains their speech patterns and their meanings, and by using pattern matching, we look for the highest similarity between what the user just said and what exists in the dictionary.
An example for one of VoiceItt’s innovative approaches is adaptive framing – an approach based on speech events that enables the division of speech to homogenic frames – this approach results in frames that contain the same vocal information but with varying durations. This duration is determined by the vocal information itself, thus allowing a much more accurate modelling and classification steps.
Talkitt does not require specialised hardware and can run on any computerized device (PC, tablet or smartphone) and can be integrated in Apps (browsers, games, communication boards) and assistive devices (smart phone, wheelchairs, emergency calls). This software-based solution gained enthusiasm from field experts worldwide and has the potential to dramatically improve the quality of life of millions of people.