We’ve made our hope for The Open Voice Factory fairly clear in terms of end users: free communication, wherever you are in the world. However, I thought it was worth taking some time to lay down the strong intellectual case for why it can be a game changer for academics as well.
I used to be a university researcher looking at AAC issues. That design experience stayed with me. In many ways The Open Voice Factory is setup to provide the tools that I wished I had. There’s at least one project that would have been nine months shorter if we’d have had this in our back pocket.
Let me give you the headlines for why you should be using The Open Voice Factory in your academic projects.
Rapid Prototyping – including machine generation!
The Open Voice Factory takes in PowerPoint documents and converts them to AAC communication systems. This makes it fast. You can see the speed in real time in the video below. This makes it really easy to get a research page set off the ground in a hurry. It also makes it fast to change and easy for groups to work on a new pageset together by using track changes and the rest of Microsoft’s teamwork functions. Regardless of where you are in the AAC world!
We use the python library Python-pt to read the PowerPoint files. This means that if you are interested in easily generating content, you can write the code to make your own templates, review them by humans, and the send them into the factory to be finalised.
Starting with a pageset
The Open Voice Factory makes use of CommuniKate, the open AAC pageset. CommuniKate is creative commons license, which means that you can use it as a starting point for any academic project you like – rather than have to create a pageset from scratch.
Trigger JavaScript within the browser
At the design level, you design you templates so that they execute pieces of JavaScript when compiled into a speech aid.
Nava Tintarev and I defined four catagories of automatic construction of utterances back in 2011, here is what we wrote for the first two.
Inferred input
Inferred input is defined as utterances that can be generated from examination of previous user utterances. Thus, if a device registered the phrase “Hello Mary” and later “Thank you Mary” it would be reasonable to deduce that the user had spent some time with an individual called Mary and so the phrase “Today I spent time with Mary” could be added to the list of available phrases (later, of course, becoming “Yesterday I spent time with Mary”,
Network-based input
Network-based input is defined as new utterances that can be determined by access to information over the Internet, or some other information portal. An example is talking about the weather – phrases such as “It’s very warm today”, “It’s going to rain tomorrow”, and “It snowed on Sunday!”. Also included are observations about recent media: “On YouTube I watched a video called ‘the Four Chord Song’ ”.
Almost all of those examples are extremely difficult to perform with mainstream AAC devices. They are also trivial to put together for a programmer using the Open Voice Factory.
This is partly because the code is open – you can simply clone and change the relevant code from the GitHub repository and just insert the code that does what you want. It’s also because you can, effectively, embed your own JavaScript functions within the PowerPoint documents that you create for The Open Voice Factory. Full control of the content using a mature programming language.
Impact
A particular buzzword for the UK, if you use The Open Voice Factory for our AAC research, you might write some code, maybe contribute some pagesets as part of the project. That code will be merged into the Factory’s trunk, and will go towards helping people long before the journal reviewers get around to looking at your paper. You’ll have made lives better, isn’t that why you got into research?