Help! – Need a FOSS speech-to-text application

I need to be able to have speech translated on the fly to text for deaf children in classrooms around Australia. If anyone either has some suggestions or some experience in this, please either email me or leave a comment. If you have expertise and would like to work on something like this, get in touch!

14 thoughts on “Help! – Need a FOSS speech-to-text application”

  1. I will probably get slapped but both Dragon and Microsoft Vista provide this support natively, which permeates into Office 2007 as part of accessibility support. Not sure that helps you but at least you have an option if it comes to it.

  2. The VoxForge guys are working on creating speech models for open-source speech recognition engines. I’m not sure if they have any end-user-ready applications yet though. See http://www.voxforge.org/ for more details.

    I’m a post-doctoral speech recognition researcher so I may be interested in helping. Send me an email and we can talk.

  3. I’m curious why you need this? I’ve always found signlanguage, and Deaf kid’s awesome communication skills, means these solutions aren’t needed.

  4. @AlphaG, thanks for that. I know about the expensive options. the Vista one isn’t viable as it has to run on small netbooks.

    @Brenda, for deaf kids in the classroom where they don’t have a teacher who knows sign language and they are missing out on core curriculum. There is a need to balance special needs with a normal classroom experience.

  5. Pia if this is an education environment then Vista pricing is very close to free under the schools agreement available, in QLD I think it is about $50 per device. The hardware may be more likely the limiting factor 🙂

  6. From the exposure I have had to this through having a quadriplegic friend, it seems that so far all of the success in this area is still in the Windows domain. Dragon still looks like the best option in terms of accuracy and ease of use, and it’s XP or Vista.

    The Windows stuff built into MS Office’s accessibility module is pretty rudimentary, at least in the way it performs. With a great deal of training it can just barely work through very carefully intoned speech. Natural language, not a chance.

    I’ll be delighted if your search turns up something in the FOSS domain that works really well, though I doubt you’ll get something that an OLPC can keep up with. Even if a FOSS project has no GUI and requires arcane glueing together from small pieces of source code, I’ll be keen to check it out.

  7. hi,

    I need speech to text apps to capture voices on 350 hours of digital video tape for the Digital Tipping Point film project, a video documentary on how Free Open Source Software is changing global culture. We have about 13 languages and lots of heavy accents. We need to be able to automatically transcribe this video. Ideally, we could pipe the output from the video directly as input into whatever FOSS app is available.

    @AlphaG , I can’t speak for Pia, but I need to use Free Software because I need to be able to scale up on low budget and I need long term viable Free standards. I need to be sure that my software will still work and my data still viable in 5, 10 and 15 years, even if the vendor goes belly up or decides to no longer support a software package.

    Microsoft-based solutions might offer short term benefits, but bring with them a whole host of other short term problems (spyware, viruses, licensing hassles, proprietary formats) and lots of long term problems (doesn’t scale, backwards incompatible, obsolete data).

    @David Dean, if our project sounds interesting to you, I would love to get the benefit of any tips you have. And you can be sure that we will be giving back to the community, as all of our tools are FOSS tools, so we would be giving bug reports and all of our data is Creative Commons Attribute-ShareAlike licensed. Our goal is to make a watchable, entertaining documentary for newbies out of a fully forkable library of 350 hours of rough-edited hi-res standard def video. If that sounds interesting, please feel free to contact me at einfeldt at gmail dot com.

    Thx for twittering about this project, Pia!


    Christian Einfeldt,
    Producer, the Digital Tipping Point

  8. “Microsoft-based solutions might offer short term benefits, but bring with them a whole host of other short term problems (spyware, viruses, licensing hassles, proprietary formats) and lots of long term problems (doesn’t scale, backwards incompatible, obsolete data). ”

    That is possibly all true, but if you keep away from your favourite porn and torrent sites you should be fine. I run both Windows XP and Ubuntu systems and using similar admin time on each have found both to be resilient to issues. Yes I do have an added AV solution on the XP box but so far both have been well behaved and updates on both solutions has been about the same amount of work.

  9. Hi Pia,

    Sounds like a good project.

    What level of accuracy did you get with Sphinx? We have a customer who wants to move their transcription business online. Where is sphinx in comparison to the Dragon Server?

    Both of my children have gone to Tooowong State School. One is still there. This school has a bilingual, bicultural programme for deaf and hearing children. The teachers are really smart and well connected teachers there so the school would be a fantastic place for you to do some testing. If interested, I can get you in touch with them.

    Cheers,

    Cheers,

    Damian

Leave a Reply

Your email address will not be published. Required fields are marked *