Tuesday, March 4, 2014

Speech recognition with the Python "speech" module

By Vasudev Ram



Today I came across a Python library for speech recognition and tried it out. It's called speech.py or pyspeech and is available here on Google Code.

(The story of how I came across it is interesting, but I'll save that for a future post.)

The library can be used as shown in this simple test program, test_speech.py (modified slightly from the example on the pyspeech site):
import string
import speech

while True:
    print "Talk:"
    phrase = speech.input()
    speech.say("You said %s" % phrase)
    print "You said {0}".format(phrase)
    #if phrase == "turn off":
    if phrase.lower() == "goodbye":
        break
Below is the partial output of a sample run, after issuing the command python test_speech.py ; the recognized phrases are in italics, and I've put Python comments in the output (manually) to indicate where the software got the words right or wrong:
Talk:
You said Give me some water # Right.
Talk:
You said Thank you # Right.
Talk:
You said Please continue # Right.
Talk:
You said That's a better plan # Wrong. I had said something like "That is better."
Talk:
You said That is better # Right.
Talk:
You said Nate # Wrong. I had said "Great."
Talk:
You said We'll be enough # Wrong. I had said "That'll be enough."
Talk:
You said My for now # Wrong. I had said "Bye for now."
Talk:
You said Goodbye # Right.

I had to modify the termination test:

if phrase == "turn off":

to:

if phrase.lower() == "goodbye":

because the software could not recognize the phrase "turn off" when spoken by me. Also had to convert the recognized text to lowercase because when I said "goodbye", it recognized it as "Goodbye" and the program did not terminate.

I actually ran the program many times, and as those readers who have tried out speech recognition or certain AI software may know, the results can be quite funny sometimes. Many sentences were mis-recognized. Some examples:

I said: "1 2 3 4 5". It heard it as: "One new V war by".

I said (three times): "Do Re Mi Fa So La Ti". (Solfège)

It heard it as:

"Below the knee law school now the"

"Below the knee law school nine E"

"Below the knee and fought so not be"

Heh :) But still, even getting the phrases right some of the time is quite an achievement, since speech recognition is a challenging field.

Note 1: I did this on Windows and had to enable the Windows Speech Recognition facility before the program would work.

Note 2: The pyspeech site says that the library is no longer being maintained, and mentions dragonfly, another Python speech-recognition framework, as an alternative. I had actually tried that first (because of reading that message), but it had some issues, so I decided to try pyspeech instead. I may check dragonfly out more later, and blog about it if I can get it to work.

Read more Python posts on my blog.

- Vasudev Ram - Dancing Bison Enterprises

Contact Page




O'Reilly 50% Ebook Deal of the Day

3 comments:

James said...

I wonder how you'd go about getting this to work on Linux. I read your previous post on Text to Speech and successfully got that working on my CRUX/Linux desktop at home using the espeak text to speech sensitizer.

Vasudev Ram said...


Are you looking for training on Python programming, SQL programming and database design, or Unix / Linux architecture, usage, commands and shell scripting?

Visit my training page and check out the courses I offer.


Vasudev Ram said...

I conduct courses on:

- Python programming
- Linux commands & shell scripting
- SQL programming and database design
- PDF report generation using ReportLab and xtopdf

xtopdf is my own product, a Python toolkit for PDF generation from other formats.

Check out my course outlines and testimonials.

More courses will be added over time.

Sign up to be notified of my new courses