audio - Python offline speech recognition -
i'm doing , application fallowing:
1:if noise detected microphone, starts record audio, until no noise detected. after it, audio recorded wav file.
2:i have detect words on it. there only, 5 10 words detect.
so far, code first part (detect noise , record audio). now, have list following words: help, please, yes, no, could, you, after, tomorrow. need offline way detect if sound contains these words. possible? how can that? i'm using linux , there no way change operational system windows or use virtual machine.
i'm thinking use sound's spectrogram, create train database , use classifier predict. example, this spectrogram of word. technique use?
thanks.
you can use pocketsphinx python, install pip install pocketsphinx. code looks this:
import sys, os pocketsphinx.pocketsphinx import * sphinxbase.sphinxbase import * modeldir = "../../../model" datadir = "../../../test/data" # create decoder model config = decoder.default_config() config.set_string('-hmm', os.path.join(modeldir, 'en-us/en-us')) config.set_string('-dict', os.path.join(modeldir, 'en-us/cmudict-en-us.dict')) config.set_string('-kws', 'command.list') # open file read data stream = open(os.path.join(datadir, "goforward.raw"), "rb") # alternatively can read microphone # import pyaudio # # p = pyaudio.pyaudio() # stream = p.open(format=pyaudio.paint16, channels=1, rate=16000, input=true, frames_per_buffer=1024) # stream.start_stream() # process audio chunk chunk. on keyword detected perform action , restart search decoder = decoder(config) decoder.start_utt() while true: buf = stream.read(1024) if buf: decoder.process_raw(buf, false, false) else: break if decoder.hyp() != none: print ([(seg.word, seg.prob, seg.start_frame, seg.end_frame) seg in decoder.seg()]) print ("detected keyword, restarting search") decoder.end_utt() decoder.start_utt() the list of keywords should this:
forward /1e-1/ down /1e-1/ other phrase /1e-20/ the numbers thresholds detection
Comments
Post a Comment