Python 语音识别 - 狮身人面像答案

【问题标题】：Python Speech Recognition - SphinxPython 语音识别 - 狮身人面像
【发布时间】：2021-01-09 03:39:02
【问题描述】：

我正在制作一个简单的语音识别程序，使我能够通过语音命令控制我的机器人。我只希望程序查找某些单词并且相对较快。我的项目基于 Micheal Reeves 的“我制造了一个向我的眼睛发射激光的机器人”，并试图创建类似于他视频中看到的语音命令的东西。

我遇到的问题是 sphinx 很快，但是（编辑：不准确）。除此之外，当我启用关键字时，输出变得很奇怪。如果我说命令关闭，输出将是：

"three  nine  one  four  five  eight  two  one  eight  nine  three  four  two  six  zero  eight  nine  two  one  six  four  eight  seven  one  three  four  nine  five  eight  two  eight 
four  five  nine  three  one  two  eight  six  nine  three  five  seven  two  zero  one  nine  five  eight  two  four  four  nine  one  five  eight  three  two  six  four  two  zero  seven  one  nine  three  four  five  eight  two  five  one  three  four  eight  two  six  eight  zero  one  three  four  five  two  seven  eight  eight  three  nine  five  two  four  eight  one  two  eight  two  eight  two  eight  command shutdown  command  eight  one  four  three  eight  two  two  eight "

我不确定要解决这个问题，我尝试过 recognise_google，但它更准确，但速度很慢，我希望启用关键字，以便它只检查是否说过一组单词，然后将其打印到屏幕如果我这样做了。

我遇到的另一个问题是 listen_in_background() 函数。我似乎无法让它正常工作。

这是我的代码：

import speech_recognition as sr
import pocketsphinx

keywords = [
    ("command", 1), 
    ("one", 0), 
    ("two", 0), 
    ("three", 0), 
    ("four", 0), 
    ("five", 0), 
    ("six", 0), 
    ("seven", 0), 
    ("eight", 0), 
    ("nine", 0), 
    ("zero", 0), 
    ("command x axis add", 0), 
    ("command y axis add", 0), 
    ("command x axis subtract", 0), 
    ("command y axis subtract", 0), 
    ("command clear shift string", 0), 
    ("command shutdown", 0),
    ("command flip tracking", 0), 
    ("command pause", 0), 
    ("command detect face", 0), 
    ("command detect body", 0)
]

def speech2text():
    r = sr.Recognizer()
    with sr.Microphone() as source:
        r.adjust_for_ambient_noise(source)
        audio = r.listen(source) #this is were i want to listen in the background to run it at the same 
        #time as other code
    try:
        data = r.recognize_sphinx(audio, keyword_entries = keywords)
        return data
    except:
        return "Error..."

while True:
    print(speech2text())

【问题讨论】：

github.com/Uberi/speech_recognition/issues/305 是此模块的错误报告，报告了一个非常相似的问题（所有关键字都被识别，以随机顺序） - 发布于 2017 年，与没有任何人的回复。似乎没有积极支持该模块。

标签： python performance speech-recognition pocketsphinx

【解决方案1】：

我有同样的问题。我尝试了从 0 到 1 的不同敏感度，发现如果所有关键字的敏感度都超过 0.9，那么它们的识别度相等且相当准确，并且不会在输出短语中随机发送垃圾邮件。如果该值低于该值，则它会吐出过多的关键字而不是合理的。

当任何不是关键字的单词时，我也会收到 UnknownValueError。如果您正在寻找一种仅检测这些关键字的方法，我肯定会尝试将它们的敏感度全部设置为 1 并查看您的位置。我认为唯一的缺点可能是，如果关键字列表中的字词与您的预期相似，那么您获得的点击量可能会与您的预期不同。

【讨论】：