【问题标题】：How do I only get the transcript from Watson Speech API?如何仅从 Watson Speech API 获取成绩单？
【发布时间】：2020-02-26 16:39:21
【问题描述】：

我能够使我的代码工作，但是，在我的输出中我收到一个嵌套字典，我不确定如何只访问整个 wav 文件的转录（单词）？

import json
from os.path import join, dirname
from ibm_watson import SpeechToTextV1
from ibm_watson.websocket import RecognizeCallback, AudioSource
import threading
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

authenticator = IAMAuthenticator('****')
service = SpeechToTextV1(authenticator=authenticator)
service.set_service_url('https://api.us-east.speech-to-text.watson.cloud.ibm.com')

models = service.list_models().get_result()
print(json.dumps(models, indent=2))

model = service.get_model('en-US_BroadbandModel').get_result()
print(json.dumps(model, indent=2))

with open(join(dirname('__file__'), 'nickvoice.wav'),
          'rb') as audio_file:
        print(json.dumps(
        output = service.recognize(
            audio=audio_file,
            content_type='audio/wav',
            #timestamps=True,
            #word_confidence=True,
            model='en-US_NarrowbandModel',
        continuous=True).get_result(),
        indent=2))

我的输出：

    {
      "alternatives": [
        {
          "confidence": 0.97,
          "transcript": "awesome "
        }
      ],
      "final": true
    },
    {
      "alternatives": [
        {
          "confidence": 0.59,
          "transcript": "%HESITATION possible give Charlie meds from me and then "
        }
      ],
      "final": true
    },
    {
      "alternatives": [
        {
          "confidence": 0.86,
          "transcript": "thing else comes up or you have any questions just don't hesitate to call us okay okay thank you so much yeah you're very welcome you have a great rest your day okay you too bye bye "
        }
      ],
      "final": true
    }
  ],
  "result_index": 0
}

以上是输出的一部分。我试图通过以下方式仅调用成绩单：

print(output['results'][0]['alternatives'][0]['transcript'])
Traceback (most recent call last):



File "<ipython-input-28-fda0a085be69>", line 31, in <module>
    print(output['results'][0]['alternatives'][0]['transcript'])

TypeError: 'NoneType' object is not subscriptable

我如何只访问整个成绩单而不会获得“信心”、“最终”等其他垂直领域。

【问题讨论】：

您从该错误消息中了解/不了解什么？

标签： python dictionary speech-recognition ibm-watson

【解决方案1】：

响应对象中的results 键是一个数组，因此您需要执行output["results"][0]["alternatives"][0]["transcript"] 之类的操作。我假设您将结果保存在输出中，例如：

output = service.recognize(
            audio=audio_file,
            content_type='audio/wav',
            #timestamps=True,
            #word_confidence=True,
            model='en-US_NarrowbandModel',
            continuous=True).get_result()
print(output['results'][0]['alternatives'][0]['transcript'])


## Updated Example including file handling
with open(join(dirname(__file__), '../resources/speech.wav'),
          'rb') as audio_file:
    output = service.recognize(
            audio=audio_file,
            content_type='audio/wav',
            timestamps=True,
            word_confidence=True).get_result()
    print(output['results'][0]['alternatives'][0]['transcript'])

请查看Getting started document 以了解有关响应对象的更多信息。

【讨论】：

回溯（最近一次调用最后）：文件“”，第 31 行，在 print(output['results'][0]['alternatives '][0]['transcript']) TypeError: 'NoneType' 对象不可下标
打印（输出）无
output = json.dumps(...) 在您的示例中是错误的。应该和我贴的例子一样。
您正在尝试将print 的输出保存在output 中，这就是为什么当您再次尝试打印时会看到错误。
我尝试了上述更改，但仍然出错} Traceback（最近一次调用最后一次）：文件“”，第 31 行，在 print(output ['results'][0]['alternatives'][0]['transcript']) TypeError: tuple indices must be integers or slices, not str

【解决方案2】：

print(output['results'][0]['alternatives'][0]['transcript'])

如果你使用上面的这个打印并且没有音频记录，这个代码会有这个错误：

IndexError: list index out of range

因为结果一无所有。

所以要解决这个问题，首先验证结果是否有内容，如下面的代码。

if(len(output['results']) > 0 ):

    speech = output['results'][0]['alternatives'][0]['transcript']

【讨论】：