【问题标题】:How do I only get the transcript from Watson Speech API?如何仅从 Watson Speech API 获取成绩单?
【发布时间】:2020-02-26 16:39:21
【问题描述】:

我能够使我的代码工作,但是,在我的输出中我收到一个嵌套字典,我不确定如何只访问整个 wav 文件的转录(单词)?

import json
from os.path import join, dirname
from ibm_watson import SpeechToTextV1
from ibm_watson.websocket import RecognizeCallback, AudioSource
import threading
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

authenticator = IAMAuthenticator('****')
service = SpeechToTextV1(authenticator=authenticator)
service.set_service_url('https://api.us-east.speech-to-text.watson.cloud.ibm.com')

models = service.list_models().get_result()
print(json.dumps(models, indent=2))

model = service.get_model('en-US_BroadbandModel').get_result()
print(json.dumps(model, indent=2))

with open(join(dirname('__file__'), 'nickvoice.wav'),
          'rb') as audio_file:
        print(json.dumps(
        output = service.recognize(
            audio=audio_file,
            content_type='audio/wav',
            #timestamps=True,
            #word_confidence=True,
            model='en-US_NarrowbandModel',
        continuous=True).get_result(),
        indent=2))

我的输出:

    {
      "alternatives": [
        {
          "confidence": 0.97,
          "transcript": "awesome "
        }
      ],
      "final": true
    },
    {
      "alternatives": [
        {
          "confidence": 0.59,
          "transcript": "%HESITATION possible give Charlie meds from me and then "
        }
      ],
      "final": true
    },
    {
      "alternatives": [
        {
          "confidence": 0.86,
          "transcript": "thing else comes up or you have any questions just don't hesitate to call us okay okay thank you so much yeah you're very welcome you have a great rest your day okay you too bye bye "
        }
      ],
      "final": true
    }
  ],
  "result_index": 0
}

以上是输出的一部分。我试图通过以下方式仅调用成绩单:

print(output['results'][0]['alternatives'][0]['transcript'])
Traceback (most recent call last):



File "<ipython-input-28-fda0a085be69>", line 31, in <module>
    print(output['results'][0]['alternatives'][0]['transcript'])

TypeError: 'NoneType' object is not subscriptable

我如何只访问整个成绩单而不会获得“信心”、“最终”等其他垂直领域。

【问题讨论】:

  • 您从该错误消息中了解/不了解什么?

标签: python dictionary speech-recognition ibm-watson


【解决方案1】:

响应对象中的results 键是一个数组,因此您需要执行output["results"][0]["alternatives"][0]["transcript"] 之类的操作。我假设您将结果保存在输出中,例如:

output = service.recognize(
            audio=audio_file,
            content_type='audio/wav',
            #timestamps=True,
            #word_confidence=True,
            model='en-US_NarrowbandModel',
            continuous=True).get_result()
print(output['results'][0]['alternatives'][0]['transcript'])


## Updated Example including file handling
with open(join(dirname(__file__), '../resources/speech.wav'),
          'rb') as audio_file:
    output = service.recognize(
            audio=audio_file,
            content_type='audio/wav',
            timestamps=True,
            word_confidence=True).get_result()
    print(output['results'][0]['alternatives'][0]['transcript'])

请查看Getting started document 以了解有关响应对象的更多信息。

【讨论】:

  • 回溯(最近一次调用最后):文件“”,第 31 行,在 print(output['results'][0]['alternatives '][0]['transcript']) TypeError: 'NoneType' 对象不可下标
  • 打印(输出)无
  • output = json.dumps(...) 在您的示例中是错误的。应该和我贴的例子一样。
  • 您正在尝试将print 的输出保存在output 中,这就是为什么当您再次尝试打印时会看到错误。
  • 我尝试了上述更改,但仍然出错} Traceback(最近一次调用最后一次):文件“”,第 31 行,在 print(output ['results'][0]['alternatives'][0]['transcript']) TypeError: tuple indices must be integers or slices, not str
【解决方案2】:
print(output['results'][0]['alternatives'][0]['transcript'])

如果你使用上面的这个打印并且没有音频记录,这个代码会有这个错误:

IndexError: list index out of range

因为结果一无所有。

所以要解决这个问题,首先验证结果是否有内容,如下面的代码。

if(len(output['results']) > 0 ):

    speech = output['results'][0]['alternatives'][0]['transcript']

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-07-28
    • 1970-01-01
    • 2019-05-13
    相关资源
    最近更新 更多