【发布时间】:2019-02-14 17:51:20
【问题描述】:
使用 Google-Speech-to-Text,我只能获得部分转录。 输入文件:来自谷歌示例音频文件
Link to google repo location Commercial_mono.wav
这是我的代码:
def transcribe_gcs(gcs_uri):
from google.cloud import speech_v1p1beta1 as speech
from google.cloud.speech_v1p1beta1 import enums
from google.cloud.speech_v1p1beta1 import types
client = speech.SpeechClient()
audio = types.RecognitionAudio(uri = gcs_uri)
config = speech.types.RecognitionConfig( language_code = 'en-US',enable_speaker_diarization=True, diarization_speaker_count=2)
operation = client.long_running_recognize(config, audio)
print('Waiting for operation to complete...')
response = operation.result(timeout=5000)
result = response.results[-1]
words_info = result.alternatives[0].words
tag=1
speaker=" "
for word_info in words_info:
if word_info.speaker_tag==tag:
speaker=speaker+" "+word_info.word
else:
print("speaker {}: {}".format(tag,speaker))
tag=word_info.speaker_tag
speaker=" "+word_info.word
这是我如何调用脚本:
transcribe_gcs('gs://mybucket0000t/commercial_mono.wav')
我只从整个音频文件中获得部分转录
(venv3) ➜ g-transcribe git:(master) ✗ python gtranscribeWithDiarization.py
Waiting for operation to complete...
speaker 1: I'm here
speaker 2: hi I'd like to buy a Chrome Cast and I was wondering whether you
could help me
这就是我的全部
如果我多次执行代码,在 5 或 6 次之后,我不会收到任何转录。
这是几次尝试后的结果:
(venv3) ➜ g-transcribe git:(master) ✗ python gtranscribeWithDiarization.py
Waiting for operation to complete...
speaker 1:
(venv3) ➜ g-transcribe git:(master) ✗
环境:使用 python3
- 使用 google 服务帐户,连接没有问题。
- 还将文件复制到谷歌存储并确认我可以玩
- 尝试将文件从 wav 转换为 flac 但结果 是一样的
- 使用 ffprobe 确保只有一个通道
我正在尝试在扬声器更改时获取带有时间戳的整个转录。
期望的输出
Speaker 1: Start Time 0.0001: Hello transcription starts
Speaker 2: Start Time 0.0009: Here starts with the transcription of the 2nd speaker and so on to the end of file.
希望您能提供帮助。
【问题讨论】:
标签: python-3.x google-cloud-platform speech-to-text google-speech-api google-cloud-speech