【发布时间】:2016-07-30 00:41:31
【问题描述】:
我正在尝试构建一个应用程序,它采用流式音频输入(例如:麦克风中的一条线)并使用 IBM Bluemix (Watson) 进行语音到文本。
我对@987654321@ 找到的示例Java 代码进行了简单的修改。此示例发送 WAV,但我发送的是 FLAC……这 [应该] 无关紧要。
结果很糟糕,非常糟糕。这是我在使用 Java Websockets 代码时得到的:
{
"result_index": 0,
"results": [
{
"final": true,
"alternatives": [
{
"transcript": "it was six weeks ago today the terror ",
"confidence": 0.92
}
]
}
]
}
现在,将上面的结果与下面的结果进行比较。这些是发送相同内容但使用 cURL (HTTP POST) 时的结果:
{
"results": [
{
"alternatives": [
{
"confidence": 0.945,
"transcript": "it was six weeks ago today the terrorists attacked the U. S. consulate in Benghazi Libya now we've obtained email alerts that were put out by the state department as the attack unfolded as you know four Americans were killed including ambassador Christopher Stevens "
}
],
"final": true
},
{
"alternatives": [
{
"confidence": 0.942,
"transcript": "sharyl Attkisson has our story "
}
],
"final": true
}
],
"result_index": 0
}
这是一个几乎完美的结果。
为什么在使用 Websockets 时会有所不同?
【问题讨论】:
-
我在存储库中创建了一个问题,并将查看此github.com/watson-developer-cloud/java-sdk/issues/231
-
您可以将您正在使用的音频文件添加到问题中吗?
-
@GermanAttanasio 这是文件。 s3.amazonaws.com/mozart-company/tmp/4.flac
-
很酷,我在 github 中添加了该问题,并将继续努力
标签: java api ibm-cloud speech-to-text ibm-watson