【发布时间】:2021-07-20 03:01:28
【问题描述】:
当用户点击转录按钮时,Python 需要自动识别正在加载的音频文件的语言并以特定语言打印音频文件中的文本,这是否可能以及该功能应该是什么样子,请帮助.
from flask import Flask, render_template, request, redirect
import speech_recognition as sr
app = Flask(__name__)
@app.route("/", methods=["GET", "POST"])
def index():
transcript = ""
if request.method == "POST":
print("FORM DATA RECEIVED")
if "file" not in request.files:
return redirect(request.url)
file = request.files["file"]
if file.filename == "":
return redirect(request.url)
if file:
recognizer = sr.Recognizer()
audioFile = sr.AudioFile(file)
with audioFile as source:
data = recognizer.record(source)
transcript = recognizer.recognize_google(data, language="en-US")
return render_template('index.html', transcript=transcript)
if __name__ == "__main__":
app.run(debug=True, threaded=True)
好的,我在 HTML 下拉列表中创建了,但是如何链接它以获得transcript = recognizer.recognize_google(data, language="en-US") 所选语言的结果?
<label for="lang">Language:</label>
<select name="lang" id="langs">
<option value="en">English</option>
<option value="es">Spanis</option>
<option value="de">German</option>
</select>
完整模板:
from flask import Flask, render_template, request, redirect
import speech_recognition as sr
import requests
app = Flask(__name__)
def get_languages():
url = 'https://cloud.google.com/speech-to-text/docs/languages'
resp = requests.get(url)
start_text = ' <tbody class="list">\n'
end_text = ' </tbody>\n'
table = resp.text.split(start_text)[1].split(end_text)[0]
tr_start = ' <tr>\n'
sections = table.split(tr_start)[1:]
languages = []
for section in sections:
short = section.splitlines()[1].split('<td>')[1].split('<')[0]
long = section.splitlines()[0].split('<td>')[1].split('<')[0]
if len(languages) > 0:
# dupe check. For some reason the page has all
# languages twice
if languages[-1] != {'short': short, 'long': long}:
languages.append({'short': short, 'long': long})
else:
languages.append({'short': short, 'long': long})
print(f'FOUND {len(languages)} LANGUAGES')
return languages
language_list = get_languages()
@app.route("/", methods=["GET", "POST"])
def index():
transcript = ""
if request.method == "POST":
print("FORM DATA RECEIVED")
# set the language, use en-US by default
language = request.form.get('langs') or 'en-US'
if "file" not in request.files:
return redirect(request.url)
file = request.files["file"]
if file.filename == "":
return redirect(request.url)
if file:
recognizer = sr.Recognizer()
audioFile = sr.AudioFile(file)
with audioFile as source:
data = recognizer.record(source)
# change the line below
transcript = recognizer.recognize_google(data, language=language)
return render_template('index.html', transcript=transcript, language_list=language_list)
if __name__ == "__main__":
app.run(debug=True, threaded=True)
【问题讨论】:
-
您说的是通用翻译器。该技术不存在。当您知道所说的语言时,语音识别就足够困难了。也许您可以进行研究以实现这一目标。
-
比如我有一个西班牙语的音频文件,目前这个脚本只能识别英语,我需要在加载西班牙语音频文件时,Python自动识别西班牙语。
-
对。我完全理解你在问什么。如果您知道该文件是西班牙语并将该信息与文件一起传递,您可以在脚本中选择正确的语言。但是除了《星际迷航》之外,根本不存在听文件并找出它是什么语言的技术。
标签: python speech-recognition speech-to-text google-cloud-speech