【问题标题】:Is there a way to use the SpeechRecognizer API directly for speech input?有没有办法直接使用 SpeechRecognizer API 进行语音输入?
【发布时间】:2011-06-25 21:59:27
【问题描述】:

Android Dev 网站提供了一个使用内置 Google Speech Input Activity 进行语音输入的示例。该活动显示一个带有麦克风的预配置弹出窗口,并使用onActivityResult()传递其结果

我的问题: 有没有办法直接使用SpeechRecognizer 类进行语音输入而不显示预设活动?这将让我为语音输入构建自己的活动。

【问题讨论】:

    标签: android speech-recognition speech-to-text


    【解决方案1】:

    这是使用 SpeechRecognizer 类的代码(来自herehere):

    import android.app.Activity;
    import android.content.Intent;
    import android.os.Bundle;
    import android.view.View;
    import android.view.View.OnClickListener;
    import android.speech.RecognitionListener;
    import android.speech.RecognizerIntent;
    import android.speech.SpeechRecognizer;
    import android.widget.Button;
    import android.widget.TextView;
    import java.util.ArrayList;
    import android.util.Log;
    
    
    
    public class VoiceRecognitionTest extends Activity implements OnClickListener 
    {
    
       private TextView mText;
       private SpeechRecognizer sr;
       private static final String TAG = "MyStt3Activity";
       @Override
       public void onCreate(Bundle savedInstanceState) 
       {
                super.onCreate(savedInstanceState);
                setContentView(R.layout.main);
                Button speakButton = (Button) findViewById(R.id.btn_speak);     
                mText = (TextView) findViewById(R.id.textView1);     
                speakButton.setOnClickListener(this);
                sr = SpeechRecognizer.createSpeechRecognizer(this);       
                sr.setRecognitionListener(new listener());        
       }
    
       class listener implements RecognitionListener          
       {
                public void onReadyForSpeech(Bundle params)
                {
                         Log.d(TAG, "onReadyForSpeech");
                }
                public void onBeginningOfSpeech()
                {
                         Log.d(TAG, "onBeginningOfSpeech");
                }
                public void onRmsChanged(float rmsdB)
                {
                         Log.d(TAG, "onRmsChanged");
                }
                public void onBufferReceived(byte[] buffer)
                {
                         Log.d(TAG, "onBufferReceived");
                }
                public void onEndOfSpeech()
                {
                         Log.d(TAG, "onEndofSpeech");
                }
                public void onError(int error)
                {
                         Log.d(TAG,  "error " +  error);
                         mText.setText("error " + error);
                }
                public void onResults(Bundle results)                   
                {
                         String str = new String();
                         Log.d(TAG, "onResults " + results);
                         ArrayList data = results.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
                         for (int i = 0; i < data.size(); i++)
                         {
                                   Log.d(TAG, "result " + data.get(i));
                                   str += data.get(i);
                         }
                         mText.setText("results: "+String.valueOf(data.size()));        
                }
                public void onPartialResults(Bundle partialResults)
                {
                         Log.d(TAG, "onPartialResults");
                }
                public void onEvent(int eventType, Bundle params)
                {
                         Log.d(TAG, "onEvent " + eventType);
                }
       }
       public void onClick(View v) {
                if (v.getId() == R.id.btn_speak) 
                {
                    Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);        
                    intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
                    intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE,"voice.recognition.test");
    
                    intent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS,5); 
                    sr.startListening(intent);
                    Log.i("111111","11111111");
                }
       }
    }
    

    使用按钮定义 main.xml 并在清单中授予 RECORD_AUDIO 权限

    【讨论】:

    • 在搜索其他内容时遇到了这个问题。虽然它的老问题我认为发布答案会对其他人有所帮助
    • 它总是输出 5 或 4 或错误 7 Result-5
    • 应该被接受为答案,现在已经 3 年了。
    • 为了清楚起见,授予 RECORD_AUDIO 权限看起来像 &lt;uses-permission android:name="android.permission.RECORD_AUDIO" /&gt;
    【解决方案2】:

    还要确保向用户请求适当的权限。我遇到了错误 9 返回值:INSUFFICIENT_PERMISSIONS,即使我在清单中列出了正确的 RECORD_AUDIO 权限。

    按照示例代码here,我能够从用户那里获得权限,然后语音识别器返回了良好的响应。

    例如在调用 SpeechRecognizer 方法之前,我将这个块放入我的 onCreate() 活动中,尽管它可以在 UI 流中的其他位置:

        protected void onCreate(Bundle savedInstanceState) {
            ...
            if (ContextCompat.checkSelfPermission(this,
                Manifest.permission.RECORD_AUDIO)
                != PackageManager.PERMISSION_GRANTED) {
    
            // Should we show an explanation?
            if (ActivityCompat.shouldShowRequestPermissionRationale(this,
                    Manifest.permission.RECORD_AUDIO)) {
    
                // Show an explanation to the user *asynchronously* -- don't block
                // this thread waiting for the user's response! After the user
                // sees the explanation, try again to request the permission.
    
            } else {
    
                // No explanation needed, we can request the permission.
    
                ActivityCompat.requestPermissions(this,
                        new String[]{Manifest.permission.RECORD_AUDIO},
                        527);
    
                // MY_PERMISSIONS_REQUEST_READ_CONTACTS is an
                // app-defined int constant. The callback method gets the
                // result of the request. (In this example I just punched in
                // the value 527)
            }
            ...
        }
    

    然后在activity中为权限请求提供回调方法:

    @Override
    public void onRequestPermissionsResult(int requestCode,
                                           String permissions[], int[] grantResults) {
        switch (requestCode) {
            case 527: {
                // If request is cancelled, the result arrays are empty.
                if (grantResults.length > 0
                        && grantResults[0] == PackageManager.PERMISSION_GRANTED) {
    
                    // permission was granted, yay! Do the
                    // contacts-related task you need to do.
    
                } else {
    
                    // permission denied, boo! Disable the
                    // functionality that depends on this permission.
                }
                return;
            }
    
            // other 'case' lines to check for other
            // permissions this app might request
        }
    }
    

    我必须在上面的 preetha 示例代码中更改另一件事,其中在 onResults() 方法中检索结果文本。要获取已翻译语音的实际文本(而不是原始代码打印的大小),要么打印构造字符串 str 的值,要么获取 ArrayList(数据)中的返回值之一。例如:

    .setText(data.get(0));
    

    【讨论】:

      【解决方案3】:

      您可以使用SpeechRecognizer,尽管我不知道除this previous SO question 之外的任何示例代码。但是,这是 API 级别 8 (Android 2.2) 的新功能,因此在撰写本文时尚未广泛使用。

      【讨论】:

      • 我编写了一个测试应用程序,试图启动 SpeechRecognizer.startListening() 以及实现的监听器方法,但什么也没发生。
      【解决方案4】:

      你可以这样做:

      import android.app.Activity
      import androidx.appcompat.app.AppCompatActivity
      import android.os.Bundle
      import kotlinx.android.synthetic.main.activity_main.*
      import android.widget.Toast
      import android.content.ActivityNotFoundException
      import android.speech.RecognizerIntent
      import android.content.Intent
      
      class MainActivity : AppCompatActivity() {
          private val REQ_CODE = 100
      
          override fun onCreate(savedInstanceState: Bundle?) {
              super.onCreate(savedInstanceState)
              setContentView(R.layout.activity_main)
      
              speak.setOnClickListener {
                  val intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
                  intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
                          RecognizerIntent.LANGUAGE_MODEL_FREE_FORM)
                  intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE,  "ar-JO") //  Locale.getDefault()
                  intent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Need to speak")
                  try {
                      startActivityForResult(intent, REQ_CODE)
                  } catch (a: ActivityNotFoundException) {
                      Toast.makeText(applicationContext,
                              "Sorry your device not supported",
                              Toast.LENGTH_SHORT).show()
                  }
              }
          }
      
          override fun onActivityResult(requestCode: Int, resultCode: Int, data: Intent?) {
              super.onActivityResult(requestCode, resultCode, data)
      
              when (requestCode) {
                  REQ_CODE -> {
                      if (resultCode == Activity.RESULT_OK && data != null) {
                          val result = data
                                  .getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS)
                          println("result: $result")
                          text.text = result[0]
                      }
                  }
              }
          }
      }
      

      layout 可以很简单:

      <?xml version = "1.0" encoding = "utf-8"?>
      <RelativeLayout xmlns:android = "http://schemas.android.com/apk/res/android"
          xmlns:app = "http://schemas.android.com/apk/res-auto"
          xmlns:tools = "http://schemas.android.com/tools"
          android:layout_width = "match_parent"
          android:layout_height = "match_parent"
          tools:context = ".MainActivity">
          <LinearLayout
              android:layout_width = "match_parent"
              android:gravity = "center"
              android:layout_height = "match_parent">
              <TextView
                  android:id = "@+id/text"
                  android:textSize = "30sp"
                  android:layout_width = "wrap_content"
                  android:layout_height = "wrap_content"/>
          </LinearLayout>
          <LinearLayout
              android:layout_width = "wrap_content"
              android:layout_alignParentBottom = "true"
              android:layout_centerInParent = "true"
              android:orientation = "vertical"
              android:layout_height = "wrap_content">
              <ImageView
                  android:id = "@+id/speak"
                  android:layout_width = "wrap_content"
                  android:layout_height = "wrap_content"
                  android:background = "?selectableItemBackground"
                  android:src = "@android:drawable/ic_btn_speak_now"/>
          </LinearLayout>
      </RelativeLayout>
      

      你问的另一种方式,时间长一点,但给你更多的控制权,也不会用谷歌帮助对话框打扰你:

      1- 首先您需要在Manifest 文件中授予权限:

          <uses-permission android:name="android.permission.INTERNET" />
          <uses-permission android:name="android.permission.RECORD_AUDIO"/>
      

      2- 我将以上所有答案合并为:

      • 创建RecognitionListener 类,为:
      private val TAG = "Driver-Assistant"
      
      class Listener(context: Context): RecognitionListener {
          private var ctx = context
      
          override fun onReadyForSpeech(params: Bundle?) {
              Log.d(TAG, "onReadyForSpeech")
          }
      
          override fun onRmsChanged(rmsdB: Float) {
              Log.d(TAG, "onRmsChanged")
          }
      
          override fun onBufferReceived(buffer: ByteArray?) {
              Log.d(TAG, "onBufferReceived")
          }
      
          override fun onPartialResults(partialResults: Bundle?) {
              Log.d(TAG, "onPartialResults")
          }
      
          override fun onEvent(eventType: Int, params: Bundle?) {
              Log.d(TAG, "onEvent")
          }
      
          override fun onBeginningOfSpeech() {
              Toast.makeText(ctx, "Speech started", Toast.LENGTH_LONG).show()
          }
      
          override fun onEndOfSpeech() {
              Toast.makeText(ctx, "Speech finished", Toast.LENGTH_LONG).show()
          }
      
          override fun onError(error: Int) {
              var string = when (error) {
                  6 -> "No speech input"
                  4 -> "Server sends error status"
                  8 -> "RecognitionService busy."
                  7 -> "No recognition result matched."
                  1 -> "Network operation timed out."
                  2 -> "Other network related errors."
                  9 -> "Insufficient permissions"
                  5 -> " Other client side errors."
                  3 -> "Audio recording error."
                  else -> "unknown!!"
              }
              Toast.makeText(ctx, "sorry error occurred: $string", Toast.LENGTH_LONG).show()
          }
      
          override fun onResults(results: Bundle?) {
              Log.d(TAG, "onResults $results")
              val data = results!!.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION)
              display.text = data!![0]
          }
      }
      
      • 在主文件中你需要定义SpeechRecognizer,把上面的listner加进去,别忘了请求运行时权限,全部如下:
      lateinit var sr: SpeechRecognizer
      lateinit var display: TextView
      
      class MainActivity : AppCompatActivity() {
      
          override fun onCreate(savedInstanceState: Bundle?) {
              super.onCreate(savedInstanceState)
              setContentView(R.layout.activity_main)
      
              display = text
      
              if (ContextCompat.checkSelfPermission(this,
                              Manifest.permission.RECORD_AUDIO)
                      != PackageManager.PERMISSION_GRANTED) {
                  if (ActivityCompat.shouldShowRequestPermissionRationale(this,
                                  Manifest.permission.RECORD_AUDIO)) {
                  } else {
                      ActivityCompat.requestPermissions(this,
                              arrayOf(Manifest.permission.RECORD_AUDIO),
                              527)
                  }
              }
      
              sr = SpeechRecognizer.createSpeechRecognizer(this)
              sr.setRecognitionListener(Listener(this))
      
              speak.setOnClickListener {
                  val intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
                  intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
                          RecognizerIntent.LANGUAGE_MODEL_FREE_FORM)
                  intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE,  "ar-JO") //  Locale.getDefault()
                  sr.startListening(intent)
              }
      
          }
      
          override fun onRequestPermissionsResult(requestCode: Int, permissions: Array<out String>, grantResults: IntArray) {
              super.onRequestPermissionsResult(requestCode, permissions, grantResults)
              when (requestCode) {
                  527  -> if (grantResults.isNotEmpty()
                          && grantResults[0] == PackageManager.PERMISSION_GRANTED) {
      
                      Toast.makeText(this, "Permission granted", Toast.LENGTH_SHORT).show()
                  } else {
                      Toast.makeText(this, "Permission not granted", Toast.LENGTH_SHORT).show()
                  }
              }
          }
      }
      

      【讨论】:

        【解决方案5】:
        package com.android.example.speechtxt;
        
        import androidx.appcompat.app.AppCompatActivity;
        import androidx.core.content.ContextCompat;
        
        import android.Manifest;
        import android.content.Intent;
        import android.content.pm.PackageManager;
        import android.net.Uri;
        import android.os.Build;
        import android.os.Bundle;
        import android.provider.Settings;
        import android.speech.RecognitionListener;
        import android.speech.RecognizerIntent;
        import android.speech.SpeechRecognizer;
        import android.view.MotionEvent;
        import android.view.View;
        import android.widget.RelativeLayout;
        import android.widget.Toast;
        
        import java.util.ArrayList;
        import java.util.Locale;
        
        public class MainActivity extends AppCompatActivity {
        
            private RelativeLayout relativeLayout;
            private SpeechRecognizer speechRecognizer;
            private Intent speechintent;
            String keeper="";
        
            @Override
            protected void onCreate(Bundle savedInstanceState) {
                super.onCreate(savedInstanceState);
                setContentView(R.layout.activity_main);
        
                checkVoiceCommandPermission();
                relativeLayout = findViewById(R.id.touchscr);
        
                speechRecognizer = SpeechRecognizer.createSpeechRecognizer(getApplicationContext());
                speechintent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
                speechintent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
                speechintent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault());
        
        
                speechRecognizer.setRecognitionListener(new RecognitionListener() {
                    @Override
                    public void onReadyForSpeech(Bundle params) {
        
                    }
        
                    @Override
                    public void onBeginningOfSpeech() {
        
                    }
        
                    @Override
                    public void onRmsChanged(float rmsdB) {
        
                    }
        
                    @Override
                    public void onBufferReceived(byte[] buffer) {
        
                    }
        
                    @Override
                    public void onEndOfSpeech() {
        
                    }
        
                    @Override
                    public void onError(int error) {
        
                    }
        
                    @Override
                    public void onResults(Bundle results)
                    {
                        ArrayList<String> speakedStringArray = results.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
                        if(speakedStringArray!=null)
                        {
                            keeper = speakedStringArray.get(0);
        
                            Toast.makeText(getApplicationContext(),""+keeper,Toast.LENGTH_SHORT).show();
                        }
                    }
        
                    @Override
                    public void onPartialResults(Bundle partialResults) {
        
                    }
        
                    @Override
                    public void onEvent(int eventType, Bundle params) {
        
                    }
                });
        
                relativeLayout.setOnTouchListener(new View.OnTouchListener() {
                    @Override
                    public boolean onTouch(View v, MotionEvent event) {
                        switch (event.getAction())
                        {
                            case MotionEvent.ACTION_DOWN:
                                speechRecognizer.startListening(speechintent);
                                keeper="";
                                break;
                            case MotionEvent.ACTION_UP:
                                speechRecognizer.stopListening();
                                break;
                        }
                        return false;
                    }
                });
            }
        
        
            private void checkVoiceCommandPermission()
            {
                if(Build.VERSION.SDK_INT>=Build.VERSION_CODES.M)
                {
                    if (!(ContextCompat.checkSelfPermission(MainActivity.this, Manifest.permission.RECORD_AUDIO)== PackageManager.PERMISSION_GRANTED))
                    {
                        Intent intent = new Intent(Settings.ACTION_APPLICATION_DETAILS_SETTINGS, Uri.parse("package:" +getPackageName()));
                        startActivity(intent);
                        finish();
                    }
        
                }
            }
        }
        

        【讨论】:

          猜你喜欢
          • 2021-06-26
          • 2018-02-16
          • 2021-12-15
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2021-07-31
          相关资源
          最近更新 更多