【问题标题】:Sphinx4 speech recognition trasncribe demo not working accurately for short wav fileSphinx4 语音识别转录演示对于短 wav 文件无法正常工作
【发布时间】:2012-09-07 19:53:36
【问题描述】:

我刚刚为转录音频文件实现了转录器演示 .. 我的音频文件是 .wav 文件,其中仅包含“BHAVIK”“ANKIT”等名称 “萨加”

我的语法文件由这个语法组成如下:-

公开 = (JAY)|(SAGAR)|(BHAVIK)|(ANKIT)|(MIRAJ)|(YAGNESH);

但问题是转录器演示没有提供正确的结果..它 当我给“JAY”的.wav文件时,只是给我提供了其他东西......它不给 正确的结果..

为什么会这样? 我的 .wav 文件在这里你可以听到..

http://www.crocko.com/62E9703E57844AE6B7A172D5CBBB8DE3/Vocaroo_s0dmd1kxVSfJ.wav

请帮帮我... 提前致谢 我已经像这样在adnroid中实现了逻辑..我使用一个类名作为extaudiorecorder。

公共类 ExtAudioRecorder

{ 私有最终静态 int[] sampleRates = {44100, 22050, 11025, 16000};

public static ExtAudioRecorder getInstanse(Boolean recordingCompressed)
{
    ExtAudioRecorder result = null;

    if(recordingCompressed)
    {
        result = new ExtAudioRecorder(  false, 
                                        AudioSource.MIC, 
                                        sampleRates[3], 
                                        AudioFormat.CHANNEL_CONFIGURATION_MONO,
                                        AudioFormat.ENCODING_PCM_16BIT);
    }
    else
    {
        int i=0;
        do
        {
            result = new ExtAudioRecorder(  true, 
                                            AudioSource.MIC, 
                                            sampleRates[i], 
                                            AudioFormat.CHANNEL_CONFIGURATION_MONO,
                                            AudioFormat.ENCODING_PCM_16BIT);

        } while((++i<sampleRates.length) & !(result.getState() == ExtAudioRecorder.State.INITIALIZING));
    }
    return result;
}

/**
* INITIALIZING : recorder is initializing;
* READY : recorder has been initialized, recorder not yet started
* RECORDING : recording
* ERROR : reconstruction needed
* STOPPED: reset needed
*/
public enum State {INITIALIZING, READY, RECORDING, ERROR, STOPPED};

public static final boolean RECORDING_UNCOMPRESSED = true;
public static final boolean RECORDING_COMPRESSED = false;

// The interval in which the recorded samples are output to the file
// Used only in uncompressed mode
private static final int TIMER_INTERVAL = 120;

// Toggles uncompressed recording on/off; RECORDING_UNCOMPRESSED / RECORDING_COMPRESSED
private boolean         rUncompressed;

// Recorder used for uncompressed recording
private AudioRecord     audioRecorder = null;

// Recorder used for compressed recording
private MediaRecorder   mediaRecorder = null;

// Stores current amplitude (only in uncompressed mode)
private int             cAmplitude= 0;

// Output file path
private String          filePath = null;

// Recorder state; see State
private State           state;

// File writer (only in uncompressed mode)
private RandomAccessFile randomAccessWriter;

// Number of channels, sample rate, sample size(size in bits), buffer size, audio source, sample size(see AudioFormat)
private short                    nChannels;
private int                      sRate;
private short                    bSamples;
private int                      bufferSize;
private int                      aSource;
private int                      aFormat;

// Number of frames written to file on each output(only in uncompressed mode)
private int                      framePeriod;

// Buffer for output(only in uncompressed mode)
private byte[]                   buffer;

// Number of bytes written to file after header(only in uncompressed mode)
// after stop() is called, this size is written to the header/data chunk in the wave file
private int                      payloadSize;

/**
*
* Returns the state of the recorder in a RehearsalAudioRecord.State typed object.
* Useful, as no exceptions are thrown.
*
* @return recorder state
*/
public State getState()
{
    return state;
}

/*
*
* Method used for recording.
*
*/
private AudioRecord.OnRecordPositionUpdateListener updateListener = new AudioRecord.OnRecordPositionUpdateListener()
{
    public void onPeriodicNotification(AudioRecord recorder)
    {
        audioRecorder.read(buffer, 0, buffer.length); // Fill buffer
        try
        { 
            randomAccessWriter.write(buffer); // Write buffer to file
            payloadSize += buffer.length;
            if (bSamples == 16)
            {
                for (int i=0; i<buffer.length/2; i++)
                { // 16bit sample size
                    short curSample = getShort(buffer[i*2], buffer[i*2+1]);
                    if (curSample > cAmplitude)
                    { // Check amplitude
                        cAmplitude = curSample;
                    }
                }
            }
            else    
            { // 8bit sample size
                for (int i=0; i<buffer.length; i++)
                {
                    if (buffer[i] > cAmplitude)
                    { // Check amplitude
                        cAmplitude = buffer[i];
                    }
                }
            }
        }
        catch (IOException e)
        {
            Log.e(ExtAudioRecorder.class.getName(), "Error occured in updateListener, recording is aborted");
            //stop();
        }
    }

    public void onMarkerReached(AudioRecord recorder)
    {
        // NOT USED
    }
};
/** 
 * 
 * 
 * Default constructor
 * 
 * Instantiates a new recorder, in case of compressed recording the parameters can be left as 0.
 * In case of errors, no exception is thrown, but the state is set to ERROR
 * 
 */ 
public ExtAudioRecorder(boolean uncompressed, int audioSource, int sampleRate, int channelConfig, int audioFormat)
{
    try
    {
        rUncompressed = uncompressed;
        if (rUncompressed)
        { // RECORDING_UNCOMPRESSED
            if (audioFormat == AudioFormat.ENCODING_PCM_16BIT)
            {
                bSamples = 16;
            }
            else
            {
                bSamples = 8;
            }

            if (channelConfig == AudioFormat.CHANNEL_CONFIGURATION_MONO)
            {
                nChannels = 1;
            }
            else
            {
                nChannels = 2;
            }

            aSource = audioSource;
            sRate   = sampleRate;
            aFormat = audioFormat;

            framePeriod = sampleRate * TIMER_INTERVAL / 1000;
            bufferSize = framePeriod * 2 * bSamples * nChannels / 8;
            if (bufferSize < AudioRecord.getMinBufferSize(sampleRate, channelConfig, audioFormat))
            { // Check to make sure buffer size is not smaller than the smallest allowed one 
                bufferSize = AudioRecord.getMinBufferSize(sampleRate, channelConfig, audioFormat);
                // Set frame period and timer interval accordingly
                framePeriod = bufferSize / ( 2 * bSamples * nChannels / 8 );
                Log.w(ExtAudioRecorder.class.getName(), "Increasing buffer size to " + Integer.toString(bufferSize));
            }

            audioRecorder = new AudioRecord(audioSource, sampleRate, channelConfig, audioFormat, bufferSize);

            if (audioRecorder.getState() != AudioRecord.STATE_INITIALIZED)
                throw new Exception("AudioRecord initialization failed");
            audioRecorder.setRecordPositionUpdateListener(updateListener);
            audioRecorder.setPositionNotificationPeriod(framePeriod);
        } else
        { // RECORDING_COMPRESSED
            mediaRecorder = new MediaRecorder();
            mediaRecorder.setAudioSource(MediaRecorder.AudioSource.MIC);
            mediaRecorder.setOutputFormat(MediaRecorder.OutputFormat.THREE_GPP);
            mediaRecorder.setAudioEncoder(MediaRecorder.AudioEncoder.AMR_NB);               
        }
        cAmplitude = 0;
        filePath = null;
        state = State.INITIALIZING;
    } catch (Exception e)
    {
        if (e.getMessage() != null)
        {
            Log.e(ExtAudioRecorder.class.getName(), e.getMessage());
        }
        else
        {
            Log.e(ExtAudioRecorder.class.getName(), "Unknown error occured while initializing recording");
        }
        state = State.ERROR;
    }
}

/**
 * Sets output file path, call directly after construction/reset.
 *  
 * @param output file path
 * 
 */
public void setOutputFile(String argPath)
{
    try
    {
        if (state == State.INITIALIZING)
        {
            filePath = argPath;
            if (!rUncompressed)
            {
                mediaRecorder.setOutputFile(filePath);                  
            }
        }
    }
    catch (Exception e)
    {
        if (e.getMessage() != null)
        {
            Log.e(ExtAudioRecorder.class.getName(), e.getMessage());
        }
        else
        {
            Log.e(ExtAudioRecorder.class.getName(), "Unknown error occured while setting output path");
        }
        state = State.ERROR;
    }
}

/**
 * 
 * Returns the largest amplitude sampled since the last call to this method.
 * 
 * @return returns the largest amplitude since the last call, or 0 when not in recording state. 
 * 
 */
public int getMaxAmplitude()
{
    if (state == State.RECORDING)
    {
        if (rUncompressed)
        {
            int result = cAmplitude;
            cAmplitude = 0;
            return result;
        }
        else
        {
            try
            {
                return mediaRecorder.getMaxAmplitude();
            }
            catch (IllegalStateException e)
            {
                return 0;
            }
        }
    }
    else
    {
        return 0;
    }
}


/**
 * 
* Prepares the recorder for recording, in case the recorder is not in the INITIALIZING state and the file path was not set
* the recorder is set to the ERROR state, which makes a reconstruction necessary.
* In case uncompressed recording is toggled, the header of the wave file is written.
* In case of an exception, the state is changed to ERROR
*    
*/
public void prepare()
{
    try
    {
        if (state == State.INITIALIZING)
        {
            if (rUncompressed)
            {
                if ((audioRecorder.getState() == AudioRecord.STATE_INITIALIZED) & (filePath != null))
                {
                    // write file header

                    randomAccessWriter = new RandomAccessFile(filePath, "rw");

                    randomAccessWriter.setLength(0); // Set file length to 0, to prevent unexpected behavior in case the file already existed
                    randomAccessWriter.writeBytes("RIFF");
                    randomAccessWriter.writeInt(0); // Final file size not known yet, write 0 
                    randomAccessWriter.writeBytes("WAVE");
                    randomAccessWriter.writeBytes("fmt ");
                    randomAccessWriter.writeInt(Integer.reverseBytes(16)); // Sub-chunk size, 16 for PCM
                    randomAccessWriter.writeShort(Short.reverseBytes((short) 1)); // AudioFormat, 1 for PCM
                    randomAccessWriter.writeShort(Short.reverseBytes(nChannels));// Number of channels, 1 for mono, 2 for stereo
                    randomAccessWriter.writeInt(Integer.reverseBytes(sRate)); // Sample rate
                    randomAccessWriter.writeInt(Integer.reverseBytes(sRate*bSamples*nChannels/8)); // Byte rate, SampleRate*NumberOfChannels*BitsPerSample/8
                    randomAccessWriter.writeShort(Short.reverseBytes((short)(nChannels*bSamples/8))); // Block align, NumberOfChannels*BitsPerSample/8
                    randomAccessWriter.writeShort(Short.reverseBytes(bSamples)); // Bits per sample
                    randomAccessWriter.writeBytes("data");
                    randomAccessWriter.writeInt(0); // Data chunk size not known yet, write 0

                    buffer = new byte[framePeriod*bSamples/8*nChannels];
                    state = State.READY;
                }
                else
                {
                    Log.e(ExtAudioRecorder.class.getName(), "prepare() method called on uninitialized recorder");
                    state = State.ERROR;
                }
            }
            else
            {
                mediaRecorder.prepare();
                state = State.READY;
            }
        }
        else
        {
            Log.e(ExtAudioRecorder.class.getName(), "prepare() method called on illegal state");
            release();
            state = State.ERROR;
        }
    }
    catch(Exception e)
    {
        if (e.getMessage() != null)
        {
            Log.e(ExtAudioRecorder.class.getName(), e.getMessage());
        }
        else
        {
            Log.e(ExtAudioRecorder.class.getName(), "Unknown error occured in prepare()");
        }
        state = State.ERROR;
    }
}

/**
 * 
 * 
 *  Releases the resources associated with this class, and removes the unnecessary files, when necessary
 *  
 */
public void release()
{
    if (state == State.RECORDING)
    {
        stop();
    }
    else
    {
        if ((state == State.READY) & (rUncompressed))
        {
            try
            {
                randomAccessWriter.close(); // Remove prepared file
            }
            catch (IOException e)
            {
                Log.e(ExtAudioRecorder.class.getName(), "I/O exception occured while closing output file");
            }
            (new File(filePath)).delete();
        }
    }

    if (rUncompressed)
    {
        if (audioRecorder != null)
        {
            audioRecorder.release();
        }
    }
    else
    {
        if (mediaRecorder != null)
        {
            mediaRecorder.release();
        }
    }
}

/**
 * 
 * 
 * Resets the recorder to the INITIALIZING state, as if it was just created.
 * In case the class was in RECORDING state, the recording is stopped.
 * In case of exceptions the class is set to the ERROR state.
 * 
 */
public void reset()
{
    try
    {
        if (state != State.ERROR)
        {
            release();
            filePath = null; // Reset file path
            cAmplitude = 0; // Reset amplitude
            if (rUncompressed)
            {
                audioRecorder = new AudioRecord(aSource, sRate, nChannels+1, aFormat, bufferSize);
            }
            else
            {
                mediaRecorder = new MediaRecorder();
                mediaRecorder.setAudioSource(MediaRecorder.AudioSource.MIC);
                mediaRecorder.setOutputFormat(MediaRecorder.OutputFormat.THREE_GPP);
                mediaRecorder.setAudioEncoder(MediaRecorder.AudioEncoder.AMR_NB);
            }
            state = State.INITIALIZING;
        }
    }
    catch (Exception e)
    {
        Log.e(ExtAudioRecorder.class.getName(), e.getMessage());
        state = State.ERROR;
    }
}

/**
 * 
 * 
 * Starts the recording, and sets the state to RECORDING.
 * Call after prepare().
 * 
 */
public void start()
{
    if (state == State.READY)
    {
        if (rUncompressed)
        {
            payloadSize = 0;
            audioRecorder.startRecording();
            audioRecorder.read(buffer, 0, buffer.length);
        }
        else
        {
            mediaRecorder.start();
        }
        state = State.RECORDING;
    }
    else
    {
        Log.e(ExtAudioRecorder.class.getName(), "start() called on illegal state");
        state = State.ERROR;
    }
}

/**
 * 
 * 
 *  Stops the recording, and sets the state to STOPPED.
 * In case of further usage, a reset is needed.
 * Also finalizes the wave file in case of uncompressed recording.
 * 
 */
public void stop()
{
    if (state == State.RECORDING)
    {
        if (rUncompressed)
        {
            audioRecorder.stop();

            try
            {
                randomAccessWriter.seek(4); // Write size to RIFF header
                randomAccessWriter.writeInt(Integer.reverseBytes(36+payloadSize));

                randomAccessWriter.seek(40); // Write size to Subchunk2Size field
                randomAccessWriter.writeInt(Integer.reverseBytes(payloadSize));

                randomAccessWriter.close();
            }
            catch(IOException e)
            {
                Log.e(ExtAudioRecorder.class.getName(), "I/O exception occured while closing output file");
                state = State.ERROR;
            }
        }
        else
        {
            mediaRecorder.stop();
        }
        state = State.STOPPED;
    }
    else
    {
        Log.e(ExtAudioRecorder.class.getName(), "stop() called on illegal state");
        state = State.ERROR;
    }
}

/* 
 * 
 * Converts a byte[2] to a short, in LITTLE_ENDIAN format
 * 
 */
private short getShort(byte argB1, byte argB2)
{
    return (short)(argB1 | (argB2 << 8));
}

}

我在我的主要活动中使用该类,就像这里..我在按钮单击时录制声音并在另一个按钮单击时停止

    package com.test.android_recorder;

    import android.os.Bundle;
    import android.os.Environment;
    import android.app.Activity;
    import android.view.Menu;
    import android.widget.Button;
    import android.widget.TextView;
    import android.view.View;
    import android.view.View.OnClickListener;

    public class AndroidRecorder extends Activity {
         ExtAudioRecorder extAudioRecorder;
        @Override
        public void onCreate(Bundle savedInstanceState) {
            super.onCreate(savedInstanceState);
            setContentView(com.test.android_recorder.R.layout.activity_android_recorder);
            Button button = (Button) findViewById(R.id.button1);

            Button buttonstop = (Button) findViewById(R.id.buttonstop);

            // Start recording
               extAudioRecorder = ExtAudioRecorder.getInstanse(true);   


            button.setOnClickListener(new OnClickListener()
            {
              public void onClick(View v)
              {

    TextView txt1=(TextView)findViewById(com.test.android_recorder.R.id.textView1);
                  txt1.setText("Button clciked");

                     // Compressed recording (AMR)
                        //extAudioRecorder = ExtAudioRecorder.getInstanse(false); // Uncompressed recording (WAV)

                        extAudioRecorder.setOutputFile(Environment.getExternalStorageDirectory().getAbsolutePath() + "/javarecorder");
                        extAudioRecorder.prepare();
                        extAudioRecorder.start();


                  // ImageView iv = (ImageView) findViewById(R.id.imageview1);
                 //iv.setVisibility(View.VISIBLE);
              }
            });



            buttonstop.setOnClickListener(new OnClickListener()
            {
              public void onClick(View v)
              {

    //TextView txt1=(TextView)findViewById(com.test.android_recorder.R.id.textView1);
                 // txt1.setText("Button clciked");
                  // ImageView iv = (ImageView) findViewById(R.id.imageview1);
                 //iv.setVisibility(View.VISIBLE);

                  // Stop recording
                  extAudioRecorder.stop();
                  extAudioRecorder.release();
              }
            });




            /*


    */
        }

        @Override
        public boolean onCreateOptionsMenu(Menu menu) {
            getMenuInflater().inflate(com.test.android_recorder.R.menu.activity_android_recorder, menu);
            return true;
        }
    }


When i run this code then nothing is happening and in logcat i have following error

Start() called in illegel state.
preapare() called in illegel state.
stop() called in illegel state.

Please Help me Out

【问题讨论】:

    标签: android speech-recognition audio-recording sphinx4 transcription


    【解决方案1】:

    我见过的一个问题是让使用 Wav 文件和其他识别器的人望而却步,那就是文件必须采用受支持的格式。 WAV 是一种通用文件类型,可以支持不同格式的音频(样本大小、单声道/立体声、不同的采样率等)。识别器不如媒体播放器灵活,因此您必须确保您的 wav 文件采用识别器处理的格式。

    http://cmusphinx.sourceforge.net/wiki/faq 说:

    准确性差的主要原因是:

    1. 传入音频的采样率不匹配。肯定是 16khz 16 位单声道小端文件。您需要固定采样率 重新采样的源。验证采样率以进行测试。

    如果您的文件不是这种格式,您可以使用Audacity重新采样

    顺便说一句,我无法使用 QuickTime 或 Windows Media 播放器播放您的 wav 文件。它是什么格式或编码的?

    【讨论】:

    • 感谢您的指导..i ahd 从这个在线 wav 录音机网站vocaroo.com 记录 wav 文件
    • 我想生成那个 16khz 16bit 单声道 little-endian 文件。但我找不到任何能够使用上述规范配置声音文件的录音机。你能建议我吗?我有胆量下载。但在 16khz 16 位单声道小端中找不到该选项
    • 还有一件事我想问一下,我怎样才能用 android 设备记录 16khz 16 位单声道小端 .wav 文件。或者在 java 中使用指定采样转换 .wav 文件的任何转换器代码率?
    • 如果您在 Audacity 中录制,请转到编辑>首选项。然后选择质量。然后将采样设置为“16000 Hz”和 16 位。您还可以将设备设置为默认为 1 通道(单声道)。录制完成后,您可以通过选择文件>导出并将文件类型指定为 WAV 签名 16 位 PCM 来保存它。要重新采样现有文件,请尝试:Tracks->Stereo to Mono 然后 Tracks->Resample (to 16000 Hz)。
    • wiki.audacityteam.org/wiki/WAV - "WAV 文件总是有 little-endian 字节顺序"
    猜你喜欢
    • 2013-10-16
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2016-03-03
    • 1970-01-01
    • 2018-09-30
    • 2016-09-16
    • 1970-01-01
    相关资源
    最近更新 更多