CoreAudio：更改麦克风的采样率并在回调中获取数据？答案

【问题标题】：CoreAudio: change sample rate of microphone and get data in a callback?CoreAudio：更改麦克风的采样率并在回调中获取数据？
【发布时间】：2017-06-04 23:31:55
【问题描述】：

这是我第一次尝试使用 CoreAudio，但我的目标是捕获麦克风数据，将其重新采样到新的采样率，然后捕获原始 16 位 PCM 数据。

我的策略是用麦克风制作一个 AUGraph --> 一个采样率转换器，然后有一个回调从转换器的输出中获取数据（我希望这是新样本的麦克风输出率？）。

现在我的回调只触发了一个空的 AudioBufferList*，这显然是不正确的。我应该如何设置它，我做错了什么？

代码如下：

  CheckError(NewAUGraph(&audioGraph), @"Creating graph");
  CheckError(AUGraphOpen(audioGraph), @"Opening graph");

  AUNode micNode, converterNode;
  AudioUnit micUnit, converterUnit;

  makeMic(&audioGraph, &micNode, &micUnit);

  // get the Input/inputBus's stream description
  UInt32 sizeASBD = sizeof(AudioStreamBasicDescription);
  AudioStreamBasicDescription hwASBDin;
  AudioUnitGetProperty(micUnit,
                       kAudioUnitProperty_StreamFormat,
                       kAudioUnitScope_Input,
                       kInputBus,
                       &hwASBDin,
                       &sizeASBD);
  makeConverter(&audioGraph, &converterNode, &converterUnit, hwASBDin);

  // connect mic output to converterNode
  CheckError(AUGraphConnectNodeInput(audioGraph, micNode, 1, converterNode, 0),
             @"Connecting mic to converter");

  // set callback on the output? maybe?
  AURenderCallbackStruct callbackStruct;
  callbackStruct.inputProc = audioCallback;
  callbackStruct.inputProcRefCon = (__bridge void*)self;
  CheckError(AudioUnitSetProperty(micUnit,
                                kAudioOutputUnitProperty_SetInputCallback,
                                kAudioUnitScope_Global,
                                kInputBus,
                                &callbackStruct,
                                sizeof(callbackStruct)),
             @"Setting callback");

  CheckError(AUGraphInitialize(audioGraph), @"AUGraphInitialize");

  // activate audio session
  NSError *err = nil;
  AVAudioSession *audioSession = [AVAudioSession sharedInstance];
  if (![audioSession setActive:YES error:&err]){
    [self error:[NSString stringWithFormat:@"Couldn't activate audio session: %@", err]];
  }
  CheckError(AUGraphStart(audioGraph), @"AUGraphStart");

和：

void makeMic(AUGraph *graph, AUNode *micNode, AudioUnit *micUnit) {
  AudioComponentDescription inputDesc;
  inputDesc.componentType = kAudioUnitType_Output;
  inputDesc.componentSubType = kAudioUnitSubType_VoiceProcessingIO;
  inputDesc.componentFlags = 0;
  inputDesc.componentFlagsMask = 0;
  inputDesc.componentManufacturer = kAudioUnitManufacturer_Apple;

  CheckError(AUGraphAddNode(*graph, &inputDesc, micNode),
             @"Adding mic node");

  CheckError(AUGraphNodeInfo(*graph, *micNode, 0, micUnit),
             @"Getting mic unit");

  // enable microphone for recording
  UInt32 flagOn = 1; // enable value
  CheckError(AudioUnitSetProperty(*micUnit,
                                  kAudioOutputUnitProperty_EnableIO,
                                  kAudioUnitScope_Input,
                                  kInputBus,
                                  &flagOn,
                                  sizeof(flagOn)),
             @"Enabling microphone");
}

和：

void makeConverter(AUGraph *graph, AUNode *converterNode, AudioUnit *converterUnit, AudioStreamBasicDescription inFormat) {
  AudioComponentDescription sampleConverterDesc;
  sampleConverterDesc.componentType = kAudioUnitType_FormatConverter;
  sampleConverterDesc.componentSubType = kAudioUnitSubType_AUConverter;
  sampleConverterDesc.componentFlags = 0;
  sampleConverterDesc.componentFlagsMask = 0;
  sampleConverterDesc.componentManufacturer = kAudioUnitManufacturer_Apple;

  CheckError(AUGraphAddNode(*graph, &sampleConverterDesc, converterNode),
             @"Adding converter node");
  CheckError(AUGraphNodeInfo(*graph, *converterNode, 0, converterUnit),
             @"Getting converter unit");

  // describe desired output format
  AudioStreamBasicDescription convertedFormat;
  convertedFormat.mSampleRate           = 16000.0;
  convertedFormat.mFormatID         = kAudioFormatLinearPCM;
  convertedFormat.mFormatFlags      = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
  convertedFormat.mFramesPerPacket  = 1;
  convertedFormat.mChannelsPerFrame = 1;
  convertedFormat.mBitsPerChannel       = 16;
  convertedFormat.mBytesPerPacket       = 2;
  convertedFormat.mBytesPerFrame        = 2;

  // set format descriptions
  CheckError(AudioUnitSetProperty(*converterUnit,
                                  kAudioUnitProperty_StreamFormat,
                                  kAudioUnitScope_Input,
                                  0, // should be the only bus #
                                  &inFormat,
                                  sizeof(inFormat)),
             @"Setting format of converter input");
  CheckError(AudioUnitSetProperty(*converterUnit,
                                  kAudioUnitProperty_StreamFormat,
                                  kAudioUnitScope_Output,
                                  0, // should be the only bus #
                                  &convertedFormat,
                                  sizeof(convertedFormat)),
             @"Setting format of converter output");
}

【问题讨论】：

这需要现场进行吗？将音频捕获到文件要容易得多。您是否开始使用 C API？ AVAudioEngine 可以做很多其他事情。
它确实需要实时发生，但我使用哪一组 API 没有偏好
我检查了 AVAudioEngine，看起来采样率转换由于某种原因仅限于某些采样率。我猜想奇数采样率需要 c API。

标签： ios objective-c macos core-audio

【解决方案1】：

渲染回调用作音频单元的来源。如果您在 remoteIO 单元上设置 kAudioOutputUnitProperty_SetInputCallback 属性，则必须从您提供的回调中调用 AudioUnitRender，那么您将不得不手动进行采样率转换，这很难看。

有一种“更简单”的方法。 remoteIO 充当两个单元，输入（麦克风）和输出（扬声器）。使用 remoteIO 创建一个图表，然后使用所需的格式将麦克风连接到扬声器。然后，您可以使用 renderNotify 回调获取数据，该回调充当“点击”。

我创建了一个 ViewController 类来演示

#import "ViewController.h"
#import <AudioToolbox/AudioToolbox.h>
#import <AVFoundation/AVFoundation.h>

@implementation ViewController

- (void)viewDidLoad {
    [super viewDidLoad];

    //Set your audio session to allow recording
    AVAudioSession *audioSession = [AVAudioSession sharedInstance];
    [audioSession setCategory:AVAudioSessionCategoryPlayAndRecord error:NULL];
    [audioSession setActive:1 error:NULL];

    //Create graph and units
    AUGraph graph = NULL;
    NewAUGraph(&graph);

    AUNode ioNode;
    AudioUnit ioUnit = NULL;
    AudioComponentDescription ioDescription = {0};
    ioDescription.componentManufacturer = kAudioUnitManufacturer_Apple;
    ioDescription.componentType         = kAudioUnitType_Output;
    ioDescription.componentSubType      = kAudioUnitSubType_VoiceProcessingIO;

    AUGraphAddNode(graph, &ioDescription, &ioNode);
    AUGraphOpen(graph);
    AUGraphNodeInfo(graph, ioNode, NULL, &ioUnit);

    UInt32 enable = 1;
    AudioUnitSetProperty(ioUnit,kAudioOutputUnitProperty_EnableIO,kAudioUnitScope_Input,1,&enable,sizeof(enable));

    //Set the output of the ioUnit's input bus, and the input of it's output bus to the desired format.
    //Core audio basically has implicite converters that we're taking advantage of.
    AudioStreamBasicDescription asbd = {0};
    asbd.mSampleRate        = 16000.0;
    asbd.mFormatID          = kAudioFormatLinearPCM;
    asbd.mFormatFlags       = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
    asbd.mFramesPerPacket   = 1;
    asbd.mChannelsPerFrame  = 1;
    asbd.mBitsPerChannel    = 16;
    asbd.mBytesPerPacket    = 2;
    asbd.mBytesPerFrame     = 2;

    AudioUnitSetProperty(ioUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output, 1, &asbd, sizeof(asbd));
    AudioUnitSetProperty(ioUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input, 0, &asbd, sizeof(asbd));

    //Connect output of the remoteIO's input bus to the input of it's output bus
    AUGraphConnectNodeInput(graph, ioNode, 1, ioNode, 0);

    //Add a render notify with a bridged reference to self (If using ARC)
    AudioUnitAddRenderNotify(ioUnit, renderNotify, (__bridge void *)self);

    //Start graph
    AUGraphInitialize(graph);
    AUGraphStart(graph);
    CAShow(graph);



}
OSStatus renderNotify(void                          *inRefCon,
                      AudioUnitRenderActionFlags    *ioActionFlags,
                      const AudioTimeStamp          *inTimeStamp,
                      UInt32                        inBusNumber,
                      UInt32                        inNumberFrames,
                      AudioBufferList               *ioData){

    //Filter anything that isn't a post render call on the input bus
    if (*ioActionFlags != kAudioUnitRenderAction_PostRender || inBusNumber != 1) {
        return noErr;
    }
    //Get a reference to self
    ViewController *self = (__bridge ViewController *)inRefCon;

    //Do stuff with audio

    //Optionally mute the audio by setting it to zero;
    for (int i = 0; i < ioData->mNumberBuffers; i++) {
        memset(ioData->mBuffers[i].mData, 0, ioData->mBuffers[i].mDataByteSize);
    }
    return noErr;
}


@end

【讨论】：

好的，非常感谢！一个后续问题：如何从 AudioBufferList 中取回 16 位样本？现在我正在做以下事情：pastebin.com/SM11ykf4 我基本上只需要将这些数据推送到 NSNumbers 的 NSMutableArray 中，但我现在似乎没有得到好的数据。
我认为您将不得不学习一些 C。如果您正在处理音频，那么值得您花时间。在渲染线程上为每个样本创建对象可能不会成立。但要回答这个问题，您需要将 ioData->mBuffers[].mData 中的数据转换为所需的格式。
我的 C 实际上非常强大，我只需要将这些数据放入 NSNumber 的 NSMutableArray 中，这样我就可以将它传递给 React Native。同意它按原样效率不高，但我想确保在优化之前获得良好的数据。将 mData 转换为 SInt16* 并遍历它会给我带来不好的数据 - 特别是，这里有一个示例：15275,0,15112,0,-17608,0,-17491,0,-17460,0,-17507,0 ,-17768,0,15076,0,15178 交替的 0 让我觉得我在这里做错了...我在 kAudioFormatFlagIsPacked 上找不到好的文档，这应该归咎于此吗？
你是对的！渲染通知应该已经过滤了输出总线，它按原样获得浮点数。我会编辑答案。
在分配内存之前，您应该使用循环缓冲区从渲染线程中获取样本。分配（和 Objc 消息）可以锁定，这会导致音频故障。