ffmpeg 从 AVFrame 获取特定 AVSampleFormat 的音频样本答案

【问题标题】：ffmpeg Get Audio Samples in a specific AVSampleFormat from AVFrameffmpeg 从 AVFrame 获取特定 AVSampleFormat 的音频样本
【发布时间】：2020-11-21 00:25:00
【问题描述】：

我正在查看 ffmpeg 文档中的示例： Here

static int output_audio_frame(AVFrame *frame)
{
    size_t unpadded_linesize = frame->nb_samples * av_get_bytes_per_sample(frame->format);
    printf("audio_frame n:%d nb_samples:%d pts:%s\n",
           audio_frame_count++, frame->nb_samples,
           av_ts2timestr(frame->pts, &audio_dec_ctx->time_base));
    /* Write the raw audio data samples of the first plane. This works
     * fine for packed formats (e.g. AV_SAMPLE_FMT_S16). However,
     * most audio decoders output planar audio, which uses a separate
     * plane of audio samples for each channel (e.g. AV_SAMPLE_FMT_S16P).
     * In other words, this code will write only the first audio channel
     * in these cases.
     * You should use libswresample or libavfilter to convert the frame
     * to packed data. */
    fwrite(frame->extended_data[0], 1, unpadded_linesize, audio_dst_file);
    return 0;
}

问题是无法设置解码器的格式，因此它会给我以下任何类型的音频样本：

enum AVSampleFormat {
      AV_SAMPLE_FMT_NONE = -1, AV_SAMPLE_FMT_U8, AV_SAMPLE_FMT_S16, AV_SAMPLE_FMT_S32,
      AV_SAMPLE_FMT_FLT, AV_SAMPLE_FMT_DBL, AV_SAMPLE_FMT_U8P, AV_SAMPLE_FMT_S16P,
      AV_SAMPLE_FMT_S32P, AV_SAMPLE_FMT_FLTP, AV_SAMPLE_FMT_DBLP, AV_SAMPLE_FMT_S64,
      AV_SAMPLE_FMT_S64P, AV_SAMPLE_FMT_NB
    }

我正在使用声音引擎，引擎要求我向引擎发送浮点 [-1 到 1] PCM 数据，因此我想获取帧的音频数据作为两个通道（立体声音乐）的浮点数。我该怎么做？我需要使用 libswresample 吗？如果可以的话，任何人都可以给我一个例子来说明我的情况

【问题讨论】：

标签： c++ ffmpeg

【解决方案1】：

如果您没有从解码器获得所需的格式，则必须对其重新采样，使用 AV_SAMPLE_FMT_FLT 进行编码。

根据enum AVSampleFormat

浮点格式基于 [-1.0, 1.0] 范围内的完整音量。此范围之外的任何值都超出了最大音量。

所有示例都有很好的文档记录，并不复杂。函数名本身就很容易理解，所以应该不难理解。

【讨论】：

我不同意All the Examples are well documented and not that complicated.，但由于我没有任何其他来源，我将深入研究这些示例
好的，同意，也许不是全部，但我见过更糟的。我最后一次使用 av 库是在十年前，但我记得使用过其中的示例，但没有更多可用的。 mplayer 的资源也是一个很好的学习资源。
我同意，图书馆本身很棒，但是天哪，没有好的资源可以告诉你它是如何工作的以及如何正确使用它。我什至检查了udemy，但没有，什么都没有。
最让你头疼的是什么？也许我可以对此说些什么。
好吧，内部运作如何，祝你好运。但基本上你有格式（容器）、codec_context（编解码器的包装器）、编解码器（en-/decoder）和数据包（数据容器）。我记得的一个问题是，由于您将 av_format 用于源和接收器，因此很难判断格式结构的哪些字段是在何时以及为什么设置的。编解码器上下文相同。因此，最简单的事情就是依赖 api 的接口而不是搞乱内部。要了解界面、阅读文档和研究示例，当然还要进行实验。