【发布时间】:2020-06-08 06:06:34
【问题描述】:
我正在使用 FFmpeg 库生成 MP4 文件,其中包含来自各种文件(如 MP3、WAV、OGG)的音频,但我遇到了一些麻烦(我也将视频放在那里,但为了简单起见,我'我忽略了这个问题,因为我已经开始工作了)。我当前的代码打开一个音频文件,解码内容并将其转换为 MP4 容器,最后将其作为交错帧写入目标文件。
它适用于大多数 MP3 文件,但在输入 WAV 或 OGG 时,生成的 MP4 中的音频会稍微失真,并且经常以错误的速度播放(快很多倍或慢很多倍)。
我查看了无数使用转换函数 (swr_convert) 的示例,但我似乎无法摆脱导出音频中的噪音。
以下是我向 MP4 添加音频流的方法(outContext 是输出文件的 AVFormatContext):
audioCodec = avcodec_find_encoder(outContext->oformat->audio_codec);
if (!audioCodec)
die("Could not find audio encoder!");
// Start stream
audioStream = avformat_new_stream(outContext, audioCodec);
if (!audioStream)
die("Could not allocate audio stream!");
audioCodecContext = audioStream->codec;
audioStream->id = 1;
// Setup
audioCodecContext->sample_fmt = AV_SAMPLE_FMT_S16;
audioCodecContext->bit_rate = 128000;
audioCodecContext->sample_rate = 44100;
audioCodecContext->channels = 2;
audioCodecContext->channel_layout = AV_CH_LAYOUT_STEREO;
// Open the codec
if (avcodec_open2(audioCodecContext, audioCodec, NULL) < 0)
die("Could not open audio codec");
并从 MP3/WAV/OGG(从文件名变量)打开声音文件...
// Create contex
formatContext = avformat_alloc_context();
if (avformat_open_input(&formatContext, filename, NULL, NULL)<0)
die("Could not open file");
// Find info
if (avformat_find_stream_info(formatContext, 0)<0)
die("Could not find file info");
av_dump_format(formatContext, 0, filename, false);
// Find audio stream
streamId = av_find_best_stream(formatContext, AVMEDIA_TYPE_AUDIO, -1, -1, NULL, 0);
if (streamId < 0)
die("Could not find Audio Stream");
codecContext = formatContext->streams[streamId]->codec;
// Find decoder
codec = avcodec_find_decoder(codecContext->codec_id);
if (codec == NULL)
die("cannot find codec!");
// Open codec
if (avcodec_open2(codecContext, codec, 0)<0)
die("Codec cannot be found");
// Set up resample context
swrContext = swr_alloc();
if (!swrContext)
die("Failed to alloc swr context");
av_opt_set_int(swrContext, "in_channel_count", codecContext->channels, 0);
av_opt_set_int(swrContext, "in_channel_layout", codecContext->channel_layout, 0);
av_opt_set_int(swrContext, "in_sample_rate", codecContext->sample_rate, 0);
av_opt_set_sample_fmt(swrContext, "in_sample_fmt", codecContext->sample_fmt, 0);
av_opt_set_int(swrContext, "out_channel_count", audioCodecContext->channels, 0);
av_opt_set_int(swrContext, "out_channel_layout", audioCodecContext->channel_layout, 0);
av_opt_set_int(swrContext, "out_sample_rate", audioCodecContext->sample_rate, 0);
av_opt_set_sample_fmt(swrContext, "out_sample_fmt", audioCodecContext->sample_fmt, 0);
if (swr_init(swrContext))
die("Failed to init swr context");
最后,解码+转换+编码...
// Allocate and init re-usable frames
audioFrameDecoded = av_frame_alloc();
if (!audioFrameDecoded)
die("Could not allocate audio frame");
audioFrameDecoded->format = fileCodecContext->sample_fmt;
audioFrameDecoded->channel_layout = fileCodecContext->channel_layout;
audioFrameDecoded->channels = fileCodecContext->channels;
audioFrameDecoded->sample_rate = fileCodecContext->sample_rate;
audioFrameConverted = av_frame_alloc();
if (!audioFrameConverted)
die("Could not allocate audio frame");
audioFrameConverted->nb_samples = audioCodecContext->frame_size;
audioFrameConverted->format = audioCodecContext->sample_fmt;
audioFrameConverted->channel_layout = audioCodecContext->channel_layout;
audioFrameConverted->channels = audioCodecContext->channels;
audioFrameConverted->sample_rate = audioCodecContext->sample_rate;
AVPacket inPacket;
av_init_packet(&inPacket);
inPacket.data = NULL;
inPacket.size = 0;
int frameFinished = 0;
while (av_read_frame(formatContext, &inPacket) >= 0) {
if (inPacket.stream_index == streamId) {
int len = avcodec_decode_audio4(fileCodecContext, audioFrameDecoded, &frameFinished, &inPacket);
if (frameFinished) {
// Convert
uint8_t *convertedData=NULL;
if (av_samples_alloc(&convertedData,
NULL,
audioCodecContext->channels,
audioFrameConverted->nb_samples,
audioCodecContext->sample_fmt, 0) < 0)
die("Could not allocate samples");
int outSamples = swr_convert(swrContext,
&convertedData,
audioFrameConverted->nb_samples,
(const uint8_t **)audioFrameDecoded->data,
audioFrameDecoded->nb_samples);
if (outSamples < 0)
die("Could not convert");
size_t buffer_size = av_samples_get_buffer_size(NULL,
audioCodecContext->channels,
audioFrameConverted->nb_samples,
audioCodecContext->sample_fmt,
0);
if (buffer_size < 0)
die("Invalid buffer size");
if (avcodec_fill_audio_frame(audioFrameConverted,
audioCodecContext->channels,
audioCodecContext->sample_fmt,
convertedData,
buffer_size,
0) < 0)
die("Could not fill frame");
AVPacket outPacket;
av_init_packet(&outPacket);
outPacket.data = NULL;
outPacket.size = 0;
if (avcodec_encode_audio2(audioCodecContext, &outPacket, audioFrameConverted, &frameFinished) < 0)
die("Error encoding audio frame");
if (frameFinished) {
outPacket.stream_index = audioStream->index;
if (av_interleaved_write_frame(outContext, &outPacket) != 0)
die("Error while writing audio frame");
av_free_packet(&outPacket);
}
}
}
}
av_frame_free(&audioFrameConverted);
av_frame_free(&audioFrameDecoded);
av_free_packet(&inPacket);
我也尝试为传出帧设置适当的 pts 值,但这似乎根本不会影响音质。
我也不确定如何/是否应该分配转换后的数据,可以使用 av_samples_alloc 吗? avcodec_fill_audio_frame 呢?我在正确的轨道上吗?
感谢任何输入(如果您想听到失真,我也可以发送导出的 MP4 文件)。
【问题讨论】: