【问题标题】:ffmpeg - merge mp3 and mp4 (duration difference)ffmpeg - 合并 mp3 和 mp4(持续时间差异)
【发布时间】:2017-12-06 20:18:11
【问题描述】:

我正在尝试将 mp4 和 mp3 文件与 ffmpeg 合并。 mp4 持续时间 - 9.800 秒,mp3 - 58.540 秒。所以我使用 -shortest 键。 代码:

ffmpeg -i video.mp4 -i audio.mp3 -c:v libx264 -c:a aac -strict experimental -shortest output.mp4

之后,我得到了 output.mp4,持续时间为 9.846。我的错误在哪里?为什么输出视频比源长? (9.846 秒和 9.800 秒)。

来源 mp4 MediaInfo:

General
Complete name                  : F:\video test\video.mp4
Format                         : MPEG-4
Format profile                 : Base Media
Codec ID                       : iso5 (iso5/dash)
File size                      : 3.19 MiB
Duration                       : 9 s 800 ms
Overall bit rate               : 2 732 kb/s
Encoded date                   : UTC 2017-11-24 20:53:53
Tagged date                    : UTC 2017-11-24 20:53:53

Video
ID                             : 1
Format                         : AVC
Format/Info                    : Advanced Video Codec
Format profile                 : High@L3.1
Format settings                : CABAC / 4 Ref Frames
Format settings, CABAC         : Yes
Format settings, ReFrames      : 4 frames
Codec ID                       : avc1
Codec ID/Info                  : Advanced Video Coding
Duration                       : 9 s 800 ms
Bit rate                       : 2 729 kb/s
Maximum bit rate               : 3 766 kb/s
Width                          : 1 280 pixels
Height                         : 720 pixels
Display aspect ratio           : 16:9
Frame rate mode                : Constant
Frame rate                     : 25.000 FPS
Color space                    : YUV
Chroma subsampling             : 4:2:0
Bit depth                      : 8 bits
Scan type                      : Progressive
Bits/(Pixel*Frame)             : 0.118
Stream size                    : 3.19 MiB (100%)
Writing library                : x264 core 146
Encoding settings              : cabac=1 / ref=3 / deblock=1:0:0 / analyse=0x3:0x113 / me=hex / subme=7 / psy=1 / psy_rd=1.00:0.00 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=1 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=-2 / threads=12 / lookahead_threads=2 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / constrained_intra=0 / bframes=3 / b_pyramid=2 / b_adapt=1 / b_bias=0 / direct=1 / weightb=1 / open_gop=0 / weightp=2 / keyint=250 / keyint_min=25 / scenecut=40 / intra_refresh=0 / rc_lookahead=40 / rc=crf / mbtree=1 / crf=23.0 / qcomp=0.60 / qpmin=0 / qpmax=69 / qpstep=4 / ip_ratio=1.40 / aq=1:1.00
Tagged date                    : UTC 2017-11-24 20:53:53

来源 mp3 Mediainfo:

General
Complete name                  : F:\video test\audio.mp3
Format                         : MPEG Audio
File size                      : 1.19 MiB
Duration                       : 58 s 540 ms
Overall bit rate mode          : Variable
Overall bit rate               : 170 kb/s
Writing library                : LAME3.99r

Audio
Format                         : MPEG Audio
Format version                 : Version 1
Format profile                 : Layer 3
Format settings                : Joint stereo / MS Stereo
Duration                       : 58 s 540 ms
Bit rate mode                  : Variable
Bit rate                       : 170 kb/s
Minimum bit rate               : 32.0 kb/s
Channel(s)                     : 2 channels
Sampling rate                  : 44.1 kHz
Frame rate                     : 38.281 FPS (1152 SPF)
Compression mode               : Lossy
Stream size                    : 1.19 MiB (100%)
Writing library                : LAME3.99r
Encoding settings              : -m j -V 2 -q 0 -lowpass 18.5 --vbr-new -b 32

控制台输出:

ffmpeg version 3.4 Copyright (c) 2000-2017 the FFmpeg developers
  built with gcc 7.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-bzlib --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-cuda --enable-cuvid --enable-d3d11va --enable-nvenc --enable-dxva2 --enable-avisynth --enable-libmfx
  libavutil      55. 78.100 / 55. 78.100
  libavcodec     57.107.100 / 57.107.100
  libavformat    57. 83.100 / 57. 83.100
  libavdevice    57. 10.100 / 57. 10.100
  libavfilter     6.107.100 /  6.107.100
  libswscale      4.  8.100 /  4.  8.100
  libswresample   2.  9.100 /  2.  9.100
  libpostproc    54.  7.100 / 54.  7.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'video.mp4':
  Metadata:
    major_brand     : iso5
    minor_version   : 1
    compatible_brands: iso5dash
    creation_time   : 2017-11-24T20:53:53.000000Z
  Duration: 00:00:09.80, start: 0.000000, bitrate: 2732 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 2259 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
    Metadata:
      handler_name    : VideoHandler
Input #1, mp3, from 'audio.mp3':
  Duration: 00:00:58.54, start: 0.025057, bitrate: 170 kb/s
    Stream #1:0: Audio: mp3, 44100 Hz, stereo, s16p, 170 kb/s
    Metadata:
      encoder         : LAME3.99r
    Side data:
      replaygain: track gain - -2.200000, track peak - unknown, album gain - unknown, album peak - unknown, 
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
  Stream #1:0 -> #0:1 (mp3 (native) -> aac (native))
Press [q] to stop, [?] for help
[libx264 @ 00000000005ab440] using SAR=1/1
[libx264 @ 00000000005ab440] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
[libx264 @ 00000000005ab440] profile High, level 3.1
[libx264 @ 00000000005ab440] 264 - core 152 r2851 ba24899 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=6 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'output.mp4':
  Metadata:
    major_brand     : iso5
    minor_version   : 1
    compatible_brands: iso5dash
    encoder         : Lavf57.83.100
    Stream #0:0(und): Video: h264 (libx264) (avc1 / 0x31637661), yuv420p(progressive), 1280x720 [SAR 1:1 DAR 16:9], q=-1--1, 25 fps, 12800 tbn, 25 tbc (default)
    Metadata:
      handler_name    : VideoHandler
      encoder         : Lavc57.107.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
    Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 128 kb/s
    Metadata:
      encoder         : Lavc57.107.100 aac
    Side data:
      replaygain: track gain - -2.200000, track peak - unknown, album gain - unknown, album peak - unknown, 
frame=   54 fps=0.0 q=28.0 size=       0kB time=00:00:00.04 bitrate=   8.3kbits/s speed=0.0927x    
frame=   80 fps= 80 q=28.0 size=       0kB time=00:00:01.09 bitrate=   0.4kbits/s speed=1.09x    
frame=   98 fps= 65 q=28.0 size=     256kB time=00:00:01.83 bitrate=1143.5kbits/s speed=1.21x    
frame=  119 fps= 59 q=28.0 size=     512kB time=00:00:02.67 bitrate=1570.9kbits/s speed=1.32x    
frame=  144 fps= 56 q=28.0 size=     768kB time=00:00:03.66 bitrate=1715.0kbits/s speed=1.42x    
frame=  167 fps= 52 q=28.0 size=    1024kB time=00:00:04.57 bitrate=1833.9kbits/s speed=1.44x    
frame=  190 fps= 51 q=28.0 size=    1280kB time=00:00:05.50 bitrate=1905.5kbits/s speed=1.47x    
frame=  218 fps= 51 q=28.0 size=    1792kB time=00:00:06.64 bitrate=2210.6kbits/s speed=1.56x    
frame=  242 fps= 50 q=28.0 size=    2048kB time=00:00:07.56 bitrate=2216.4kbits/s speed=1.58x    
frame=  245 fps= 41 q=-1.0 Lsize=    3045kB time=00:00:09.82 bitrate=2539.6kbits/s speed=1.65x    
video:2880kB audio:156kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.298058%
[libx264 @ 00000000005ab440] frame I:14    Avg QP:20.01  size: 39750
[libx264 @ 00000000005ab440] frame P:106   Avg QP:23.85  size: 14578
[libx264 @ 00000000005ab440] frame B:125   Avg QP:24.63  size:  6770
[libx264 @ 00000000005ab440] consecutive B-frames: 22.9% 22.0% 15.9% 39.2%
[libx264 @ 00000000005ab440] mb I  I16..4: 16.7% 80.3%  3.0%
[libx264 @ 00000000005ab440] mb P  I16..4: 10.2% 36.2%  1.1%  P16..4: 25.0%  7.9%  2.5%  0.0%  0.0%    skip:17.1%
[libx264 @ 00000000005ab440] mb B  I16..4:  2.3%  5.8%  0.2%  B16..8: 31.4%  6.5%  0.9%  direct: 3.7%  skip:49.2%  L0:51.8% L1:44.5% BI: 3.7%
[libx264 @ 00000000005ab440] 8x8 transform intra:76.1% inter:86.3%
[libx264 @ 00000000005ab440] coded y,uvDC,uvAC intra: 38.3% 52.1% 9.0% inter: 12.3% 20.1% 0.2%
[libx264 @ 00000000005ab440] i16 v,h,dc,p: 30% 28%  9% 33%
[libx264 @ 00000000005ab440] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 36% 23% 19%  3%  3%  4%  4%  4%  4%
[libx264 @ 00000000005ab440] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 33% 21% 14%  5%  7%  7%  6%  5%  3%
[libx264 @ 00000000005ab440] i8c dc,h,v,p: 45% 24% 25%  6%
[libx264 @ 00000000005ab440] Weighted P-Frames: Y:13.2% UV:6.6%
[libx264 @ 00000000005ab440] ref P L0: 71.7% 12.5% 12.9%  2.7%  0.2%
[libx264 @ 00000000005ab440] ref B L0: 92.8%  6.3%  0.9%
[libx264 @ 00000000005ab440] ref B L1: 98.3%  1.7%
[libx264 @ 00000000005ab440] kb/s:2406.56
[aac @ 00000000005adde0] Qavg: 511.420

ffprobe -show_packets 输出太大,所以我加载到 pastebin https://pastebin.com/TYSMdceS

【问题讨论】:

    标签: video ffmpeg mp3 mp4 concat


    【解决方案1】:

    对您的问题的快速回答是,FFmpeg / libaac 在开头编码一个额外的 aac 启动数据包,从 -0.0213 秒开始。这会增加你的持续时间。 如果有帮助,我稍后会尝试给出详细的答案。 你可以试试ffprobe -show_packets output.mp4

    我查看了您共享的数据包转储。 你的视频包看起来像

    dts: -0.08 | pts: 0.0
    dts: -0.04 | pts: 0.12
    dts:  0.0  | pts: 0.04
    dts:  0.04 | pts: 0.08
    dts:  0.08 | pts: 0.24
    ...
    dts:  9.64 | pts: 9.76
    dts:  9.68 | pts: 9.72
    

    来回的pts值可能是因为你有I B B P顺序的B帧。 您的视频流是25 fps,即1 frame duration = 0.04 s。 这使您的视频成为9.76 + 0.04(frame duration) = 9.8 s

    您的原始音频大于视频,因此它会被截断以使最后一个数据包达到9.80 s or later。 你的音频包看起来像

    pts: -0.023220 (AAC priming data)
    pts:  0.0
    pts:  0.023220
    ...
    pts:  9.775601 | duration: 0.023220
    pts:  9.798821 | duration: 0.023175
    

    您的最后一个音频包必须在 9.80 或之后结束。这就是接受 9.79 的数据包的原因。 所以你混入 AV 流的音频持续时间是 0.02322 (primiing pkt) + 9.798821 + 0.023175 (dur) = 9.845216

    我不确定额外的 0.001 是从哪里来的。其他人应该可以发表评论。我在开始时看到的跳过数据。

    [SIDE_DATA]
    side_data_type=Skip Samples
    skip_samples=1024
    discard_padding=0
    skip_reason=0
    discard_reason=0
    [/SIDE_DATA]
    

    我希望这会有所帮助。

    【讨论】:

    • 我更新了帖子,添加了ffprobe输出
    猜你喜欢
    • 1970-01-01
    • 2014-08-07
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-10-12
    • 1970-01-01
    • 2017-03-15
    相关资源
    最近更新 更多