yaoyaohust

思路1:从字幕或音轨中找到对话较多的部分

- 抽取音轨

ffmpeg -i a.mp4 -map 0:a:0 a.mp3

- 逐帧抽取RMS功率:

ffmpeg -i in.mp3 -af astats=metadata=1:reset=1,ametadata=print:key=lavfi.astats.Overall.RMS_level:file=log.txt -f null -

Determining audio level peaks with ffmpeg

https://superuser.com/questions/1183663/determining-audio-level-peaks-with-ffmpeg

- 对整体进行音量分析:

ffmpeg -i input.wav -filter:a volumedetect -f null /dev/null

https://trac.ffmpeg.org/wiki/AudioVolume

https://ffmpeg.org/ffmpeg-filters.html#volumedetect 

- 截取片段:

ffmpeg -ss $ss -t 00:05:00 -i $vfile.mp4 -vcodec copy -acodec copy -y $vfile.${ss//:/_}.mp4

https://stackoverflow.com/questions/21420296/how-to-extract-time-accurate-video-segments-with-ffmpeg

 

提取精彩片段时间区间:

import sys, os

def getv(rms):
    return max(0, 100-abs(rms))

def extract(diff):
    pos=0
    pos3 = 0
    for n, v in enumerate(diff):
        if v > 0:
            pos += 1
        if n < 3 and v >= 3:
            pos3 += 1
    if pos >= 3 and pos3 >= 2:
        return 1
    return 0

timebin = 0
s = []
v = []
diff = (0,)*5
for nline, line in enumerate(sys.stdin):
    if \'pts_time\' in line:
        ts = float(line.split(\'pts_time:\')[1])
        if ts > timebin + 60:
            if s:
                avgrms = int(sum(s)/len(s))
            #    print \'%.2d %.2d\' % (timebin/60, timebin%60), avgrms, 100-abs(avgrms), \'-\' * (100-abs(avgrms))
            if v:
                d = max(0, getv(avgrms)-v[-1])
                diff = diff[1:] + (d,)
                ext = extract(diff)
                print >>sys.stderr, \'%3d %2d %s %3d\' % (timebin/60, timebin%60, avgrms, getv(avgrms)-v[-1]), \'-\' * d, \'*\' * ext
                if ext:
                    h = timebin/3600
                    print \'%.2d:%.2d:00\' % (h, (timebin-3600*h)/60)
                if ext:
                    diff = (0,)*5
            v.append(getv(avgrms))
            timebin += 60
            s=[]
    if \'RMS\' in line:
        rms = float(line.split(\'lavfi.astats.Overall.RMS_level=\')[1])
        if rms > -1000:
            s.append(rms) 

 

调试:

ffmpeg volumedetect returns unstable result

https://stackoverflow.com/questions/48673923/ffmpeg-volumedetect-returns-unstable-result

 

思路2:思路1+镜头边缘检测

安装opencv:https://www.cnblogs.com/yaoyaohust/p/10228888.html

镜头边界检测:https://www.cnblogs.com/lynsyklate/p/7840881.html

Yahoo的开源工具Hecate:https://github.com/yahoo/hecate

 

思路3:耗时更长、技术难度更高的做法

百度BROAD-Video Highlights视频精彩片段数据集简要介绍与分析

https://zhuanlan.zhihu.com/p/31770408

 

Temporal Action Detection (时序动作检测)方向2017年会议论文整理

https://zhuanlan.zhihu.com/p/31501316

 

Video Analysis 相关领域解读之Temporal Action Detection(时序行为检测)

https://zhuanlan.zhihu.com/p/26603387

 

Video Analysis相关领域解读之Action Recognition(行为识别)

https://zhuanlan.zhihu.com/p/26460437

 

Temporal Action Detection with Structured Segment Networks

林达华(香港中文)的团队

https://github.com/yjxiong/action-detection

基于PyTorch和DenseFlow

 

UntrimmedNets for Weakly Supervised Action Recognition and Detection

林达华(香港中文)的团队

https://github.com/wanglimin/UntrimmedNet

https://github.com/yjxiong/caffe/tree/untrimmednet

基于Caffe

 

分类:

技术点:

相关文章:

  • 2021-04-06
  • 2021-12-27
  • 2021-10-13
  • 2021-11-27
  • 2021-11-11
  • 2021-12-10
猜你喜欢
  • 2021-08-31
  • 2021-11-08
  • 2021-08-17
  • 2021-12-14
  • 2021-12-28
  • 2021-12-03
  • 2021-12-18
相关资源
相似解决方案