带有 VTDecompressionSession 的图像缓冲区显示顺序答案

【问题标题】：Image buffer display order with VTDecompressionSession带有 VTDecompressionSession 的图像缓冲区显示顺序
【发布时间】：2016-01-19 14:33:45
【问题描述】：

我有一个项目，我需要从实时网络流中解码 h264 视频，最终得到可以在 iOS 设备上的另一个框架 (Unity3D) 中显示的纹理。我可以使用 VTDecompressionSession 成功解码视频，然后使用 CVMetalTextureCacheCreateTextureFromImage（或 OpenGL 变体）获取纹理。当我使用低延迟编码器并且图像缓冲区按显示顺序出现时效果很好，但是，当我使用常规编码器时，图像缓冲区不会按显示顺序出现并且重新排序图像缓冲区显然要困难得多我期待。

第一次尝试是使用 kVTDecodeFrame_EnableAsynchronousDecompression 和 kVTDecodeFrame_EnableTemporalProcessing 设置 VTDecodeFrameFlags... 但是，事实证明 VTDecompressionSession 可以选择忽略该标志并做任何它想做的事情...在我的情况下，它选择忽略该标志并且仍然以编码器顺序（不是显示顺序）输出缓冲区。基本没用。

下一个尝试是将图像缓冲区与演示时间戳相关联，然后将它们放入一个向量中，这样我就可以在创建纹理时获取所需的图像缓冲区。问题似乎是进入与时间戳相关联的 VTDecompressionSession 的图像缓冲区不再是出来的缓冲区，本质上使时间戳无用。

比如进入解码器...

  VTDecodeFrameFlags flags = kVTDecodeFrame_EnableAsynchronousDecompression;
  VTDecodeInfoFlags flagOut;
  // Presentation time stamp to be passed with the buffer
  NSNumber *nsPts = [NSNumber numberWithDouble:pts];

  VTDecompressionSessionDecodeFrame(_decompressionSession, sampleBuffer, flags,
                                          (void*)CFBridgingRetain(nsPts), &flagOut);

在回调方面...

void decompressionSessionDecodeFrameCallback(void *decompressionOutputRefCon, void *sourceFrameRefCon, OSStatus status, VTDecodeInfoFlags infoFlags, CVImageBufferRef imageBuffer, CMTime presentationTimeStamp, CMTime presentationDuration)
 {
      // The presentation time stamp...
      // No longer seems to be associated with the buffer that it went in with!
      NSNumber* pts = CFBridgingRelease(sourceFrameRefCon);
 }

排序时，回调端的时间戳以预期速率单调增加，但缓冲区的顺序不正确。有谁看到我在这里犯了错误？或者知道如何确定回调端缓冲区的顺序？在这一点上，我几乎尝试了所有我能想到的。

【问题讨论】：

你解决了吗？它让我头疼。
我还没有解决这个问题。我敢肯定，如果没有重新排序，包含 B 帧的视频会按照我的预期以高-低-中顺序播放。但是，很明显，视频帧虽然具有高-低-中顺序，但不再与回调中到达的高-低-中presentationTimeStamps 相关联。这会破坏排序，最终会出现奇怪的帧播放顺序。至少我知道我不是唯一一个……
我正在查看 XBMC 的实现，他们在回调中有一条注释，内容为“有时帧按解码顺序”以及他们用来重新排序它们的优先级队列。我必须说我对 Video Toolbox API 印象不深，它的文档记录很差，而且这个错误非常糟糕。
我假设您使用的是 h264 编码...您使用自己的编码器还是 Apple 编码器？我刚刚使用带有 Apple 编码器的 VTDecompressionSession 编写了一个测试应用程序，它完美地对帧进行了排序。似乎 VTDecompressionSession 不喜欢我正在使用的编码器。 :-/
这取决于我认为你有多少 B 和 P 帧。我的内容是由几个不同的编码器生成的，而且看起来确实比其他编码器更糟糕 - 但我认为这是 B/P 帧比。

标签： ios core-video video-toolbox

【解决方案1】：

在我的情况下，问题不在于 VTDecompressionSession，而是解复用器获取错误 PTS 的问题。虽然我无法让 VTDecompressionSession 使用 kVTDecodeFrame_EnableAsynchronousDecompression 和 kVTDecodeFrame_EnableTemporalProcessing 标志按时间（显示）顺序输出帧，但我可以使用一个小向量根据 PTS 自己对帧进行排序。

首先，确保将所有计时信息与 CMSampleBuffer 以及块缓冲区相关联，以便在 VTDecompressionSession 回调中接收。

// Wrap our CMBlockBuffer in a CMSampleBuffer...
CMSampleBufferRef sampleBuffer;

CMTime duration = ...;
CMTime presentationTimeStamp = ...;
CMTime decompressTimeStamp = ...;

CMSampleTimingInfo timingInfo{duration, presentationTimeStamp, decompressTimeStamp};

_sampleTimingArray[0] = timingInfo;
_sampleSizeArray[0] = nalLength;

// Wrap the CMBlockBuffer...
status = CMSampleBufferCreate(kCFAllocatorDefault, blockBuffer, true, NULL, NULL, _formatDescription, 1, 1, _sampleTimingArray, 1, _sampleSizeArray, &sampleBuffer);

然后，解码帧...值得尝试使用标志按照显示顺序取出帧。

VTDecodeFrameFlags flags = kVTDecodeFrame_EnableAsynchronousDecompression | kVTDecodeFrame_EnableTemporalProcessing;
VTDecodeInfoFlags flagOut;

VTDecompressionSessionDecodeFrame(_decompressionSession, sampleBuffer, flags,
                                      (void*)CFBridgingRetain(NULL), &flagOut);

在回调方面，我们需要一种对收到的 CVImageBufferRefs 进行排序的方法。我使用一个包含 CVImageBufferRef 和 PTS 的结构。然后是一个大小为 2 的向量，它将进行实际排序。

struct Buffer
{
    CVImageBufferRef imageBuffer = NULL;
    double pts = 0;
};

std::vector <Buffer> _buffer;

我们还需要一种对缓冲区进行排序的方法。始终写入和读取 PTS 最低的索引效果很好。

 -(int) getMinIndex
 {
     if(_buffer[0].pts > _buffer[1].pts)
     {
         return 1;
     }

     return 0;
 }

在回调中，我们需要用 Buffers 填充向量...

 void decompressionSessionDecodeFrameCallback(void *decompressionOutputRefCon, void *sourceFrameRefCon, OSStatus status, VTDecodeInfoFlags infoFlags, CVImageBufferRef imageBuffer, CMTime presentationTimeStamp, CMTime presentationDuration)
 {
    StreamManager *streamManager = (__bridge StreamManager     *)decompressionOutputRefCon;

    @synchronized(streamManager)
    {
    if (status != noErr)
    {
        NSError *error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil];
        NSLog(@"Decompressed error: %@", error);
    }
    else
    {
        // Get the PTS
        double pts = CMTimeGetSeconds(presentationTimeStamp);

        // Fill our buffer initially
        if(!streamManager->_bufferReady)
        {
            Buffer buffer;

            buffer.pts = pts;
            buffer.imageBuffer = imageBuffer;

            CVBufferRetain(buffer.imageBuffer);

            streamManager->_buffer[streamManager->_bufferIndex++] = buffer;
        }
        else
        {
            // Push new buffers to the index with the lowest PTS
            int index = [streamManager getMinIndex];

            // Release the old CVImageBufferRef
            CVBufferRelease(streamManager->_buffer[index].imageBuffer);

            Buffer buffer;

            buffer.pts = pts;
            buffer.imageBuffer = imageBuffer;

            // Retain the new CVImageBufferRef
            CVBufferRetain(buffer.imageBuffer);

            streamManager->_buffer[index] = buffer;
        }

        // Wrap around the buffer when initialized
        // _bufferWindow = 2
        if(streamManager->_bufferIndex == streamManager->_bufferWindow)
        {
            streamManager->_bufferReady = YES;
            streamManager->_bufferIndex = 0;
        }
    }
}
}

最后我们需要按时间（显示）顺序排空缓冲区...

 - (void)drainBuffer
 {
      @synchronized(self)
      {
         if(_bufferReady)
         {
             // Drain buffers from the index with the lowest PTS
             int index = [self getMinIndex];

             Buffer buffer = _buffer[index];

             // Do something useful with the buffer now in display order
         }
       }
 }

【讨论】：

【解决方案2】：

我想稍微改进一下这个答案。虽然概述的解决方案有效，但它需要了解生成输出帧所需的帧数。该示例使用大小为 2 的缓冲区，但在我的情况下，我需要大小为 3 的缓冲区。为了避免必须提前指定这一点，可以利用这一事实，即帧（按显示顺序）根据点数/持续时间精确对齐。 IE。一帧的结束正是下一帧的开始。因此，可以简单地累积帧，直到开始时没有“间隙”，然后弹出第一帧，依此类推。也可以将第一帧（始终是 I 帧）的 pts 作为初始“头”（因为它不必为零......）。这是一些执行此操作的代码：

#include <CoreVideo/CVImageBuffer.h>

#include <boost/container/flat_set.hpp>

inline bool operator<(const CMTime& left, const CMTime& right)
{
    return CMTimeCompare(left, right) < 0;
}

inline bool operator==(const CMTime& left, const CMTime& right)
{
    return CMTimeCompare(left, right) == 0;
}

inline CMTime operator+(const CMTime& left, const CMTime& right)
{
    return CMTimeAdd(left, right);
}

class reorder_buffer_t
{
public:

    struct entry_t
    {
        CFGuard<CVImageBufferRef> image;
        CMTime pts;
        CMTime duration;
        bool operator<(const entry_t& other) const
        {
            return pts < other.pts;
        }
    };

private:

    typedef boost::container::flat_set<entry_t> buffer_t;

public:

    reorder_buffer_t()
    {
    }

    void push(entry_t entry)
    {
        if (!_head)
            _head = entry.pts;
        _buffer.insert(std::move(entry));
    }

    bool empty() const
    {
        return _buffer.empty();
    }

    bool ready() const
    {
        return !empty() && _buffer.begin()->pts == _head;
    }

    entry_t pop()
    {
        assert(ready());
        auto entry = *_buffer.begin();
        _buffer.erase(_buffer.begin());
        _head = entry.pts + entry.duration;
        return entry;
    }

    void clear()
    {
        _buffer.clear();
        _head = boost::none;
    }

private:

    boost::optional<CMTime> _head;
    buffer_t _buffer;
};

【讨论】：

【解决方案3】：

这是一个适用于任何所需缓冲区大小的解决方案，也不需要任何第 3 方库。我的 C++ 代码可能不是最好的，但它可以工作。

我们创建一个 Buffer 结构体来通过 pts 来识别缓冲区：

struct Buffer
{
    CVImageBufferRef imageBuffer = NULL;
    uint64_t pts = 0;
};

在我们的解码器中，我们需要跟踪缓冲区，以及我们接下来要释放的点：

@property (nonatomic) std::vector <Buffer> buffers;
@property (nonatomic, assign) uint64_t nextExpectedPts;

现在我们准备好处理传入的缓冲区了。在我的例子中，缓冲区是异步提供的。确保为解压会话提供正确的持续时间和演示时间戳值，以便能够正确排序：

-(void)handleImageBuffer:(CVImageBufferRef)imageBuffer pts:(CMTime)presentationTimeStamp duration:(uint64_t)duration {
    //Situation 1, we can directly pass over this buffer
    if (self.nextExpectedPts == presentationTimeStamp.value || duration == 0) {
        [self sendImageBuffer:imageBuffer duration:duration];
        return;
    }
    //Situation 2, we got this buffer too fast. We will store it, but first we check if we have already stored the expected buffer
    Buffer futureBuffer = [self bufferWithImageBuffer:imageBuffer pts:presentationTimeStamp.value];
    int smallestPtsInBufferIndex = [self getSmallestPtsBufferIndex];
    if (smallestPtsInBufferIndex >= 0 && self.nextExpectedPts == self.buffers[smallestPtsInBufferIndex].pts) {
        //We found the next buffer, lets store the current buffer and return this one
        Buffer bufferWithSmallestPts = self.buffers[smallestPtsInBufferIndex];
        [self sendImageBuffer:bufferWithSmallestPts.imageBuffer duration:duration];
        CVBufferRelease(bufferWithSmallestPts.imageBuffer);
        [self setBuffer:futureBuffer atIndex:smallestPtsInBufferIndex];
    } else {
        //We dont have the next buffer yet, lets store this one to a new slot
        [self setBuffer:futureBuffer atIndex:self.buffers.size()];
    }
}

-(Buffer)bufferWithImageBuffer:(CVImageBufferRef)imageBuffer pts:(uint64_t)pts {
    Buffer futureBuffer = Buffer();
    futureBuffer.pts = pts;
    futureBuffer.imageBuffer = imageBuffer;
    CVBufferRetain(futureBuffer.imageBuffer);
    return futureBuffer;
}

- (void)sendImageBuffer:(CVImageBufferRef)imageBuffer duration:(uint64_t)duration {
    //Send your buffer to wherever you need it here
    self.nextExpectedPts += duration;
}

-(int) getSmallestPtsBufferIndex
{
    int minIndex = -1;
    uint64_t minPts = 0;
    for(int i=0;i<_buffers.size();i++) {
        if (_buffers[i].pts < minPts || minPts == 0) {
            minPts = _buffers[i].pts;
            minIndex = i;
        }
    }
    return minIndex;
}

- (void)setBuffer:(Buffer)buffer atIndex:(int)index {
    if (_buffers.size() <= index) {
        _buffers.push_back(buffer);
    } else {
        _buffers[index] = buffer;
    }
}

在释放解码器时不要忘记释放向量中的所有缓冲区，例如，如果您正在处理循环文件，请跟踪文件何时完全循环以重置 nextExpectedPts 等。

【讨论】：