【发布时间】:2014-10-29 12:56:32
【问题描述】:
上下文
我制作了一个 directshow 过滤器来更改视频的对比度和亮度。我想加快速度。
没有 SSE 的工作过滤器
HRESULT CBrightness::Transform(IMediaSample *pMediaSample)
{
...
BYTE *pData; // Pointer to the actual image buffer
pMediaSample->GetPointer(&pData);
int numPixels = cxImage * cyImage;
...
prgb = (RGBTRIPLE*) pData;
for (int iPixel=0; iPixel < numPixels; iPixel++ ) {
RGBTRIPLE *ppixel = prgb + iPixel;
ppixel->rgbtGreen = ppixel->rgbtGreen * _contrastPower + _brightnessPower;
ppixel->rgbtBlue = ppixel->rgbtBlue * _contrastPower + _brightnessPower;
ppixel->rgbtRed = ppixel->rgbtRed * _contrastPower + _brightnessPower;
if(ppixel->rgbtGreen>255) ppixel->rgbtGreen = 255;
if(ppixel->rgbtBlue>255) ppixel->rgbtBlue = 255;
if(ppixel->rgbtRed>255) ppixel->rgbtRed = 255;
}
...
}
SEE 过滤器无效
HRESULT CBrightness::Transform(IMediaSample *pMediaSample)
{
BYTE *pData; // Pointer to the actual image buffer
long lDataLen; // Holds length of any given sample
int iPixel; // Used to loop through the image pixels
RGBTRIPLE *prgb; // Holds a pointer to the current pixel
AM_MEDIA_TYPE* pType = &m_pInput->CurrentMediaType();
VIDEOINFOHEADER *pvi = (VIDEOINFOHEADER *) pType->pbFormat;
ASSERT(pvi);
CheckPointer(pMediaSample,E_POINTER);
pMediaSample->GetPointer(&pData);
lDataLen = pMediaSample->GetSize();
// Get the image properties from the BITMAPINFOHEADER
int cxImage = pvi->bmiHeader.biWidth;
int cyImage = pvi->bmiHeader.biHeight;
int numPixels = cxImage * cyImage;
prgb = (RGBTRIPLE*) pData;
double dcontrast = 0.7;
__m128d cStore = _mm_set1_pd(dcontrast);
BYTE *pDataOutput = new BYTE[lDataLen];
for (iPixel=0; iPixel < numPixels; iPixel += 4 ) {
//unpack to 32 bits
__m128i current = _mm_unpacklo_epi8( _mm_loadu_si128( (__m128i*)( prgb+iPixel ) ), _mm_setzero_si128());
__m128d image = _mm_cvtepi32_pd(_mm_unpacklo_epi16(current, _mm_setzero_si128()));
//vector operations
__m128d result = _mm_mul_pd(cStore, image);
//pack back to 8 bits
__m128i pack_32 = _mm_cvtpd_epi32 (result);
__m128i pack_16 = _mm_packs_epi32 (pack_32, pack_32);
__m128i pack_8 = _mm_packus_epi16(pack_16, pack_16);
//store the new pixel in pDataOutput
_mm_storeu_si128((__m128i*)(pDataOutput+iPixel), pack_8);
//also tryed to store the result in the original array
//_mm_storeu_si128((__m128i*)(prgb+iPixel), pack_8); // blacks out the whole video
}
//assign the original pointer to point at the start of the new data array
pData = pDataOutput;
return NOERROR;
}
问题
这段代码对原始流没有任何作用:
//store the new pixel in pDataOutput
_mm_storeu_si128((__m128i*)(pDataOutput+iPixel), pack_8);
....
pData = pDataOutput;
这段代码使整个视频变黑:
_mm_storeu_si128((__m128i*)(prgb+iPixel), pack_8);
问题
我是否正确使用了 SSE 指令?
如何将修改后的数据分配给原始媒体样本指针?
【问题讨论】:
-
使用双精度数进行算术几乎会消除 SIMD 的任何优势 - 理想情况下,您需要使用定点并使用可以避免的最窄数据类型。
-
@PaulR 似乎快了 4 倍
-
可能是因为原来的标量代码效率很低——你应该可以做得更好(假设你仍然想提高性能)。
标签: c++ directshow sse simd