环境光遮蔽
之前我们算环境光的时候都是考虑所有地方收到的环境光强度相等,直接用漫反射系数乘以环境光强度。
我们取一个骷髅头模型,单独渲染环境光,不计算任何其他光照,结果如下图。
这个结果显然是不对的,场景中有的地方被掩蔽的程度更大,反射光线会更难到达,而有的地方被掩蔽的程度更小,反射光线更容易到达,所以不同地方的环境光强度应该是不同的。
那么我们要怎么来计算场景中的点被遮蔽的程度呢?接下来讲一种离线的算法和一种在线的算法,其中在线的算法就是我们经常在游戏里见到的SSAO。
离线环境光遮蔽算法:RAY CASTING
ray casting,就是从要计算环境光遮蔽的点出发,发出很多光线,然后检查这些光线是否和其他表面发生了碰撞,假如碰撞距离t很近,就认为这个方向被遮蔽了,如果很远或者没有碰撞,就认为这个方向没有被遮蔽,最后假如有h条光线被遮蔽,总共N条,那么环境光遮蔽就等于
如下图所示
我们称p位置的accessibility是1-occlusion。
现在要离线地计算模型上每个点的环境光遮蔽,把每个三角面取出来,从质心出发放射出n条光线,然后计算每个三角面的遮蔽程度,然后顶点的遮蔽程度就是通过了这个顶点的三角面的环境光遮蔽取平均。
代码如下:
void AmbientOcclusionApp::BuildVertexAmbientOcclusion(std::vector<Vertex::AmbientOcclusion>& vertices,
const std::vector<UINT>& indices)
{
UINT vcount = vertices.size();
UINT tcount = indices.size()/3;
std::vector<XMFLOAT3> positions(vcount);
for(UINT i = 0; i < vcount; ++i)
positions[i] = vertices[i].Pos;
Octree octree;
octree.Build(positions, indices);
// For each vertex, count how many triangles contain
the vertex.
std::vector<int> vertexSharedCount(vcount);
// Cast rays for each triangle, and average triangle occlusion
// with the vertices that share this triangle.
for(UINT i = 0; i < tcount; ++i)
{
UINT i0 = indices[i*3+0];
UINT i1 = indices[i*3+1];
UINT i2 = indices[i*3+2];
XMVECTOR v0 = XMLoadFloat3(&vertices[i0].Pos);
XMVECTOR v1 = XMLoadFloat3(&vertices[i1].Pos);
XMVECTOR v2 = XMLoadFloat3(&vertices[i2].Pos);
XMVECTOR edge0 = v1 - v0;
XMVECTOR edge1 = v2 - v0;
XMVECTOR normal = XMVector3Normalize(XMVector3Cross(edge0, edge1));
XMVECTOR centroid = (v0 + v1 + v2)/3.0f;
// Offset to avoid self intersection.
centroid += 0.001f*normal;
const int NumSampleRays = 32;
float numUnoccluded = 0;
for(int j = 0; j < NumSampleRays; ++j)
{
XMVECTOR randomDir = MathHelper::RandHemisphereUnitVec3(normal);
// Test if the random ray intersects the scene mesh.
//
// TODO: Technically we should not count intersections
// that are far away as occluding the triangle, but
// this is OK for demo.
if( !octree.RayOctreeIntersect(centroid, randomDir) )
{
numUnoccluded++;
}
}
float ambientAccess = numUnoccluded / NumSampleRays;
// Average with vertices that share this face.
vertices[i0].AmbientAccess += ambientAccess;
vertices[i1].AmbientAccess += ambientAccess;
vertices[i2].AmbientAccess += ambientAccess;
vertexSharedCount[i0]++;
vertexSharedCount[i1]++;
vertexSharedCount[i2]++;
}
// Finish average by dividing by the number of samples we added,
// and store in the vertex attribute.
for(UINT i = 0; i < vcount; ++i)
{
vertices[i].AmbientAccess /= vertexSharedCount[i];
}
}
这样计算了环境光遮蔽之后,再在原来计算环境光的式子上乘一个该点的accessibility,渲染结果如下
这种方法费时比较长,一个这样的模型都可能要计算好几秒,无法实时使用,而且这种方法只适合静态的物体,如果场景里有动的物体,或者这个物体本身要做动画,那么每个点的遮蔽程度随时都会变,这种离线的算法就不再适用了。
在线算法:SSAO
SSAO是指Screen Space Ambient Occlusion,在屏幕空间计算遮蔽,也就是可以理解为一种后处理。
首先我们有整个屏幕的深度缓冲,那么可以理解为我们可以把距离摄像机最近的一层还原出来,我们把x和y变换到摄像机空间,或者世界空间(然而没有必要多变换一次),然后把深度也变换过去,那么就还原了离摄像机最近的一层的场景的三维图像(好像之前看到shader书里实现动态模糊也是利用这点)。
SSAO的大致思路是,首先我们渲染一个pass,把屏幕的法线和深度渲染到RT和DS,然后第二个pass我们利用整个屏幕的法线信息和深度信息来算环境光遮蔽,大致思路如下图
方法是从要计算的p点开始,在半球内取一堆随机的光线向量,光线向量终点q,然后根据q的x和y投影到屏幕空间,采样深度缓冲得到这个点上对应的深度,再变到视野空间里算出最近的点r,然后这个r点就是可能遮蔽我们屏幕上的点q的点,然后判断一下r和q的距离以及p到r的方向和法线的点积,判断r是否在p的后方,以此判断r是否能遮蔽p,这里我们要先渲染一个法线pass就是为了做这个判断,然后如果能遮蔽,根据距离算出遮蔽的量,把每个射线取平均,就是点的环境光遮蔽,下面给出具体的计算过程。
首先重建视野空间下的点,代码如下
static const float2 gTexCoords[6] =
{
float2(0.0f, 1.0f),
float2(0.0f, 0.0f),
float2(1.0f, 0.0f),
float2(0.0f, 1.0f),
float2(1.0f, 0.0f),
float2(1.0f, 1.0f)
};
// Draw call with 6 vertices
VertexOut VS(uint vid : SV_VertexID)
{
VertexOut vout;
vout.TexC = gTexCoords[vid];
// Quad covering screen in NDC space.
vout.PosH = float4(2.0f*vout.TexC.x - 1.0f, 1.0f - 2.0f*vout.TexC.y, 0.0f, 1.0f);
// Transform quad corners to view space near plane.
float4 ph = mul(vout.PosH, gInvProj);
vout.PosV = ph.xyz / ph.w;
return vout;
}
输入是六个点,即两个三角形,这两个三角形构成一个四边形,即屏幕坐标的四个点,在顶点着色器里我们只需要计算这四个点再屏幕空间下的坐标,到了像素着色器里之后中间点的位置就被插值出来了。注意这里我们没有输入顶点,绑vb和ib的时候绑bull就行,然后要运行6次,draw call的时候第一个参数写6即可。
cmdList->DrawInstanced(6, 1, 0, 0);
到这一步我们插值得到的还只是近平面上的v,实际上p=tv,其中t=p.z/v.z,那么我们先采样得到ndc下的深度,然后变换到view space里来得到p.z,然后v我们已经知道,故我们知道v.z,就可以根据p=(p.z/v/z)v还原p点坐标,如下
float NdcDepthToViewDepth(float z_ndc)
{
// We can invert the calculation from NDC space to view space for the
// z-coordinate. We have that
// z_ndc = A + B/viewZ, where gProj[2,2]=A and gProj[3,2]=B.
// Therefore…
float viewZ = gProj[3][2] / (z_ndc - gProj[2][2]);
return viewZ;
}
float4 PS(VertexOut pin) : SV_Target
{
// Get z-coord of this pixel in NDC space from depth map.
float pz = gDepthMap.SampleLevel(gsamDepthMap, pin.TexC, 0.0f).r;
// Transform depth to view space.
pz = NdcDepthToViewDepth(pz);
// Reconstruct the view space position of the point with depth pz.
float3 p = (pz/pin.PosV.z)*pin.PosV;
[…]
}
接下来我们要生成一系列随机向量用来计算遮蔽,这里我们用的是14个均匀分布在半球内的向量,实际用的时候,我们随机生成一个向量,然后根据这14个向量分别计算反射,这样算出来的结果也是均匀的,然后我们把不在半球内的向量反向一下移到半球内,就得到了一个随机的、均匀的14个向量。
void Ssao::BuildOffsetVectors()
{
// Start with 14 uniformly distributed vectors. We choose the
// 8 corners of the cube and the 6 center points along each
// cube face. We always alternate the points on opposite sides
// of the cubes. This way we still get the vectors spread out
// even if we choose to use less than 14 samples.
// 8 cube corners
mOffsets[0] = XMFLOAT4(+1.0f, +1.0f, +1.0f, 0.0f);
mOffsets[1] = XMFLOAT4(-1.0f, -1.0f, -1.0f, 0.0f);
mOffsets[2] = XMFLOAT4(-1.0f, +1.0f, +1.0f, 0.0f);
mOffsets[3] = XMFLOAT4(+1.0f, -1.0f, -1.0f, 0.0f);
mOffsets[4] = XMFLOAT4(+1.0f, +1.0f, -1.0f, 0.0f);
mOffsets[5] = XMFLOAT4(-1.0f, -1.0f, +1.0f, 0.0f);
mOffsets[6] = XMFLOAT4(-1.0f, +1.0f, -1.0f, 0.0f);
mOffsets[7] = XMFLOAT4(+1.0f, -1.0f, +1.0f, 0.0f);
// 6 centers of cube faces
mOffsets[8] = XMFLOAT4(-1.0f, 0.0f, 0.0f, 0.0f);
mOffsets[9] = XMFLOAT4(+1.0f, 0.0f, 0.0f, 0.0f);
mOffsets[10] = XMFLOAT4(0.0f, -1.0f, 0.0f, 0.0f);
mOffsets[11] = XMFLOAT4(0.0f, +1.0f, 0.0f, 0.0f);
mOffsets[12] = XMFLOAT4(0.0f, 0.0f, -1.0f, 0.0f);
mOffsets[13] = XMFLOAT4(0.0f, 0.0f, +1.0f, 0.0f);
for(int i = 0; i < 14; ++i)
{
// Create random lengths in [0.25, 1.0].
float s = MathHelper::RandF(0.25f, 1.0f);
XMVECTOR v = s *
XMVector4Normalize(XMLoadFloat4(&mOffsets[i]));
XMStoreFloat4(&mOffsets[i], v);
}
}
然后我们要生成可能的遮蔽点,假如我们已经生成了p周围的随机采样点q,现在要求的是r,我们把q乘上proj和tex矩阵变换到贴图坐标系,然后用这个坐标的x和y采样深度信息,再把深度信息变换回到view space,根据r=(rz/qz)*q求出r。
然后就是遮蔽测试,我们计算|q-r|,如果这个值太小,我们认为这两点离得太近,就在同一个面上,就不会遮蔽,然后遮蔽的效果和距离是线性衰减的关系,超过一定范围就为0,无法遮蔽了,此外,我们还要在掩蔽系数上乘上n和(r-p)的点积,即在前面就遮蔽的厉害一些,在旁边遮蔽效果就一般,在后面就不遮蔽。
最后我们把occlusion的值取一下平均,access为1-occlusion,然后access我们可以乘方几次加大对比度。
occlusionSum /= gSampleCount;
float access = 1.0f - occlusionSum;
// Sharpen the contrast of the SSAO map to make the SSAO affect more dramatic.
return saturate(pow(access, 4.0f));
然后这样的话算出来的遮蔽会比较噪,我们还要做几次模糊处理,但是这次做的不是高斯模糊,而是能保留边缘的模糊,我们通过法线和深度来检测边缘,如果超出边缘,就舍弃采样的值,不参与平均。
SSAO Demo
接下来我们实现一个完整的ssao的demo,并给出其中关键部分的代码。
首先我们封装一个SSAO的类,来存用到的dsv,srv,resource等等。
其中一些关键的方法如下
void Ssao::BuildDescriptors(
ID3D12Resource* depthStencilBuffer,
CD3DX12_CPU_DESCRIPTOR_HANDLE hCpuSrv,
CD3DX12_GPU_DESCRIPTOR_HANDLE hGpuSrv,
CD3DX12_CPU_DESCRIPTOR_HANDLE hCpuRtv,
UINT cbvSrvUavDescriptorSize,
UINT rtvDescriptorSize)
{
// Save references to the descriptors. The Ssao reserves heap space
// for 5 contiguous Srvs.
mhAmbientMap0CpuSrv = hCpuSrv;
mhAmbientMap1CpuSrv = hCpuSrv.Offset(1, cbvSrvUavDescriptorSize);
mhNormalMapCpuSrv = hCpuSrv.Offset(1, cbvSrvUavDescriptorSize);
mhDepthMapCpuSrv = hCpuSrv.Offset(1, cbvSrvUavDescriptorSize);
mhRandomVectorMapCpuSrv = hCpuSrv.Offset(1, cbvSrvUavDescriptorSize);
mhAmbientMap0GpuSrv = hGpuSrv;
mhAmbientMap1GpuSrv = hGpuSrv.Offset(1, cbvSrvUavDescriptorSize);
mhNormalMapGpuSrv = hGpuSrv.Offset(1, cbvSrvUavDescriptorSize);
mhDepthMapGpuSrv = hGpuSrv.Offset(1, cbvSrvUavDescriptorSize);
mhRandomVectorMapGpuSrv = hGpuSrv.Offset(1, cbvSrvUavDescriptorSize);
mhNormalMapCpuRtv = hCpuRtv;
mhAmbientMap0CpuRtv = hCpuRtv.Offset(1, rtvDescriptorSize);
mhAmbientMap1CpuRtv = hCpuRtv.Offset(1, rtvDescriptorSize);
// Create the descriptors
RebuildDescriptors(depthStencilBuffer);
}
void Ssao::BuildResources()
{
// Free the old resources if they exist.
mNormalMap = nullptr;
mAmbientMap0 = nullptr;
mAmbientMap1 = nullptr;
D3D12_RESOURCE_DESC texDesc;
ZeroMemory(&texDesc, sizeof(D3D12_RESOURCE_DESC));
texDesc.Dimension = D3D12_RESOURCE_DIMENSION_TEXTURE2D;
texDesc.Alignment = 0;
texDesc.Width = mRenderTargetWidth;
texDesc.Height = mRenderTargetHeight;
texDesc.DepthOrArraySize = 1;
texDesc.MipLevels = 1;
texDesc.Format = Ssao::NormalMapFormat;
texDesc.SampleDesc.Count = 1;
texDesc.SampleDesc.Quality = 0;
texDesc.Layout = D3D12_TEXTURE_LAYOUT_UNKNOWN;
texDesc.Flags = D3D12_RESOURCE_FLAG_ALLOW_RENDER_TARGET;
float normalClearColor[] = { 0.0f, 0.0f, 1.0f, 0.0f };
CD3DX12_CLEAR_VALUE optClear(NormalMapFormat, normalClearColor);
ThrowIfFailed(md3dDevice->CreateCommittedResource(
&CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_DEFAULT),
D3D12_HEAP_FLAG_NONE,
&texDesc,
D3D12_RESOURCE_STATE_COMMON,
&optClear,
IID_PPV_ARGS(&mNormalMap)));
// Ambient occlusion maps are at half resolution.
texDesc.Width = mRenderTargetWidth / 2;
texDesc.Height = mRenderTargetHeight / 2;
texDesc.Format = Ssao::AmbientMapFormat;
float ambientClearColor[] = { 1.0f, 1.0f, 1.0f, 1.0f };
optClear = CD3DX12_CLEAR_VALUE(AmbientMapFormat, ambientClearColor);
ThrowIfFailed(md3dDevice->CreateCommittedResource(
&CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_DEFAULT),
D3D12_HEAP_FLAG_NONE,
&texDesc,
D3D12_RESOURCE_STATE_GENERIC_READ,
&optClear,
IID_PPV_ARGS(&mAmbientMap0)));
ThrowIfFailed(md3dDevice->CreateCommittedResource(
&CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_DEFAULT),
D3D12_HEAP_FLAG_NONE,
&texDesc,
D3D12_RESOURCE_STATE_GENERIC_READ,
&optClear,
IID_PPV_ARGS(&mAmbientMap1)));
}
void Ssao::BuildRandomVectorTexture(ID3D12GraphicsCommandList* cmdList)
{
D3D12_RESOURCE_DESC texDesc;
ZeroMemory(&texDesc, sizeof(D3D12_RESOURCE_DESC));
texDesc.Dimension = D3D12_RESOURCE_DIMENSION_TEXTURE2D;
texDesc.Alignment = 0;
texDesc.Width = 256;
texDesc.Height = 256;
texDesc.DepthOrArraySize = 1;
texDesc.MipLevels = 1;
texDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
texDesc.SampleDesc.Count = 1;
texDesc.SampleDesc.Quality = 0;
texDesc.Layout = D3D12_TEXTURE_LAYOUT_UNKNOWN;
texDesc.Flags = D3D12_RESOURCE_FLAG_NONE;
ThrowIfFailed(md3dDevice->CreateCommittedResource(
&CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_DEFAULT),
D3D12_HEAP_FLAG_NONE,
&texDesc,
D3D12_RESOURCE_STATE_GENERIC_READ,
nullptr,
IID_PPV_ARGS(&mRandomVectorMap)));
//
// In order to copy CPU memory data into our default buffer, we need to create
// an intermediate upload heap.
//
const UINT num2DSubresources = texDesc.DepthOrArraySize * texDesc.MipLevels;
const UINT64 uploadBufferSize = GetRequiredIntermediateSize(mRandomVectorMap.Get(), 0, num2DSubresources);
ThrowIfFailed(md3dDevice->CreateCommittedResource(
&CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_UPLOAD),
D3D12_HEAP_FLAG_NONE,
&CD3DX12_RESOURCE_DESC::Buffer(uploadBufferSize),
D3D12_RESOURCE_STATE_GENERIC_READ,
nullptr,
IID_PPV_ARGS(mRandomVectorMapUploadBuffer.GetAddressOf())));
XMCOLOR initData[256 * 256];
for(int i = 0; i < 256; ++i)
{
for(int j = 0; j < 256; ++j)
{
// Random vector in [0,1]. We will decompress in shader to [-1,1].
XMFLOAT3 v(MathHelper::RandF(), MathHelper::RandF(), MathHelper::RandF());
initData[i * 256 + j] = XMCOLOR(v.x, v.y, v.z, 0.0f);
}
}
D3D12_SUBRESOURCE_DATA subResourceData = {};
subResourceData.pData = initData;
subResourceData.RowPitch = 256 * sizeof(XMCOLOR);
subResourceData.SlicePitch = subResourceData.RowPitch * 256;
//
// Schedule to copy the data to the default resource, and change states.
// Note that mCurrSol is put in the GENERIC_READ state so it can be
// read by a shader.
//
cmdList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(mRandomVectorMap.Get(),
D3D12_RESOURCE_STATE_GENERIC_READ, D3D12_RESOURCE_STATE_COPY_DEST));
UpdateSubresources(cmdList, mRandomVectorMap.Get(), mRandomVectorMapUploadBuffer.Get(),
0, 0, num2DSubresources, &subResourceData);
cmdList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(mRandomVectorMap.Get(),
D3D12_RESOURCE_STATE_COPY_DEST, D3D12_RESOURCE_STATE_GENERIC_READ));
}
void Ssao::BuildOffsetVectors()
{
// Start with 14 uniformly distributed vectors. We choose the 8 corners of the cube
// and the 6 center points along each cube face. We always alternate the points on
// opposites sides of the cubes. This way we still get the vectors spread out even
// if we choose to use less than 14 samples.
// 8 cube corners
mOffsets[0] = XMFLOAT4(+1.0f, +1.0f, +1.0f, 0.0f);
mOffsets[1] = XMFLOAT4(-1.0f, -1.0f, -1.0f, 0.0f);
mOffsets[2] = XMFLOAT4(-1.0f, +1.0f, +1.0f, 0.0f);
mOffsets[3] = XMFLOAT4(+1.0f, -1.0f, -1.0f, 0.0f);
mOffsets[4] = XMFLOAT4(+1.0f, +1.0f, -1.0f, 0.0f);
mOffsets[5] = XMFLOAT4(-1.0f, -1.0f, +1.0f, 0.0f);
mOffsets[6] = XMFLOAT4(-1.0f, +1.0f, -1.0f, 0.0f);
mOffsets[7] = XMFLOAT4(+1.0f, -1.0f, +1.0f, 0.0f);
// 6 centers of cube faces
mOffsets[8] = XMFLOAT4(-1.0f, 0.0f, 0.0f, 0.0f);
mOffsets[9] = XMFLOAT4(+1.0f, 0.0f, 0.0f, 0.0f);
mOffsets[10] = XMFLOAT4(0.0f, -1.0f, 0.0f, 0.0f);
mOffsets[11] = XMFLOAT4(0.0f, +1.0f, 0.0f, 0.0f);
mOffsets[12] = XMFLOAT4(0.0f, 0.0f, -1.0f, 0.0f);
mOffsets[13] = XMFLOAT4(0.0f, 0.0f, +1.0f, 0.0f);
for(int i = 0; i < 14; ++i)
{
// Create random lengths in [0.25, 1.0].
float s = MathHelper::RandF(0.25f, 1.0f);
XMVECTOR v = s * XMVector4Normalize(XMLoadFloat4(&mOffsets[i]));
XMStoreFloat4(&mOffsets[i], v);
}
}
上面包含了创建随机矢量图、创建14个均匀分布的矢量、创建resource等的代码。
再来看主程序,因为渲染ssao的输入和普通渲染的不同,所以这次我们要两个root signature。
void SsaoApp::BuildRootSignature()
{
CD3DX12_DESCRIPTOR_RANGE texTable0;
texTable0.Init(D3D12_DESCRIPTOR_RANGE_TYPE_SRV, 3, 0, 0);
CD3DX12_DESCRIPTOR_RANGE texTable1;
texTable1.Init(D3D12_DESCRIPTOR_RANGE_TYPE_SRV, 10, 3, 0);
// Root parameter can be a table, root descriptor or root constants.
CD3DX12_ROOT_PARAMETER slotRootParameter[5];
// Perfomance TIP: Order from most frequent to least frequent.
slotRootParameter[0].InitAsConstantBufferView(0);
slotRootParameter[1].InitAsConstantBufferView(1);
slotRootParameter[2].InitAsShaderResourceView(0, 1);
slotRootParameter[3].InitAsDescriptorTable(1, &texTable0, D3D12_SHADER_VISIBILITY_PIXEL);
slotRootParameter[4].InitAsDescriptorTable(1, &texTable1, D3D12_SHADER_VISIBILITY_PIXEL);
auto staticSamplers = GetStaticSamplers();
// A root signature is an array of root parameters.
CD3DX12_ROOT_SIGNATURE_DESC rootSigDesc(5, slotRootParameter,
(UINT)staticSamplers.size(), staticSamplers.data(),
D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT);
// create a root signature with a single slot which points to a descriptor range consisting of a single constant buffer
ComPtr<ID3DBlob> serializedRootSig = nullptr;
ComPtr<ID3DBlob> errorBlob = nullptr;
HRESULT hr = D3D12SerializeRootSignature(&rootSigDesc, D3D_ROOT_SIGNATURE_VERSION_1,
serializedRootSig.GetAddressOf(), errorBlob.GetAddressOf());
if(errorBlob != nullptr)
{
::OutputDebugStringA((char*)errorBlob->GetBufferPointer());
}
ThrowIfFailed(hr);
ThrowIfFailed(md3dDevice->CreateRootSignature(
0,
serializedRootSig->GetBufferPointer(),
serializedRootSig->GetBufferSize(),
IID_PPV_ARGS(mRootSignature.GetAddressOf())));
}
void SsaoApp::BuildSsaoRootSignature()
{
CD3DX12_DESCRIPTOR_RANGE texTable0;
texTable0.Init(D3D12_DESCRIPTOR_RANGE_TYPE_SRV, 2, 0, 0);
CD3DX12_DESCRIPTOR_RANGE texTable1;
texTable1.Init(D3D12_DESCRIPTOR_RANGE_TYPE_SRV, 1, 2, 0);
// Root parameter can be a table, root descriptor or root constants.
CD3DX12_ROOT_PARAMETER slotRootParameter[4];
// Perfomance TIP: Order from most frequent to least frequent.
slotRootParameter[0].InitAsConstantBufferView(0);
slotRootParameter[1].InitAsConstants(1, 1);
slotRootParameter[2].InitAsDescriptorTable(1, &texTable0, D3D12_SHADER_VISIBILITY_PIXEL);
slotRootParameter[3].InitAsDescriptorTable(1, &texTable1, D3D12_SHADER_VISIBILITY_PIXEL);
const CD3DX12_STATIC_SAMPLER_DESC pointClamp(
0, // shaderRegister
D3D12_FILTER_MIN_MAG_MIP_POINT, // filter
D3D12_TEXTURE_ADDRESS_MODE_CLAMP, // addressU
D3D12_TEXTURE_ADDRESS_MODE_CLAMP, // addressV
D3D12_TEXTURE_ADDRESS_MODE_CLAMP); // addressW
const CD3DX12_STATIC_SAMPLER_DESC linearClamp(
1, // shaderRegister
D3D12_FILTER_MIN_MAG_MIP_LINEAR, // filter
D3D12_TEXTURE_ADDRESS_MODE_CLAMP, // addressU
D3D12_TEXTURE_ADDRESS_MODE_CLAMP, // addressV
D3D12_TEXTURE_ADDRESS_MODE_CLAMP); // addressW
const CD3DX12_STATIC_SAMPLER_DESC depthMapSam(
2, // shaderRegister
D3D12_FILTER_MIN_MAG_MIP_LINEAR, // filter
D3D12_TEXTURE_ADDRESS_MODE_BORDER, // addressU
D3D12_TEXTURE_ADDRESS_MODE_BORDER, // addressV
D3D12_TEXTURE_ADDRESS_MODE_BORDER, // addressW
0.0f,
0,
D3D12_COMPARISON_FUNC_LESS_EQUAL,
D3D12_STATIC_BORDER_COLOR_OPAQUE_WHITE);
const CD3DX12_STATIC_SAMPLER_DESC linearWrap(
3, // shaderRegister
D3D12_FILTER_MIN_MAG_MIP_LINEAR, // filter
D3D12_TEXTURE_ADDRESS_MODE_WRAP, // addressU
D3D12_TEXTURE_ADDRESS_MODE_WRAP, // addressV
D3D12_TEXTURE_ADDRESS_MODE_WRAP); // addressW
std::array<CD3DX12_STATIC_SAMPLER_DESC, 4> staticSamplers =
{
pointClamp, linearClamp, depthMapSam, linearWrap
};
// A root signature is an array of root parameters.
CD3DX12_ROOT_SIGNATURE_DESC rootSigDesc(4, slotRootParameter,
(UINT)staticSamplers.size(), staticSamplers.data(),
D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT);
// create a root signature with a single slot which points to a descriptor range consisting of a single constant buffer
ComPtr<ID3DBlob> serializedRootSig = nullptr;
ComPtr<ID3DBlob> errorBlob = nullptr;
HRESULT hr = D3D12SerializeRootSignature(&rootSigDesc, D3D_ROOT_SIGNATURE_VERSION_1,
serializedRootSig.GetAddressOf(), errorBlob.GetAddressOf());
if(errorBlob != nullptr)
{
::OutputDebugStringA((char*)errorBlob->GetBufferPointer());
}
ThrowIfFailed(hr);
ThrowIfFailed(md3dDevice->CreateRootSignature(
0,
serializedRootSig->GetBufferPointer(),
serializedRootSig->GetBufferSize(),
IID_PPV_ARGS(mSsaoRootSignature.GetAddressOf())));
}
创建descriptor heap的时候还要创建一下ssao相关的rtv、dsv、srv
srvDesc.ViewDimension = D3D12_SRV_DIMENSION_TEXTURECUBE;
srvDesc.TextureCube.MostDetailedMip = 0;
srvDesc.TextureCube.MipLevels = skyCubeMap->GetDesc().MipLevels;
srvDesc.TextureCube.ResourceMinLODClamp = 0.0f;
srvDesc.Format = skyCubeMap->GetDesc().Format;
md3dDevice->CreateShaderResourceView(skyCubeMap.Get(), &srvDesc, hDescriptor);
mSkyTexHeapIndex = (UINT)tex2DList.size();
mShadowMapHeapIndex = mSkyTexHeapIndex + 1;
mSsaoHeapIndexStart = mShadowMapHeapIndex + 1;
mSsaoAmbientMapIndex = mSsaoHeapIndexStart + 3;
mNullCubeSrvIndex = mSsaoHeapIndexStart + 5;
mNullTexSrvIndex1 = mNullCubeSrvIndex + 1;
mNullTexSrvIndex2 = mNullTexSrvIndex1 + 1;
auto nullSrv = GetCpuSrv(mNullCubeSrvIndex);
mNullSrv = GetGpuSrv(mNullCubeSrvIndex);
md3dDevice->CreateShaderResourceView(nullptr, &srvDesc, nullSrv);
nullSrv.Offset(1, mCbvSrvUavDescriptorSize);
srvDesc.ViewDimension = D3D12_SRV_DIMENSION_TEXTURE2D;
srvDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
srvDesc.Texture2D.MostDetailedMip = 0;
srvDesc.Texture2D.MipLevels = 1;
srvDesc.Texture2D.ResourceMinLODClamp = 0.0f;
md3dDevice->CreateShaderResourceView(nullptr, &srvDesc, nullSrv);
nullSrv.Offset(1, mCbvSrvUavDescriptorSize);
md3dDevice->CreateShaderResourceView(nullptr, &srvDesc, nullSrv);
mShadowMap->BuildDescriptors(
GetCpuSrv(mShadowMapHeapIndex),
GetGpuSrv(mShadowMapHeapIndex),
GetDsv(1));
mSsao->BuildDescriptors(
mDepthStencilBuffer.Get(),
GetCpuSrv(mSsaoHeapIndexStart),
GetGpuSrv(mSsaoHeapIndexStart),
GetRtv(SwapChainBufferCount),
mCbvSrvUavDescriptorSize,
mRtvDescriptorSize);
编译这次用到的shader:
void SsaoApp::BuildShadersAndInputLayout()
{
const D3D_SHADER_MACRO alphaTestDefines[] =
{
"ALPHA_TEST", "1",
NULL, NULL
};
mShaders["standardVS"] = d3dUtil::CompileShader(L"Shaders\\Default.hlsl", nullptr, "VS", "vs_5_1");
mShaders["opaquePS"] = d3dUtil::CompileShader(L"Shaders\\Default.hlsl", nullptr, "PS", "ps_5_1");
mShaders["shadowVS"] = d3dUtil::CompileShader(L"Shaders\\Shadows.hlsl", nullptr, "VS", "vs_5_1");
mShaders["shadowOpaquePS"] = d3dUtil::CompileShader(L"Shaders\\Shadows.hlsl", nullptr, "PS", "ps_5_1");
mShaders["shadowAlphaTestedPS"] = d3dUtil::CompileShader(L"Shaders\\Shadows.hlsl", alphaTestDefines, "PS", "ps_5_1");
mShaders["debugVS"] = d3dUtil::CompileShader(L"Shaders\\ShadowDebug.hlsl", nullptr, "VS", "vs_5_1");
mShaders["debugPS"] = d3dUtil::CompileShader(L"Shaders\\ShadowDebug.hlsl", nullptr, "PS", "ps_5_1");
mShaders["drawNormalsVS"] = d3dUtil::CompileShader(L"Shaders\\DrawNormals.hlsl", nullptr, "VS", "vs_5_1");
mShaders["drawNormalsPS"] = d3dUtil::CompileShader(L"Shaders\\DrawNormals.hlsl", nullptr, "PS", "ps_5_1");
mShaders["ssaoVS"] = d3dUtil::CompileShader(L"Shaders\\Ssao.hlsl", nullptr, "VS", "vs_5_1");
mShaders["ssaoPS"] = d3dUtil::CompileShader(L"Shaders\\Ssao.hlsl", nullptr, "PS", "ps_5_1");
mShaders["ssaoBlurVS"] = d3dUtil::CompileShader(L"Shaders\\SsaoBlur.hlsl", nullptr, "VS", "vs_5_1");
mShaders["ssaoBlurPS"] = d3dUtil::CompileShader(L"Shaders\\SsaoBlur.hlsl", nullptr, "PS", "ps_5_1");
mShaders["skyVS"] = d3dUtil::CompileShader(L"Shaders\\Sky.hlsl", nullptr, "VS", "vs_5_1");
mShaders["skyPS"] = d3dUtil::CompileShader(L"Shaders\\Sky.hlsl", nullptr, "PS", "ps_5_1");
mInputLayout =
{
{ "POSITION", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 0, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 },
{ "NORMAL", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 12, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 },
{ "TEXCOORD", 0, DXGI_FORMAT_R32G32_FLOAT, 0, 24, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 },
{ "TANGENT", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 32, D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0 },
};
}
然后创建PSO,注意这里我们因为先用了一个pass渲染了深度了,所以后面的opaque等渲染的时候深度缓冲判断选EQUAL,然后关掉深度写入。
void SsaoApp::BuildPSOs()
{
D3D12_GRAPHICS_PIPELINE_STATE_DESC basePsoDesc;
ZeroMemory(&basePsoDesc, sizeof(D3D12_GRAPHICS_PIPELINE_STATE_DESC));
basePsoDesc.InputLayout = { mInputLayout.data(), (UINT)mInputLayout.size() };
basePsoDesc.pRootSignature = mRootSignature.Get();
basePsoDesc.VS =
{
reinterpret_cast<BYTE*>(mShaders["standardVS"]->GetBufferPointer()),
mShaders["standardVS"]->GetBufferSize()
};
basePsoDesc.PS =
{
reinterpret_cast<BYTE*>(mShaders["opaquePS"]->GetBufferPointer()),
mShaders["opaquePS"]->GetBufferSize()
};
basePsoDesc.RasterizerState = CD3DX12_RASTERIZER_DESC(D3D12_DEFAULT);
basePsoDesc.BlendState = CD3DX12_BLEND_DESC(D3D12_DEFAULT);
basePsoDesc.DepthStencilState = CD3DX12_DEPTH_STENCIL_DESC(D3D12_DEFAULT);
basePsoDesc.SampleMask = UINT_MAX;
basePsoDesc.PrimitiveTopologyType = D3D12_PRIMITIVE_TOPOLOGY_TYPE_TRIANGLE;
basePsoDesc.NumRenderTargets = 1;
basePsoDesc.RTVFormats[0] = mBackBufferFormat;
basePsoDesc.SampleDesc.Count = m4xMsaaState ? 4 : 1;
basePsoDesc.SampleDesc.Quality = m4xMsaaState ? (m4xMsaaQuality - 1) : 0;
basePsoDesc.DSVFormat = mDepthStencilFormat;
//
// PSO for opaque objects.
//
D3D12_GRAPHICS_PIPELINE_STATE_DESC opaquePsoDesc = basePsoDesc;
opaquePsoDesc.DepthStencilState.DepthFunc = D3D12_COMPARISON_FUNC_EQUAL;
opaquePsoDesc.DepthStencilState.DepthWriteMask = D3D12_DEPTH_WRITE_MASK_ZERO;
ThrowIfFailed(md3dDevice->CreateGraphicsPipelineState(&opaquePsoDesc, IID_PPV_ARGS(&mPSOs["opaque"])));
//
// PSO for shadow map pass.
//
D3D12_GRAPHICS_PIPELINE_STATE_DESC smapPsoDesc = basePsoDesc;
smapPsoDesc.RasterizerState.DepthBias = 100000;
smapPsoDesc.RasterizerState.DepthBiasClamp = 0.0f;
smapPsoDesc.RasterizerState.SlopeScaledDepthBias = 1.0f;
smapPsoDesc.pRootSignature = mRootSignature.Get();
smapPsoDesc.VS =
{
reinterpret_cast<BYTE*>(mShaders["shadowVS"]->GetBufferPointer()),
mShaders["shadowVS"]->GetBufferSize()
};
smapPsoDesc.PS =
{
reinterpret_cast<BYTE*>(mShaders["shadowOpaquePS"]->GetBufferPointer()),
mShaders["shadowOpaquePS"]->GetBufferSize()
};
// Shadow map pass does not have a render target.
smapPsoDesc.RTVFormats[0] = DXGI_FORMAT_UNKNOWN;
smapPsoDesc.NumRenderTargets = 0;
ThrowIfFailed(md3dDevice->CreateGraphicsPipelineState(&smapPsoDesc, IID_PPV_ARGS(&mPSOs["shadow_opaque"])));
//
// PSO for debug layer.
//
D3D12_GRAPHICS_PIPELINE_STATE_DESC debugPsoDesc = basePsoDesc;
debugPsoDesc.pRootSignature = mRootSignature.Get();
debugPsoDesc.VS =
{
reinterpret_cast<BYTE*>(mShaders["debugVS"]->GetBufferPointer()),
mShaders["debugVS"]->GetBufferSize()
};
debugPsoDesc.PS =
{
reinterpret_cast<BYTE*>(mShaders["debugPS"]->GetBufferPointer()),
mShaders["debugPS"]->GetBufferSize()
};
ThrowIfFailed(md3dDevice->CreateGraphicsPipelineState(&debugPsoDesc, IID_PPV_ARGS(&mPSOs["debug"])));
//
// PSO for drawing normals.
//
D3D12_GRAPHICS_PIPELINE_STATE_DESC drawNormalsPsoDesc = basePsoDesc;
drawNormalsPsoDesc.VS =
{
reinterpret_cast<BYTE*>(mShaders["drawNormalsVS"]->GetBufferPointer()),
mShaders["drawNormalsVS"]->GetBufferSize()
};
drawNormalsPsoDesc.PS =
{
reinterpret_cast<BYTE*>(mShaders["drawNormalsPS"]->GetBufferPointer()),
mShaders["drawNormalsPS"]->GetBufferSize()
};
drawNormalsPsoDesc.RTVFormats[0] = Ssao::NormalMapFormat;
drawNormalsPsoDesc.SampleDesc.Count = 1;
drawNormalsPsoDesc.SampleDesc.Quality = 0;
drawNormalsPsoDesc.DSVFormat = mDepthStencilFormat;
ThrowIfFailed(md3dDevice->CreateGraphicsPipelineState(&drawNormalsPsoDesc, IID_PPV_ARGS(&mPSOs["drawNormals"])));
//
// PSO for SSAO.
//
D3D12_GRAPHICS_PIPELINE_STATE_DESC ssaoPsoDesc = basePsoDesc;
ssaoPsoDesc.InputLayout = { nullptr, 0 };
ssaoPsoDesc.pRootSignature = mSsaoRootSignature.Get();
ssaoPsoDesc.VS =
{
reinterpret_cast<BYTE*>(mShaders["ssaoVS"]->GetBufferPointer()),
mShaders["ssaoVS"]->GetBufferSize()
};
ssaoPsoDesc.PS =
{
reinterpret_cast<BYTE*>(mShaders["ssaoPS"]->GetBufferPointer()),
mShaders["ssaoPS"]->GetBufferSize()
};
// SSAO effect does not need the depth buffer.
ssaoPsoDesc.DepthStencilState.DepthEnable = false;
ssaoPsoDesc.DepthStencilState.DepthWriteMask = D3D12_DEPTH_WRITE_MASK_ZERO;
ssaoPsoDesc.RTVFormats[0] = Ssao::AmbientMapFormat;
ssaoPsoDesc.SampleDesc.Count = 1;
ssaoPsoDesc.SampleDesc.Quality = 0;
ssaoPsoDesc.DSVFormat = DXGI_FORMAT_UNKNOWN;
ThrowIfFailed(md3dDevice->CreateGraphicsPipelineState(&ssaoPsoDesc, IID_PPV_ARGS(&mPSOs["ssao"])));
//
// PSO for SSAO blur.
//
D3D12_GRAPHICS_PIPELINE_STATE_DESC ssaoBlurPsoDesc = ssaoPsoDesc;
ssaoBlurPsoDesc.VS =
{
reinterpret_cast<BYTE*>(mShaders["ssaoBlurVS"]->GetBufferPointer()),
mShaders["ssaoBlurVS"]->GetBufferSize()
};
ssaoBlurPsoDesc.PS =
{
reinterpret_cast<BYTE*>(mShaders["ssaoBlurPS"]->GetBufferPointer()),
mShaders["ssaoBlurPS"]->GetBufferSize()
};
ThrowIfFailed(md3dDevice->CreateGraphicsPipelineState(&ssaoBlurPsoDesc, IID_PPV_ARGS(&mPSOs["ssaoBlur"])));
//
// PSO for sky.
//
D3D12_GRAPHICS_PIPELINE_STATE_DESC skyPsoDesc = basePsoDesc;
// The camera is inside the sky sphere, so just turn off culling.
skyPsoDesc.RasterizerState.CullMode = D3D12_CULL_MODE_NONE;
// Make sure the depth function is LESS_EQUAL and not just LESS.
// Otherwise, the normalized depth values at z = 1 (NDC) will
// fail the depth test if the depth buffer was cleared to 1.
skyPsoDesc.DepthStencilState.DepthFunc = D3D12_COMPARISON_FUNC_LESS_EQUAL;
skyPsoDesc.pRootSignature = mRootSignature.Get();
skyPsoDesc.VS =
{
reinterpret_cast<BYTE*>(mShaders["skyVS"]->GetBufferPointer()),
mShaders["skyVS"]->GetBufferSize()
};
skyPsoDesc.PS =
{
reinterpret_cast<BYTE*>(mShaders["skyPS"]->GetBufferPointer()),
mShaders["skyPS"]->GetBufferSize()
};
ThrowIfFailed(md3dDevice->CreateGraphicsPipelineState(&skyPsoDesc, IID_PPV_ARGS(&mPSOs["sky"])));
}
部分初始化过程
mCamera.SetPosition(0.0f, 2.0f, -15.0f);
mShadowMap = std::make_unique<ShadowMap>(md3dDevice.Get(),
2048, 2048);
mSsao = std::make_unique<Ssao>(
md3dDevice.Get(),
mCommandList.Get(),
mClientWidth, mClientHeight);
LoadTextures();
BuildRootSignature();
BuildSsaoRootSignature();
BuildDescriptorHeaps();
BuildShadersAndInputLayout();
BuildShapeGeometry();
BuildSkullGeometry();
BuildMaterials();
BuildRenderItems();
BuildFrameResources();
BuildPSOs();
mSsao->SetPSOs(mPSOs["ssao"].Get(), mPSOs["ssaoBlur"].Get());
Update里面还要update一下ssao的cb
void SsaoApp::UpdateSsaoCB(const GameTimer& gt)
{
SsaoConstants ssaoCB;
XMMATRIX P = mCamera.GetProj();
// Transform NDC space [-1,+1]^2 to texture space [0,1]^2
XMMATRIX T(
0.5f, 0.0f, 0.0f, 0.0f,
0.0f, -0.5f, 0.0f, 0.0f,
0.0f, 0.0f, 1.0f, 0.0f,
0.5f, 0.5f, 0.0f, 1.0f);
ssaoCB.Proj = mMainPassCB.Proj;
ssaoCB.InvProj = mMainPassCB.InvProj;
XMStoreFloat4x4(&ssaoCB.ProjTex, XMMatrixTranspose(P*T));
mSsao->GetOffsetVectors(ssaoCB.OffsetVectors);
auto blurWeights = mSsao->CalcGaussWeights(2.5f);
ssaoCB.BlurWeights[0] = XMFLOAT4(&blurWeights[0]);
ssaoCB.BlurWeights[1] = XMFLOAT4(&blurWeights[4]);
ssaoCB.BlurWeights[2] = XMFLOAT4(&blurWeights[8]);
ssaoCB.InvRenderTargetSize = XMFLOAT2(1.0f / mSsao->SsaoMapWidth(), 1.0f / mSsao->SsaoMapHeight());
// Coordinates given in view space.
ssaoCB.OcclusionRadius = 0.5f;
ssaoCB.OcclusionFadeStart = 0.2f;
ssaoCB.OcclusionFadeEnd = 1.0f;
ssaoCB.SurfaceEpsilon = 0.05f;
auto currSsaoCB = mCurrFrameResource->SsaoCB.get();
currSsaoCB->CopyData(0, ssaoCB);
}
然后是draw部分
ID3D12DescriptorHeap* descriptorHeaps[] = { mSrvDescriptorHeap.Get() };
mCommandList->SetDescriptorHeaps(_countof(descriptorHeaps), descriptorHeaps);
mCommandList->SetGraphicsRootSignature(mRootSignature.Get());
//
// Shadow map pass.
//
// Bind all the materials used in this scene. For structured buffers, we can bypass the heap and
// set as a root descriptor.
auto matBuffer = mCurrFrameResource->MaterialBuffer->Resource();
mCommandList->SetGraphicsRootShaderResourceView(2, matBuffer->GetGPUVirtualAddress());
// Bind null SRV for shadow map pass.
mCommandList->SetGraphicsRootDescriptorTable(3, mNullSrv);
// Bind all the textures used in this scene. Observe
// that we only have to specify the first descriptor in the table.
// The root signature knows how many descriptors are expected in the table.
mCommandList->SetGraphicsRootDescriptorTable(4, mSrvDescriptorHeap->GetGPUDescriptorHandleForHeapStart());
DrawSceneToShadowMap();
//
// Normal/depth pass.
//
DrawNormalsAndDepth();
//
// Compute SSAO.
//
mCommandList->SetGraphicsRootSignature(mSsaoRootSignature.Get());
mSsao->ComputeSsao(mCommandList.Get(), mCurrFrameResource, 3);
//
// Main rendering pass.
//
mCommandList->SetGraphicsRootSignature(mRootSignature.Get());
// Rebind state whenever graphics root signature changes.
// Bind all the materials used in this scene. For structured buffers, we can bypass the heap and
// set as a root descriptor.
matBuffer = mCurrFrameResource->MaterialBuffer->Resource();
mCommandList->SetGraphicsRootShaderResourceView(2, matBuffer->GetGPUVirtualAddress());
mCommandList->RSSetViewports(1, &mScreenViewport);
mCommandList->RSSetScissorRects(1, &mScissorRect);
// Indicate a state transition on the resource usage.
mCommandList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(CurrentBackBuffer(),
D3D12_RESOURCE_STATE_PRESENT, D3D12_RESOURCE_STATE_RENDER_TARGET));
// Clear the back buffer.
mCommandList->ClearRenderTargetView(CurrentBackBufferView(), Colors::LightSteelBlue, 0, nullptr);
// WE ALREADY WROTE THE DEPTH INFO TO THE DEPTH BUFFER IN DrawNormalsAndDepth,
// SO DO NOT CLEAR DEPTH.
// Specify the buffers we are going to render to.
mCommandList->OMSetRenderTargets(1, &CurrentBackBufferView(), true, &DepthStencilView());
// Bind all the textures used in this scene. Observe
// that we only have to specify the first descriptor in the table.
// The root signature knows how many descriptors are expected in the table.
mCommandList->SetGraphicsRootDescriptorTable(4, mSrvDescriptorHeap->GetGPUDescriptorHandleForHeapStart());
auto passCB = mCurrFrameResource->PassCB->Resource();
mCommandList->SetGraphicsRootConstantBufferView(1, passCB->GetGPUVirtualAddress());
// Bind the sky cube map. For our demos, we just use one "world" cube map representing the environment
// from far away, so all objects will use the same cube map and we only need to set it once per-frame.
// If we wanted to use "local" cube maps, we would have to change them per-object, or dynamically
// index into an array of cube maps.
CD3DX12_GPU_DESCRIPTOR_HANDLE skyTexDescriptor(mSrvDescriptorHeap->GetGPUDescriptorHandleForHeapStart());
skyTexDescriptor.Offset(mSkyTexHeapIndex, mCbvSrvUavDescriptorSize);
mCommandList->SetGraphicsRootDescriptorTable(3, skyTexDescriptor);
mCommandList->SetPipelineState(mPSOs["opaque"].Get());
DrawRenderItems(mCommandList.Get(), mRitemLayer[(int)RenderLayer::Opaque]);
mCommandList->SetPipelineState(mPSOs["debug"].Get());
DrawRenderItems(mCommandList.Get(), mRitemLayer[(int)RenderLayer::Debug]);
mCommandList->SetPipelineState(mPSOs["sky"].Get());
DrawRenderItems(mCommandList.Get(), mRitemLayer[(int)RenderLayer::Sky]);
首先渲染normal和depth
void SsaoApp::DrawNormalsAndDepth()
{
mCommandList->RSSetViewports(1, &mScreenViewport);
mCommandList->RSSetScissorRects(1, &mScissorRect);
auto normalMap = mSsao->NormalMap();
auto normalMapRtv = mSsao->NormalMapRtv();
// Change to RENDER_TARGET.
mCommandList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(normalMap,
D3D12_RESOURCE_STATE_GENERIC_READ, D3D12_RESOURCE_STATE_RENDER_TARGET));
// Clear the screen normal map and depth buffer.
float clearValue[] = {0.0f, 0.0f, 1.0f, 0.0f};
mCommandList->ClearRenderTargetView(normalMapRtv, clearValue, 0, nullptr);
mCommandList->ClearDepthStencilView(DepthStencilView(), D3D12_CLEAR_FLAG_DEPTH | D3D12_CLEAR_FLAG_STENCIL, 1.0f, 0, 0, nullptr);
// Specify the buffers we are going to render to.
mCommandList->OMSetRenderTargets(1, &normalMapRtv, true, &DepthStencilView());
// Bind the constant buffer for this pass.
auto passCB = mCurrFrameResource->PassCB->Resource();
mCommandList->SetGraphicsRootConstantBufferView(1, passCB->GetGPUVirtualAddress());
mCommandList->SetPipelineState(mPSOs["drawNormals"].Get());
DrawRenderItems(mCommandList.Get(), mRitemLayer[(int)RenderLayer::Opaque]);
// Change back to GENERIC_READ so we can read the texture in a shader.
mCommandList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(normalMap,
D3D12_RESOURCE_STATE_RENDER_TARGET, D3D12_RESOURCE_STATE_GENERIC_READ));
}
然后是计算SSAO
void Ssao::ComputeSsao(
ID3D12GraphicsCommandList* cmdList,
FrameResource* currFrame,
int blurCount)
{
cmdList->RSSetViewports(1, &mViewport);
cmdList->RSSetScissorRects(1, &mScissorRect);
// We compute the initial SSAO to AmbientMap0.
// Change to RENDER_TARGET.
cmdList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(mAmbientMap0.Get(),
D3D12_RESOURCE_STATE_GENERIC_READ, D3D12_RESOURCE_STATE_RENDER_TARGET));
float clearValue[] = {1.0f, 1.0f, 1.0f, 1.0f};
cmdList->ClearRenderTargetView(mhAmbientMap0CpuRtv, clearValue, 0, nullptr);
// Specify the buffers we are going to render to.
cmdList->OMSetRenderTargets(1, &mhAmbientMap0CpuRtv, true, nullptr);
// Bind the constant buffer for this pass.
auto ssaoCBAddress = currFrame->SsaoCB->Resource()->GetGPUVirtualAddress();
cmdList->SetGraphicsRootConstantBufferView(0, ssaoCBAddress);
cmdList->SetGraphicsRoot32BitConstant(1, 0, 0);
// Bind the normal and depth maps.
cmdList->SetGraphicsRootDescriptorTable(2, mhNormalMapGpuSrv);
// Bind the random vector map.
cmdList->SetGraphicsRootDescriptorTable(3, mhRandomVectorMapGpuSrv);
cmdList->SetPipelineState(mSsaoPso);
// Draw fullscreen quad.
cmdList->IASetVertexBuffers(0, 0, nullptr);
cmdList->IASetIndexBuffer(nullptr);
cmdList->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
cmdList->DrawInstanced(6, 1, 0, 0);
// Change back to GENERIC_READ so we can read the texture in a shader.
cmdList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(mAmbientMap0.Get(),
D3D12_RESOURCE_STATE_RENDER_TARGET, D3D12_RESOURCE_STATE_GENERIC_READ));
BlurAmbientMap(cmdList, currFrame, blurCount);
}
void Ssao::BlurAmbientMap(ID3D12GraphicsCommandList* cmdList, FrameResource* currFrame, int blurCount)
{
cmdList->SetPipelineState(mBlurPso);
auto ssaoCBAddress = currFrame->SsaoCB->Resource()->GetGPUVirtualAddress();
cmdList->SetGraphicsRootConstantBufferView(0, ssaoCBAddress);
for(int i = 0; i < blurCount; ++i)
{
BlurAmbientMap(cmdList, true);
BlurAmbientMap(cmdList, false);
}
}
void Ssao::BlurAmbientMap(ID3D12GraphicsCommandList* cmdList, bool horzBlur)
{
ID3D12Resource* output = nullptr;
CD3DX12_GPU_DESCRIPTOR_HANDLE inputSrv;
CD3DX12_CPU_DESCRIPTOR_HANDLE outputRtv;
// Ping-pong the two ambient map textures as we apply
// horizontal and vertical blur passes.
if(horzBlur == true)
{
output = mAmbientMap1.Get();
inputSrv = mhAmbientMap0GpuSrv;
outputRtv = mhAmbientMap1CpuRtv;
cmdList->SetGraphicsRoot32BitConstant(1, 1, 0);
}
else
{
output = mAmbientMap0.Get();
inputSrv = mhAmbientMap1GpuSrv;
outputRtv = mhAmbientMap0CpuRtv;
cmdList->SetGraphicsRoot32BitConstant(1, 0, 0);
}
cmdList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(output,
D3D12_RESOURCE_STATE_GENERIC_READ, D3D12_RESOURCE_STATE_RENDER_TARGET));
float clearValue[] = { 1.0f, 1.0f, 1.0f, 1.0f };
cmdList->ClearRenderTargetView(outputRtv, clearValue, 0, nullptr);
cmdList->OMSetRenderTargets(1, &outputRtv, true, nullptr);
// Normal/depth map still bound.
// Bind the normal and depth maps.
cmdList->SetGraphicsRootDescriptorTable(2, mhNormalMapGpuSrv);
// Bind the input ambient map to second texture table.
cmdList->SetGraphicsRootDescriptorTable(3, inputSrv);
// Draw fullscreen quad.
cmdList->IASetVertexBuffers(0, 0, nullptr);
cmdList->IASetIndexBuffer(nullptr);
cmdList->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
cmdList->DrawInstanced(6, 1, 0, 0);
cmdList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(output,
D3D12_RESOURCE_STATE_RENDER_TARGET, D3D12_RESOURCE_STATE_GENERIC_READ));
}
计算完SSAO后,不要清除DS,也不要开深度写入,然后用ssao信息渲染整个画面即可。
接下来看shader部分。
common.hlsl里多声明一个ssao贴图。
TextureCube gCubeMap : register(t0);
Texture2D gShadowMap : register(t1);
Texture2D gSsaoMap : register(t2);
然后渲染法线的shader
// Defaults for number of lights.
#ifndef NUM_DIR_LIGHTS
#define NUM_DIR_LIGHTS 0
#endif
#ifndef NUM_POINT_LIGHTS
#define NUM_POINT_LIGHTS 0
#endif
#ifndef NUM_SPOT_LIGHTS
#define NUM_SPOT_LIGHTS 0
#endif
// Include common HLSL code.
#include "Common.hlsl"
struct VertexIn
{
float3 PosL : POSITION;
float3 NormalL : NORMAL;
float2 TexC : TEXCOORD;
float3 TangentU : TANGENT;
};
struct VertexOut
{
float4 PosH : SV_POSITION;
float3 NormalW : NORMAL;
float3 TangentW : TANGENT;
float2 TexC : TEXCOORD;
};
VertexOut VS(VertexIn vin)
{
VertexOut vout = (VertexOut)0.0f;
// Fetch the material data.
MaterialData matData = gMaterialData[gMaterialIndex];
// Assumes nonuniform scaling; otherwise, need to use inverse-transpose of world matrix.
vout.NormalW = mul(vin.NormalL, (float3x3)gWorld);
vout.TangentW = mul(vin.TangentU, (float3x3)gWorld);
// Transform to homogeneous clip space.
float4 posW = mul(float4(vin.PosL, 1.0f), gWorld);
vout.PosH = mul(posW, gViewProj);
// Output vertex attributes for interpolation across triangle.
float4 texC = mul(float4(vin.TexC, 0.0f, 1.0f), gTexTransform);
vout.TexC = mul(texC, matData.MatTransform).xy;
return vout;
}
float4 PS(VertexOut pin) : SV_Target
{
// Fetch the material data.
MaterialData matData = gMaterialData[gMaterialIndex];
float4 diffuseAlbedo = matData.DiffuseAlbedo;
uint diffuseMapIndex = matData.DiffuseMapIndex;
uint normalMapIndex = matData.NormalMapIndex;
// Dynamically look up the texture in the array.
diffuseAlbedo *= gTextureMaps[diffuseMapIndex].Sample(gsamAnisotropicWrap, pin.TexC);
#ifdef ALPHA_TEST
// Discard pixel if texture alpha < 0.1. We do this test as soon
// as possible in the shader so that we can potentially exit the
// shader early, thereby skipping the rest of the shader code.
clip(diffuseAlbedo.a - 0.1f);
#endif
// Interpolating normal can unnormalize it, so renormalize it.
pin.NormalW = normalize(pin.NormalW);
// NOTE: We use interpolated vertex normal for SSAO.
// Write normal in view space coordinates
float3 normalV = mul(pin.NormalW, (float3x3)gView);
return float4(normalV, 0.0f);
}
注意我们在vs里把法线变换到世界空间,在ps里做了裁剪之后再变换到view空间,之所以要做裁剪,是因为我们不需要渲染透明部分的法线,只渲染离屏幕最近的而且不透明的像素的法线即可。
接下来是重点部分,计算SSAO的shader:
cbuffer cbSsao : register(b0)
{
float4x4 gProj;
float4x4 gInvProj;
float4x4 gProjTex;
float4 gOffsetVectors[14];
// For SsaoBlur.hlsl
float4 gBlurWeights[3];
float2 gInvRenderTargetSize;
// Coordinates given in view space.
float gOcclusionRadius;
float gOcclusionFadeStart;
float gOcclusionFadeEnd;
float gSurfaceEpsilon;
};
cbuffer cbRootConstants : register(b1)
{
bool gHorizontalBlur;
};
// Nonnumeric values cannot be added to a cbuffer.
Texture2D gNormalMap : register(t0);
Texture2D gDepthMap : register(t1);
Texture2D gRandomVecMap : register(t2);
SamplerState gsamPointClamp : register(s0);
SamplerState gsamLinearClamp : register(s1);
SamplerState gsamDepthMap : register(s2);
SamplerState gsamLinearWrap : register(s3);
static const int gSampleCount = 14;
static const float2 gTexCoords[6] =
{
float2(0.0f, 1.0f),
float2(0.0f, 0.0f),
float2(1.0f, 0.0f),
float2(0.0f, 1.0f),
float2(1.0f, 0.0f),
float2(1.0f, 1.0f)
};
struct VertexOut
{
float4 PosH : SV_POSITION;
float3 PosV : POSITION;
float2 TexC : TEXCOORD0;
};
VertexOut VS(uint vid : SV_VertexID)
{
VertexOut vout;
vout.TexC = gTexCoords[vid];
// Quad covering screen in NDC space.
vout.PosH = float4(2.0f*vout.TexC.x - 1.0f, 1.0f - 2.0f*vout.TexC.y, 0.0f, 1.0f);
// Transform quad corners to view space near plane.
float4 ph = mul(vout.PosH, gInvProj);
vout.PosV = ph.xyz / ph.w;
return vout;
}
// Determines how much the sample point q occludes the point p as a function
// of distZ.
float OcclusionFunction(float distZ)
{
//
// If depth(q) is "behind" depth(p), then q cannot occlude p. Moreover, if
// depth(q) and depth(p) are sufficiently close, then we also assume q cannot
// occlude p because q needs to be in front of p by Epsilon to occlude p.
//
// We use the following function to determine the occlusion.
//
//
// 1.0 -------------\
// | | \
// | | \
// | | \
// | | \
// | | \
// | | \
// ------|------|-----------|-------------|---------|--> zv
// 0 Eps z0 z1
//
float occlusion = 0.0f;
if(distZ > gSurfaceEpsilon)
{
float fadeLength = gOcclusionFadeEnd - gOcclusionFadeStart;
// Linearly decrease occlusion from 1 to 0 as distZ goes
// from gOcclusionFadeStart to gOcclusionFadeEnd.
occlusion = saturate( (gOcclusionFadeEnd-distZ)/fadeLength );
}
return occlusion;
}
float NdcDepthToViewDepth(float z_ndc)
{
// z_ndc = A + B/viewZ, where gProj[2,2]=A and gProj[3,2]=B.
float viewZ = gProj[3][2] / (z_ndc - gProj[2][2]);
return viewZ;
}
float4 PS(VertexOut pin) : SV_Target
{
// p -- the point we are computing the ambient occlusion for.
// n -- normal vector at p.
// q -- a random offset from p.
// r -- a potential occluder that might occlude p.
// Get viewspace normal and z-coord of this pixel.
float3 n = normalize(gNormalMap.SampleLevel(gsamPointClamp, pin.TexC, 0.0f).xyz);
float pz = gDepthMap.SampleLevel(gsamDepthMap, pin.TexC, 0.0f).r;
pz = NdcDepthToViewDepth(pz);
//
// Reconstruct full view space position (x,y,z).
// Find t such that p = t*pin.PosV.
// p.z = t*pin.PosV.z
// t = p.z / pin.PosV.z
//
float3 p = (pz/pin.PosV.z)*pin.PosV;
// Extract random vector and map from [0,1] --> [-1, +1].
float3 randVec = 2.0f*gRandomVecMap.SampleLevel(gsamLinearWrap, 4.0f*pin.TexC, 0.0f).rgb - 1.0f;
float occlusionSum = 0.0f;
// Sample neighboring points about p in the hemisphere oriented by n.
for(int i = 0; i < gSampleCount; ++i)
{
// Are offset vectors are fixed and uniformly distributed (so that our offset vectors
// do not clump in the same direction). If we reflect them about a random vector
// then we get a random uniform distribution of offset vectors.
float3 offset = reflect(gOffsetVectors[i].xyz, randVec);
// Flip offset vector if it is behind the plane defined by (p, n).
float flip = sign( dot(offset, n) );
// Sample a point near p within the occlusion radius.
float3 q = p + flip * gOcclusionRadius * offset;
// Project q and generate projective tex-coords.
float4 projQ = mul(float4(q, 1.0f), gProjTex);
projQ /= projQ.w;
// Find the nearest depth value along the ray from the eye to q (this is not
// the depth of q, as q is just an arbitrary point near p and might
// occupy empty space). To find the nearest depth we look it up in the depthmap.
float rz = gDepthMap.SampleLevel(gsamDepthMap, projQ.xy, 0.0f).r;
rz = NdcDepthToViewDepth(rz);
// Reconstruct full view space position r = (rx,ry,rz). We know r
// lies on the ray of q, so there exists a t such that r = t*q.
// r.z = t*q.z ==> t = r.z / q.z
float3 r = (rz / q.z) * q;
//
// Test whether r occludes p.
// * The product dot(n, normalize(r - p)) measures how much in front
// of the plane(p,n) the occluder point r is. The more in front it is, the
// more occlusion weight we give it. This also prevents self shadowing where
// a point r on an angled plane (p,n) could give a false occlusion since they
// have different depth values with respect to the eye.
// * The weight of the occlusion is scaled based on how far the occluder is from
// the point we are computing the occlusion of. If the occluder r is far away
// from p, then it does not occlude it.
//
float distZ = p.z - r.z;
float dp = max(dot(n, normalize(r - p)), 0.0f);
float occlusion = dp*OcclusionFunction(distZ);
occlusionSum += occlusion;
}
occlusionSum /= gSampleCount;
float access = 1.0f - occlusionSum;
// Sharpen the contrast of the SSAO map to make the SSAO affect more dramatic.
return saturate(pow(access, 6.0f));
}
这个shader里做的事情和我们之前介绍原理的时候说的一样,最后将access值渲染到一张贴图上。
这个时候得到的结果是很噪的,如图
因此,我们要做一下保留边缘的模糊,shader如下
cbuffer cbSsao : register(b0)
{
float4x4 gProj;
float4x4 gInvProj;
float4x4 gProjTex;
float4 gOffsetVectors[14];
// For SsaoBlur.hlsl
float4 gBlurWeights[3];
float2 gInvRenderTargetSize;
// Coordinates given in view space.
float gOcclusionRadius;
float gOcclusionFadeStart;
float gOcclusionFadeEnd;
float gSurfaceEpsilon;
};
cbuffer cbRootConstants : register(b1)
{
bool gHorizontalBlur;
};
// Nonnumeric values cannot be added to a cbuffer.
Texture2D gNormalMap : register(t0);
Texture2D gDepthMap : register(t1);
Texture2D gInputMap : register(t2);
SamplerState gsamPointClamp : register(s0);
SamplerState gsamLinearClamp : register(s1);
SamplerState gsamDepthMap : register(s2);
SamplerState gsamLinearWrap : register(s3);
static const int gBlurRadius = 5;
static const float2 gTexCoords[6] =
{
float2(0.0f, 1.0f),
float2(0.0f, 0.0f),
float2(1.0f, 0.0f),
float2(0.0f, 1.0f),
float2(1.0f, 0.0f),
float2(1.0f, 1.0f)
};
struct VertexOut
{
float4 PosH : SV_POSITION;
float2 TexC : TEXCOORD;
};
VertexOut VS(uint vid : SV_VertexID)
{
VertexOut vout;
vout.TexC = gTexCoords[vid];
// Quad covering screen in NDC space.
vout.PosH = float4(2.0f*vout.TexC.x - 1.0f, 1.0f - 2.0f*vout.TexC.y, 0.0f, 1.0f);
return vout;
}
float NdcDepthToViewDepth(float z_ndc)
{
// z_ndc = A + B/viewZ, where gProj[2,2]=A and gProj[3,2]=B.
float viewZ = gProj[3][2] / (z_ndc - gProj[2][2]);
return viewZ;
}
float4 PS(VertexOut pin) : SV_Target
{
// unpack into float array.
float blurWeights[12] =
{
gBlurWeights[0].x, gBlurWeights[0].y, gBlurWeights[0].z, gBlurWeights[0].w,
gBlurWeights[1].x, gBlurWeights[1].y, gBlurWeights[1].z, gBlurWeights[1].w,
gBlurWeights[2].x, gBlurWeights[2].y, gBlurWeights[2].z, gBlurWeights[2].w,
};
float2 texOffset;
if(gHorizontalBlur)
{
texOffset = float2(gInvRenderTargetSize.x, 0.0f);
}
else
{
texOffset = float2(0.0f, gInvRenderTargetSize.y);
}
// The center value always contributes to the sum.
float4 color = blurWeights[gBlurRadius] * gInputMap.SampleLevel(gsamPointClamp, pin.TexC, 0.0);
float totalWeight = blurWeights[gBlurRadius];
float3 centerNormal = gNormalMap.SampleLevel(gsamPointClamp, pin.TexC, 0.0f).xyz;
float centerDepth = NdcDepthToViewDepth(
gDepthMap.SampleLevel(gsamDepthMap, pin.TexC, 0.0f).r);
for(float i = -gBlurRadius; i <=gBlurRadius; ++i)
{
// We already added in the center weight.
if( i == 0 )
continue;
float2 tex = pin.TexC + i*texOffset;
float3 neighborNormal = gNormalMap.SampleLevel(gsamPointClamp, tex, 0.0f).xyz;
float neighborDepth = NdcDepthToViewDepth(
gDepthMap.SampleLevel(gsamDepthMap, tex, 0.0f).r);
//
// If the center value and neighbor values differ too much (either in
// normal or depth), then we assume we are sampling across a discontinuity.
// We discard such samples from the blur.
//
if( dot(neighborNormal, centerNormal) >= 0.8f &&
abs(neighborDepth - centerDepth) <= 0.2f )
{
float weight = blurWeights[i + gBlurRadius];
// Add neighbor pixel to blur.
color += weight*gInputMap.SampleLevel(
gsamPointClamp, tex, 0.0);
totalWeight += weight;
}
}
// Compensate for discarded samples by making total weights sum to 1.
return color / totalWeight;
}
这个和之前的高斯模糊很像,不同的是我们要做这么一个判断
if(dot(neighborNormal, centerNormal) >= 0.8f && abs(neighborDepth - centerDepth) <= 0.2f)
{
······
}
我们通过法线和深度来检查连续性,如果断开了说明是边缘,边缘我们要保留,不去模糊。
最后得到的结果如图
这样得到的结果就平滑了很多,然后我们利用这个来渲染场景。
渲染的shader里面做一点点修改,把环境光遮蔽考虑进去:
// Finish texture projection and sample SSAO map.
pin.SsaoPosH /= pin.SsaoPosH.w;
float ambientAccess = gSsaoMap.Sample(gsamLinearClamp, pin.SsaoPosH.xy, 0.0f).r;
// Light terms.
float4 ambient = ambientAccess*gAmbientLight*diffuseAlbedo;
最后渲染整个场景,我们把ssao渲染结果放在右下角,最终效果如图所示
可以看到,柱子底下有很强的遮蔽,在渲染的场景里也可以感受到最近的这根柱子底下有点黑黑的。这个视觉感受非常像育碧的一些游戏里的效果,早期的孤岛3就有这种很强的环境光遮蔽的感觉(不过他那个环境光遮蔽一开就疯狂掉帧23333)。好像b社prey和dice的游戏里面也有时候有这种感觉,不过没育碧的那么强烈,所以环境光遮蔽还是要适当使用。