Graphics and Compute Pipelines

[TOC]

说明

内容

Creating a shader module

Specifying pipeline shader stages

Specifying a pipeline vertex binding description, attribute description, and input
state

Specifying a pipeline input assembly state

Specifying a pipeline tessellation state

Specifying a pipeline viewport and scissor test state

Specifying a pipeline rasterization state

Specifying a pipeline multisample state

Specifying a pipeline depth and stencil state

Specifying a pipeline blend state

Specifying pipeline dynamic states

Creating a pipeline layout

Specifying graphics pipeline creation parameters

Creating a pipeline cache object

Retrieving data from a pipeline cache

Merging multiple pipeline cache objects

Creating a graphics pipeline

Creating a compute pipeline

Binding a pipeline object

Creating a pipeline layout with a combined image sampler, a buffer, and push
constant ranges

Creating a graphics pipeline with vertex and fragment shaders, depth test
enabled, and with dynamic viewport and scissor tests

Creating multiple graphics pipelines on multiple threads

Destroying a pipeline

Destroying a pipeline cache

Destroying a pipeline layout

Destroying a shader module

介绍

本文内容是核心之一.

在cb里record和提交给queus的operations由硬件执行.使用compute pipeline进行数学计算,使用graphic pipeline来绘制图形.

Pipeline objects控制geometry绘制和计算的方式.管理硬件的行为.是Vulkan和OpenGL最大的区别之处.它允许我们随时修改rendering或computing参数.我们能设置state,激活shaderprogram,绘制几何体,然后激活另一个shader program绘制另一个几何体.在vulkan里这是不可能的,因为整个rendering或computeingstate存储在一个单片的(monolithical)object里.当使用不同的shaders时,需要准备和使用分开的pipeline.不能switch shaders.

这一开始可能让人害怕,因为很多shader变体(variations)(还不包括额外的pipeline state)可能会创建大量的pipeline objects.但它是为了两个目的服务的,第一是性能.驱动能提前知道整个state以便优化后续操作的执行.第二是稳定性,随时修改state可能让驱动执行额外的操作,比如shader重编译.vulkan中所有需要提前准备的包括shader 编译都在pipeline创建时完成.

本文讨论如何给graphics或compute pipelines 参数进行设置.准备shader modules和决定激活shader stages激活,如何设置depth/stencil tests和如何激活blending.指明vertex attributes以及在绘制操作时时如何提供的.最后看如何创建多pipelines以及如何提高创建速度.

Shader Module

Creating a shader module

第一件事是为pipeline object准备shader modules.SPIR-V assembly.一个module可能包含多个shader stages.

Shader modules包括选择的shader programs的源码–一个SPIR-V assembly.可能包含多个stages,但每个stage需要有关联的入口(entry point).这些入口作为创建pipeline object的参数之一.

加载SPIR-V code,然后

VkShaderModuleCreateInfo shader_module_create_info = {
    VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO,
    nullptr,
    0,
    source_code.size(),
    reinterpret_cast<uint32_t const *>(source_code.data())
};

调用vkCreateShaderModule

VkResult result = vkCreateShaderModule( logical_device,
&shader_module_create_info, nullptr, &shader_module );
if( VK_SUCCESS != result ) {
	std::cout << "Could not create a shader module." << std::endl;
	return false;
}
return true;

需要记住创建shader module时shader没有编译和链接,而是在创建pipeline object时完成.

pipeline states

Specifying pipeline shader stages

在compute pipelines,我们只能用compute shaders.但graphics pipeline包括很多shader stages–vertex,geometry,tessellation control and evaluation,fragment.所以为了正确创建pipeline,需要指明哪些可编程shader stages在创建在cb里的pipeline时会被激活.且需要提供激活的shaders的所有源码.

自定义一个结构体

structShaderStageParameters {
    VkShaderStageFlagBits ShaderStage;
    VkShaderModule ShaderModule;
    char const * EntryPointName;
    VkSpecializationInfo const * SpecializationInfo;
};

VkSpecializationInfo 用来提供constant变量设置值.可为nullptr.

为了定义一组pipeline要激活的shader stages,需要准备VkPipelineShaderStageCreateInfo的数组.每个shader stage需要一个独立的entry,在entry里指明shader模块以及实现shader行为的入口.

也能提供特殊信息,比如创建时(运行时)修改常量的值,这允许在多次使用相同的shader时有细微变化.

graphics和compute pipelins都需要指明pipeline shader stages信息

假设只使用verte和fragment shaders.

std::vector<ShaderStageParameters>shader_stage_params = {
    {
        VK_SHADER_STAGE_VERTEX_BIT,
        *vertex_shader_module,
        "main",
        nullptr
    },
    {
        VK_SHADER_STAGE_FRAGMENT_BIT,
        *fragment_shader_module,
        "main",
        nullptr
    }
};

shader_stage_create_infos.clear();
for( auto & shader_stage : shader_stage_params ) {
    shader_stage_create_infos.push_back( {
        VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
        nullptr,
        0,
        shader_stage.ShaderStage,
        shader_stage.ShaderModule,
        shader_stage.EntryPointName,
        shader_stage.SpecializationInfo
    } );
}

每个shader stage需要时独一无二的.

Specifying a pipelinee vertex bindign description,attribute description,and input state

当想绘制几何体,我们准备额外的属性比如normal vectors,colors,texture coordinates.这些顶点数据是我们可以随意选择的,为了硬件能正确使用它们,我们需要指明有多少属性,内存中如何而排放,或者它们从哪里取.这些通过创建graphics pipeline时verte bindign description和attribute description提供.

veertex binding定义从绑定到选定索引的顶点缓冲区获取的数据集合.此绑定用作顶点属性的编号数据源.我们能至少使用16个分开的bindings,能绑定分开的vertex buffers或同一个buffer的不同memory.

通过binding description,指明数据来自哪里(from which binding),如何存放(缓冲区中连续元素之间的跨距是多少),数据如何读取($\color{red}{逐vertex还是逐instance}$).

一下是一个例子:vec3 position,ve2 texcoord,vec3 color

std::vector<VkVertexInputBindingDescription> binding_descriptions = {
    {
        0,
        8 * sizeof( float ),
        VK_VERTEX_INPUT_RATE_VERTEX
    }
};

通过vertex input description,我们定义了从给定绑定中获取的属性.每个属性需要提供一个shader location(与layout(location=)一样),用于给定属性的数据格式,以及给定属性开始时的内存偏移量(offset).input description 条目数量指明了渲染时属性数量总和.

std::vector<VkVertexInputAttributeDescription> attribute_descriptions = {
    {
        0,
        0,
        VK_FORMAT_R32G32B32_SFLOAT,
        0
    },
    {
        1,
        0,
        VK_FORMAT_R32G32_SFLOAT,
        3 * sizeof( float )
    },
    {
        2,
        0,
        VK_FORMAT_R32G32B32_SFLOAT,
        5 * sizeof( float )
    }
};

vertex_input_state_create_info = {
    VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO,
    nullptr,
    0,
    static_cast<uint32_t>(binding_descriptions.size()),
    binding_descriptions.data(),
    static_cast<uint32_t>(attribute_descriptions.size()),
    attribute_descriptions.data()
};

Specifying a pipeline input assembly state

绘制几何体涉及明确的图元类型,通过input assembly state完成.

VkPipelineInputAssemblyStateCreateInfo

通过input assembly state定义vertices如何组成polygons,最常用的是triangle strips 或Lists.

注意事项

list primitives不能使用primitive restart选项

primitives with adjacency只能和geometry shaders一起使用过.创建logical device时需要激活geometryShader特性.

当使用tessellation shaders时只能用patch primitives.创建logical device时需要激活tessellationShader特性.

VkPipelineInputAssemblyStateCreateInfo:

input_assembly_state_create_info = {
    VK_STRUCTURE_TYPE_PIPELINE_INPUT_ASSEMBLY_STATE_CREATE_INFO,
    nullptr,
    0,
    topology,
    primitive_restart_enable
};

Specifying a pipeline tessellation state

为了使用tessellation shaders,需要

创建logicalDevice时激活tessellationShader特性

为tessellation control和evaluation shaders写代码

为他们创建一个shader module(或2个)

准备VkPipelineTessellationStateCreateInfo pipeline tessellation state

VkPipelineTessellationStateCreateInfo

tessellation_state_create_info = {
    VK_STRUCTURE_TYPE_PIPELINE_TESSELLATION_STATE_CREATE_INFO,
    nullptr,
    0,
    patch_control_points_count
};

在tessellation state里我们只需要提供形成patch(vertices)的control points信息.至少支持32个vertices.

一个patch就是一组点(vertices),用于tessellation stages生成points,lines,或三角形之类的polygons.作为例子,获取三角形vertices

VkPipelineTessellationStateCreateInfo:

tessellation_state_create_info = {
    VK_STRUCTURE_TYPE_PIPELINE_TESSELLATION_STATE_CREATE_INFO,
    nullptr,
    0,
    patch_control_points_count
};

Specifying a pipeline viewport and scissor test state

在屏幕上绘制要求指明screen parameters.创建swapchain不够,不总是绘制整个image area.有在一个更小的image上绘制得情况,比如镜面反射,分屏多人游戏.我们定义要通过pipeline viewport和scissor test states绘制到的图像区域

指明viewport和scissor states参数要求给viewport和scissor test提供独立的参数,但数量一致.自定义一个结构体

struct ViewportInfo {
    std::vector<VkViewport> Viewports;
    std::vector<VkRect2D> Scissors;
};

如果要多viewport渲染，需要在创建logical device时激活multiViewport特性

顶点从局部坐标变换到clip space,硬件做透视出发,成圣normalized device coordinates(标准化设备坐标NDC),然后polygons被assemled和rasterized(光栅化)–产生了fragments,每个fragments有自己的position(由framgbuffer的coordinates定义).为了position被正确计算,需要视口(viewport)变换.这个变换的参数由viewport state指明.

viewport 和 scissor test state是可选的,尽管通常启用.但如果不激活rasterization就不需要提供它们.

viewport state,我们定义framebuffer的coordinate(pixels on screen)的$\color{red}{左上角和width和height}$.也定义iewport depth值得最小、最大值(floating-point $\in$[0,1]).最大深度比最小深度小也是合法的.

scissor test允许对生成的fragments用指明的矩形做额外的clip操作.如果不想做clip操作,可以指明一个viewport大小的区域.

Vulkan里scissort test一直开启.

一个例子

ViewportInfo viewport_infos = {
    {
        {
            0.0f,
            0.0f,
            512.0f,
            512.0f,
            0.0f,
            1.0f
        },
    },
    {
        {
            {
                0,
                0
            },
            {
                512,
                512
            }
        }
    }
};

前面的变量可用于创建此配方中定义的viewport和scissor test.实现如下

uint32_t viewport_count =
static_cast<uint32_t>(viewport_infos.Viewports.size());
uint32_t scissor_count =
static_cast<uint32_t>(viewport_infos.Scissors.size());
viewport_state_create_info = {
    VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO,
    nullptr,
    0,
    viewport_count,
    viewport_infos.Viewports.data(),
    scissor_count,
    viewport_infos.Scissors.data()
};

如果想改变viewport或scissor test参数,需要重建pipeline.但是在创建pipeline时可以指明viewport和scissor test parameters是动态的(dynamic).这样就不用重建pipeline就能改变这些参数了.可以在command buffer recording过程中指明.但是viewport和scissor tests的数量是pipeline 创建时指定的.之后不能改.

除非创建logical device时激活了multiViewport特性,否则不能提供一个以上的viewport和scissor test.

只能在geometry shaders内更改用于rasterization的viewport transformation的index.

Specifying a pipeline rasterzation state

rasterization process将assembled polygons生成fragments(pixels).viewport state在这使用,fragments会生成到framebuffer coordinates.为了觉得fragments如何生成,我们需要准备rasterization state.

rasterization state控制rasterization的参数.首先最重要的是它定义是否开启rasterization.能指明polygon哪一侧是front–是顶点在屏幕上按顺时针顺序(clockwise)出现或按逆时针(counterclockwise)顺序出现的.是否进行front,back,both faces culling.OpenGL中默认逆时针表面为正面且culling关闭.vulkan没有默认值.

一个rasterization state在graphics pipeline创建时总是需要的.

rasterization state也控制polygons绘制的方式.通常需要fully rendered(filled).但也能指明是否只绘制edges(lines)或者points(vertices).Line或points模式只有在创建logical device时激活了fillModeNoSolid特性时才能用.

还需要定义fragment的深度值如何计算,能够开启depth bias–一个给生成的depth value进行offset并添加slope factor的过程.也需要指明当depth bias激活时能给depth value加上的最大的(最小的)offset值.

这之后,也需要定义如果深度超过viewport state给定的范围怎么做.当depth clamp激活,会进行clamp.如果没有,fragment会discarded.(默认应该是disable)

最后一件事,定义绘制的lines的夸大怒,通常指明为1.但如果激活wideLines特性,能提供大于1的值.

同理,设置point size.

其实这些值在shader里可以对定点进行设置

VkPipelineRasterizationStateCreateInfo.

一个例子

VkPipelineRasterizationStateCreateInfo rasterization_state_create_info = {
    VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO,
    nullptr,
    0,
    depth_clamp_enable,
    rasterizer_discard_enable,
    polygon_mode,
    culling_mode,
    front_face,
    depth_bias_enable,
    depth_bias_constant_factor,
    depth_bias_clamp,
    depth_bias_slope_factor,
    line_width
};

Specifying a pipeline multisample state

多重采用(multisampling)是绘制primitives时抗锯齿(eliminates jagged edges)的proceess.换句话说,它可以anti-alias polygons,lines and points.通过multisample state控制.

VkPipelineMultisampleStateCreateInfo

multisample_state_create_info = {
    VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO,
    nullptr,
    0,
    sample_count,//the number of samples generated per fragment
    per_sample_shading_enable,//
    min_sample_shading,//minimal number of uniquely shaded samples
    sample_masks,//fragment的覆盖范围参数
    alpha_to_coverage_enable,//是否从alpha分量生成coverage
    alpha_to_one_enable//是否alpha用1.0替代
};

Specifying a pipeline depth and stencil state

depth test (never, less, less and equal, equal,greater and equal, greater, not equal, always)

stencil test compareOp(never, less, less and equal, equal,greater and equal, greater, not equal, always)

dpeth 和 stencil state在rasterization为非激活或render pass给定的subpass没有用depth/stencil attachment时不需要.

需要指明depth value如何比较的以及通过测试的fragment是否写入depth attachment.

当depthBounds 特性激活时,能使用额外的depth bounds test.这个测试监测fragment是否在特定的minDepthBounds-maxDepthBounds范围内.如果不是discard(failed the depth test).

stencil test对每个fragment与一个integer 值进行额外的test.能用于多种目的,比如能定义复杂的图形决定哪块区域需要渲染,在defered shading/lighting中决定哪块区域进行lit,还有渲染鼠标选中物体的轮廓(高亮)都很有用,以及渲染隐藏在物体后面的物体的轮廓.

在激活stencil test的情况下,我们需要给front and back -facing polygons定义参数.这些参数在:fragment stencil test失败;stencil test成功但depth test失败;stencil test和depth test都成功的情况下执行什么行动.对于每种情况定义一些模式:保持不变;重置为0;替换为参考值;clamp(saturate)递增或递减;按位倒转.也指明进行comparison操作时test如何操作(与depth test类似),比较和写入模板,选择应参与测试或应在模板attachment中更新的stencil value’s bits,以及参考值.

VkPipelineDepthStencilStateCreateInfo

VkPipelineDepthStencilStateCreateInfo depth_and_stencil_state_create_info =
{
    VK_STRUCTURE_TYPE_PIPELINE_DEPTH_STENCIL_STATE_CREATE_INFO,
    nullptr,
    0,
    depth_test_enable,
    depth_write_enable,
    depth_compare_op,
    depth_bounds_test_enable,
    stencil_test_enable,
    front_stencil_test_parameters,
    back_stencil_test_parameters,
    min_depth_bounds,
    max_depth_bounds
};

Specifying a pipeline blend state

为了模拟透明物体,硬件通过混合存储在framebuffer里的已经渲染的fragment的颜色,通过graphics pipeline的blend state准备这个操作.

VkPipelineColorBlendAttachmentState

VkPipelineColorBlendStateCreateInfo

blending state是可选的且在rasterization非激活状态时或graphics pipeline的subpass没有color attachment是不要求的.

blending state主要是定义blending 操作的参数.但它也有其他用处,指明color mask选择渲染时哪个color components刷新(written to).控制logical operation 状态.当激活时,在当前fragment color和已经写入framebuffer的color 执行指定的逻辑操作.

仅对具有整数和规范化整数格式的attachment执行逻辑操作.

支持的logical 操作

CLEAR: Setting the color to zero

AND: Bitwise AND operation between the source (fragment’s) color and a
destination color (already stored in an attachment)

AND_REVERSE: Bitwise AND operation between source and inverted destination
colors

COPY: Copying the source (fragment’s) color without any modifications
AND_INVERTED: Bitwise AND operation between destination and inverted source
colors

NO_OP: Leaving the already stored color intact

XOR: Bitwise excluded OR between source and destination colors

OR: Bitwise OR operation between the source and destination colors

NOR: Inverted bitwise OR

EQUIVALENT: Inverted XOR

INVERT: Inverted destination color

OR_REVERSE: Bitwise OR between the source color and inverted destination color

COPY_INVERTED: Copying bitwise inverted source color

OR_INVERTED: Bitwise OR operation between destination and inverted source
color

NAND: Inverted bitwise AND operation
SET: Setting all color bits to ones

blending操作对给定graphic pipeline的subpass的每个color attachment是分开的.也就是说需要给每个color attachment指定blending 参数.但如果independentBlend特性没有启用,每个attachment的blending参数必须一样.

对blending,我们为color components和alpha component分别指明source和destination factors.支持的blend factors包括:

ZERO: 0

ONE: 1

SRC_COLOR:

ONE_MINUS_SRC_COLOR: 1 -

DST_COLOR:

ONE_MINUS_DST_COLOR: 1 -

SRC_ALPHA:

ONE_MINUS_SRC_ALPHA: 1 -

DST_ALPHA:

ONE_MINUS_DST_ALPHA: 1 -

CONSTANT_COLOR:

ONE_MINUS_CONSTANT_COLOR: 1 -

CONSTANT_ALPHA:

ONE_MINUS_CONSTANT_ALPHA: 1 -

SRC_ALPHA_SATURATE: min( , 1 -
)

SRC1_COLOR: <component of a source’s second color> (used in dual
source blending)

ONE_MINUS_SRC1_COLOR: 1 - <component of a source’s second color>
(from dual source blending)

SRC1_ALPHA: <alpha component of a source’s second color> (in dual
source blending)

ONE_MINUS_SRC1_ALPHA: 1 - <alpha component of a source’s second
color> (from dual source blending)

有些blendingg factors使用constant color而不是fragment的(source)color或者存储在attachment的color(destination).这个constant color可以在创建Pipeline是静态指定货在command buffer recording调用vkCmdSetBlendConstants()动态设置.

其中use the source’s second color(SRC1)只在dualSrcBlend特性开启式有效.

控制如何blending的blending function也能为color和alpha分量分开指定.Blending operators包括:

ADD: +

SUBTRACT: -

REVERSE_SUBTRACT: -

MIN: min( , )

MAX: max( , )

Enabling a logical operation disables blending.

下面是disabled logical operation和blending操作的blend state的例子

 std::vector<VkPipelineColorBlendAttachmentState> attachment_blend_states =
{
  {
  false,
  VK_BLEND_FACTOR_ONE,
  VK_BLEND_FACTOR_ONE,
  VK_BLEND_OP_ADD,
  VK_BLEND_FACTOR_ONE,
  VK_BLEND_FACTOR_ONE,
  VK_BLEND_OP_ADD,
  VK_COLOR_COMPONENT_R_BIT |
  VK_COLOR_COMPONENT_G_BIT |
  VK_COLOR_COMPONENT_B_BIT |
  VK_COLOR_COMPONENT_A_BIT
  }
 };
 VkPipelineColorBlendStateCreateInfo blend_state_create_info;
 SpecifyPipelineBlendState( false, VK_LOGIC_OP_COPY,
 attachment_blend_states, { 1.0f, 1.0f, 1.0f, 1.0f },
 blend_state_create_info );

这种recipe实现fillsVkPipelineColorBlendStateCreateInfo如下

blend_state_create_info = {
 VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_STATE_CREATE_INFO,
 nullptr,
 0,
 logic_op_enable,
 logic_op,
 static_cast<uint32_t>(attachment_blend_states.size()),
 attachment_blend_states.data(),
 {
     blend_constants[0],
     blend_constants[1],
     blend_constants[2],
     blend_constants[3]
 }
};

Specifying pipeline dynamic states

创建graphic pipeline要求提供很多参数,且不再能修改,专业能提高性能,能给驱动提供稳定的可预测的环境.但不幸的是,给开发者造成了不便,使得可能需要创建很多pipeline objects–但只有很少的不同.

为了避免这个问题,引入了dynamic states.它允许我们再command bufferrecording specific函数动态控制pipeline的参数.为了做到这,需要指明pipeline的那部分时dynamic.这通过指明pipeline dynamic states实现.

VkDynamicState

VK_DYNAMIC_STATE_VIEWPORT

VK_DYNAMIC_STATE_SCISSOR

VK_DYNAMIC_STATE_LINE_WIDTH

VK_DYNAMIC_STATE_DEPTH_BIAS

VK_DYNAMIC_STATE_BLEND_CONSTANTS

VK_DYNAMIC_STATE_DEPTH_BOUNDS

VK_DYNAMIC_STATE_STENCIL_COMPARE_MASK

VK_DYNAMIC_STATE_STENCIL_WRITE_MASK

VK_DYNAMIC_STATE_STENCIL_REFERENCE

dynamic pipeline states被引入允许设置pipeline objects的state.在命令缓冲区记录期间,可能没有太多不同的管道部分可以设置,但是选择需要在性能、驱动程序的简单性、现代硬件的功能和API的易用性之间的折衷.

dynamic state时可选的.

一下是可以被动态设置的部分:

Viewport

Scissor

Line width

Depth bias

Stencil compare mask

Stencil write mask

Stencil reference value

Blend constants

通过VkDynamicState数组指明哪些state需要动态设置,然后通过VkPipelineDynamicStateCreateInfo结构记录

VkPipelineDynamicStateCreateInfo dynamic_state_creat_info = {
    VK_STRUCTURE_TYPE_PIPELINE_DYNAMIC_STATE_CREATE_INFO,
    nullptr,
    0,
    static_cast<uint32_t>(dynamic_states.size()),
    dynamic_states.data()
};

pipeline

Creating a pipeline layout

Pipeline layouts和descriptor set layouts类似.Descriptor set layouts用来定义什么类型的resources形成descriptor set.Pipeline layouts定义什么类型的资源能被pipeline 访问.它们通过descriptor set layouts创建并push constant ranges

在pipeline创建时需要pipeline layouts,因为它们通过a set,binding,array element address指明了shader stages和shader resources间的接口.shaders使用同样的address(through a lyout qualifier)能访问给定resources.但是,即使给定的管道不使用任何描述符资源,我们也需要创建一个管道布局来通知驱动程序不需要这样的接口.

pipeline layout定义了pipeline 的shaders能访问的resources集合.当record command buffers时,我们绑定descriptor set to 选定的indices(Binding descriptor sets).descriptor set layout的index与关联的Pipeline layout的数组的index一致.同样的index在shaders中通过layout(set = ,binding=)qualifier指定以访问所给资源.

通常multiple pipelines会访问不同的resources.在command buffer recording,绑定pipeilne 和descriptor sets.只有这样才能issue dcs.当我们切换pipeline,需要根据pipeline的需要绑定信的descriptor sets.但频繁绑定不同descriptor sets会影响app的性能.这也是创建由相似(or compatible)layouts的pipelines和绑定不常改变的descriptor sets(that are common for many pipelines)到indices接近0(或靠近layout开始的地方).这样,当我们switch pipelines,descriptor sets.这样,当切换pipelines时,descriptor sets靠近pipeline layout 开始的地方(from index 0 to some index N)能继续用且不用更新.只有在绑定不同descriptor sets(由更高indices,在index N之后),才有必要.但需要注意,为了similar(or compatible),pipeline layouts必须由相同的push constant ranges.

我们需要将很多pipelines通用的descriptor sets绑定到pipeline layout靠近开始的地方(near the $0^{th}$ index)

pipeline layouts也定义了push constants的ranges.能提供一个小的constant values集合给shaders.比更新descriptor sets快,但memory更小,最少只有128bytes(in a pipeline layout).

比如,我们能给graphics pipeline每个state提供不同的range.每个stage128/5=26bytes.也可以给多个shader stage提供相同的ranges.但每个shader stage只能访问一个push constant range.

通常是不需要push constant ranges的,所以上述例子是比较糟糕的情况.一般由足够的内存存储若干vec4或1、2个matrix

需要注意push constant range的size和offset必须为4的倍数.

VkPipelineLayoutCreateInfo pipeline_layout_create_info = {
    VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO,
    nullptr,
    0,
    static_cast<uint32_t>(descriptor_set_layouts.size()),
    descriptor_set_layouts.data(),
    static_cast<uint32_t>(push_constant_ranges.size()),
    push_constant_ranges.data()
};
VkResult result = vkCreatePipelineLayout( logical_device,
                                         &pipeline_layout_create_info, nullptr, &pipeline_layout );
if( VK_SUCCESS != result ) {
    std::cout << "Could not create pipeline layout." << std::endl;
    return false;
}
return true;

Specifying graphics pipeline creation parameters

创建graphic pipeline需要填VkGraphicsPipelineCreateInfo提供很多控制不同方面的内容的参数.

在pipeline创建阶段能提供很多VkGraphicsPipelineCreateInfo,每一个指明了会被创建的单个pipeline的属性.

创建graphic pipeline后,可以在recording a dc前将之绑定到cb.Graphic pipeline只能在render pass绑定cb.在pipeline创建时,我们指明在哪个render pass这个pipeline会被爱使用.如果render pass是compatible那么可以使用同一个pipeline.

很少pipeline没有公共state.所以为了加快速度,可以$\color{red}{指明一个pipeline称为其他pipeline的parent(allow dervatives)}$,使用VkGraphicsPipelineCreateInfob的basePipelineHandle或basePipelineIndex.

basePipelineHandle允许我们指明已经存在的pipeline的handle,作为parent

basePipelineIndex当一次创建多Pipelines时,能指明VkGraphicsPipelineCreateInfo数组的哪个index提供给vkCreateGraphicsPipelines().此索引指向将与子pipeline一起在同一个单函数调用中创建的父管道.因为一起创建的所以无法提供handle.要求是parent的index必须必其他的小.也就是先创建.

basePipelineHandle和basePipelineIndex不能同时使用.

下面是一个例子:

VkGraphicsPipelineCreateInfo graphics_pipeline_create_info = {
    VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO,
    nullptr,
    additional_options,
    static_cast<uint32_t>(shader_stage_create_infos.size()),
    shader_stage_create_infos.data(),
    &vertex_input_state_create_info,
    &input_assembly_state_create_info,
    &tessellation_state_create_info,
    &viewport_state_create_info,
    &rasterization_state_create_info,
    &multisample_state_create_info,
    &depth_and_stencil_state_create_info,
    &blend_state_create_info,
    &dynamic_state_creat_info,
    pipeline_layout,
    render_pass,
    subpass,
    base_pipeline_handle,
    base_pipeline_index
};

$\color {red}{Creating\ a\ pipeline\ cache\ object}$

一个pipeline object不只是对参数进行包装.它包括所有可编程states和fixed pipeline stages的准备,设置shaders和descriptor resources间的interface,compiling和linking shader programs,进行错误检查(检查shaders是否正确linked).这些结果会存在cache里.这个cache能在创建相似属性的pipeline objects是复用加速.

VkPipelineCacheCreateInfo

VkPipelineCache

vkCreatePipelineCache

pipeline cache存储着一个pipeline preparation process的结果.可选的且能省略的.但能显著加快创建pipeline objects的速度.

在创建Pipeline时使用cache需要先创建一个pipeline cache object并提供给Pipeline creating function.驱动会自动缓存结果.如果cache有数据,driver自动尝试在创建pipeline时使用它.

使用pipeline cache object最常用的剧本(scenario)是将它的内容存储到一个file并在相同的app的独立的executions中复用.当启动app时,创建一个所有pipelines需要的empty cache.然后检索这个cache data并存储到file里.下次app执行时,也创建这个cache,但这次从文件读取数据来初始化它.但如果是只创建少量的pipelines,可能不用这么复杂.但是现代3D app都需要大量的pipelines.这种技术能极大加快初始化速度.

假设cache数据存储在cache_data数组里,可能是空的也可能是从先前创建的数据初始化了,创建pipeline cache的process如下

VkPipelineCacheCreateInfo pipeline_cache_create_info = {
    VK_STRUCTURE_TYPE_PIPELINE_CACHE_CREATE_INFO,
    nullptr,
    0,
    static_cast<uint32_t>(cache_data.size()),
    cache_data.data()
};
VkResult result = vkCreatePipelineCache( logical_device,
                                        &pipeline_cache_create_info, nullptr, &pipeline_cache );
if( VK_SUCCESS != result ) {
    std::cout << "Could not create pipeline cache." << std::endl;
    return false;
}
return true;

Retrieving data from a pipeline cache

为了能复用pipeline cache,我们需要存储cache的内容并在任何时候复用它.为此,我们检索cache里的数据.

vkGetPipelineCacheData

检索pipeliine cache内容是Vulkan里典型的doule-call of a single function.

size_t data_size = 0;
VkResult result = VK_SUCCESS;
result = vkGetPipelineCacheData( logical_device, pipeline_cache,
                                &data_size, nullptr );
if( (VK_SUCCESS != result) ||
   (0 == data_size) ) {
    std::cout << "Could not get the size of the pipeline cache." <<
        std::endl;
    return false;
}
pipeline_cache_data.resize( data_size );
result = vkGetPipelineCacheData( logical_device, pipeline_cache,
                                &data_size, pipeline_cache_data.data());
if( (VK_SUCCESS != result) ||
   (0 == data_size) ) {
    std::cout << "Could not acquire pipeline cache data." << std::endl;
    return false;
}
return true;

Merging multiple pipeline cache objects

因为要创建大量的pipelines,为了缩短创建时间,通过多线程将他们创建过程分开.每个线程会使用一个独立的pipeline cache.当都完成后,为了复用cache,需要合并他们到一个cache objects里.

vkMergePipelineCaches

VkResult result = vkMergePipelineCaches( logical_device,
                                        target_pipeline_cache,
                                        static_cast<uint32_t>(source_pipeline_caches.size()),
                                        source_pipeline_caches.data() );
if( VK_SUCCESS != result ) {
    std::cout << "Could not merge pipeline cache objects." << std::endl;
    return false;
}
return true;

注意合并后的那个cache object不能在vector里.

Creating a grphics pipeline

graphics pipeline控制所有drawing相关的操作.通过它我们指明drawing阶段的shader programs,各种测试(depth,stencil)的参数,或者final color如何计算并写入any of the subpass attachments.是最重要的objects之一.能创建一个或一次创建多个.

vkCreateGraphicsPipelines

下图白色blocks为可编程stages,灰色为固定管线部分

其中有的是可选的.如果Rasterization关闭,就不需要Fragment stage.如果启用tessellation stage,就需要提供Tessellation control 和 evaluation shaders.

VkGraphicsPipelineCreateInfo

VkPipeline

二者大小相同

graphics_pipelines.resize( graphics_pipeline_create_infos.size() );
VkResult result = vkCreateGraphicsPipelines( logical_device,
                                            pipeline_cache,
                                            static_cast<uint32_t>(graphics_pipeline_create_infos.size()),
                                            graphics_pipeline_create_infos.data(), nullptr, graphics_pipelines.data()
                                           );
if( VK_SUCCESS != result ) {
    std::cout << "Could not create a graphics pipeline." << std::endl;
    return false;
}
return true;

Creating a compute pipeline

VkPipelineShaderStageCreateInfo

VkComputePipelineCreateInfo

VkPipeline

vkCreateComputePipelines

一个compute pipeline 只有一个compute shader stage.(尽管硬件可能实现额外的stages)

compute shader只有一些内置变量,没有输入输出.只能用uniform 变量(buffers or images).所以compute shader更通用,能对images执行数学计算.

与graphics pipelines类似,也有继承.

下面是一个简单例子

VkComputePipelineCreateInfo compute_pipeline_create_info = {
    VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO,
    nullptr,
    additional_options,
    compute_shader_stage,
    pipeline_layout,
    base_pipeline_handle,
    -1
};
VkResult result = vkCreateComputePipelines( logical_device, pipeline_cache,
                                           1, &compute_pipeline_create_info, nullptr, &compute_pipeline );
if( VK_SUCCESS != result ) {
    std::cout << "Could not create compute pipeline." << std::endl;
    return false;
}
return true;

Binding a pipeline object

在issue dc或dispatch computational work前,需要设置所有需要的states.其一为cb绑定pipeline object,graphic piepeline 或compute pipeline.

VkCommandBuffer

vkCmdBindPipeline

1	vkCmdBindPipeline( command_buffer, pipeline_type, pipeline );

example

Creating a pipeline layout with a combined image sampler, a buffer,and push constant ranges

fragment有一个image,vertex有一个uniform

std::vector<VkDescriptorSetLayoutBinding> descriptor_set_layout_bindings =
{
    {
        0,
        VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE,
        1,
        VK_SHADER_STAGE_FRAGMENT_BIT,
        nullptr
    },
    {
        1,
        VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER,
        1,
        VK_SHADER_STAGE_VERTEX_BIT,
        nullptr
    }
};
if( !CreateDescriptorSetLayout( logical_device,
                               descriptor_set_layout_bindings, descriptor_set_layout ) ) {
    return false;
}

ranges of push constants

if( !CreatePipelineLayout( logical_device, { descriptor_set_layout },
                          push_constant_ranges, pipeline_layout ) ) {
    return false;
}
return true;

Creating a graphics pipeline with vertex andfragment shaders, depth test enabled, and with dynamic viewport and scissor tests

desctroy

本节介绍一个通用的graphic pipeline 创建过程,vertex shaders,fragment shaders,depth test enabled.动态指明viewport 和 sicssor tests.

准备vertex和fragment shader stages

std::vector<unsigned char> vertex_shader_spirv;
if( !GetBinaryFileContents( vertex_shader_filename, vertex_shader_spirv ) )
{
    return false;
}
VkDestroyer<VkShaderModule> vertex_shader_module( logical_device );
if( !CreateShaderModule( logical_device, vertex_shader_spirv,
                        *vertex_shader_module ) ) {
    return false;
}
std::vector<unsigned char> fragment_shader_spirv;
if( !GetBinaryFileContents( fragment_shader_filename, fragment_shader_spirv
) ) {
return false;
}
VkDestroyer<VkShaderModule> fragment_shader_module( logical_device );
if( !CreateShaderModule( logical_device, fragment_shader_spirv,
                        *fragment_shader_module ) ) {
    return false;
}
std::vector<ShaderStageParameters> shader_stage_params = {
    {
        VK_SHADER_STAGE_VERTEX_BIT,
        *vertex_shader_module,
        "main",
        nullptr
    },
    {
        VK_SHADER_STAGE_FRAGMENT_BIT,
        *fragment_shader_module,
        "main",
        nullptr
    }
};
std::vector<VkPipelineShaderStageCreateInfo> shader_stage_create_infos;
SpecifyPipelineShaderStages( shader_stage_params, shader_stage_create_infos
);

然后选择vertex bindings和verte attributes.

VkPipelineVertexInputStateCreateInfo vertex_input_state_create_info;
SpecifyPipelineVertexInputState( vertex_input_binding_descriptions,
                                vertex_attribute_descriptions, vertex_input_state_create_info );
VkPipelineInputAssemblyStateCreateInfo input_assembly_state_create_info;
SpecifyPipelineInputAssemblyState( primitive_topology,
                                  primitive_restart_enable, input_assembly_state_create_info );

Viewport和scissor test参数很重要,因为动态设置,所以只有viewports的数量重要.

ViewportInfo viewport_infos = {
    {
        {
            0.0f,
            0.0f,
            500.0f,
            500.0f,
            0.0f,
            1.0f
        }
    },
    {
        {
            {
                0,
                0
            },
            {
                500,
                500
            }
        }
    }
};
VkPipelineViewportStateCreateInfo viewport_state_create_info;
SpecifyPipelineViewportAndScissorTestState( viewport_infos,
                                           viewport_state_create_info );

然后为rasterization 和 multisample states准备参数.

VkPipelineRasterizationStateCreateInfo rasterization_state_create_info;
SpecifyPipelineRasterizationState( false, false, polygon_mode,
                                  culling_mode, front_face, false, 0.0f, 1.0f, 0.0f, 1.0f,
                                  rasterization_state_create_info );
VkPipelineMultisampleStateCreateInfo multisample_state_create_info;
SpecifyPipelineMultisampleState( VK_SAMPLE_COUNT_1_BIT, false, 0.0f,
                                nullptr, false, false, multisample_state_create_info );

dpeth test.一般而言需要靠近摄像机的fragment得到保留,所以使用VK_COMPARE_OP_LESS_OR_EQUAL作为比较操作.这里假设stencil test关闭

VkStencilOpState stencil_test_parameters = {
    VK_STENCIL_OP_KEEP,
    VK_STENCIL_OP_KEEP,
    VK_STENCIL_OP_KEEP,
    VK_COMPARE_OP_ALWAYS,
    0,
    0,
    0
};
VkPipelineDepthStencilStateCreateInfo depth_and_stencil_state_create_info;
SpecifyPipelineDepthAndStencilState( true, true,
                                    VK_COMPARE_OP_LESS_OR_EQUAL, false, 0.0f, 1.0f, false,
                                    stencil_test_parameters, stencil_test_parameters,
                                    depth_and_stencil_state_create_info );

blending parameters

1
2
3

VkPipelineColorBlendStateCreateInfo blend_state_create_info;
SpecifyPipelineBlendState( logic_op_enable, logic_op,
                          attachment_blend_states, blend_constants, blend_state_create_info );

list of dynamic states

std::vector<VkDynamicState> dynamic_states = {
    VK_DYNAMIC_STATE_VIEWPORT,
    VK_DYNAMIC_STATE_SCISSOR
};
VkPipelineDynamicStateCreateInfo dynamic_state_create_info;
SpecifyPipelineDynamicStates( dynamic_states, dynamic_state_create_info );

创建pipeline

VkGraphicsPipelineCreateInfo graphics_pipeline_create_info;
SpecifyGraphicsPipelineCreationParameters( additional_options,
                                          shader_stage_create_infos, vertex_input_state_create_info,
                                          input_assembly_state_create_info, nullptr, &viewport_state_create_info,
                                          rasterization_state_create_info, &multisample_state_create_info,
                                          &depth_and_stencil_state_create_info, &blend_state_create_info,
                                          &dynamic_state_create_info, pipeline_layout, render_pass,
                                          subpass, base_pipeline_handle, -1, graphics_pipeline_create_info );
if( !CreateGraphicsPipelines( logical_device, {
    graphics_pipeline_create_info }, pipeline_cache, graphics_pipeline ) ) {
    return false;
}
return true;

multiple thread

Creating multiple graphics pipelines on multiple threads

创建graphic pipeline 可能会话很长时间.shader编译链接在pipeline创建时完成,指定给shader的states是否正常.所以有大量pipeline需要创建时最好使用多线程.

但当有大量pipeline创建时需要使用cachee去加速创建过程.本节会介绍在多并发管道(multiple concurrent pipeline)创建时使用cache并在之后合并cache.

本节使用VkDestroyer<>模板来自动销毁无用的资源

流程

cache文件std::string pipeline_cache_filename

cache从文件加载到std::vector cache_data;

std::vector pipeline_caches.为每个独立的thread创建pipeline cache object并存储句柄到pipeline_caches

std::vector\std::thread\ threads. resize

创建变量std::vector\<std::vector\<VkGraphicsPipelineCreateInfo>> graphics_pipelines_create_infos.为每个thread添加新的VkGraphicsPipelineCreateInfo graphics_pipelines_create_infos.并存储到线程创建的pipeline数等大的数组里.

创建变量std::vector\<std::vector\<VkPipeline>> graphics_pipelines.按照每个thread的pipelines数量resize graphics_pipelines的子数组.

创建期望数量的threads,每个thread使用logical_device创建选定数量的pipelines,一个cache关联到该thread(pipeline_caches[]),一个VkGraphicsPipelineCreateInfo数组关联到该thread(graphics_pipelines_create_infos[]).

等待所有threads结束

创建一个VkPipelineCache target_cache

合并pipeline_caches数组到 target_cache.

遍历target_cache内容,存储到cache_data数组.

将cache_data存储到文件pipeline_cache_filename

创建multiple graphics pipeline要求给很多不同pipelines提供很多参数.

为了速度更快,使用pipeline cache非常有效,首先需要从文件里读取预先存储的cache(如果有的话).然后为每个独立thread创建cache.每个cache需要用文件里加载的cache内容初始化.
1
2
3
4
5
6
7
8
9
10
11
12
> std::vector<unsigned char> cache_data;
> GetBinaryFileContents( pipeline_cache_filename, cache_data );
> std::vector<VkDestroyer<VkPipelineCache>> pipeline_caches(
>     graphics_pipelines_create_infos.size() );
> for( size_t i = 0; i < graphics_pipelines_create_infos.size(); ++i ) {
>     pipeline_caches[i] = VkDestroyer< VkPipelineCache >( logical_device );
>     if( !CreatePipelineCacheObject( logical_device, cache_data,
>                                    *pipeline_caches[i] ) ) {
>         return false;
>     }
> }
>

下一步是为每个thread创建的pipeline handles准备存储空间.同时开始所有thread使用对应的cache object创建多pipelines.

> std::vector<std::thread>threads( graphics_pipelines_create_infos.size() );
> for( size_t i = 0; i < graphics_pipelines_create_infos.size(); ++i ) {
>     graphics_pipelines[i].resize( graphics_pipelines_create_infos[i].size()
>                                 );
>     threads[i] = std::thread::thread( CreateGraphicsPipelines,
>                                      logical_device, graphics_pipelines_create_infos[i], *pipeline_caches[i],
>                                      graphics_pipelines[i] );
> }
>

等待所有thread完成.然后合并所有cache objects到一个.将新内容存储(replace)到对应文件.

> for( size_t i = 0; i < graphics_pipelines_create_infos.size(); ++i ) {
>     threads[i].join();
> }
> VkPipelineCache target_cache = *pipeline_caches.back();
> std::vector<VkPipelineCache> source_caches( pipeline_caches.size() - 1);
> for( size_t i = 0; i < pipeline_caches.size() - 1; ++i ) {
>     source_caches[i] = *pipeline_caches[i];
> }
> if( !MergeMultiplePipelineCacheObjects( logical_device, target_cache,
>                                        source_caches ) ) {
>     return false;
> }
> if( !RetrieveDataFromPipelineCache( logical_device, target_cache,
>                                    cache_data ) ) {
>     return false;
> }
> if( !SaveBinaryFile( pipeline_cache_filename, cache_data ) ) {
>     return false;
> }
> return true;
>

Destroy

Destroy pipeline

if( VK_NULL_HANDLE != pipeline ) {
    vkDestroyPipeline( logical_device, pipeline, nullptr );
    pipeline = VK_NULL_HANDLE;
}

需要确保commands已经完成(通过fences).

Destroy a pipeline cache

当用来创建了pipeline,合并cache data,或遍历了内容后可以销毁cache.

if( VK_NULL_HANDLE != pipeline_cache ) {
    vkDestroyPipelineCache( logical_device, pipeline_cache, nullptr );
    pipeline_cache = VK_NULL_HANDLE;
}

Destroying a pipeline layout

当不需要pipeline layout时,也就是不想用它来创建更多pipeline、绑定descriptor sets或更新push constants(给定layout使用的)、使用这个pipeline layout的所有操作已经完成,我们能销毁它.

Pipeline layouts只在三种情况游泳–创建pipelines,绑定descriptor sets,update push constants.第一种可以用完就销毁,后两种在硬件停止运行相关cbs后销毁

if( VK_NULL_HANDLE != pipeline_layout ) {
    vkDestroyPipelineLayout( logical_device, pipeline_layout, nullptr );
    pipeline_layout = VK_NULL_HANDLE;
}

Destroying a shadr module

Shader modules只用于创建pipeline objects.完成后能立即销毁.

if( VK_NULL_HANDLE != shader_module ) {
    vkDestroyShaderModule( logical_device, shader_module, nullptr );
    shader_module = VK_NULL_HANDLE;
}