329 lines
14 KiB
Text
329 lines
14 KiB
Text
---
|
||
title: The Graphics Pipeline
|
||
date: "April 20 - 2025"
|
||
---
|
||
|
||
<script>
|
||
import Image from "../Image.svelte"
|
||
import Note from "../Note.svelte"
|
||
import Tip from "../Tip.svelte"
|
||
</script>
|
||
|
||
Ever wondered how games put all that gore on your display? All that beauty is brought into life by
|
||
a process called **rendering**, and at the heart of it, is the **graphics pipeline**.
|
||
In this article we'll dive deep into the intricate details of this powerful beast.
|
||
|
||
We'll cover all the terminologies needed to understand each stage and have many restatements so don't
|
||
worry if you don't fully grasp something at first. If you still had questions, feel free to contact me :)
|
||
|
||
So without further ado---
|
||
|
||
## Overview
|
||
|
||
Like any pipeline, the **graphics pipeline** is comprised
|
||
of several **stages**, each of which can be a pipeline in itself or even parallelized.
|
||
Each stage takes some input (data and configuration) to generate some output data for the next stage.
|
||
|
||
<Note title="A coarse division of the graphics pipeline", type="diagram">
|
||
|
||
Application --> Geometry Processing --> Rasterization --> Pixel Processing --> Presentation
|
||
</Note>
|
||
|
||
Before the heavy rendering work starts on the <Tip text="GPU">Graphics Processing Unit</Tip>,
|
||
we simulate and update the world through **systems** such as physics engine, game logic, networking, etc.
|
||
during the **application** stage.
|
||
This stage is mostly ran on the <Tip text="CPU">Central Processing Unit</Tip>,
|
||
therefore it is extremely efficient on executing <Tip text="sequentially dependent logic">
|
||
A type of execution flow where the operations depend on the results of previous steps, limiting parallel execution.
|
||
In other words, **CPUs** are great at executing **branch-heavy** code, and **GPUs** are geared
|
||
towards executing a TON of **branch-less** or **branch-light** code in parallel. </Tip>.
|
||
|
||
The updated scene data is then prepped and fed to the **GPU** for **geometry processing**. Here
|
||
we figure out where everything ends up on our screen by doing lots of fancy matrix math.
|
||
We'll cover this stage in depth very soon so don't panic (yet).
|
||
|
||
Afterwards, the final geometric data are converted into <Tip text="pixels"> Pixel is the shorthand for **picture-element**, Voxel is the shorthand for **volumetric-element**. </Tip>
|
||
and prepped for the **pixel processing** stage via a process called **rasterization**.
|
||
In other words, this stage converts a rather abstract and internal presentation (geometry)
|
||
into something more concrete (pixels). It's called rasterization because end the product is a <Tip text="raster">Noun. A rectangular pattern of parallel scanning lines followed by the electron beam on a television screen or computer monitor. -- 1930s: from German Raster, literally ‘screen’, from Latin rastrum ‘rake’, from ras- ‘scraped’, from the verb radere. ---Oxford Languages</Tip> of pixels.
|
||
|
||
The **pixel processing** stage then uses the rasterized geometry data (pixel data) to do **lighting**, **texturing**,
|
||
and all the sweet gory details of a scene (like a murder scene).
|
||
This stage is often, but not always, the most computationally expensive.
|
||
A huge problem that a good rendering engine needs to solve is how to be **performant**. And a great deal
|
||
of **optimization** can be done through **culling** the work that we can deem unnecessary/redundant in each
|
||
stage before it's passed on to the next. More on **culling** later don't worry (yet 🙂).
|
||
|
||
The pipeline will then serve (present) the output of the **pixel processing** stage, which is a **rendered image**,
|
||
to your pretty eyes 👁👄👁 using your <Tip text="display">Usually a monitor but the technical term for it is
|
||
the target **surface**. Which can be anything like a VR headset or some other crazy surface used for displaying purposes.</Tip>.
|
||
But to avoid drowning you in overviews, let's jump right into the gory details of the **geometry processing**
|
||
stage and have a recap afterwards!
|
||
|
||
## Surfaces
|
||
|
||
Ever been jump-scared by this sight in an <Tip text="FPS">First Person (Shooter) perspective</Tip>? Why are (the inside of) things rendered like that?
|
||
|
||
<Image
|
||
paths={["/images/boo.png"]}
|
||
/>
|
||
|
||
|
||
In order to display a (murder) scene,
|
||
we need to have a way of **representing** the **surface** of its composing objects (like corpses) in computer memory.
|
||
We only care about the **surface** since we won't be seeing the insides anyway---Not with that attitude.
|
||
At this stage, we only care about the **shape** or the **geometry** of the **surface**.
|
||
Texturing, lighting, and all the sweet gory details come at a much later stage once all the **geometry** has been processed.
|
||
|
||
But how do we represent surfaces in computer memory?
|
||
|
||
## Vertices
|
||
|
||
There are several ways to **represent** the surfaces of 3d objects for a computer to understand.
|
||
For instance, <Tip text="NURBS">
|
||
**Non-uniform rational basis spline** is a mathematical model using **basis splines** (B-splines) that is commonly used in computer graphics for representing curves and surfaces. It offers great flexibility and precision for handling both analytic (defined by common mathematical formulae) and modeled shapes. ---Wikipedia</Tip> surfaces are great for representing **curves**, and it's all about the
|
||
**high precision** needed to do <Tip text="CAD">Computer Assisted Design</Tip>. We could also do **ray-tracing** using fancy equations for
|
||
rendering **photo-realistic** images.
|
||
|
||
These are all great---ignoring the fact that they would take an eternity to process...
|
||
But what we need is a **performant** approach that can do this for an entire scene with
|
||
hundreds of thousands of objects (like a lot of corpses) in under a small fraction of a second. What we need is **polygonal modeling**.
|
||
|
||
**Polygonal modeling** enables us to do an exciting thing called **real-time rendering**. The idea is that we only need an
|
||
**approximation** of a surface to render it **realistically enough** for us to have some fun killing time!
|
||
We can achieve this approximation using a collection of **triangles**, **lines**, and **dots** (primitives),
|
||
which themselves are composed of a series of **vertices** (points in space).
|
||
|
||
<Image
|
||
paths={["/images/polygon_sphere.webp"]}
|
||
/>
|
||
|
||
A **vertex** is simply a point in space.
|
||
Once we get enough of these **points**, we can connect them to form **primitives** such as **triangles**, **lines**, and **dots**.
|
||
And once we connect enough of these **primitives** together, they form a **model** or a **mesh** (that we need for our corpse).
|
||
With some interesting models put together, we can compose a **scene** (like a murder scene :D).
|
||
|
||
<Image
|
||
paths={["/images/bunny.jpg"]}
|
||
/>
|
||
|
||
But let's not get ahead of ourselves. The primary type of **primitive** that we care about during **polygonal modeling**
|
||
is a **triangle**. But why not squares or polygons with a variable number of edges?
|
||
|
||
## Why Triangles?
|
||
|
||
In <Tip text="Euclidean geometry"> Developed by **Euclid** around 300 BCE, is based on five axioms. It describes properties of shapes, angles, and space using deductive reasoning. It remained the standard model of geometry for centuries until non-Euclidean geometries and general relativity showed its limits. It's still widely used in education, engineering, and **computer graphics**. ---Wikipedia </Tip>, triangles are always **planar** (they exist only in one plane),
|
||
any polygon composed of more than 3 points may break this rule, but why does polygons residing in one plane so important
|
||
to us?
|
||
|
||
<Image
|
||
paths={["/images/planar.jpg", "/images/non_planar_1.jpg", "/images/non_planar_2.png"]}
|
||
/>
|
||
|
||
When a polygon exists only in one plane, we can safely imply that **only one face** of it can be visible
|
||
at any one time; this enables us to utilize a huge optimization technique called **back-face culling**.
|
||
Which means we avoid wasting a ton of **precious processing time** on the polygons that
|
||
we know won't be visible to us. We can safely **cull** the **back-faces** since we won't
|
||
be seeing the **back** of a polygon when it's in the context of a closed-off model.
|
||
We figure this out by simply using the **winding order** of the triangle to determine whether we're looking at the
|
||
back of the triangle or the front of it.
|
||
|
||
Triangles also have a very small **memory footprint**; for instance, when using the **triangle-strip** topology (more on this very soon), for each additional triangle after the first one, only **one extra vertex** is needed.
|
||
|
||
The most important attribute, in my opinion, is the **algorithmic simplicity**.
|
||
Any polygon or shape can be composed from a **set of triangles**; for instance, a rectangle is simply **two coplanar triangles**.
|
||
Also, it is a common practice in computer science to break down hard problems into simpler, smaller problems.
|
||
This will be a lot more convincing when we cover the **rasterization** stage :)
|
||
|
||
|
||
<Note title="Bonus point: evolution", type="info">
|
||
|
||
present-day **hardware** and **algorithms** have become **extremely efficient** at processing
|
||
triangles by doing operations such as sorting, rasterizing, etc, after eons of evolving around them.
|
||
|
||
</Note>
|
||
|
||
## Primitive Topology
|
||
So, we got our set of vertices, but having a bunch of points floating around wouldn't make a scene very lively
|
||
(or gory), we need to form **triangles** out of them to compose **models** (corpse xd).
|
||
|
||
We communicate to the computer the <Tip text="toplogy"> The way in which constituent parts are interrelated or arranged.--mid 19th century: via German from Greek topos ‘place’ + -logy.---Oxford Languages </Tip>
|
||
of the primitives to be generated from our set of vertices by
|
||
configuring the **primitive topology** of the **input assembler**.
|
||
We'll get into the **input assembler** bit in a second, but let's clarify the topology with some examples.
|
||
|
||
**Point list**:
|
||
|
||
When the topology is point list, each **consecutive vertex** defines a **single point** primitive, according to the equation:
|
||
<Note title="equation", type="math">
|
||
|
||
```math
|
||
p_i = \{ v_{i} \}
|
||
```
|
||
|
||
</Note>
|
||
|
||
**Line list**:
|
||
|
||
When the primitive topology is line list, each **consecutive pair of vertices** defines a **single line**, according to the equation:
|
||
|
||
<Note title="equation", type="math">
|
||
|
||
```math
|
||
p_i = \{ v_{2i},\ v_{2i+1} \}
|
||
```
|
||
|
||
</Note>
|
||
|
||
The number of primitives generated is equal to ⌊vertex_count / 2⌋.
|
||
|
||
**Line Strip**:
|
||
|
||
When the primitive topology is line strip, **one line** is defined by each **vertex and the following vertex**, according to the equation:
|
||
|
||
<Note title="equation", type="math">
|
||
|
||
```math
|
||
p_i = \{ v_i, v_{i+1} \}
|
||
```
|
||
|
||
</Note>
|
||
|
||
The number of primitives generated is equal to max(0, vertex_count - 1).
|
||
|
||
**Triangle list**:
|
||
|
||
When the primitive topology is triangle list, each **consecutive set of three vertices** defines a **single triangle**, according to the equation:
|
||
<Note title="equation", type="math">
|
||
|
||
```math
|
||
p_i = \{ v_{3i}, v_{3i+1}, v_{3i+2} \}
|
||
```
|
||
|
||
</Note>
|
||
|
||
|
||
The number of primitives generated is equal to ⌊vertex_count / 3⌋.
|
||
|
||
**Triangle strip**:
|
||
|
||
When the primitive topology is triangle strip, **one triangle** is defined by each **vertex and the two vertices that follow it**, according to the equation:
|
||
|
||
<Note title="equation", type="math">
|
||
|
||
```math
|
||
p_i = \{ v_i,\ v_{i + (1 + i \bmod 2)},\ v_{i + (2 - i \bmod 2)} \}
|
||
```
|
||
|
||
</Note>
|
||
|
||
|
||
The number of primitives generated is equal to max(0, vertex_count - 2).
|
||
|
||
**Triangle fan**:
|
||
|
||
When the primitive topology is trinagle fan, triangleas are defined **around a shared common vertex**, according to the equation:
|
||
|
||
<Note title="equation", type="math">
|
||
|
||
```math
|
||
p_i = \{ v_{i+1}, v_{i+2}, v_0 \}
|
||
```
|
||
|
||
</Note>
|
||
|
||
|
||
The number of primitives generated is equal to max(0, vertex_count - 2).
|
||
|
||
There's also some topologies suffixed with the word **adjacency**, and a special type of primitive called **patch** primitive.
|
||
But for the sake of simplicity we won't get into them.
|
||
|
||
So what's next?
|
||
|
||
## Indices
|
||
Great, we got our vertices, we figured out how to connect them, but there's one last thing we need
|
||
to understand before we can **assemble** our input using the **input assembler**. The **indices**.
|
||
|
||
|
||
## **Input Assembler**
|
||
Every section before this explained terminologies needed to grasp this,
|
||
section colored in yell-ow are concrete pipeline stages where some code gets executed
|
||
which processes the data we feed to it based on the configurations we set on it.
|
||
|
||
The **vertices** and **indices** are provided to this stage via something we call buffers.
|
||
So technically we have to provide **two** buffers here, a **vertex buffer** and a **index buffer**.
|
||
|
||
To give you yet-another ovreview, this is the diagram of the **geometry processing** section of
|
||
our pipeline:
|
||
|
||
<Note title="Geometry Processing Pipeline", type="diagram">
|
||
|
||
Draw --> Input Assembler -> Vertex Shader -> Tessellation Control Shader -> Tessellation Primitive Generator -> Tessellation Evaluation Shader -> Geometry Shader -> Vertex Post-Processing -> ... Rasterization ...
|
||
</Note>
|
||
|
||
|
||
## Coordinate System -- Local Space
|
||
|
||
## Coordinate System -- World Space
|
||
|
||
## Coordinate system -- View Space
|
||
|
||
## Coordinate system -- Clip Space
|
||
|
||
## Coordinate system -- Screen Space
|
||
|
||
## Vertex Shader
|
||
|
||
## Tessellation & Geometry Shaders
|
||
|
||
## Let's Recap!
|
||
|
||
## Rasterizer
|
||
|
||
## Pixel Shader
|
||
|
||
## Output Merger
|
||
|
||
## The Future
|
||
|
||
## Conclusion
|
||
|
||
## Sources
|
||
|
||
<Note title="Reviewers", type="review">
|
||
|
||
Mohammad Reza Nemati
|
||
</Note>
|
||
|
||
<Note title="Books", type="resource">
|
||
|
||
[Tomas Akenine Moller --- Real-Time Rendering 4th Edition](https://www.realtimerendering.com/intro.html)
|
||
[JoeyDeVriez --- LearnOpenGL - Hello Triangle](https://learnopengl.com/Getting-started/Hello-Triangle)
|
||
[JoeyDeVriez --- LearnOpenGL - Face Culling](https://learnopengl.com/Advanced-OpenGL/Face-culling)
|
||
</Note>
|
||
|
||
<Note title="Wikipedia", type="resource">
|
||
|
||
[Polygonal Modeling](https://en.wikipedia.org/wiki/Polygonal_modeling)
|
||
[Non-uniform Rational B-spline Surfaces](https://en.wikipedia.org/wiki/Non-uniform_rational_B-spline)
|
||
[Computer Aided Design (CAD)](https://en.wikipedia.org/wiki/Computer-aided_design)
|
||
[Rasterization](https://en.wikipedia.org/wiki/Rasterisation)
|
||
[Euclidean geometry](https://en.wikipedia.org/wiki/Euclidean_geometry)
|
||
</Note>
|
||
|
||
<Note title="Youtube", type="resource">
|
||
|
||
...
|
||
|
||
</Note>
|
||
|
||
<Note title="Stackoverflow", type="resource">
|
||
|
||
[Why do 3D engines primarily use triangles to draw surfaces?](https://stackoverflow.com/questions/6100528/why-do-3d-engines-primarily-use-triangles-to-draw-surfaces)
|
||
</Note>
|
||
|
||
<Note title="Vulakn Docs", type="resource">
|
||
|
||
[Drawing](https://docs.vulkan.org/spec/latest/chapters/drawing.html)
|
||
[Pipeline Diagram](https://docs.vulkan.org/spec/latest/_images/pipelinemesh.svg)
|
||
</Note>
|