Coordinates & Transforms in 2D
2 Coordinates & Transforms in 2D
2.1 A slightly boring, laborious example
The Introduction mentioned that all the code this tutorial deals with is fragment shaders (or pixel shaders) in GLSL. For our sake, what that means is that we are dealing with short programs which create images (and eventually animations) by explicitly giving the color of every pixel in the image.
This may be a bit different than what you are used to in more general programming languages. You are likely used to programs for which, at any given point in time, variables have a specific value and the program is executing some specific part of the code.
Fragment shaders deviate from this a bit, and require a bit of a different mindset. The way we use them here, they can be thought of as programs which run simultaneously over every pixel in the image, and likewise, have variables which take on different values for every pixel. (See Stream processing for a little more in-depth information on this.)
The below code gives a very simple example of this:
Click Use code, and then go to the Render tab. You should see a gradient - green at the top left corner, yellow at the top right, red at the bottom right, black at the bottom left, and colors blended everywhere between those.
Click back to the Fragment Shader tab (ignore the Parameters tab
for now). The code might look familiar if you've used something like
C, C++, or Java, but with a few unfamiliar constructs. vec2
and
vec4
are vector types containing 2 elements and 4 elements,
respectively (more on that later). For now, just treat a vec2
as a
structure with fields x
and y
, and a vec4
as a structure with
fields r
, g
, b
, and a
, all of them floating-point values.
gl_FragCoord
is one of those variables which takes on a different
value for every pixel in the image; the .xy
provides us a vec2
of
just the X and Y coordinates of each pixel. resolution
, on the
other hand, is a uniform variable - it has the same value over all
pixels. In this case, it's something the environment provides to tell
us the total number of pixels along X and Y.
gl_FragCoord.xy / resolution.xy
produces a new vec2
by dividing
gl_FragCoord.x
by resolution.x
and gl_FragCoord.y
by
resolution.y
. Try the code below if you want to verify yourself
that it's the same as splitting them out:
Thus, the uv
vector (or u
and v
separately above) contains X and
Y coordinates that each have been normalized range from 0 to 1.
Since it's dividing by the total resolution, it works identically at
any resolution (resize the render window and see, or click the
lower-right icon to make it full-screen). This gives us a normal
(x,y) coordinate plane (or Cartesian plane) where the lower-left
point is (0,0), moving to the right increases the X coordinate, and
moving up increases the Y coordinate - up to the top-right corner
which is (1,1). This is a fairly standard transformation to see
anytime pixel locations are involved.
Finally, gl_FragColor
is where we assign the pixel's respective
color into a vec4
by giving, in order, its red, green, blue, and
alpha channel values (and we ignore alpha here, as it refers to
transparency and doesn't make sense to use here). Each value is from
0 to 1, and values outside that range are clipped inside of it; try
messing with the values (e.g. replacing u
with u * 4.0
) to verify
this.
The result, then, is that we've assigned the red channel of each pixel to its X coordinate, and the green channel to the Y coordinate. The blue channel was just left at 0. Thus, at the very lower-left, we'd expect to see black (red and green are both 0 because X and Y are zero), and moving up to the top-right, we'd expect to see yellow (red and green are both at their maximum of 1).
2.2 What can this actually do?
You may be complaining that a colored gradient is a rather uninteresting thing to draw, and it is. However, with some additional math, this method of creating graphics is remarkably flexible.
First off, though, we must be able to create some basic shapes and
constructs to demonstrate things. We've already shown how the
computation of uv
above can create simple color gradients - but
color gradients by their inherent fuzziness make it difficult to show
any sort of sharp details or well-defined shapes.
So, consider the below code. The uv.x
line isn't especially
important, but it will appear again and again, and it is there to
correct the aspect ratio of the image. In our case, all it means is
that uv.x
will be scaled so that a given step in uv.x
is the same
distance as in uv.y
, and our shapes aren't squashed.
The important line is the definition of uv_mod
: we've scaled up uv
by a certain amount in both X and Y, and then used modulo to turn it
to a repeating pattern again in X and Y - across some distance, it
rises from 0 to 1, and then goes back to 0.
You should be able to see a sort of grid pattern emerging, and playing
with the value of count
, or changing the 1.0
in mod(..., 1.0)
to
something else, should produce some effects that make sense. However,
if you look, you'll see it's still just a bunch of smaller gradients.
Try looking just at uv_mod.x
or just uv_mod.y
(i.e. change
gl_FragColor
so that the red, green, and blue channel are all
uv_mod.x
, and then so they all are uv_mod.y
).
Now consider: How could we turn this into a grid with sharp lines,
instead of gradients? Try to make sense of the below, and change
values like thickness
to something else:
This is a fairly small change from the previous code. If you're not
familiar, uv_mod
is now defined using the ternary operator that is
commonly used in C. In simple, it has forced the darker parts (see
thickness
) to be uniformly light, and the parts other than that to
be uniformly black. However, the X and Y grid lines are still
separated out, and we may simply add them together to give white grid
lines:
2.3 Implicit functions
Perhaps it wasn't obvious why this is significant, but to try to explain it further: We just used an implicit function to draw lines. Rather than drawing lines by iteratively walking along pixel coordinates and darkening certain ones according to a line's formula, we started with a formula that was something like:
\begin{equation} f(x,y)=I(x\mod C)+I(y\mod C) \mathrm{ where}\\ I(a) = \begin{cases} a < \epsilon & : 1\\ a \ge \epsilon & : 0 \end{cases} \end{equation}
and then evaluated this over every pixel, using \(C\) as basically
count
and \(\epsilon\) as thickness
. We've taken some liberties, in
that we're comparing with \(\epsilon\) rather than 0, but that is due to
using floating point and a discrete number of pixels. If we break this
(mathematical) function out into a separate (GLSL) function:
Put another way, we just drew isolines of that function.
More about this implicit functions will follow later. For now, we
just use it to create a grid to help illustrate some transformations.
Below is one that may be familiar: It converts the coordinates we're
using already - rectangular, or Cartesian, coordinates in uv
- to
polar coordinates. uv
is also rescaled and moved so that the center
is (0,0). Then, we use polar rather than rectangular coordinates to
draw this grid.
Note that, despite the notation, fields x
and y
of polar
now
stand for radius and angle, not X and Y coordinates.
We started with a grid made of equally-sized squares. The above should give some intuitive sense of how the conversion to polar coordinates transformed space: The grid "squares" are now differently-sized sections of a circle.
2.4 Animation & Mouse Input
Up to this point, we've just been rendering images that don't change over time or in response to any input (aside from you editing the code). The below code makes a couple modifications to the last example to change this:
(The grid
function was rewritten to have lines with softer edges as
a primitive anti-aliasing approach. You can try to understand how it
works, but it doesn't much matter. If by chance it runs too slowly on
your GPU, feel free to copy & paste the old grid
definition in
instead.)
To see the render react to input, click or drag in the render window.
The environment provides two new variables - mouse
and time
-
which give, respectively, the mouse location and a time value (which
simply counts up in seconds). (mouse
, for whatever reason, uses a
different coordinate system than the rest of the image, so above we
remedy this when we compute m
and then we do the same transformation
to normalize this mouse position as we do each pixel coordinate.)
There are two buttons at the bottom center of the render window which let you stop animation completely, and pause or resume it. It will continue to react to mouse input even when paused or stopped.
In the last example, the center of the image was (0,0) (in both rectangular and polar coordinates, incidentally); in this one, note that by subtracting the mouse position, wherever you click is the new origin.
See the changes to polar.x
and polar.y
for the source of the
animation; change how time
is used (adding vs. subtracting, and
dividing by larger and smaller numbers), and it should make sense how
this works.
2.5 Control Inputs
I mentioned uniform variables earlier, and we've dealt with a couple
of them like resolution
, mouse
, and time
. We've also introduced
some code which uses parameters here and there that (so far) had to be
edited manually in the code.
Helpfully, the environment we're working in also provides the ability to create our own uniform variables and adjust them in the renderer. Click Use code below and then look at the Parameters tab.
Note that the code (the Fragment Shader code) is near-identical to the last example, except that:
- Two
uniform float
variables are now declared, and their names (freq_rad
andfreq_ang
). - The names of the above two variables are identical to the
name
field in the JSON definitions in the Parameters tab. - The lines with
polar.x
andpolar.y
now use the uniform
Click Render and see that a controls panel should now be visible at
the top right (you may need to click Open Controls). It should have
sliders with the names you supplied in GUIName
in the JSON
definitions. Adjust these sliders, and you should see parts of the
animation proceed at different speeds (and directions, if you change
between positive and negative).
For some information on this type of variable, see pocket.gl documentation.
2.6 Function composition
In the prior examples, we achieved some simple transformations on
graphics by starting with the \((r,\theta)\) of polar coordinates above,
we achieved rotation by turning \(\theta\) to \(\theta+f_1t\) (where \(t\)
is something like the variable time
, and \(f_1\) like freq_ang
), and
we achieved a sort of radial expansion or contraction by turning \(r\)
to \(r+f_2t\). If you look a little deeper, we're using the same method
to move the graphics around by turning rectangular coordinates \((x,y)\)
to \((x-x_m, y-y_m)\), where \((x_m, y_m)\) are the mouse coordinates.
More generally, in both cases we were changing the domain of some function (in mostly the mathematical sense of the term) by composing another function with it. Fundamentally, this fragment shader is just a function which turns domain \((X,Y)\) - the pixel location - to codomain \((r,g,b)\) - the pixel's color.
From this standpoint, we started out with a function itself made by composing together several other functions in order (if "compose" feels too odd of a term here, just think of it like steps a pipeline):
- Converting pixel location
gl_FragCoord
to normalized pixel locationuv
. - Converting
uv
to polar coordinatespolar
. - Using
polar
to create a grid with functiongrid
(consisting of just values 0 and 1). - Turning those 0 & 1 grid values into \((r,g,b)\) in
gl_FragColor
.
As we added in the effect of the mouse and the animation, we spliced in additional functions - first in between 1 and 2 for the mouse, then in between 2 and 3 for the animation.
To give another example, let's start with the simple rectangular grid
we've seen before - but warp the domain of grid
a little. Look
particularly at the sine_warp
function, and note that what in place
of uv
(the input to grid
) we've instead used uv
as the input to
sine_warp
, and put its output in place of uv
.
Click around in the render, and also adjust the sliders. Note that adjusting Ampl to zero completely eliminates the effect.
sine_warp
might take some explanation:
- It converts the input coordinates (assumed to be rectangular
coordinates) to polar coordinates, \((r,\theta)\) -
polar.x
andpolar.y
in the code. - It distorts radius \(r\) by adding a sinusoid to it. This sine wave's
amplitude is
ampl
, it varies with radius by frequencyfreq
, and it has phase adjusted byphase
. Note that ifampl=0
, \(r\) is simply left as-is. - It converts the distorted radius and original angle back to rectangular coordinates, and returns that.
Thus, we're distorting the entire grid based on its radius - i.e. its distance from whatever the center of the coordinate space is, which in this case is defined by the mouse click. The nature of that distortion is that peaks in the sinewave occur at certain distances, and at those distances we've stretched space outward a bit (note how the grid lines are further apart); troughs in the sinewave occur at other distances, and at those distances we've forced space inward a bit (note how the grid lines are closer together). Since the pixels that are a specific distance from some point form a circle, this gives a sort of circular distortion which repeats as you move outward.
By typing time
to this sinewave's phase (with freq_t
to set how
fast that phase varies), we add some animation to it.
This is all trying to convey intuitively that when you have a function that turns a point in space to something (such as color in this case), warping that function's domain is distorting or transforming space in some way.
We can transform the space multiple times too:
The below likewise transforms space twice, but in different ways this
time. angle_warp
distorts the angle in polar coordinates,
proportional to the radius; play with the AngCoef
parameter to see
the effect.
We can swap out grid
for any other function and see space distorted
in the same way; the grid was just used there to make it more obvious
how we are distorting space itself. The below code does the same
transformations - but on the color gradient that was used early on in
this chapter.
2.7 Summary
The intention of this chapter was to give a crash course in how shaders work in this context, how functions are a meaningful representation for graphics, and how transformations on the function's domain can be visualized as warping space.
This whole chapter was focused on the use of shaders to create 2D graphics. The next chapter extends this to their use in creating a 3D renderer.