Lecture 7: Image Sources, Convolution, Scene Graphs
Click here to see today's slides
Today we went over all of the math that's needed for Group Assignment 1. This includes the "image sources" algorithm for modeling specular reflections (angle in = angle out) of any order in an environment with a bunch of polygons. It also includes "convolution," which is a process for simulating what an environment does to a sound after bouncing it around and contributing lots of echoes. The final piece of the puzzle is the "scene graph," which is a way of organizing a complex hierarchy of transformations that occurs in real world virtual environment modeling. With these three pieces and the vector math we've discussed so far, we have enough to complete a very fun assignment on virtual sound simulation in 3D environments.
Before discussing the image sources algorithm, it's important to discuss how to compute "specular reflections." These are reflections in which an incoming ray is transformed into an outgoing ray, such that the angle that they make with the reflecting plane is the same. For ease of illustration in the examples below, a line will be drawn in place of a plane, but the same math applies in any Euclidean dimension since the dot product generalizes (and in your assignment you will be doing this in 3D)
By complementary angles, another way to say this is that
vin and
vout must make the same angle with the normal to the line,
n:
If we
project vin onto the line normal
n, we get a perpendicular component
vperp
\[ \vec{v_{\perp}} = \left( \frac{\vec{v_{\text{in}}} \cdot \vec{n}}{\vec{n} \cdot \vec{n}} \right) \vec{n} \]
If we write the component of
vin parallel to the line as
vpar, then we can decompose
vin as
\[ \vec{v_{\text{in}}} = \vec{v_{\parallel}} + \vec{v_{\perp}} \]
As shown in the picture below,
vout can be written the same way, except the perpendicular component is flipped (the parallel component stays exactly the same
Thus,
\[ \vec{v_{\text{in}}} = \vec{v_{\parallel}} - \vec{v_{\perp}}\]
Making some substitutions, in terms of
vin and the normal
n,
vout is, therefore,
\[ \vec{v_{\text{out}}} = \vec{v_{in}} - 2\vec{v_{\perp}} \]
\[ \vec{v_{\text{out}}} = \vec{v_{in}} - 2\left( \frac{\vec{v_{\text{in}}} \cdot \vec{n}}{\vec{n} \cdot \vec{n}} \right) \vec{n} \]
If
n has been normalized (so that its magnitude is 1), then this can be written more simply as
\[ \vec{v_{\text{out}}} = \vec{v_{in}} - 2\left( \vec{v_{\text{in}}} \cdot \vec{n} \right) \vec{n} \]
And that's all there is to it! Now we can move onto an algorithm that can find these types of paths from a source to a receiver in a very efficient and elegant way
Now that we know how to perform reflections that make a perfect angle in and angle out, let's try to apply them to find the perfect specular path from a source (blue point) to a receiver (red point). The images below show some attempts at casting rays in different directions and looking at where they fall
Even though these rays are all very close to each other in direction, they all miss the receiver after the reflection. This problem only gets worse when we then start to consider "second order bounces"; that is, rays that take another specular bounce after reflecting so that there are two perfect reflections before they reach the source. The picture below shows this phenomenon. Even though there are only very small differences of the directions in which the three rays are cast from the blue point (perhaps only 20 degrees), there is a drastic difference in the reflected directions after two bounces, and it therefore takes a lot of precision to hit the exact ray that we want to get a perfect path from the source to the receiver:
Therefore, casting rays in a bunch of different directions is clearly not the way to go here. As it turns out, there is a much more elegant way to extract specular paths from source to receiver by using mirror images, also known as
image sources. The idea is to create a "virtual source," which is the mirror image of a source across a plane (again, lines drawn here for ease of illustration but this all applies in any dimension). First, create the mirror image of the source across one of the line segments:
To create a mirror image, take any point
p on the plane and create a vector from the source
s to
p:
vps =
s - p. As before, this vector can be written as
\[ \vec{v_{ps}} = \vec{v_{\perp}} + \vec{v_{\parallel}} \]
That is, the mirror image is the same perpendicular length from the mirroring plane as the original source, just with that perpendicular direction flipped, as shown in the picture below:
Then, similarly to before, we want to subtract 2x the perpendicular component from
s to get its mirror image
s'. In sum, given any point
p on the plane, a normalized plane normal
n, and a source
s, the mirror image
s' is
\[ \vec{s'} = \vec{s} - (2(\vec{s}-\vec{p}) \cdot \vec{n}) \vec{n} \]
Now that we have a mirror image for the source, we can treat it as if it were itself a source, and cast a path from it to the receiver:
Now to turn this into a path that starts at the real source instead of its mirror image, simply mark the intersection point of the above path with the face that gave rise to the mirror image (marked in magenta in the image below), and trace a path from this intersection point back to the source:
(
see slide 14 of lecture 4 for the equation intersecting a ray with a plane for how to get the magenta point)
And that gives you the final path, with a perfect angle in and angle out! This seems like magic, but there's a very simple geometric proof showing that it indeed does given the same angle in as angle out, as shown in the image below. We start by noting that the source and its mirror image are the same perpendicular distance from the mirroring line segment, as discussed above. Since these distances are perpendicular distances, we can draw two right triangles that have them as congruent segments. Then, we show that these right triangles actually overlap at a segment, which is trivially congruent to itself, so the right triangles are actually fully congruent. This means that the two angles shown adjacent to the magenta point (including the angle in) are congruent, and by reflex angles they are also congruent to angle out. This completes the proof
The magic of this is, for a fixed source position and fixed geometry, we can move the receiver to any position in the plane and all we have to do is follow this same procedure. This will give us the exact specular path every time. I would encourage you to practice this on paper with different scenarios to convince yourself that this works (we did a few examples in class)
Now that we have the basic mirror images algorithm down, let's discuss how to model multiple bounces off of different planes. For this, we follow the same basic philosophy, but now we actually create mirror images of mirror images (the same thing would happen with light if you had a system of mirrors). The figure below shows creating the first mirror image about the top segment, and then creating a mirror image of that image across the bottom segment
Now we do the exact same thing as before: we start by tracing a path from the mirror image on the bottom to the receiver:
To turn this into a path which goes to the first mirror image, we trace the intersection point with the bottom face back to the first mirror image, same as before:
But we're still not all the way back to the original source, so we still have to repeat this step one more time, tracing the intersection point of the last path with the top segment back to the original source:
Just to make sure this point is really driven home, let's dive into an example with three bounces. Things are getting a little more complicated now, so let's number the mirror images by their
order (number of bounces needed to get from the image in question back to the source) and also color code them. First, start just as before. Get the mirror image of the source about the first face, which I draw in orange. Then get the mirror image of that mirror image around the second face, which I draw in green. Finally, get the mirror image of that mirror image around the third face, which I draw in pink:
Notice how for the second (and especially) third mirror image, we reflect the images around parts of the lines containing the line segments which are quite far from the segment boundaries. So remember in your assignment when creating the images, it doesn't matter how far outside of the polygons the reflection points are; you're simply reflecting the points about the plane that spans the polygon.
Okay now let's dive into the algorithm again. The first step is to trace a path from mirror image 3 all the way at the end back to the receiver:
Now we need to convert this to a path that goes back to mirror image 2, which gave rise to mirror image 3. To do this, we find the intersection with the pink face, and trace a path from that intersection to the green face:
Now, we need to figure out the part of the path that reflects across the green face back to mirror image 1, which gave rise to mirror image 2 across the green face. Find the intersection point of the path with the green face, and trace back from that intersection point to mirror image 1:
Finally, we want to find the path back to the original source, so find the intersection point with the first (orange) mirror face, and trace a path from that intersection point back to the original source:
Just to draw this without all of the virtual paths and images in the way, here's what the final 3-bounce path looks like:
Notice how the angle in equals the angle out on every bounce. This is quite remarkable and quite elegant! Tracing these paths is also very amenable to recursion, since, starting at a particular virtual image, you're tracing the intersection point with the reflecting face on the way back to the receiver to that virtual image's parent (the image that gave rise to it via reflection), and then you trace the next intersection point to that parent's parent, and so on all the way until you get back to the original source.
Note also that allowing images of
order r (up to
r bounces) in a scene with
N faces, there are
\[ N(N-1)^{r-1} \]
virtual images. This is because there are
N mirror images with all
N faces on the first bounce. But for each image after that, we only reflect it around
N-1 faces. This is because we don't reflect a mirror image around the face that just gave rise to it; otherwise, we'll end up right back where we started. In the three bounce example above, we don't reflect the orange image around the orange face, because it will end up right back on the blue image. But we can reflect it across all of the
N-1 other faces.
We are almost finished describing the algorithm (and half of the points on the first group assignment). The last thing that you have to worry about is whether the path you're looking at is blocked by an objects in the scene. In the image below, for example, there is a pink object blocking the path traced back from the mirror image to the receiver.
The brute force way of dealing with this is to intersect the ray cast from the mirror image to the receiver with every object in the scene and to make sure that the receiver is the object it hits first (you can sort them by the parameter
t of the ray described as
p0 + t
v). In the case of a higher order reflection, you need to check to make sure the first intersection point is along the face that gave rise to it. For instance, in the three bounce example above, tracing a path back from the point on the pink line to mirror image 2, we need to make sure the first intersection point is on the green face.
The second thing we need to worry about is whether or not the intersection points representing the bounce points are even within the faces, since we weren't worried about face containment when we created the mirror images. For instance, in the image below, the intersection point with the mirror face is outside of the line segment, so there is no valid path with perfect angle in = angle out from the source to the receiver.
In 3D, you have to check polygon containment, which is a bit tricker but can be done quickly if the polygon is convex, which you can assume in group assignment 1. See
lecture 4 slides for a few examples of tests you can perform to determine whether a point is inside of a polygon, the easiest of which is probably the area test since you already implemented that for mini assignment 1.
Coming soon...see lecture slides and
interactive demo for now. Note that this is implemented for you in the assignment
We've spent a couple of lectures now talking about matrix transformations as functions that modify the geometry of points in some space. Now, we're going to discuss how to apply a system of matrix transformations to model complex scenes. The idea is to build a graph called a
scene graph in which each node of the graph has a transformation that describes how to express all of the objects in that node and that node's children in the coordinate system of the node's parent. For instance, take a look at the scene graph below of a bedroom, courtesy of
Tianqiang Liu:
As you can see, there is a hierarchy of relationships between objects in this scene, and the idea is that each arrow has an associated transformation that says how to describe the object at the tip of the arrow in the coordinate frame of the object right above it at the head of the arrow. At the top of the tree is
world coordinates (in this example "bedroom coordinates"), which is the global coordinate system used to describe everything. Usually in a rendering or physics application, we need to understand how to transform all objects into world coordinates.
Let's just dive right into an example of how one would query this tree to express some of the objects in world coordinates. First, let's label some of the transformations as matrices, as labeled in the following way:
For instance, to describe the bed in world coordinates, we need to multiply on the left by the matrix
T1. Similarly, to describe the mattress in the coordinate frame of the bed, we need to multiply the points of the mattress on the left by the matrix
T4. But notice that the mattress still isn't quite in world coordinates yet. To put the mattress in world coordinates, we need to transform the points in bed coordinates by the matrix
T2 to get it into the "sleep area," and then we need to transform by the matrix
T1 to get it from the sleep area into world coordinates. So we're applying a composition of functions, starting at the bottom at
T4, feeding that output to
T2 as input, then feeding
T2's output to
T1. Writing this in matrix form, we have the matrix
\[ T = T_1 T_2 T_4 \]
which says how to transform coordinates in the mattress into world coordinates, when multiplying mattress coordinates on the left by
T. Similarly, to get the pillows into world coordinates, we need to multiply the pillow geometry on the left by
T1T3, and to get the bed frame into world coordinates, we need to multiply on the left by
T1T2T5