3D Coding TAT - Convex Polyhedra, Planes And Surface-Details
1: Surface Plane Detail
Sometimes you wish to draw some fine surface detail on objects to give a greater sense of realism. Say for example we wanted to draw a racing car with a number or logo on the car's door and this number or logo could be personalised (perhaps depending on grid position, name etc.). The most obvious solution would be to copy the door texture map into a buffer and overlay the logo onto using a simple sprite routine to draw the non-transparent pixels on the door bitmap. Now the car can be drawn normally using this changable texture bitmap. It requires NO more processing during the game only once before race time perhaps in the pits area or menu.
But this does have some limitations, firstly we can't modify the bitmap during the game without incurring a large amount of data copying and overlaying. If we wanted to draw damage or mud on the car door then we could modify the bitmap or switch it for one already pre-drawn by a graphics artist. This switching of bitmap textures can of course be used for animation purposes.
If we continue the hierarchical mentality a stage further then instead of polygons we could describe sub-polygons in either a tree or basic table format. In the case of a car door number/logo we could draw the entire door normally and then draw the logo using more polygons on top. This way when the car is in the distance we can just draw the car door and stop there, without drawing any of its surface detail.
If the car door polygon is hidden or completely clipped then we can reject all of its details too without any further processing. This way the door can be thought of as the root and the detail as the branches or leaves on the tree. So surface detail on objects (or the environment) is based upon visibility and then distance. If the root surface is hidden then so is all of its detailing.
But the problem with this, draw the background then the detail method is the old one of overdraw, where pixels are written to a number of times which is on 99% of video cards is very slow. If a S-BUFFER technique is employed then that would help save some processor time by removing the pixel overdraw. Another way is to define a detailed polygon and vertex list which avoids the overdrawing problem and would replace the simple, one-polygon door for an already sub-divided surface.
2: When I'm cleaning windows...
This was probably the basis for my 'Inlet' rendering method. Instead of simply using a polygon to define a plane on which all of its fine detail and sub-polygons lie, why not create some fake transparent polygons to classify interior groups of polygons. The easiest way to think about this is to imagine a house with only one window and that window is a fake polygon which completes the exterior surface of the house and makes it into a convex object. We also have an interior to the house which only has a single room and is a concave shape. Now taking the simple cube for the exterior and one cube volume for the interior and one for the window then we could define the house using just 24 vertices (8 exterior, 8 interior and 8 for the window volume) and about 22 polygons for the surfaces.
Now imagine being at the back of the house, not being able to see the window at the front. We would not be able to see any part of the interior (8 vertices and about 9 polygons) and none of the window (8 vertices and 4 polygons). This is over half of our polygons and vertices which don't need to be processed. Only from a few places at the front of the house at certain angles would you be able to see any of the interior. The only connection between the interior and the exterior is the window. If we can see the window (the 'fake' polygon) then we can see some part of the house's interior, and so some of it's polygons. So instead of describing surface detail on the polygon plane we can describe further polygons or sub-objects using these 'fake' (transparent) polygons to encapsulate a complex and possibly partly hollow object.
Another way to think about the 'INLET' method is to describe it as a slab or polygon face S-Buffer technique. In my opinion it is far superior to the normal S-Buffer method because we only really process and clip those polygon faces which can be seen from our current view-point and in the current direction, we are NOT processing every polygon as is the case with the normal S-Buffer technique. The Inlet way of describing both opaque (floor, ceilings, doors) and transparent (windows, holes, openings) using convex volumes actually guides the engine in the correct direction and away from incorrect/hidden ones.
3: Convex Polyhedra & Compass Bubbles
I have only just made the connection between transparent 'fake' polygon surfaces and a back-face culling speed up for very complex 3d models. This is mostly due to me focusing on the problem of rendering buildings, rooms and other 3d structures which tend to be the biggest workload for the CPU to handle. Now let's turn our attention towards placing inhabitants within a 3d world. Not only do they move about, but parts of them move in relation to each other (e.g. arms and legs). As the polygon count rises due to ever greater complex models so does the need to help minimize the number of individual checks for each one. A doubling of polygons means a doubling of work even though only 50% or less of an object can ever be seen at any one time. What we ideally want is some form of fast method which would work out what polygon surfaces could be seen from the current view point WITHOUT having to check each and every one of them.
Imagine a Rhombi-icosi-dodecahedron (no, I'm NOT going to draw an ASCII diagram). It has 62 faces in total and 80 vertices and looks like a badly drawn football. The normal way to perform back-face culling (hidden surface removal) is to rotate and project all 80 vertices and then test each one of the 62 faces. We could use the edge connection between neighbouring polygons to navigate our way around the 3d model. The most obvious way is to use a simple link or pointer to each connecting polygon for each edge. A four sided polygon would require four of these edge links. Given a single polygon as a starting point we could then render each one of the surrounding polygons by following the edge links. There are two problems with this.
1. Tracing across this map of connected polygons is very difficult as each polygon can have a number of edges to them and so any number of neighbours. Also it is possibly that you may find yourself visiting the same polygon two or more times. In which case you must record which ones you have already done.
2. The main idea behind this local edge-connection is that we don't need to search to find other polygons which have a common edge or similar direction, but what happens when our starting point is on the opposite side to what we want? One answer could be to search using the connection links until we found the front (facing us) of the object and then proceed from that.
Going back to the title of this little sub-section a very complex model with tens or hundreds of faces can be enveloped inside a very simple convex 'bubble' object such as the old favourite, a cube. We could then use one or more of these six faces to direct rendering towards a good starting point on the complex object. We would start off with this 'bubble' polyhedron and given its six polygon faces we could quickly determine which are visible and which are not (back face culled). At most we would end up with a maximum of just 3 faces, now instead of drawing these faces we use each one to point to a parallel (or near-parallel) face on the complex model. This would give a maximum of 3 starting points from where we could draw and follow connected polygon faces until we meet a side or back facing one.
So in affect the surrounding 'bubble' has given us a number of starting places around a complex model for a number of different viewing angles without having to test and reject too many faces. You can think of the bubble as being like a 3d compass, directing the renderer quickly to a front (or back) face given the viewing direction. Or you can think of it as being 6 points on a sphere with one at the north pole, one at the south and 4 around the equator. Of course there is some extra work and storage requirements for the bubble object, with more faces and vertices to process, but hopefully this will be significantly offset by the saving in rejecting lots of hidden polygon faces.
But you don't really need to create any extra vertices or faces to do this 'compass bubble' technique, you can use 6 (or more or less) faces on the complex model and use these possible starting points directly. A straight forward list might be used which contains the address of these 6 3d compass faces. This would not only avoid the extra storage and processing requirements but would solve the problem of parallax between the surrounding bubble and the contained complex model. If the vertices for these compass faces were defined together at the start of the complete vertex list for the entire model then this would simplify and reduce some of the vertex calculations, because once these compass vertices had been rotated and projected etc., then we only need to process the remaining ones in the list.
4: Tessellation, Tiles and Blocky Pixels
This 'compass bubble' idea doesn't have to be restricted to just finding various starting points around a 3d model. It could possibly be used for landscape and scenary mapping. Take the problem of drawing a reasonably realistic looking landscape especially one for a quite detailed mountain range along it. Most (i.e. all) 3d engines I have seen tend to have hills rather than mountains and even then they're very flat hills. Perhaps this is a restriction in the way most scenary and terrain is defined, usually with a flat 2d array with varying height at each corner vertex. And within each of the array cells there is usually one texture/bitmap token which will be used to fill the polygon between the vertex corner points. Now if you just pull some of these vertices upwards by a large amount then the polygon will either:
1. become very blocky due to the length between vertices becoming large, and so the texture/bitmap is scaled up. (I reckon about 99% of engines do this, even JEDI KNIGHT!)
2. or the texture will be repeated many times over like a bathroom tiled wall. I understand this as "tessellation" but it might have another meaning.
Which one of the above two solutions is used depends on the 3d engine. For very, very long walls or floors method 2 the tessellation method is used, this does have the advantage of keeping textures all to the same scale so the viewer can more easily recognise distant or close-up items just by the amount of pixel-zoom which occurs. But this method does introduce more instructions within the texture-mapping routine because it must wrap around at the texture-space edges. This can be quickly done using a simple (and fast) AND instruction so long as the texture dimension is a power of 2 (32,64,128,256 etc.) It also means that you don't have to store the texture-space coordinates with each polygon vertex, you 'could' use the differences in their world coordinates. E.g. if a wall was 1000 pixels long and a texture was only 32 pixel wide then you need to repeat the texture 31.25 times. This of course needs more calculation but allows faces to grow or shrink without stretching or squashing the texture.
Method 1 works out the fastest because it avoids having to perform the wrap-around logical AND instruction in the texture mapping routine (as long as the vertex texture-space coords do not exceed the texture dimensions, i.e. with a 64 pixel wide texture you can NOT map from 32 to 96). The down side to this is that amount of pixel-zoom across polygons can vary depending the magnitude of scaling between the polygon edge lengths and the texture bitmap dimensions. After playing a JEDI-KNIGHT demo I could see at the end that the huge cargo ship which you must jump onto is a small bitmap scaled up to cover a vast area.
It is possible to write an engine which allows both kinds of texture-mapping and mark polygons with a bit flag to indicate which kind of mapping routine should be used.
5: 'Checkerboard' or Grid Based Textures
There is another kind but it would require far more work than the tessellation method, but could allow a far greater variety in filling large polygons without the need to break them up into many smaller ones. Instead of repeating the same texture once an edge has been crossed why not use another texture? This would require a map or array of texture tokens to be stored and that the texture-filler routine would need to detect when a neighbouring texture needs to be fetched from the token map. So instead of drawing a bathroom like wall with the same tile used again and again, we have the ability to have any number of tiles in any order along the wall. We could have one texture for the left side of the polygon, one for the centre and one for the right side.
There are some limitations and problems with this:
a. The textures will need to be of an indentical size so that each neighbouring texture lines up (most likely square).
b. The texture-filling routine will be much slower.
c. Extra storage is required for the token array map.
d. All the textures on the polygon still lie on the same plane. So even though you could have grass on one edge and rock on the other they would both be flat unlike if more vertices (and more polygons) were used.
Another way to think about this method is to imagine the polygon as a flat wall and the chequer-board as various pictures hung along the wall's length.
6: One Big Texture, Many Mappings
Depending on how the texture bitmaps are stored (usually in a 256-byte-per-line format) it is possible to greatly reduce the number of small textures in a crafty and easy to do way. Imagine we need to draw a rocky field landscape with mud, grass, gravel and some stone or rock path. Five or more simple texture bitmaps could be drawn by a graphics artist for each type of terrain. Now we design and render a rough field using these texture bitmaps to cover a number of polygons. The result is reasonably, except where textures of different kinds meet, e.g. where mud meets grass or grass meets stone. It looks like a chess-board and the viewer can easily see where one texture meets another (he or she can see the seams in the landscape). To reduce this the graphic artist could draw more textures which can be used to blend one terrain type into another (e.g. 50% grass and 50% mud). The seams are still there but the viewer doesn't notice them as much.
The problem of using 'blending' texture bitmaps to smooth out the edges between two terrain type is it quickly increases the amount of memory used. If we had 64 textures then we would need about 8192 textures to smooth from each one to the other 63. As you can see this is an extreme case and most texture bitmaps only need to be smooth between one or two others and NOT every other one.
Most textures these days range in size of 32x32 upto a maximum of 256x256 and most being about 64x64 in size. There is a problem facing designers and programmers. Should they use big 256x256 texture bitmaps to cover large areas quickly in one go or should they break the area into a smaller parts and use a size of 64x64 or less? Personally I think the smaller the texture, the better as they tend to cache very well in the CPU with less memory reads and also free up more memory for different textures. As almost all textures need to be stored in a 256-per-byte-line format and most textures are smaller than 256 pixels in height and width, we can cheat and create a huge 4096 textures from just FOUR 64x64 textures. We know that each line of a texture is 256 bytes apart from the previous one so we can easily store 4 textures in the following way:
0 ................. 255
aaaaa bbbbb <----- 1st line
aaaaa bbbbb <----- 2nd line
aaaaa bbbbb <----- 3rd line
ccccc ddddd where a,b,c,d are the
ccccc ddddd texture bitmap pixels.
ccccc ddddd
In a 256x256 block of memory we could store 16 of these 64x64 texture bitmaps without wasting a single byte of memory. Each 64x64 texture bitmap can be drawn separately and only stored together like this by the rendering program in memory. Its loader would just interleave each 64 pixel line as each texture bitmap is read from a disk file. So the textures do not need to be drawn or stored in any particular order.
Right enough waffle, here is the crafty trick.
Make the graphic artist draw the 64x64 textures on a huge 256x256 bitmap (go on, nail 'em to the chair). All the textures DO NOT have to be 64x64 pixels in size, some can be 8x8, some 16x64, some 256x48 or whatever.... So long as there are NO gaps between each rectangular texture. After the graphic artist has stopped complaining about "not being able to see the edges between neighbouring textures" tell 'em: "Yeah, that's the whole idea pal!"
Think of the below diagram as being a collection of small texture bitmaps on a large 256x256 bitmap.
0 .................................... 255
+-------+--------+----+------------------+
| | | | |
| grass | mud | | long slab |
| | | | |
+-------+--------+----+----------+-------+
| | | | | |
| rock | water | | | tall |
| | | +----------+ thing |
| | | | | |
| | | | | |
+-------+--------+----+----------+-------+
You should notice that the bitmaps do not have to be all the same size as long as there are no gaps between them. Using the 4 textures (1 for grass, mud, rock and water) we can also blend between grass + mud or mud + water simply by starting somewhere on this 256x256 bitmap, NOT on the texture boundaries. We could start half way across the grass and end up half way into the mud or we could begin 1/5th into the mud and end up 1/5th (or any other amount) into the water.
We have blended neighbouring textures together WITHOUT having to create a single new texture! So instead of wrapping around in the same texture bitmap we allow the mapper to cross into the neighbouring one.
Of course some care and pre-planning is required before the graphics are drawn but this should help the graphic artists as much as hinder them because they can see the joins between the small terrains texture when they drawn them. This will allow them to smoothly blend from one to another simply by treating the 256x256 bitmap as one giant picture. This huge texture can be used in its entirety for filling large areas with a pond or rocks at its centre or can be used as normal and small parts of it used to fill small blended polygons.
Actually it might not be a good idea to place textures directly next to other textures (e.g. grass next to water) because if you use just the grass to tessellate a large area then the smoothed edges (between grass and water which the artist drew) will break up the illusion of a continous area of grass. The old isolated and flat-drawn texture bitmaps would give a much better illusion of a seamless area because their left and right (and top and bottom) edge have roughly the same amount of colour and shade, i.e. one side doesn't fade into water and the other side fade into stone. But all is not lost, remember these gaps between textures? Well, why not introduce gaps to separate the 'flat' continous textures and then get the artist to draw some smoothing graphical bits in these gaps. This way we can still use the individual textures to tessellate large areas and we can also use some/all of these new gap graphics for blending. (Nice eh?)
7: 'Horizon' Z clipping
Most of the 3d performance boost comes from various types of clipping. Whether it's back-face culling (hidden surface removal), object rejection, Z-buffering or Span-buffering they all have the same goal, to reduce the amount of polygons, lines or pixels which need to be drawn on the screen. The ideal situation is only having to draw each pixel once and only once, never to overdraw anything. Whether you intentionally use them or not, 3d object models contain some amount of hierarchy (the models are built from polygons, the polygons from lines and the lines from pixels). As objects move into the distance some part or all of them will be reduced to the size of single pixels. In the classic example of a aircraft runway with white line markings on a mostly black tarmac there needs to be a cut-off distance for each. If we continued to drawn both then at some point both would become single pixels one black and one white and every so often the runway would appear to flash white then black. It is clear that the white line markings should not be drawn - they should be clipped if beyond a certain distance from the view point. And the black tarmac would too be clipped at some further distance because of its larger size.
The distance clipping is normally performed using just the Z component of a polygon (instead of the proper distance SQRT(XX+YY+ZZ) which this is SLOW). So each polygon needs to have a 'horizon' or distance-cut-off point beyond which it is not drawn even though it is visible.
Most games use a number of polygon models each for varying amount of detail to them. Four seems to be a favourite number. The closer to the 3d model the greater the number of polygons and detail. The further away the simpler and more crude the model. Instead of using the cop-out method of multiple models I will try to explain some ideas. Some will be practical and some not so, depending on how the models are defined and rendered.
One of the most obvious ways to render a 3d model/object is to go through the polygon list and reject those beyond the horizon cut-off point. But this would require another test (and perhaps some calculation) for each polygon face.
e.g.
for i = 1 to Number_Of_Polygons
if Polygon[i].Z > Polygon[i].Horizon then reject
else render( Polygon[i] )
next i
Note how the above loop takes the polygons in no particular order, just the way they are defined in the polygon list. Within the loop each polygon has its Z value compared against its Horizon (or Z-cut off) point. If it is beyond this limit it is not drawn, otherwise it is.
Suppose we order the polygon list in decreasing order of their 'Horizon' field with furthest cut-off distance first (the biggest polygon) and the closest cut-off distance last (the smallest polygon). We know that every polygon after 'n' will have a closer cut-off point and so will disappear sooner so if polygon 'n' is beyond the cut-off point then so will ALL the following ones. Because the list of polygons are sorted this way we can immediately stop if we encounter a polygon which is beyond the Z cut-off limit.
e.g.
for i = 1 to Number_Of_Polygons
if Polygon[i].Z > Polygon[i].Horizon then stop
render( Polygon[i] )
next i
In the worst case every polygon's Z will be less than the cut-off distance so 'N' checks are being performed for nothing. You could store a depth field with the object model and use this together with the relative distance of the object to quickly reject or accept the entire model without the need to peform any horizon checks. As we are basically searching for the first cut-off polygon in the sorted list then we could employ a binary-search or some other hashing/look-up technique, this way the number of non cut-off polygons can quickly be found and a straight loop without checks can be performed.
But one of the main problems with simply losing polygons that 'fall' over the horizon cut-off point is that objects may appear to suddenly have holes in them when a small part of the model is rejected but the surrounding larger ones are not. As this pruning is done on mostly small, far away objects you might get away with Swiss cheese.
8: Dynamic Details & Model Simplification
Another way to simplify models (which I haven't tried yet or really thought about to any great extent) is to build the object model like a sculpture and build many detail levels into a single model. This would require more polygons but perhaps no more vertices. To see what I mean imagine a solid square box of stone this will be our coarsest model (of course it doesn't have to be square) with only six faces its a pretty fast object to render. This is fine for the most distant model as it would appear to be a few pixels in size. You can think of this as the beginning of a sculpture with no detail just the basic outline. Now chisel away a corner or two and you start to have a more detailed model with more faces and more vertices. Continue this process removing corners and sub-dividing faces and you eventually end up with the most detailed model.
You could think of this as another form of hierarchy with the most coarse model being the root and the more detailed being the leaves. If a polygon face is close enough then use it's sibling sub-faces. This genealogical rendering continues until we reach either the horizon cut-off point or the leaves (finest detail) of the model.
PROBLEMS, PROBLEMS .....
There is a HUGE problem with this egg-within-an-egg idea; that of parallax. The sub-faces directly under a coarser face depends on the viewing angle and angles of rotation. So moving the object model from side to side will make the sub-faces seem to slide underneath a parent face.
Personally I am not convinced by this idea (yet) it still needs much more work. In the end its probably far easier to create multiple detail models instead rather than trying to clump it all together in a single data structure. The only advantage from having an 'all-in-one' model is the ability to select any number of detail levels for a smooth coarse-to- fine transition. Using a number of individual detail models means you only have a few levels of details to choose between, the more detail levels, the greater the number of models that you need.
9: MIP Faces & Component Models
As with most programming problems the surface detail one can be easied a great deal by throwing lots of memory at it. Going back to the racing car door and its surface detail we can improve matters by defining two or more door models. The overall distance to the car would determine which door model to use, the most detailed one for close-ups and the most basic one for long range views. By having more than one 2d/3d model for a component would take up far more memory it's sensible to try and reuse them as frequently as possible. This would be like building a large object from a number of sub-objects (the components) and we would select a reasonable version of each sub-object for a certain distance & detail range. An advantage of this cloned part approach is that we could modify one single model and change the entire game world or atleast every item which uses the same component.
But having a variable detail level for a polygon face means a variable number of sub-polygon faces which is much slower than a straight forward texture bitmap approach. Due to the increase in polygon count and not forgetting vertices and rotation/perspective calculations it first appears to be a bad idea with changable bitmap textures looking like the best method. Just remember that these bitmaps are just 2d planes and suffer from pixellation when viewed from short distances.
The good thing about using sub-polygons or 'component models' is that our old friend 'the tree' can be used. We also have much more control over the detail levels and we need far less memory than using MIP-mapping techniques (multiple copies of the same bitmap texture at different scales) or simply using a very high resolution bitmap to avoid pixellation. Just like the bitmap approach the component models can be reused again and again to build up very large models from a few components. This method might be best suited for landscape rendering where the player can get very close to hills, rocks or the ground. Once the 3-d component models have been designed (and perhaps optimized) it would be very easy and quick to build up a large environment or object without having to design each and every tree on the landscape or every bolt on a robot.
Another possible method could be to employ both techniques, normal bitmap textures for distant items and proper polygons for close-up ones. This leads to the problem of creating these detail models and associated bitmaps. One obvious solution would be to define the highest detail 2d/3d models first and then render them on a 2d screen like any other 3d object. Then this 2d rendering can be taken directly to be its bitmap. When programming the map/game editing utility it is probably best to restrict the designer to placing the low detail bitmaps which fill the large areas. This way huge worlds can be created quickly without having to design thousands of detailed component models, each of which require a 2d bitmap.
The problem with all these MIP-mapping and multi-detail model techniques is that they really need a smooth transition between two scales otherwise the model will appear to suddenly change and shatter the illusion of realism. By using horizon fading (black or white fog) this sudden jump can be visually disguised.
10: Primative Component Models
When most people think of and define polyhedral object models they think in terms of vertices and polygons/faces, but there is another higher level way to build up complex environments and models, that of primatives. A primative is just a basic, building block solid object component such as a cube, pyramid and so on. These are often used in CAD design packages to help speed up an otherwise lengthly design process. You don't have to stick with sharp, planar components you can use spheres, circles, cylinders or even more complex solids. I think the Freespace system used in games like Castle Master and it's grandparents used this method in designing their levels and objects. Even newer games like Estatica uses primatives in the form of ellipsoids and spheres. And I think most people can remember seeing a DOC-like demo with rotating objects made up entirely from balls (helicopters, peoples and national flags were all favourites).
The advantage from using primatives is that their shape, vertex and face counts are already know and so can be optimized and pre-processed in ways in which a general polyhedral shape could not be. Also symmetry can be used to greatly reduce rotations to their minimum. For example a cube only requires a single vertex (X,Y,Z) to be defined and then all of its other 7 corner vertices can be quickly created from negating and/or swapping these X,Y,Z values about.
The bad points (lame pun) is that they can require more custom code (not too bad, is it?) and tend to make objects look like building blocks stuck together. Also complex models may require tens of these primative solids instead of a single, more flexible mesh. This can lead to more faces being rendered then is absolutely necessary, you may end up with primative faces which are inside other solids and so shouldn't really need to be defined or rendered. In short you are forced to draw each part of each primative even though some of it might be completely wasted.