Faster and Better RGB Palette Fades
Introduction
Welcome coders, to another article for Hugi diskmag. I will describe RGB palette fading, and a simple, fast way to get a better fade.
Fades are important as they can help to smooth the transition from one part into another. Let's face it, they look much better than a sudden clear-screen.
They give a more cinematic feel to your productions, be they demos, intros or full blown games. They should be used very carefully, otherwise the audience will get very bored, very quickly watching the same fade 50 times over. Perhaps the best place for a fade-out (fade to black) is at the end of your production just before it exits back to DOS/Windoze 89 etc.
The Problem
The easiest fade to do, is the fade-to-black. This is where all the individual components of an R,G,B palette (the Red, the Green and the Blue of each of your 256 or so colours) are slowly, and smoothly decreased towards black (Red=0, Green=0, Blue=0).
Once you have the fade-out routine you will want to write a fade-in routine. This is the opposite of a fade-out (surprise, surprise). You begin with an all black palette, where all 256 or so colours have the value Red=0, Green=0, Blue=0, and then you slowly increase each component until it matches the final palette's R,G,B values.
So it seems that we need to write two routines for fading, a fade-out and a fade-in routine. But, a more flexible way would be to write a single routine which fades from a starting-palette to a destination-palette. Then we could specify an all black palette as our destination to create a fade-out, and likewise we could start with an all black palette and give our final palette as the destination to produce a fade-in.
One-By-One
Unless you choose a strange way to store your RGB palettes there will be a block of 768 bytes, where each byte has a value from 0 to 63 (00..3F hex). We arrive at the number 768 because there are 3 bytes per color (a Red byte, a Green byte, and a Blue byte) and because there are 256 colors.
Sorry if this seems obvious, but there may be some "newbie" coders reading and we wouldn't want to scare them off this early in the article.
Let's begin with a fade-out routine, I will describe the more flexible method later on.
Because a Red, Green and Blue byte in our palette all share the same property (of having a value from 0 to 63) we can use a single operation for all three, and so a really easy loop of 768 iterations can be used.
In the loop we take each palette byte, if it's non-zero then we decrement it.
[ES:DI] --> palette
FadeOut:
MOV CX, 768 ; = 256 * 3 bytes to fade
fadeout2:
MOV AL, ES:[DI] ; get a palette byte
CMP AL, 0
JNZ SHORT fadeout3 ; is it already 0?
DEC AL ; else descrease it by 1
fadeout3:
STOSB ; store the palette byte
LOOP fadeout2 ; and repeat for the entire palette
I didn't say palette fading was difficult, did I?
The above code will only decrease each component of the palette once, of course you still need to set the VGA colors using this palette, wait for the normal V-Sync signal and repeat the process.
But how do we know when all 256 colours have been faded out?
It is safe to assume that after 64 calls that the entire palette will have been faded out correctly, because we already know that each component can only have the range 0 to 63.
The Fade-In
The next easiest fade after the fade-out is the fade-in. This needs two palettes. One for a temporary 768 byte workspace and the final palette (so we know what to fade-in to).
This time in the loop we increment each R,G,B component until it matches our final palette.
[ES:DI] --> 768 byte temporary palette
[DS:SI] --> the final palette
FadeIn:
MOV CX, 768 ; = 256 * 3 bytes to fade
fadein2:
CMPSB ; does the palette byte
JZ SHORT fadein3 ; already match our final palette byte?
INC BYTE PTR ES:[DI-1] ; else increment our temporary value.
fadein3:
LOOP fadein2 ; and repeat for the entire palette
Again we need to send the palette to the VGA RGB colour registers, wait for V-Sync and repeat the process 64 or so times.
We also need to initialise our temporary palette to all 0's before we start this fade-in.
The Flexible Fade
This is another type of fade, but it is more flexible than both the fade-in and fade-out routines. You can think of it as a palette-morph.
We compare each component in both our temporary and our final palette and increment or decrement as needed, and repeat this for all 768 components.
Like the fade-in routine we need two palettes, a temporary one and our final palette. But this time our temporary palette isn't initialised to 0's we copy our current palette into this buffer and use it as our starting point for the fading process.
[ES:DI] --> 768 byte temporary palette
[DS:SI] --> the final palette
FadeTo:
MOV CX, 768 ; = 256 * 3 bytes to fade
fadeto2:
CMPSB ; does the palette byte
JZ SHORT fadeto3 ; already match our final palette byte?
JG SHORT fadeto4 ; do we need to fade-up?
DEC BYTE PTR ES:[DI-1] ; else fade-down
JMP SHORT fadeto3
fadeto4:
INC BYTE PTR ES:[DI-1] ; fade-up
fadeto3:
LOOP fadeto2 ; and repeat for the entire palette
A Better Way
That last routine "FadeTo" seemed very nice, short and reasonably quick. But in fact there is a problem with the increment/decrement method which is easy to overlook.
The R,G,B components of the palette are not faded out evenly as a correct fade would do. For example, say we had the values 1 and 63 and both needed to be faded out to 0 (black), then the decrement method would fade the first value out straight away, while the 2nd value 63, would take another 62 loops to fade out.
This may not seem a problem, but try altering the brightness on your monitor and you should see that all the colours are faded out evenly.
What we need to do is to scale the fading process so that every component, independant of value, takes 63 (or 64) loops to change, this should give a far better fade.
But doesn't this mean we need to divide and multiply to scale the values? This would mean a much slower fade routine than the current increment/decrement method, wouldn't it?
Fixed-Point Palette Fades
Well, no.
Enter the world of fixed-point maths (yet again).
What we need to do is to step up each component based upon its difference between our starting component value and our final component value, AND the period of time over which the fade should take place.
Let's suppose we use a 64 loop cycle to perform our fade, so after 64 loops our fade is complete, and after just 32 loops we are half way there.
Say we need to fade from 8 to 24 in 64 steps, then we begin at 8 and then step up by 0.25 per loop. After 64 loops we would have 8 + (0.25 * 64) = 24.
The formula is just:
step value = (final - start) / 64
If we wanted a different period for the fade then simply change 64. Can I suggest using 32 or 128 etc.?
To perform this fixed-point fade we need an extra 3072 bytes (768 * 2 * 2). This is used to store some increments and our temporary, working palette in 8.8 fixed-point format.
The set-up process for the fade is slightly longer, but as you will see the actual fade loop is much easier as it requires no conditional jumps.
[ES:DI] --> 3072 byte temporary palette
[DS:BX] --> the starting palette (768 bytes)
[DS:SI] --> the final palette (768 bytes)
InitFade:
MOV CX, 768 ; = 256 * 3 bytes to fade
initfade2:
LODSB
SUB AL, [BX] ; the component difference
CBW
ROL AX, 1
ROL AX, 1 ; increment = (diff / 64) * 256
STOSW
MOV AH, [SI-1]
MOV AL, 0
MOV ES:[DI+1536-2], AX ; value = start * 256
INC BX
LOOP initfade2 ; and repeat for the entire palette
Now the temporary 3072 byte palette has this format for all 768 values:
+0 768 x WORD increment in 8.8 format
+1536 768 x WORD value in 8.8 format
To set the VGA colours using this format needs a custom routine:
[ES:DI] --> 3072 byte temporary palette
SetFixedPalt:
MOV CX, 768 ; = 256 * 3 bytes to write
MOV DX, 3C8h ; PEL port
MOV AL, 0
OUT DX, AL ; start with colour 0
INC DX ; PEL write port
setfpal2:
MOV AL, ES:[DI+1537] ; high byte of temp value (8.8 format)
OUT DX, AL ; output each R,G,B component...
ADD DI, 2
LOOP setfpal2
Now the actual fading routine:
[DS:SI] --> 3072 byte temporary palette
FixedFade:
MOV CX, 768 ; = 256 * 3 bytes to fade
fixfade2:
LODSW ; get the increment (8.8)
ADD [SI+1536-2], AX ; add to the temp value (8.8)
LOOP setfpal2
Of course you can use an entirely different method and palette format. The only thing to remember is that you need space for the 8.8 fixed-point maths (the temp-value of each 768 component and the 768 increment values).
If you don't understand fixed-point maths then I suggest looking for one of the many tutorials documents on the net or post a question to one of the many newsgroups, and some kind person will post some information to you.
Closing Words
Well, that's another article done. I haven't seen any palette fading method which uses a similar technique to this one, so it may be a first. It's not a ground-breaking discovery, and it's probably not even new, but it should give coders something to think about. Sometimes you have so much to do that the little things like this can get overlooked.
The method of using fixed-point maths could also be applied to other morphs such as co-ordinates or any other kind of data which needs to be interpolated in some quick, simple way.
Oh, a quick message to all the millions of elite coders out there, please don't send me any flames. Instead spend the time writing an article for Hugi or some other lesser diskmag. This way other people can see how smart you are, instead of being a fire hazard.
Have fun.
Regards