Win32 Assembler Tutorial Part 2.718 (You know this number, don't you?)

Hi there, after the introduction to the basics of the Windows programming style in the first two tutorials a first real (read: maybe useful for something) program is created here. As I promised in the last tutorial it deals with multimedia, involving with graphic and sound.

Since repeating the body of a Windows program would be boring I splitted the source up into several parts: the *.asm file is nearly the same as in the last tutorial. The initialization, the cleanup code, the main loop and - since Windows works a lot with messages - the message processing have been put into their own include files for making the coding process easier to overview.

Note: The sample application should be started with no sound playing and the color depth set to 32 bits since the error handling is very simple for clarity reasons. And don't forget to provide sound for the Analog/Digital converter like Line-In, MIDI or CD, otherwise you'll see nearly nothing.

1. Graphics under Win32

Since the beginning all the drawing was done through GDI (no, not Kane & Friends ;-)). However, GDI was made for security, not for speed, so all the graphic processing was done in system memory. The pixels were written into the graphic cards memory by the OS without having the applications involved in it.

Now there is also DirectDraw, allowing the applications to write directly to the GFX cards memory, like using VESA 2. In addition, a lot of additional functions are coming with it, from palette manipulation through resizable transparent sprite support up to all the 3D stuff.

1.1. DirectX and ASM

OK, we need libraries and header/include files for using the DirectX API. However, ASM is not a language supported by the software firms. And because the different assemblers all have their own advanced syntax, one include cannot be used for all. Coming with this tute are one for NASM and one for TASM/MASM. Both use the same names for functions and constants as the original SDK header files, so using them is quite easy. There wasn't a good header for DirectSound, so I included the few needed definitions for DirectSound directly in the source. It looks like that the amount of Win32 asm include files is rapidly increasing now, so this problem fades away. :-D

Since I encountered problems with new DirectX DLLs if static linked DLL entry points are used I decided to get the entry points at run time. And the best: Doing so removes the need of additional *.lib files!

1.2. Getting DLL entry points at run time

This technique is not as weird as one may think. Everyone knows a program using this technique: The WinAmp plug-ins all have the same function names, but since they are different DLLs static linking is impossible.

The following code loads a DLL (if it is not loaded yet) and retrieves a handle (returned in eax) for it:


		push offset NameOfTheDLLtoLoad
		call LoadLibraryA

The Name_of_the_DLL_to_load is a zero-terminated string.

With this handle one can get the 32Bit linear address of the entry point of a given function in eax:


		push offset NameOfTheFunctionWeWantToCall
		push HandleOfTheLoadedDLL
		call [GetProcAddress]

The function can be called now with a simple


		call eax

It is also possible to load other files than DLLs if they contain at least one function to call. Note that it is also possible to use the Ordinal, a number identifying the function instead of a name for getting the address. However, there is a slight possibility that the ordinals may change between several versions of the DLL while it is very unlikely that its names may change.

1.3. DirectX uses COM

COM stands for Component Object Model, the new API used for many new parts implemented in Win32 now. It is more flexible than the old way of doing things. Passing parameters to functions works the same as with the standard API functions, however calling them works different.

As the name tells, it is designed to be object orientated. An object may represent, for example, our GraphicCard. This object has several properties and, for using it, a bunch of functions associated especially to it. These functions are implemented in an so-called interface. An Interface points to a table (called vtable) containing the entry points of all its functions. Calling a function (the functions are called methods here since the functions behave differently depending on the object and the version they are created from) now works the following way:

1. We have a variable (called Interface) which has been filled out with the 32 Bit linear address of the pointer to the vtable.
2. The vtable pointer points to the start of the function table.
3. The table is a list of DWORD pointers to the entry points. For getting the 3rd function of the table simply get the third dword in the table.
4. Call the dword found in the table.

Now you should be able to do it on your own. It's just a chain of pointers. Because repeating this all the time sucks I made a macro handling it:


		DXfunction macro  interface , method
		  mov edi,[interface]	 ;edi = COM-Object (address)
		  mov edi,[edi] 	 ;edi = VTable (address)
		  mov edi,[edi+method]	 ;edi = call destination
		    push [interface]
		    call edi
		endm

Note that calling a COM function always requires the interface as a parameter, too. So it is also handled in the macro, too (the same applies in C/C++, so you can use the functions in the same way).

Since an object and so the interface can be reused by several apps (or several times by your own code), all interfaces include two methods for changing the internal reference count of the interface. Everytime an object is needed its reference count is increased by 1 using AddRef and everytime it is no longer needed the counter decresed by Release. If the counter reaches zero, the object will be thrown away (and all its child objects created through it).

2. Nuf' said, start the engines

Did I forgot something? Oh, yes, to tell how to create the first object. The simplest way is using the DirectDrawCreate function inside DDRAW.DLL (the entry point is determined as shown above).

If it succeeds we get our pointer filled out with the address of the object.

The DirectDraw object we got is quite boring. It just represents the GFX device. But we want to write to its memory. A block of GFX memory is called Surface. A Surface is not only the memory, it is a complete object with functions for its manipulation.

A surface is set up by a call to IDirectDraw:CreateSurface (where IDirectDraw is replaced by our DirectDraw object):


push 0
push offset PointerToFillOutWithAddressOfTheSurfaceObject
push offset PointerToAStructureDescribingTheTypeOfSurfaceWanted
DXfunction Our_DirectDrawObject, CreateSurface

and now our surface pointer has been filled out and the surface is initialized.

The most interesting part of the GFX mem is the FrontBuffer, in DirectDraw called PrimarySurface. The PrimarySurface consists of the whole screen we can see on the monitor, no matter if we do DirectDraw in Fullscreen or windowed mode. So the attributes of the PrimarySurface are equal to the attributes of the current Video Mode.

Since the so-called windowed mode also uses the same framebuffer one may wonder where the difference between the two modes is. There is only one tiny (but effective) difference: You can exchange the PrimarySurface with another one having the same attributes (read: Page Flipping) and you can set any Video Mode wanted (and thus the attributes of the PrimarySurface).

One should better distinguish between exclusive and non-exclusive mode of an application instead of fullscreen and windowed mode.

Most GFX code writes to an offscreen part of GFX memory before showing the graphic instead of writing directly to the Front Buffer. The sample application uses such a backbuffer surface by creating an offscreen surface with the same bit depth as the primary surface, but with a different size: 256*256 pixels.

Surfaces are also used for storing and displaying textures, video streams, alpha maps,...

2.2 How to bring the offscreen surface's content to the primary: blitting

Blitting (a technique also used in GDI for years) is nothing more than sprite animation. The blit can copy the content of one surface into another one. One can also blit a part of a surface only, or only to a part of the other surface. It supports resizing, color keys and transparency. One can also do a colorfill with it. The benefit from using blits instead of doing it on your own in software is that all these functions are done in hardware on most cards (but shitty Voodoo2 does most in software).

Resizing? Can blit to a part of a surface no matter where it starts? That's ideal for bringing our back buffer to the front one because we want to draw to the client area of the window only. The only required thing is to tell the blitter function about the size and place of the window's client area.

One can enable and disable resizing and transparent blitting in the example using the two functions in the menu.

2.3 Writing to the GFX mem

It is nice having the surface, but it is useless since the position of the surface within the 32 bit linear adress space is unknown. A simple call to the surface's Lock method provides its adress and allows the program to write to this memory area. Another value is returned as well, the so-called Pitch. The pitch is the number of bytes by which every horizontal line of the screen is extended with before the new line starts. Writing to the pitch is not a good idea since it may be used for another surface or serve as a kind of cache, etc.

Note that the Lock function does not only Lock the video mem: it also prevents all other apps as well as the GDI from writing to it. So the code between Lock and the corresponding Unlock is critical and should be as short as possible. Any errors here can cause severe problems.

2.4. Clippers

A clipper object prevents one or more rectangular parts of the surface it is attached to from being changed. The main use of clippers are either for cropping blits hitting the edge of a surface or to restrict the access to the primary surface only to the visible area of a certain window. The sample application does the second one by telling the clipper object the handle of our window. The rest is done automatically, moving and resizing the window being included. Just comment the part initializing the clipper in the source out and look how the changed executable behaves different.

3. GFX is only one medium - here is the other one: Sound.

Using Direct Sound works similar to Direct Draw: Call DirectSoundCreate or DirectSoundCaptureCreate instead of DirectDrawCreate and we create SoundBuffers instead of GraphicSurfaces. Locking works the same as for DirectDraw. The special things for SoundBuffers is (of course :-)) that they change without being actively controlled by the application since the sound cards reads and writes to it at its own speed (and has to do it if we do not want to produce cracles and pops in the audio stream).

A sound buffer may be played or recorded only up to its end. But very often a streaming buffer is needed where the playback or recording position is set to the start of the buffer again if the position has reached the end of the buffer.

The result of this is that we have to take care when writing to a buffer in order to prevent the area of the buffer currently used by the soundcard from being locked. The application does this by getting the current recording position of the buffer and setting the locked region by taking care of it. Another possible solution is to let DirectSound inform us whenever a certain position in the buffer is hit (the Notify function does this).

There is one primary buffer for each DirectSound object, each having as many secondary buffers as we want. The secondary buffers are mixed into the primary by DirectSound in software or, if possible, in hardware. However, there is only one recording buffer for each DirectSoundCapture object (hmmm... maybe someday someone invents a de-mixing routine).

4. The main loop of the sample program...

...works like this:

1. call PeekMessage
instead of using GetMessage for our messages PeekMessage is used here since it does not wait until a message arrives. For easier understanding, GetMessage is still used if a Message has arrived, so PeekMessage is just used for knowing whether there is a message or not.

2. Clear the Offscreen surface with a colorfill blit

3. Lock the offscreen surface
3.1. Lock the Capture Buffer
3.2. Get the captured bytes and use them for drawing the waveform.
This part is cool: 256 bytes of 8 bit samples written to a screen sized 256*256 - this is what all asm coders have dreamt of under DOS. Note that this routine is for 32bit colour depth only. Use mov [edi+ebx],eax and mov [edi+4*ebx],eax for 8 and 16 bit depths. Normally an application should support all colour modes or use exclusive mode instead and set the video mode to whatever it needs. However, for simplicity and easier understading I did not include it in the sample.
3.3. Unlock the capture buffer and the offscreen surface

4. Blit the offscreen surface to the primary according to the flags set by the user

5. jmp 1.

5. Things to know

5.1. Cursor and blitting

You may have noticed that the cursor is not visible while being in the sample app window. This is normal because the cursor is overwritten by the blit in the main loop as often as the hardware is capable doing it. And the clipper removes the cursor while the blit to the primary surface is in going on in order to prevent graphic artifacts. A possible solution may be, for example, to reduce the frequency of the blits.

5.2. Error Handling

The example has only minimal error handling code: It just tells where an error happened and exits. The Lock functions are the only ones which behave differently since it is not unrealistic that the memory is not available at the time so the mem access is given another chance in the next loop pass.

If you experimented a bit with the sample, you may notice that there are some circumstaces which cause an error: Changes of the display mode (for example, by the screen saver). When the display mode changes, the surface memory is gone and the DDSERR_SURFACELOST error appears. In order to continue, one has to call IDirectDraw:Restore to get them back. This may also happen under exclusive mode: The user having Alt-Tabbed away from your program is the most common cause.

If you use DirectX more effectively than in this tutorial, there are a lot of error messages which are not severe: A blit is still in progress, another function has not finished yet,... One may just wait until the error messages do not appear any longer, however, one could use the processing time for better things, so do it if you can! This can speed up your code a lot when the hardware is capable of doing a lot on its own, allowing you to use the CPU for other things...

5.3. Improving the main loop

The main loop is not that optimal: it uses all available CPU time, even if the window is minimized and no GFX is shown. Sometimes it uses that much time that other programs nearly look like having locked up, as long as another application wants to access the primary buffer. So code setting the refresh rate of the application according to the window being active or not or even been minimized should be there in a real program as well. Best solution would be to set up an own thread for the GFX and let the main thread handle the messages only using GetMessage.

BTW: DirectSound already creates an own thread for capturing the sound since our program might not call the DirectSound functions often enough (this is the case during resizing and dragging the program window).

5.4. Multi Device support

Bigger programs should allow the user to select which Sound (and maybe Graphic) card to use before creating the DirectDraw or DirectSound object. If you wonder about having several GFX cards just think of the 3D-AddOn cards, which is the most common case for this.

The value passed to the create functions can be obtained through the DirectDrawEnumerate and DirectSoundEnumerate functions. These functions are standard API style, not COM like.

5.5. Hardware capabilities, hardware emulation and unsupported options

If a driver cannot do what you want, DirectX emulates it in software. However, there are a lot of capabilities which are not in the hardware emulation. So there are things you can count on being available on all systems (the ones being supported by the software emulation) and others on which you can't. In this case one may write your own software doing it and use it all the time or only if no hardware doing it is available. Other capailities may be used if available and not be used at all if they aren't. One example for these may-be-used-if-available capabilities is the interpolation, it just improves the quality, or hardware working asynchronous to the CPU (which is useless to emulate).

5.6. How to use newer versions of DirectX

If you want to use the new possibilities of newer DirectX versions you have to use a new interface supporting them. Each interface has a value that identifies it: The IID-Values. These are included in the DirectX header files. The following snippet shows how to obtain a new version:


push offset PointerToFillOutWithAddressOfNewerVersionOfObject
push offset IIDofTheNewerVersionOfTheObject'sInterface
DXfunction PointerContainingOlderVersionOfObject , QueryInterface

If eax is set to 0, the call was successful. Now we can throw away the old object, it is no longer needed:


    DXfunction PointerContainingOlderVersionOfTheObject , Release

As long as the old interface is not used anywhere else its reference count has now reached 0 and will be discarded automatically.

But there is another version of creating an object than calling DirectDrawCreate, DirectSoundCreate, etc, which all create the oldest version of these objects:

One can use the basic COM functions for creating the Direct X objects (these functions are within ole32.dll). Here the sample code:


 push 0
 call CoInitialize

This function needs to be called only once. At the end of the program, just call CoUninitialize (no parameters needed).

To create the object with the given interface do:


push offset PointerToFillWithAddressOfTheCreatedObject
push offset IIDofTheInterfaceWeNeed ;Defined in the DirectX headers
push CLSCTX_ALL      ;run in all contexts, not that interesting, though
push 0		     ;no controlling unknown available, just set it to 0
push offset CLSIDofTheObject	     ;defined in the DirectX headers
call CoCreateInstance

Last step: Initializing it.


 push 0
 DXfunction PointerToFillWithAddressOfCreatedObject, Initialize

If we create the objects this way, we do not need to throw away older versions if newer ones are used. Some parts of DirectX like DirectMusic can only be initialized this way.

6. OK, that was it for this time.

Maybe programming like this looks a bit strange at first sight, but if one gets used to it you will find them easier to use than the standard API. CU in the next tutorial and go coding like hell!