C++ Tips-n-Tricks
brioche/aspirine

00 - Table of Contents

	     01 - Introduction
	02 - Template functions
	03 - Enumerations are cool
	04 - Having fun with static
	05 - More about templates
	06 - New stuff about new
	07 - Doctor, I'm still addicted to void (*f)()
	08 - Namespace, the final frontier
	09 - Exceptional use of exceptions
	10 - Not so secure typecasts
	11 - Resource control
	12 - Who holds the reference?
	13 - Creating objects
	14 - Streams
	15 - Wrapping C++ with C
	16 - Elegant multiple inheritance
	17 - Destruction
	18 - Philosophical Corner
	19 - References and Resources
	20 - Closing Words and Acknowledgements

01 - Introduction

Hello object fellows,

It seems that some of you appreciated the Design Patterns articles released int the previous issue of Hugi... so, are you ready for more nasty C++?

Ok, this one is about some lovely subtleties and other weird things people usually "love to hate about C++", as James Gossling, the creator of the Java programming language, use to say when doing technology evangelism at every Sun Microsystem developper's conference... but that's not the point!

Yes, this article is not really about Object Oriented design but about all these - nasty - technicalities inherently binded to the C++ language.

Sorry for the newbies, this article is aimed at people who already have some practice with C++: you'll have a new flame to write in your diary! :) But you can check out the resources at the end of this article to learn C++.

Now, let's go for some hardcore C++!

02 - Template functions

Warning: Template functions are not to be mixed up with the Template Method design pattern (see Hugi 18's Coding Corner).

I'm sure most of you have probably heard about template classes but what about functions? As a recall: a template is a function (or class) that you code without specifying some of the data types you deal with. It means that you just write the skeleton of the algorithm and at compilation time, you choose the real type that the class or the function will use. Here's a short example of template function:

	  template <typename T> T square(T x) { return x*x; }

     void foo()
     {
       int i = square<int>(2);
       float f = square<float>(3.14159f);
       Matrix m = square<Matrix>(m);
       // ...
     }

All data types (primitives and structures/classes) will be accepted unless they the binary * operator hasn't been properly overloaded.

Template functions are efficient, the code is generated and inlined, there shouldn't be any expensive branch operation when expanding a template.

Actually, all kind of #define-based macros should be banned and replaced by inline functions and/or template functions.

Of course, things that apply to small algorithms also works with larger pieces of code. For instance, if you want to code a single line drawing algorithm for all screen resolutions and pixel depth (from 8 to 32-bpp), just implement it like this:

	  template <typename Pixel>
     void line(int x1, int y1, int x2, int y2, Pix col, Pixel* buffer)
     {
       // ...
       // Clipping and other Bresenham's algorithm related stuff
       // ...
       buffer += clipped_y1*surface_width+clipped_x1;
       while (length--) {
	 *buffer = col;
	 // Update buffer offset and decision variables
	 // ...
       }
     }

Believe it or not but we've just achieved this: "write once, use many!". Ok, ok, I know there's still an issue with the surface_width variable but you might pass it to the routine as an argument... and in a few minutes, we'll see another trick to hardcode the width in a clean way, so stop complaining! :)

03 - Enumerations are cool

Enumeration are probably (and unfortunately, imho) one of the most unused feature of C and C++.

One of the advantages of enum is that the compiler treats them nearly as plain integers (or some command line switch is provided to do so explicitly).

When you need to define a set of integer constants (for flags or whatever), use enums. But take care if you want to store bitmasks as enum since their values are signed integers (maybe your compiler supports some "unsigned enum" switch). I've also seen that it's possible to cheat some compilers by doing something like this:

	  enum Bitmask {
       a = 1, b = 2, c = 16,
       d = 128, e = 512,
       f = 16384, g = 65536
       __i_am_here_to_cheat_the_compiler__ = 0x7fffffffL
     };

But I'm absolutely not sure of the portability of this hack...

I'm sure you've already wondered how to store the (constant) size of an array without using one of those infamous #define. You can't use `const' since the const-ness is not guaranteed by the compiler and thus it will refuse to statically set the size of the array with a `const int' for instance. Why the heck don't you use enumerations? The following piece of code is perfectly correct:

	  // An anonymous enumeration
     enum {
       screen_width = 640,
       screen_height = 480
     };

     pixel double_buffer[screen_width*screen_height];
     raster* raster_list[screen_height];

Great, ain't it? :) Btw, another way to statically set the size of an array is to use a template (see below).

When creating a union, you may always rely on the union members previously defined as follows:

	  enum Render_settings {
       Reflections = 1,
       Shadows = Reflections << 1,
       Textures = Shadows << 1,
       Bicubic_spline_lerp = Textures << 1,
       Fuzzy_logic_based_trilinear_filtering = Bicubic_spline_lerp << 1
     };

This is a clean way to create a bitmask without having to mind neither about machine endianess nor about the starting position of the flag bits. So if I want to move the first bit of the flag bitfield, I won't have to change the whole enumeration.

When you write a C++ module or library, it's usually clever to store the enum as a public definition within the class they're related to instead of defining them outside of its scope (i.e. in the global scope or within the namespace they've been declared in). Check out the following example:

	  class Obj_loader {

     public:
       // ...
       enum State { OK = 1, LOAD_ERR = 2, CONV_ERR = 3 };
       State get_state() const;

     };

     void foo(Obj_loader* loader)
     {
       // ...
       switch (Obj_loader->get_state()) {
       case Obj_loader::OK:
	 // ...
	 break;
       case Obj_loader::LOAD_ERR:
	 // ...
	 break;
       case Obj_loader::CONV_ERR:
	 // ...
	 break;
       default:
	 // do nothing but satisfy strict compilers
	 ;
       }
     }

This might look harder to write but it's also more flexible: you immediately know where the symbol has been defined and to which class it is related.

04 - Having fun with static

When you declare a member (a method or an attribute) as `static', it is considered as attached to a class rather than to an object. This means that to use these items (which are members of some class), you are not obliged to instanciate that class. The same applies with static functions, you may call them without creating any instance of the class.

Static fields are useful to define properties that are common to all the intances of a class without having to store the data in each instance of that class. The other advantage is that when you modify the static data, it's up to date for all the instances at the same time.

If you want to implement runtime customizable default values for some objects, you should consider using static data members and static methods to change those members during the execution.

You may use static fields to store metadata (= data about your objects) for debugging purpose or any other excellent reason like lame memory wasting copyright information. :)

Example:

	  template <typename T> class Buffer {

       static const char *meta_name_Buffer;
       static const char *meta_author_Buffer;

     public:

       // ...

     };

     const char *Buffer::meta_name_Buffer = "Buffer";
     const char *Buffer:meta_author_Buffer = "MemyselfandI";

You may also use static fields to make instance counters or implement singletons (see Design Patterns).

	  // Foo.h
     class Foo {

       static int counter;

     public:

       Foo();
       ~Foo();

       // Other operations ...

       static int num_instances() const { return counter; }

     };

     // Foo.cc
     int Foo::counter = 0;   // init counter

     Foo::Foo()
     {
       // whatever...
       counter++;
     }

     Foo::~Foo()
     {
       // whatever... (cut-n-paste rules)
       counter--;
     }

At any time, you may know how many instances of the class Foo are present in your application even without having to instanciate an object!

When inheriting from a class having static members, they are logically still unique for all the instances of the derived classes. So, don't forget that if you write a class inheriting from Foo, each instance of the child class will count as Foo since it is one of these "is a" relation.

Using static public methods declared in the scope of a class is a better solution than declaring global function that will pollute the namespace they are declared in.

Tip of the day: when you think about adding "just one little global variable" into your program, please consider using a static member stored in some of your classes, it will tremendously increase the reusability of your code!

Static functions are cool when you want to create some widely available factory method.

	  class Return_of_the_Foo {

       // ...

     public:

       Return_of_the_Foo();

       // The factory
       static Return_of_the_Foo* create();

       // ...

     };

     // A very very very simple factory
     Return_of_the_Foo* Return_of_the_Foo::create()
     {
       return new Return_of_the_Foo();
     }

You can even hide the real constructors of the class and just allow the construction using the factory method. The factory method might also encapsulate some default settings for the class to be built. For instance, the factory might select the right constructor according to some contextual information. The behaviour of the static factory method might be changed using other static methods sharing static data. This is great if you want to be sure all the object created after some environmental change will have the same traits.

05 - More about templates

Most of you C++ coders have already written templates with one or more classes as template arguments but did you know that it was possible to pass integer values as template arguments (e.g. to set the size of static data structure)?

Here are a few lines of code that will make things clear:

	  template <int size, typename T> class Buffer {

       T data[size];
       int num;

     public:

       Buffer() { num = 0; }

       // Only dumb error checking is performed here ...
       void add_element(T elem) {
	 if (num < size)
	 data[num++] = elem;
       }

       T &element_at(int pos) {
	 if ((pos >= 0) && (pos < size))
	   return data[pos];
	 else
	   throw something_that_hurts();
       }

       // To be continued ...
     };

Caution, template parameters cannot be modified at runtime for an obvious reason since all the code is generated for each new value of size existing in your program.

It's always faster to hardcode an algorithm for a set of given values. A nice example would be a routine to apply a blur filter on a given rectangular area of a surface. It would be nice to be able to customize the strength of the blur without losing efficiency by using memory variables in the mainloop. You can use template functions as if they were some kind of macros with advanced features. Just let your compiler generate all the code (with constant values hardcoded in the mainloop) for you! Here's a generic 4-pixels blur routine:

	  // Standard 4 points blur (no clipping)
     // Not optimal code but cool enough as an example :)
     template <int strength, int screen_width = 320>
     void apply_blur(int x, int y, int w, int h, Pix* to, Pix* from)
     {
       int offset = (y+strength)*screen_width+(x+strength);
       to += offset;
       from += offset;
       w -= strength*2;
       h -= strength*2;
       int to_skip = screen_width-w;
       while (h--) {
	 int tmpw = w;
	 while (tmpw--) {
	   Pix a = Pixel::avg(from[-screen_width*strength],
		     from[screen_width*strength]);
	   Pix b = Pixel::avg(from[-strength], from[+strength]);
	   *to++ = Pixel::avg(a, b);
	   from++;
	 }
	 to += to_skip;
	 from += to_skip;
       }
     }

You can force the compiler to generate a concrete function by adding this at the end of the header file containing the template definition:

	  // Blur filters (strength 1 to 5) for 320 pixels wide screen
     void apply_blur<1>(int x, int y, int w, int h, Pix* to, Pix* from);
     void apply_blur<2>(int x, int y, int w, int h, Pix* to, Pix* from);
     void apply_blur<3>(int x, int y, int w, int h, Pix* to, Pix* from);
     void apply_blur<4>(int x, int y, int w, int h, Pix* to, Pix* from);
     void apply_blur<5>(int x, int y, int w, int h, Pix* to, Pix* from);

Just reference these routines in a jump table, wrap it into an inline function and you'll get a highly flexible and customizable blurring routine.

I used a similar trick to build a spread of interpolation routines to rasterize 4x4 or 8x8 texture mapped and shaded blocks. That way, I could setup various interpolation routines with hardcoded values (and thus faster than variables) without having to rewrite the whole code.

Templates are also helpful when you want to avoid using virtual functions in time critical routines. With templates, the type checking is performed during the compilation process, not at execution time. Thus you won't have to define strong typed interfaces to reuse objects in other algorithms. Of course, if you need to dynamically change the behaviour of the object, it won't work. But if all you want is to reuse as much code as possible, templates will most probably do the trick!

Do you happen to know member templates? Well, as it names nearly implies, a member template is a template which is member of a non-template class.

Let's say you have a class which is an abstraction of quantity like a "rectangle" can be a quantity of pixel or a "length" can be an amount of units within a given coordinate space. If I take the rectangle example, we will most probably meet a situation where we will have to allocate a portion of memory (whose size is related to the size of the rectangle) to store objects. For instance:

	  buffer = new some_type [rect.width()*rect.height()];

Or, if you're a smart coder, you'll have something like:

	  buffer = new some_type [rect.size()];

Member templates allows you to write a template method within your rectangle class like:

	  class Rectangle {
     public:
	// ...
	template <typename T> T* alloc() { return new T [this.size()]; }
     };

Now to create a Pixel surface or a look-up table matching the size of the rectangle, you just have to write this:

	  Pix* surface = rect.template alloc<Pix>();
     float* lut = rect.template alloc<float>();

Although the syntax is weird, I just love it! This is so clean... :)

Member templates are NOT supported by all compilers! At my humble knowledge, a weird piece of software called MSVC doesn't support them at this time.

Most of the time, the code related to a template is inlined in the class, as follows:

	  template <typename T> class Lookup_table {

       T* data;
       // ...

     public:

       // ...
       const T& at(int i) const {
	 return data[i];
       }

     };

This leads to cryptic source code where implementation and interface are totally mixed up. It's a nice idea to implement the template into an external file (included in .h) or just at after your class definition in the header file. Whatever solution you choose, you'll experience one of the craziest C++ syntax oddity ever! Let's go back to our LUT example:

	  // Interface

     template <typename T> class Lookup_table {

       T* data;
       // ...

     public:

       // ...
       const T& at(int i) const;

     };

     // Implementation

     template <typename T> Lookup_table<T>::at(int i) const
     {
       return data[i];
     }

Now, let me explain the ins and outs of the syntax used.

First, we have to specify that the following "object" we will declare is a template, i.e. it has a parameter which is a type. The method we will implement does not really belong to a class called "Lookup_table" but to a class named "Lookup_table" which is parametrized by a given type T. The difference between "class T" and "T" may seem silly but there is a good reason: "class T" is a declaration of the parameter to be used whereas "T" is an "instance" of this still unknown type.

Notice that whatever your code is inlined or written outside of the class, the final result will be inlined due to the template nature of the object.

06 - New stuff about new

The new operator (let me insist on the word operator) is the mechanism which allows you to dynamically allocate memory on the heap. Although you may still use malloc(), free() and other crap, I warmly recommend you to use that allocation method for several reasons like:

- It's usually easier to write (no more ugly typecasts and sizeof)
- You can setup your object in a row by invoking one of its constructor
- new is an operator (as well as delete), thus you may overload it

When you have to use many dynamic allocation in a short period of time (e.g. a span-based rendering engine), it's usually a good idea to use a large portion of memory and use your own memory allocation code. Did you ever heard about "placement" new? It allows you to specify the memory source as an extra argument to the new operator. We may write things like:

	  #include <new>	  // You MUST include this header

     enum { my_mem_pool_size = 4096 };
     char my_mem_pool[my_mem_pool_size];

     void f()
     {
	// Classical new: we get memory from the heap
	Foo* foo1 = new Foo;
	// Placement new: we get our fresh memory from guess who :)
	Foo* foo2 = new (my_mem_pool) Foo;
	// ...
     }

WARNING! Placement new is NOT a part of the ANSI C++ specifications although it has been implemented in many popular compilers. To get it working with my egcs/g++ 2.91.60, I had to remove the "-ansi" switch from the command line options.

Although It may sound funny to use your own memory manager, don't miss the point that It might be a real pain in the ass to write a reliable and efficient memory manager. Writing the allocation code is fairly easy but when It comes to free memory areas, you'll get lovely gaps in your beautiful linear-and-so-fast memory pool... get a hand to Andy Tanenbaum's book on Modern Operating Systems or browse /usr/src/linux/mm/[kv]malloc.c if you have a copy of Linux installed somewhere... anyway, good luck! :)

07 - Doctor, I'm still addicted to void (*f)()

When moving to C++, most C programmers have to lose their bad habbit of writing tricky code. One of these trick (which is theoreticaly forbidden in C++) is pointer to functions. When do you need such hacks? When you want to customize the behaviour of an algorithm? No problem, in C++ you have inheritance (get my introduction to Design Patterns in the previous issue of Hugi). When you want to implement function callbacks? It's still possible with classes too!

A very clean way exists to use a syntax similar to old C pointers but without really using pointers. What we gonna do is to overload the () operator which is the "function call" operator that will be called when you apply () to any instance of a class.

First, you have to define a common interface just like in C.

	  class Operation {
       // This is the interface with some arguments
       virtual void operator () (float x, float y, float& z);
     };

     void foo(Operation& op)
     {
       float u, v, result;
       // We use the object just as if it was a function!
       op(u, v, result);
     }

If you don't want having to deal with "complex" syntax, just implement an inline operator in the Operation interface which calls a well named method.

I used this trick to customize some look-up table calculation code in one of our productions (see http://aspirine.planet-d.net/magnus.html).

You may even write it as a template so you can write (almost) datatype-proof operations. This might be useful for calculation or pixel operations.

But don't make me say what I didn't said: I do not consider harmful ALL pointers to functions. At a low-level, they are very useful when you really need very fast branching (like selecting a proper blitter) whereas late binding is sometimes slower.

08 - Namespace, the final frontier

Most demos are not written by one single coder and when it comes to link the code from 2 or 3 guys who have their habbits (their own types, helper routines and naming convention), it's usually a big mess. Not to mention that this often happens at some obscure party place when nobody has slept for 48 hours and that the demo compo deadline is scheduled in a couple of hours! :)

Fortunately, ANSI C++ has a great feature to sort all this out: namespaces. There are plenty of stuff to learn about namespace and I'll only introduce you the concept here.

Namespaces can be used basically for three things:

- to avoid symbol name clashes
- to define an interface to a module
- to express a logical grouping of classes, functions or data

The criteria used to include (or not) an 'object' within a namespace is very context-dependant.

You access to the element 'bar' of the namespace 'Foo' using the scope resolution operator '::'. Example:

	  namespace Foo {
	  int bar;
     }

     void f()
     {
	  Foo::bar = 1;
     }

The global namespace is known as ::, so if you are writing a class with a method called time(), ::time() refers to the standard C library time() function.

All the standardized C++ stuff (algorithms, containers, etc) is located in the "std" namespace. Nothing should be added to this namespace.

As a more complete example, here's some typical C source written by Colas:

	  /* colas.h */

     #ifdef __cplusplus
     extern "C" {
     #endif

     #include "mytypes.h"
     #include "crazyasmrasterizers.h"

     /* pointer to off-screen buffer */
     extern unsigned long *scr;

     /* typical name for a function written by Colas, cheerio buddy! :) */
     int lrpSpnUVZ(unsigned*,int**,int(*)(int,int),struct st*);
     char **hackDaWholeSystemRightNow(void);
     #ifndef __LINUX__
     void *copyStuffWithDaFPURegardlessPortability(void*, void*, int);
     #endif
     ...

     #ifdef __cplusplus
     };
     #endif

And here's my C++ source where I include Colas' C code:

	  // Wrap everything into a lovely namespace
     namespace Colas {
     #include "include/colas.h"
     };

     // No name clash here!
     Sprite_motion_script scr;

     class Old_skool_stuff {
       // Poor code here... :)
     };

If you want to avoid to write namespace_name:: all the time for some symbols which obviously don't collide, you may use the "using" statement (yet another new keyword) as follows:

	  using namespace_name;
     using namespace_name::symbol;

So if you have wrapped in a namespace all your inlined pixel manipulation routines, you'll import a function or a type like this:

	  using Pixel::Pixel_type;
     using void Pixel::shift_bits(Pixel_type& p1, unsigned n_bits);

Ok, this was just an introduction to namespaces but it should be enough if what you want is just to insulate some parts of your code.

One thing you have to know is that any kind of #define'd symbols will pass through the namespace wrapping mechanism: that's another excellent reason to give up using macros for anything else than conditional compilation and double inclusion avoidance!

09 - Exceptional use of exceptions

You all know exceptions as being a cool way to handle runtime errors. It allows you to put all you error handling logic within the same portion of code.

Btw, did you known that in C++ you may catch all exceptions using the following special statement ('...'):

	  try {
       // do something dangerous here
     } catch (Exception1& e1) {
       // handle errors
     } catch (Exception2& e2) {
       // handle more errors
     } catch (...) {
       // handle all the other errors
     }

Have you noticed that I caught a reference to the object that has been raised rather than catching directly the instance of the object. This detail is tremendously important when you're working with a hierarchy of exceptions. If you don't specify that you catch the reference of the object, you'll get an instance of a concrete object instead of a reference to an object implementing an "interface" defined by the type specified within the catch statement. So whatever the real type of the object (which might be a subclass of the type explicitly caught), you'll get an instance of the explicitly class you want to catch and believe me, this is definetly not what you want! So don't forget that goddamn'd ampersand!

People often look at exceptions when error condition occurs but have you ever been thinking about using exceptions within your algorithms? Well this is not really structured programming but exception might be useful to interrupt a sequence, especially when the interruption overhead is not negligible like when the sequence is highly recursive. The trick here is based on the fact that you may "throw" any kind of object. The following example is taken from Bjarne Stroustrup's book (section 14.15):

	  // Helper function (btw, the tree is not a real binary seach tree)
     void find_recursively(Tree* p, const string& s)
     {
       // Throw the object we've been looking for to the caller!
       if (s == p->str)
	 throw p;
       // Go left
       if (p->left)
	 find_recursively(p->left, s);
       // Go right
       if (p->right)
	 find_recursively(p->right, s);
     }

     // Main function
     Tree* find(Tree* p, const string& s)
     {
       try {
	 find_recursively(p, s);
       } catch (Tree* q) {
	 return q;
       }
       return 0;
     }

I love this trick too! Of course, the code doens't look that obvious anymore but certainly more efficient than returning back trough all the stack when the item has been found!

I just saw some more crazy trick with exception in some obfuscated code in C++ Report. A guy had written a recursive function with multiple level of try { } catch () to "return" the result. But I admit it was more some sort of game than serious coding...

10 - Not so secure typecasts

Warning: this one is aimed at nostalgic C coders only. :)

Although C-like typecasting might be considered as "poor style" in C++, operator overloading mechanisms provides a strong way to perform secure dynamic type conversion.

Let's say you have a frame buffer abstraction class and that you have a method to allow access to a byte buffer representing the frame, this is usually achieved that way:

	  class Frame_buffer {

       unsigned char* pixels;

     public:
       // ...
       void* get_base() const { return pixels; };
     };

     void foo()
     {
       memcpy(raw_buffer, frame_buffer.get_base(), 1000);
     }

You may actually provide a cast operator as follows:

	  class Frame_buffer {

       unsigned char* pixels;

     public:

       // Invoke this method when cast to (void*) is required
       operator void* () { return pixels; };
     };

     void copy_some_memory(Frame_buffer& frame_buffer)
     {
       memcpy(raw_buffer, (void*)frame_buffer, 10000);
     }

When overloading the typecast operator, you may perform some extra checking or conversion on the data before you make it available to the user.

Although this technique allows you to add some control during the type conversion operation, it's not really safe when you use it. Take a look at the example, you can't tell at the first sight that your copying the content of the frame buffer and not some memory garbage you could find at the base address of your frame_buffer instance.

You may consider overloading typecasting operators for some really obvious reasons like with strings or tables wrapped into a class or to implement transparent proxies, smart pointers (see next section) or iterators.

11 - Resource control

One major drawback of C++ is the way memory is managed and especially the way memory is not managed at all! :) This is particularly annoying when you play around with dynamically allocated objects that you do not properly free. This inevitably involves memory leaks.

It is theorically possible to perform garbage collection by overloading the new operator but runtime efficiency might suffer from that (see early version of Java) and it's hell a lot of hacking to achieve! :)

Fortunately, there's another clean and handy solution to the memory management. You may use destructors because one knows that they are invoked by the compiler for each object going out of the scope of its definition. This technique is known as "smart pointers". What we gonna do is to wrap pointers into a template as follows:

	  // No error checking in this example!
     template <typename T> class Pointer {

       T *p;

     protected:

       // Prevent pointer from being copied
       Pointer(const Pointer &ptr) {}
       Pointer &operator = (const Pointer &ptr) {}

     public:

       Pointer(T* ptr) : p(ptr) {}
       ~Pointer() { delete p; }

       // Provide transparent access to the object pointed to
       T& operator * () const { return *p; }
       T* operator -> () const { return p; }

       // You may add other operations like reference
       // counting (when copy is allowed) or whatever...
     };

     // Use the Pointer wrapper
     void foo()
     {
       Pointer<Heavy_3D_scene> scene(new Heavy_3D_scene);

       // dereference using the arrow operator
       scene->load_all_the_stuff( "fat-vectors.scn" );

       while ( !spectator->is_bored() ) {

	 scene->update_all_the_stuff(timer);
	 // This works too! =)

	 (*scene).render_all_the_stuff(screen);
	 screen->blit();
       }

       // We do not free the memory associated with the pointer,
       // it will be done automatically in the destructor of the
       // wrapper when the end of its definition scope is reached.
     }

Well, this quick-n-dirty implementation has of course some drawbacks since it doesn't work on multiple allocations like but that was not the point of this example. If you're searching for a more flexible template and if your compiler is ANSI C++ compliant (egcs/g++ will do the trick), you may use the auto_ptr template defined in the header <memory> from the STL.

Smart pointers are pretty handy when using object factories, so the factory which performs the memory allocation don't have to worry about the object being properly destroyed.

What we are doing here with pointers can be done with any kind of resource like files for instances. Just think about using wrapper with destructors to free the resource (close files, flush buffers, etc.) when you don't need them anymore.

For further information about acquire-on-init programming techniques, check out chapter 14 of Bjarne Stroustrup's book.

In C++, most resources we deal with are objects and most objects are manipulated using pointers. Most of the time, objects have other objects for attributes and thus they have to maintain a set of pointers.

What happens when one try to copy an object containing pointers to other objects? The default behaviour of the copy constructor is to make a shallow copy of the data member. This means that both objects will point to the same member objects. Sometimes, you'll be obliged to write yourself a couple in-depth copy routines to get your code working. This is boring and since what we do in demos is often just hacking and not industrial-strength software component we may apply the following rule: "With C++, the easiest way to deal with copy is to prevent it." This can be achieved by "disabling" the copy constructor and the assignment operator as follow:

	  class Noncopyable {

       // We just hide the copy operation by redifining them
       // as private member functions
       Noncopyable(Noncopyable& other) {};
       const Noncopyable& operator = (Noncopyable& other) {};

     public:

       Noncopyable() { ... }
       Noncopyable(/* any other constructor */) { ... }
       // ...

     };

I already hear you: "What? I have to type all this stuff nearly for each single class I write?" In fact, you can have the same effect with only one line of code. Let's just define the following base class:

	  class Noncopyable {
       Noncopyable(Noncopyable& other) {};
       const Noncopyable& operator = (Noncopyable& other) {};
     public:
       Noncopyable() {};
     }

And now, we may write:

	  class Copy_me_if_you_dare : public Noncopyable {
       // Some attributes...
     public:
	Copy_me_if_you_dare(/* A constructor */);
	Copy_me_if_you_dare(/* Another contstructor ... */);
	// Some nice functions ...
     };

And that's it, It will be impossible to make a copy of any object derived from the Noncopyable base class. You won't be able to write:

	  Copy_me_if_you_dare x(y);	  // No way buddy
     Copy_me_if_you_dare x = y;      // Forget it either

     // Impossible, can't copy argument, * or & required
     void f(Copy_me_if_you_dare arg)
     {
     }

Noncopyable objects are not only useful to lazy coders. In some cases, you don't want the object to be copied because of its unique nature like a singleton for instance.

12 - Who holds the reference?

This question might sound pretty silly, but you'll see that there is some neat brainstorming to perform about it... :)

When writing almost any Object Oriented application, you'll run into the following case:

		     ---------		---------
	       | Class A |<>------| Class B |
		---------	   ---------

If you don't read OMT/UML (Ok, I know it's only ugly ASCII!), this graphical relation means that A is composed of zero, one or more instances of B. In the theory it should also mean that an instance of B has one and only container. But when you play with C++ (and thus with pointers), you'll often blow that holy concept into tiny small little pieces. :)

Now, it's implementation time and the big question is where to store that goddamn' reference? There are only three possibilities:

- A stores a table or a list of references to the B's (1)
- B holds a reference to its container object (i.e. A) (2)
- you maintain a reference in both classes (3)

As usual, each of these solutions have both advantages and drawbacks:

(1) If you need to have a fast access to all the B's related to an A, this is the best solution, but it might be slower when you need to know whether a given instance of B is stored into an A due to the look-up overhead.

(2) If you need to find quickly whether B is linked to A, this is the fastest method if you have to perform an algorithm on all the B linked to an A, it might be more time consuming and even impossible depending on the way all the B's are stored in your application.

(3) You may think that using both might allow you to gather the advantages of both methods but watch out because it might be pretty hard to keep a coherent set of references since you have to maintain the relation in several objects.

You have to care about cross-references. Maybe that A needs a reference to B to perform some operation and that B also needs a reference to A to accomplish some task. Just check out carefuly where you need to put the operation or you will have to maintain a spiderweb of links between objects. Don't forget that to compile a source containing xrefs, you have to declare one of the class before the other as follows:

	  // Declare Y before using this symbol in struct X
     struct Y;

     // Declare symbol X
     struct X {
       Y* y;
     };

     // I can use a reference to the symbol X here
     struct Y {
       X* x;
     };

When you can't deal with more xrefs anymore, it's maybe high time to use a mediator class (yet another pattern!). A mediator will hold the reference to all the objects and organize the dialog between the classes. Instead of "registering" objects to the mediator, the mediator might implement an abstract factory (see Hugi 18) so it automatically creates its colleagues (that's how we call the objects interacting with the mediator) and register them in some container. When the whole system is up and running, objects communicates with each other only using the mediator. Not to mention that each colleague holds a reference to its mediator. Of course, you may use several mediators in your application, each one taking care of one dedicated sub-system.

Here's a simple and quick example of use of the mediator pattern. Instead of writing this:

	  void XXX::foo()
     {
       // we need a reference to YYY to perform the operation
       YYY->bar();
     }

We'll have something like this:

	  void Mediator::bar()
     {
       // Mediator is the only class to have references to --
       // all the other objects (colleagues) of the system
       YYY->bar();
     }

     // ...

     void XXX::foo()
     {
       // assuming there's one instance of YYY in the system
       mediator->bar();
     }

In fact, this mediator looks much more some kind of multiobject adapter since it only dispatch the queries to the right objects. The mediator might in fact define some higher level operation that interacts with the colleague to provide a thiner (but smarter) interface to the requesters.

13 - Creating objects

You all know that the way objects are created often influence their behaviour for their whole life cycle. That's why the creation of object is such an important step in the design of your code. As an extent to my previous article about creational design patterns, I present here a few other pattern-like tricks.

Here's a problem that all C++ programmers have met one day or another: how the heck to call a virtual function from a constructor? The answer is pretty simple: it's impossible. C++ only supports early binding when executing code resides in a constructor. There's a way to deal with this limitation: virtual constructors. Let's have a look at the following silly example:

	  class Fruit {

       Fruit* f;

       // Remember section 12?
       Fruit(Fruit& other) {};
       const Fruit& operator = Fruit(& other) {};

     protected:

       Fruit() : f(0) {}

     public:

       // Here's our virtual constructor
       Fruit(which);

       // A polymorphic delete
       ~Fruit() { delete f; }

       // Dispatch the virtual operation
       virtual void say_hello() { f->say_hello(); }
     };

     // A sort of Fruit
     class Lemon : public Fruit {

       friend class Fruit;
       // No one but Fruit's virtual constructor can create a Lemon
       Lemon() { };

     public:

       // Implements the virtual method
       void say_hello() { cout << "Hello, I'm an Apple"; }

     };

     // Another sort of Fruit
     class Banana : public Fruit {

       friend class Fruit;
       // No one but Fruit's virtual constructor can create a Banana
       Banana() { };

     public:

       // Implements the virtual method
       void say_hello() { cout << "Hi, I'm a banana"; }

     };

     // It's tremendously important to write the code this method
     // after the declaration of the classes for obvious cross reference
     // issues.
     Fruit::Fruit(const string& which)
     {
       if (which == "apple") {
	 f = new Apple;
	 // We can make any v-call in here!
	 f->say_hello();
       }
       else if (which = "banana") {
	 f = new Banana;
	 // Apples have always been more sympathic than banana's :)
       } else
	 throw exception("don't know about this kind of fruit");
     }

Recall: the 'friend' relationship is not inherited otherwise it would be too easy to overcome the protection of members by subclassing friend classes!

There's a real big drawback to these virtual constructors. The superclass has to know about all its derived classes. This can be very annoying when you want to extend the system by adding new kind of Fruits... Virtual constructors should only be used when you know that there will only be a limited number of subclasses.

Another example a bit more pragmatic... I mean pragmatic for demomakers! :)

Let's say we've got an Image class which is an abstract public base of both Jpeg and Png classes. Now the question is, how to create an object which has exactly the same type as a given instance. You can't directly use the copy constructor since you have to know the real type of the object (Jpeg or Png). A clean way to do so is to provide a couple of "virtual constructor" which are implemented by all of the subclasses. Since the return type is not a part of the signature of C++ functions, you may change it in subclasses, thus providing a strong typing of the returned object. Writing a clone() method in C++ is as simple as invoking the copy constructor on *this.

	  class Image {
     public:
       // ...
       virtual Image* create() = 0;
       virtual Image* clone() = 0;
     };

     class Jpeg : public Image {
     public:
       // ...
       Jpeg* create() { return new Jpeg; }
       Jpeg* clone() { return new Jpeg(*this); }
     };

     class Png : public Image {
     public:
       // ...
       Png* create() { return new Png; }
       Png* clone() { return new Png(*this); }
     };

Now we can write things like:

	  void f(Image* img)
     {
       // We want a copy of the object with the same type
       Image* twin = img->clone();

       // We want an "empty" object of the same concrete type as img.
       Image* same_type = img->create();
     }

Classes providing a clone() operation are often referred to as "Prototypes" from the name of the design pattern describing this feature.

You have to choose whether you want to implement a deep copy (i.e. duplicating objects referenced by pointers) or a shallow copy (the behaviour of the default copy constructor in C++).

14 - Streams

The I/O mechanism provided by C++ is entirely based on the concept of stream. There are several sorts of streams as such as:

- character streams (cin, cout, cerr)
- file streams ([io]fstream)
- string streams ([io]strstream, [io]stringstream in ANSI C++)

I'm sure you're all familiar with this kind of statement:

	  cout << "Hello world, PI=" << 3.14159f << endl;
     string s;
     cin >> s;	// Remember that cin takes all whitespaces as delimiters!

But do you really know what happens when the compiler run into such a line of code? The stream mechanism is based upon two key features of C++:

- operator overloading
- return by reference

You should already be familiar with operator overloading since it's usually the first thing demo coders look at when learning C++. :) The return by reference allows you to chain the call to several methods by returning a self reference (ie return *this) at the end of these methods.

But what is that 'endl' thing? Is it an object? a constant? Nope, endl is a function. Such a function is called a manipulator and matches the following prototype:

	  // An output stream manipulator
     ostream& manipulator(ostream& os);

You may write your own manipulator too, check this out:

	  // Add an horizontal tabulation character to the output stream
     ostream& tab(ostream& os) { return os << '\t'; }

     void f()
     {
       cout << "Hello" << tab << tab << tab << "out there! :)";
     }

You shoud all be wondering how the hell does the compiler invoke this function since the function call operator () has not been specified. This is where the coolest hack comes: ladies and gentlemen, let me introduce you the concept of "applicator". An applicator is nothing more than a case of operator overloading.

Let's see how it works by writing our own stream-like class:

	  class My_stream;
     typedef My_stream& (*My_manip)(My_stream&);

     class My_stream {
     public:
       // ...
       // Overload redirection operator for some data types
       My_stream& operator << (const string& s) { ... }
       My_stream& operator << (const int& i) { ... }
       // ...
       // Here's the applicator
       My_stream& operator << (My_manip manip) { return manip(*this); }
     };

The technique used in the applicator is called "double dispatching" and is used in a design pattern called "Visitor".

This trick has proven to be very useful when I had to write a C++ wrapper for 4.3+BSD system log facility which is a typical C API whose main function is one of these printf-like. I implemented a Log class as a stream and defined a set of operators for some common types and an new "endl" manipulator to write the current entry to the log. This is a very elegant solution when you have to cope with C style functions having a variable number of arguments. Using endl as a line separator allows you to perform extra operation at the end of the line like flush a buffer (that's actually what the default endl does) or other thing like resizing an array or updating some statistics.

15 - Wrapping C++ with C

It's common use to wrap C APIs in C++ classes to use them in object oriented applications. But how to user C++ code with a C compiler? The answer is rather simple: you'll have to wrap your classes with a set of functions.

Let's say you've the following very simple class:

	  // Pic.h, -*- C++ -*-
     class Pic {
       // ...
     public:
       Pic();
       ~Pic();
       int width() const;
       int height() const;
       void load(const char* filename);
     };

There are several points to take into account when designing the wrapper:

- How will the instance be represented in C?
- What sort of operation will be allowed?
- How to retrieve the data returned by the object

To represent the instance, we will use a handle (which is the address of the instance typecasted as a void pointer).

To enable the C compiler to link functions compiled by a C++ compiler, the exported C++ function must be declared as 'extern "C" { ... }'. The reason is binded to way C++ supports function overloading: a function is not only identified by its name but by the number and the type of its arguments. Thus, the C++ compiler keeps a reference to these information within its symbol table. All the function declared in the scope of the extern statement will be stored in the symbol table without argument type tags.

Sometimes, it might be interresting to group some data together within a structure to avoid to perform too many (often expensive) calls to the wrappers since in C, it's not possible to retrieve this data using inline functions.

Let's design the interface of our C wrapper for the forementioned Pic class:

	  /* Pic_wrap.h, this one will be used by C and C++ programs */

     typedef struct {
       int width;
       int height;
     } Pic_info;

     typedef void *Pic_handle;

     /* The '__cplusplus' symbol is defined by the C++ compiler */
     #ifdef __cplusplus
     extern "C" {
     #endif

     /* Constructor */
     Pic_handle Pic_new();
     /* Destructor */
     void Pic_delete(Pic_handle pic);
     /* Get width and height of pic */
     void Pic_info(Pic_handle pic, Pic_info* info);
     /* Load a picture from a file */
     void Pic_load(Pic_handle pic, const char* filename);

     #ifdef __cplusplus
     };
     #endif

Now that we've got an interface, it's time to write down some code:

	  // Pic_wrap.cc, this is still C++ since we will use objects

     // Note that this is only here that we include the C++ header
     #include "Pic.h"

     Pic_handle Pic_new()
     {
       Pic* pic = new Pic;
       return (Pic_handle)pic;
     }

     void Pic_delete(Pic_handle pic)
     {
       Pic* p = (Pic*)pic;
       delete p;
     }

     void Pic_info(Pic_handle pic, Pic_info* info)
     {
       Pic* p = (Pic*)pic;
       info->width = p->width();
       info->height = p->height();
     }

     void Pic_load(Pic_handle pic, const char* filename)
     {
       Pic* p = (Pic*)pic;
       p->load(filename);
     }

If you are already very very very familiar with C++, you might wonder why am I not using the 'dynamic_cast' operator to downcast the pointer from void* to Pic*. The reason is that the main code will be run with the C runtime, not the C++ runtime and thus, we're not sure we'll have access to the Run-Time Type Identification (RTTI) services.

Using this library from a C piece of code is straightforward but since I'm in a pretty good day today so, here's an example:

	  #include <stdio.h>
     #include "Pic_wrap.h"

     int main(int argc, char **argv)
     {
       Pic_info info;
       Pic_handle pic;
       pic = Pic_new();
       Pic_load(pic, "brioche_nude_by_the_pool.pic");
       Pic_info(pic, &info);
       printf( "width=%d, height=%d\n", info.width, info.height);
       Pic_delete(pic);
     }

Although it might sound weird to wrap C++ in a C library this is sometimes mandatory when you're obliged to use a C interface like when developing a Plug-in for an another application or when someone is allergic to C++ and absolutely wants to stick on that good old ANSI C.

16 - Elegant multiple inheritance

Everyone knows that C++ allows you to inherit from more than one class. I'm sure you're already familiar with such a thing:

	  // Prerequisite: We assume there there is it at least
     // one Fruit which is not eatable! :)

     class Banana : public Fruit, public Eatable, public Peelable {

       // ...

     public:

       // ...

       // Inherited from Fruit
       virtual const string& name() const;
       virtual const Shape& shape() const;

       // Inherited from Eatable
       virtual void eat();

       // Inherited from Peelable
       virtual void peel_off();

       // Some method useful to a Banana class...
       void put_the_peel_on_the_floor();
     };

A Banana can be seen both as a Fruit, as a more generic Eatable object as much as an object covered with a peel. These interfaces are rather independant. It depends on the context. Someone who's starving only wants to know that an object of the type Banana is eatable ; a peeling machine only wants to know that a Banana is Peelable ; a drawing program, just wants to know about the Shape of the object and, last but not least, a funny boy just have to know that when he has a Banana, he can have fun with it by laughing at some people passing by... :)

When designing a class which will implement a complex behaviour, it's usually a good idea to think about interfaces - and how to group or to split them - before starting to code.

But how to inherit more than once from the same base? That's a rather interresting question... First, let's wonder why would we want to do so? A nice example is the Observer pattern (see Hugi 18) or any other kind of event notification framework. An object might be listening to more than one event producer. It's pretty common to run into a class like this one:

	  // If Listener1 and Listener2 have a common ancestor we could
     // have to do virtual inheritance by adding 'virtual' nearby
     // the 'public' directive.

     class Some_listener : public Listener1, public Listener2 {

       // ...

     public:

       // ...

       // Implements interface defined in Listener1
       virtual void handle_event1(Producer1& p);

       // Implements interface defined in Listener2
       virtual void handle_event2(Producer2& p);

     };

This is a bad solution because you will have to deal with a different interface for each event you want to monitor... Each Listener should have a different name for handle_event to avoid the (ab)use of the scope resolution operator, in other words: this is odd.

That's where nested-classes arrives! A nested class, is just a class defined within the scope of another class. You can create a couple of nested classes which will be friend of their nesting class and use these specialized classes as event listeners:

	  class Some_listener {

	// By default, nested classes can't access private members!
	// Yet another area where C++ differs from Java...

	friend class First_listener;
	friend class Second_listener;

       // ...

     public:

       // ...

       class First_listener : public Listener1 {
       public:
	 virtual void handle_event(Producer1& p);
       };

       class Second_listener : public Listener2 {
       public:
	 virtual void handle_event(Producer2& p);
       };

     };

The handle_event() methods of the "inner-listeners" will just update the internal state of the nesting class and that's it!

Nested classes allows us to inherite more than once from the same base, in the previous example, Listener2 might have been replaced by Listener1 without any trouble.

17 - Destruction

When moving from C, one of feature people usually appreciate in C++ is the fact that you can declare data anywhere. You're not obliged anymore to declare all the data at the beginning of an instruction block and before the code is executed, like in C, otherwise the object contruction mechanism couldn't even exist!

Instanciating objects everywhere is funny but memory wasting since there is no garbage collection in C++! Do not forget that when you write:

	  Point p;
     Point center(320, 240);
     Rect area(0, 0, 640, 480);
     Matrix<double> buffer(8192, 8192);

All these objects are allocated on the execution stack and it's not unusual to run out of stack memory - and thus experiencing a nice core dump for some and blue screen for the others :) - when your program is too hungry!

The ideal solution would be to dynamically allocate the big objects on the heap using the new operator. But we've previously seen that pointers are bug prone when it comes to delete them (or forgiving to do so).

Another solution is to clearly set the definition scope of your objects. Lots of objects are only required for temporary operations, so why not defining them in their own scope so you're sure they will be destroyed when you don't need them anymore? The rule is simple in C++, any object allocated on the stack will be deleted (and its destructor will be invoked) when it runs out of its definition scope.

Example:

	  void a_pretty_memory_wasting_one()
     {
	// Create some objects on the stack
	Matrix<double> m(4, 4);
	Vector<double> v(4);

	// Define a scope where we will need some memory
	{
	   unsigned char buff[8192];
	   fill_buffer(buff);
	   do_someting_with_the_buffer(buff);
	   // 8k buffer will be deallocated here...
	}

	// Another buffer needed
	unsigned char memory_pool[4096];
	dive_in(memory_pool);
	swim_trough(memory_pool);

	// And so on...
     }

I hope you'll keep this in your mind when your program will crash after adding just-one-more-object-in-main(). :)

18 - Philosophical Corner

Mmmmm, let's end this article with a bit of philosophy...

When you declare a piece of data with a expression containing * or &, where do you place that special character? Like this?

	  int *p1;

Or like that?

	  int* p2;

Although these statements have both the same meaning for the compiler, one might interpret them differently. The first notation focus on the syntax (p1 is a pointer to something which is of type int) while the second put a stress on the type (p2 is an object whose type is "pointer to an integer"). The first syntax is more C-like while the second is the favorite of most C++ enthusiasts, so now it's time to choose your own! Imho, the second syntax is somewhat more coherent since one binds everything to the type of the object and since "p1" is obviously different from "*p1" so it should be the same between "int" and "int*".

Btw, there has been a similar discussion with Java about the array built-in type to tell the difference between "int[] a" and "int a[]". This is what people usually call brain masturbation. :)

The complete story about this thing can be found in Bjarne Stroustrup's FAQ that you may read on his homepage.

19 - References and Resources

Several people asked me for more pointers to resources about OOP and C++, so here they are...

If you want to learn more about C++, here are a couple of books you should already have bought from your favorite bookstore:

	  The C++ Programming Language, 3rd Edition.
     Bjarne Stroustrup,
     Addison-Wesley.

     Thinking in C++.
     Bruce Eckel,
     Prentice-Hall.

The draft of the 2nd edition of Bruce Eckel's Thinking in C++ is freely available online as a Rich Text Format document. The book is about 1350 pages of pure C++, a lot of topics are covered from STL hacking to design pattern implementation. You can download that book or read the HTML version online. Check also this URL: http://www.codeguru.com/cpp/tic

A nice article for a quick look around of some new features introduced in the latest draft (at this time) of the ANSI/ISO C++ comittee:

	  Dr Dobb's Journal (sep 1998), Al Stevens' C Programming Column,
     "Then Next Great Migration: from C++ to Standard C++", p.105

In love with design patterns? Check out http://hillside.net/patterns.

If you're searching for a place with hundreds of articles, books review and links about C++, you should rush to the website of the "Association of C and C++ Users"!

A nice place with lots of portable class libraries: http://www.boost.org. The boost mailing-list is also a nice place to meet experienced C++ coders.

Last but not least, a coding area will be soon available on our website. There will be copy of the articles, source code and lots of links.

Happy reading!

20 - Closing Words and Acknowledgements

That's it, you've reached the end of this C++ Tips-n-Tricks article! I hope you liked it, as usual feel free to send me some feedback!

This article is money-back guaranteed so if absolutely none of the topics discussed here were useful to you, you may contact our Customer Service Hot Line. Get in touch with our friendly operators, dial 1-800-ASPI-RINE now!

And some special hello to all the people who gave me feedback about my previous articles like Chris, Piotr, Allergy, ...

Big respect to Fuzzion for making very cool productions.

If you want to see a special topic appearing in the next issue of Hugi, just tell me and I'll try to fix something! There are a lot of interresting topics we haven't covered yet like RTTI or the STL for instance.

Regards!

brioche of aspirine - a bunch of fresh d-zign fanatics

RCS - $Id: cpp_tnt.txt,v 1.15 2000/04/22 09:35:12 alcibiade Exp alcibiade $