June | 2007 | adam's Xanga Site

So you have just created a sweet little template function that will save the world from global warming. Quite an amazing feat! You have decided declare the function like so:

template <typename T>
void SaveTheWorld(T moreOzone) { … }

The problem is that the function has to be called many times, with many different template parameter types for it to be effective, so you need to make sure it is as efficient as possible. Since some of the template parameter types might be large objects, passing by value may not be a good idea. You decide to redefine SaveTheWorld like so:

template <typename T>
void SaveTheWorld(const T& moreOzone) { … }

It turns out that this is going to perform a lot better than the previous definition, but for primitive types like ints, and chars, passing by value turns out to be a little quicker, and our secret ozone object needs a lot of those little guys. Wouldn’t it be nice if primitives could use the top one, and class types defined by the program use the bottom one? Well with some creative use of C++ templates, it turns out we can. The following ideas were taken from the book “C++ Templates, The Complete Guide” by David Vandevoorde and Nicolai M. Josuttis. I recommend this read for anyone who wants a serious understanding of C++ Templates.

The first thing we need is a template that takes as template parameters a boolean value and two types. It will contain a typedef called BestType, which will be of type T1 if the boolean is true, and T2 if the boolean is false. To do this we will also need to utilize partial specialization.

    // primary template
    template <bool B, typename T1, typename T2>
    class TypeChooser;

    // partial specialization for true
    template<typename T1, typename T2>
    class TypeChooser<true, T1, T2> {
    public:
       typedef T1 BestType;
    };

    // partial specialization for false
    template<typename T1, typename T2>
    class TypeChooser<false, T1, T2> {
    public:
       typedef T2 BestType;
    };

With these definitions, TypeChooser<true, int, char>::BestType will be type int, and TypeChooser<false, int, char>::BestType will be type char.

Now we need a class template that will tell us whether a type is a class or not. Here is the code for that:

    template <typename T>
    class IsClass {
    private:
       typedef char One;
       typedef struct { char a[2]; } Two;
       template<typename C> static One test(int (C::*pmt));
       template<typename C> static Two test(…);
    public:
       enum { Yes = sizeof(IsClass<T>::test<T>(0)) == 1 };
       enum { No = !Yes };
    };

This class might be a little confusing, so I’ll explain it here. If you declare IsClass with an integer type like IsClass<int>, then IsClass<int>::Yes is going to be false. The reason is because the declaration test(int (int::*pmt)) would be illegal, so the compiler will choose the test(…) function when it comes across IsClass<int>::test<int>(0), and return Two, whose size is 2. However, test(int (MyClass::*pmt)) would be legal, with int (MyClass::*pmt) being of type pointer to member function of MyClass. The following demonstrates the usage of this class.

    if(IsClass<int>::No) {
       cout << “NOT A CLASS!” << endl;
    }
    if(IsClass<MyClass>::Yes) {
       cout << “WE HAVE A CLASS!” << endl;
    }

Finally, we’ll create a template class called ParamTChooser who will choose the correct parameter type for our SaveTheWorld function.

    template <typename T>
    class ParamTChooser {
    public:
       typedef typename
            TypeChooser<IsClass<T>::No, T, const T&>::BestType Type;
    };

So ParamTChooser<T>::Type will be of type T if T is a primitive, and of type const T& if T is not a primitive. Finally, we redeclare SaveTheWorld as follows:

template <typename T>
void SaveTheWorld(typename ParamTChooser<T>::Type moreOzone);

Now when we do SaveTheWorld<int>(myInt), it will pass myInt by value, and when we do SaveTheWorld<SomeClass>(myClass), it will pass myClass by const reference.

The other day I was having a discussion with a fellow co-worker here at xanga (http://www.xanga.com/michael), and our discussion was about the C++ Boost Lambda Library (http://www.boost.org/doc/html/lambda.html). Here is a little background on this library for those of you that aren’t familiar with it.

The Lambda library implements lambda abstractions for C++. Basically unnamed or anonymous functions. It’s a sweet little library, because instead of having to write code like:

    template<typename T> struct output {
       void operator() (const T& x) const { std::cout << x << ‘ ‘; }
    };
    …
    for_each(v.begin(), v.end(), output);

to write each element in vector v to standard output, you could use an anonymous function instead of the function object. It would look like so:

for_each(v.begin(), v.end(), std::cout << _1 << ‘ ‘);

This is really quite cool, and you don’t have to pollute your code with all those function objects that the std algorithms need.

So back to my discussion with Mikey. He was confused as to how C++ could allow such coding constructs without generating a compiler error. How, without modifying the compiler, could this be legal. The answer: C++ templates. Most people don’t realize the power of C++ templates. Did any of you know that with the technique of Template metaprogramming, you can write code that actually gets executed by the compiler at compile time? You could have the compiler compute the square root of a number at compile time. Template metaprogramming is generally even Turing-complete!

I’m not going to give Template metapogramming much discussion here, though I’ll be discussing something similar. That is I’ll be discussing Expression Templates, the technique that makes the above code snippet legal. I’ll be using a different code snippet as an example though. Much of the following idea is taken from the book “C++ Templates, the Complete Guide” by David Vandevoorde and Nicolai M. Josuttis. I recommend this read for anyone serious about C++ programming. Say you have the following code:

SArray<double> x(1000), y(1000);
x = x*x + x*y;

So here, SArray is some generic simple array type, and you want to perform calculations on every element in each array. You could override the multiplication operator and addition operators that take 2 of these SArray types, or an SArray type and a scalar, and do the appropriate calculation on them. You could imagine the loops being like so.

    SArray<T> result(a.size());
    for(size_t k = 0; k < a.size(); ++k) {
       result[k] = a[k] + b[k];
    }
    return result;

For the thousand element arrays, there are 2000 reads in this loop, and 1000 writes. Also, a temporary object of size 1000 is created and returned from the function. So with 3 calculations, two multiplications and one addition, there will be 6000 reads and 4000 writes (including the final initialization). If we had one ideal loop, like the following:

for(int idx=0; idx<x.size; ++idx)
x[idx] = x*x[idx] + x[idx]*y[idx];

We would not need any temporary storage, and there are only 2000 memory reads, and 1000 memory writes. But this code couldn’t be re-used very well. If we wanted to change it to be x + x + y, then we would need to write another loop. This is where expression templates come into play. The way they work is you encode the actual expression in template arguments. So x*x would be encoded in an object that represents each value of x multiplied by x, and x*y would be encoded in an object that represents x multipled by y. Then finally another object would represent the first object added to the second object. Here are 2 classes that will help us encode the multiplication and addition operations.

    template <typename T, typename OP1, typename OP2>
    class A_Mult {
    private:
       T const& op1;
       T const& op2;
    public:
       A_Mult(OP1 const& a, OP2 const& b) : op1(a), op2(b) {
       }

       // compute product when value requested
       T operator[] (size_t idx) const {
          return op1[idx] * op2[idx];
       }
    };

    template <typename T, typename OP1, typename OP2>
    class A_Add {
    private:
       T const& op1;
       T const& op2;
    public:
       A_Add(OP1 const& a, OP2 const& b) : op1(a), op2(b) {
       }

       // compute product when value requested
       T operator[] (size_t idx) const {
          return op1[idx] + op2[idx];
       }
    };

With these expression templates, we now need an array class that contains the actual storage, and knows about these expression templates. Here is such an array class.

    template <typename T, typename Rep = SArray<T> >
    class Array {
    private:
       Rep expr_rep;
    public:
       // put constructors here

       Array& operator= (Array const& b) {
          for(size_t idx=0; idx<b.size(); ++idx) {
             expr_rep[idx] = b[idx];
          }
          return *this;
       }

       template<typename T2, typename Rep2>
       Array& operator= (Array<T2, Rep2> const& b) {
          for(size_t idx=0; idx<b.size(); ++idx) {
             expr_rep[idx] = b[idx];
          }
          return *this;
       }

       T operator[] (size_t idx) const {
          return expr_rep[idx];
       }
       T& operator[] (size_t idx) {
          return expr_rep[idx];
       }

       Rep const& rep() const {
          return expr_rep;
       }
       Rep& rep() {
          return expr_rep;
       }
    };

And finally I’ll overload the Addition and Multiplication operators. I’ll explain how everything works together after I write these out.

    //addtion of two Arrays
    template <typename T, typename R1, typename R2>
    Array<T, A_Add<T, R1, R2> >
    operator+ (Array<T, R1> const& a, Array<T, R2> const& b) {
       return Array<T, A_Add<T, R1, R2> >(A_Add<T, R1, R2>(a.rep(),b.rep()));
    }

    // multiplication of two Arrays
    template <typename T, typename R1, typename R2>
    Array<T, A_Mult<T, R1, R2> >
    operator* (Array<T, R1> const& a, Array<T, R2> const& b) {
       return Array<T, A_Mult<T, R1, R2> >(A_Mult<T, R1, R2>(a.rep(), b.rep()));
    }

Whew, thats some funky looking code! Now to do some explaining. Lets first start with the code snippet that will do the calculation.

Array<double> x(1000), y(1000), z(1000);
x=x*y + x*z;

When we encounter x*y in the code, the multiplication operator is encounted. This returns another Array object (keep in mind x is an Array object, and y is an Array object). The difference is that instead of the expr_rep member variable being of type SArray (the actual storage), it is of type:

    A_Mult<double, SArray<double>, SArray<double> >

x*z returns an Array object that contains the same type for its expr_rep member. Finally, those two returned Array types added together, invoking the addition operator, and returning an Array type, whose expr_rep member is of type:

     A_Add<double,
                   A_Mult<double, SArray<double>, SArray<double> >,
                   A_Mult<double, SArray<double>, SArray<double> > >

Now keep in mind, all of this is done at compile time by the C++ compiler! The compiler turns x*y + x*z into an object of type:

    Array<double, A_Add<double,
                   A_Mult<double, SArray<double>, SArray<double> >,
                   A_Mult<double, SArray<double>, SArray<double> > > >

When you set x equal the the object of this type (x = x*y + x*z), the assignment operator is invoked on that Array type. I’ll repeat the code here, so you don’t have to scroll up to the Array class and look for it:

So this code calls the index operator on Array, which in turn calls the index operator on that A_Add object we created above. The A_Add index operator. If you trace through the index operator of A_Add, you will find that for each given subscript idx, it will comput (x[idx] * y[idx]) + (x[idx]*z[idx]), which is what our intial aim was to do!

Just to put this all into perspective, this code recursively instantiates templates that, in the end, represent an entire expression. You can imagine the lambda library doing the same thing. In the example above for lambda, the << operator was overloaded, and the expression cout << _1 << ” ” was converted into an an object that was a function object. This function object gets executed by the for_each loop, and the rest is history.

adam's Xanga Site

Month: June 2007

Another C++ template programming goodie

The beauty of generic programming

adam