June 12, 2007

  • The beauty of generic programming

    The other day I was having a discussion with a fellow co-worker here at xanga (http://www.xanga.com/michael), and our discussion was about the C++ Boost Lambda Library (http://www.boost.org/doc/html/lambda.html). Here is a little background on this library for those of you that aren’t familiar with it.

    The Lambda library implements lambda abstractions for C++. Basically unnamed or anonymous functions. It’s a sweet little library, because instead of having to write code like:

        template<typename T> struct output {
           void operator() (const T& x) const { std::cout << x << ‘ ‘; }
        };
        …
        for_each(v.begin(), v.end(), output);

    to write each element in vector v to standard output, you could use an anonymous function instead of the function object. It would look like so:

        for_each(v.begin(), v.end(), std::cout << _1 << ‘ ‘);

    This is really quite cool, and you don’t have to pollute your code with all those function objects that the std algorithms need.

    So back to my discussion with Mikey. He was confused as to how C++ could allow such coding constructs without generating a compiler error. How, without modifying the compiler, could this be legal. The answer: C++ templates. Most people don’t realize the power of C++ templates. Did any of you know that with the technique of Template metaprogramming, you can write code that actually gets executed by the compiler at compile time? You could have the compiler compute the square root of a number at compile time. Template metaprogramming is generally even Turing-complete!

    I’m not going to give Template metapogramming much discussion here, though I’ll be discussing something similar. That is I’ll be discussing Expression Templates, the technique that makes the above code snippet legal. I’ll be using a different code snippet as an example though. Much of the following idea is taken from the book “C++ Templates, the Complete Guide” by David Vandevoorde and Nicolai M. Josuttis. I recommend this read for anyone serious about C++ programming. Say you have the following code:

        SArray<double> x(1000), y(1000);
        x = x*x + x*y;

    So here, SArray is some generic simple array type, and you want to perform calculations on every element in each array. You could override the multiplication operator and addition operators that take 2 of these SArray types, or an SArray type and a scalar, and do the appropriate calculation on them. You could imagine the loops being like so.

        SArray<T> result(a.size());
        for(size_t k = 0; k < a.size(); ++k) {
           result[k] = a[k] + b[k];
        }
        return result;

    For the thousand element arrays, there are 2000 reads in this loop, and 1000 writes. Also, a temporary object of size 1000 is created and returned from the function. So with 3 calculations, two multiplications and one addition, there will be 6000 reads and 4000 writes (including the final initialization). If we had one ideal loop, like the following:

        for(int idx=0; idx<x.size; ++idx)
           x[idx] = x*x[idx] + x[idx]*y[idx];

    We would not need any temporary storage, and there are only 2000 memory reads, and 1000 memory writes. But this code couldn’t be re-used very well. If we wanted to change it to be x + x + y, then we would need to write another loop. This is where expression templates come into play. The way they work is you encode the actual expression in template arguments. So x*x would be encoded in an object that represents each value of x multiplied by x, and x*y would be encoded in an object that represents x multipled by y. Then finally another object would represent the first object added to the second object. Here are 2 classes that will help us encode the multiplication and addition operations.

        template <typename T, typename OP1, typename OP2>
        class A_Mult {
        private:
           T const& op1;
           T const& op2;
        public:
           A_Mult(OP1 const& a, OP2 const& b) : op1(a), op2(b) {
           }

           // compute product when value requested
           T operator[] (size_t idx) const {
              return op1[idx] * op2[idx];
           }
        };

        template <typename T, typename OP1, typename OP2>
        class A_Add {
        private:
           T const& op1;
           T const& op2;
        public:
           A_Add(OP1 const& a, OP2 const& b) : op1(a), op2(b) {
           }

           // compute product when value requested
           T operator[] (size_t idx) const {
              return op1[idx] + op2[idx];
           }
        };

    With these expression templates, we now need an array class that contains the actual storage, and knows about these expression templates. Here is such an array class.

        template <typename T, typename Rep = SArray<T> >
        class Array {
        private:
           Rep expr_rep;
        public:
           // put constructors here

           Array& operator= (Array const& b) {
              for(size_t idx=0; idx<b.size(); ++idx) {
                 expr_rep[idx] = b[idx];
              }
              return *this;
           }

           template<typename T2, typename Rep2>
           Array& operator= (Array<T2, Rep2> const& b) {
              for(size_t idx=0; idx<b.size(); ++idx) {
                 expr_rep[idx] = b[idx];
              }
              return *this;
           }

           T operator[] (size_t idx) const {
              return expr_rep[idx];
           }
           T& operator[] (size_t idx) {
              return expr_rep[idx];
           }

           Rep const& rep() const {
              return expr_rep;
           }
           Rep& rep() {
              return expr_rep;
           }
        };

    And finally I’ll overload the Addition and Multiplication operators. I’ll explain how everything works together after I write these out.

        //addtion of two Arrays
        template <typename T, typename R1, typename R2>
        Array<T, A_Add<T, R1, R2> >
        operator+ (Array<T, R1> const& a, Array<T, R2> const& b) {
           return Array<T, A_Add<T, R1, R2> >(A_Add<T, R1, R2>(a.rep(),b.rep()));
        }

        // multiplication of two Arrays
        template <typename T, typename R1, typename R2>
        Array<T, A_Mult<T, R1, R2> >
        operator* (Array<T, R1> const& a, Array<T, R2> const& b) {
           return Array<T, A_Mult<T, R1, R2> >(A_Mult<T, R1, R2>(a.rep(), b.rep()));
        }

    Whew, thats some funky looking code! Now to do some explaining. Lets first start with the code snippet that will do the calculation.

        Array<double> x(1000), y(1000), z(1000);
        x=x*y + x*z;

    When we encounter x*y in the code, the multiplication operator is encounted. This returns another Array object (keep in mind x is an Array object, and y is an Array object). The difference is that instead of the expr_rep member variable being of type SArray (the actual storage), it is of type:
       
        A_Mult<double, SArray<double>, SArray<double> >
      
    x*z returns an Array object that contains the same type for its expr_rep member. Finally, those two returned Array types added together, invoking the addition operator, and returning an Array type, whose expr_rep member is of type:

         A_Add<double,
                       A_Mult<double, SArray<double>, SArray<double> >,
                       A_Mult<double, SArray<double>, SArray<double> > >

    Now keep in mind, all of this is done at compile time by the C++ compiler! The compiler turns x*y + x*z into an object of type:

        Array<double, A_Add<double,
                       A_Mult<double, SArray<double>, SArray<double> >,
                       A_Mult<double, SArray<double>, SArray<double> > > >

    When you set x equal the the object of this type (x = x*y + x*z), the assignment operator is invoked on that Array type. I’ll repeat the code here, so you don’t have to scroll up to the Array class and look for it:

        template<typename T2, typename Rep2>
           Array& operator= (Array<T2, Rep2> const& b) {
              for(size_t idx=0; idx<b.size(); ++idx) {
                 expr_rep[idx] = b[idx];
              }
              return *this;
           }

    So this code calls the index operator on Array, which in turn calls the index operator on that A_Add object we created above. The A_Add index operator. If you trace through the index operator of A_Add, you will find that for each given subscript idx, it will comput (x[idx] * y[idx]) + (x[idx]*z[idx]), which is what our intial aim was to do!

    Just to put this all into perspective, this code recursively instantiates templates that, in the end, represent an entire expression. You can imagine the lambda library doing the same thing. In the example above for lambda, the << operator was overloaded, and the expression cout << _1 << ” ” was converted into an an object that was a function object. This function object gets executed by the for_each loop, and the rest is history.

Comments (2)

  • I’m not proficient in C++, but there was a time when I thought I could at least read it. I bow to your kung-fu sir.

  • TAG UR IT!!!! this is so scary. send this to 15 people in the next 143 mins. when you are done press F6 and your crushes name will appear on the screen in big letters. this is scary cuz it works!!!! if you break the chain you will have problems with relationships for the next 5 years.

Post a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *