Wednesday, May 10, 2006

C++ : Meta Programming : Fibonacci and Phi

In this blog I explore the relationship between the Fibonacci sequence and the Golden Ratio Phi. I write a meta program to calculate the Fibonacci sequence at compile time and use the last two elements in the sequence to calculate Phi.


Fibonacci Sequence:

1 1 2 3 5 8 13 21 34 ...

N N + N
i = i-1 i-2


The following expression tends towards Phi as i increases:

N / N
i i-1


Another definition of Phi:

Phi * Phi = Phi + 1


Which rearranged proves that Phi is 1 + it's reciprocal, hinting at a much simpler algorithm to calculate it than the one we have here based on this equation:

Phi = 1 + 1/Phi


This could be expressed as a recursive function and we can start with any number, except 0 or -1, to tend towards Phi, 1 will do though.

The following link explores the interesting mathematical properties of Phi


http://www.mcs.surrey.ac.uk/Personal/R.Knott/
Fibonacci/phi.html#simpledef



Now time for some code, here's a runtime recursive function that calculates the last two numbers in the Fibonacci sequence for any given index in the sequence i:

#define DO_OUTPUT

typedef unsigned long long ulong;
typedef std::pair<ulong, ulong> tuple;

// Fibonacci
// Run-time Recursive Function
void GetFibonacciTupleImpl(unsigned int i, tuple &result)
{
if(i>0)
{
result.second += result.first;
result.first = result.second - result.first;
#ifdef DO_OUTPUT
std::cerr << result.second << " ";
#endif
if(result.second>result.first) // wrapping guard
GetFibonacciTupleImpl(i-1, result);
else // store recursive level
result.first = result.second = i;

}
}

void GetFibonacciTuple(unsigned int i, tuple &result)
{
result.first = 1;
result.second = 1;
#ifdef DO_OUTPUT
std::cerr << "1 1 ";
#endif
GetFibonacciTupleImpl(i, result);
#ifdef DO_OUTPUT
if(result.first==result.second && result.second>1)
std::cerr << "\nwrapping occured at i="
<< (i-result.first+2) << "\n";
#endif
}


Here's a meta-programming solution for the exact same thing, only all calculated at compile-time!

template<int I>
struct Fibonacci
{
enum { value = Fibonacci<I-1>::value +
Fibonacci<I-2>::value };
};

template<>
struct Fibonacci<1>
{
enum { value = 1 };
};


template<>
struct Fibonacci<0>
{
enum { value = 1 };
};

template<int I, typename T>
struct FibonacciTuple
{
enum {
first = Fibonacci<I-1>::value,
second= Fibonacci<I>::value
};

// run time part as not integral ...
static const T getRatio()
{
return (T)second / (T)first;
}
};


Now this only works up to a point. If you try to compile Golden::FibonacciTuple<47,double>::getRatio() under g++ 3.4.4 you get a compiler error: Integer Overflow in Expression. Also the results are only correct up to Golden::FibonacciTuple<45,double>::getRatio(). Initially I surmised this was due to some compiler recursion-stack related problem but after I'd actaully read the compiler output! I realised the error occured on the line where the enum value is evaluated (using the + operator). If it were a compiler stack-depth problem then something like this migth have alleviated the problem:

template<ulong I>
struct Fibonacci
{
...
};


but the error lies with the face that it looks like the compiler uses a signed integer to store enum values. Why signed? Well the runtime version uses an unsigned int to store the Fibonacci sequence value and it breaks down after 90 recursions! (i.e. twice as many).

Of course this is only a bit of fun and isn't the most efficient way of calculating Phi. Initially I thought perhaps I could write a way of statically storing the value of Phi in a program without defining it as a literal value, i.e.

static const float Phi = 1.618....;


However, firstly there's the enum problem and secondly (and this is a related problem) we can only use integral values at compile-time. This is illustrated by the following meta program:

// Cannot get Phi using the reciprocal method
// since only integral numbers are used in enums!

template<int I>
struct Phi
{
enum { value = 1 + 1/Phi<I-1>::value };
};

template<>
struct Phi<0>
{
enum { value = 1 };
};


This is a flawed implementation of the reciprocal method of calculating Phi as mentioned above. This obviously requires less work than the Fibonacci method but isn't quite as interesting from a mathematical point of view.

In conclusion then, as well as exploring the fascinating relationship between the Fibonacci Sequence and Phi, we've found out about some limitations of what we can and can't do a compile-time and managed to deduce the g++ 3.4.4 uses an int to store an enum values!

Monday, May 08, 2006

C++ : Exploring Sorting Part IV

Let's start with the full code for the promised generic quicksort implementation that is my attempt at an "exercise for the reader" from an Alexandrescu article mentioned previously

template<class Iter>
void sort(Iter left, Iter right)
{
// make larger to invoke insertion sort sooner
enum { threshold = 56 };
while(right-left>1)
{
// do insertion sort on range
if(right-left<=threshold)
{
Bannister::insertionSort(left, right);
return; // done sorting range
}
else // do quicksort on range
{
const std::pair<Iter,Iter> pivot =
Bannister::partition(
left, right,
*Bannister::selectRandomPivot(
left, right));
if (left < pivot.first)
Bannister::sort(left, pivot.first);
if (right > pivot.second)
left = pivot.second;
else
break;
}
}
}


Quite different from the original C code I posted in my last blog! I'm sure you could have made that function generic using the same method as shown with the bubblesort, it's not worth going into detail again. The above algorithm uses a quicksort for arrays over 56 elements long, otherwise it uses an insertion sort which is another fast method. It's a matter of testing to get the best speed benefits, something I have as yet to write a test framework for. Now to explain the quicksort part of the algorithm. The selecting of a pivot value and partitioning of the array passed to the sort function is deferred to two functions for clarity and to make it easier to tinker with the algorithm. The sort function itself has been partially converted to an iterative solution to try and alleviate the speed costs of recursion.

The pivot value is chosen randomly to help prevent the "worst-case scenario" of always choosing the lowest number around which to partition the array, which is always the case when sorting an array which is already ordered using the first value as the pivot. Another speed increase is achieved by uses Alexandrescus "fit pivot" technique which is based on the more well-known "fat-pivot". The "fat pivot" algorithm partitions the array into 3, not 2, sub arrays. The 3rd contains all occurences of the pivot value and places this in the middle of the other two (which contain values lower and values higher than the pivot respectively). This reduces the number of partitioning steps needed when there are long runs of equal numbers. However there is an overhead cost in doing the partitioning. The "fit pivot" algorithm will simply apply the normal "two partition" algorithm to an array but carry on moving the right pointer to the left and vice versa until numbers are found which are not equal to the pivot value. This way, if there is a run of pivot-equal numbers in the middle of the array, it will be kept there, not passed to the quicksort again and the resulting two partitions will be smaller, increasing the speed in those situations at no extra cost. Here are the two functions (including a more standard selectFirstPivot).

template<class Iter>
Iter selectFirstPivot(Iter left, Iter right)
{
return left;
}

template<class Iter>
Iter selectRandomPivot(Iter left, Iter right)
{
long int index = Bannister::rand(0, right-left);
return left + index;
}

template<class I0, class I1, class T>
std::pair<I1,I0> partition(
I0 left, I1 right, T pivot)
{
--right;
while (left < right)
{
while ((*right >= pivot) && (left < right))
--right;
if (left != right)
{
*left = *right;
++left;
}
while ((*left <= pivot) && (left < right))
++left;
if (left != right)
{
*right = *left;
--right;
}
}
*left = pivot;
// all numbers in-between will be equal
return std::make_pair(right, left+1);
}


Note, I've put my functions in a namespace called "Bannister" (rubbish name I know, it's named after Roger Bannister, the first man to run the 4 minute mile). Here's the insertion sort algorithm.

template<class Iter, class T>
void insertionSort(Iter left, Iter right, T)
// T unused, only type is needed
{
Iter i, j;
T index;

for (i=left+1; i != right; ++i)
{
index = *i;
j = i;
while ((j > left) && (*(j-1) > index))
{
*j = *(j-1);
--j;
}
*j = index;
}
}

template<class Iter>
void insertionSort(Iter left, Iter right)
{
// dereference left to deduce type of iterator
Bannister::insertionSort(left, right, *left);
}


And finally I'll include the fast pseudo-random function I found to calculate the random pivot.

// Park-Miller "minimal standard" 31 bit
// pseudo-random number generator
// implemented with David G. Carta's
// optimisation: with 32 bit math and
// without division..

long unsigned int rand31()
{
static long unsigned int seed = 16807;
long unsigned int hi, lo;

lo = 16807 * (seed & 0xFFFF);
hi = 16807 * (seed >> 16);

lo += (hi & 0x7FFF) << 16;
lo += hi >> 15;

if(lo>0x7FFFFFFF) lo -= 0x7FFFFFFF;

return (seed=(long)lo);
}

long int rand(long int lo, long int hi)
{
return (long int)(rand31() % (hi-lo)) + lo;
}


I should include some examples of using these functions as well ... (I really am letting the code speak for itself in this blog)... disclaimer: really bad test harness with no error-checking!!

#include <iostream>
#include <cstdlib>
#include <ctime>
#include <limits>
#include <vector>
// previous code goes in here...
#include "sort.h"

template<class Iter>
void output(Iter left, Iter right)
{
for(Iter it=left; it!=right; ++it)
std::cerr << *it << " ";
std::cerr << std::endl;
}

template<class Iter>
Iter find_unsorted(Iter left, Iter right)
{
for(Iter it=left+1; it!=right; ++it)
{
if(*it<*(it-1))
{
return it;
}
}
return right;
}

int unit(void)
{
const int size = 12;
int arr[] = {4, 0, 5, 7, 2, 2, 2, 2, 2, 1, 78, 4};
float arf[] = {4.2f, 0.1f, 5.3f, 7.8f, 8.02f,
2.3f, 45.9f, 2.1f, 2.1f, 1.2f,
78.0f, 4.2f};
int arr[] = {0, 7, 1, 8, 2, 9,
3, 10, 4, 11, 5, 6};
typedef std::vector<char> char_array_t;
char_array_t arc;
arc.push_back('k');
arc.push_back('c');
arc.push_back('e');
arc.push_back('f');
arc.push_back('b');
arc.push_back('g');
arc.push_back('i');
arc.push_back('h');
arc.push_back('l');
arc.push_back('d');
arc.push_back('j');
arc.push_back('a');

output(arr, arr+size);
bubbleSortT<int*>(arr, size);
Bannister::sort(arr, arr+size);
output(arr, arr+size);

output(arf, arf+size);
Bannister::insertionSort(arf, arf+size);
Bannister::sort(arf, arf+size);
output(arf, arf+size);

output(arc.begin(), arc.end());
Bannister::sort(arc.begin(), arc.end());
Bannister::insertionSort(arc.begin(), arc.end());
output(arc.begin(), arc.end());

return 1;
}

int main(int argc, char **argv)
{
if(argc==1)
{
return unit();
}

typedef std::vector<char> char_array_t;
char_array_t arc;

// Kilobytes of characters to sort
const int mem_size_bytes = 1024 * atoi(argv[1]);
for(int i=0; i<mem_size_bytes; ++i)
{
arc.push_back(Bannister::rand('A', 'Z'));
}

switch(argv[2][0])
{
case 'b':
std::cerr << "bubble sorting array...\n";
bubbleSort(arc.begin(), arc.end());
break;
case 'q':
std::cerr << "Bannister::sorting array... "
"(hybrid quicksort/insertion sort "
"with random fit pivot)\n";
Bannister::sort(arc.begin(), arc.end());
break;
case 'i':
std::cerr << "insertion sorting array...\n";
Bannister::insertionSort(
arc.begin(), arc.end());
break;
case 's':
std::cerr << "std::sorting array...\n";
std::sort(arc.begin(), arc.end());
break;
default:
break;
}

char_array_t::iterator it =
find_unsorted(
arc.begin(), arc.end());
if(it!=arc.end())
{
std::cerr << "Unsorted! Error!\n" << std::endl;
output(it-10, it+10);
//output(arc.begin(), arc.end());
}

return 1;
}


Now I propose an exercise for the reader... write some proper test code to automatically time and compare the results using different sorting methods.

Some simple timing tests I've done (using >time ./sort.exe on the bash command line under Cygwin) show that the Banniser::sort is many times quicker than a bubble sort or even an insertion sort with large arrays. It doesn't hold a candle to std::sort though. I suspect the std::sort implementation uses some really good platform-specific tricks, maybe implementing some inner loops in assembly. Anyone who could attempt modifying this code to go even faster is more than welcome!

Thursday, May 04, 2006

Haskell : Getting Started

Here are a couple of links for anyone who wants to get started using Haskell. These are both interpreter implementations of the language but are good for getting started.

http://haskell.org/hugs/

Hugs 98 is a functional programming system based on Haskell 98, the de facto standard for non-strict functional programming languages. Hugs 98 provides an almost complete implementation of Haskell 98.

http://www-users.cs.york.ac.uk/~ndm/projects/winhugs.php

WinHugs is the Windows user interface to Hugs. In addition to all of the Hugs features, it includes type browsers and heirarchical constraint views. WinHugs should run on any version of Windows since 95, i.e. 95/98/ME/NT/2K/XP.

Wednesday, May 03, 2006

Haskell : QuickSort in 2 Lines

As an aside to the series of blogs on sorting and in an effort to ensure you that this is not a C++ only blog here's how you can express the quick sort algorithm in haskell

qsort [] = []
qsort (x:xs) = qsort (filter (< x) xs) ++ [x]
++ qsort (filter (>= x) xs)


Compare this to the C example below! Haskell is a high-level, functional programming language. It is definitely worth a closer look, even if you don't use it to develop anything it will offer a different mindset that may help you approach and solve problems from a different angle.

Programming Quote of the Month

"The programmer, like the poet, works only slightly removed from pure thought-stuff. He builds castles in the air, from air, creating by exertion of the imagination. Few media of creation are so flexible, so easy to polish and rework, so readily capable of realizing grand conceptual structures."

- Frederick P. Brooks, "The Mythical Man-Month: Essays on Software Engineering, Anniversary Edition (2nd Edition)" by Frederick P. Brooks , ISBN: 0201835959,
page: 7-8


www.softwarequotes.com