My Algorithms for Sorted Arrays

Warning! Some information on this page is older than 5 years now. I keep it for reference, but it probably doesn't reflect my current knowledge and beliefs.

Sun
21
Mar 2010

Yesterday I wrote about The Beauty of Sorted Arrays. I believe the C++ STL is not perfect and so I've written some additional code that I needed for the purpose of handling such data structure. It can be found in my CommonLib library.

I once wanted to have a sorting algorithm, just like std::sort, but using insertion sort. The std::sort is of course great, has average O(n log n) complexity and AFAIK internally uses introsort algorithm (an improved version of quicksort). But if we know that the collection is already almost sorted, insertion sort will work faster. So here is my code:

// Basic version that uses operator <
template<class Iterator>
void InsertionSort(Iterator b, Iterator e)
{
if (b == e) return;
for (Iterator j = b + 1; j < e; j++)
{
typename std::iterator_traits<Iterator>::value_type key;
key = *j;
Iterator i = j;
while (i > b && key < *(i-1))
{
*i = *(i-1);
i--;
}
*i = key;
}
}

// Overloaded version with explicit comparator
template<class Iterator, typename Compare>
void InsertionSort(Iterator b, Iterator e, Compare Comp)
{
if (b == e) return;
for (Iterator j = b + 1; j < e; j++)
{
typename std::iterator_traits<Iterator>::value_type key;
key = *j;
Iterator i = j;
while (i > b && Comp(key, *(i-1)))
{
*i = *(i-1);
i--;
}
*i = key;
}
}

Another problem is finding an item in the ordered array by some key. I can see two solutions in C++, both unsatisfactory. First one is to write your own comparator or overload operator < for your item structure to compare items by key, like this:

struct Item
{
int Id;
DWORD Color;
std::string Name;

bool operator < (const Item &rhs) const { return Id < rhs.Id; }
};

The problem is that C++ comparator always compares two full items, so to use this comparing feature for finding item in the array, we must construct and pass full Item object, despite we use only the Id field.

Item item;
item.Id = 1;
bool contains = std::binary_search(vec.begin(), vec.end(), item);

Second solution is to use std::map container. It has separate key and value type so you can search by key:

std::map<int, Item>::iterator foundIt = map.find(1);
if (foundIt != map.end())
std::cout << "Found ID=" << foundIt->first
<< ", Name=" << foundIt->second.Name << std::endl;

But the problem is that the key field is separate from value type so the Id field should not be included in the Item structure because it's redundant. We should instead use std::pair<int, Item> as the type of items contained in the map, which is alised to std::map<int, Item>::value_type.

I wanted to have more convenient way to quickly search an ordered array by key, so I've coded these two function templates:

template <typename IterT, typename KeyT, typename CmpT>
IterT FirstNotLessIndex(IterT beg, IterT end, const KeyT &key, CmpT cmp)
{
unsigned down = 0, up = (end - beg);
while(down < up)
{
unsigned mid = (down + up) / 2;
int res = cmp(key, *(beg + mid));
if(res > 0)
down = mid + 1;
else
up = mid;
}
return beg + down;
}

template <typename IterT, typename KeyT, typename CmpT>
IterT BinarySearch(IterT beg, IterT end, const KeyT &key, CmpT cmp)
{
IterT it = FirstNotLessIndex<IterT, KeyT, CmpT>(beg, end, key, cmp);
if (it == end || cmp(key, *it) != 0)
return end;
return it;
}

First one performs binary search to determine the iterator for a first element not less (greater or equal) than the specified key. It's perfect for determining the place for inserting new item into the array. Second function uses first one to check whether an item with exact matching key exists in the array and returns iterator pointing to it, or end iterator if not found.

As you can see, both functions take the type of the key as a template parameter, which may be different than full item type pointed by iterator. They also take a comparator functor which is always called with parameters (key, item) and should return int. The comparator should return a number less, equal or greater than zero according to the result of the comparison - just like strcmp, memcmp and other functions from old C.

Now my example can be written like this:

struct Item
{
int Id;
DWORD Color;
std::string Name;

static int Cmp(int id, const Item &rhs) { return id - rhs.Id; }
};

bool contains =
common::BinarySearch(vec.begin(), vec.end(), 1, &Item::Cmp) != vec.end();

Having such universal comparison function, it's easy to convert it to all comparison operators:

bool operator <  (const Item &rhs) const { return Cmp(Id, rhs) <  0; }
bool operator >  (const Item &rhs) const { return Cmp(Id, rhs) >  0; }
bool operator <= (const Item &rhs) const { return Cmp(Id, rhs) <= 0; }
bool operator >= (const Item &rhs) const { return Cmp(Id, rhs) <= 0; }
bool operator == (const Item &rhs) const { return Cmp(Id, rhs) == 0; }
bool operator != (const Item &rhs) const { return Cmp(Id, rhs) != 0; }

Opposite conversion is also possible. If you have a type with operator < accessible (like float or your own type with overloaded operator), you can convert it to a comparator returning int with this function:

template <typename T>
inline int UniversalCmp(const T &a, const T &b)
{
if (a < b) return -1;
if (b < a) return  1;
return 0;
}

Comments | #commonlib #stl #c++ #algorithms