Entries for tag "stl", ordered from most recent. Entry count: 6.
# Efficient way of using std::vector
Sat
22
Sep 2018
Some people say that C++ STL is slow. I would rather say it's the way we use it. Sure, there are many potential sources of slowness when using STL. For example, std::list
or std::map
tends to allocate many small objects and dynamic allocation is a time-consuming operation. Making many copies of your objects like std::string
is also costly - that's why I created str_view project. But std::vector
is just a wrapper over a dynamically allocated array that also remembers its size and can reallocate when you add new elements. Without STL, you would need to implement the same functionality yourself.
When it comes to traversing elements of the vector (e.g. to sum the numerical values contained in it), there are many ways to do it. STL is notoriously known for working very slow in Debug project configuration, but as it turns out, this heavily depends on what method do you choose for using it.
Here is a small experiment that I've just made. In this code, I create a vector of 100,000,000 integers, then sum its elements using 5 different methods, calculating how much time does it take for each of them. Results (averaged over 5 iterations for each method) are as follows. Notice logarithmic scale on horizontal axis.
Here is the full source code of my testing program:
#include <cstdio> #include <cstdint> #include <vector> #include <chrono> #include <numeric> typedef std::chrono::high_resolution_clock::time_point time_point; typedef std::chrono::high_resolution_clock::duration duration; inline time_point now() { return std::chrono::high_resolution_clock::now(); } inline double durationToMilliseconds(duration d) { return std::chrono::duration<double, std::milli>(d).count(); } int main() { printf("Iteration,Method,Sum,Time (ms)\n"); for(uint32_t iter = 0; iter < 5; ++iter) { std::vector<int> numbers(100000000ull); numbers[0] = 1; numbers[1] = 2; numbers.back() = 3; { time_point timeBeg = now(); // Method 1: Use STL algorithm std::accumulate. int sum = std::accumulate(numbers.begin(), numbers.end(), 0); printf("%u,accumulate,%i,%g\n", iter, sum, durationToMilliseconds(now() - timeBeg)); } { time_point timeBeg = now(); // Method 2: Use the new C++11 range-based for loop. int sum = 0; for(auto value : numbers) sum += value; printf("%u,Range-based for loop,%i,%g\n", iter, sum, durationToMilliseconds(now() - timeBeg)); } { time_point timeBeg = now(); // Method 3: Use traditional loop, traverse vector using its iterator. int sum = 0; for(auto it = numbers.begin(); it != numbers.end(); ++it) sum += *it; printf("%u,Loop with iterator,%i,%g\n", iter, sum, durationToMilliseconds(now() - timeBeg)); } { time_point timeBeg = now(); // Method 4: Use traditional loop, traverse using index. int sum = 0; for(size_t i = 0; i < numbers.size(); ++i) sum += numbers[i]; printf("%u,Loop with indexing,%i,%g\n", iter, sum, durationToMilliseconds(now() - timeBeg)); } { time_point timeBeg = now(); // Method 5: Get pointer to raw array and its size, then use a loop to traverse it. int sum = 0; int* dataPtr = numbers.data(); size_t count = numbers.size(); for(size_t i = 0; i < count; ++i) sum += dataPtr[i]; printf("%u,Loop with pointer,%i,%g\n", iter, sum, durationToMilliseconds(now() - timeBeg)); } } }
As you can see, some methods are slower than the others in Debug configurations by more than 3 orders of magnitude! The difference is so big that if you write your program or game like this, it may not be possible to use its Debug version with any reasonably-sized input data. But if you look at disassembly, it should be no surprise. For example, method 4 calls vector methods size()
and operator[]
in every iteration of the loop. We know that in Debug configuration functions are not inilined and optimized, so these are real function calls:
On the other hand, method 5 that operates on raw pointer to the vector's underlying data is not that much slower in Debug configuration comparing to Release. Disassembly from Debug version:
So my conclusion is: Using std::vector to handle memory management and reallocation and using raw pointer to access its data is the best way to go.
My testing environment was:
CPU: Intel Core i7-6700K 4.00 GHz
RAM: DDR4, Dual-Channel, current memory clock 1066 MHz
OS: Windows 10 Version 1803 (OS Build 17134.285)
Compiler: Microsoft Visual Studio Community 2017 Version 15.4.8
Configuration options: x64 Debug/Release
Windows SDK Version 10.0.16299.0
Update 2023-09-30: There is a follow-up article that you may find interesting: "3 Ways to Iterate Over std::vector".
Comments | #optimization #c++ #stl Share
# My Algorithms for Sorted Arrays
Sun
21
Mar 2010
Yesterday I wrote about The Beauty of Sorted Arrays. I believe the C++ STL is not perfect and so I've written some additional code that I needed for the purpose of handling such data structure. It can be found in my CommonLib library.
I once wanted to have a sorting algorithm, just like std::sort, but using insertion sort. The std::sort is of course great, has average O(n log n) complexity and AFAIK internally uses introsort algorithm (an improved version of quicksort). But if we know that the collection is already almost sorted, insertion sort will work faster. So here is my code:
// Basic version that uses operator < template<class Iterator> void InsertionSort(Iterator b, Iterator e) { if (b == e) return; for (Iterator j = b + 1; j < e; j++) { typename std::iterator_traits<Iterator>::value_type key; key = *j; Iterator i = j; while (i > b && key < *(i-1)) { *i = *(i-1); i--; } *i = key; } } // Overloaded version with explicit comparator template<class Iterator, typename Compare> void InsertionSort(Iterator b, Iterator e, Compare Comp) { if (b == e) return; for (Iterator j = b + 1; j < e; j++) { typename std::iterator_traits<Iterator>::value_type key; key = *j; Iterator i = j; while (i > b && Comp(key, *(i-1))) { *i = *(i-1); i--; } *i = key; } }
Another problem is finding an item in the ordered array by some key. I can see two solutions in C++, both unsatisfactory. First one is to write your own comparator or overload operator < for your item structure to compare items by key, like this:
struct Item { int Id; DWORD Color; std::string Name; bool operator < (const Item &rhs) const { return Id < rhs.Id; } };
The problem is that C++ comparator always compares two full items, so to use this comparing feature for finding item in the array, we must construct and pass full Item object, despite we use only the Id field.
Item item; item.Id = 1; bool contains = std::binary_search(vec.begin(), vec.end(), item);
Second solution is to use std::map container. It has separate key and value type so you can search by key:
std::map<int, Item>::iterator foundIt = map.find(1); if (foundIt != map.end()) std::cout << "Found ID=" << foundIt->first << ", Name=" << foundIt->second.Name << std::endl;
But the problem is that the key field is separate from value type so the Id field should not be included in the Item structure because it's redundant. We should instead use std::pair<int, Item> as the type of items contained in the map, which is alised to std::map<int, Item>::value_type.
I wanted to have more convenient way to quickly search an ordered array by key, so I've coded these two function templates:
template <typename IterT, typename KeyT, typename CmpT> IterT FirstNotLessIndex(IterT beg, IterT end, const KeyT &key, CmpT cmp) { unsigned down = 0, up = (end - beg); while(down < up) { unsigned mid = (down + up) / 2; int res = cmp(key, *(beg + mid)); if(res > 0) down = mid + 1; else up = mid; } return beg + down; } template <typename IterT, typename KeyT, typename CmpT> IterT BinarySearch(IterT beg, IterT end, const KeyT &key, CmpT cmp) { IterT it = FirstNotLessIndex<IterT, KeyT, CmpT>(beg, end, key, cmp); if (it == end || cmp(key, *it) != 0) return end; return it; }
First one performs binary search to determine the iterator for a first element not less (greater or equal) than the specified key. It's perfect for determining the place for inserting new item into the array. Second function uses first one to check whether an item with exact matching key exists in the array and returns iterator pointing to it, or end iterator if not found.
As you can see, both functions take the type of the key as a template parameter, which may be different than full item type pointed by iterator. They also take a comparator functor which is always called with parameters (key, item) and should return int. The comparator should return a number less, equal or greater than zero according to the result of the comparison - just like strcmp, memcmp and other functions from old C.
Now my example can be written like this:
struct Item { int Id; DWORD Color; std::string Name; static int Cmp(int id, const Item &rhs) { return id - rhs.Id; } }; bool contains = common::BinarySearch(vec.begin(), vec.end(), 1, &Item::Cmp) != vec.end();
Having such universal comparison function, it's easy to convert it to all comparison operators:
bool operator < (const Item &rhs) const { return Cmp(Id, rhs) < 0; } bool operator > (const Item &rhs) const { return Cmp(Id, rhs) > 0; } bool operator <= (const Item &rhs) const { return Cmp(Id, rhs) <= 0; } bool operator >= (const Item &rhs) const { return Cmp(Id, rhs) <= 0; } bool operator == (const Item &rhs) const { return Cmp(Id, rhs) == 0; } bool operator != (const Item &rhs) const { return Cmp(Id, rhs) != 0; }
Opposite conversion is also possible. If you have a type with operator < accessible (like float or your own type with overloaded operator), you can convert it to a comparator returning int with this function:
template <typename T> inline int UniversalCmp(const T &a, const T &b) { if (a < b) return -1; if (b < a) return 1; return 0; }
Comments | #commonlib #stl #c++ #algorithms Share
# The Beauty of Sorted Arrays
Sat
20
Mar 2010
A one-dimmensional array (also called vector) is very simple data structure. We were taught that inserting or deleting elements in the middle of the array is slow because we have to move (by rewriting) all following elements. That's why books about C++ teach that for sorted collections - those who provide fast search operation - it's better to use containers like std::set or std::map.
But those who code games in C++ and care about performance know that today's processors (CPU) are very fast in terms of doing calculations and processing lots of data, while access to the RAM memory becomes more and more kind of a bottleneck. According to great presentation Pitfalls of Object Oriented Programming by Tony Albrecht from Sony, a cachce miss (access to some data from the memory which are not currently in CPU cache) costs about 400 cycles! Dynamic memory allocation (like operator new in C++) is also a slow operation. That's why data structures that use separate nodes linked with pointers (like std::list, std::set, std::map) are slower than std::vector and it's often better to keep a sorted collection of elements in a vector, even if you have hundreds or thousands of elements and must sometimes insert or delete them from the middle.
Unfortunately, C++ STL doesn't provide a "std::ordered_vector" container that whould keep items sorted while allocating single chunk of continuous memory. We have to do it by ourselves, like this:
First, we have to choose a way to compare elements. There are multiple solutions (read about comparators in C++), but the simplest one is to just overload operator < in your element structure.
#include <vector> #include <string> #include <algorithm> struct Item { int Id; DWORD Color; std::string Name; Item() { } Item(int id, DWORD color, const std::string name) : Id(id), Color(color), Name(name) { } bool operator < (const Item &rhs) const { return Id < rhs.Id; } }; typedef std::vector<Item> ItemVector; ItemVector vec; Item item;
Here is how we insert an item into sorted vector:
item = Item(1, 0xFF0000, "Red"); ItemVector::iterator itToInsert = std::lower_bound( vec.begin(), vec.end(), item); vec.insert(itToInsert, item);
Unfortunately we have to deal with iterators instead of simple indices.
To check if the vector contains element with particular Id:
item = Item(1, 0, std::string()); bool contains = std::binary_search(vec.begin(), vec.end(), item);
Here come another nuisances. We have to create full Item object just to use its Id field for comparison purpose. I'll show tomorrow how to overcome this problem.
It's also annoying that binary_search algorithm returns bool instead of iterator that would show us where the found item is. To find an item and determine its iterator (and from this its index) we have to use lower_bound algorithm. This algorithm is very universal but also hard to understand. It quickly searches a sorted collection and returns the position of the first element that has a value greater than or equivalent to a specified value. It was perfect for determining a place to insert new item into, but to find existing element, we have to use it with this if:
item = Item(1, 0, std::string()); ItemVector::iterator findIt = std::lower_bound(vec.begin(), vec.end(), item); if (findIt != vec.end() && findIt->Id == item.Id) { size_t foundIndex = std::distance(vec.begin(), findIt); } else { // Not found. }
And finally, to remove an item from the vector while having index, we do this trick:
size_t indexToDelete = 0; vec.erase(vec.begin() + indexToDelete);
Well, C++ STL is very general, universal, powerful and... very complicated, while standard libraries of other programming languages also have some curiosities in their array classes.
For example, ActionScript 3 have single method to insert and remove items:
function splice(startIndex:int, deleteCount:uint, ... values):Array
To delete some items, you pass non-zero deleteCount. To insert some items, you pass them as values parameter.
In .NET, the System.Collections.Generic.List<T> class (which is actually an array, the name is misnomer) has very smart BinarySearch method. It returns the "zero-based index of item (...), if item is found; otherwise, a negative number that is the bitwise complement of the index of the next element that is larger than item or, if there is no larger element, the bitwise complement of Count.". It can be used to both find existing item in the array or determine an index for a new item, like:
int index = myList->BinarySearch("abc"); if (index < 0) myList->Insert(~index, "abc");
See also: My Algorithms for Sorted Arrays
Comments | #algorithms #c++ #stl #.net #flash Share
# Programujesz w C czy w C++?
Sun
16
Nov 2008
Dziś będzie refleksja, do której skłoniła mnie niedawna rozmowa na Gadu-Gadu. Wielu z tych, którzy pisali w C albo uczyli się go na studiach, a nawet niektórzy autorzy książek wyznają podział na C i C++ tak ścisły, że wszelkie pisanie własnych kontenerów, używanie łańcuchów typu char* czy zmiennych i funkcji globalnych, a nawet wskaźników nazywają językiem C. C++ kojarzy im się wyłącznie z jak najintensywniejszym stosowaniem programowania obiektowego i biblioteki STL.
Tymczasem ja nigdy nie używałem języka C i nie mam zamiaru. Stale programuję w C++, a mimo tego używam czasem funkcji globalnych, piszę nieraz własne struktury danych i nie przepadam za strumieniami biblioteki standardowej. Dlatego uważam, że takie nazywanie wszystkiego co nie-obiektowe i nie-STL-owe programowaniem w C jest trochę przesadzone.
Pokażcie mi z resztą bibliotekę przeznaczoną dla C++, która wykorzystuje wszystkie cechy tego języka (klasy, łańcuchy i kontenery STL, strumienie i wyjątki) zamiast dostarczać swoich własnych rozwiązań w tym zakresie albo używać starego podejścia z C. Niewiele takich jest, a mimo tego nie nazywamy wszystkich pozostałych bibliotekami do C.
# std::vector wszędzie zamiast tablic
Thu
17
Jul 2008
Początkujący w C++ uczą się najpierw o wskaźnikach i dynamicznej alokacji pamięci, w tym tablic operatorem new[]. Dopiero potem poznają bibliotekę STL. Tymczasem klasy std::vector warto używać zawsze zamiast dynamicznych tablic. Ma wiele zalet: [+] sama zwalnia pamięć, nie grozi wyciekiem pamięci, [+] pamięta swój rozmiar, [+] potrafi się sama rozszerzać, [+] ma wygodne metody m.in. do wstawiania (insert) i usuwania (erase) ze środka. Wcale nie jest przy tym wolniejsza niż zwykłe tablice, o ile używa się jej prawidłowo. Na przykład tak:
class MojaKlasa { private: std::vectorm_Wektor; public: MojaKlasa() : // Rozmiar od razu na 128 m_Wektor(128) { // Wypełniam elementy, a nie dodaję for (int i = 0; i < 128; i++) m_Wektor[i] = i; } };
Najlepsze jest to, że wektor zawsze jest trzymany w pamięci jako zwykła tablica i można uzyskać do niej wskaźnik pobierając adres pierwszego elementu:
int *Tablica = &m_Wektor[0];
# Kasowanie linijki w konsoli
Sun
22
Jun 2008
Konsola systemowa, używana przez funkcje C (jak printf) lub C++ (jak std::cout) pozwala tylko wypisywać na wyjście i pobierać na wejściu znaki lub całe łańcuchy. Dopiero Windows API umożliwia bardziej niskopioziomowe reagowanie na klawiaturę, myszkę i swobodne rysowanie znakami po dowolnych miejscach konsoli, także z użyciem kolorów. Zawsze zastanawiało mnie, jak bez WinAPI radzą sobie twórcy programów konsolowych, którzy kasują bieżącą linijkę zamiast tylko dopisywać nowe informacje, jak wtedy kiedy następuje odliczanie 10%, 20%, 30% itd.
Ostatnio znalazłem rozwiązanie. Jest nim magiczny znak "\r", który użyty samodzielnie (a nie jako część końca wiersza "\r\n") powoduje powrót karetki na początek linii umożliwiając napisanie w jej miejscu czegoś nowego. Na przykład:
for (unsigned i = 0; i < 10; i++) { std::cout << "\rLiczba: " << i; Sleep(500); } std::cout << std::endl;