# Tag: algorithms

Entries for tag "algorithms", ordered from most recent. Entry count: 66.

Uwaga! Informacje na tej stronie mają ponad 5 lat. Nadal je udostępniam, ale prawdopodobnie nie odzwierciedlają one mojej aktualnej wiedzy ani przekonań.

Pages: 1 2 3 4 5 ... 9

# Doing Something Every X Time

Sun
24
Jan 2010

Games are one of these applications where we have a loop spinning infinitely as fast as it can (or at least 30 times per second) and calling some DoFrame function that make all the computations and rendering of a game frame. Dealing with time is an important task in game programming. One of the things game frameworks usually do are retrieving precise current time from the system and measuring FPS. Xion wrote an interesting article about all this stuff some time ago, entitled Pętla czasu rzeczywistego [pl].

By the way, a fact not so well known by beginners is that FPS (frames per second) is not good measurement of a game performance, because it is expressed in Hz which is reciprocal of average frame time and thus it is nonlinear. The real metric for how much more optimization do you have to put into your game should be average frame time in milliseconds. But that's another topic. I've blogged about this some time ago here [pl]. Others also did this, like Humus or Real-Time Rendering Blog.

Subject I want to write about today is inspired by recent question from forum.gamedev.pl - Koniec z timerami [pl]. It's about implementing timers in games. The question is how to execute some code - let's say to call DoSomething() function - every EventTime time, e.g. every 200 ms. Two solutions have been proposed there by Kos and cybek. The difference between them is suble but important.

• const float EventTime - desired time between DoSomething() calls
• float CurrTime - time of the beginning of the current frame
• float LastTime - time of the beginning of the previous frame
• float DeltaTime = CurrTime - LastTime

Time may be expressed in seconds, milliseconds or any units - it doesn't matter here. First solutions looks like this:

```// Defining helper variable
float EventCounter;
// Initialization
EventCounter = 0.f;

// Inside DoFrame, called every frame
EventCounter += DeltaTime;
while (EventCounter >= EventTime)
{
DoSomething();
EventCounter -= EventTime; // !
}```

And here is the second example:

```// Inside DoFrame, called every frame
EventCounter += DeltaTime;
if (EventCounter >= EventTime)
{
DoSomething();
EventCounter = 0.f; // !
}```

So what's the difference? Method number 1 is precise. DoSometing will be always called every EventTime, plus/minus one frame and exactly every EventTime on average. On the other hand, method 2 calls DoSomething() not more often than every EventTime, but usually more rarely. I suppose it's EventTime + average_frame_time/2 on average. I hope you can see from this code why is that.

Why does it make a difference? First case is when you have a stall for, let's say, 500 ms between frames because you did some very expensive computations like failed A* pathfinding or some background process like antivirus slowed down the system. In the next executed frame, the first code above will "catch up" executing DoSomething two or three times, but the second code will just execute DoSomething() once and continue as nothing happened. Second difference is when you measure how many times was DoSomething() executed during some longer period. First code will probably do it exactly period/EventTime +/- 1 and the second code will result in no more, but probably less number of calls.

There is another question: How to express EventCounter variable? This problem is orthogonal to the one described above and make no difference to the code logic, so it's just your choice. Here are the options: As time passes since the last DoSomething call...

• EventCounter increases from 0 towards EventTime (that's what I used here)
• EventCounter decreases from EventTime towards 0
• EventCounter increases from CurrTime of that frame towards CurrFrame+EventTime

As a side note, you don't have to tell me that float is not a good type for expressing absolute and precise time inside games. I know that and I will write more about it another time. The code above was just and example.

# Processing File Paths in C++

Tue
12
Jan 2010

Paths to files and directories are special kind of strings. Sometimes they have to be processed, e.g. merged or splitted into some parts like drive letter, directory, file name and extension. Standard C library provides two function for this: makepath and splitpath. They use char* strings and operate always on all possible parts (drive, dir, fname, ext). In my opinion they are not very convenient. On the other hand, modern programming languages have more functionality in this field. For example, .NET has System.IO.Path static class with methods like GetExtension, GetFileName, GetDirectoryName.

To make operations on file paths convenient in C++, I've developed some time ago my own functions for that which operate on std::string strings. I've included them in my CommonLib library. They are similar a bit to these from Delphi :) Here is how I approach this topic: First, a common task is to extract part of a path (e.g. "C:\Windows\notepad.exe"). My functions for this are:

• ExtractFileDir returns path without final name ("C:\Windows"). It does not include trailing delimter unless returned path is a root directory, like "C:\".
• ExtractFileName returns only the final name, without preceding path ("notepad.exe"). Result is the full name including extension.
• ExtractFileExt returns only the extension of final name (".exe"). It's good to treat the dot as part of an extension because we sometimes need to strip or add an extension to an existing file name.
• ExtractPathPrefix returns only the starting point of an absolute path. For local Windows paths it looks like "C:\", but there is a second possibility - a network share like "\\ComputerName\ShareName\".

I also have some functions for processing paths:

• IncludeTrailingPathDelimiter and ExcludeTrailingPathDelimiter address a problem that sometimes paths to directories end with trailing backslash. These functions can be used whenever you want to ensure that your path ends or doesn't end with the '\'.
• ChangeFileExt is maybe a misnomer as it doesn't change an extension of a real file. It only changes file extension inside a path. For example, you can pass ".txt" to get "C:\Windows\notepad.txt". Thanks to requiring dot in the extension you can also use this function to strip extension from the path by passing empty string "" as the new extension.

Path can be absolute or relative. In Windows, absolute paths start with drive letter like "C:\" (which we can recognized by looking for colon) or network share (paths that begin with "\\"). Relative paths (where most basic example is a single file name) make sense only in some context. When we use relative paths to open real files, we address them relative to the "Current Directory" (which is a separate topic, I will write about it another time). To deal with absolute and relative paths, I've coded functions such as:

• PathIsAbsolute - tells whether a path is absolute
• RelativeToAbsolutePath - that's the most important function for me. Given base path (like "C:\Windows") and relative path (like "notepad.exe") it combines them into full path. This function is equivalent to System.IO.Path.Combine from .NET. There are many possibilities here. Base path can be relative or empty. Second path can be absolute (function directly returns second path in such case). It also automatically adds '\' separator between paths if needed.
• AbsoluteToRelativePath - does the opposite: Given full path and a base path to some drive or directory, it returns a path that points to the same file/directory relative to the base directory.

NormalizePath function preprocessess all the . and .. drectories in the path. A single dot used as directory name means "my own directory" (just does nothing) and two dots mean going up one level to the parent directory. So the path we consider here is logically equal to "C:\Windows\.\system32\..\notepad.exe". Paths like this still work, but they can look a bit unclear, so you can use NormalizePath function to clean them.

Paths in Linux are quite different. My functions also support them, although I didn't test this case in real situations yet. First of all, there are no drive letters in Linux, so absolute paths just start with '/', like "/etc/passwd". Second, a slash sign is used as a separator instead of backslash (it's interesting that slash also works in Windows). It's also more common in Linux than in Windows to meet files that don't have extension or have double extension (like "MyFile.tar.gz"). A question arises here whether what we call "extension" starts from the first or from the second dot :)

# Writing Parsers is Easy

Sun
27
Dec 2009

Using binary file formats is easier as data can be read and written almost directly as they are in memory, but sometimes we have to use some text formats. Whenever it is neither a standard format like XML or YAML, nor it is so simple we can load it with std::cin or scanf, we have to write a parser to read it. Writing a parser - code that reads and interprets some text format (e.g. file format or network protocol) looks like very complex task, but here I want to convince you that's not true - it is actually quite easy.

Why do I think so? It may look like it is a very long way from reading characters from a file till interpreting them as data records containing numbers, strings and other meaningful information. My answer for that is to split this problem into three layers.

The lowest layer is called CharReader. All it can do is reading single characters. It provides a layer of abstraction over reading data from a real source so you can forget whether you read from a file (hopefully with some buffering - never do file reads byte-after-byte!) or just iterate over a string fully loaded into memory. You can also forger about encoding so you see you data as characters, not bytes. It's a good idea to keep last read character somewhere so it can be "peeked" (read without losing it) many times before going to next one. The interface of a simple CharReader may look like this:

```class CharReader {
public:
// Returns last read character as out.
// If cursor at end, returns false.
bool Peek(char &out);
// Moves cursor one character forward.
void Next();
};```

Next layer is called Tokenizer. We need it because it's still complicated to interpret characters directly as some structured information. Data in almost every text format is made of multiple-character pieces of information like numbers, strings etc. so it's convenient to provide another layer of abstraction to see data as a sequence of tokens, not single characters. Here are some typical kinds of tokens that can be found also in programming languages:

• Identifiers, like: i, WinMain, getStringFromObj
• Strings, like: "abc", "Hello World!"
• Numbers, like: 0, 65535
• Symbols, like: = { } <=

Tokenizer uses CharReader to read chars and interprets these chars as tokens. Just like CharReader, it buffers and provides access to single token that was parsed recently so you can read it many times before anvancing to next one. It's also Tokenizer's responsibility to skip comments and whitespaces between tokens. Example interface may look like this:

```class Tokenizer {
public:

enum TOKEN_TYPE {
TOKEN_END,
TOKEN_IDENTIFIER,
TOKEN_STRING,
TOKEN_SYMBOL,
};
// Returns type of the last read token.
TOKEN_TYPE GetType();
// Returns content of the last read token.
const std::string & GetContent();
// Moves cursor one token forward.
void Next();
};```

And finally, when being able to see file as made of tokens, it's easy to code the third layer - Parser - on top of that. Parser reads subsequent tokens and interprets them as some meaningful data. For example, you can parse a Key="Value" pair by expecting token of type identifier and saving its content as Key, then expecting the (=) symbol and finally expecting a string token and saving its content as Value. Of course you also have to find a way to report errors on all these stages, including information about current row and column.

# Random Choice with Given Probabilities

Fri
11
Dec 2009

When we want to randomly choose one option from some possible items, we usually do something as simple as (rand() % numberOfItems). But what if we want to vary probabilities of these items? Here is an algorithm for this:

As input data, we have number of items and an array of their probabilities. Each probability is a nonnegative real number. I like to allow these numbers to be as big as I want, knowing that an item with probability 2 (or 200) is just two times more probable than the item with probability 1 (or 100). I will normalize them later.

```// == Preprocessing step ==

// Input:
std::vector<float> ItemProbabilities;
// Output:
std::vector<float> ProcessedProbabilities;

// Algorithm:

size_t count = ItemProbabilities.size();
assert(count > 0);

// First pass: summation
float sum = 0.f;
for (size_t i = count; i--; ) // count-1, count-2, ..., 0
{
float p = ItemProbabilities[i];
assert(p >= 0.f);
// Optional: Delete items with zero probability
if (p == 0.f)
{
Items.RemoveByIndex(i);
// That's how we delete an element from std::vector by index :P
ItemProbabilities.erase(ItemProbabilities.begin()+i);
}
sum += p;
}
assert(sum > 0.f);
float sumInv = 1.f / sum;

// Second pass: building preprocessed array
ProcessedProbabilities.resize(count);
ProcessedProbabilities[0] = ItemProbabilities[0] * sumInv;
for (size_t i = 1; i < count; i++)
ProcessedProbabilities[i] =
ProcessedProbabilities[i-1] + ItemProbabilities[i] * sumInv;```

The algorithm above does two things: It prepares ProcessedProbabilities array so that each probability is the item's probability plus the sum of probabilities of all previous items. It also normalizes them to range 0..1. As the result, the output array forms an ascending sequence, where the difference to the previous element is proportional to the item's probability in respect to full 0..1 range. For example:

```Input: 100, 100, 200
Temporary data: count=3, sum=400, sumInv=0.0025
Output: 0.25, 0.5, 1```

Now as we have these preprocessed data, generating random choices with certain probabilities is fast and simple. We just generate a random real number in range 0..1 and choose the first item for which ProcessedProbabilities[i] is greater than this. Additional optimization would be to use binary search here.

```// == Random choice ==

// Input:
std::vector<float> ProcessedProbabilities;
// Output:
size_t choice;

// Algorithm:
choice = 0;
float randNum = RandFloat(); // 0..1
while (choice < ProcessedProbabilities.size()-1
&& ProcessedProbabilities[choice] < randNum)
{
choice++;
}```

# Calculating Linear and Quadratic Equation Coefficients

Sun
30
Aug 2009

Some time ago I've written about formulas to calculate coefficients of linear and quadratic equation [pl] when having 2 or 3 given points (x,y). Yesterday I suddenly noticed that my code needs to do it every frame so I need functions to calculate these coefficients. Below you can find the code of my functions. They are templates, so they work with many types including float, double, as well as vectors of any dimmension and other types which have addition, subtraction and assignment operators, as well as multiplication and divistion by float scalar.

# Source Code and Ray Tracer

Wed
19
Aug 2009

As some of you are interested in seeing source code of my 2D Software Renderer, I'll fullfill this request, although I don't think there is anything special about this code. Here is the file: SoftRender01.zip and here is a brief description of this project:

• It's definitely not finished.
• It uses WinAPI and my CommonLib library.
• AsyncConsole is a code snipped I've published before in my paper on Asynchronous Windows Console [pl].
• In the FontLoad files you can find simple example of using FreeType library.
• In the ImgLoad files you can find simple example of using DevIL library.
• In StdAfx and TokenWriter files you can find some of my code that is going to be included in upcoming version of my CommonLib.
• Files VariableManager, VariableTypes and MathVariableTypes contain quite smart "property" system (or maybe even reflection?) with text and binary serialization that I'm very satisfied with.
• And finally, Renderer itself just contains some of elementary CG algorithms implemented, so you better study Wikipedia and other sources for algorithms such as Line clipping (I've used Liang-Barsky method), Line drawing algorithm (I've used Bresenham method) etc.

BTW, I've started coding a raytracer. Of course the first primitive I support is a sphere and first effects I've coded are shadows and reflections. Ray tracing is not magic. It's just like "foreach pixel, foreach object, render", while rasterization is the opposite: "foreach object, foreach pixel, render". But It's quite fascinating that raytracer suppors directly some of effects that require special preprocessing in standard rasterization (like preparing shadow map and enviromental map in case of my effects).

# Three Ways to Calculate Mean and Variance

Mon
20
Jul 2009

There is a free e-book on DSP (Digital Signal Processing) called The Scientist and Engineer's Guide to Digital Signal Processing (by Steven W. Smith, Ph.D.). As I have a holiday now, I've started reading it and I've found some interesting information even in introductory chapters. For example, there are three algorithms to calculate mean and variance. Let's say we have some data in a vector of bytes and we want to calculate its statistics.

```std::vector<unsigned char> Bytes;
// Fill Bytes...
uint N = Bytes.size();```

```float Mean = 0.f, Variance = 0.f;
for (uint i = 0; i < N; i++)
Mean += (float)Bytes[i];
Mean /= (float)N;
for (uint i = 0; i < N; i++)
Variance += ((float)Bytes[i] - Mean) * ((float)Bytes[i] - Mean);
Variance /= (float)(N-1);```

Now the incremental one, which can return current mean and variance at any time while you can also post next piece of data. All we need to implement it is to keep track of current sample number, sum of samples and sum of squared samples. I've created template class for that, which can be parametrized by float, double, int or other numeric types.

```template <typename T>
class IncrementalMeanAndVarianceCalc
{
public:
IncrementalMeanAndVarianceCalc() : m_Sum(T()), m_SqSum(T()), m_Count(0) { }
void Clear() { m_Sum = m_SqSum = T(); m_Count = 0; }

{
m_Sum += v;
m_SqSum += v*v;
m_Count++;
}
void Add(const T *values, uint valCount)
{
for (uint i = 0; i < valCount; i++)
{
m_Sum += values[i];
m_SqSum += values[i]*values[i];
}
m_Count += valCount;
}

bool IsEmpty() { return m_Count == 0; }
uint GetCount() { return m_Count; }
T GetSum() { return m_Sum; }
T GetSqSum() { return m_SqSum; }

T GetMean()
{
assert(!IsEmpty());
return m_Sum / m_Count;
}
T GetVariance(bool varianceBiased = true)
{
if (varianceBiased)
return (m_SqSum - m_Sum*m_Sum/m_Count) / (m_Count-1);
else
return (m_SqSum - m_Sum*m_Sum/m_Count) / m_Count;
}
void GetMeanAndVariance(T &outMean, T &outVariance, bool varianceBiased = true)
{
assert(!IsEmpty());
outMean = m_Sum / m_Count;
if (varianceBiased)
outVariance = (m_SqSum - m_Sum*m_Sum/m_Count) / (m_Count - 1);
else
outVariance = (m_SqSum - m_Sum*m_Sum/m_Count) / m_Count;
}

private:
T m_Sum, m_SqSum;
uint m_Count;
};```

Finally, there is an algorithm to calculate mean and variance from histogram. It is very efficient method. If we have only 256 possible values for each sample, we don't have to do binning and calculating histogram is very simple:

```uint Histogram[256] = { 0 };
for (uint i = 0; i < N; i++)
Histogram[Bytes[i]]++;```

Now, the mean and variance can be calculated this way:

```float Mean = 0.f, Variance = 0.f;
for (uint i = 0; i < 256; i++)
Mean += (float)i * Histogram[i];
Mean /= (float)N;
for (uint i = 0; i < 256; i++)
Variance += (i - Mean)*(i - Mean) * Histogram[i];
Variance /= N - 1;```

# What is GUID and how to use it

Thu
16
Jul 2009

There are many ways to identify objects in a collection. One of them is to keep direct pointers to objects. Another way is to simply index their positions in an array. We can make user see only some strange indirect "handles" without any obvious meaning, like HWND or FILE*. We can also give objects string names or some numeric identifiers. SQL databases often use integer number to identify record in a table and new identifiers are generated as subsequent numbers. Another solution is to use GUIDs.

GUID means Globally Unique Identifier and it is a kind of general purpose, 128-bit numeric identifier. GUIDs are generated from some random data. We can be sure about their uniqueness in any context because of their size, as 2^128 possible values is more than stars in the whole observable universe :) That's why they are called "globally unique".

We could just generate 128 bits of data from any good pseudorandom number generator and call it good identifiers, but to standardize this idea, they have created standard - RFC4122 (it's actually UUID, but let's ignore the difference). This standard says how to express GUIDs as strings, made of hexadecimal numbers separated by dashes (for example "f81d4fae-7dec-11d0-a765-00a0c91e6bf6"). You can find such GUIDs everywhere. For example, run "regedit.exe" and look into key HKEY_CLASSES_ROOT\CLSID or run your Visual C++ and run command Tools / Create GUID from the main menu.

The standard also defines bit structure of the UUID and algorithms to generate it. There are several possible algorithms. Some of them use system time or even network card MAC address. The simplest one is version 4 algorithm, which uses only random number generator. Some of bits have special meaning and code algorithm which have been used to generate particular GUID.

Here is my code in C++ defining GUID structure and the algorithm to generate it.