Floating-Point Formats Cheatsheet

Warning! Some information on this page is older than 3 years now. I keep it for reference, but it probably doesn't reflect my current knowledge and beliefs.

Jun 2013

Floating-Point Formats Cheatsheet

Floating-point numbers (or floats in short) are not as simple as integer numbers. There is much to be understood when dealing with these numbers on low level - basic things like the sign + exponent + significand representation (and that exponent is biased, while significand has implicit leading 1), why you should never compare calculation results operator ==, that some fractions with finite decimal representation cannot be represented exactly in binary etc., as well as why there are two zeros -0 and +1, what are infinite, NaN (Not a Number) and denorm (denormal numbers) and how they behave. I won't describe it here. It's not an arcane knowledge - you can find many information about this on the Web, starting from Wikipedia article.

But after you understand these concepts, quantitative questions come to mind, like: how many significant decimal digits can we expect from precision of particular float representation (half, single, double)? What is the minimum non-zero value representable in that format? What range of integers can we represent exactly? What is the maximum value? And finally: if our game crashes with "Access violation, reading location 0x3f800000", what chances are that we mistaken pointer for a float number, as this is one of common values, meaning 1.0?

So to organize such knowledge, I created a "Floating-Point Formats" cheatsheet:


Comments (0) | Tags: productions math | Author: Adam Sawicki | Share


(No comments)

Post comment

Nick *
Your name or nickname
Your contact information (optional, will not be shown)
Text *
Content of your comment
Calculate *
(* - required field)
STAT NO AD [Stat] [Admin] [STAT NO AD] [pub] [Mirror] Copyright © 2004-2016 Adam Sawicki
Copyright © 2004-2016 Adam Sawicki