floating point numbers

Hello! Here are some questions & answers. The goal isn't to get all the questions "right". Instead, the goal is to learn something! If you find a topic you're interested in learning more about, I'd encourage you to look it up and learn more.

Can 0.1 be exactly represented in a floating point number?

Nope!

Numbers like 0.1 and 0.3 have nice representations in decimal numbers, but no exact match in binary. They will get approximated to the nearest available representation.

Can 4.25 be exactly represented in a floating point number?

Yes!

That one happens to have an exact binary representation: 100.01. More generally, fractions where the denominator is a power-of-two can be represented exactly with a float, e.g. 1/2, 3/4, 7/32.

When working with floats, does 0.1 + 0.2 equal 0.3?

Nope!

None of these numbers has an exact representation in binary, so each will be rounded to the nearest available representation. When the addition of 0.1 and 0.2 is done, these rounding errors combine to give a result which is not the closest available approximation to 0.3.

Do I have to accept floating point calculations are always 'fuzzy' and have some un-knowable rounding error on them?

No, you don't!

The floating point standard (IEEE 754) specifies 'exact rounding' on basic operations (addition, subtraction, multiplication, division). On these operations, you're guaranteed to get the approximation which is closest to the true value!

If my calculation only involves integers, am I guaranteed to get the exact result?

Not quite!

The rounding rules usually give exact results for integers, but if at any stage you have a number too big to exactly store in a float, you'll get roundoff error.

If I call sqrt(x), am I certain the answer will be exactly rounded?

Yes!

Just like the basic arithmetic operations, square root is guaranteed to give the closest available approximation to the true result!

If I call sin(x), am I certain the answer will exactly rounded?

Nope!

Functions like sin, cos, log, etc. are implementation dependent. Depending on your implementation, you could get very good or very poor accuracy. The only function with exact rounding is sqrt().

If I have a calculation involving several basic operations (e.g. a*b + c*d), will I get an exactly rounded result?

Nope!

The exact rounding guarantee applies to the individual add and multiply operations. When a bunch of these are combined, you get multiple roundings, and the final answer is not guaranteed to be the closest available approximation to the true result.

On modern hardware, do basic floating point calculations follow IEEE 754 rounding rules?

It depends!

Some languages (e.g. C and C++) don't promise this, and special (non-default) compiler settings are needed if you want rounding to follow IEEE 754 rules.

I really need exact rounding on a complicated calculation - is that possible?

Yes!

There are dedicated methods and software libraries for doing this; check out 'arbitrary precision arithmetic'. It's usually a lot slower than using standard floating point numbers though.