ChatGPT解决这个技术问题 Extra ChatGPT

Compare double to zero using epsilon

Today, I was looking through some C++ code (written by somebody else) and found this section:

double someValue = ...
if (someValue <  std::numeric_limits<double>::epsilon() && 
    someValue > -std::numeric_limits<double>::epsilon()) {
  someValue = 0.0;
}

I'm trying to figure out whether this even makes sense.

The documentation for epsilon() says:

The function returns the difference between 1 and the smallest value greater than 1 that is representable [by a double].

Does this apply to 0 as well, i.e. epsilon() is the smallest value greater than 0? Or are there numbers between 0 and 0 + epsilon that can be represented by a double?

If not, then isn't the comparison equivalent to someValue == 0.0?

The epsilon around 1 will most likely be much higher than that around 0, so there will probably be values between 0 and 0+epsilon_at_1. I guess the author of this section wanted to use something small, but he didn't want to use a magic constant, so he just used this essentially arbitrary value.
Comparing Floating point numbers is tough, and usage of epsilon or threshold value is even encouraged. Please refer: cs.princeton.edu/introcs/91float and cygnus-software.com/papers/comparingfloats/comparingfloats.htm
First link is 403.99999999
epsilon() is the smallest positive value. So if we assume that epsilon() is e, we've got that 1+e != 1, so yes, epsilon is the smallest value greater than 0 and there are no numbers between 0 and 0 + e
IMO, in this case the usage of numeric_limits<>::epsilon is misleading and irrelevant. What we want is to assume 0 if the actual value differs by no more than some ε from 0. And ε should be chosen based on the problem specification, not on a machine-dependent value. I'd suspect that the current epsilon is useless, as even just a few FP operations could accumulate an error greater than that.

Y
Yakov Galka

Assuming 64-bit IEEE double, there is a 52-bit mantissa and 11-bit exponent. Let's break it to bits:

1.0000 00000000 00000000 00000000 00000000 00000000 00000000 × 2^0 = 1

The smallest representable number greater than 1:

1.0000 00000000 00000000 00000000 00000000 00000000 00000001 × 2^0 = 1 + 2^-52

Therefore:

epsilon = (1 + 2^-52) - 1 = 2^-52

Are there any numbers between 0 and epsilon? Plenty... E.g. the minimal positive representable (normal) number is:

1.0000 00000000 00000000 00000000 00000000 00000000 00000000 × 2^-1022 = 2^-1022

In fact there are (1022 - 52 + 1)×2^52 = 4372995238176751616 numbers between 0 and epsilon, which is 47% of all the positive representable numbers...


So weird that you can say "47% of the positive numbers" :)
@configurator: Nah, you cannot say that (no 'natural' finite measure exists). But you can say "47% of the positive representable numbers".
@ybungalobill I can't figure it out. Exponent has 11 bits: 1 sign bit and 10 value bits. Why 2^-1022 and not 2^-1024 is the smallest positive number?
@PavloDyban: simply because exponents do not have a sign bit. They are encoded as offsets: if the encoded exponent is 0 <= e < 2048 then the mantissa is multiplied by 2 to the power of e - 1023. E.g. exponent of 2^0 is encoded as e=1023, 2^1 as e=1024 and 2^-1022 as e=1. The value of e=0 is reserved for subnormals and the real zero.
@PavloDyban: also 2^-1022 is the smallest normal number. The smallest number is actually 0.0000 00000000 00000000 00000000 00000000 00000000 00000001 × 2^-1022 = 2^-1074. This is subnormal, meaning that the mantissa part is smaller than 1, so it is encoded with the exponent e=0.
S
Steve Jessop

The test certainly is not the same as someValue == 0. The whole idea of floating-point numbers is that they store an exponent and a significand. They therefore represent a value with a certain number of binary significant figures of precision (53 in the case of an IEEE double). The representable values are much more densely packed near 0 than they are near 1.

To use a more familiar decimal system, suppose you store a decimal value "to 4 significant figures" with exponent. Then the next representable value greater than 1 is 1.001 * 10^0, and epsilon is 1.000 * 10^-3. But 1.000 * 10^-4 is also representable, assuming that the exponent can store -4. You can take my word for it that an IEEE double can store exponents less than the exponent of epsilon.

You can't tell from this code alone whether it makes sense or not to use epsilon specifically as the bound, you need to look at the context. It may be that epsilon is a reasonable estimate of the error in the calculation that produced someValue, and it may be that it isn't.


Good point, but even if that is the case, a better practice would be to keep the error bound in a reasonably named variable and use it in the comparison. As it stands, it is no different from a magic constant.
Perhaps I should've been clearer in my question: I didn't question whether epsilon was a large enough "threshold" to cover computational error but whether this comparison is equal to someValue == 0.0 or not.
S
Skizz

There are numbers that exist between 0 and epsilon because epsilon is the difference between 1 and the next highest number that can be represented above 1 and not the difference between 0 and the next highest number that can be represented above 0 (if it were, that code would do very little):-

#include <limits>

int main ()
{
  struct Doubles
  {
      double one;
      double epsilon;
      double half_epsilon;
  } values;

  values.one = 1.0;
  values.epsilon = std::numeric_limits<double>::epsilon();
  values.half_epsilon = values.epsilon / 2.0;
}

Using a debugger, stop the program at the end of main and look at the results and you'll see that epsilon / 2 is distinct from epsilon, zero and one.

So this function takes values between +/- epsilon and makes them zero.


p
pbhd

An aproximation of epsilon (smallest possible difference) around a number (1.0, 0.0, ...) can be printed with the following program. It prints the following output:
epsilon for 0.0 is 4.940656e-324
epsilon for 1.0 is 2.220446e-16
A little thinking makes it clear, that the epsilon gets smaller the more smaller the number is we use for looking at its epsilon-value, because the exponent can adjust to the size of that number.

#include <stdio.h>
#include <assert.h>
double getEps (double m) {
  double approx=1.0;
  double lastApprox=0.0;
  while (m+approx!=m) {
    lastApprox=approx;
    approx/=2.0;
  }
  assert (lastApprox!=0);
  return lastApprox;
}
int main () {
  printf ("epsilon for 0.0 is %e\n", getEps (0.0));
  printf ("epsilon for 1.0 is %e\n", getEps (1.0));
  return 0;
}

What implementations have you checked? This is definitely not the case for GCC 4.7.
D
Daniel Laügt

The difference between X and the next value of X varies according to X.
epsilon() is only the difference between 1 and the next value of 1.
The difference between 0 and the next value of 0 is not epsilon().

Instead you can use std::nextafter to compare a double value with 0 as the following:

bool same(double a, double b)
{
  return std::nextafter(a, std::numeric_limits<double>::lowest()) <= b
    && std::nextafter(a, std::numeric_limits<double>::max()) >= b;
}

double someValue = ...
if (same (someValue, 0.0)) {
  someValue = 0.0;
}

+1 for mentioning nextafter; but note that this usage doesn't likely do what the programmer is intending. Assuming 64-bit IEEE 754, in your example same(0, 1e-100) returns false, which is probably not what the programmer wants. The programmer probably rather wants some small threshold to test for equality, e.g. +/-1e-6 or +/-1e-9, instead of +/-nextafter.
Y
Yakk - Adam Nevraumont

Suppose we are working with toy floating point numbers that fit in a 16 bit register. There is a sign bit, a 5 bit exponent, and a 10 bit mantissa.

The value of this floating point number is the mantissa, interpreted as a binary decimal value, times two to the power of the exponent.

Around 1 the exponent equals zero. So the smallest digit of the mantissa is one part in 1024.

Near 1/2 the exponent is minus one, so the smallest part of the mantissa is half as large. With a five bit exponent it can reach negative 16, at which point the smallest part of the mantissa is worth one part in 32m. And at negative 16 exponent, the value is around one part in 32k, much closer to zero than the epsilon around one we calculated above!

Now this is a toy floating point model that does not reflect all the quirks of a real floating point system , but the ability to reflect values smaller than epsilon is reasonably similar with real floating point values.


A
Arsenii Fomin

You can't apply this to 0, because of mantissa and exponent parts. Due to exponent you can store very little numbers, which are smaller than epsilon, but when you try to do something like (1.0 - "very small number") you'll get 1.0. Epsilon is an indicator not of value, but of value precision, which is in mantissa. It shows how many correct consequent decimal digits of number we can store.


c
cababunga

So let's say system cannot distinguish 1.000000000000000000000 and 1.000000000000000000001. that is 1.0 and 1.0 + 1e-20. Do you think there still are some values that can be represented between -1e-20 and +1e-20?


Except for zero, I don't think that there are values between -1e-20 and +1e-20. But just because I think this doesn't make it true.
@SebastianKrysmanski: it's not true, there are lots of floating-point values between 0 and epsilon. Because it's floating point, not fixed point.
The smallest representable value that is distinct from zero is limited by number of bits allocated to represent exponent. So if double has 11 bit exponent, the smallest number would be 1e-1023.
d
default

I think that depend on the precision of your computer. Take a look on this table: you can see that if your epsilon is represented by double, but your precision is higher, the comparison is not equivalent to

someValue == 0.0

Good question anyway!


s
supercat

With IEEE floating-point, between the smallest non-zero positive value and the smallest non-zero negative value, there exist two values: positive zero and negative zero. Testing whether a value is between the smallest non-zero values is equivalent to testing for equality with zero; the assignment, however, may have an effect, since it would change a negative zero to a positive zero.

It would be conceivable that a floating-point format might have three values between the smallest finite positive and negative values: positive infinitesimal, unsigned zero, and negative infinitesimal. I am not familiar with any floating-point formats that in fact work that way, but such a behavior would be perfectly reasonable and arguably better than that of IEEE (perhaps not enough better to be worth adding extra hardware to support it, but mathematically 1/(1/INF), 1/(-1/INF), and 1/(1-1) should represent three distinct cases illustrating three different zeroes). I don't know whether any C standard would mandate that signed infinitesimals, if they exist, would have to compare equal to zero. If they do not, code like the above could usefully ensure that e.g. dividing a number repeatedly by two would eventually yield zero rather than being stuck on "infinitesimal".


Isn't "1/(1-1)" (from your example) infinity rather than zero?
The quantities (1-1), (1/INF), and (-1/INF) all represent zero, but dividing a positive number by each of them should in theory yield three different results (IEEE math regards the first two as identical).
H
Hassan Bahaloo

"the difference between 1 and the smallest value greater than 1" means one + "machine zero" which is around 10^-8 or 10^-16 depending whether you use float of double variables, respectively. To see the machine zero, you can divide 1 by 2 until the computer sees 1 = 1+1/2^p, as below:

#include <iostream>
#include "math.h"
using namespace std;

int main() {
    float a = 1;
    int n = 0;
    while(1+a != 1){
        a = a/2;
        n +=1;
    }
    cout << n-1 << endl << pow(2,-n);
    return 0;
} 

C
Community

Also, a good reason for having such a function is to remove "denormals" (those very small numbers that can no longer use the implied leading "1" and have a special FP representation). Why would you want to do this? Because some machines (in particular, some older Pentium 4s) get really, really slow when processing denormals. Others just get somewhat slower. If your application doesn't really need these very small numbers, flushing them to zero is a good solution. Good places to consider this are the last steps of any IIR filters or decay functions.

See also: Why does changing 0.1f to 0 slow down performance by 10x?

and http://en.wikipedia.org/wiki/Denormal_number


This removes many more numbers than just denormalised numbers. It changes Planck's constant or the mass of an electron to zero which will give you very, very wrong results if you used these numbers.