ChatGPT解决这个技术问题 Extra ChatGPT

How can I check for NaN values?

float('nan') represents NaN (not a number). But how do I check for it?

For some history of NaN in Python, see PEP 754. python.org/dev/peps/pep-0754

B
Boris Verkhovskiy

Use math.isnan:

>>> import math
>>> x = float('nan')
>>> math.isnan(x)
True

@charlie-parker : In Python3, math.isnan is still a part of the math module. docs.python.org/3/library/math.html#math.isnan . Use numpy.isnan if you wish, this answer is just a suggestion.
is math.isnan preferred to np.isnan() ?
@TMWP possibly... import numpy takes around 15 MB of RAM, whereas import math takes some 0,2 MB
@TMWP: If you're using NumPy, numpy.isnan is a superior choice, as it handles NumPy arrays. If you're not using NumPy, there's no benefit to taking a NumPy dependency and spending the time to load NumPy just for a NaN check (but if you're writing the kind of code that does NaN checks, it's likely you should be using NumPy).
@jungwook That actually doesn't work. Your expression is always false. That is, float('nan') == float('nan') returns False — which is a strange convention, but basically part of the definition of a NaN. The approach you want is actually the one posted by Chris Jester-Young, below.
C
Chris Jester-Young

The usual way to test for a NaN is to see if it's equal to itself:

def isNaN(num):
    return num != num

Word of warning: quoting Bear's comment below "For people stuck with python <= 2.5. Nan != Nan did not work reliably. Used numpy instead." Having said that, I've not actually ever seen it fail.
I'm sure that, given operator overloading, there are lots of ways I could confuse this function. go with math.isnan()
It says in the 754 spec mentioned above that NaN==NaN should always be false, although it is not always implemented as such. Isn't is possible this is how math and/or numpy check this under the hood anyway?
Even though this works and, to a degree makes sense, I'm a human with principles and I hereby declare this as prohibited witchcraft. Please use math.isnan instead.
@djsadinoff Is there any other drawback to confusion? math.isnan() can't check string values, so this solution seems more robust.
B
Boris Verkhovskiy

numpy.isnan(number) tells you if it's NaN or not.


Works in python version 2.7 too.
numpy.all(numpy.isnan(data_list)) is also useful if you need to determine if all elements in the list are nan
No need for NumPy: all(map(math.isnan, [float("nan")]*5))
When this answer was written 6 years ago, Python 2.5 was still in common use - and math.isnan was not part of the standard library. Now days I'm really hoping that's not the case in many places!
note that np.isnan() doesn't handle decimal.Decimal type (as many numpy's function). math.isnan() does handle.
p
petezurich

Here are three ways where you can test a variable is "NaN" or not.

import pandas as pd
import numpy as np
import math

# For single variable all three libraries return single boolean
x1 = float("nan")

print(f"It's pd.isna: {pd.isna(x1)}")
print(f"It's np.isnan: {np.isnan(x1)}}")
print(f"It's math.isnan: {math.isnan(x1)}}")

Output

It's pd.isna: True
It's np.isnan: True
It's math.isnan: True

pd.isna(value) saved a lot of troubles! working like a charm!
pd.isnan() or pd.isna()? That is the question :D
version 3 of this answer was correct and well formatted. this one (now 7) is wrong again. rolled back as "dont want your edit" while the edits improved the answer, wtf.
side note I have found if not np.isnan(x): to be quite useful.
x
x0s

here is an answer working with:

NaN implementations respecting IEEE 754 standard ie: python's NaN: float('nan'), numpy.nan...

ie: python's NaN: float('nan'), numpy.nan...

any other objects: string or whatever (does not raise exceptions if encountered)

A NaN implemented following the standard, is the only value for which the inequality comparison with itself should return True:

def is_nan(x):
    return (x != x)

And some examples:

import numpy as np
values = [float('nan'), np.nan, 55, "string", lambda x : x]
for value in values:
    print(f"{repr(value):<8} : {is_nan(value)}")

Output:

nan      : True
nan      : True
55       : False
'string' : False
<function <lambda> at 0x000000000927BF28> : False

The series I'm checking is strings with missing values are 'nans' (???) so this solution works where others failed.
numpy.nan is a regular Python float object, just like the kind returned by float('nan'). Most NaNs you encounter in NumPy will not be the numpy.nan object.
numpy.nan defines its NaN value on its own in the underlying library in C. It does not wrap python's NaN. But now, they both comply with IEEE 754 standard as they rely on C99 API.
@user2357112supportsMonica: Python and numpy NaN actually don't behave the same way: float('nan') is float('nan') (non-unique) and np.nan is np.nan (unique)
@x0s: That has nothing to do with NumPy. np.nan is a specific object, while each float('nan') call produces a new object. If you did nan = float('nan'), then you'd get nan is nan too. If you constructed an actual NumPy NaN with something like np.float64('nan'), then you'd get np.float64('nan') is not np.float64('nan') too.
G
Grzegorz

It seems that checking if it's equal to itself

x!=x

is the fastest.

import pandas as pd 
import numpy as np 
import math 

x = float('nan')

%timeit x!=x                                                                                                                                                                                                                        
44.8 ns ± 0.152 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

%timeit math.isnan(x)                                                                                                                                                                                                               
94.2 ns ± 0.955 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

%timeit pd.isna(x) 
281 ns ± 5.48 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%timeit np.isnan(x)                                                                                                                                                                                                                 
1.38 µs ± 15.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


It's worthwhile noting that this works even if infinities are in question. That is, if z = float('inf'), z != z evaluates to false.
in my computer z=float('inf') and then z==z give True. x=float('nan') and then x==x give False.
In most (if not all) cases, these speed differences will only be relevant, if repeated numerous times. Then you'll be using numpy or another tensor library, anyway.
D
DaveTheScientist

I actually just ran into this, but for me it was checking for nan, -inf, or inf. I just used

if float('-inf') < float(num) < float('inf'):

This is true for numbers, false for nan and both inf, and will raise an exception for things like strings or other types (which is probably a good thing). Also this does not require importing any libraries like math or numpy (numpy is so damn big it doubles the size of any compiled application).


math.isfinite was not introduced until Python 3.2, so given the answer from @DaveTheScientist was posted in 2012 it was not exactly "reinvent[ing] the wheel" - solution still stands for those working with Python 2.
This can be useful for people who need to check for NaN in a pd.eval expression. For example pd.eval(float('-inf') < float('nan') < float('inf')) will return False
T
Tomalak

math.isnan()

or compare the number to itself. NaN is always != NaN, otherwise (e.g. if it is a number) the comparison should succeed.


For people stuck with python <= 2.5. Nan != Nan did not work reliably. Used numpy instead.
I
Idok

Well I entered this post, because i've had some issues with the function:

math.isnan()

There are problem when you run this code:

a = "hello"
math.isnan(a)

It raises exception. My solution for that is to make another check:

def is_nan(x):
    return isinstance(x, float) and math.isnan(x)

It was probably downvoted because isnan() takes a float, not a string. There's nothing wrong with the function, and the problems are only in his attempted use of it. (For that particular use case his solution is valid, but it's not an answer to this question.)
Be careful with checking for types in this way. This will not work e.g. for numpy.float32 NaN's. Better to use a try/except construction: def is_nan(x): try: return math.isnan(x) except: return False
NaN does not mean that a value is not a valid number. It is part of IEEE floating point representation to specify that a particular result is undefined. e.g. 0 / 0. Therefore asking if "hello" is nan is meaningless.
this is better because NaN can land in any list of strings,ints or floats, so useful check
I had to implement exactly this for handling string columns in pandas.
J
Josh Lee

Another method if you're stuck on <2.6, you don't have numpy, and you don't have IEEE 754 support:

def isNaN(x):
    return str(x) == str(1e400*0)

M
Mauro Bianchi

With python < 2.6 I ended up with

def isNaN(x):
    return str(float(x)).lower() == 'nan'

This works for me with python 2.5.1 on a Solaris 5.9 box and with python 2.6.5 on Ubuntu 10


This isn't too portable, as Windows sometimes calls this -1.#IND
M
Mahdi

I am receiving the data from a web-service that sends NaN as a string 'Nan'. But there could be other sorts of string in my data as well, so a simple float(value) could throw an exception. I used the following variant of the accepted answer:

def isnan(value):
  try:
      import math
      return math.isnan(float(value))
  except:
      return False

Requirement:

isnan('hello') == False
isnan('NaN') == True
isnan(100) == False
isnan(float('nan')) = True

or try: int(value)
@chwi so what does your suggestion tell about value being NaN or not?
Well, being "not a number", anything that can not be casted to an int I guess is in fact not a number, and the try statement will fail? Try, return true, except return false.
@chwi Well, taking "not a number" literally, you are right, but that's not the point here. In fact, I am looking exactly for what the semantics of NaN is (like in python what you could get from float('inf') * 0), and thus although the string 'Hello' is not a number, but it is also not NaN because NaN is still a numeric value!
@chwi: You are correct, if exception handling is for specific exception. But in this answer, generic exception have been handled. So no need to check int(value) For all exception, False will be written.
s
siberiawolf61

All the methods to tell if the variable is NaN or None:

None type

In [1]: from numpy import math

In [2]: a = None
In [3]: not a
Out[3]: True

In [4]: len(a or ()) == 0
Out[4]: True

In [5]: a == None
Out[5]: True

In [6]: a is None
Out[6]: True

In [7]: a != a
Out[7]: False

In [9]: math.isnan(a)
Traceback (most recent call last):
  File "<ipython-input-9-6d4d8c26d370>", line 1, in <module>
    math.isnan(a)
TypeError: a float is required

In [10]: len(a) == 0
Traceback (most recent call last):
  File "<ipython-input-10-65b72372873e>", line 1, in <module>
    len(a) == 0
TypeError: object of type 'NoneType' has no len()

NaN type

In [11]: b = float('nan')
In [12]: b
Out[12]: nan

In [13]: not b
Out[13]: False

In [14]: b != b
Out[14]: True

In [15]: math.isnan(b)
Out[15]: True

p
petezurich

How to remove NaN (float) item(s) from a list of mixed data types

If you have mixed types in an iterable, here is a solution that does not use numpy:

from math import isnan

Z = ['a','b', float('NaN'), 'd', float('1.1024')]

[x for x in Z if not (
                      type(x) == float # let's drop all float values…
                      and isnan(x) # … but only if they are nan
                      )]
['a', 'b', 'd', 1.1024]

Short-circuit evaluation means that isnan will not be called on values that are not of type 'float', as False and (…) quickly evaluates to False without having to evaluate the right-hand side.


V
Valentin Goikhman

In Python 3.6 checking on a string value x math.isnan(x) and np.isnan(x) raises an error. So I can't check if the given value is NaN or not if I don't know beforehand it's a number. The following seems to solve this issue

if str(x)=='nan' and type(x)!='str':
    print ('NaN')
else:
    print ('non NaN')

E
Erfan

Comparison pd.isna, math.isnan and np.isnan and their flexibility dealing with different type of objects.

The table below shows if the type of object can be checked with the given method:


+------------+-----+---------+------+--------+------+
|   Method   | NaN | numeric | None | string | list |
+------------+-----+---------+------+--------+------+
| pd.isna    | yes | yes     | yes  | yes    | yes  |
| math.isnan | yes | yes     | no   | no     | no   |
| np.isnan   | yes | yes     | no   | no     | yes  | <-- # will error on mixed type list
+------------+-----+---------+------+--------+------+

pd.isna

The most flexible method to check for different types of missing values.

None of the answers cover the flexibility of pd.isna. While math.isnan and np.isnan will return True for NaN values, you cannot check for different type of objects like None or strings. Both methods will return an error, so checking a list with mixed types will be cumbersom. This while pd.isna is flexible and will return the correct boolean for different kind of types:

In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: missing_values = [3, None, np.NaN, pd.NA, pd.NaT, '10']

In [4]: pd.isna(missing_values)
Out[4]: array([False,  True,  True,  True,  True, False])

J
J11

For nan of type float

>>> import pandas as pd
>>> value = float(nan)
>>> type(value)
>>> <class 'float'>
>>> pd.isnull(value)
True
>>>
>>> value = 'nan'
>>> type(value)
>>> <class 'str'>
>>> pd.isnull(value)
False

M
Max Kleiner

for strings in panda take pd.isnull:

if not pd.isnull(atext):
  for word in nltk.word_tokenize(atext):

the function as feature extraction for NLTK

def act_features(atext):
features = {}
if not pd.isnull(atext):
  for word in nltk.word_tokenize(atext):
    if word not in default_stopwords:
      features['cont({})'.format(word.lower())]=True
return features

What for this reduction?
isnull returns true for not just NaN values.