How do I get the row count of a Pandas DataFrame?

python pandas dataframe

How do I get the number of rows of a pandas dataframe df?

ok I found out, i should have called method not check property, so it should be df.count() no df.count

^ Dangerous! Beware that df.count() will only return the count of non-NA/NaN rows for each column. You should use df.shape[0] instead, which will always correctly tell you the number of rows.

Note that df.count will not return an int when the dataframe is empty (e.g., pd.DataFrame(columns=["Blue","Red").count is not 0)

could use df.info() so you get row count (# entries), number of non-null entries in each column, dtypes and memory usage. Good complete picture of the df. If you're looking for a number you can use programatically then df.shape[0].

Mateen Ulhaq

For a dataframe df, one can use any of the following:

len(df.index)

df.shape[0]

df[df.columns[0]].count() (== number of non-NaN values in first column)

https://i.stack.imgur.com/wEzue.png

Code to reproduce the plot:

import numpy as np
import pandas as pd
import perfplot

perfplot.save(
    "out.png",
    setup=lambda n: pd.DataFrame(np.arange(n * 3).reshape(n, 3)),
    n_range=[2**k for k in range(25)],
    kernels=[
        lambda df: len(df.index),
        lambda df: df.shape[0],
        lambda df: df[df.columns[0]].count(),
    ],
    labels=["len(df.index)", "df.shape[0]", "df[df.columns[0]].count()"],
    xlabel="Number of rows",
)

There's one good reason why to use shape in interactive work, instead of len(df): Trying out different filtering, I often need to know how many items remain. With shape I can see that just by adding .shape after my filtering. With len() the editing of the command-line becomes much more cumbersome, going back and forth.

Won't work for OP, but if you just need to know whether the dataframe is empty, df.empty is the best option.

I know it's been a while, but isn't len(df.index) takes 381 nanoseconds, or 0.381 microseconds, df.shape is 3 times slower, taking 1.17 microseconds. did I miss something? @root

(3,3) matrix is bad example as it does not show the order of the shape tuple

How is df.shape[0] faster than len(df) or len(df.columns)? Since 1 ns (nanosecond) = 1000 µs (microsecond), therefore 1.17µs = 1170ns, which means it's roughly 3 times slower than 381ns

Peter Mortensen

Suppose df is your dataframe then:

count_row = df.shape[0]  # Gives number of rows
count_col = df.shape[1]  # Gives number of columns

Or, more succinctly,

r, c = df.shape

If the data set is large, len (df.index) is significantly faster than df.shape[0] if you need only row count. I tested it.

Why i do not have shape method on my DataFrame?

@ArdalanShahgholi it's probably because what was returned is a series, which is always 1 dimensional. Therefore, only len(df.index) will work

@Connor I need to have Number of rows and number of Columns from my DF. In my DF also i have a select it means i have a table and now the question is why i do not have SHAPE function on my DF?

Great question, make it a separate question on SO, share what you’ve tried and what you see as a result (give a full working set of code that’s simple for others to replicate) and then share the link to that question here. I’ll see if I can help

Dr. Jan-Philip Gehrcke

Use len(df) :-).

__len__() is documented with "Returns length of index".

Timing info, set up the same way as in root's answer:

In [7]: timeit len(df.index)
1000000 loops, best of 3: 248 ns per loop

In [8]: timeit len(df)
1000000 loops, best of 3: 573 ns per loop

Due to one additional function call, it is of course correct to say that it is a bit slower than calling len(df.index) directly. But this should not matter in most cases. I find len(df) to be quite readable.