ChatGPT解决这个技术问题 Extra ChatGPT

pandas get column average/mean

I can't get the average or mean of a column in pandas. A have a dataframe. Neither of things I tried below gives me the average of the column weight

>>> allDF 
         ID           birthyear  weight
0        619040       1962       0.1231231
1        600161       1963       0.981742
2      25602033       1963       1.3123124     
3        624870       1987       0.94212

The following returns several values, not one:

allDF[['weight']].mean(axis=1)

So does this:

allDF.groupby('weight').mean()
df.groupby('weight') wasn't what you wanted, because it split the df into separate columns, each with a distinct value of weight. Instead of just df['weight'].mean()
allDF. weight.mean()

D
DSM

If you only want the mean of the weight column, select the column (which is a Series) and call .mean():

In [479]: df
Out[479]: 
         ID  birthyear    weight
0    619040       1962  0.123123
1    600161       1963  0.981742
2  25602033       1963  1.312312
3    624870       1987  0.942120

In [480]: df["weight"].mean()
Out[480]: 0.83982437500000007

and what if I wanted to get a mean of each and every column?
@Chris df.describe()
@Chris df.mean() gives you the weight of each column and returns it in a series.
S
Soufiane S

Try df.mean(axis=0) , axis=0 argument calculates the column wise mean of the dataframe so the result will be axis=1 is row wise mean so you are getting multiple values.


This works for most columns, but it will ignore any datetime columns.
n
nainometer

Do try to give print (df.describe()) a shot. I hope it will be very helpful to get an overall description of your dataframe.


display(df.describe()) is better (in Jupyter Notebooks) because display from ipython provides formatted HTML rather than ASCII, which is more visually useful/pleasing.
H
Hrvoje

Mean for each column in df :

    A   B   C
0   5   3   8
1   5   3   9
2   8   4   9

df.mean()

A    6.000000
B    3.333333
C    8.666667
dtype: float64

and if you want average of all columns:

df.stack().mean()
6.0

A
Arun Singh

you can use

df.describe() 

you will get basic statistics of the dataframe and to get mean of specific column you can use

df["columnname"].mean()

This is a duplicate of the answers mentioned above.
N
Nikos Tavoularis

You can also access a column using the dot notation (also called attribute access) and then calculate its mean:

df.your_column_name.mean()

o
oo00oo00oo00

You can use either of the two statements below:

numpy.mean(df['col_name'])
# or
df['col_name'].mean()

Please, enrich your answer with proper comments. Otherwise it is likely to be marked for deletion
M
Md. Tanvir Raihan

Additionally if you want to get the round value after finding the mean.

#Create a DataFrame
df1 = {
    'Subject':['semester1','semester2','semester3','semester4','semester1',
               'semester2','semester3'],
   'Score':[62.73,47.76,55.61,74.67,31.55,77.31,85.47]}
df1 = pd.DataFrame(df1,columns=['Subject','Score'])

rounded_mean = round(df1['Score'].mean()) # specified nothing as decimal place
print(rounded_mean) # 62

rounded_mean_decimal_0 = round(df1['Score'].mean(), 0) # specified decimal place as 0
print(rounded_mean_decimal_0) # 62.0

rounded_mean_decimal_1 = round(df1['Score'].mean(), 1) # specified decimal place as 1
print(rounded_mean_decimal_1) # 62.2

S
SHAGUN SHARMA

You can simply go for: df.describe() that will provide you with all the relevant details you need, but to find the min, max or average value of a particular column (say 'weights' in your case), use:

    df['weights'].mean(): For average value
    df['weights'].max(): For maximum value
    df['weights'].min(): For minimum value

k
kklw

Do note that it needs to be in the numeric data type in the first place.

 import pandas as pd
 df['column'] = pd.to_numeric(df['column'], errors='coerce')

Next find the mean on one column or for all numeric columns using describe().

df['column'].mean()
df.describe()

Example of result from describe:

          column 
count    62.000000 
mean     84.678548 
std     216.694615 
min      13.100000 
25%      27.012500 
50%      41.220000 
75%      70.817500 
max    1666.860000

a
artscan

You can easily follow the following code

import pandas as pd 
import numpy as np 
        
classxii = {'Name':['Karan','Ishan','Aditya','Anant','Ronit'],
            'Subject':['Accounts','Economics','Accounts','Economics','Accounts'],
            'Score':[87,64,58,74,87],
            'Grade':['A1','B2','C1','B1','A2']}

df = pd.DataFrame(classxii,index = ['a','b','c','d','e'],columns=['Name','Subject','Score','Grade'])
print(df)

#use the below for mean if you already have a dataframe
print('mean of score is:')
print(df[['Score']].mean())