ChatGPT解决这个技术问题 Extra ChatGPT

Turn Pandas Multi-Index into column

I have a dataframe with 2 index levels:

                         value
Trial    measurement
    1              0        13
                   1         3
                   2         4
    2              0       NaN
                   1        12
    3              0        34 

Which I want to turn into this:

Trial    measurement       value

    1              0        13
    1              1         3
    1              2         4
    2              0       NaN
    2              1        12
    3              0        34 

How can I best do this?

I need this because I want to aggregate the data as instructed here, but I can't select my columns like that if they are in use as indices.

Duplicate: stackoverflow.com/questions/18624039/… You want the first suggestion. .reset_index()
many thanks, I actually browsed around for this a lot, but "make multiindex to column" and similar queries always got me threads which wanted to pivot their dataframes...
Always easier to find an answer when you already know it :)

c
cs95

The reset_index() is a pandas DataFrame method that will transfer index values into the DataFrame as columns. The default setting for the parameter is drop=False (which will keep the index values as columns).

All you have to do call .reset_index() after the name of the DataFrame:

df = df.reset_index()  

For my case where I had 3 index levels inplace reset did not work. Alternative is assigning newly resetted dataframe to a new one: df2 = df.reset_index()
To reset only a particular level(s), use df.reset_index(level=[...])
Or the side-effect (probably quicker) way: df.reset_index(inplace=True)
K
Karl Anka

This doesn't really apply to your case but could be helpful for others (like myself 5 minutes ago) to know. If one's multindex have the same name like this:

                         value
Trial        Trial
    1              0        13
                   1         3
                   2         4
    2              0       NaN
                   1        12
    3              0        34 

df.reset_index(inplace=True) will fail, cause the columns that are created cannot have the same names.

So then you need to rename the multindex with df.index = df.index.set_names(['Trial', 'measurement']) to get:

                           value
Trial    measurement       

    1              0        13
    1              1         3
    1              2         4
    2              0       NaN
    2              1        12
    3              0        34 

And then df.reset_index(inplace=True) will work like a charm.

I encountered this problem after grouping by year and month on a datetime-column(not index) called live_date, which meant that both year and month were named live_date.


How to have your Trial values to repeat themselves? I had the same problem and it works except my values don't repeat themselves.
A
Alex

There may be situations when df.reset_index() cannot be used (e.g., when you need the index, too). In this case, use index.get_level_values() to access index values directly:

df['Trial'] = df.index.get_level_values(0)
df['measurement'] = df.index.get_level_values(1)

This will assign index values to individual columns and keep the index.

See the docs for further info.


This is soooooooooo useful! It should be possible to do this using much clearer language, e.g. df['measurement'] = df.index.values(1).
s
sameagol

As @cs95 mentioned in a comment, to drop only one level, use:

df.reset_index(level=[...])

This avoids having to redefine your desired index after reset.


k
kevin_theinfinityfund

I ran into Karl's issue as well. I just found myself renaming the aggregated column then resetting the index.

df = pd.DataFrame(df.groupby(['arms', 'success'])['success'].sum()).rename(columns={'success':'sum'})

https://i.stack.imgur.com/7mlAz.png

df = df.reset_index()

https://i.stack.imgur.com/DHwDT.png


w
whitetiger1399

Short and simple

df2 = pd.DataFrame({'test_col': df['test_col'].describe()})
df2 = df2.reset_index()