Python Panda (Page: 8)

DataFrame in Pandas can have more that one index column. In this case these DataFrames are called multi-index dataframes. In this short artile, I will show you how to create and use Multi-index DataFrame

Creation MultiIndex DataFrame

We have table with Shape, Color, Count and Size information for some objects stored in CSV data. First of all, we will read CSV this dataset into a dataframe without any indexes.


import pandas as pd
import io
data = io.StringIO('''Shape,Color,Count,Size
Circle,Red,5,1.2
Circle,Green,6,0.9
Circle,Green,6,1.1
Square,Red,15,2.5
Square,Green,3,2.9
Oval,Green,21,0.5
''')
dfu = pd.read_csv(data)
dfu
Our test DataFrame without any indexes
DataFrame without index

Our test DataFrame without any indexes

At the second step, we will declare columns ‘Shape’ and ‘Color’ as index columns


df = dfu.set_index(['Shape', 'Color'])
df
DataFrame with two columns ‘<b>Shape</b>’ and ‘<b>Color</b>’, declared as an indexes
DataFrame with two indexes.

DataFrame with two columns ‘Shape’ and ‘Color’, declared as an indexes

Grouping by indexes

Group DataFrame by one index

It is possible to group by any indexes and do some calculations within these groups. For example, lets group all objects by its ‘Shape’ and calculate mean values for the rest of parameters


df.groupby(level=['Shape']).mean()
Calculation of mean values for all groups in ‘<b>Shape</b>’
Mean values for ‘Shape

Calculation of mean values for all groups in ‘Shape

It is possible to group by any indexes. For example let’s count how many data we have for different ‘Color’.

It is important to remember, that count() give number of elements without NaN and size() give total number of elements, with NaN


df.groupby(level=['Color']).count()
Grouped by ‘<b>Color</b>’ and then counted. Obviously that all columns will have the same values, because there are the same amount of information in every column.
Count group by ‘Color

Grouped by ‘Color’ and then counted. Obviously that all columns will have the same values, because there are the same amount of information in every column.

Group DataFrame by two indexes

It is possible to group by few indexes. And also it is possible to specify an order of grouping. So let’s group by ‘Color’ and then by ‘Shape’ and then let’s calculate the average values in every group.


df.groupby(level=['Color','Shape']).mean()
Grouped by ‘<b>Color</b>’ and then by ‘<b>Shape</b>’ and calculating mean value in every group
Mean of group by ‘Color’ and ‘Shape

Grouped by ‘Color’ and then by ‘Shape’ and calculating mean value in every group

Use group

After grouping, it it is possible to call one group


gb = df.groupby(level=['Color','Shape'])
gb.get_group(('Green', 'Oval'))
After group by it is possible to use one group by specify it’s indexes
Using one group

After group by it is possible to use one group by specify it’s indexes

Go to Page: 1; 2; 3; 4; 5; 6; 7; 8;


Published: 2021-11-05 09:11:16
Updated: 2021-12-17 02:48:39