My Coding > Programming language > Python > Python libraries and packages > Python Panda

Python Panda (Page: 6)

Go to Page:

  1. Panda Series;
  2. Pandas DataFrame: Creation;
  3. Pandas: Create test DataFrame;
  4. Pandas DataFrame: Add/Remove;
  5. Pandas DataFrame: Export/Import;
  6. Panda search and select;
  7. Pandas Cheat Sheet;
  8. Pandas: MultiIndex DataFrame;

Panda DataFrame behave similar to standard relative Database and also it can be used as a standard list in Python

First of all, this is a structure for our examples


import pandas as pd
roman = ['0', 'I', 'II', 'III', 'IV', 'V']
numbers = ['zero', 'one', 'two', 'three', 'four', 'five']
df = pd.DataFrame({'european': pd.Series(numbers),
                   'roman' : pd.Series(roman)
                 })

DataFrame slices

Selecting one row as a slice with using index value


print(df.loc[2])
#european    three
#roman         III
#Name: 2, dtype: object

Selecting few rows a slice with selecting only one column by its name


print(df.loc[2:4, 'roman'])
#2    III
#3     IV
#4      V
#Name: roman, dtype: object

It is possible to use numeric values for slicing columns as well instead of using their names


print(df.iloc[2:4, 1])
#2    III
#3     IV
#Name: roman, dtype: object

Search over Panda DataFrame

It is possible to select data from Panda DataFrame which obey some conditions. The only you need to remember, that the boolean mask operator are different.

  • & is used for and
  • | is used for or
  • ~ is used for not


print(df[df['roman'] >= 'V'])
#  european roman
#4     five     V
#5      six    VI

More complicated case with search over two columns. Please pay attention to the new line symbol \


print(df[(df['roman'] >= 'II') \
       & (df['european'] >= 't')])
#  european roman
#1      two    II
#2    three   III

Iterations over Panda DataFrame

Again, let’s create one more panda DataFrame for these examples


import pandas as pd
data = {'Number' : ['zero', 'one', 'two', 'three', 'four', 'five', 'six'],
        'Roman'  : ['O', 'I', 'II', 'III', 'IV', 'V', 'VI'],
        'Arabic' : [0, 1, 2, 3, 4, 5, 6],
       }
ndf = pd.DataFrame(data)

Now we can iterate over rows


or row in ndf.itertuples():
    print(row)
#Pandas(Index=0, Arabic=0, Number='zero', Roman='O')
#Pandas(Index=1, Arabic=1, Number='one', Roman='I')
#Pandas(Index=2, Arabic=2, Number='two', Roman='II')
#Pandas(Index=3, Arabic=3, Number='three', Roman='III')
#Pandas(Index=4, Arabic=4, Number='four', Roman='IV')
#Pandas(Index=5, Arabic=5, Number='five', Roman='V')
#Pandas(Index=6, Arabic=6, Number='six', Roman='VI')

We can also use values from each element of the row


for row in ndf.itertuples():
    print('Number: ', row.Number)
    print('Roman : ', row.Roman, '\n')
#Number:  zero
#Roman :  O 
#
#Number:  one
#Roman :  I 
#
#Number:  two
#Roman :  II 
# etc .…

It is possible to use row index in row content in this selection


for row_index, row in ndf.iterrows():
    print('Row Index   : ', row_index)
    print('Row Content : ', row)
    print('One by One  : ', row[0], row[1], row[2], '\n')
#Row Index   :  0
#Row Content :  Arabic       0
#Number    zero
#Roman        O
#Name: 0, dtype: object
#One by One  :  0 zero O 
# . . . .
#Row Index   :  6
#Row Content :  Arabic      6
#Number    six
#Roman      VI
#Name: 6, dtype: object
#One by One  :  6 six VI 

Almost traditional iterations over the items


for K, V in ndf.iteritems():
    print('Key : ', K)
    print('Val :\n', V, '\n')
#Key :  Roman
#Val :
# 0      O
#1      I
#2     II
#3    III
#4     IV
#5      V
#6     VI
#Name: Roman, dtype: object 
# ... etc

Coversion Panda to List

Sometimes it is easier to have a list instead of Panda structure. It is possible to do with tolist() method. Let's imagine, that we need to have list, with some selection over Gender


import numpy as np 

data = {'Gender' : ['Male', 'Male', 'Male', 'Female', 'Female', 'Female', 'Female'],
        'weight' : [85, 86, 96, 44, 64, 46, 54],
        'age'    : [78, 65, 69, 74, 78, 77, 87],
       }
gdf = pd.DataFrame(data)

It is possible to select your data with column names


weight = gdf[gdf['Gender'] == 'Male']['weight'].tolist()
print(weight)  # [85, 86, 96]

And with numbers of these columns, but it can be more complicated


age = gdf[gdf[gdf.columns[0]] == 'Female'][gdf.columns[1]].tolist()
print(age)  # [74, 78, 77, 87]

Go to Page: 1; 2; 3; 4; 5; 6; 7; 8;


Published: 2021-11-05 09:11:16
Updated: 2021-12-17 02:48:39

Last 10 artitles


9 popular artitles

© 2020 MyCoding.uk -My blog about coding and further learning. This blog was writen with pure Perl and front-end output was performed with TemplateToolkit.