Python Panda (Page: 7)

This will be very long list of most useful commands with panda, usually accompanied with some examples, if it is necessary for clear understanding. At he beginning I will introduce example sets and will continue to work only with them.

Pandas cheat sheet starting

For all your python code you need to load numpy and pandas libraries first. Also, defined here data sets we will use later for examples. Aslo we will use name df for our data-frame variable


import pandas as pd
import numpy as np
df1 = pd.DataFrame(
      [[1, 2, 3, 4],
       [5, 6, 7, 8],
       [9, 10, 11, 12]],
      index   = ['r1', 'r2', 'r3'],
      columns = ['c1', 'c2', 'c3', 'c4'])

Pandas importing data

df = pd.read_csv('file_name.csv') – read CSV file

Read CSV file. Important parameters are:

sep=',' - separator between columns
header=0 - select the row for column names. Use header=None for ignoring

df = pd.read_excel('file_name.xlsx') – read EXEL file

df = read_sql(query, connection_object) – read SQL table-specific

This function can read directly from many different SQL databases. This is an example how to read from MySQL


import mysql.connector as sql
import pandas as pd
db_connection = sql.connect(host='hostname', database='db_name', 
user='username', password='password')
df = pd.read_sql('SELECT * FROM table_name', con=db_connection)

df = pd.read_json(json_string) – read from JSON file

df = pd.read_html(url) – read table from URL given

pd.read_html(url) can read tables directly from HTM, but usually a lot of further cleaning required.

df = pd.read_table(filename) – read table from text file

This function is very similar to read_csv fucntion

df = pd.read_clipboard() - read content of clipboard

Read content of clipboard and send to read_table() function

Pandas exporting data

df = df.to_csv(filename) Write to a CSV file

df = df.to_excel(filename) Write to an Excel file

Before writing to excel, you need to specify the engine


writer = pd.ExcelWriter('./dataset/numbers.xlsx', engine='xlsxwriter')
numbers_df.to_excel(writer, sheet_name='Sheet1')

df = df.to_sql(table_name, connection_object) - Write to a SQL table

df = df.to_json(filename) - Write to a file in JSON format

View and inspect

df.head(n) – first n lines

df.tail(n) – last n lines

nrow, ncol = df.shape – number of rows and columns


nrow, ncol = df1.shape
print(nrow, ncol) # 3 4

df.info() - Index. DataType and Memory information

Statistical information about DataFrame

df.describe() - Summary statistic for numerical columns

Give basic statistic for all numerical columns, like number of elements, mean, standard deviation, minimal, maximal values and 25% 50% 75% quantilies


print(df1.describe())
#        c1    c2    c3    c4
#count  3.0   3.0   3.0   3.0
#mean   5.0   6.0   7.0   8.0
#std    4.0   4.0   4.0   4.0
#min    1.0   2.0   3.0   4.0
#25%    3.0   4.0   5.0   6.0
#50%    5.0   6.0   7.0   8.0
#75%    7.0   8.0   9.0  10.0
#max    9.0  10.0  11.0  12.0

df.mean() - mean of all numerical columns


This function is similar to df.describe() but produce differently oriented dataframe
print(df1.mean())
#c1    5.0
#c2    6.0
#c3    7.0
#c4    8.0
#dtype: float64

df.corr() - correlation between columns in a DataFrame

This example is not very demonstrative, because all columns are “parallel” with 100% correlation.


print(df1.corr())
#     c1   c2   c3   c4
#c1  1.0  1.0  1.0  1.0
#c2  1.0  1.0  1.0  1.0
#c3  1.0  1.0  1.0  1.0
#c4  1.0  1.0  1.0  1.0

df.count() - number of non-null values in each DataFrame column

Count any present numbers. 0 is counted also, this is not NULL

df.max() -highest value in each column

df.min() - lowest value in each column

df.median() - median of each column

df.std() - Returns the standard deviation of each column

Go to Page: 1; 2; 3; 4; 5; 6; 7; 8;

Published: 2021-11-05 09:11:16
Updated: 2021-12-17 02:48:39