Pymongo: browse databases

Starting to work with new collection in a new database it is very important to see what is stored in this database, check the name of collections etc. Here I will show you how to do this first step of analysis of the existing collection.

I will describe this code step by step, but you need to run all parts.

List of Databases

We will use pymongo for working with MongoDB and also pprint for nice output of JSON data.

First of all it is necessary to connect to MongoDB with MongoClient and then make a list of all databases with list_database_names() method


from pymongo import MongoClient
from pprint import pprint
client = MongoClient('localhost', 27017) # create connection to MongoDB
print("Databeses: ",client.list_database_names())
# Databeses:  ['admin', 'config', 'intro_db', 'local']

It is possible to see that we have 4 databases, and intro_db looks interesting for further investigation.

List of collection in DataBase

Nest step is to find all collections in interesting database. We will use intro_db for further analysis


db = client.intro_db  # create database object
print("Collections: ",db.list_collection_names())
# Collections:  ['event_example', 'iris', 'iris3', 'iris1']

We can see 4 collections in intro_db and we will investigate iris collections

Collection size

To check the size of the present collection, use count_documents() method


print("total: ", db.iris.count_documents({}))
# total: 150

To find size of returned colelction – you can use standard len checking.

Printing one document from collection

To find any one documents in collection we can use method find_one()


cursor = db.iris.find_one() # this will find one document and return it
pprint(cursor)
#{'_id': ObjectId('61c414f1fbb6d7fec39b5612'),
# 'petal': {'length': 1.4, 'width': 0.2},
# 'sepal': {'length': 4.9, 'width': 3},
# 'variety': 'Setosa'}

It is clear what information is stored in this object.

Printing all documents from collections

It is possible to take all documents from collection and analyse then later or print them or do any other actions with them with method find()

To know the amount of documents after find() - it is possible to count length of the cursor, converted to a list, but it is important to make a clone of this cursor first, otherwise it will be used and you need to find new one


cursor1 = db.iris.find() # find all documents
total = len(list(cursor1.clone())) # total = 150

for rec1 in cursor1:
    pprint(rec1)
#{'_id': ObjectId('61c414f1fbb6d7fec39b56a7'),
# 'petal': {'length': 5.1, 'width': 1.8},
# 'sepal': {'length': 5.9, 'width': 3},
# 'variety': 'Virginica'}
# . . .

This operation will print all 150 documents

Simple find some database

We can use method find() to specify what documents we need to find. Here I will show you the simplest search, for example for given variety field


cursor2 = db.iris.find({'variety': 'Setosa'})
for rec2 in cursor2:
    pprint(rec2)
#{'_id': ObjectId('61c3dfb003e959223125ece2'),
# 'petal': {'length': 1.4, 'width': 0.2},
# 'sepal': {'length': 5, 'width': 3.3},
# 'variety': 'Setosa'}
# . . .

This will print 50 lines with given variety.

Published: 2021-12-23 12:04:29