Pymongo: browse databases
Starting to work with new collection in a new database it is very important to see what is stored in this database, check the name of collections etc. Here I will show you how to do this first step of analysis of the existing collection.
I will describe this code step by step, but you need to run all parts.
List of Databases
We will use pymongo for working with MongoDB and also pprint for nice output of JSON data.
First of all it is necessary to connect to MongoDB with MongoClient and then make a list of all databases with list_database_names() method
from pymongo import MongoClient
from pprint import pprint
client = MongoClient('localhost', 27017) # create connection to MongoDB
print("Databeses: ",client.list_database_names())
# Databeses: ['admin', 'config', 'intro_db', 'local']
It is possible to see that we have 4 databases, and intro_db looks interesting for further investigation.
List of collection in DataBase
Nest step is to find all collections in interesting database. We will use intro_db for further analysis
db = client.intro_db # create database object
print("Collections: ",db.list_collection_names())
# Collections: ['event_example', 'iris', 'iris3', 'iris1']
We can see 4 collections in intro_db and we will investigate iris collections
Collection size
To check the size of the present collection, use count_documents() method
print("total: ", db.iris.count_documents({}))
# total: 150
To find size of returned colelction – you can use standard len checking.
Printing one document from collection
To find any one documents in collection we can use method find_one()
cursor = db.iris.find_one() # this will find one document and return it
pprint(cursor)
#{'_id': ObjectId('61c414f1fbb6d7fec39b5612'),
# 'petal': {'length': 1.4, 'width': 0.2},
# 'sepal': {'length': 4.9, 'width': 3},
# 'variety': 'Setosa'}
It is clear what information is stored in this object.
Printing all documents from collections
It is possible to take all documents from collection and analyse then later or print them or do any other actions with them with method find()
To know the amount of documents after find() - it is possible to count length of the cursor, converted to a list, but it is important to make a clone of this cursor first, otherwise it will be used and you need to find new one
cursor1 = db.iris.find() # find all documents
total = len(list(cursor1.clone())) # total = 150
for rec1 in cursor1:
pprint(rec1)
#{'_id': ObjectId('61c414f1fbb6d7fec39b56a7'),
# 'petal': {'length': 5.1, 'width': 1.8},
# 'sepal': {'length': 5.9, 'width': 3},
# 'variety': 'Virginica'}
# . . .
This operation will print all 150 documents
Simple find some database
We can use method find() to specify what documents we need to find. Here I will show you the simplest search, for example for given variety field
cursor2 = db.iris.find({'variety': 'Setosa'})
for rec2 in cursor2:
pprint(rec2)
#{'_id': ObjectId('61c3dfb003e959223125ece2'),
# 'petal': {'length': 1.4, 'width': 0.2},
# 'sepal': {'length': 5, 'width': 3.3},
# 'variety': 'Setosa'}
# . . .
This will print 50 lines with given variety.
Published: 2021-12-23 12:04:29