Get Database Information and Check if a MongoDB Collection Exists in Python

Introduction

This tutorial will explain how to check for instances of and how to access MongoDB databases and collections on a server using the PyMongo driver for Python. The PyMongo Python driver for MongoDB must be installed along with Python Version 3.

Prerequisites

  • The MongoDB server must be running on the same cluster as the node or machine with the Python script making API calls:
mongo --version
  • The PyMongo Python driver for MongoDB must be installed using pip3, as shown below. As Python 2.7 is deprecated and losing support all the examples in this tutorial use Python 3:
pip3 install pymongo
  • A database on the MongoDB cluster with collections suitable for testing the examples in this article.

No spaces are allowed in a MongoDB database name

Unlike collections, database names are not allowed to have spaces; only hyphens (-), dashes (—), and underscores (_) are permitted in a database name, as show here:

Screenshot of MongoDB Compass GUI displaying an error from user creating a database with a space in the name

For example, trying to create a database object like client["Hello World"] in PyMongo will return a pymongo.errors.InvalidName exception.

How to Access the MongoClient Class and Create a Client Instance of MongoDB in Python

First, import the PyMongo library into the Python script and then create a new client instance of the driver, as follows:

# import pymongo and all of the classes
from pymongo import MongoClient, errors

# create a new client instance
mongo_client = MongoClient('mongodb://localhost:27017')

Use the client to access the databases and a collection on the server. Import PyMongo’s errors module to catch any PyMongo errors in a “try-catch” exception indentation block.

How to Call the MongoClient’s List_database_names() Method to Return a List of All the Databases

Call the client’s list_database_names() method and have the PyMongo client return a list of all the MongoDB database names, in string format, that exist on the server, as shown here:

db_names = mongo_client.list_database_names()

>NOTE: The database_names() method has been deprecated since version 3.x of MongoDB.

This should print out all of the collection names to provide an idea of precisely what’s on the database. The Mongo Shell equivalent of the list_database_names() method call is show dbs.

How to iterate the MongoDB database names

This method will return a list object containing all of the database names, or an empty list object ([]) if there are no databases. Use an iterator, like enumerate(), to search the database names as follows:

# use enumerate to iterate the database names list
for db in enumerate(db_names):
print (db)

# get the total num of databases
print ("total databases:", len(db_names))

The above print statements should print out something like the following:

(0, 'Some-Database')
(1, 'admin')
(2, 'config')
(3, 'employees')
(4, 'local')
(5, 'unwanted_database')
total databases: 6

Returning information from the Database object

Calling the Database object itself, or executing a print command as shown below, will return some information about the MongoDB server, such as the host and port MongoDB is running on:

print (mongo_client["Some-Database"])

Attempting to access an attribute of the database object will only return a new collection instance using the attribute name.

Selecting a MongoDB Database to Look for Collections

The MongoClient class’s client instance can be used to access a MongoDB database directly in Python by creating a database object. The database and collection object can be used to verify if there are documents and data available.

db = mongo_client.some_database

The database object allows the user to access a collection object directly as well as to call the collection attributes count_documents() method. For example, pass an empty dictionary object ({}) to the method call as follows:

# create collection object from database and client instance
col = mongo_client["some_database"]["some_collection"]

# count the documents
docs = col.count_documents({})
print ("doc count:", docs, "for", col.name)

The docs object created should be an integer-type object representing the total number of documents on the MongoDB collection.

How to validate and get the collections of a MongoDB database using PyMongo

If the collection "Some Collection" does not exist on the MongoDB server, then it will be created “lazily” once documents are added to it. Bear in mind, PyMongo will neither raise an exception nor give a warning, so if an instance of a collection that doesn’t even exist is created there will be no indication as to why there aren’t any documents found on the collection.

How to call the database object’s validate_collection() method

One way to get around not having any documents returned is to actually force PyMongo to raise an error if a database and collection don’t exist. Do this by using the db.validate_collection() method by having it return a Python dictionary containing any collection and document information.

Create a dict object by passing the collection’s name, as a string, to the method and access its nrecords key attribute where the value should be an integer representing the collection’s total documents, as follows:

col_dict = db.validate_collection("Some Collection")
print ("'Some Collection' has", col_dict['nrecords'], "documents on it")

PyMongo raises an OperationFailure exception if a collection does not exist

Try to access another database and collection to see what documents are there, as follows:

# access a different database and collection
db = mongo_client["Some-Database"]
col = db["Some Collection"]

If the collection doesn’t exist, then PyMongo will raise a OperationFailure exception saying ns not found. A good way to verify a database’s collection, and its documents, is by using a try-except indentation block in Python, as follows:

# import the 'errors' module from PyMongo
try:
col_dict = db.validate_collection(col.name)
print (col.name, "has", col_dict['nrecords'], "documents on it.\n")
except errors.OperationFailure as err:
col_dict = None
print ("PyMongo ERROR:", err, "\n")

Set the col_dict variable’s value to None if there’s a PyMongo error. Now, if the collection exists, the object can be evaluated.

# if no errors, print out the collection dict
if type(col_dict) == dict:
print (col.name, "-->", col_dict)
print ("nrecords:", col_dict["nrecords"])
else:
print (col.name, "doesn't exist on server")

The following importation line must be included at the beginning of the script to avoid a NameError: as name 'errors' is not a defined exception:

from pymongo import errors

Use PyMongo’s validate_collection() method to check if a MongoDB collection exists and has documents, as shown here:

Screenshot of IDLE counting documents and using validate_collection() on a MongoDB collection

Use the drop_database() method to delete the MongoDB database

The PyMongo command for db.dropDatabase() is the drop_database() method. To drop the MongoDB database, select a database to drop and pass a string of the database name to the client instance’s drop_database() method:

mongo_client.drop_database('unwanted_database')

The API call does not return a Cursor or Results object, only a NoneType object.

Execute another call to the list_database_names() method to verify the database was dropped, as follows:

# get the database names again
db_names = mongo_client.list_database_names()

# use enumerate to iterate the database names list
for db in enumerate(db_names):
print (db)

# get the total num of databases
print ("total databases:", len(db_names))

Now the total number of databases should be one less than before.

Conclusion

This tutorial demonstated how to check for, create and access a database and collection on a server using the PyMongo driver for Python. The MongoDB server must be operating on the same cluster as the node or machine running the Python script. Bear in mind, PyMongo will not raise an exception even if they haven’t been created on the server and there will be no indication as to why documents are not found on the collection. To avoid this, PyMongo can be forced to raise an error if a database and collection don’t exist by using the db.validate_collection() method to return a Python dictionary containing any collection and document information.

Just the Code

#!/usr/bin/env python3
#-*- coding: utf-8 -*-

# import pymongo and all of the classes
from pymongo import MongoClient, errors

# create a new client instance
mongo_client = MongoClient('mongodb://localhost:27017')

# have PyMongo return a list of the server's database names
db_names = mongo_client.list_database_names()

# use enumerate to iterate the database names list
for db in enumerate(db_names):
print (db)

# get the total num of databases
print ("total databases:", len(db_names))

# create collection object from database and client instance
col = mongo_client["some_database"]["some_collection"]

# count the documents
docs = col.count_documents({})
print ("doc count:", docs, "for", col.name)

# access a different database and collection
db = mongo_client["Some-Database"]
col = db["Some Collection"]

# import the 'errors' module from PyMongo
try:
col_dict = db.validate_collection(col.name)
print (col.name, "has", col_dict['nrecords'], "documents on it.\n")
except errors.OperationFailure as err:
col_dict = None
print ("PyMongo ERROR:", err, "\n")

# if no errors, print out the collection dict
if type(col_dict) == dict:
print (col.name, "-->", col_dict)
print ("nrecords:", col_dict["nrecords"])
else:
print (col.name, "doesn't exist on server")

# drop a database and its collections
mongo_client.drop_database('unwanted_database')

Pilot the ObjectRocket platform free for 30 Days

It's easy to get started. Imagine the time you'll save by not worrying about database management. Let's do this!

PILOT FREE FOR 30 DAYS

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.