How To Sort MongoDB Documents In A Collection Using Python

Introduction

When a call is made to query documents, MongoDB returns a pymongo.cursor.Cursor object containing data about the query’s results. This cursor object has an attribute method called sort() that allows for the sorting of MongoDB documents and the data to be iterated and organized either alphabetically or numerically, making the iteration process more efficient. When sorting MongoDB documents with Python, method calls to the collection class that returns a dictionary object, like find_one(), will not have any built-in methods like sort(). Users can test out the following examples for sorting MongoDB documents with PYMongo using Python’s IDLE environment by opening up a terminal or command prompt window and typing “idle3” for the Python 3 version of IDLE, or just “idle" for Python 2.

Prerequisites to Sorting MongoDB Documents

  • A working knowledge of Python and its syntax.
  • The properly installed Python interpreter.

Note: Python 2.7 is deprecated and scheduled to lose long-term support by 2020, As such, Python 3 is recommended for executing the examples in this tutorial.

  • The PyMongo Python driver for MongoDB must be installed with the PIP package manager, as shown here:
pip3 install pymongo
  • Use the netcat (nc) command, shown below, in a UNIX terminal to confirm that MongoDB is running on port 27017:
nc -v localhost 27017

The terminal should output something that resembles the following:

Connection to localhost port 27017 [tcp/*] succeeded!

Press the CTRL+C keys to exit.

How to create a collection instance from PyMongo’s MongoClient class

It is important to import PyMongo’s MongoClient class at the top of the script to create a collection object of a collection stored on the MongoDB database, as shown here:

# import the MongoClient class from PyMongo
from pymongo import MongoClient

Next, create a new client instance of MongoClient and access a database that has a collection with documents on it, as in this example:

# create a client instance of the MongoClient class
mongo_client = MongoClient('mongodb://localhost:27017')

# create database instance
db = mongo_client["sort_example"]

Now access a collection’s find() method using the db object. Pass an empty dictionary ({}) to the method if you want to query all of the collection’s documents, as shown here:

cursor = db["Sort-Collection"].find( {} )

How to find a MongoDB collection with documents on it

Call one of the collection class methods that return pymongo.cursor objects, like the find() or find_raw_batches() methods, and use those objects to filter and query documents.

Calling the PyMongo’s find() method and have it return a pymongo.cursor.Cursor object

All cursor objects in PyMongo should work with the sort() method. Use Python’s hasattr() function for confirmation, as shown here:

print ("Cursor obj has 'sort()':", hasattr(cursor, "sort"))

If done correctly, Python should print True to the terminal or IDLE window.

A list of the MongoDB document examples used in this tutorial

The examples in this tutorial, shown below, are from a collection made up of just six documents that contain "last name", "first name", "gender" string fields and an "age" integer field:

{'_id': ObjectId('5cff441e72ccde332cf3a213'), 'last name': 'Truk', 'first name': 'Ezekiel', 'gender': 'Male', 'age': 45},
{'_id': ObjectId('5cff445a72ccde332cf3a214'), 'last name': 'Nan', 'first name': 'Clara', 'gender': 'Female', 'age': 23},
{'_id': ObjectId('5cff449e72ccde332cf3a215'), 'last name': 'Shulman', 'first name': 'Corinthian', 'gender': 'Male', 'age': 52},
{'_id': ObjectId('5cff44aa72ccde332cf3a216'), 'last name': 'Erickson', 'first name': 'Janet', 'gender': 'Female', 'age': 36},
{'_id': ObjectId('5cff44b972ccde332cf3a217'), 'last name': 'Atkins', 'first name': 'Carmen', 'gender': 'Female', 'age': 26},
{'_id': ObjectId('5cff44c572ccde332cf3a218'), 'last name': 'Paleozoic', 'first name': 'Josephus', 'gender': 'Male', 'age': '24'}

Screenshot of Python's IDLE getting the documents in a MongoDB collection and checking find() for the sort() attribute

How to use the Cursor object’s sort() method to organize MongoDB documents

The sort method doesn’t require that a parameter for the order be passed to it. If just the field name is passed to sort(), then it will organize the data in ASCENDING order by default.

Have the find().sort() call return a cursor object so the documents can be iterated over, as shown here:

cursor = db["Sort-Collection"].find().sort("first name")

Iterate the cursor object returned by the sort() filter

The documents returned by MongoDB in the cursor object can be iterated over similar to a Python list using an iterator such as for, or using the enumerate() function, as shown here:

# verify that MongoDB returned a Cursor object
if hasattr(cursor, "sort") == True:

# iterate over the Cursor obj for docs
for num, doc in enumerate(cursor):
print (num, "-->", doc, "\n")

The document returned in each iteration should be a Python dictionary object that has the document’s fields, including its "_id" as the dict object’s keys.

Using Python’s IDLE to sort MongoDB documents and iterate over the PyMongo Cursor object

Screenshot of the enumerate() function iterating over a PyMongo cursor object returned by the sort() method

How to pass multiple parameters to the sort() method to use intersectional sorting and reverse the sort order

When passing more than just the field name string, a tuple object, with two values, must be passed inside of a list. If using more than one sorting parameter, then just pass another tuple inside the list.

The following is an example that sorts the "age" of the people in descending order (-1) and also sorts their last names in ascending order (1):

cursor = db["Sort-Collection"].find( {} ).sort(
[
("age", -1), ("last name", 1)
]
)

Screenshot of IDLE using two different sort parameters on a MongoDB collection

How to append MongoDB documents in the cursor object to a regular Python list

The documents returned by the Cursor iterator are just regular Python dict objects of the document’s contents. These can be inserted into a list by using the += [] operator:

# create an empty list to store docs
sorted_docs = []

# iterate over the Cursor object for MongoDB docs
for num, doc in enumerate(cursor):
# append the dict object to the list
sorted_docs += [ doc ]

print ( "total docs:", len(sorted_docs) )

The Sort API call will fail to properly order the data if the field in a document is the wrong data type

If just one of the fields in just one of the documents is the wrong data type, e.g. a document has a string-type for an integer field, then the sort() method will fail to sort properly.

The PyMongo sort() method failing to sort an integer field

Below is an example of the PyMongo sort() method failing to sort an integer field because one of the document’s has the wrong data type:

If the sort() method fails, first examine the document’s integer field data type to confirm that all of the documents have an integer for the data type.

How to use the PyMongo’s update_many() method

Below is an example of using the PyMongo’s update_many() method, or MongoDB Compass UI, to update and fix the data:

How to filter an integer field using $gh and sort the data by that field

The following is an example of how to use the greater-than filter ($gt) and sort method together:

# only get documents with "age" > 40, and sort them by "age"
cursor = db["Sort-Collection"].find(
{"age": {"$gt":40}}
).sort( [("age", -1)] ) # sort age by DESCENDING order

>NOTE: The number representing the second item of the tuple object passed to sort() is the sort order. The 1 is for ASCENDING and -1 is for DESCENDING order.

How to sort MongoDB documents by a field, but return a different field type

A second dictionary parameter can also be passed to the find() method and have the API call return only that data type, but still have the data sorted by another field name by calling sort().

How to return only one data type using sort() and the second parameter in find()

In the below example, this API call will return just the "first name" and "_id" field data of the MongoDB collection, but it will be in descending order using the "age" integer field’s data:

cursor = db["Sort-Collection"].find( {}, { "first name":1 } ).sort( [("age", -1)] )

Screenshot of Python's IDLE iterating a Cursor object with sorted data of two fields

Conclusion

This tutorial explained how to use the PyMongo sort() method and how to pass multiple parameters for executing compound filtering and sorting of MongoDB documents. Bear in mind, when sorting MongoDB documents in Python, the sort API call will fail if the field in a document has the wrong data type. The sort method will fail to sort properly if even one of the fields in just one of the documents contains the wrong data type. The most common reason for a failure when sorting MongoDB documents with PYMongo sort method is usually tied to not all of the documents having an integer for the data type. Also remember, the number representing the second item of the tuple object passed to sort() is the sort order, with 1 for ASCENDING and -1 for DESCENDING order.

Just the Code

#!/usr/bin/env python3
#-*- coding: utf-8 -*-

# create a client instance of the MongoClient class
mongo_client = MongoClient('mongodb://localhost:27017')

# create database instance
db = mongo_client["sort_example"]

# a simple find() API call to get documents
cursor = db["Sort-Collection"].find( {} )

# check if Cursor obj has the "sort" attr
print ("Cursor obj has 'sort()':", hasattr(cursor, "sort"))

# verify that MongoDB returned a Cursor object
if hasattr(cursor, "sort") == True:

# iterate over the Cursor obj for docs
for num, doc in enumerate(cursor):
print (num, "-->", doc, "\n")


# organize MongoDB documents using two sort parameters
# 1 = ASCENDING and -1 = DESCENDING
cursor = db["Sort-Collection"].find( {} ).sort(
[
("age", -1), ("last name", 1) # each sort is a tuple
]
)

# create an empty list to store docs
sorted_docs = []

# iterate over the Cursor object for MongoDB docs
for num, doc in enumerate(cursor):
# append the dict object to the list
sorted_docs += [ doc ]

print ( "total docs:", len(sorted_docs) )

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.