Migrate MongoDB to CockroachDB

Have a Database Problem? Speak with an Expert for Free
Get Started >>

Introduction

In this instructional article, we will migrate MongoDB to CockroachDB with Python codeing. We will hold your hand every step of the way, starting out with the libraries you will have to install and reference in your Python scripts in order to open a Mongo database and read from it, as well as the Python libraries needed to open and write to your CockroachDB data.

Assumptions and prerequisites

We will assume you have a basic understanding of the following technologies:

  • Python.
  • MongoDB.
  • CockroachDB.

To start out, let’s have a look at the test database – in JSON – that we want to migrate to CockroachDB.

MongoDB structure and data

Here are some documents (rows/records) from the users collection (table) from our Mongo database called “testDB”.


GeSHi Error: GeSHi could not find the language js (using path /nas/content/live/orkbprod/wp-content/plugins/codecolorer/lib/geshi/) (code 2)

Please keep in mind: We have given the above Mongo JSON-formatted collection a consistent structure, which you can not always count on when you are using a NoSQL database like Mongo.

Python libraries

In order to read that data from Mongo using Python, we will reference the pymongo library and MongoClient so we can eventually import into CockroachDB.

1
2
import pymongo
from pymongo import MongoClient

Mongo connection

Now we will set up Mongo credentials and create a connection to that database.

1
2
3
4
5
6
7
8
txtHostURL = "localhost"
txtPortNum = "27017" # Mongo port (double-check to make sure this is the port your Mongo database is running on)
txtDBname = "testDB"
t_user = "mongoadmin"
t_pw = "[Mongo database user password here]"
client = MongoClient("mongodb://" + t_user + ":" + t_pw + "@" + txtHostURL + ":" + txtPortNum + "/" + txtDBname)
db_mongo = client.testDB
users_mongo = db_mongo.users

Analysis

Note that here we have created an object called “users_mongo” for referencing the “users” collection. “Collections” in NoSQL are like tables in SQL. Later we will cycle through that table, focusing on two fields, “name” and “age”.

Now we set up CockroachDB credentials and create a connection to the SQL database we will use to write to later.

1
2
3
4
5
6
7
8
9
10
11
import psycopg2
txtDBname = "db name"
txtNameUser = "db user name"
txtSSLmode = "auto"
txtSSLrootCert = 'certifs/ca.crt'
t_sslkey = 'certifs/client.maxroach.key'
t_sslcert = 'certifs/client.maxroach.crt'
txtHostURL = "localhost"
txtPortNum = "26251"
connCRDB = psycopg2.connect(database=txtDBname, user=txtNameUser, sslmode=txtSSLmode, sslrootcert=txtSSLrootCert, sslkey=t_sslkey, sslcert=t_sslcert, host=txtHostURL, port=txtPortNum)
cursorCRDB = connCRDB.cursor()

Analysis

Above, we created the “cursorCRDB” object for later use to execute SQL statements.

Next we will set up two arrays, one for user names and one for user age.

1
2
arrUsersNames = []
arrUsersAges = []

Then we read all three rows (documents) that exist in the users collection from Mongo and copy them into a Python array.

1
2
3
4
results_mongoDB = users_mongo.find()
for row in results_mongoDB:
    arrUsersNames.append(row['name'])
    arrUsersAges.append(row['age'])

Analysis

  • “find()”: With no parameters, this function pulls all rows (documents) into a Python collection we are calling “results_mongoDB”.
  • “for…”: This is a FOR loop to move through the 3 rows in the “results_mongoDB” collection.
  • “append(row[‘name’])”: Adding an item to the array we use to store user names for later adding to our Cockroach database.
  • “append(row[‘age’])”: Adding an item to the array we use to store user ages for insertion into our CockroachDB users table.

Next, we build the SQL we will end up sending to Cockroach via the cursor we created earlier.

1
2
3
4
5
6
7
8
9
10
11
s = ""
s += "INSERT INTO tblUsers "
s += "("
s += "txtName"
s += ", intAge"
s += ")"
s += " VALUES"
s += "("
s += "'" + arrUsersNames + "'"
s += "," + arrUsersAges
s += ")"

Analysis

  • “INSERT INTO tblUsers…”: This will add a record as a new row to tblUsers.
  • “txtName” and “intAge”: We’re preparing CockroachDB to receive two columns of data and telling it where to place them.
  • “VALUES (“: This tells the Cockroach database to insert the values that follow into “tblUsers”.
  • “arrUsersNames” and “arrUsersAges”: Supplies the actual values to insert, which in this case are the two arrays we populated from the MongoDB source earlier.

Finally, scripts execution, error checking, cleanup, and redirect.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# Here we are catching and showing any errors that might happen
#   while TRYing to commit the running of our SQL code.
cursorCRDB.execute(s)
try:
    connCRDB.commit()
except psycopg2.Error as e:
    # Create a message to send to the user with some data, including
    # the error message and the SQL that caused the error.
    txtMessage = "Database error: " + e + "/n SQL: " + s
    # Notice: you may want to create a template page called "errors.html" for handling errors.
    # We are also sending a parameter along with our call to that HTML page.
    return render_template("errors.html", message = txtMessage)

# They got this far, meaning it was successful.
# Clean up our Cockroach database connection and cursor object
cursorCRDB.close()
connCRDB.close()

# Redirect user to the rest of your application
return redirect("https://your-site-address-here", code=302)

Python source

Here’s the final version of your Python scripts! IMPORTANT: When we say “learn about the scripts below,” this is important for more reasons than one. For example, you may have to install some new Python libraries, namely “flask”, “psycopg2”, “pymongo”, and “MongoClient”. You definitely need to reference those libraries as you see here:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
from flask import Flask
from flask import render_template # used later to send user to error page.
import psycopg2 # for Cockroach database connection
import pymongo # for MongoDB database connection
from pymongo import MongoClient

app = Flask(__name__)

# Mongo Database credentials
txtHostURL = "localhost"
txtPortNum = "27017" # Mongo port
txtDBname = "testDB"
t_user = "mongoadmin"
t_pw = "[Mongo database user password here]"
client = MongoClient("mongodb://" + t_user + ":" + t_pw + "@" + txtHostURL + ":" + txtPortNum + "/" + txtDBname)
db_mongo = client.testDB
users_mongo = db_mongo.users

# Cockroach Database credentials
txtDBname = "db name"
txtNameUser = "db user name"
txtSSLmode = "auto"
txtSSLrootCert = 'certifs/ca.crt'
t_sslkey = 'certifs/client.maxroach.key'
t_sslcert = 'certifs/client.maxroach.crt'
txtHostURL = "localhost"
txtPortNum = "26251"
connCRDB = psycopg2.connect(database=txtDBname, user=txtNameUser, sslmode=txtSSLmode, sslrootcert=txtSSLrootCert, sslkey=t_sslkey, sslcert=t_sslcert, host=txtHostURL, port=txtPortNum)
cursorCRDB = connCRDB.cursor()

def migrate():
    # Build arrayRecords here from Mongo users collection
    arrUsersNames = []
    arrUsersAges = []

    # Pull all documents (rows) from users collection (table)
    results_mongoDB = users_mongo.find()
    for row in results_mongoDB:
        arrUsersNames.append(row['name'])
        arrUsersAges.append(row['age'])

    # Your validation of data would go in this spot.

    # Clean up the database collection connection
    results_mongoDB.close()

    # SEND records to the Cockroach database
    s = ""
    s += "INSERT INTO tblUsers "
    s += "("
    s += "txtName"
    s += ", intAge"
    s += ")"
    s += " VALUES"
    s += "("
    s += "'(%arrUsersNames)'"
    s += ",(%arrUsersAges)
    s += "
)"

    # Here we are catching and displaying any errors that occur
    #   while TRYing to commit the execution of our SQL code.
    cursorCRDB.execute(s, [arrUsersNames, arrUsersAges])
        try:
            connCRDB.commit()
        except psycopg2.Error as e:
            # Create a message to send to the user with valuable data, including
            #    the error message and the SQL that caused the error.
            txtMessage = "
Database error: " + e + "/n SQL: " + s
            # Notice: you may want to create a template page called "
errors.html" for handling errors.
            #   We are also sending a parameter along with our call to the page.
            return render_template("
errors.html", message = txtMessage)

        # They got this far, meaning success
        # Clean up our Cockroach database connection and cursor object
        cursorCRDB.close()
        connCRDB.close()

    # Redirect user to the rest of your Mongo/Cockroach/Python application
    return redirect("
https://your-site-address-here", code=302)

if __name__ == "
__migrate__":
    migrate()

Miscellaneous

In the interest of wanting to keep it less complex, we left out data validation. Meaning, we recommend you validate the data received from the MongoDB before attempting to insert it into your Cockroach database. This is especially important when dealing with “structureless” NoSQL database systems where you can not count on the data being consist as much as you can with SQL relational databases. So please be aware of this and check for NULL!

Conclusion

In this instructional article, we walked you through learning how to Migrate MongoDB into Cockroach using Python. We showed script samples along the way and explained in detail how and why to take each step. We even had exposure to JSON. Also, for part of this tutorial document, we used the CockroachDB INSERT statement. In addition, we used Python Arrays, along with the FOR loop.

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.