How to Create a Web Scraper with Mongoose, NodeJS, Axios, and Cheerio - Part 5

Have a Database Problem? Speak with an Expert for Free
Get Started >>

Introduction

In this fifth and final part of the tutorial on how to build a webscraper from scratch we will concentrate on the front-end javascript. We glossed over this file in the previous parts just so you could see the app working, but we’d like to dissect it a little more because it is an important piece that connects the front-end app to our backend server app.js

The index.js File

We’ll show you the entire code and then we’ll break it down into pieces. This file is in the public folder because it is a resource that our index.html will need to request and retrieve that file unchanged.

File: /webscraperDemo/public/index.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
$(document).ready(function () {

    /* Create variables to Dom needed DOM elements */
    var $scrapeTerm = $("#scrapeTerm");
    var $scrapeButton = $("#scrapeButton");
    var $searchTerm = $("#searchTerm");
    var $searchButton = $("#searchButton");
    var $getAllButton = $("#getAllButton");
    var $tableDiv = $("#tableDiv");

    /* Create API object to make AJAX calls */
    var searchAPI = {

        getAll: function () {
            return $.ajax({
                url: "/idioms",
                type: "GET"
            });
        },

        searchTerm: function (term) {
            return $.ajax({
                url: "/idioms/search/" + term,
                type: "GET"
            });
        },

        scrapeTerm: function (term) {
            return $.ajax({
                url: "/idioms/scrape/" + term,
                type: "POST"
            });
        }

    };


    /* Functions called by Event Listeners */
    var handleScrapeSubmit = function (event) {
        event.preventDefault();

        var searchTerm = $scrapeTerm.val().trim();

        searchAPI.scrapeTerm(searchTerm).then(function (resp) {

            var data = prepareResponseForTable(resp);
            makeTable($tableDiv, data);
        });

        // Clear out scrape field
        $scrapeTerm.val("");
    };


    var handleSearchSubmit = function (event) {

        var searchTerm = $searchTerm.val().trim();

        searchAPI.searchTerm(searchTerm).then(function (resp) {

            var data = prepareResponseForTable(resp);
            makeTable($tableDiv, data);
        });

        // Clear out search field
        $searchTerm.val("");
    };

    var handleGetAll = function (event) {

        searchAPI.getAll()
        .then(function(resp) {
            var data = prepareResponseForTable(resp);
            makeTable($tableDiv, data);

        })
        .catch(function(err) {
            console.log(err);
        });

    };

   
    /* Utilities */
    //  Utility to make a table from aset of data ( an array of arrays )
    //  https://www.htmlgoodies.com/beyond/css/working_w_tables_using_jquery.html
    function makeTable(container, data) {
        var table = $("<table/>").addClass('table table-striped');
        $.each(data, function (rowIndex, r) {

            var row = $("<tr/>");
            $.each(r, function (colIndex, c) {
                row.append($("<t" + (rowIndex == 0 ? "h" : "d") + "/>").text(c));
            });
            table.append(row);
        });
        return container.html(table);
    }

    //  Utility to take a response filled with idioms and make it into an array of arrays that is in a format ready for our "makeTable" utility.
    function prepareResponseForTable(response) {
        var data = [];
        data[0] = ["idiom"]; // Row header ( Add more columns if needed )

        response.forEach(function (eachIdiom) {
            //   data.push([eachIdiom._id, eachIdiom.idiom, eachIdiom.link, eachIdiom._v]);
            data.push([eachIdiom.idiom]);
        });

        return data; // Returns an array of arrays for "makeTable"
    }



    /* Event Listeners */
    $scrapeButton.on("click", handleScrapeSubmit);
    $searchButton.on("click", handleSearchSubmit);
    $getAllButton.on("click", handleGetAll);
});

Event listeners

There are certain events we care about on the page, in our case they are all button clicks for our three buttons. The first step to start listening for when those events happen is to get references to those components. This is how we do it in code:

1
2
3
4
5
6
7
8
9
$(document).ready(function () {

    /* Create variables to Dom needed DOM elements */
    var $scrapeTerm = $("#scrapeTerm");
    var $scrapeButton = $("#scrapeButton");
    var $searchTerm = $("#searchTerm");
    var $searchButton = $("#searchButton");
    var $getAllButton = $("#getAllButton");
    var $tableDiv = $("#tableDiv");

We wait for the document to be ready and then we use jQuery to find all of those button elements by their id. We assign all these to variables.

At the end of the file we listen for clicks on these buttons with these lines of code:

1
2
3
4
    /* Event Listeners */
    $scrapeButton.on("click", handleScrapeSubmit);
    $searchButton.on("click", handleSearchSubmit);
    $getAllButton.on("click", handleGetAll);

When the Scrape Button is pressed it will call the handleScrapeSubmit function which will take action. How does it take action? Keep reading!

The API

We created an API object in our index.js that defines functions to make requests to our backend. A function like handleScrapeSubmit would use the scrapeTerm function to help it accomplish it’s task. This API is a utility player and will be very valuable. It’s nice to keep the calls self-contained like this because it makes the requests you can make to the backend easy to see and call.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
    var searchAPI = {

        getAll: function () {
            return $.ajax({
                url: "/idioms",
                type: "GET"
            });
        },

        searchTerm: function (term) {
            return $.ajax({
                url: "/idioms/search/" + term,
                type: "GET"
            });
        },

        scrapeTerm: function (term) {
            return $.ajax({
                url: "/idioms/scrape/" + term,
                type: "POST"
            });
        }

    };

Other Utility Functions

In this file we also have some utility function that will take results and populate the table with them. General functions like that are good to keep generic so you can reuse them. This is the same concept behind our API object which can be reused again and again.

Conclusion

This concludes our final webscraper app tutorial. We delved into and integrated a lot of different technologies including MongoDB, NodeJS, Axios, Cheerio, Mongoose, Boostrap, and JQuery. We learned how to create an API and how to call it from the front-end using Javascript. We also learned a lot about integrating MongoDB into a web application and that it may not be as daunting as you thought especially with the help of some npm libraries like Mongoose.

Thank you so much for joining us on this journey and if you have any questions don’t hesitate to reach out.

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.