Use of Random in CockroachDB

Have a Database Problem? Speak with an Expert for Free
Get Started >>

Introduction

In this document, we will learn to use Random in CockroachDB SQL. We’ll begin simple and then build a small imaginary project in order to better understand the scope of this function. Here we will discuss:

  • From a high level, what does the RANDOM function do in CockroachDB?
  • What are a couple situations where we would use the Random() function?
  • How do we use the Random function in our CockroachDB database?
  • We’ll also learn about the Count, Floor, and Generate_Series functions.

What is the RANDOM Cockroach function?

The Cockroach RANDOM() function returns a random numeric value between 0 (including) and 1 (not including).

Random Syntax

1
myRandomNumber := RANDOM()

This will return numbers like 0.03255423, 0.53825552, 0.88232914, etc. All less than 1.

When do we use the RANDOM() function?

Any time you need one or more numbers that are random or not easily predictable, like rolling dice, flipping a coin, or calculating enemy movement in a game. You may want to choose random rows from a table. Or you may actually want to emulate rolling 6-sided dice. Or maybe you want to fill a table with random numbers. We’ll use that last idea as a small practice project.

First, we will back up a little and make sure you fully understand how the RANDOM() function works by itself.

1
2
3
4
5
6
7
8
-- to return a random numeric value (integer) between (and includes) 0 and 9:
FLOOR(RANDOM()*10)

-- to return a random numeric value (integer) between (and includes) 1 and 10:
FLOOR(RANDOM()*10)+1

-- to return a random numeric value (integer) between (and including) intSmallest and intLargest:
FLOOR(RANDOM()*(intLargest-intSmallest)+intSmallest)

Quick note in case the FLOOR() function is not familiar to you: Floor is like the INT function used by many languages. It turns a numeric value into an integer. Another way to see it: It cuts off everything after the decimal point.

Start: Set a goal

Our project goal is to fill a table with random integers. We’ll start by creating a practice table called “tblRandoms”:

1
2
3
4
5
6
7
8
CREATE TABLE public.tblRandoms (
    id serial NOT NULL
    , intNumber int4 NULL DEFAULT 0
    , intRandomNumber int8 NULL DEFAULT 0
    , intTimesPicked int8 NOT NULL DEFAULT 0
    , dateTime TIME NULL DEFAULT now()
);
CREATE UNIQUE INDEX tblRandoms_id_idx ON public.tblRandoms USING btree (id);

The SQL above should give you a table like this:

[CockroachDB Random Function sample table](https://gyazo.com/19d647e80894097aba5de18ee2c125ab “CockroachDB Random Function sample table”)

You may be studying the “tblRandoms” Cockroach table we just created and wondering why certain columns? Let us see:

  • id: It is good Cockroach database design practice to create a table with an indexed id field. In many cases you will need a column like this with a unique indexed column, especially in relational databases where you wish to normalize with foreign keys. Also, we set ours to auto-increment, which is another common practice we recommend.
  • intNumber: This column is purely so we can more easily follow what is going on when we insert and update data in our tblRandoms table. From some perspectives, this column is redundant, but – depending on what you do with this table after this lesson – it could be quite useful. So we chose to include it.
  • intRandomNumber: This is the foundation of our test project. This is the column where we store the random integers we create using the Cockroach RANDOM function.
  • intTimesPicked: This is the column we use to store our calculation when we UPDATE the table with a count (using the COUNT function later) of how many times a specific number came up. We’ll use that in order to double-check just how truly random our numbers are.
  • dateTime: This is another column we put in for your future use of this table, in case you decide to do testing to find out speeds of various techniques of using the RANDOM function in your CockroachDB queries. Here, we are working with small datasets, but you could choose to change the generate_series function we use later to generate many rows instead of the small number we do here in a second. And if you try that, you may want to compare how much time it takes for the SQL to run.

Fill Cockroach table with random numbers

1
2
3
4
5
6
7
8
9
10
11
INSERT INTO
    tblRandoms
    (
    intNumber
    , intRandomNumber
    )
SELECT
    intSeries AS intNumber
    , FLOOR(RANDOM()*100)
FROM
    generate_Series(1,10) AS intSeries;

Analysis of the SQL above:

  • (1) INSERT INTO: This is where we tell CockroachDB’s SQL engine we are going to insert data into two columns (fields) and we are explictly naming those fields; intNumber and intRandomNumber.
  • (2) SELECT: Generating the data to go in the two columns we just mentioned. We’re getting “intSeries” later in the FROM section. The next line, FLOOR(RANDOM()*100) is what gives us a random integer between 0 and 100.
  • (3) FROM: Uses the generate_Series(intBegin, intEnd) function to generate 10 numbers in sequence, like so: 1, 2, 3, 4, 5, 6, 7 8, 9, 10. This will also cause our FROM to generate 10 rows, new rows because of our INSERT statement at the top.

Now, after executing the SQL above, your tblRandoms table should have 10 rows and look just like this, except with different numbers in the “intRandomNumber” column:

[Cockroach table after insert of Random numbers](https://gyazo.com/b861f1c6b4a1f08ee3f05edd37ad9630 “Cockroach table after insert of Random numbers”)

OK. Yay, the numbers in our intRandomNumber column look to be random!

Now, we need to keep in mind we have generated only 10 numbers and really, that’s not enough to know anything about how random our distribution is. Also, we don’t have any duplicates, so no need to utilize our intTimesPicked column yet. So let us generate more rows using the same query as before, except changing the line in our SQL that says generate_Series(1,10) AS intSeries; to use “30” instead of “10”. This will add 30 more rows to our 10-row dataset.

1
2
3
4
5
6
7
8
9
10
11
INSERT INTO
    tblRandoms
    (
    intNumber
    , intRandomNumber
    )
SELECT
    intSeries AS intNumber
    , FLOOR(RANDOM()*100)
FROM
    generate_Series(1,30) AS intSeries;

Now our dataset has 40 records with random numbers. Take a look at your dataset. Are there any duplicate numbers in the intRandomNumber column? If not, run that INSERT query again. Keep running it until your resulting data set has at least one duplicate in the intRandomNumber column. And, while you are at it, you may as well set your “generate_series” function to create a few thousand rows. The more you create, the more likely we will be to find any patterns that may exist once we execute the UPDATE SQL we are going to run next.

Modify CockroachDB table

OK, so now, to (1) appease our curiosity about just how random the RANDOM() function is, and (2) bone up on our advanced SQL writing, we will figure out how often any specific random integer was generated, so we can look for any patterns that may surface. In this case, since we want as truly random as possible; patterns bad. There is an important reason we are going in this direction and you will see if you stay to the end of this lesson.

Execute the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
UPDATE
    tblRandoms AS tblOuter
SET
    intTimesPicked = tblInner.intCountForThisRndNum
FROM
    (
    SELECT
        intRandomNumber
        , COUNT(intRandomNumber) AS intCountForThisRndNum
    FROM
        tblRandoms
    GROUP BY
        intRandomNumber
    ) AS tblInner
WHERE tblOuter.intRandomNumber = tblInner.intRandomNumber

Analysis of the SQL above:

(1) UPDATE – Here we establish which CockroachDB table will be modified, as well as giving it an alias of “tblOuter” so that it can be linked later to the dataset we are generating on the fly that we will alias as “tblInner”. These aliases can be any name you pick. We advise you to pick meaningful names that lend themselves to easy recognition. (2) SET – Fill the “intTimesPicked” column with “intCountForThisRndNum” generated in the SELECT statement below, aliased as tblInner. (3) SELECT – This is the starting of the SQL that generates our aggregate group and count. (4) COUNT – The “meat” of the entire query. COUNT(intRandomNumber) AS intCountForThisRndNum looks at the grouping of random numbers, counts all duplicates, and places that count into a variable aliased (named) as intCountForThisRndNum, which is used above (later, though, in the execute process, ha) in our UPDATE … SET statement. (5) FROM – This establishes the source of the data of our inner “table”. (6) GROUP BY – Group on the intRandomNumber field. (7) AS – Give our “virtual” inner “table” the alias “tblInner” so that data in this dataset can be referred to from other parts of the query. (8) WHERE – Binds our two virtual tables together.

After the update (see data below), notice the intTimesPicked column now has data. Unfortunately, there are no repetitions in the intRandomNumber field. Or… actually, that is fortunate. It means – so far – we have randomness. BUT, as we talked about earlier, that is a small sample; too small to be really meaningful if we are looking for patterns.

idintNumberintRandomNumberintTimesPickeddateTime
115115:22:41
2218115:22:41
3324115:22:41
4446115:22:41
5568115:22:41
6632115:22:41
7769115:22:41
8835115:22:41
997115:22:41
10102115:22:41

And after we added 20 more rows, we get a couple repeated random numbers. See the “2” in a couple spots in our intTimesPicked column?:

idintNumberintRandomNumberintTimesPickeddateTime
115115:22:41
2218115:22:41
3324115:22:41
4446115:22:41
5568115:22:41
6632215:22:41
7769115:22:41
8835115:22:41
997115:22:41
10102115:22:41
11197115:22:41
12238115:22:41
13365115:22:41
14411115:22:41
15512115:22:41
16684115:22:41
17777115:22:41
1886115:22:41
19930115:22:41
201083115:22:41
211132215:22:41
221227115:22:41
231347115:22:41
241488115:22:41
251541115:22:41
261622115:22:41
271734115:22:41
281882115:22:41
291929115:22:41
302079115:22:41

Just how random is the RANDOM function?

Yay; it worked. AND… even though it seems as if we are getting a random distribution, I can tell you from experience that it’s not as random as it could be. Hint: search for CockroachDB’s SETSEED function.

Conclusion

In this lesson, we learned when and how to use the CockroachDB RANDOM function to generate random numeric values in SQL statements. We also worked with some other functions, including FLOOR, GENERATE_SERIES, and COUNT. Code samples included.

Pilot the ObjectRocket Platform Free!

Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.

Get Started

Keep in the know!

Subscribe to our emails and we’ll let you know what’s going on at ObjectRocket. We hate spam and make it easy to unsubscribe.