Strpos in TimescaleDB
Introduction
In this lesson, we’ll investigate how to use the Strpos in TimescaleDB to isolate text in multi-word strings, including:
- What? What does the TimescaleDB STRPOS function do?
- How? How do we use this function in our TimescaleDB SQL commands?
- Why? When do we need these functions? We’ve included a “business reality” use case for your learning pleasure.
Some prerequisites
Understanding of building SQL for TimescaleDB (or Postgres, MS SQL Server, etc.) using the free Dbeaver tool or another flexible relational database administration tool for running SQL to get data or make changes to your TimescaleDB database.
The first thing to do is set up a database instance. We prefer using Objectrocket because we like the amount of value and efficiency they provide.
Create a TimescaleDB database instance on ObjectRocket
- (1) Go to kb.objectrocket.com and “Log In”.
- (2) Use left menu to “Create Instance”.
- (3) In the “Name” field, add a descriptive name for your instance.
- (4) Under “Select your service” choose the system you want to use from “CockroachDB, Elasticsearch, PostgreSQL, Redis, TimescaleDB, and MongoDB”.
- (5) Choose the “Cloud Provider” and “Type” you want.
- (6) Select “Version” and “Region”.
- (7) Click the “GO TO STEP 2” button.
- (8) Make a choice in the “Pick Your Flavor” section.
- (9) Choose the “Capacity (Storage/Memory)”. Notice how this choice influences your total fee at the bottom right.
- (10) Click the “GO TO STEP 3” button.
- (11) Under “Add a Whitelist IP” pick “ALOW ANY IP” or “USE MY IP” or “Add my IPs later”.
- (12) Choose between “Master” and “Replica” and click the “ADD” button.
- (13) Now click the “CREATE INSTANCE” button at the bottom.
Before we start writing SQL queries, let’s look at how to use Flask’s psycopg2 to connect to a TimescaleDB instance at ObjectRocket.
Connect to TimescaleDB
1 2 3 4 5 6 7 8 9 | t_dbname = "myTSdb" t_name_user = "tsdbadmin" t_password = "secret" t_sslmode = "require" t_host = "ingress.hkybrhnz.launchpad.objectrocket.cloud" t_port = "4129" connection_string = "dbname=" & t_dbname & " user=" & t_name_user & " password=" & t_password & " host=" & t_host & " port=" & t_port & " sslmode=" & sslmode db_conn = psycopg2.connect(connection_string) db_cursor = db_conn.cursor() |
What is STRPOS?
The TimescaleDB STRPOS function returns the position of a nested substring. So if you provide the function with something like, “TimescaleDB is the best relational database system,” and tell it you need to understand the position in that string of a nested string, the STRPOS() function will return “8” as an integer.
STRPOS syntax
1 | intStringPosition = STRPOS(txtStringOriginal, txtStringToFind); |
STRPOS example
Next, if we need to copy a piece out of one string, we give the STRPOS function two parameters:
- The original string, which we named “txtStringOriginal,” above.
- The string we need to find.
Let’s take another step by filling the STRPOS syntax with some data using TimescaleDB’s console or Python or dBeaver:
1 2 3 4 5 6 | DECLARE txtStringOriginal := 'Python is a sweet coding language.'; BEGIN SELECT intStringPosition := STRPOS(txtStringOriginal, "is"); END |
The example above returns “8” as an integer because “is” begins at the 8th character in the text containing “Python is a versatile scripting language.”.
If we make a small change to the above SQL, we see an example of a Substr type of search that most programmers will use at some point in time; searching for the first instance of a space within a string, so that we can delineate a word:
1 2 3 4 5 6 | DECLARE txtStringOriginal := 'Python is a sweet scripting language.'; BEGIN SELECT intStringPosition := STRPOS(txtStringOriginal, " "); END |
In the example above, our TimescaleDB STRPOS function returns a “7”.
Now examine a slightly deeper example of how STRPOS can work for us. We need to search our TimescaleDB table for the first word in the text kept in a column that may contain multiple words.
Our test table, named tblOurTechnologies, with data:
idItem | txtItem | txtMyCategory |
---|---|---|
0 | Postgres | DB Relational Free |
1 | Mongo | DB NoSQL Flexible |
2 | TimescaleDB | DB Relational Free |
3 | Python | Lang Newish Flexible |
4 | PHP | Lang Old Pervasive |
5 | Java | Lang Pervasive |
With the task being to return the first word in the txtMyCategory column of our table, we could use the LIKE and “%” operators but we need to learn STRPOS. Also, we need to find the numeric position of that first word and do not need to assume we know how long or what that word is, so the LIKE/”%” method won’t work in this case. Here’s the code:
1 2 3 4 | SELECT STRPOS(txtMyCategory, " ") AS intPositionFirstSpace FROM tblOurTechnologies; |
The above query will return the following results:
intPositionFirstSpace |
---|
3 |
3 |
3 |
5 |
5 |
5 |
While the example above may help you know better a method where we can use TimescaleDB’s STRPOS, the data returned above is not very useful because we’re not getting any id-type columns, so that we can understand which technology goes with which category. Keep reading to learn how.
Finding a word with STRPOS
Again, here is the table we named “tblOurTechnologies”:
idItem | txtItem | txtMyCategory |
---|---|---|
0 | Postgres | DB Relational Free |
1 | Mongo | DB NoSQL Flexible |
2 | TimescaleDB | DB Relational Free |
3 | Python | Lang Newish Flexible |
4 | PHP | Lang Old Pervasive |
5 | Java | Lang Pervasive |
The idea we spoke of above about our first example that returns only one column of data that may not be so useful without an id or name of what database or scripting technology is being referred to. So we’ll now modify that query now:
An SQL script that acquires the information we need from the TimescaleDB would look like:
1 2 3 4 5 6 | SELECT idItem , txtItem , STRPOS(txtMyCategory, " ") AS intPositionFirstSpace FROM tblOurTechnologies; |
Here are the rowset of results TimescaleDB returns to us, giving us idItem, txtItem, and intPositionFirstSpace for each record:
idItem | txtItem | intPositionFirstSpace |
---|---|---|
0 | Postgres | 3 |
1 | Mongo | 3 |
2 | TimescaleDB | 3 |
4 | Python | 5 |
5 | PHP | 5 |
6 | Java | 5 |
Now we understand where the first space is. That means we understand the first word is defined by position 1 through position (intPositionFirstSpace – 1), right?
Now we’ll step through the logic of how to do this and then take it a step further:
- Use STRPOS to retrieve intPositionFirstSpace.
- Subtract 1 from intPositionFirstSpace so we retrieve the length of that first word.
- Use SUBSTRING to isolate the new word.
Here’s the script to attain that task. But first we’ll list the full contents of our original table named tblOurTechnologies, for your simplicity:
idItem | txtItem | txtMyCategory |
---|---|---|
0 | Postgres | DB Relational Free |
1 | Mongo | DB NoSQL Flexible |
2 | TimescaleDB | DB Relational Free |
3 | Python | Lang Newish Flexible |
4 | PHP | Lang Old Pervasive |
5 | Java | Lang Pervasive |
1 2 3 4 5 6 7 | SELECT idItem , txtItem , STRPOS(txtMyCategory, " ") AS intPositionFirstSpace , SUBSTRING(txtMyCategory, 1, intPositionFirstSpace - 1) FROM tblOurTechnologies; |
Miscellaneous
Strpos has similar functionality but different syntax as TimescaleDB’s Position function.
Conclusion
In this lesson document we seed to use STRPOS() with TimescaleDB SQL. In piece of this lesson, we used realistic examples. Finally, we investigated some other powerful string related functions like SUBSTRING as well as the commonly used SELECT statement. Code samples included.
Pilot the ObjectRocket Platform Free!
Try Fully-Managed CockroachDB, Elasticsearch, MongoDB, PostgreSQL (Beta) or Redis.
Get Started