Conquer the Search Realm: Unleashing Postgres' Full Text Search Potential!
Joshua Radin-GrantEvgenia Brusnitskaya6 min read
You’ve been working on your application for a while with Postgres and have realised that it would be great for users to be able to search key values in your database.
Well, look no further! This article will introduce you to Postgres Full Text Search (FTS) - a simple keyword matching engine - tell you why to consider it in your application and give you a full starter guide, covering generation of FTS vectors, queries and vectors on associated records and multiple reference rows.
Before jumping in, lets consider the classic options for search implementations. You’re most likely thinking of search engines such as ElasticSearch and Algolia.
Algolia
Currently one of the most adopted search solutions for most applications. Algolia, is a search-as-a-service platform that provides powerful capabilities to users. Autocomplete, language processing, geo-awareness and many others that provide required business value dependent on the project’s needs and requirements. Unlike other solutions, Algolia provides the search infrastructure which is ideal for businesses that don’t have the resources to manage one. The cost - as this is a commercial product - Algolia’s pricing is done by search requests/month.
Elastic search
For larger scale projects, ElasticSearch allows you to build the search infrastructure yourself, avoiding potential scaling issues and costs. An open-source self-hosted search engine that uses inverted index for faster search capabilities; providing functionality such as relevance scoring, spell check, suggesters and results pinning. A great choice - provided you have the resources to support managing the search infrastructure.
Postgres FTS
Both Algolia and ElasticSearch come with an array of inbuilt features that are tailored to suit the buyer. However, consider a project at the start of its lifetime. You are unsure whether you need spell check, fuzzy search, compatibility rankings, etc. Implementing these things without a reason would be a waste of time and money. Project budgets have a limit; delivering an MVP is the priority.
So, is there a flexible and simple solution that can be leveraged for said projects? Yes, Postgres FTS!
Implementation
How does it work? FTS is a way to perform text queries on text documents, using indexes and queries. Indexing refers to creating a list of searchable terms (indices) in the backend that act as a glossary of key words. The query refers to the text you are searching for. In our solution, the query generation is handled in the frontend, and is sent to the backend to leverage PostgreSQL FTS inbuilt capabilities - matching the indexes to the query and returning the result.
Context:
Lets consider a social media application, where each user has a profile. Every time that user makes a new post, they might add a description or a location. Another user comes along, and starts searching through the feed.
Fig.1
We are tackling three use cases here:
- The user wants to search for posts in a specific location
- The user wants to search for their friends name to see their posts
- The user wants to search for a product and see all posts with that product tagged
So, let’s jump right in!
Step 1: Generating FTS Vectors with Postgres
We’ll start by generating an FTS for a singular table posts.
Postgres allows us to generate FTS indexes based on associate column values of each row.
In the above example, the posts table is being altered. This alteration involves adding a column named posts_fts
.
This column will always generate new search indexes based on the values of the description and location columns in our posts table. These indexes are generated using the to_tsvector
statement.
This was super easy, we now have a posts table in our database with an FTS column that will that will regenerate it’s search indexes whenever the values inside location or description change.
Step 2: Generate a Full Text Search Query
Now that we’ve generated an index, our frontend application can take a search term, generate a full-text search query, and send it off to our Postgres Database. In our example we’re generating a query inside our react application.
An example of a getQuery
could be as simple as this. With basic string operations, we take a user’s rawQuery
- the search term, and separate each word adding a :*
prefix operator to allow for partial word searches.
Great! If we only care about generating search terms based on a singular table in our database then we’ve succeeded and can rest easy.
However… we have a new user requirement.
Users now want to be able to search for posts based on the display name of the poster and the products tagged in the post.
Referring back to Fig.1 profiles and products can be related to multiple posts through user_id
or product_id
foreign key relations. That means updating our posts FTS index to include the related profile.display_name
and product.name
values.
“Great! So I can go back to Step 1 and make a small adjustment right?” - not quite, but… we’re getting there!
Unfortunately, our FTS index generation from step 1 cannot be implemented across reference tables.
Step 3: Generating FTS Vectors with Associate Records
Since Postgres isn’t nice enough to generate indexes through associate tables for us, we’ll have to implement a solution manually. We need our solution to be able to generate an index for tables with associate records on create, update and delete, and update our search table with those indexes.
First we create a function that when triggered will update our post search fts.
This function consists of 2 key sections:
- The first section string aggregates the values of the associate records that we want to be able to search with respect to and assigns them to
cte_search
.
- The second updates our relevant post table’s fts column by generating a new ts vector with our previously selected
cte_search
values.
Step 4: Multiple Reference Rows
But hang on, each post can have multiple products associated with each one. Because of this we need a way to initialise this function and supply it with a relevant post uuid each time a product is updated that is referenced by a post.
To do this we’ll need a function specific to the table reference values we care about, in this case products
, and that function will need to be triggered on insert or update of information in the table.
This products_update_search
function loops through all posts that have an associate product and for each executes our previosly defined post_update_fts
function for each.
For this function to actually trigger we’ll need to utilise Postgres triggers to execute our function after insert or update.
With this implementation every time a new row is inserted (or updated) into the products table, our products_update_search
function is triggered, which in turn triggers our post_update_search
function and update’s our relevant post rows.
Amazing! The user has all the freedom to search for products, other users and descriptions of posts. As far as you are concerned, fts’ are updated automatically on table updates and you can sleep easy.