A quick tutorial

In our detailed article on Full Text Search (FTS) in PostgreSQL we compare search-engines like Elasticsearch or Solr to PostgreSQL FTS, go through configurations, weighting, indexing, briefly explaining the GIN (Generalised Inverted Index), core functions, and ranking results.

Here we want to give a quick tutorial on FTS with some real world examples. To run yourself checkout this repo.

What You’ll Learn

This tutorial walks you through building a full-text search system in PostgreSQL. You’ll start by configuring language settings, then create efficient search indexes—first with a simple expression, then with a more powerful generated column.

We’ll add weighted indexing to prioritise certain fields (like giving titles more weight than descriptions), and use ts_rank to control how results are ranked. Finally, we’ll explore the different query functions PostgreSQL offers, from simple keyword searches to structured and phrase-based queries.

By the end, you’ll know how to index, rank, and query text effectively using PostgreSQL’s full-text search features.

The Dataset

We’ll be using an extract from the imdb-dataset, found on Kaggle imdb-dataset-of-top-1000-movies-and-tv-shows with different columns on movies and series:

  • Poster_Link - Link of the poster that imdb using
  • Series_Title = Name of the movie
  • Released_Year - Year at which that movie released
  • Certificate - Certificate earned by that movie
  • Runtime - Total runtime of the movie
  • Genre - Genre of the movie
  • IMDB_Rating - Rating of the movie at IMDB site
  • Overview - mini story/ summary
  • Meta_score - Score earned by the movie
  • Director - Name of the Director
  • Star1,Star2,Star3,Star4 - Name of the Stars
  • No_of_votes - Total number of votes
  • Gross - Money earned by that movie

We’ll be focussing on the main text columns: series_title and overview.

series_titleoverview
The Shawshank RedemptionTwo imprisoned men bond over a number of years, finding solace and eventual redemption through acts of common decency.
The GodfatherAn organized crime dynasty’s aging patriarch transfers control of his clandestine empire to his reluctant son.

To get a Postgres instance running and importing the dataset follow this readme.

Language Config

You can set a default for your session or database:

SET default_text_search_config = 'english';

Or always specify the language in your tsvector() and tsquery() statements.

Creating and Indexing Text Searchable Columns

tsvector as a plain Concatenation

PostgreSQL offers multiple ways to implement full-text search—either inline with an expression index or by using a generated column. Here’s a quick comparison of the two approaches:

Option 1: Expression-Based FTS Index

CREATE INDEX imdb_movies_idx 
ON imdb_movies 
USING GIN (
  to_tsvector('english', series_title || ' ' || overview)
);

This creates a GIN index directly on a computed tsvector, built from series_title and overview. It doesn’t add any new columns — the tsvector is computed at query time, then matched against the index.

Option 2: Stored Generated Column with FTS Index

ALTER TABLE imdb_movies
ADD COLUMN textsearchable_index_col tsvector
GENERATED ALWAYS AS (
  to_tsvector('english', coalesce(series_title, '') || ' ' || coalesce(overview, ''))
) STORED;
 
CREATE INDEX textsearch_idx 
ON imdb_movies 
USING GIN (textsearchable_index_col);

This adds a persistent tsvector column that is automatically generated and updated by PostgreSQL whenever the row changes. The index is built on this stored column.

Pros of Option 2 (Generated Column)

  • Stored once, indexed once: The tsvector is computed only once per row insert/update — not at query time.
  • Faster queries: Queries skip on-the-fly tsvector generation.
  • Reusable: The column can be used in queries, materialised views, triggers, etc.
  • Debuggable: Easier to inspect and troubleshoot — you can SELECT it directly.

Cons of Option 2

  • Adds schema complexity: Extra column in your table.
  • Increased storage: The column consumes additional disk space.
  • Less flexible: Changing the logic requires dropping and recreating the column.

Use Option 1 for simplicity and flexibility, especially during prototyping. Use Option 2 for performance, reusability, and maintainability in production setups. We’ll be using Option 2 in the following.

tsvector with prefixed weights

You can also assign weights to different parts of a document to indicate their relative importance in ranking search results.

PostgreSQL allows you to label tokens in the tsvector with weights A, B, C, or D, where A is considered the most important and D the least. The default weights for A, B, C, and D are {1.0, 0.4, 0.2, 0.1} .

This is especially useful when your document has structured fields/columns like a title, body, or keywords, and you want matches in more important fields (e.g., the title) to rank higher. Or-in our case-we give series_title the higher weight A and overview a lower weight B.

ALTER TABLE imdb_movies
ADD COLUMN textsearchable_index_col_weighted tsvector
GENERATED ALWAYS AS (
	setweight(to_tsvector('english', coalesce(series_title, '')), 'A') ||
	setweight(to_tsvector('english', coalesce(overview, '')), 'B')
) STORED;
 
CREATE INDEX textsearch_idx_weighted
ON imdb_movies USING GIN (textsearchable_index_col_weighted);

Now we have two generated and indexed tsvector columns in our table:

  • purely concatenated text textsearchable_index_col
  • weighted: series_title weighted higher than overview: textsearchable_index_col_weighted

Ranking Full-Text Search Results in PostgreSQL with Weights

Let’s explore how to rank search results in PostgreSQL using ts_rank and custom weight vectors to influence relevance. We’ll break down this query line-by-line:

SELECT
  series_title,
  overview,
  ts_rank(textsearchable_index_col_weighted, query) AS rank,
  ts_rank('{0.1, 0.1, 1.0, 0.1}', textsearchable_index_col_weighted, query) AS rank_weights_inverted
-- {0.1, 0.2, 0.4, 1.0} are the default weights for D, C, B, A
FROM
  imdb_movies,
  plainto_tsquery('fish') query
WHERE query @@ textsearchable_index_col_weighted
ORDER BY rank DESC
LIMIT 10;

Line-by-Line Explanation

We’re selecting two text columns from the imdb_movies table for display — this gives context to each result, series_title and overview.

ts_rank(textsearchable_index_col_weighted, query) AS rank, This computes the default relevance score (rank) of each document using PostgreSQL’s built-in ranking function.

  • textsearchable_index_col_weighted is the generated tsvector that contains weighted tokens, 'A' for series_title and 'B' for overview
  • query is a tsquery (more on that below).
  • ts_rank() gives higher scores to matches found in higher-weight fields (A > B > C > D).

ts_rank('{0.1, 0.1, 1.0, 0.1}', textsearchable_index_col_weighted, query) AS rank_weights_inverted

  • Here’s the cool part: we override the default weights.
  • {0.1, 0.1, 1.0, 0.1} corresponds to weights {D, C, B, A}
  • We’re saying: give priority to B-weighted tokens, downplaying A, C, and D. This way the overview weighs more than the title—the inverse of the default that we set above.

FROM imdb_movies, plainto_tsquery('fish') query

  • We pull from the imdb_movies table and use a plain text search term ('fish') to create a tsquery: plainto_tsquery('fish') tokenises and normalises the input string using the configured language.
  • We alias the result as query, so it can be reused in the SELECT clause and WHERE condition.

WHERE query @@ textsearchable_index_col_weighted This filters the rows to only those where the tsvector matches the tsquery.

  • @@ is the match operator: returns true when the query matches the indexed text.

We finally sort results by the default rank in descending order (highest relevance first) and keep top 10.

Output

series_titleoverviewrankrank_weights_inverted
Big FishA frustrated son tries to determine the fact from fiction in his dying father’s life.0.60792710.06079271
The Bourne IdentityA man is picked up by a fishing boat… suffering from amnesia…0.243170840.6079271

See how inverting the weights flips the ranking order?

  • With default weights, title matches (A) dominate — Big Fish ranks higher.
  • With inverted weights, the overview match (B) dominates — The Bourne Identity jumps ahead.

Key Takeaways

  • PostgreSQL’s ts_rank lets you tune result relevance with weighting vectors.
  • Default weight order is: {D, C, B, A} = {0.1, 0.2, 0.4, 1.0}
  • By inverting weights, you can prioritize body content over title or metadata.

Want to boost relevance of user-generated descriptions? Or downplay noisy titles? Now you know how.

Building FTS Queries in PostgreSQL

PostgreSQL offers several functions to construct tsquery objects for full-text search. Each one serves a different purpose—whether you’re handling raw user input, structured queries, or phrase searches. Here’s a quick comparison using series_title and overview from the imdb_movies table:

SELECT
	series_title,
	overview,
	ts_rank(textsearchable_index_col, query) AS rank
FROM
	imdb_movies,
	THE_EXAMPLE_QUERY query
WHERE query @@ textsearchable_index_col_weighted
ORDER BY rank DESC
LIMIT 10;

to_tsquery: Structured Query Logic

Use this when you want full control over logical operators (&, |, !, <->).

SELECT to_tsquery('english', 'fish & memory');
-- 'fish' & 'memori'

Good for advanced filtering or programmatically generated queries.

Finds: The Bourne Identity:A man is picked up by a fishing boat, bullet-riddled and suffering from amnesia, before racing to elude assassins and attempting to regain his memory.

plainto_tsquery: Simple AND Searches

Great for clean user input or when you just want to match all words.

SELECT plainto_tsquery('english', 'robot space journey');
-- 'robot' & 'space' & 'journey'

Ideal for basic search bars without needing advanced syntax.

Finds: WALL·E: In the distant future, a small waste-collecting robot inadvertently embarks on a space journey that will ultimately decide the fate of mankind.

phraseto_tsquery: Exact Phrase Matching

Preserves word order with <->, perfect for looking up movie taglines or quotes.

SELECT phraseto_tsquery('english', 'space journey');
-- 'space' <-> 'journey'

Also respects stop word positions for accurate phrase structure.

Finds WALL·E again.

websearch_to_tsquery: Search Engine Style

Most intuitive for users. Handles quoted phrases, -NOT, and or logic naturally.

SELECT websearch_to_tsquery('english', '"space journey" -robot');
-- 'space' <-> 'journey' & !'robot'

Perfect for building Google-style search experiences in web apps.

Finds no results, because the only movie with the phrase “space journey” is WALL·E , but it also contains ‘robot’, which is excluded in the query.

Summary

FunctionPurposeHandles OperatorsPreserves Word OrderBest For
to_tsqueryPrecise logic & filteringOptional (<->)Power users, advanced filters
plainto_tsqueryQuick AND queries❌ (auto &)Simple inputs, basic search bars
phraseto_tsqueryPhrase and proximity searches❌ (auto <->)Quotes, taglines, search accuracy
websearch_to_tsquerySearch-engine-like inputEnd-user queries, web apps

Wrapping Up

PostgreSQL’s full-text search is powerful, flexible, and surprisingly easy to set up. With just a few lines of SQL, you can build fast, ranked search over rich text data—no external search engine required. Whether you’re prototyping or scaling a production system, FTS in Postgres gives you full control over how text is indexed, weighted, and queried. Try it out, tweak the weights, and see how your results change—it’s all right there at your fingertips.

For in depth article see here .