mirror of
https://github.com/supabase/supabase.git
synced 2026-07-04 18:34:26 +08:00
631 lines
15 KiB
Plaintext
631 lines
15 KiB
Plaintext
---
|
|
id: full-text-search
|
|
title: "Full Text Search"
|
|
description: How to use full text search in PostgreSQL.
|
|
---
|
|
|
|
import Tabs from '@theme/Tabs';
|
|
import TabsPanel from '@theme/TabsPanel';
|
|
|
|
|
|
Postgres has built-in functions to handle `Full Text Search` queries. This is like a "search engine" within Postgres.
|
|
|
|
|
|
## Preparation
|
|
|
|
For this guide we'll use the following example data:
|
|
|
|
|
|
<Tabs
|
|
defaultActiveId="Data"
|
|
values={[
|
|
{label: 'Data', value: 'Data'},
|
|
{label: 'SQL', value: 'SQL'},
|
|
]}>
|
|
|
|
<TabsPanel id="SQL" label="SQL">
|
|
|
|
```sql
|
|
create table books (
|
|
id serial primary key,
|
|
title text,
|
|
author text,
|
|
description text
|
|
);
|
|
|
|
insert into books (title, author, description)
|
|
values
|
|
('The Poky Little Puppy','Janette Sebring Lowrey','Puppy is slower than other, bigger animals.'),
|
|
('The Tale of Peter Rabbit','Beatrix Potter','Rabbit eats some vegetables.'),
|
|
('Tootle','Gertrude Crampton','Little toy train has big dreams.'),
|
|
('Green Eggs and Ham','Dr. Seuss','Sam has changing food preferences and eats unusually colored food.'),
|
|
('Harry Potter and the Goblet of Fire','J.K. Rowling','Fourth year of school starts, big drama ensues.');
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="Data" label="Data">
|
|
|
|
| id | title | author | description |
|
|
| ---- | ---- | ---- | ----------- |
|
|
| 1 | The Poky Little Puppy | Janette Sebring Lowrey | Puppy is slower than other, bigger animals. |
|
|
| 2 | The Tale of Peter Rabbit | Beatrix Potter | Rabbit eats some vegetables. |
|
|
| 3 | Tootle | Gertrude Crampton | Little toy train has big dreams. |
|
|
| 4 | Green Eggs and Ham | Dr. Seuss | Sam has changing food preferences and eats unusually colored food. |
|
|
| 5 | Harry Potter and the Goblet of Fire | J.K. Rowling | Fourth year of school starts, big drama ensues. |
|
|
|
|
</TabsPanel>
|
|
|
|
</Tabs>
|
|
|
|
## Usage
|
|
|
|
The functions we'll cover in this guide are:
|
|
|
|
### `to_tsvector()`
|
|
|
|
Converts your data into searchable "tokens". `to_tsvector()` stands for "to text search vector". For example:
|
|
|
|
```sql
|
|
select to_tsvector("green eggs and ham")
|
|
|
|
-- Returns 'egg':2 'green':1 'ham':4
|
|
```
|
|
|
|
Collectively these tokens are called a "document" which Postgres can use for comparisons.
|
|
|
|
### `to_tsquery()`
|
|
|
|
Converts a query string into "tokens" to match. `to_tsquery()` stands for "to text search query".
|
|
|
|
This conversion step is important because we will want to "fuzzy match" on keywords.
|
|
For example if a user searches for "eggs", and a column has the value "egg", we probably still want to return a match.
|
|
|
|
|
|
### Match: `@@`
|
|
|
|
The `@@` symbol is the "match" symbol for Full Text Search. It returns any matches between a `to_tsvector` result and a `to_tsquery` result.
|
|
|
|
Take the following example:
|
|
|
|
<Tabs
|
|
defaultActiveId="SQL"
|
|
values={[
|
|
{label: 'SQL', value: 'SQL'},
|
|
{label: 'Javascript', value: 'JS'},
|
|
{label: 'Dart', value: 'DART'},
|
|
]}>
|
|
<TabsPanel id="SQL" label="SQL">
|
|
|
|
```sql
|
|
select *
|
|
from books
|
|
where name = 'Harry';
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="JS" label="JS">
|
|
|
|
```js
|
|
const { data, error } = await supabase
|
|
.from('books')
|
|
.select()
|
|
.eq('name', 'Harry')
|
|
```
|
|
|
|
</TabsPanel>
|
|
|
|
<TabsPanel id="DART" label="DART">
|
|
|
|
```dart
|
|
final result = await client
|
|
.from('books')
|
|
.select()
|
|
.eq('name', 'Harry')
|
|
.execute();
|
|
```
|
|
|
|
</TabsPanel>
|
|
</Tabs>
|
|
|
|
|
|
The equality symbol above (`=`) is very "strict" on what it matches. In a full text search context, we might want to find all "Harry Potter" books and so we can rewrite the
|
|
example above:
|
|
|
|
<Tabs
|
|
defaultActiveId="SQL"
|
|
values={[
|
|
{label: 'SQL', value: 'SQL'},
|
|
{label: 'Javascript', value: 'JS'},
|
|
{label: 'Dart', value: 'DART'},
|
|
]}>
|
|
<TabsPanel id="SQL" label="SQL">
|
|
|
|
```sql
|
|
select *
|
|
from books
|
|
where to_tsvector(name) @@ to_tsquery('Harry');
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="JS" label="JS">
|
|
|
|
```js
|
|
const { data, error } = await supabase
|
|
.from('books')
|
|
.select()
|
|
.textSearch('name', `'Harry'`)
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="DART" label="DART">
|
|
|
|
```dart
|
|
final result = await client
|
|
.from('books')
|
|
.select()
|
|
.textSearch('name', "'Harry'")
|
|
.execute();
|
|
```
|
|
|
|
</TabsPanel>
|
|
</Tabs>
|
|
|
|
## Basic Full Text Queries
|
|
|
|
### Search a single column
|
|
|
|
To find all `books` where the `description` contain the word `big`:
|
|
|
|
<Tabs
|
|
defaultActiveId="SQL"
|
|
values={[
|
|
{label: 'SQL', value: 'SQL'},
|
|
{label: 'Javascript', value: 'JS'},
|
|
{label: 'Dart', value: 'DART'},
|
|
{label: 'Data', value: 'Data'},
|
|
]}>
|
|
<TabsPanel id="SQL" label="SQL">
|
|
|
|
```sql
|
|
select
|
|
*
|
|
from
|
|
books
|
|
where
|
|
to_tsvector(description)
|
|
@@ to_tsquery('big');
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="JS" label="JS">
|
|
|
|
```js
|
|
const { data, error } = await supabase
|
|
.from('books')
|
|
.select()
|
|
.textSearch('description', `'big'`)
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="DART" label="DART">
|
|
|
|
```dart
|
|
final result = await client
|
|
.from('books')
|
|
.select()
|
|
.textSearch('description', "'big'")
|
|
.execute();
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="Data" label="Data">
|
|
|
|
| id | title | author | description |
|
|
| -- | ----------------------------------- | ----------------- | ----------------------------------------------- |
|
|
| 3 | Tootle | Gertrude Crampton | Little toy train has big dreams. |
|
|
| 5 | Harry Potter and the Goblet of Fire | J.K. Rowling | Fourth year of school starts, big drama ensues. |
|
|
|
|
</TabsPanel>
|
|
</Tabs>
|
|
|
|
### Search multiple columns
|
|
|
|
To find all `books` where `description` or `title` contain the word `little`:
|
|
|
|
<Tabs
|
|
defaultActiveId="SQL"
|
|
values={[
|
|
{label: 'SQL', value: 'SQL'},
|
|
{label: 'Data', value: 'Data'},
|
|
]}>
|
|
<TabsPanel id="SQL" label="SQL">
|
|
|
|
```sql
|
|
select
|
|
*
|
|
from
|
|
books
|
|
where
|
|
to_tsvector(description || ' ' || title) -- concat columns, but be sure to include a space to separate them!
|
|
@@ to_tsquery('little');
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="Data" label="Data">
|
|
|
|
| id | title | author | description |
|
|
| -- | --------------------- | ---------------------- | ------------------------------------------- |
|
|
| 1 | The Poky Little Puppy | Janette Sebring Lowrey | Puppy is slower than other, bigger animals. |
|
|
| 3 | Tootle | Gertrude Crampton | Little toy train has big dreams. |
|
|
|
|
</TabsPanel>
|
|
</Tabs>
|
|
|
|
### Match all search words
|
|
|
|
To find all `books` where `description` contains BOTH of the words `little` and `big`, we can use the `&` symbol:
|
|
|
|
<Tabs
|
|
defaultActiveId="SQL"
|
|
values={[
|
|
{label: 'SQL', value: 'SQL'},
|
|
{label: 'Javascript', value: 'JS'},
|
|
{label: 'Dart', value: 'DART'},
|
|
{label: 'Data', value: 'Data'},
|
|
]}>
|
|
<TabsPanel id="SQL" label="SQL">
|
|
|
|
```sql
|
|
select
|
|
*
|
|
from
|
|
books
|
|
where
|
|
to_tsvector(description)
|
|
@@ to_tsquery('little & big'); -- use & for AND in the search query
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="JS" label="JS">
|
|
|
|
```js
|
|
const { data, error } = await supabase
|
|
.from('books')
|
|
.select()
|
|
.textSearch('description', `'little' & 'big'`)
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="DART" label="DART">
|
|
|
|
```dart
|
|
final result = await client
|
|
.from('books')
|
|
.select()
|
|
.textSearch('description', "'little' & 'big'")
|
|
.execute();
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="Data" label="Data">
|
|
|
|
| id | title | author | description |
|
|
| -- | ------ | ----------------- | -------------------------------- |
|
|
| 3 | Tootle | Gertrude Crampton | Little toy train has big dreams. |
|
|
|
|
</TabsPanel>
|
|
</Tabs>
|
|
|
|
### Match any search words
|
|
|
|
To find all `books` where `description` contain ANY of the words `little` or `big`, use the `|` symbol:
|
|
|
|
<Tabs
|
|
defaultActiveId="SQL"
|
|
values={[
|
|
{label: 'SQL', value: 'SQL'},
|
|
{label: 'Javascript', value: 'JS'},
|
|
{label: 'Dart', value: 'DART'},
|
|
{label: 'Data', value: 'Data'},
|
|
]}>
|
|
<TabsPanel id="SQL" label="SQL">
|
|
|
|
```sql
|
|
select
|
|
*
|
|
from
|
|
books
|
|
where
|
|
to_tsvector(description)
|
|
@@ to_tsquery('little | big'); -- use | for OR in the search query
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="JS" label="JS">
|
|
|
|
```js
|
|
const { data, error } = await supabase
|
|
.from('books')
|
|
.select()
|
|
.textSearch('description', `'little' | 'big'`)
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="DART" label="DART">
|
|
|
|
```dart
|
|
final result = await client
|
|
.from('books')
|
|
.select()
|
|
.textSearch('description', "'little' | 'big'")
|
|
.execute();
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="Data" label="Data">
|
|
|
|
| id | title | author | description |
|
|
| -- | --------------------- | ---------------------- | ------------------------------------------- |
|
|
| 1 | The Poky Little Puppy | Janette Sebring Lowrey | Puppy is slower than other, bigger animals. |
|
|
| 3 | Tootle | Gertrude Crampton | Little toy train has big dreams. |
|
|
|
|
</TabsPanel>
|
|
</Tabs>
|
|
|
|
Notice how searching for `big` includes results with the word `bigger` (or `biggest`, etc).
|
|
|
|
## Creating Indexes
|
|
|
|
Now that we have Full Text Search working, let's create an `index`. This will allow Postgres to "build" the documents pre-emptively so that they
|
|
don't need to be created at the time we execute the query. This will make our queries much faster.
|
|
|
|
### Searchable columns
|
|
|
|
Let's create a new column `fts` inside the `books` table to store the searchable index of the `title` and `description` columns.
|
|
|
|
We can use a special feature of Postgres called
|
|
[Generated Columns](https://www.postgresql.org/docs/current/ddl-generated-columns.html)
|
|
to ensure that the index is updated any time the values in the `title` and `description` columns change.
|
|
|
|
<Tabs
|
|
defaultActiveId="SQL"
|
|
values={[
|
|
{label: 'SQL', value: 'SQL'},
|
|
{label: 'Data', value: 'Data'},
|
|
]}>
|
|
<TabsPanel id="SQL" label="SQL">
|
|
|
|
```sql
|
|
alter table
|
|
books
|
|
add column
|
|
fts tsvector generated always as (to_tsvector('english', description || ' ' || title)) stored;
|
|
|
|
create index books_fts on books using gin (fts); -- generate the index
|
|
|
|
select id, fts
|
|
from books;
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="Data" label="Data">
|
|
|
|
| id | fts |
|
|
| -- | --------------------------------------------------------------------------------------------------------------- |
|
|
| 1 | 'anim':7 'bigger':6 'littl':10 'poki':9 'puppi':1,11 'slower':3 |
|
|
| 2 | 'eat':2 'peter':8 'rabbit':1,9 'tale':6 'veget':4 |
|
|
| 3 | 'big':5 'dream':6 'littl':1 'tootl':7 'toy':2 'train':3 |
|
|
| 4 | 'chang':3 'color':9 'eat':7 'egg':12 'food':4,10 'green':11 'ham':14 'prefer':5 'sam':1 'unus':8 |
|
|
| 5 | 'big':6 'drama':7 'ensu':8 'fire':15 'fourth':1 'goblet':13 'harri':9 'potter':10 'school':4 'start':5 'year':2 |
|
|
|
|
</TabsPanel>
|
|
</Tabs>
|
|
|
|
### Search using the new column
|
|
|
|
Now that we've created and populated our index, we can search it using the same techniques as before:
|
|
|
|
<Tabs
|
|
defaultActiveId="SQL"
|
|
values={[
|
|
{label: 'SQL', value: 'SQL'},
|
|
{label: 'Javascript', value: 'JS'},
|
|
{label: 'Dart', value: 'DART'},
|
|
{label: 'Data', value: 'Data'},
|
|
]}>
|
|
<TabsPanel id="SQL" label="SQL">
|
|
|
|
```sql
|
|
select
|
|
*
|
|
from
|
|
books
|
|
where
|
|
fts @@ to_tsquery('little & big');
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="JS" label="JS">
|
|
|
|
```js
|
|
const { data, error } = await supabase
|
|
.from('books')
|
|
.select()
|
|
.textSearch('fts', `'little' & 'big'`)
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="DART" label="DART">
|
|
|
|
```dart
|
|
final result = await client
|
|
.from('books')
|
|
.select()
|
|
.textSearch('fts', "'little' & 'big'")
|
|
.execute();
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="Data" label="Data">
|
|
|
|
| id | title | author | description | fts |
|
|
| -- | ------ | ----------------- | -------------------------------- | ------------------------------------------------------- |
|
|
| 3 | Tootle | Gertrude Crampton | Little toy train has big dreams. | 'big':5 'dream':6 'littl':1 'tootl':7 'toy':2 'train':3 |
|
|
|
|
</TabsPanel>
|
|
</Tabs>
|
|
|
|
## Query Operators
|
|
|
|
Visit [PostgreSQL: Text Search Functions and Operators](https://www.postgresql.org/docs/current/functions-textsearch.html)
|
|
to learn about additional query operators you can use to do more advanced `full text queries`, such as:
|
|
|
|
### Proximity: `<->`
|
|
|
|
The proximity symbol is useful for searching for terms that are a certain "distance" apart.
|
|
For example, to find the phrase `big dreams`, where the a match for "big" is followed immediately by a match for "dreams":
|
|
|
|
|
|
<Tabs
|
|
defaultActiveId="SQL"
|
|
values={[
|
|
{label: 'SQL', value: 'SQL'},
|
|
{label: 'Javascript', value: 'JS'},
|
|
{label: 'Dart', value: 'DART'},
|
|
]}>
|
|
<TabsPanel id="SQL" label="SQL">
|
|
|
|
```sql
|
|
select
|
|
*
|
|
from
|
|
books
|
|
where
|
|
to_tsvector(description) @@ to_tsquery('big <-> dreams');
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="JS" label="JS">
|
|
|
|
```js
|
|
const { data, error } = await supabase
|
|
.from('books')
|
|
.select()
|
|
.textSearch('description', `'big' <-> 'dreams'`)
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="DART" label="DART">
|
|
|
|
```dart
|
|
final result = await client
|
|
.from('books')
|
|
.select()
|
|
.textSearch('description', "'big' <-> 'dreams'")
|
|
.execute();
|
|
```
|
|
|
|
</TabsPanel>
|
|
</Tabs>
|
|
|
|
|
|
We can also use the `<->` to find words within a certain distance of eachother. For example to find `year` and `school` within 2 words of each other:
|
|
|
|
<Tabs
|
|
defaultActiveId="SQL"
|
|
values={[
|
|
{label: 'SQL', value: 'SQL'},
|
|
{label: 'Javascript', value: 'JS'},
|
|
{label: 'Dart', value: 'DART'},
|
|
]}>
|
|
<TabsPanel id="SQL" label="SQL">
|
|
|
|
```sql
|
|
select
|
|
*
|
|
from
|
|
books
|
|
where
|
|
to_tsvector(description) @@ to_tsquery('year <2> school');
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="JS" label="JS">
|
|
|
|
```js
|
|
const { data, error } = await supabase
|
|
.from('books')
|
|
.select()
|
|
.textSearch('description', `'year' <2> 'school'`)
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="DART" label="DART">
|
|
|
|
```dart
|
|
final result = await client
|
|
.from('books')
|
|
.select()
|
|
.textSearch('description', "'year' <2> 'school'")
|
|
.execute();
|
|
```
|
|
|
|
</TabsPanel>
|
|
</Tabs>
|
|
|
|
|
|
### Negation: `!`
|
|
|
|
The negation symbol can be used to find phrases which _don't_ contain a search term.
|
|
For example, to find records that have the word `big` but not `little`:
|
|
|
|
<Tabs
|
|
defaultActiveId="SQL"
|
|
values={[
|
|
{label: 'SQL', value: 'SQL'},
|
|
{label: 'Javascript', value: 'JS'},
|
|
{label: 'Dart', value: 'DART'},
|
|
]}>
|
|
<TabsPanel id="SQL" label="SQL">
|
|
|
|
```sql
|
|
select
|
|
*
|
|
from
|
|
books
|
|
where
|
|
to_tsvector(description) @@ to_tsquery('big & !little');
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="JS" label="JS">
|
|
|
|
```js
|
|
const { data, error } = await supabase
|
|
.from('books')
|
|
.select()
|
|
.textSearch('description', `'big' & !'little'`)
|
|
```
|
|
|
|
</TabsPanel>
|
|
<TabsPanel id="DART" label="DART">
|
|
|
|
```dart
|
|
final result = await client
|
|
.from('books')
|
|
.select()
|
|
.textSearch('description', "'big' & !'little'")
|
|
.execute();
|
|
```
|
|
|
|
</TabsPanel>
|
|
</Tabs>
|
|
|
|
|
|
## Resources
|
|
|
|
* [PostgreSQL: Text Search Functions and Operators](https://www.postgresql.org/docs/12/functions-textsearch.html)
|
|
|
|
|
|
|