Implementing searching with pattern matching and raw SQL
The possibility of searching through the contents of the database is a very common feature. There are great solutions built with that use case in mind, such as…
October 17, 2022
The possibility of searching through the contents of the database is a very common feature. There are great solutions built with that use case in mind, such as Elasticsearch. Even though that’s the case, PostgreSQL also has the functionality of matching a given string pattern. In this article, we explore what PostgreSQL offers and use this in our NestJS project.
The code from this article is in this repository.
Pattern matching with LIKE#
The idea behind pattern matching is to check if a given string has specific characteristics. The most straightforward way of doing that in PostgreSQL is by using the LIKE operator.
SELECT 'Hi! This is my 1st post.' LIKE 'Hi! This is my 1st post.' -- trueBesides regular text, our pattern can contain the percent sign – %. It matches a sequence of zero or more characters.
SELECT 'Hi! This is my 1st post.' LIKE 'Hi!%'; -- true
SELECT 'I wrote this post.' LIKE '%post.'; -- true
SELECT 'Hi! This is my favourite post.' LIKE 'Hi! % post.'; -- true
SELECT 'This is the 1st time I am writing.' LIKE '%1st%'; -- trueWhen using the LIKE operator, we can also take advantage of the underscore sign – _. It matches a single character.
SELECT 'Hi! This is my 1st post.' LIKE 'Hi! This is my ___ post.'; -- true
SELECT 'Hi! This was a 2nd post.' LIKE 'Hi! This _____ ___ post.'; -- true
SELECT 'Hi! This is my 1st post.' LIKE 'Hi! This _____ ___ post.'; -- trueWe can also use multiple percentages and underscore signs in a single pattern.
SELECT 'This is my 1st post.' LIKE 'This % my ___ %.'; -- true
SELECT 'This was my 2nd article.' LIKE 'This % my ___ %.'; -- trueUsing pattern matching in a real use-case#
Pattern matching is especially useful when performing a SELECT on a table and using WHERE.
SELECT * FROM posts
WHERE title LIKE '%post%';Above, we use the % sign on both the left and right sides of the post string when looking through the title column. This means we are looking for rows that use the post string in any way in the title column. The above is a very common case.
Finding rows that don’t match a pattern#
We can also use NOT LIKE to find rows that don’t match a particular pattern.
SELECT * FROM posts
WHERE title NOT LIKE '%content%';Above, we look for posts with a title that does not contain the word “content”.
Implementing searching in NestJS#
Let’s use the approach with the LIKE operator and two % signs in practice. To do that, let’s expect the user to provide a query parameter.
searchPostsQuery.ts#
import { IsString, IsNotEmpty, IsOptional } from "class-validator"
class SearchPostsQuery {
@IsString()
@IsNotEmpty()
@IsOptional()
search?: string
}
export default SearchPostsQueryOnce we have the above class, we need to use it in our controller.
posts.controller.ts#
import { ClassSerializerInterceptor, Controller, Get, Query, UseInterceptors } from "@nestjs/common"
import { PostsService } from "./posts.service"
import GetPostsByAuthorQuery from "./getPostsByAuthorQuery"
import PaginationParams from "../utils/paginationParams"
import SearchPostsQuery from "./searchPostsQuery"
@Controller("posts")
@UseInterceptors(ClassSerializerInterceptor)
export default class PostsController {
constructor(private readonly postsService: PostsService) {}
@Get()
getPosts(
@Query() { authorId }: GetPostsByAuthorQuery,
@Query() { search }: SearchPostsQuery,
@Query() { offset, limit, idsToSkip }: PaginationParams,
) {
return this.postsService.getPosts(authorId, offset, limit, idsToSkip, search)
} // ...
}We can rely on PostsService to call the correct methods from our repositories.
posts.service.ts#
import { Injectable } from "@nestjs/common"
import PostsRepository from "./posts.repository"
import PostsStatisticsRepository from "./postsStatistics.repository"
import PostsSearchRepository from "./postsSearch.repository"
@Injectable()
export class PostsService {
constructor(
private readonly postsRepository: PostsRepository,
private readonly postsStatisticsRepository: PostsStatisticsRepository,
private readonly postsSearchRepository: PostsSearchRepository,
) {}
getPosts(
authorId?: number,
offset?: number,
limit?: number,
idsToSkip?: number,
searchQuery?: string,
) {
if (authorId && searchQuery) {
return this.postsSearchRepository.searchByAuthor(
authorId,
offset,
limit,
idsToSkip,
searchQuery,
)
}
if (authorId) {
return this.postsRepository.getByAuthorId(authorId, offset, limit, idsToSkip)
}
if (searchQuery) {
return this.postsSearchRepository.search(offset, limit, idsToSkip, searchQuery)
}
return this.postsRepository.get(offset, limit, idsToSkip)
} // ...
}We can delegate the logic of searching through the posts to a separate repository to avoid creating one big file that’s difficult to read.
postsSearch.repository.ts#
import { Injectable } from "@nestjs/common"
import DatabaseService from "../database/database.service"
import PostModel from "./post.model"
@Injectable()
class PostsSearchRepository {
constructor(private readonly databaseService: DatabaseService) {}
async search(offset = 0, limit: number | null = null, idsToSkip = 0, searchQuery: string) {
const databaseResponse = await this.databaseService.runQuery(
`
WITH selected_posts AS (
SELECT * FROM posts
WHERE id > $3 AND concat(post_content, title) LIKE concat('%', $4::text, '%')
ORDER BY id ASC
OFFSET $1
LIMIT $2
),
total_posts_count_response AS (
SELECT COUNT(*)::int AS total_posts_count FROM posts
WHERE concat(post_content, title) LIKE concat('%', $4::text, '%')
)
SELECT * FROM selected_posts, total_posts_count_response
`,
[offset, limit, idsToSkip, searchQuery],
)
const items = databaseResponse.rows.map((databaseRow) => new PostModel(databaseRow))
const count = databaseResponse.rows[0]?.total_posts_count || 0
return {
items,
count,
}
} // ...
}
export default PostsSearchRepositoryA few significant things are happening above. We use pattern matching with both title and post_content columns. We wrap the query provided by the user with the % signs on both ends. To do that, we use the concat function.
It is important to acknowledge that our query might work in an unexpected way if the users puts
%or_characters in their search input. To prevent this, we could sanitize the provided string by prepending all special characters with the\sign.
We also indicate that we want the $4 argument to be treated as a string because the concat() function works with different data types. Without it, PostgreSQL would throw an error.
The ILIKE operator#
The ILIKE operator works in a similar way to LIKE. However, an essential thing about ILIKE is that it is case-insensitive.
SELECT 'Hi! This is my favourite post.' ILIKE 'hI! % pOsT.'; -- trueSince we let the user search for any occurrence of a given string, let’s make it case-insensitive.
postsSearch.repository.ts#
import { Injectable } from "@nestjs/common"
import DatabaseService from "../database/database.service"
import PostModel from "./post.model"
@Injectable()
class PostsSearchRepository {
constructor(private readonly databaseService: DatabaseService) {} // ...
async searchByAuthor(
authorId: number,
offset = 0,
limit: number | null = null,
idsToSkip = 0,
searchQuery: string,
) {
const databaseResponse = await this.databaseService.runQuery(
`
WITH selected_posts AS (
SELECT * FROM posts
WHERE author_id=$1 AND id > $4 AND (title ILIKE concat('%', $5::text, '%') OR post_content ILIKE concat('%', $5::text, '%'))
ORDER BY id ASC
OFFSET $2
LIMIT $3
),
total_posts_count_response AS (
SELECT COUNT(*)::int AS total_posts_count FROM posts
WHERE author_id=$1 AND id > $4 AND (title ILIKE concat('%', $5::text, '%') OR post_content ILIKE concat('%', $5::text, '%'))
)
SELECT * FROM selected_posts, total_posts_count_response
`,
[authorId, offset, limit, idsToSkip, searchQuery],
)
const items = databaseResponse.rows.map((databaseRow) => new PostModel(databaseRow))
const count = databaseResponse.rows[0]?.total_posts_count || 0
return {
items,
count,
}
}
}
export default PostsSearchRepositoryUsing regular expressions#
Using LIKE and ILIKE can cover a lot of use cases with pattern matching. But, unfortunately, not all of them. Sometimes we might need to be more specific when describing the pattern.
Fortunately, PostgreSQL allows us to use regular expressions with the ~ operator.
SELECT 'Hi!' ~ '^[0-9]*$'; -- false
SELECT '123' ~ '^[0-9]*$'; -- trueWe can also make it case-insensitive by using the ~* operator.
SELECT 'Admin' ~* 'admin|user|moderator'; -- trueTo check if a string does not match the regular expression, we can use the !~ operator.
SELECT 'admin' !~ 'admin|user|moderator'; -- false
SELECT 'editor' !~ 'admin|user|moderator'; -- trueWe can also mix it up and check if a string does not match the regular expression and keep it case-insensitive.
SELECT 'ADMIN' !~* 'admin|user|moderator'; -- falseRegular expressions can be handy when the LIKE operator is not enough. Unfortunately, we need to ensure we are writing an expression that does not cause issues with the performance.
The SIMILAR TO operator#
The SQL standard also contains the SIMILAR TO operator. It is a blend of the LIKE operator and regular expressions. Patterns used with SIMILAR TO are similar to regex but use _ and % instead of . and .*.
SELECT '123' SIMILAR TO '[0-9]{1,}'; -- trueThe interesting thing is that PostgreSQL translates the patterns from the SIMILAR TO format to regular expressions.
EXPLAIN ANALYZE
SELECT * FROM posts WHERE title SIMILAR TO '[0-9]{1,}';Because of the above, I suggest writing regular expressions instead of using the SIMILAR TO operator when the LIKE keyword is not enough.
Summary#
In this article, we’ve gone through pattern matching with PostgreSQL. We’ve used it to implement a search feature with NestJS. We also compared the LIKE and SIMILAR TO operators and regular expressions to get a better picture.
There is still more to learn when it comes to searching through text in PostgreSQL, such as the text search types. Stay tuned for more content!