Health checks with Terminus and Amazon ECS
In one of the previous parts of this series, we learned how to use the Elastic Container Service to deploy multiple instances of our application. With this…
March 6, 2023
Previously, we learned how to use the Elastic Container Service to deploy multiple instances of our application. With this architecture, we maintain the target group, where each target is a single instance of our application. Thanks to that, the load balancer can route a particular API request to one of the registered targets.
Before redirecting the traffic to a particular target, the load balancer must know if the target can handle it. To determine that, the load balancer periodically sends requests to all registered targets to test them. We call those tests health checks. Thanks to performing them, the load balancer redirects the traffic only to the healthy targets.
A common approach is to create a designated endpoint that responds with the status of the application. To create it, we can use the tool called Terminus that NestJS is equipped with.
Using Terminus#
Let’s start by installing the Terminus library.
npm install @nestjs/terminusTo introduce an endpoint using Terminus, we should create a new controller.
health.controller.ts#
import { Controller, Get } from "@nestjs/common"
import { HealthCheckService, HealthCheck } from "@nestjs/terminus"
@Controller("health")
class HealthController {
constructor(private healthCheckService: HealthCheckService) {}
@Get()
@HealthCheck()
check() {
return this.healthCheckService.check([])
}
}
export default HealthControllerThe
@HealthCheck()decorator is optional. As we can see under the hood, it allows for integrating Terminus with Swagger.
The most important thing above is the healthCheckService.check method. The code we have so far gives us a straightforward health check.
Built-in health indicators#
We can perform more advanced checks using the health indicators built into NestJS. With them, we can verify a particular aspect of our application.
A very good example is the TypeOrmHealthIndicator. Under the hood, it performs a simple SELECT SQL query to verify that our database is up and running. Doing that also ensures we’ve established a connection successfully.
There is also the
MikroOrmHealthIndicator,SequelizeHealthIndicator, andMongooseHealthIndicatorif you are using some other ORM than TypeORM.
health.controller.ts#
import { Controller, Get } from "@nestjs/common"
import { HealthCheckService, HealthCheck, TypeOrmHealthIndicator } from "@nestjs/terminus"
@Controller("health")
class HealthController {
constructor(
private healthCheckService: HealthCheckService,
private typeOrmHealthIndicator: TypeOrmHealthIndicator,
) {}
@Get()
@HealthCheck()
check() {
return this.healthCheckService.check([() => this.typeOrmHealthIndicator.pingCheck("database")])
}
}
export default HealthControllerThe healthCheckService.check method responds with a few properties:
status- if all of our health indicators report success, it equals
ok. Otherwise, it can beshutting_downor anerror. If the status is notok, the endpoint responds with 503 Service Unavailable instead of 200 OK.
- if all of our health indicators report success, it equals
info- has data about each healthy indicator
error- contains information about every unhealthy indicator
details- has data about every indicator
Terminus offers more health indicators than just those related to the database:
HttpHealthIndicator- allows us to make an HTTP request and verify if it’s working as expected
MemoryHealthIndicator- verifies if the process does not exceed a specific memory limit
DiskHealthIndicator- checks how much storage our application uses
MicroserviceHealthIndicator- ensures a given microservice is up,
GRPCHealthIndicator- verifies if a service is working as expected using the standard health check specification of GRPC.
Custom health indicators#
The above list contains health indicators for various ORMs. However, we’ve also worked with raw SQL without any ORM.
Fortunately, we can set up a custom health indicator. To do that, we need to extend the HealthIndicator class.
databaseHealthIndicator.ts#
import { Injectable } from "@nestjs/common"
import { HealthIndicator, HealthIndicatorResult, HealthCheckError } from "@nestjs/terminus"
import DatabaseService from "../database/database.service"
@Injectable()
class DatabaseHealthIndicator extends HealthIndicator {
constructor(private readonly databaseService: DatabaseService) {
super()
}
async isHealthy(): Promise<HealthIndicatorResult> {
try {
await this.databaseService.runQuery("SELECT 1")
return this.getStatus("database", true)
} catch (error) {
throw new HealthCheckError(
"DatabaseHealthIndicator failed",
this.getStatus("database", false),
)
}
}
}
export default DatabaseHealthIndicatorThe this.getStatus method generates the health indicator result that ends up in the info, error, and details objects.
To include it, we must call our new isHealthy method in the HealthController.
health.controller.ts#
import { Controller, Get } from "@nestjs/common"
import { HealthCheckService, HealthCheck } from "@nestjs/terminus"
import DatabaseHealthIndicator from "./databaseHealthIndicator"
@Controller("health")
class HealthController {
constructor(
private healthCheckService: HealthCheckService,
private databaseHealthIndicator: DatabaseHealthIndicator,
) {}
@Get()
@HealthCheck()
check() {
return this.healthCheckService.check([() => this.databaseHealthIndicator.isHealthy()])
}
}
export default HealthControllerSetting the health check with AWS#
We need to point the load balancer to our /health endpoint. We do that when setting up the load balancer while starting tasks in our Elastic Container Service cluster.
Above, when creating the target group for our cluster, we specify /health as the health check path. Thanks to that, the load balancer periodically sends requests to the /health endpoint to determine if a particular instance of our NestJS application is working as expected.
If our task takes a long time to start, the load balancer might mark it as unhealthy and shut it down. We can prevent that by setting up the health check grace period in the above form. This gives our tasks additional time to reach a healthy state.
Verifying if our tasks are running#
Previously, we learned how to manage logs with Amazon CloudWatch. We also created the LoggerInterceptor that logs every endpoint requested in our API.
Let’s look at the logs to verify if the load balancer requests our /health endpoint.
It’s also worth looking at the “Health and metrics” tab on the page dedicated to the service running our tasks.
If no target is healthy, the load balancer cannot handle the incoming traffic. So if our application does not work, it’s one of the first things to check when debugging.
It’s also worth looking at the “Deployment and events” tab. If something goes wrong with our deployment, the issue will often be visible in the “events” table.
Summary#
In this article, we learned what health checks are and how to design them. We also used them together with Amazon Elastic Container Service to verify if the instances of our NestJS application were running correctly. While doing so, we’ve learned more about debugging our NestJS app running with AWS and why we need to care about the health of our tasks running in the cluster.
There is still more to learn about running NestJS with AWS, so stay tuned!