Reliable communication using the Transactional Outbox Pattern

In today's digital age, email communication remains an indispensable tool for businesses and individuals alike. Whether it's sending important notifications, marketing campaigns, or transactional updates, emails play a pivotal role in ensuring effective communication.

At first glance sending an email is just a line of code, right? Well, integrating any asynchronous messaging functionality (e.g. sending emails, sending data to 3rd party services, billing systems etc.) into software applications can be a daunting task, especially when it comes to ensuring the reliability and consistency of message delivery. Most of the time applications need a way to send one message if and only if the database gets updated with an entry. A simple example of when this behavior is necessary would be user registration. When a new user registers, the application has to send a confirmation email to their address. Seems easy enough, so what could go wrong?

Let us try writing some Spring Boot code with Kotlin to illustrate the problem:

@Transactional
fun registerUser(user: User){
    userRepo.save(user)
    emailService.send(ConfirmationEmail())
}

At first sight, this code may seem right. But what happens if the server encounters an error after saving the new user? The email will get sent but the user will not exist in the database. Now, you might think of wrapping this code in a try-catch block such that we don't execute the send function if the save operation fails. This would look something like this:

@Transactional
fun registerUser(user: User){
    try {
        userRepo.save(user)
        emailService.send(ConfirmationEmail())
    }
    catch(err: UserRepoException) {
        // ...    
    }
}

Unfortunately, this isn't a good approach either. What if the email provider is not available at the moment? The user will be persisted in the database, but the confirmation email will never arrive to them. It will not even be sent to begin with. This seems unacceptable in a professional, modern web application. We would like to somehow roll back the saving of the user if sending of the email fails. This might remind you of the lesson about transactions from the databases course.

Transactional Outbox Pattern to the Rescue

The Transactional Outbox Pattern, a proven architectural design, provides a robust and reliable solution for managing email sending within applications. It addresses various use cases where message delivery is crucial and helps mitigate potential failures that can occur when this pattern is not implemented.

The basic idea is to have an Outbox table that contains the emails our application has to send out. We can now insert, in the same transaction, into this new table and the User table when a new user is created.

Key Components of the Transactional Outbox Pattern:

Outbox: The central component of this pattern is the "outbox," which acts as a temporary storage for messages that need to be sent. When a message needs to be sent (e.g. an email), instead of sending it directly, it is first placed in the outbox, in the same transaction as the update on the other table (e.g. Insert in the User table). This outbox can be implemented as a database table, a message queue, or any other persistent storage. In our case, it will be a database table.
Message Queue or Scheduler: An essential part of the pattern is a mechanism that monitors the outbox for messages and sends them at an appropriate time. This can be done using a message queue (e.g. RabbitMQ, Kafka) or a scheduler that periodically checks the outbox for pending messages. When a message is sent successfully, it is marked as "sent" in the outbox.
Transactional Behavior: The Transactional Outbox Pattern ensures that message sending is part of a larger transaction as an atomic database operation. If the transaction fails (e.g., due to an error or an exception), the messages are not inserted into the outbox and the data is not updated in the other table (UsersTable in our example). This guarantees that messages are only sent when the whole transaction is successful, maintaining data consistency. This behavior can be easily achieved in Spring JPA using the @Transactional annotation since it satisfies the "ACID" requirements: it is Atomic, Consistent, Isolated and Durable.

Sequence Diagram

We can illustrate the flow of registering a new user in an application that implements the Transactional Outbox pattern. It would look something like this:

Implementation

Now that we understand the problem and have a solution, we can write an EmailService class to achieve reliable email sending:

@Service
class TransactionalEmailService(
    val outboxRepository: OutboxRepository,
    val emailOutboxMapper: EmailOutboxMapper
) {
    @Transactional
    override fun sendEmail(email: Email) {
        this.outboxRepository.save(this.emailOutboxMapper.emailToOutbox(email))
    }
}

The service has only one function: sendEmail(). The function saves the Email object into the Outbox table as a database entry. It is annotated with @Transactional, allowing Spring to handle the database save operation as part of a transaction. Please note that the default propagation for @Transactional is required since we want the code in the sendEmail() function to run in the same transaction as the code in the function calling it (so both the original database operation and the save on the outbox repository happen in the same transaction). Using another propagation might result in creating a second transaction for sendEmail(), which would defeat the purpose. For more information please check the official documentation.

Scheduled task (Email Relay)

To fetch the Outbox table for new entries we decided to use a scheduled task. This task fetches batches (pages) of entries from the Outbox until there aren't any un-fetched entries. It is important to implement this way and to not only fetch one batch per execution since, if at any point a lot of messages enter the Outbox (if a newsletter has to be sent, for example), the scheduled task will not stop until it has tried to send every message, thus saving the time it would have had to wait between the scheduled runs.

We should also mention that in the Outbox table, besides the usual email fields, we also store:

the scheduled date to send the email, which serves two purposes: to send emails at a certain point in the future; and if sending fails, mark at which point a resend should be retried
number of tries: if an email fails too many times the service will stop trying to send it

To optimize our selects, we created a compound index on the Outbox table on the columns ID, Number of tries and Scheduled date.

A pitfall we have to watch out for is that we may have multiple instances of the scheduled task running at once. To make sure that the two tasks don't select the same batch of emails (which would result in sending the same emails multiple times) we can use a distributed lock, like ShedLock. Its purpose is to make sure only one instance of the scheduled task is running at a certain point in time. Another way, which is a bit more database management system specific is using the row locks FOR UPDATE SKIP LOCKED. This command is available for PostgreSQL, Oracle and others but not present for example in SQL Server and SQLite.

A pseudocode of our implementation would look something like this:

acquire distributed lock
repeat until no batches left {
    batch = select relevant entries from Outbox table
    for each email in batch {
        try {
            send(email)
            // send - successful
            delete email from Outbox table
        }
        catch {
            // send - failed
            email -> increase number of tries
            email -> set scheduled send date sometime in future   // (time of retry)
            update email in Outbox table       
        }
    }
}
release distributed lock

Email Sender

There are multiple approaches to handling this step, and it is very project-specific. You might use an SMTP server or something like AWS SES. At this point, there isn't anything that can go wrong as long as you are careful to catch all exceptions in the Scheduled Task so that they are handled properly.

We created a custom AWS SES Sender class that implements JavaMailSender. This way we can easily switch between the default JavaMailSenderImpl SMTP implementation and AWS SES implementation just by changing a configuration. This is Spring Boot specific, but it should be pretty similar in other languages and frameworks. Our implementation looks something like this:

@ConditionalOnProperty(
    value = ["email.sender"],
    havingValue = "AWS",
    matchIfMissing = false
)
@Component
class AWSSESJavaMailSender(
    @Autowired val sesClient: AmazonSimpleEmailService
): JavaMailSender {

    override fun send(mimeMessage: MimeMessage) {
        try {
            val outputStream = ByteArrayOutputStream()
            mimeMessage.writeTo(outputStream)

            val buf = ByteBuffer.wrap(outputStream.toByteArray())

            val rawMessage = RawMessage(buf)

            val rawEmailRequest = SendRawEmailRequest(rawMessage)

            sesClient.sendRawEmail(rawEmailRequest)
        }
        catch (ex: Exception) {
            throw MailSendException("Could not send email through AWS SES", ex)
        }
    }

    override fun createMimeMessage(): MimeMessage {
        return MimeMessage(Session.getDefaultInstance(Properties()))
    }

    // other functions' from  JavaMailSender implementation...
}

Usage

The purpose of this project was to implement the pattern in an easily reusable way for our Spring Boot projects. We provided a way to make the pattern easy to use while keeping the code clean.

The sendEmail() function of the TransactionalEmailService can be called in any other transactional function:

@Service
class ExampleEmailServiceImpl(
    val userRepo: UserRepo,
    val emailService: TransactionalEmailService
){
    @Transactional
    fun exampleUsage() {
        this.userRepo.save(User(...))
        val email = Email(...)
        this.emailService.sendEmail(email)
  }
}

Note that both the exampleUsage() function and our sendEmail() function are annotated with @Transactional. Spring is smart enough to handle both database changes in a single transaction, fulfilling the ACID requirements.

For a more decoupled approach we can make use of event listeners:

@Service
class ExampleUserService(
    val userRepo: UserRepo,
    val eventPublisher: ApplicationEventPublisher
){
    @Transactional
    fun exampleUsage() {
        val user = userRepo.save(User(...))
        eventPublisher.publish(UserCreatedEvent(user))
    }
}

@Service
class UserEventListener(
    val emailService: TransactionalEmailService
) {
    @Transactional
    @EventListener
    override fun handleUserCreatedEvent(event: UserCreatedEvent) {
        val email = Email(...)
        emailService.sendEmail(email);
    }
}

Conclusion

If right now you're thinking about that project that you worked on in the past and you didn't implement information sending in a truly reliable way, you're not alone. This is a very common mistake unfortunately, but we hope that through this article we were able to paint a clearer picture about this design pattern, why it is useful and how to implement it.