Db2 LUW: Performance of UUIDs as Primary Keys

Published on 12/23/2024

tl;dr

Motivation

During one of my DB2 training sessions, someone asked if the type of UUIDs affects database performance. Specifically, whether random or sortable UUIDs are better to use.

For PostgreSQL, I knew the answer: Primary keys need to be ordered so the B-tree index requires less frequent rebalancing.

However, I wasn’t sure how it works with DB2. My assumption was that a similar effect might occur there.

About UUIDs

Since May 2024, there’s a new RFC standard that addresses different types of UUIDs.

For primary keys, two main variants are relevant: So the question is: "If I use UUIDs as primary keys in DB2, should I choose UUIDv4 or UUIDv7?"

Test Setup

This question piqued my curiosity, so I decided to investigate further.

To get to the bottom of it, I created a test setup using a Spring Boot application. First, I populate a table with some data, and then I measure how long a batch INSERT of 100,000 rows takes. I run the test once with UUIDv4 as primary keys and once with UUIDv7.

In Java, UUIDv4 can be generated directly using UUID.randomUUID(). For UUIDv7, you currently need to use a library or write some custom code.

The code to insert the data looks something like this:
public void insertBatchJdbcClient(List<User> users, int batchSize) {
    jdbcTemplate
        .batchUpdate("""
            INSERT INTO test_user (id, username, password)
            VALUES (?, ?, ?);
            """,
            users, batchSize,
            (ps, argument) -> {
                ps.setObject(1, argument.id());
                ps.setString(2, argument.username());
                ps.setString(3, argument.password());
            });
}

Part I: PostgreSQL

With PostgreSQL, I was able to reproduce well-known results: UUIDv4 (random): The random distribution of keys made the B-tree index work harder. UUIDv7 (time-ordered): Significantly reduced index rebalancing and delivered much better performance. PostgreSQL performance

The data type I used for the primary key was UUID.

Part II: DB2

With DB2, the expected effect was observed as well — though it was less pronounced compared to PostgreSQL. DB2 performance

The data type I used for the primary key was CHAR(16) FOR BIT DATA.

Conclusion

UUIDv7 provides performance advantages not only with PostgreSQL but also with DB2, although the benefits are smaller in the latter case.

New code should prefer UUIDv7 as a primary key if you decide to use UUIDs in general. For insert-heavy workloads, this can result in a noticeable performance improvement.

Whether you should migrate existing tables and code from another type of UUID to UUIDv7 depends on the effort involved and the workload on the data.

For comments or feedback, feel free to reach out at info (at) leonardw (dot) de.

Technical Details

The following setup was used for the tests.

Server DB2 PostgreSQL Java 21 (Temurin), Spring Boot 3.4