During one of my DB2 training sessions, someone asked if the type of UUIDs affects database performance. Specifically, whether random or sortable UUIDs are better to use.
For PostgreSQL, I knew the answer: Primary keys need to be ordered so the B-tree index requires less frequent rebalancing.
However, I wasn’t sure how it works with DB2. My assumption was that a similar effect might occur there.
Since May 2024, there’s a new RFC standard that addresses different types of UUIDs.
For primary keys, two main variants are relevant:
This question piqued my curiosity, so I decided to investigate further.
To get to the bottom of it, I created a test setup using a Spring Boot application. First, I populate a table with some data, and then I measure how long a batch INSERT of 100,000 rows takes. I run the test once with UUIDv4 as primary keys and once with UUIDv7.
In Java, UUIDv4 can be generated directly using UUID.randomUUID()
. For UUIDv7, you currently need to use a library or write some custom code.
public void insertBatchJdbcClient(List<User> users, int batchSize) { jdbcTemplate .batchUpdate(""" INSERT INTO test_user (id, username, password) VALUES (?, ?, ?); """, users, batchSize, (ps, argument) -> { ps.setObject(1, argument.id()); ps.setString(2, argument.username()); ps.setString(3, argument.password()); }); }
The data type I used for the primary key was UUID
.
The data type I used for the primary key was CHAR(16) FOR BIT DATA
.
UUIDv7 provides performance advantages not only with PostgreSQL but also with DB2, although the benefits are smaller in the latter case.
New code should prefer UUIDv7 as a primary key if you decide to use UUIDs in general. For insert-heavy workloads, this can result in a noticeable performance improvement.
Whether you should migrate existing tables and code from another type of UUID to UUIDv7 depends on the effort involved and the workload on the data.
For comments or feedback, feel free to reach out at info (at) leonardw (dot) de.
The following setup was used for the tests.
Server