Getting your Trinity Audio player ready...

UUIDs: Friend or Foe?

When you first hear “UUID,” you might imagine a magical number that guarantees uniqueness across the entire universe. And yes, a UUID is a 128‑bit value designed for that very purpose. However, before you go sprinkling UUIDs as primary keys in every table in your database, it’s time to take a closer look at UUID vs int. Spoiler alert: using UUIDs as primary keys can be like using a sledgehammer to crack a nut. It is impressive on paper, but not exactly practical.

What’s a UUID, Anyway?

A UUID (Universally Unique Identifier) is a 128‑bit number, usually represented as 36 characters in the form xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx. Its primary design goal is to ensure that, even if generated on different systems without central coordination, the chance of collisions is astronomically low (roughly 1 in 3.4×10^38 possibilities). They’re fantastic in distributed systems where you need each node to independently generate IDs without stepping on each other’s toes.

Numeric Types

Before we get carried away with UUIDs, consider the classic numeric primary keys:

INT (32-bit):
Maximum value: 2,147,483,647
At 100k inserts per day, you’d exhaust an INT in roughly 58 years. For many applications, that’s more than enough space

BIGINT (64-bit):
Maximum value: 9,223,372,036,854,775,807
This is so large it’s practically infinite for most real-world applications. At 100k inserts per day you’d exhaust BIGINT in 252695124297 years

And both INTs and BIGINTs are compact (4 or 8 bytes) and lightning-fast for both inserts and indexing

Benchmark results (UUID vs INT vs BIGINT)

uuid vs int vs bigint insert
uuid vs int vs bigint storage

UUIDs as Primary Keys: A Misconception

Many developers are attracted by the promise of “future proofing” their applications with UUIDs—assuming that by choosing a type with an enormous range (3.4×10^38 possibilities), they’ll never run into limitations. But here’s the catch:

  1. Performance Overhead:
    Unlike INTs or BIGINTs, UUIDs are twice as large (16 bytes vs. 4/8 bytes) and inherently random. In databases using B+ trees for indexing, inserting random UUIDs causes index fragmentation and more frequent page splits. This means:

    • Slower Writes: Every insert can force the index to rearrange itself.

    • Slower Reads: Larger index sizes mean more data to traverse, reducing caching efficiency.

  2. Storage Inefficiency:
    The extra bytes not only slow down operations but also increase the storage footprint of your indexes, which can become significant as your data grows.

  3. Maintenance Challenges:
    Debugging and manually reading a UUID is much less friendly than a simple sequential number. The loss of natural order (as seen with auto-incrementing keys) makes certain operations and optimizations harder to implement.

When (and When Not) to Use UUIDs

UUIDs are used in URLs as a quick way to hide sequential record numbers and thwart simple enumeration attacks. However, a full UUID (36 characters) can make URLs long and unfriendly, and still offers no real security beyond obscurity. A more elegant approach is sqids, which encodes integer IDs into short, reversible strings using a secret salt which gives compact URLs that can be decoded back to the original ID server‑side

UUIDs shine in scenarios where you need global uniqueness across distributed systems without a central authority for eg. generating session IDs, object identifiers, or when merging data from multiple sources. However, if you’re working within a single database or system, using an auto-incrementing INT or BIGINT is often the better choice.

In short, opting for UUIDs simply to “future proof” your app is a form of premature optimization. For most applications, the downsides (slower inserts, larger indexes, more disk I/O) far outweigh the negligible risk of exhausting a BIGINT’s range.

Using BIGINT in distributed systems

If you really need the uniqueness of UUIDs without their performance pitfalls, consider these strategies:

  • Central ID Generation:
    Use a centralized service that generates sequential IDs. This method avoids the randomness that causes index fragmentation.

  • Machine ID Prefix:
    Combine a machine-specific prefix with an auto-incrementing integer. This ensures uniqueness across distributed systems while preserving sequential order for indexing.

Final Thoughts

While UUIDs provide global uniqueness, their use as primary keys in databases can introduce performance challenges due to their size and randomness. For many applications, especially those not requiring distributed uniqueness, sequential integer-based keys offer superior performance and efficiency. Careful consideration of the specific requirements and constraints of your application is essential when selecting the appropriate primary key strategy.

This Post Has 3 Comments

  1. Vishnukumar Balachandran

    That’s insightful, thanks Mohit

  2. sasikumar R

    Good details on performance point of view when Not to consider UUID as primary key when its really not required. Thanks Mohit

Leave a Reply

Login with