Skip to content

techwiddeep.com

Menu
  • About me
  • Contact Us
Menu

UUIDs – Good, Bad and the Ugly

Posted on January 6, 2022January 6, 2022 by Deep.Kulshreshtha

 

UUID is a relatively new data type in the tech world. We will discuss its pros and cons.

.

Based on real experience. The information is based on building multi-million enterprise applications.

.

.

Prologue

.

We began developing enterprise-level software five years ago. For that, we evaluated the database engine to use. We chose PostgreSQL among MySQL, MariaDB, MS-SQL, and PostgreSQL.

.

Choosing a DB engine is important to the discussion. More on this in a bit.

.

.

The next important decision was to decide on a datatype for the primary key. The options were

.

●Integer
●UUID
●String

.

.

Each option has its own pros and cons. So we must carefully consider which one suits our needs the best before implementing it. Today, through this post, I am sharing our evaluation criteria and the results we achieved.

.

In this post, I am using UUID (Universal Unique IDentifier) to refer to both UUID and GUID. A GUID (Global Unique Identifier) is a Microsoft implementation of the UUID.

 

Our experience

.

Following were the benefits and drawbacks of using a UUID data type.

 

.

Pros

.

1.Does not generate PrimaryKey-Conflict.

.

This is helpful when we need to synchronise data from various sources. Certain applications need internal integrations. ( we did ).

.

During such integrations, string patterns or incremental integers will give a PK conflict. However, UUIDs being unique will NOT.

.

2.Generation without a server connection.

.

A client can generate a UUID without a server connection. Later, this value can be passed to the database to store.

.

Whereas in the case of an autogenerated integer value. Only the DB can generate the primary key. Therefore, an extra call is required to retrieve the primary key.

.

3.Provides column-level security.

.

This means no one can predict the values of the column. And so cannot read data based on guesswork.

 

Cons

.

1.Not an ANSI data type.

.

This is where DB engines become important.

.

PostgreSQL and MS-SQL servers have UUID as a native data type. But MySQL, MariaDB, and Oracle do not.

.

If a database engine doesn’t support UUID. Then maintaining it as the primary key has an impact on speed due to string comparison.

.

https://www.w3resource.com/sql/data-type.php

.

2.Higher CPU cost.

.

Because UUID is not a JSON datatype, there is a higher cost in terms of CPU and

memory. This occurs when serializing/ deserializing a UUID.

.

https://www.geeksforgeeks.org/json-data-types/

.

3.Higher Memory cost.

.

If the database server does not support it natively, more storage is needed.

.

While working with native data types, storage is 16 bytes. With varchars, storage is 36 bytes.

.

4.Insertion order unavailable.

.

We can’t get an insertion order by ordering on this field.

.

5.Not human readable.

.

These Ids aren’t actually human-readable. Therefore, debugging can be a challenge.

.

For instance, suppose we use UUID as a PK for simple master data such as country or city. When we use 2-3 characters of code instead of a UUID, the code is much easier to read.

.

.

What to do ?

.

.

We have discussed the principles of using UUIDs. Let us discuss some practical aspects now.

.

The following table describes considerations while choosing data types for DB primary keys.

.

.

.

Datatype

.

Description

.

.

.

.

.

.

.

UUID

.

●Synchronization needed.

.

Software that has multiple systems creating data. Such data is subsequently synchronized to a central database.

.

●Acceptable storage/ performance costs.

.

The application is able to take a bit of CPU and memory hit.

.

●Eventing

.

UUIDs are useful when there is a need to generate events for Kafka ( or another messaging system ). They help distribute information to underlying consumers. And help with concurrent insertion into the database.

.

.

.

.

.

.

Integer

.

●A commonly used pattern.

.

●Fast.

.

●The database is the sole owner of the identifier.

.

Where the record identifier is solely generated by the database. Whenever any record is inserted into a table. An auto-generated integer value is added as a PK for the record by the database engine.

.

●Transactional tables

.

No synchronization is required from any other system.

.

.

.

.

.

String

.

●Master data tables

.

When the data itself has 2-10 character long unique keys, which can be a PK. A countries and cities table, for example.

.

●DB undecided.

.

The database engine has yet to be determined. Furthermore, software must support a database engine that lacks native UUID support.

.

●As key length grows, it may be less performant.

.

.

.

Wrapping up

.

Our software is deployed with two big clients. We have all 3 data types as PK. (each for a different use case ).

.

.

●For distributed and synchronized data – we use UUIDs
●For data with natural keys ( State, City, Address ) – we use Strings.
●In utility data ( logging, tracking ) – we use Integers.

.

.

Hope this information was helpful. Feel free to reach out for questions. Thanks.

 

– Shalini Garg

© 2025 techwiddeep.com | Powered by Superbs Personal Blog theme