This document provides recommendations for generating UUIDs (Universally unique identifier), also known as GUID (globally unique identifier).
When the UUID to be generated should be used with interconnected systems (e.g. Integrations), there are some standards to follow. For details see https://en.wikipedia.org/wiki/Universally_unique_identifier. Even when using the UUID only internally (only one system produce it and consume it), the design is important, as every bit not used properly (rand() function for UUID V4) doubles the probability of generating duplicate UUID, the main purpose of why UUID exists.
The de-facto industry standard today (2022) is UUID V4, Variant: 2 (ISO/IEC).The main characteristic of v4 UUID is that it should be able to generate billion unique UUIDs per second for about 85 years without collision (without generating the same UUID more than once).UUID v4 is 128 bits long, with 6 specific bits set to specific values to indicate the version/variant, and 122 random bits.
For example, if you need to generate UUID V4, Variant: 2 (the de-facto industry standard today, 2022), you could use the following expression:
/* spec and validity checker in https://www.uuidtools.com/decode */ fn!lower(joinarray( { dec2hex(tointeger(rand(4) * 256), 2), /*8 hex digits*/ "-", dec2hex(tointeger(rand(2) * 256), 2), /*4 hex digits*/ "-4", /*version 4 = completely random*/ dec2hex(rand(3) * 256, 1), /*3 hex digits*/ "-", bin2hex("10" & dec2bin(rand(1) * 4, 2),1), /*1 hex digit = Variant: 1st 2 bits=10=[ISO/IEC] + 2 rand bits*/ dec2hex(rand(3) * 256, 1), /*3 hex digits*/ "-", dec2hex(tointeger(rand(6) * 256), 2), /*12 hex digits*/ })),
The main building blocks are described below:
Applying Design to Requirements
There can be varying requirements for generating unique IDs and there may be cases where it is appropriate to deviate from the above design patterns. If you use another design, be wary of:
Funny fact
For those wondering why we have such strange formatting of 8-4-4-4-12, check v1 UUID definition from 1980s . 8=time_low, 4=time_mid, 4=time_hi_and_version, 4=clock_seq_hi_and_res clock_seq_low, 12=node (MAC address)By now, we have completely repurposed that, but because of compatibility and reusability, the format stayed.The collision possibility in v1 versus v4 is massively different, mostly because of introducing the probability theory (which is is very unintuitive) to IT.