ChatGPT解决这个技术问题 Extra ChatGPT

UUID max character length

We are using UUID as primary key for out oracle DB, and trying to determine an appropriate max character length for the VARCHAR. Apparently this is 36 characters but we have noticed UUID'S generated which are longer than this - up to 60 characters in length. Does anyone know a suitable max char length for UUID??

As a UUID is a 128 bits number I'm really curious to see what coding would convert it into a 60 char string. Looks like either extremely poor encoding, or some other, unrealted problem to me.
What's your RDBMS? MS SQL has a dedicated type for UUIDs, and others can simply store the bytes. Is there some reason you'd want to store these as VARCHARs?
@user565869 storing them as bytes are terrible for any kind of manual inspection

C
Community

Section 3 of RFC4122 provides the formal definition of UUID string representations. It's 36 characters (32 hex digits + 4 dashes).

Sounds like you need to figure out where the invalid 60-char IDs are coming from and decide 1) if you want to accept them, and 2) what the max length of those IDs might be based on whatever API is used to generate them.


R
RiggsFolly

This is the perfect kind of field to define as CHAR 36, by the way, not VARCHAR 36, since each value will have the exact same length. And you'll use less storage space, since you don't need to store the data length for each value, just the value.


CHAR may use more space than VARCHAR if your character set on the column is multi-byte (see bottom part on stackoverflow.com/a/59686/1691446)
Pretty sure UUIDv4 is only using the latin-1 charset of UTF-8, in which case this wont be affected. Definitely check if you are using a different charset though.
UUID in string format can only use this set of characters (regex): [0-9A-Fa-f-], which is 23 distinct octets in ASCII.
RFC 4122 says UUIDs are 16 octets or 128 bits. If you're using more than that much storage, you're inefficiently encoding them. No need to encode the dashes, for example. They add no information.
@Trenton there's a trade-off between storage efficiency and user friendliness. One could store UUIDs as BINARY(16) for maximum storage efficiency, but someone looking over the DB would not see the canonical representation, and a programming language may only have a means of creating a UUID object from the canonical/string representation, or not have a UUID object type at all; the UUID might be stored in string form in a file, making comparison with the binary form cumbersome, etc.
S
StephenS

Most databases have a native UUID type these days to make working with them easier. If yours doesn't, they're just 128-bit numbers, so you can use BINARY(16), and if you need the text format frequently, e.g. for troubleshooting, then add a calculated column to generate it automatically from the binary column. There is no good reason to store the (much larger) text form.