Should I use UUIDs for resources in my public API?

I'm building a SaaS application and want to expose IDs for resources which are not tied to my current data storage implementation (Postgres auto-increment IDs). These Stack Overflow posts (one two) suggest that creating locally unique IDs is hard and that I might as well use UUIDs, which are of course easily and safely generated in pretty much any language.

I'm happy with this approach, but I wonder why I can't find any APIs from big SaaS/hosted players which do the same? For example:

  • Shopify: 9 digit numbers
  • Twilio: 34 character strings
  • Twitter: 20+ digit numbers
  • AMEE: 12 character A-Z0-9

So basically nobody seems to use UUIDs. Is there a reason for this - not-invented-here, cleverer internal ID algorithms or something else? And in my case, in the absence of any internal algorithm, does it make most sense to go with UUIDs?


Solution 1:

It's possible that those other vendors you listed have their own ID or hashing scheme to allow them to expose a smaller number while using something more akin to a UUID internally. But in the end, the question must be asked: as long as your URIs are intended to be consumed by code (API clients) rather than humans, why would it matter?

Don't get too freaked out by what those vendors have done. There's no guarantee that (a) they are doing the "right" thing and (b) that their needs are the same as yours.

Go ahead and use UUIDs.

Solution 2:

I think you might consider the four main options here:

  1. use the UUID as your database Primary Keys, but it could be more computationally expensive than using Long

  2. create an UUID to Long mapping layer, this way you can publish your REST resources, but maintain a clean database structure using Long PK

  3. create an Alternate Key column in your database tables in order to hold de UUID values.

  4. instead of using UUID you could have cryptographic IDs, generated on the fly using a custom seed for each customer and original PK. This approach imposes more execution overhead but could be interesting in some scenarios. The customer would have to use always encrypted data, since they will never have access to the seed or algorithm.