IDs and Prefixes

June 14, 20222 min read#uuids, #perf

Rationale

  • We have observed bad cache hit ratio in transactions table due to uuid v4s as primary identifiers
    • the btree index takes a hit when range queries are done
    • The inserts are slower because a different page has to be written
    • And in general caused postgres to read many pages from disk
    • https://uuid6.github.io/uuid6-ietf-draft/ - IETF rationale for a new uuid v6 type
    • https://brandur.org/nanoglyphs/026-ids - Comparisons of different types in a primary key
    • https://github.com/tvondra/sequential-uuids
  • Taking into consideration above, our ids can become more friendly to humans and caches alike by following format
    • <prefix>_<random string>
    • Examples:
      • ord_0ujsswThIGTUYm2K8FjOOfXtY1K
      • pr_0ujsszwN8NRY24YaXiTIE2VWDTS
      • us_0ujsszgFvbiEr7CDgE3z8MAUPFt

Prefixes

  • ord_ for orders
  • ln_ for order line items
  • rfo_ for refund orders / returns
  • us_ for new users store
  • pa_ for payments
  • sh_ for shipments
  • pr_for products
  • ct_for categories

Random String Considerations

  • The string must be random enough to prevent collisions
  • Use a library than rolling out your own
  • Make sure the string has a good cache locality
  • We prefer KSUID for now, uuidv6/v7/v8 can also be used instead

Storage

Why not ULIDs

Other Reading

RELATED POSTS

  • Embracing celery - Part 1 - Extending
    next
  • Python and Async
    prev

COMMENTS

LinkedIn iconTwitter icon

© Copyright 2023, Ahiravan