IDs | Mercury Core

IDs are used throughout Mercury systems to identify a variety of entities. They are essential in any database or service to uniquely identify records, facilitate relationships between entities, and ensure data integrity. This page outlines the design principles and formats for IDs used in Mercury Core.

String identifiers

The most common type of ID used in Mercury is a 20-character lowercase alphanumeric string. These are the default random IDs used in SurrealDB, as well as when using the rand::guid() function (soon to be renamed to rand::id()) and are used as the primary identifiers for most entities in the system. An example of such an ID is as follows:

a1b2c3d4e5f6g7h8i9j0

In Mercury Core, these types of IDs are most visible in the URLs for comments and groups, place server and private tickets, and registration keys. While not visible for other entities, they are still used as the main identifiers for record lookups, even if other fields are used for display purposes.

With 36 possible characters (26 letters + 10 digits) and a length of 20 characters, there are a total of 36²⁰ (approximately 1.34 × 10³¹) possible unique IDs. As such, the probability of a collision (two entities receiving the same ID) is low enough to never be of practical concern.

Numeric identifiers

In addition to string IDs, Mercury Core also uses numeric IDs for certain entities. These are used solely for compatibility with the Client, Studio, or RCCService, and are limited to use in entities that interact with those systems.

In some versions of the Client and Studio, numeric IDs are stored as 32-bit signed integers. As such, the maximum ID value is limited to 2 147 483 647 (2³¹ - 1).

Most Client and Studio API methods expect URL strings instead of numeric IDs, though functions that require numeric IDs are still present. Assets proxied or cached by Mercury Core from Roblox's Open Cloud API may use numeric IDs that exceed the 32-bit signed integer limit. These will cause issues if passed to an API method that requires a numeric ID. When using assets from Open Cloud, watch out for errors caused by their IDs, as they may be interpreted as negative (usually -2 147 483 648) by the Client or Studio.

Note that the 32-bit limit is not caused by Lua, as all versions of Lua used in Clients and Studios use 64-bit floating point numbers for all numeric values, capable of holding a wider range of distinct integers than a 32-bit integer value. The limitation is purely due to how the Client and Studio API functions handle numeric IDs internally.

Negative IDs are possible to use in the Client and Studio, and most of the time do send requests to the Site with correctly formatted URLs. From the little experimentation we have done with these, their behaviour can best be described as "strange". We recommend only using positive numeric IDs to avoid any unexpected issues.

Mercury core forces numeric IDs to be between 100 000 000 and 999 999 999 (9 digits) or between 10 000 000 and 99 999 999 (8 digits) to maintain compatibility with these limits. All 8-digit IDs are used for Corescripts and Libraries, and all 9-digit IDs are used for Mercury assets.

Collisions between 8-digit IDs are not a concern as they are statically assigned to Corescripts and Libraries. However, the 9-digit IDs, which are generated randomly with only 9 × 10⁸ possible values, are more likely to experience collisions as more assets are created. We accept this risk as a tradeoff to maintain compatibility with the Client and Studio, without incurring the implementation complexity and other risks of using incremental IDs.

Why not sequential numeric IDs?

During early development of various Mercury systems and in small-scale testing, experiments were conducted using sequential numeric IDs for assets, users, and places. However, several issues were identified and encountered that led to the removal of sequential numeric IDs in favour of random IDs:

Extra database queries were required on record creation to determine the next available ID. These were stored as stuff:increment records in SurrealDB, as SurrealDB does not natively support auto-incrementing fields. This increased complexity of many query operations.
Parallelism would have been limited due to the difficulty of ensuring unique sequential IDs across multiple instances of the Site running at the same time and accessing the same database. This would have required additional locking mechanisms or coordination between instances.
Predictability of IDs led to a few large problems:
- Ability to easily enumerate through a large number of records could lead to unauthorised user data access or scraping of content. However, this is generally not a large security concern.
- ID generation for assets associated with each other, for example Decal assets and associated Image assets, would usually be created at the same time and in sequence. These correlated IDs, as a result of Hyrum's law, ended up being depended upon by Mercury community developers. This would have either posed an ossification risk and difficulty of changing the ID generation method later, or would have resulted in breaking changes for developers.
Rollbacks of ID increments were necessary to prevent gaps in ID sequences. IDs for assets associated with each other would be incremented at the same time to ensure they were consecutive (see above for reasoning). If one of the creations failed, the increments would need to be rolled back to prevent gaps in the sequence.
Another example would be when a user would create an asset for a fee, the asset creation would succeed, but the payment would fail due to insufficient funds. In this case, the asset ID increment would need to be rolled back to prevent gaps. Despite all efforts to mitigate this, there were still edge cases where gaps would occur, leading to inconsistencies in the ID sequence.

One of the benefits of sequential IDs is that they can be much shorter than random IDs while still maintaining uniqueness. Due to this, experiments were also conducted for hybrid random-sequential approaches (eg. increasing ID length when collisions occur). However, the added complexity and multiple queries needed to detect collisions outweighed the simplicity benefits of using longer random IDs.

If using assets from Roblox's Open Cloud API, older assets tend use sequential numeric IDs, while newer assets tend to use random numeric IDs.