NewId
NewId generates sequential unique identifiers that are 128-bit (16-bytes) and fit nicely into a Guid
. It was inspired from Snowflake and flake.
NewId is included in the MassTransit.Abstractions NuGet package.
The Problem
Many applications use unique identifiers to identify data. Common approaches applications use to generate unique identifiers in a relational database delegate identifier generation to the database, using an identity column or another similar auto-incrementing value.
While this approach can be adequate for a small application, it quickly becomes a bottleneck at scale. And it's a common problem, as evidenced by this post from Twitter Engineering in 2010.
A key use case, specifically related to MassTransit, is applications that use messages to communicate between services – which is common in a service-based architecture. In these applications, sequential identifiers generated by NewId can serve dual purposes. First and foremost, it is a sequential unique identifier. Second, it is also a timestamp, as every NewId includes a UTC timestamp.
Why does order matter now?
For a .NET developer, it is easy to reach for Guid.NewGuid()
and run with it. And while that works, the identifiers created are not sequential. They're completely randomized. And when it comes to data, being able to sort it matters. Using a unique identifier column as a primary key clustered index with SQL Server was frowned upon for years because it caused massive index fragmentation. This led developers to use an int (or bigint once they realized that four billion isn't a lot) primary key and create a separate unique index on the unique identifier column (to use the AK, one might say, it wasn't a good day).
The Solution
NewId was created to solve the problem. NewId generates sequential 128-bit identifiers that are collation compatible with SQL Server as a clustered primary key. Using the host MAC address, along with an optional offset (in case multiple processes are on the same host), combined with a timestamp and an incrementing sequence number, generate identifiers are unique across a network of systems and can be safely inserted into a database without conflicts.
NewId is largely inspired by the Erlang library flake, which adopted an approach of generating 128-bit, k-ordered ids (read time-ordered lexically) using the machines MAC, timestamp and a per-thread sequence number. These identifiers are sequential and do not collide in a cluster of nodes running applications that use these as UUIDs.
Using NewId
NewIds can be generated using one of two methods. The first returns a NewId
, whereas the second returns a Guid
.
NewId newId = NewId.Next();
Guid guid = NewId.NextGuid();
NewId implements many of the same methods and constructors as Guid, and can be converted to and from a Guid.
// Formats to 11790000-CF25-B808-2365-08D36732603A
string identifier = NewId.Next().ToString("D").ToUpperInvariant();
// Convert from a string
NewId newId = new NewId("11790000-cf25-b808-dc58-08d367322210");
// Convert from a byte array
var bytes = new byte[] { 16, 23, 54, 74, 21, 14, 75, 32, 44, 41, 31, 10, 11, 12, 86, 42 };
NewId newId = new NewId(bytes);
Configuration
Some features of NewId can be configured.
ProcessId
In cases where multiple processes are on the same host generating identifiers, it may be necessary to include the processId when generating identifiers. To enable the use of the processId, call the method below on startup.
Note that this is used by default when using NewId included with MassTransit v8 (or later).
NewId.SetProcessIdProvider(new CurrentProcessIdProvider());
This will replace two of the six network address bytes with the current processId.
- Generating passwords
- Creating security tokens
- Anything where someone should not be able to guess an identifier