Comparing Unique ID Generation Strategies and Adopting Snowflake-based Primary Keys

Mar 8, 2025

Published byJin

Comparing Unique ID Generation Strategies and Adopting Snowflake-based Primary Keys

Implementing Unique ID Generator Based on Snowflake Algorithm in Java

Generating unique identifiers is essential for maintaining data integrity, ensuring stability in distributed environments, and optimizing large-scale batch operations. The choice of ID generation strategy should align with the scale and requirements of the project.

For example, auto-increment keys work well in single-node environments and are often sufficient for smaller systems with low data generation frequency. In contrast, UUIDs offer strong uniqueness guarantees in distributed setups but can lead to increased storage overhead and performance degradation.

This post compares several ID generation strategies, highlights their characteristics and use cases, and outlines the adoption of a Snowflake-based ID generator in a production environment.

1. Various strategies for unique ID generation

Unique identifiers can be generated in multiple ways, each with its own strengths and weaknesses depending on the context. The optimal strategy depends on factors such as system scale, data generation frequency, and architectural constraints.

This section outlines several common approaches to unique ID generation and examines their key characteristics.

1-1. Auto increment

Auto-increment keys provided by relational databases are a simple and intuitive solution for single-node environments. This strategy requires no complex setup—since the database automatically manages unique ID values, it’s particularly effective in early development stages. Developers can rely on the database’s built-in mechanism to increment IDs without implementing custom logic.

Below is an example of a simple MySQL table definition using an auto-increment primary key:

create table cats
(
    id      bigint auto_increment primary key,
    name    varchar(100)                        not null,
    born_at timestamp default current_timestamp not null
);

When inserting records, the id value is automatically generated and incremented:

insert into cats (name) values ('Kitty');
insert into cats (name) values ('Tiger');
insert into cats (name) values ('Shadow');

Querying the table shows that IDs were automatically assigned in ascending order:

ID	Name	Born at
1	Kitty	2025-01-08 10:00:00
2	Tiger	2025-01-08 10:05:00
3	Shadow	2025-01-08 10:10:00

Auto-increment is efficient and straightforward. Since the values increase linearly, it also helps with sorting operations. However, there are several limitations in distributed systems:

Collision risk: Multiple database instances may generate the same key, leading to potential duplication or data inconsistency across nodes.
Single point of contention: Because the ID is generated by a single node, high traffic can turn the database into a bottleneck and degrade performance.
Database dependency: ID generation always requires communication with the database. This introduces latency and limits flexibility, especially in unstable or high-latency environments.
Inefficiency in batch operations: In bulk insert scenarios, each record must go through the database to receive an ID, which increases processing time and load.

Due to these issues, auto-increment may not be suitable for large-scale distributed systems or write-intensive applications. Still, its simplicity and reliability make it a common choice for small-scale or single-node projects.

1-2. UUID

A UUID (Universally Unique Identifier) is a standardized 128-bit identifier designed to ensure global uniqueness across systems and networks. It is commonly used in distributed systems to generate IDs independently and prevent data collisions.

A UUID consists of 128 bits (16 bytes) and is typically represented as a 36-character hexadecimal string, divided into five groups using the 8-4-4-4-12 format. Its components are structured as follows:

Component	Size	Description
⏰ Timestamp	60 bits	Represents the creation time of the UUID in milliseconds
🔢 Clock Seq.	14 bits	Handles time rollback or synchronization issues
🖥 Node ID	48 bits	Typically based on the MAC address for node uniqueness

UUIDs are available in multiple versions, each designed with a specific generation strategy. Depending on the context, some use timestamps, while others rely on hashing or randomness. Here’s a summary of common versions:

UUIDv1
- Based on timestamp and node identifier (MAC address or substitute)
- Ensures uniqueness across networked systems
- Sometimes replaced with random node values for improved security
- Less common today due to privacy concerns
UUIDv2
- Extends UUIDv1 with local domain information (e.g., user ID)
- Defined in the DCE security standard (POSIX)
- Rarely supported in modern libraries and largely deprecated
UUIDv3
- Uses an MD5 hash of a namespace and input value
- Deterministic: generates the same UUID for the same input
- Useful for consistent, name-based IDs, though MD5 is outdated
UUIDv4
- Fully random or pseudo-random
- Strong uniqueness guarantees and widely used in distributed systems
- Cannot be sorted by creation time
- Suitable for general-purpose ID generation without coordination
UUIDv5
- Like UUIDv3 but uses SHA-1 instead of MD5
- Offers better integrity, though SHA-1 is not cryptographically secure
- Used where name-based, deterministic IDs are needed

Newer versions like UUIDv6, UUIDv7, and UUIDv8 have been proposed to improve ordering, precision, and use-case flexibility. However, these are still relatively niche and are adopted primarily in specialized systems.

In Java, UUIDs can be generated using the java.util.UUID class:

UniqueIdGenerator.java

UUID uuid = UUID.randomUUID();
System.out.println("Generated UUID: " + uuid.toString());

Sample output:

console

Generated UUID: 550e8400-e29b-41d4-a716-446655440000
Generated UUID: 123e4567-e89b-12d3-a456-426614174000
Generated UUID: 9e107d9d-3721-4a6e-a10a-1ffeddfcbd00

While uniqueness is the primary concern in ID generation, performance can also be critical in high-throughput systems. The following test simulates a concurrent load scenario to assess UUID generation speed:

UniqueIdGeneratorTest.java

@Test
void shouldGenerateUUID() throws Exception {

    //  Given
    int totalRequests = 50_000;
    int threadCount = 10;
    int maxElapsedTime = 1_000;
    CountDownLatch latch = new CountDownLatch(totalRequests);
    AtomicInteger idCount = new AtomicInteger();
    Set<String> ids = ConcurrentHashMap.newKeySet();

    //  When
    long startTime = System.nanoTime();
    try (ExecutorService executor = Executors.newFixedThreadPool(threadCount)) {
        for (int i = 0; i < totalRequests; i++) {
            int threadIndex = i % threadCount;
            executor.submit(() -> {
                try {
                    String id = UUID.randomUUID().toString();
                    ids.add(id);
                    idCount.incrementAndGet();
                } finally {
                    latch.countDown();
                }
            });
        }
        latch.await();
    }
    long endTime = System.nanoTime();

    //  Then
    long actualElapsedTime = (endTime - startTime) / 1_000_000;
    assertTrue(actualElapsedTime < maxElapsedTime, "Expected: less than" + maxElapsedTime + ", Actual: " + actualElapsedTime + " ms");
    assertEquals(totalRequests, idCount.get(), "Expected: " + totalRequests + ", Actual: " + idCount.get());
    assertEquals(totalRequests, ids.size(), "Expected: " + totalRequests + ", Actual: " + ids.size());
    System.out.println("Time taken to generate IDs with multiple instances: " + actualElapsedTime + " ms");
}

The result:

console

Time taken to generate IDs with multiple instances: 107 ms

Even under concurrency, UUID generation remains efficient, completing 50,000 requests in just over 100 ms. This demonstrates UUID’s reliability in high-traffic environments.

Despite its strengths, UUID also comes with limitations:

Size overhead: Being 128 bits, UUIDs increase storage and transmission costs
Low readability: UUIDs are not human-friendly, which makes debugging or manual tracing difficult
Indexing performance: Lack of natural ordering can lead to fragmented indexes and degraded query performance
Non-zero collision probability: While extremely rare, the chance of collision is not mathematically zero

Even with these drawbacks, UUID remains a widely adopted strategy due to its ease of use, independence from external coordination, and strong uniqueness guarantees. It continues to be a practical choice for distributed systems, database keys, API tokens, and other domains requiring global identifiers.

1-3. Ticket server

A ticket server refers to a centralized, standalone service responsible for generating unique identifiers. Instead of relying on a database, it issues IDs by incrementing an internal counter. Clients request an ID from the server, which responds with a new, sequentially increasing value.

Because all IDs are generated from a single, trusted source, uniqueness is inherently guaranteed without the need for conflict resolution logic. The setup is relatively simple and doesn’t require complex distributed databases or coordination mechanisms. This makes the ticket server a practical choice for environments that prioritize reliability and low operational overhead.

Additionally, since the IDs are plain numeric values, they are more storage- and network-efficient compared to UUIDs. This contributes to better performance across the system, especially in storage and transmission.

implementing-unique-id-generator-based-on-snowflake-algorithm-in-java_00.png

The ticket server typically follows this flow:

A client application sends a request to the ticket server for a new ID.
The server increments an internal counter and generates a new unique ID.
The generated ID is returned to the client.

Here’s a basic implementation using Spring Boot:

TicketServerController.java

@RestController
@RequestMapping("/api/ticket")
public class TicketServerController {

    private final AtomicLong counter;

    public TicketServerController() {
        this.counter = new AtomicLong(1000);
    }

    @GetMapping("/generate")
    public ResponseEntity<Long> generateId() {
        long id = counter.incrementAndGet();
        return ResponseEntity.ok(id);
    }
}

This example uses an in-memory counter starting at 1000. In practice, the counter value is typically initialized from a database or persistent store at startup. Each client request increments the counter and returns the result. The simplicity of this approach makes it easy to maintain and suitable for many practical use cases.

However, this method has several limitations:

Single point of failure (SPOF): If the ticket server becomes unavailable, all ID generation requests will fail. This can seriously impact system availability and continuity. A backup system or high-availability setup (e.g., hot standby or load balancing) is essential to mitigate this risk.
Scalability constraints: The server must handle all requests. Even with fast processing, high traffic can cause bottlenecks. Scaling may require techniques such as server pooling or distributing load across multiple instances.
Network latency: Communication delays between clients and the ticket server can degrade performance. Increased round-trip time leads to slower responses, especially under heavy traffic, negatively affecting user experience and throughput.
Data synchronization issues: Running multiple ticket servers for high availability introduces the challenge of keeping counter values consistent. Without proper synchronization (e.g., distributed locks or centralized coordination), duplicate IDs may be generated.
Increased operational overhead: Adding a separate ticket server means more infrastructure to manage. Monitoring, failure recovery, and alerting mechanisms become necessary, which can add complexity to operations.

While the ticket server approach is simple to implement and guarantees uniqueness, the fact that it introduces another critical component to manage can be a burden. Ensuring availability, dealing with synchronization and latency issues, and avoiding operational pitfalls all require careful infrastructure design. After all, there’s no such thing as a “Never Die” server. 😮‍💨

1-4. Snowflake algorithm

The Snowflake algorithm, originally developed by Twitter, is a highly efficient method for generating unique identifiers in distributed systems. It eliminates the need for a central server, allowing each node to independently generate IDs. This design reduces potential bottlenecks and greatly enhances both system availability and scalability under high traffic conditions.

Snowflake generates 64-bit long integers that are compatible with Java’s long type and commonly stored as BIGINT in relational databases. The most significant bit is unused (sign bit), and the remaining 63 bits are divided into distinct components. Among them, a 10-bit machine ID ensures that each node can generate IDs without conflicts, even in a distributed environment.

Component	Size	Description
⏰ Timestamp	41 bits	Millisecond precision to ensure chronological ordering
🖥️ Machine ID	10 bits	A unique identifier assigned to each node
🔢 Sequence	12 bits	Allows each node to generate up to 4,096 IDs within the same millisecond

In theory, each node can generate up to 4,096 unique IDs per millisecond, thanks to the 12-bit sequence. This enables massive horizontal scalability as more nodes independently generate IDs. Additionally, since the ID is a pure numeric value starting with a timestamp, it’s naturally sortable and time-traceable.

The following is a simplified implementation of the Snowflake algorithm in Java. It captures the core logic, while production-grade implementations are available in Twitter’s official Snowflake repository.

SnowflakeIdGenerator.java

public class SnowflakeIdGenerator {

    private final long epoch = 1622505600000L;
    private final long machineId;
    private final long maxSequence = 4095;

    private long lastTimestamp = -1L;
    private long sequence = 0L;

    public SnowflakeIdGenerator(long machineId) {
        this.machineId = machineId;
    }

    public synchronized long generateId() {
        long currentTimestamp = System.currentTimeMillis();

        if (currentTimestamp < lastTimestamp) {
            throw new IllegalStateException("Clock moved backwards. Refusing to generate ID");
        }

        if (currentTimestamp == lastTimestamp) {
            sequence = (sequence + 1) & maxSequence;
            if (sequence == 0) {
                while (currentTimestamp <= lastTimestamp) {
                    currentTimestamp = System.currentTimeMillis();
                }
            }
        } else {
            sequence = 0;
        }

        lastTimestamp = currentTimestamp;

        return ((currentTimestamp - epoch) << 22) | (machineId << 12) | sequence;
    }
}

Snowflake is designed to produce unique identifiers in a distributed environment without central coordination. Its timestamp-based structure not only ensures high throughput but also supports chronological sorting, which is beneficial in databases and caching systems.

However, several design considerations should be addressed:

Time synchronization: Since each node generates IDs independently, inconsistent system clocks can cause duplicate or out-of-order IDs. Time synchronization using NTP (Network Time Protocol) is essential to avoid such issues.
Machine ID management: Each node must be assigned a unique machine ID. Preventing conflicts across machines may require a centralized assignment mechanism or orchestration logic.
Millisecond limit: If a node receives more than 4,096 requests within the same millisecond, it will exhaust its sequence counter. In this case, ID generation is delayed until the next millisecond, which may introduce latency under high load.
Lifespan limit: The 41-bit timestamp allows for about 69 years of ID generation from a defined custom epoch. Once this limit is reached, the timestamp wraps around, potentially causing ID collisions. A long-term system may need to reset the epoch or introduce an extended bit space.

Despite these challenges, Snowflake has proven to be a reliable and scalable solution for many large-scale platforms. Services like Twitter, Instagram, and Uber use Snowflake or similar algorithms to manage unique identifiers across distributed systems. Compared to ticket servers, Snowflake optimizes for high concurrency, system resilience, and horizontal scalability—making it a foundational strategy for modern ID generation.

2. Implementing unique ID generator

So far, we’ve explored various strategies for generating unique identifiers. As many experienced engineers would agree, there’s no single right answer—what matters is understanding the trade-offs of each approach and choosing the one that best fits the given requirements.

While I haven’t worked on large-scale systems yet, I’ve started applying a Snowflake-based ID generation strategy to my current blog project. This serves as a way to gain practical experience and prepare for building more scalable systems in the future.

2-1. Sonyflake algorithm

The Sonyflake algorithm is a distributed unique ID generator developed by Sony, inspired by Twitter’s Snowflake. While both aim to provide scalable and collision-free identifiers, Sonyflake is designed with a focus on extended lifespan and performance in large-scale, multi-instance environments. It uses a different bit structure compared to Snowflake:

Component	Bits	Description
⏰ Timestamp	39 bits	Time elapsed in 10ms units, ensuring IDs are generated chronologically
🖥️ Machine ID	16 bits	A unique identifier per node; supports up to 65,536 distributed nodes
🔢 Sequence	8 bits	Allows up to 256 IDs per node within the same 10ms interval

Sonyflake offers two major advantages: longer lifespan and greater node scalability. The 39-bit timestamp allows ID generation over approximately 174 years—significantly longer than Snowflake’s 69-year range. Meanwhile, the 16-bit machine ID supports up to 65,536 nodes, compared to Snowflake’s 1,024.

Below is a comparison of key differences between Snowflake and Sonyflake:

Feature	 Snowflake	 Sonyflake
Timestamp resolution	41 bits (milliseconds)	39 bits (10-millisecond)
Machine ID size	10 bits	16 bits
Sequence size	12 bits	8 bits
Lifespan	~69 years	~174 years
Max node count	1,024	65,536
Max ID rate	4,096 per millisecond	256 per 10 milliseconds
Primary language	Scala	Go

Sonyflake achieves its extended lifespan by lowering timestamp resolution from milliseconds to 10-millisecond intervals. With 39 bits for timestamp, it supports ID generation for roughly 174 years from a given epoch. Its 16-bit machine ID field also makes it suitable for larger-scale distributed deployments compared to Snowflake.

However, Sonyflake trades off generation speed for these benefits. While Snowflake can produce up to 4,096 IDs per millisecond, Sonyflake is limited to 256 IDs every 10 milliseconds. In high-throughput environments where real-time ID generation is critical, Snowflake may offer better performance. Sonyflake compensates for this by enabling broader horizontal scaling, but achieving the same throughput may require more nodes.

Ultimately, Sonyflake is well-suited for systems that prioritize long lifespan and large node counts over peak generation speed. In contrast, Snowflake tends to perform better in latency-sensitive and high-throughput scenarios.

2-2. Sonyflake in Java

Sonyflake is originally implemented in Go and does not offer an official Java version. To use it in a Java-based project, I translated the core logic with the help of ChatGPT. The key part of the implementation is as follows:

Sonyflake.java

// Sonyflake.java
package sonyflake.core;

import java.time.Instant;
import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReentrantLock;
import sonyflake.config.SonyflakeSettings;

public final class Sonyflake {

    private static final int BIT_LEN_TIME = 39;
    private static final int BIT_LEN_SEQUENCE = 8;
    private static final int BIT_LEN_MACHINE_ID = 16;
    private static final int SEQUENCE_MASK = (1 << BIT_LEN_SEQUENCE) - 1;
    private static final long MAX_ELAPSED_TIME = (1L << BIT_LEN_TIME);
    private static final long SONYFLAKE_TIME_UNIT = 10_000_000L;

    private final long startTime;
    private final int machineId;
    private final Lock mutex = new ReentrantLock();
    private long elapsedTime = 0;
    private int sequence = 0;

    public static Sonyflake of(SonyflakeSettings settings) {
        return new Sonyflake(settings);
    }

    private Sonyflake(SonyflakeSettings settings) {
        this.startTime = toSonyflakeTime(settings.startTime());
        this.machineId = settings.machineId();
    }

    public long nextId() {
        mutex.lock();
        try {
            long currentTime = System.currentTimeMillis() / 10;
            if (currentTime > MAX_ELAPSED_TIME) {
                throw new IllegalStateException("Time overflow");
            }

            if (currentTime == elapsedTime) {
                sequence = (sequence + 1) & SEQUENCE_MASK;
                if (sequence == 0) {
                    while (currentTime <= elapsedTime) {
                        currentTime = System.currentTimeMillis() / 10;
                    }
                }
            } else {
                sequence = 0;
            }

            elapsedTime = currentTime;
            return ((elapsedTime - startTime) << (BIT_LEN_SEQUENCE + BIT_LEN_MACHINE_ID))
                    | (machineId << BIT_LEN_SEQUENCE)
                    | sequence;
        } finally {
            mutex.unlock();
        }
    }
}

Key characteristics of this implementation include:

Time-based ID generation: IDs are generated using a 10-millisecond timestamp, ensuring they are chronologically ordered. This allows efficient sorting and querying in databases and distributed environments.
Sequence number for collision prevention: Within the same 10ms window, up to 256 IDs can be generated per node using an 8-bit sequence number. If the sequence reaches its maximum, the generator waits for the next time unit to avoid duplicates.
Machine ID for distributed environments: A 16-bit machine ID allows up to 65,536 nodes to generate IDs concurrently without conflict.
Thread safety: Go’s original implementation uses sync.Mutex to protect shared state. In Java, a ReentrantLock is used to ensure thread-safe access to shared variables and provide reliable ID generation under concurrent load.
Efficient bit-level composition: The final 64-bit ID is constructed by combining the timestamp, machine ID, and sequence bits. This bit manipulation approach keeps the logic fast and efficient.

The machine ID is typically derived from the node’s private IP address. The following code shows how a private IP is extracted and used to assign a unique machine ID:

// SonyflakeSettings.java
public final class SonyflakeSettings {

    private final Instant startTime;
    private final int machineId;

    private static InetAddress getPrivateIp() throws Exception {
        Enumeration<NetworkInterface> interfaces = NetworkInterface.getNetworkInterfaces();
        while (interfaces.hasMoreElements()) {
            NetworkInterface networkInterface = interfaces.nextElement();
            Enumeration<InetAddress> addresses = networkInterface.getInetAddresses();
            while (addresses.hasMoreElements()) {
                InetAddress address = addresses.nextElement();
                if (isPrivateIp(address)) {
                    return address;
                }
            }
        }
        throw new NoPrivateAddressException("No private IP address found.");
    }

    private static boolean isPrivateIp(InetAddress address) {
        if (address.isLoopbackAddress() || !(address instanceof java.net.Inet4Address)) return false;

        byte[] ip = address.getAddress();
        int first = ip[0] & 0xFF;
        int second = ip[1] & 0xFF;
        return first == 10 || (first == 172 && second >= 16 && second < 32) || (first == 192 && second == 168);
    }

    ...
}

Sonyflake is designed to offer stable and unique ID generation even under high load. You can find the complete Java implementation in my  GitHub Repository.

2-3. Sonyflake Integration

This blog currently runs in a single-instance environment, and unfortunately, there's no need for scaling out at the moment. 😔 For this reason, both the start timestamp and machine ID are hardcoded. To simplify usage across the project, I implemented the generator using the singleton pattern.

Sonyflake.java

@Slf4j
@StandaloneAdapter
public final class Sonyflake {

    public static Sonyflake getInstance() {
        return Sonyflake.SingletonHolder.INSTANCE;
    }

    private static class SingletonHolder {

        private static final SonyflakeSettings settings = SonyflakeSettings.of(Instant.parse("2020-11-11T00:00:00Z"));
        private static final Sonyflake INSTANCE = new Sonyflake(settings);

    }

    public long nextId() throws SonyflakeException {...}
    ...
}

The singleton pattern ensures that Sonyflake is instantiated only once and accessed consistently through the getInstance() method. Here's how it is used in the project:

PostId.java

@Slf4j
@Getter
public class PostId extends BaseId<CategoryId, Long> {

    public PostId(Long value) {
        super(value);
    }

    public static PostId withId(Long id) {
        return new PostId(id);
    }

    public static PostId withoutId() {
        return new PostId(Sonyflake.getInstance().nextId());
    }
}

Even the ID for this post was generated using the Sonyflake-based strategy, as seen below:

PostSlug.java

{
  "id": "220224905717874690",
  "pathname": "implementing-unique-id-generator-based-on-snowflake-algorithm-in-java"
}

Note that the ID value in the JSON response is a string. This is because JavaScript’s Number type supports only up to 53-bit precision, which cannot accurately represent a 64-bit integer. To prevent issues with precision loss or incorrect ID binding on the frontend, the server converts IDs to strings before sending them to the client. I learned this the hard way after spending hours debugging inconsistencies. 🫠

To verify that Sonyflake works correctly, I wrote the following unit test:

SonyflakeTest.java

@Test
void shouldGenerateId() throws Exception {
    // Given
    Instant startTime = Instant.parse("2020-11-11T00:00:00Z");
    SonyflakeSettings settings = SonyflakeSettings.of(startTime);
    Sonyflake sonyflake = Sonyflake.of(settings);

    // When
    long id = sonyflake.nextId();

    // Then
    assertTrue(id > 0, "ID should be greater than 0");
    System.out.println("Sonyflake.nextId=" + id);
    System.out.println("Sonyflake.startTime=" + startTime);
    System.out.println("Sonyflake.elapsedTime=" + sonyflake.elapsedTime(id));
    System.out.println("Sonyflake.sequenceNumber=" + sonyflake.sequenceNumber(id));
    System.out.println("Sonyflake.machineId=" + sonyflake.machineId(id));
    System.out.println("Sonyflake.timestamp=" + sonyflake.timestamp(id));
}

This test confirms that an ID is generated correctly and verifies its components: timestamp, sequence number, and machine ID.

console

Sonyflake.nextId=220390463486623886  
Sonyflake.startTime=2020-11-11T00:00:00Z  
Sonyflake.elapsedTime=13136295288  
Sonyflake.sequenceNumber=1  
Sonyflake.machineId=142  
Sonyflake.timestamp=2025-01-09T09:42:32.880Z

The following test ensures that the IDs are generated in chronological order—one of Sonyflake's key features.

SonyflakeTest.java

@Test
void shouldGenerateOrderedIds() throws Exception {
    // Given
    SonyflakeSettings settings = SonyflakeSettings.of(Instant.parse("2025-01-01T00:00:00Z"));
    Sonyflake sonyflake = Sonyflake.of(settings);
    List<Long> ids = new ArrayList<>();
    int totalSize = 1024;

    // When
    for (int i = 0; i < totalSize; i++) {
        ids.add(sonyflake.nextId());
    }

    // Then
    for (int i = 1; i < totalSize; i++) {
        long prev = ids.get(i - 1);
        long next = ids.get(i);
        assertTrue(prev < next, "IDs should be ordered");
    }
}

Lastly, I ran a high-throughput simulation test with 50,000 requests across 10 concurrent instances:

SonyflakeTest.java

@Test
void shouldHandleHighTpsWithMultipleInstances() throws Exception {
    // Given
    int totalRequests = 50_000;
    int threadCount = 10;
    int maxElapsedTime = 1_000;
    AtomicInteger idCount = new AtomicInteger();
    CountDownLatch latch = new CountDownLatch(totalRequests);

    List<Sonyflake> sonyflakeInstances = new ArrayList<>();
    for (int i = 0; i < threadCount; i++) {
        SonyflakeSettings settings = SonyflakeSettings.of(Instant.parse("2025-01-01T00:00:00Z"), i + 1);
        sonyflakeInstances.add(Sonyflake.of(settings));
    }

    // When
    long startTime = System.nanoTime();
    try (ExecutorService executor = Executors.newFixedThreadPool(threadCount)) {
        for (int i = 0; i < totalRequests; i++) {
            int threadIndex = i % threadCount;
            executor.submit(() -> {
                try {
                    sonyflakeInstances.get(threadIndex).nextId();
                    idCount.incrementAndGet();
                } finally {
                    latch.countDown();
                }
            });
        }
        latch.await();
    }
    long endTime = System.nanoTime();

    // Then
    long actualElapsedTime = (endTime - startTime) / 1_000_000;
    assertTrue(actualElapsedTime < maxElapsedTime, "Expected: less than" + maxElapsedTime + ", Actual: " + actualElapsedTime + " ms");
    assertEquals(totalRequests, idCount.get(), "Expected: " + totalRequests + ", Actual: " + idCount.get());

    System.out.println("Total IDs generated: " + idCount.get());
    System.out.println("Time taken to generate IDs with multiple instances: " + actualElapsedTime + " ms");
}

In this test, 50,000 IDs were successfully generated in approximately 368 ms across 10 threads. The slight delay compared to UUID generation is likely due to Sonyflake’s 10ms resolution and sequence limitations. Still, the performance is more than adequate for most production environments.

console

Total IDs generated: 50000  
Time taken to generate IDs with multiple instances: 368 ms

As this test suggests, increasing the number of instances can help mitigate sequence-related wait times and improve throughput. This confirms that Sonyflake is not only effective in single-node setups but also performs reliably in distributed environments.

3. Wrapping it up with unique ID

In this post, we explored various strategies for generating unique identifiers, with a focus on implementing and testing a Snowflake-inspired approach.

In my view, auto-increment keys provided by relational databases are a convenient choice when data is infrequently created and the risk of collisions is minimal. On the other hand, in high-throughput or distributed environments, combining strategies like UUID or Sonyflake tends to offer better scalability and reliability. 😄

📚 System Design Interview – An Insider's Guide
🔗 Twitter’s official Snowflake repository
🔗 Sonyflake algorithm

Discover more