# Transparent Sharding

# Transparent sharding in HStreamDB

Transparent sharding in HStreamDB means that each stream will contain multiple implicit partitions spread across multiple server nodes. We believe that stream itself is a sufficiently concise and powerful abstraction. Therefore, sharding should only be an implementation detail and not be exposed to the user. Since these partitions are invisible to the user, each stream appears to be managed as a whole as far as the user is concerned.

# Making use of the transparent sharding feature in HStreamDB

The transparent sharding feature does not require the user to deal with any sharding logic, such as the number of partitions or partition mapping. All they need to do as a user is provide ordering keys when writing a record to the stream. Each key corresponds to a virtual partition, and HServer maps these virtual partitions to physical partitions in the storage component.

If the user does not specify a ordering key, all records that do not have an ordering key will be assigned to the default partition of the stream. Therefore, if no ordering key is provided for any record, the system will behave the same as if there were no partitions. However, the user will not notice this regardless, as there is no explicit sharding logic in any user interaction.

# Why transparent sharding

Sharding is an effective solution to alleviate single-node performance bottlenecks and improve horizontal system scalability. However, if the sharding logic were exposed directly to the user, higher-level abstractions such as Stream would be fragmented, increasing the cost of learning and use. Hiding partitions from users would significantly reduce the complexity of using HStreamDB but still take advantage of the benefits of sharding.