HStreamDB v0.9: Extension on Sharding Model and Support of Integration with External Systems

Sep 28, 2022

Highlights

Shards in Streams - direct access to records in shards of streams
HStream IO- built-in data integration framework for HStreamDB
New Stream Processing Engine
Gossip-based HServer Clusters
Update Java and Go clients; Add Python Client

Shards in Streams

We have extended the sharding model in v0.8, which provides direct access and management of the underlying shards of a stream, allowing a finer-grained control of data distribution and stream scaling. Each shard will be assigned a range of hashes in the stream, and every record whose hash of partitionKey falls in the range will be stored in that shard.

Currently, HStreamDB supports:

set the initial number of shards when creating a stream
distribute written records among shards of the stream with partitionKeys
direct access to records from any shard of the specified position
check the shards and their key range in a stream

In future releases, HStreamDB will support the dynamic scaling of streams through shard splitting and merging.

HStream IO

HStream IO is the built-in data integration framework for HStreamDB, composed of source connectors, sink connectors and the IO runtime. It allows interconnection with various external systems and empowers more instantaneous unleashing of the value of data with the facilitation of efficient data flow throughout the data stack.

In particular, this release provides connectors listed below:

Source connectors:
Sink connectors:
- sink-mysql
- sink-postgresql

You can refer to the documentation to learn more about HStream IO.

New Stream Processing Engine

We have re-implemented the stream processing engine in an interactive and differential style, drastically reducing latency and improving the throughput. The new engine supports multi-way join, sub-queries, and more general materialized views.

The feature is still experimental. For tryouts, please refer to the SQL guides.

Gossip-based HServer Clusters

We refactor the hserver cluster with gossip-based membership and failure detection based on SWIM, replacing the ZooKeeper-based implementation in the previous version. The new mechanism will improve the scalability of the cluster and as well as reduce dependencies on external systems.

Java Client

The Java Client v0.9.0 has been released, with support for HStreamDB v0.9.

Golang Client

The Go Client v0.2.0 has been released, with support for HStreamDB v0.9.

Python Client

The Python Client v0.2.0 has been released, with support for HStreamDB v0.9.