What is YugaByte DB?
Watch the Video
Review Detailed Answer
YugaByte DB is a transactional, high-performance database for building internet-scale, globally-distributed applications. Built using a unique combination of distributed document store, auto sharding, per-shard distributed consensus replication and multi-shard ACID transactions (inspired by Google Spanner), it is world’s only distributed database that is both non-relational (support for Redis-compatible KV & Cassandra-compatible flexible schema transactional NoSQL APIs) and relational (support for PostgreSQL-compatible distributed SQL API) at the same time.
YugaByte DB is purpose-built to power fast-growing online services on public, private and hybrid clouds with transactional integrity, high availabilty, low latency, high throughput and multi-region scalability while also providing unparalleled data modeling freedom to application architects. Enterprises gain more functional depth and agility without any cloud lock-in when compared to proprietary cloud databases such as Amazon DynamoDB, Microsoft Azure Cosmos DB and Google Cloud Spanner. Enterprises also benefit from stronger data integrity guarantees, more reliable scaling and higher performance than those offered by legacy open source NoSQL databases such as MongoDB and Apache Cassandra.
YugaByte DB Community Edition is developed and distributed as an Apache 2.0 open source project.
What client APIs are supported by YugaByte DB?
YugaByte DB supports both Transactional NoSQL and Distributed SQL APIs.
YCQL - YCQL is a transactional flexible-schema API that is compatible with Apache Cassandra Query Language (CQL). It also extends CQL by adding distributed ACID transactions, strongly consistent secondary indexes and a native JSON column type.
The three YugaByte DB APIs are completely isolated and independent from one another. This means that the data inserted or managed by one API cannot be queried by a different API. Additionally, there is no common way to access the data across all the APIs (external frameworks such as Presto can help for simple cases).
The net impact is that application developers have to select an API first before undertaking detailed database schema/query design and implementation.
Which API should I choose for my application?
For internet-scale, transactional workloads, the question of which API to choose is a trade-off between data modeling richness and query performance. On one end of the spectrum is the YEDIS API that is completely optimized for single key access patterns, has simpler data modeling constructs and provides blazing-fast (sub-ms) query performance. On the other end of the spectrum is the YSQL API that supports complex multi-key relationships (through JOINS and foreign keys) and provides normal (single-digit ms) query performance. This is expected since multiple keys can be located on multiple shards hosted on multiple nodes, resulting in higher latency than a key-value API that accesses only a single key at any time. At the middle of the spectrum is the YCQL API that is still optimized for majority single-key workloads but has richer data modeling features such as globally consistent secondary indexes (powered by distributed ACID transactions) that can accelerate internet-scale application development significantly.
How does YugaByte DB’s common storage engine work?
DocDB, YugaByte DB’s distributed document store common across all APIs, builds on top of the popular RocksDB project by transforming RocksDB from a key-value store (with only primitive data types) to a document store (with complex data types). Every key is stored as a separate document in DocDB, irrespective of the API responsible for managing the key. DocDB’s sharding, replication/fault-tolerance and distributed ACID transactions architecture are all based on the the Google Spanner design first published in 2012.
What makes YugaByte DB unique?
YugaByte DB is a single operational database that brings together three must-have needs of user-facing cloud applications, namely ACID transactions, high performance and multi-region scalability. Monolithic SQL databases offer transactions and performance but do not have ability to scale across multi-regions. Distributed NoSQL databases offer performance and multi-region scalablility but give up on transactional guarantees.
Additionally, for the first time ever, application developers have unparalleled freedom when it comes to modeling data for workloads that require internet-scale, transactions and geo-distribution. As highlighted previously, they have two transactional NoSQL APIs and a distributed SQL API to choose from.
YugaByte DB feature highlights are listed below.
Distributed acid transactions that allow multi-row updates across any number of shards at any scale.
2. High Performance
High throughput for ingesting and serving ever-growing datasets.
3. Planet Scale
Global data distribution that brings consistent data close to users through multi-region and multi-cloud deployments.
4. Cloud Native
Self-healing database that automatically tolerates any failures common in the inherently unreliable modern cloud infrastructure.