YugaByte YugaByte
Enterprise
Open Source
Customers
Use Cases
Globally Distributed Applications Real-Time Streaming & Analytics Transactional NoSQL Planet-Scale SQL Distributed Transactional Key-Value DB
Resources
All Resources Docs Blog
Get Started
latest v1.0
  • Introduction
  • Quick Start
    • 1. Install YugaByte DB
    • 2. Create Local Cluster
    • 3. Test YCQL API
    • 4. Test YEDIS API
    • 5. Test YSQL API
    • 6. Run Sample Apps
  • Explore Core Features
    • Cloud Native
      • 1. Linear Scalability
      • 2. Fault Tolerance
      • 3. Observability
      • 4. Orchestration Readiness
    • Transactional
      • 1. ACID Transactions
      • 2. Secondary Indexes
      • 3. JSON Documents
    • High Performance
      • Tunable Reads
    • Planet Scale
      • 1. Global Distribution
      • 2. Auto Sharding
      • 3. Auto Rebalancing
    • PostgreSQL (Beta)
      • 1. Linear Scalability
      • 2. Fault Tolerance
      • 3. JOINs
      • 4. Aggregations
      • 5. Expressions
      • 6. Views
  • Develop
    • Tools
      • cqlsh
      • TablePlus
    • Client Drivers
      • C/C++
      • C#
      • Go
      • Java
      • NodeJS
      • Python
    • Learn App Dev
      • 1. SQL vs NoSQL
      • 2. Data Modeling
      • 3. Data Types
      • 4. ACID Transactions
      • 5. Aggregations
      • 6. Batch Operations
    • Ecosystem Integrations
      • Apache Kafka
      • Apache Spark
      • JanusGraph
      • KairosDB
      • Presto
      • Metabase
    • Real World Examples
      • E-Commerce App
      • IoT Fleet Management
      • Retail Analytics
  • Deploy
    • Checklist
    • Manual Deployment
      • 1. System Configuration
      • 2. Install Software
      • 3. Start YB-Masters
      • 4. Start YB-TServers
      • 5. Verify Deployment
    • Kubernetes
      • Helm Chart
      • Local SSD
    • Docker Swarm
    • Public Clouds
      • Amazon Web Services
      • Google Cloud Platform
      • Microsoft Azure
    • Pivotal Cloud Foundry
    • Enterprise Edition
      • 1. Initial Setup
      • 2. Install Admin Console
      • 3. Configure Admin Console
      • 4. Configure Cloud Providers
  • Secure
    • Security Checklist
    • Authentication
    • Authorization
      • 1. RBAC Model
      • 2. Create Roles
      • 3. Grant Permissions
    • TLS Encryption
      • 1. Prepare Nodes
      • 2. Server-server encryption
      • 3. Client-server encryption
      • 4. Connect to cluster
  • Manage
    • Backup and Restore
      • Backing Up Data
      • Restoring Data
    • Data Migration
      • Bulk Import
      • Bulk Export
    • Change Cluster Config
    • Upgrade Deployment
    • Diagnostics Reporting
    • Enterprise Edition
      • Create Universe - Multi-Zone
      • Create Universe - Multi-Region
      • Edit Universe
      • Edit Config Flags
      • Health Checking and Alerts
      • Node Status & Actions
      • Read Replicas
      • Backup & Restore
      • Upgrade Universe
      • Delete Universe
  • Troubleshoot
    • Troubleshooting Overview
    • Cluster Level Issues
      • YCQL Connection Issues
      • YEDIS Connection Issues
    • Node Level Issues
      • Check Processes
      • Inspect Logs
      • System Stats
    • Enterprise Edition
      • Troubleshoot Universes
  • Architecture
    • Basic Concepts
      • Single Node
      • Universe, Cluster, YB-TServer, YB-Master
      • Sharding
      • Replication
      • Persistence
      • Query Layer
      • Acknowledgements
    • Core Functions
      • Universe Creation
      • Table Creation
      • Write IO Path
      • Read IO Path
      • High Availability
    • Transactions
      • Isolation Levels
      • Single Row Transactions
      • Distributed Transactions
      • Transactional IO Path
  • Comparisons
    • Apache Cassandra
    • MongoDB
    • Redis In-Memory Store
    • FoundationDB
    • Amazon DynamoDB
    • Azure Cosmos DB
    • Google Cloud Spanner
    • Apache HBase
  • API Reference
    • YCQL
      • ALTER KEYSPACE
      • ALTER ROLE
      • ALTER TABLE
      • CREATE INDEX
      • CREATE KEYSPACE
      • CREATE ROLE
      • CREATE TABLE
      • CREATE TYPE
      • DROP INDEX
      • DROP KEYSPACE
      • DROP ROLE
      • DROP TABLE
      • DROP TYPE
      • GRANT PERMISSION
      • GRANT ROLE
      • REVOKE PERMISSION
      • REVOKE ROLE
      • USE
      • INSERT
      • SELECT
      • UPDATE
      • DELETE
      • TRANSACTION
      • TRUNCATE
      • Simple Value
      • Subscript
      • Function Call
      • Operator Call
      • BLOB
      • BOOLEAN
      • MAP, SET, LIST
      • FROZEN
      • INET
      • Integer & Counter
      • Non-Integer
      • TEXT
      • Date & Time Types
      • UUID & TIMEUUID
      • JSONB
      • Date & Time Functions
    • YEDIS
      • APPEND
      • AUTH
      • CONFIG
      • CREATEDB
      • DELETEDB
      • LISTDB
      • SELECT
      • DEL
      • ECHO
      • EXISTS
      • EXPIRE
      • EXPIREAT
      • FLUSHALL
      • FLUSHDB
      • GET
      • GETRANGE
      • GETSET
      • HDEL
      • HEXISTS
      • HGET
      • HGETALL
      • HINCRBY
      • HKEYS
      • HLEN
      • HMGET
      • HMSET
      • HSET
      • HSTRLEN
      • HVALS
      • INCR
      • INCRBY
      • KEYS
      • MONITOR
      • PEXPIRE
      • PEXPIREAT
      • PTTL
      • ROLE
      • SADD
      • SCARD
      • SET
      • SETRANGE
      • SISMEMBER
      • SMEMBERS
      • SREM
      • STRLEN
      • ZRANGE
      • TSADD
      • TSCARD
      • TSGET
      • TSLASTN
      • TSRANGEBYTIME
      • TSREM
      • TSREVRANGEBYTIME
      • TTL
      • ZADD
      • ZCARD
      • ZRANGEBYSCORE
      • ZREM
      • ZREVRANGE
      • PUBSUB
      • PUBLISH
      • SUBSCRIBE
      • UNSUBSCRIBE
      • PSUBSCRIBE
      • PUNSUBSCRIBE
    • YSQL (Beta)
      • DDL Statements
        • CREATE DATABASE
        • CREATE TABLE
        • CREATE VIEW
        • DROP DATABASE
        • DROP TABLE
      • DML Statements
        • INSERT
        • SELECT
      • Datatypes
        • FLOAT
        • INTEGER
        • TEXT
      • Transactions
      • Roles and Permissions
      • Prepared Statements
      • Explain Statement
  • Admin Reference
    • yb-ctl
    • yb-docker-ctl
    • docker-compose
    • yb-master
    • yb-tserver
  • FAQs
    • Product
    • Architecture
    • Enterprise Edition
    • Cassandra Compatibility
> Explore Core Features > Transactional >

Secondary Indexes

A database index is a data structure that improves the speed of data retrieval operations on a database table. Secondary indexes require additional writes and storage space to maintain the index data structure. They can be created using one or more columns of a database table, providing the basis for both rapid random lookups and efficient access of ordered records.

YugaByte DB provides consistent (ACID), performant secondary indexes. They are built on top of distributed ACID transactions. You can read more about transactions in our architecture docs.

If you haven’t installed YugaByte DB yet, do so first by following the Quick Start guide.

NOTE: Secondary indexes are a work in progress. Here are some requirements to keep in mind currently when using secondary indexes in YugaByte:

  • To create a secondary index on a table, the primary table needs to be created with distributed transaction enabled using the with transactions = { 'enabled' : true } clause.
  • The secondary index needs to be created before any data is inserted into the primary table.

These requirements may be removed in the future.

  • macOS
  • Linux
  • Docker
  • Kubernetes

1. Setup - create universe

If you have a previously running local universe, destroy it using the following.

$ ./bin/yb-ctl destroy

Start a new local cluster - by default, this will create a 3 node universe with a replication factor of 3.

$ ./bin/yb-ctl create

2. Create a table with secondary indexes

Connect to the cluster using cqlsh.

$ ./bin/cqlsh
Connected to local cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.9-SNAPSHOT | CQL spec 3.4.2 | Native protocol v4]
Use HELP for help.
cqlsh>

Create a keyspace.

cqlsh> CREATE KEYSPACE store;

Create a table with the transactions property set to enabled.

cqlsh> CREATE TABLE store.orders (
  customer_id int,
  order_date timestamp,
  amount double,
  PRIMARY KEY ((customer_id), order_date)
) with transactions = { 'enabled' : true };

Now create a secondary index on the order_date column. Note that we include the amount column in the secondary index in order to respond to queries selecting the amount column directly from the secondary index table with just one read.

cqlsh> create index orders_by_date on store.orders (order_date, customer_id) include (amount);

3. Insert sample data

Let us seed this table with some sample data. Paste the following into the cqlsh prompt.

INSERT INTO store.orders (customer_id, order_date, amount) VALUES (1, '2018-04-02', 100.30);
INSERT INTO store.orders (customer_id, order_date, amount) VALUES (2, '2018-04-02', 50.45);
INSERT INTO store.orders (customer_id, order_date, amount) VALUES (1, '2018-04-06', 20.25);
INSERT INTO store.orders (customer_id, order_date, amount) VALUES (3, '2018-04-06', 200.80);

4. Perform some queries

  • Get the total amount for a customer

Let us write a query to fetch the sum total of the order amount column across all orders for a customer. This query will be executed against the primary table using the partition key customer_id, and therefore does not use the secondary index.

cqlsh> select sum(amount) from store.orders where customer_id = 1;
 sum(amount)
-------------
      120.55
(1 rows)
  • Get the total amount for a specific date

Now, let us write a query to fetch the sum total of order amount across all orders for a specific date. Because we have a secondary index on the order_date column of the table, the query analyzer will execute the query against the secondary index using the partition key order_date and avoid a full-table scan of the primary table.

cqlsh> select sum(amount) from store.orders where order_date = '2018-04-02';
 sum(amount)
-------------
      150.75
(1 rows)

5. Clean up (optional)

Optionally, you can shutdown the local cluster created in Step 1.

$ ./bin/yb-ctl destroy

1. Setup - create universe

If you have a previously running local universe, destroy it using the following.

$ ./bin/yb-ctl destroy

Start a new local cluster - by default, this will create a 3 node universe with a replication factor of 3.

$ ./bin/yb-ctl create

2. Create a table with secondary indexes

Connect to the cluster using cqlsh.

$ ./bin/cqlsh
Connected to local cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.9-SNAPSHOT | CQL spec 3.4.2 | Native protocol v4]
Use HELP for help.
cqlsh>

Create a keyspace.

cqlsh> CREATE KEYSPACE store;

Create a table with the transactions property set to enabled.

cqlsh> CREATE TABLE store.orders (
  customer_id int,
  order_date timestamp,
  amount double,
  PRIMARY KEY ((customer_id), order_date)
) with transactions = { 'enabled' : true };

Now create a secondary index on the order_date column. Note that we include the amount column in the secondary index in order to respond to queries selecting the amount column directly from the secondary index table with just one read.

cqlsh> create index orders_by_date on store.orders (order_date, customer_id) include (amount);

3. Insert sample data

Let us seed this table with some sample data. Paste the following into the cqlsh prompt.

INSERT INTO store.orders (customer_id, order_date, amount) VALUES (1, '2018-04-02', 100.30);
INSERT INTO store.orders (customer_id, order_date, amount) VALUES (2, '2018-04-02', 50.45);
INSERT INTO store.orders (customer_id, order_date, amount) VALUES (1, '2018-04-06', 20.25);
INSERT INTO store.orders (customer_id, order_date, amount) VALUES (3, '2018-04-06', 200.80);

4. Perform some queries

  • Get the total amount for a customer

Let us write a query to fetch the sum total of the order amount column across all orders for a customer. This query will be executed against the primary table using the partition key customer_id, and therefore does not use the secondary index.

cqlsh> select sum(amount) from store.orders where customer_id = 1;
 sum(amount)
-------------
      120.55
(1 rows)
  • Get the total amount for a specific date

Now, let us write a query to fetch the sum total of order amount across all orders for a specific date. Because we have a secondary index on the order_date column of the table, the query analyzer will execute the query against the secondary index using the partition key order_date and avoid a full-table scan of the primary table.

cqlsh> select sum(amount) from store.orders where order_date = '2018-04-02';
 sum(amount)
-------------
      150.75
(1 rows)

5. Clean up (optional)

Optionally, you can shutdown the local cluster created in Step 1.

$ ./bin/yb-ctl destroy

1. Setup - create universe

If you have a previously running local universe, destroy it using the following.

$ ./yb-docker-ctl destroy

Start a new local cluster - by default, this will create a 3 node universe with a replication factor of 3.

$ ./yb-docker-ctl create

2. Create a table with secondary indexes

Connect to cqlsh on node 1.

$ docker exec -it yb-tserver-n1 /home/yugabyte/bin/cqlsh
Connected to local cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.9-SNAPSHOT | CQL spec 3.4.2 | Native protocol v4]
Use HELP for help.
cqlsh>

Create a keyspace.

cqlsh> CREATE KEYSPACE store;

Create a table with the transactions property set to enabled.

cqlsh> CREATE TABLE store.orders (
  customer_id int,
  order_date timestamp,
  amount double,
  PRIMARY KEY ((customer_id), order_date)
) with transactions = { 'enabled' : true };

Now create a secondary index on the order_date column. Note that we include the amount column in the secondary index in order to respond to queries selecting the amount column directly from the secondary index table with just one read.

cqlsh> create index orders_by_date on store.orders (order_date, customer_id) include (amount);

3. Insert sample data

Let us seed this table with some sample data. Paste the following into the cqlsh prompt.

INSERT INTO store.orders (customer_id, order_date, amount) VALUES (1, '2018-04-02', 100.30);
INSERT INTO store.orders (customer_id, order_date, amount) VALUES (2, '2018-04-02', 50.45);
INSERT INTO store.orders (customer_id, order_date, amount) VALUES (1, '2018-04-06', 20.25);
INSERT INTO store.orders (customer_id, order_date, amount) VALUES (3, '2018-04-06', 200.80);

4. Perform some queries

  • Get the total amount for a customer

Let us write a query to fetch the sum total of the order amount column across all orders for a customer. This query will be executed against the primary table using the partition key customer_id, and therefore does not use the secondary index.

cqlsh> select sum(amount) from store.orders where customer_id = 1;
 sum(amount)
-------------
      120.55
(1 rows)
  • Get the total amount for a specific date

Now, let us write a query to fetch the sum total of order amount across all orders for a specific date. Because we have a secondary index on the order_date column of the table, the query analyzer will execute the query against the secondary index using the partition key order_date and avoid a full-table scan of the primary table.

cqlsh> select sum(amount) from store.orders where order_date = '2018-04-02';
 sum(amount)
-------------
      150.75
(1 rows)

5. Clean up (optional)

Optionally, you can shutdown the local cluster created in Step 1.

$ ./yb-docker-ctl destroy

1. Setup - create universe and table

If you have a previously running local universe, destroy it using the following.

$ kubectl delete -f yugabyte-statefulset.yaml

Start a new local cluster - by default, this will create a 3 node universe with a replication factor of 3.

$ kubectl apply -f yugabyte-statefulset.yaml

Make sure there are 3 yb-tserver and 3 yb-master pods representing the 3 nodes of the cluster.

$ kubectl get pods
NAME           READY     STATUS    RESTARTS   AGE
yb-master-0    1/1       Running   0          13s
yb-master-1    1/1       Running   0          13s
yb-master-2    1/1       Running   0          13s
yb-tserver-0   1/1       Running   1          12s
yb-tserver-1   1/1       Running   1          12s
yb-tserver-2   1/1       Running   1          12s

2. Create a table with secondary indexes

Connect to the cluster using cqlsh.

Connect to cqlsh on node 1.

$ kubectl exec -it yb-tserver-0 /home/yugabyte/bin/cqlsh
Connected to local cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.9-SNAPSHOT | CQL spec 3.4.2 | Native protocol v4]
Use HELP for help.
cqlsh>

Create a keyspace.

cqlsh> CREATE KEYSPACE store;

Create a table with the transactions property set to enabled.

cqlsh> CREATE TABLE store.orders (
  customer_id int,
  order_date timestamp,
  amount double,
  PRIMARY KEY ((customer_id), order_date)
) with transactions = { 'enabled' : true };

Now create a secondary index on the order_date column. Note that we include the amount column in the secondary index in order to respond to queries selecting the amount column directly from the secondary index table with just one read.

cqlsh> create index orders_by_date on store.orders (order_date, customer_id) include (amount);

3. Insert sample data

Let us seed this table with some sample data. Paste the following into the cqlsh prompt.

INSERT INTO store.orders (customer_id, order_date, amount) VALUES (1, '2018-04-02', 100.30);
INSERT INTO store.orders (customer_id, order_date, amount) VALUES (2, '2018-04-02', 50.45);
INSERT INTO store.orders (customer_id, order_date, amount) VALUES (1, '2018-04-06', 20.25);
INSERT INTO store.orders (customer_id, order_date, amount) VALUES (3, '2018-04-06', 200.80);

4. Perform some queries

  • Get the total amount for a customer

Let us write a query to fetch the sum total of the order amount column across all orders for a customer. This query will be executed against the primary table using the partition key customer_id, and therefore does not use the secondary index.

cqlsh> select sum(amount) from store.orders where customer_id = 1;
 sum(amount)
-------------
      120.55
(1 rows)
  • Get the total amount for a specific date

Now, let us write a query to fetch the sum total of order amount across all orders for a specific date. Because we have a secondary index on the order_date column of the table, the query analyzer will execute the query against the secondary index using the partition key order_date and avoid a full-table scan of the primary table.

cqlsh> select sum(amount) from store.orders where order_date = '2018-04-02';
 sum(amount)
-------------
      150.75
(1 rows)

5. Clean up (optional)

Optionally, you can shutdown the local cluster created in Step 1.

$ kubectl delete -f yugabyte-statefulset.yaml
Explore Core Features
Distributed ACID Transactions
Explore Core Features
JSON Documents
Have a technical question?
Slack StackOverflow
Found a bug or have an enchancement request?
GitHub
YugaByte
Contact Us
Copyright © 2017-2019 YugaByte, Inc. All rights reserved.