Secondary indexes

Explore secondary indexes in YugabyteDB using YCQL

This page documents the preview version (v2.23). Preview includes features under active development and is for development and testing only. For production, use the stable version (v2024.1). To learn more, see Versioning.

Using indexes enhances database performance by enabling the database server to find rows faster. You can create, drop, and list indexes.

In YugabyteDB, indexes are global and are implemented just like tables. They are split into tablets and distributed across the different nodes in the cluster. The sharding of indexes is based on the primary key of the index and is independent of how the main table is sharded and distributed. Indexes are not colocated with the base table.

Create indexes

You can create indexes in YCQL using the CREATE INDEX statement using the following syntax:

CREATE INDEX index_name ON table_name(column_list);

YCQL supports Unique, Partial, Covering, and Multi-column secondary indexes.

For more information, see CREATE INDEX.

List indexes and verify the query plan

You can use the DESCRIBE INDEX command to check the indexes as follows:

DESCRIBE INDEX <index_name>

For more information, see DESCRIBE INDEX.

You can also use the EXPLAIN statement to check if a query uses an index and determine the query plan before execution.

For more information, see EXPLAIN.

Remove indexes

You can remove an index using the DROP INDEX statement in YCQL using the following syntax:

DROP INDEX index_name;

For more information, see DROP INDEX.

Example

Before you start

The examples will run on any YugabyteDB universe.
To create a universe, see Set up YugabyteDB universe.

Suppose you work with a database that includes the following table populated with data:

CREATE TABLE employees (
  employee_no integer PRIMARY KEY,
  name text,
  department text
)
WITH TRANSACTIONS = {'enabled':'true'};
INSERT INTO employees(employee_no, name,department) VALUES(1221, 'John Smith', 'Marketing');
INSERT INTO employees(employee_no, name,department) VALUES(1222, 'Bette Davis', 'Sales');
INSERT INTO employees(employee_no, name,department) VALUES(1223, 'Lucille Ball', 'Operations');

The following example shows a query that finds employees working in Operations:

SELECT * FROM employees WHERE department = 'Operations';

To process the preceding query, the whole employees table needs to be scanned and all the shards have to be accessed. For large organizations, this will take a significant amount of time. You can confirm this using the following EXPLAIN statement which indicates seq scan on the table.

EXPLAIN SELECT * FROM employees WHERE department = 'Operations';
 QUERY PLAN
---------------------------------------
 Seq Scan on employees
   Filter: (department = 'Operations')

To speed up the process, you create an index for the department column, as follows:

CREATE INDEX index_employees_department ON employees(department);

The following example executes the query after the index has been applied to department and uses the EXPLAIN statement to prove that the index is used during query execution:

EXPLAIN SELECT * FROM employees WHERE department = 'Operations';

Following is the output produced by the preceding example:

 QUERY PLAN
--------------------------------------------------------------------
 Index Scan using docs.index_employees_department on docs.employees
   Key Conditions: (department = 'Operations')

To remove the index index_employees_department, use the following command:

DROP INDEX index_employees_department;

Multi-column index

Multi-column indexes can be beneficial in situations where queries are searching in more than a single column.

To add a multi-column index to an existing table, you can use the following syntax:

CREATE INDEX index_name ON table_name(col2,col3,col4);

The column order is very important when you create a multi-column index in YCQL because of the structure in which the index is stored. As such, these indexes have a hierarchical order from left to right. So, for the preceding syntax, you can perform search using the following column combinations:

(col2)
(col2,col3)
(col2,col3,col4)

A column combination like (col2,col4) cannot be used to search or query a table.

Multi-column example

Before you start

The examples will run on any YugabyteDB universe.
To create a universe, see Set up YugabyteDB universe.

Create a keyspace and a table as follows:

ycqlsh> CREATE KEYSPACE example;
ycqlsh> USE example;
ycqlsh:example>
CREATE TABLE employees (
    employee_id int PRIMARY KEY,
    first_name text,
    last_name text,
    dept_name text)
WITH TRANSACTIONS = {'enabled': 'true'};

Insert data to the employees table as follows:

ycqlsh:example> INSERT INTO employees(employee_id, first_name, last_name, dept_name) VALUES(1223, 'Lucille', ' Ball', 'Operations');
ycqlsh:example> INSERT INTO employees(employee_id, first_name, last_name, dept_name) VALUES(1433, 'Brett', ' Davis', 'Marketing');
ycqlsh:example> INSERT INTO employees(employee_id, first_name, last_name, dept_name) VALUES(1439, 'Simon', ' Thompson', 'HR');
ycqlsh:example> INSERT INTO employees(employee_id, first_name, last_name, dept_name) VALUES(7898, 'Jackson', ' Lee', 'Sales');
ycqlsh:example> INSERT INTO employees(employee_id, first_name, last_name, dept_name) VALUES(7866, 'Dan', ' Shen', 'Marketing');
ycqlsh:example> INSERT INTO employees(employee_id, first_name, last_name, dept_name) VALUES(5666, 'Rob', ' Mencey', 'Marketing');
ycqlsh:example> INSERT INTO employees(employee_id, first_name, last_name, dept_name) VALUES(4321, 'Dave', ' Spencer', 'Operations');
ycqlsh:example> INSERT INTO employees(employee_id, first_name, last_name, dept_name) VALUES(3214, 'Dave', ' Marley', 'Operations');
ycqlsh:example> INSERT INTO employees(employee_id, first_name, last_name, dept_name) VALUES(4121, 'Leo', ' Marvin', 'Sales');
ycqlsh:example> INSERT INTO employees(employee_id, first_name, last_name, dept_name) VALUES(4021, 'Jackson', ' Bob', 'Marketing');

View the contents of the employees table:

ycqlsh:example> select * from employees limit 2;
 employee_id | first_name | last_name | dept_name
-------------+------------+-----------+------------
        3214 |       Dave |    Marley | Operations
        1223 |    Lucille |      Ball | Operations

Suppose you want to query the subset of employees by their first and last names. The query plan using the EXPLAIN statement would look like the following:

ycqlsh:example> EXPLAIN SELECT * FROM employees WHERE first_name='Dave' AND last_name='Marley';
 QUERY PLAN
------------------------------------------------------------
 Seq Scan on example.employees
   Filter: (first_name = 'Dave') AND (last_name = 'Marley')

To process the preceding query, the whole employees table needs to be scanned and all the shards have to be accessed. For large organizations, this will take a significant amount of time as the query is executed sequentially as demonstrated in the preceding EXPLAIN statement.

To speed up the process, you can create an index for first_name and last_name columns, as follows:

ycqlsh:example> Create INDEX index_name ON employees2(first_name, last_name);

The following example executes the query after the index has been applied to the columns first_name and last_name and uses the EXPLAIN statement to prove that the index participated in the processing of the query:

EXPLAIN SELECT * FROM employees WHERE first_name='Dave' AND last_name='Marley';
 QUERY PLAN
-----------------------------------------------------------
 Index Scan using example.index_name on example.employees2
   Key Conditions: (first_name = 'Dave')
   Filter: (last_name = 'Marley')

Learn more