After creating a local cluster, you can start exploring YugabyteDB's PostgreSQL-compatible, fully-relational Yugabyte SQL API.

Set up the sample database

The examples in this tutorial use two tables, representing departments and employees. To start, use ysqlsh to create a database and schema, and insert data into the tables.

Open the YSQL shell

Using the YugabyteDB SQL shell, ysqlsh, you can connect to your cluster and interact with it using distributed SQL. ysqlsh is installed with YugabyteDB and is located in the bin directory of the YugabyteDB home directory.

To open the YSQL shell, run ysqlsh.

$ ./bin/ysqlsh
ysqlsh (11.2-YB-{{<yb-version version="preview">}}-b0)
Type "help" for help.

yugabyte=#

To open the YSQL shell, run ysqlsh.

$ ./bin/ysqlsh
ysqlsh (11.2-YB-{{<yb-version version="preview">}}-b0)
Type "help" for help.

yugabyte=#

To open the YSQL shell, run ysqlsh.

$ docker exec -it yugabyte /home/yugabyte/bin/ysqlsh --echo-queries
ysqlsh (11.2-YB-{{<yb-version version="preview">}}-b0)
Type "help" for help.

yugabyte=#

To open the YSQL shell (ysqlsh), run the following.

$ kubectl --namespace yb-demo exec -it yb-tserver-0 -- sh -c "cd /home/yugabyte && ysqlsh -h yb-tserver-0 --echo-queries"
ysqlsh (11.2-YB-{{<yb-version version="preview">}}-b0)
Type "help" for help.

yugabyte=#

Create a database

To create a database (yb_demo), do the following:

  1. Enter the following CREATE DATABASE command:

    yugabyte=# CREATE DATABASE yb_demo;
    
  2. Connect to the new database using the ysqlsh \c meta command:

    yugabyte=# \c yb_demo;
    

Create the schema

The database for this tutorial includes two tables: dept for Departments, and emp for Employees.

Create the database schema by running the following commands.

CREATE TABLE IF NOT EXISTS public.dept (
    deptno integer NOT NULL,
    dname text,
    loc text,
    description text,
    CONSTRAINT pk_dept PRIMARY KEY (deptno asc)
);
CREATE TABLE IF NOT EXISTS emp (
    empno integer generated by default as identity (start with 10000) NOT NULL,
    ename text NOT NULL,
    job text,
    mgr integer,
    hiredate date,
    sal integer,
    comm integer,
    deptno integer NOT NULL,
    email text,
    other_info jsonb,
    CONSTRAINT pk_emp PRIMARY KEY (empno hash),
    CONSTRAINT emp_email_uk UNIQUE (email),
    CONSTRAINT fk_deptno FOREIGN KEY (deptno) REFERENCES dept(deptno),
    CONSTRAINT fk_mgr FOREIGN KEY (mgr) REFERENCES emp(empno),
    CONSTRAINT emp_email_check CHECK ((email ~ '^[a-zA-Z0-9.!#$%&''*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$'::text))
);

The emp table references the dept table through a foreign key constraint. The emp table also references itself through a foreign key constraint to ensure that an employee's manager is in turn an employee themselves.

The emp table uses constraints to ensure integrity of data, such as uniqueness and validity of email addresses.

Load data

Insert rows into the tables using multi-value inserts to reduce client-server round trips.

  1. Load data into the dept table by running the following command.

    INSERT INTO dept (deptno,  dname,        loc, description)
       values    (10,     'ACCOUNTING', 'NEW YORK','preparation of financial statements, maintenance of general ledger, payment of bills, preparation of customer bills, payroll, and more.'),
                 (20,     'RESEARCH',   'DALLAS','responsible for preparing the substance of a research report or security recommendation.'),
                 (30,     'SALES',      'CHICAGO','division of a business that is responsible for selling products or services'),
                 (40,     'OPERATIONS', 'BOSTON','administration of business practices to create the highest level of efficiency possible within an organization');
    
  2. Load data into the emp table by running the following command.

    INSERT INTO emp (empno, ename,    job,        mgr,   hiredate,     sal, comm, deptno, email, other_info)
       values   (7369, 'SMITH',  'CLERK',     7902, '1980-12-17',  800, NULL,   20,'SMITH@acme.com', '{"skills":["accounting"]}'),
                (7499, 'ALLEN',  'SALESMAN',  7698, '1981-02-20', 1600,  300,   30,'ALLEN@acme.com', null),
                (7521, 'WARD',   'SALESMAN',  7698, '1981-02-22', 1250,  500,   30,'WARD@compuserve.com', null),
                (7566, 'JONES',  'MANAGER',   7839, '1981-04-02', 2975, NULL,   20,'JONES@gmail.com', null),
                (7654, 'MARTIN', 'SALESMAN',  7698, '1981-09-28', 1250, 1400,   30,'MARTIN@acme.com', null),
                (7698, 'BLAKE',  'MANAGER',   7839, '1981-05-01', 2850, NULL,   30,'BLAKE@hotmail.com', null),
                (7782, 'CLARK',  'MANAGER',   7839, '1981-06-09', 2450, NULL,   10,'CLARK@acme.com', '{"skills":["C","C++","SQL"]}'),
                (7788, 'SCOTT',  'ANALYST',   7566, '1982-12-09', 3000, NULL,   20,'SCOTT@acme.com', '{"cat":"tiger"}'),
                (7839, 'KING',   'PRESIDENT', NULL, '1981-11-17', 5000, NULL,   10,'KING@aol.com', null),
                (7844, 'TURNER', 'SALESMAN',  7698, '1981-09-08', 1500,    0,   30,'TURNER@acme.com', null),
                (7876, 'ADAMS',  'CLERK',     7788, '1983-01-12', 1100, NULL,   20,'ADAMS@acme.org', null),
                (7900, 'JAMES',  'CLERK',     7698, '1981-12-03',  950, NULL,   30,'JAMES@acme.org', null),
                (7902, 'FORD',   'ANALYST',   7566, '1981-12-03', 3000, NULL,   20,'FORD@acme.com', '{"skills":["SQL","CQL"]}'),
                (7934, 'MILLER', 'CLERK',     7782, '1982-01-23', 1300, NULL,   10,'MILLER@acme.com', null);
    

You now have sample data and are ready to begin exploring YSQL in YugabyteDB.

Explore YugabyteDB

To display the schema of the emp table, enter the following shell meta-command:

yb_demo=# \d emp
                               Table "public.emp"
   Column   |  Type   | Collation | Nullable |             Default
------------+---------+-----------+----------+----------------------------------
 empno      | integer |           | not null | generated by default as identity
 ename      | text    |           | not null |
 job        | text    |           |          |
 mgr        | integer |           |          |
 hiredate   | date    |           |          |
 sal        | integer |           |          |
 comm       | integer |           |          |
 deptno     | integer |           | not null |
 email      | text    |           |          |
 other_info | jsonb   |           |          |
Indexes:
    "pk_emp" PRIMARY KEY, lsm (empno HASH)
    "emp_email_uk" UNIQUE CONSTRAINT, lsm (email HASH)
Check constraints:
    "emp_email_check" CHECK (email ~ '^[a-zA-Z0-9.!#$%&''*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$'::text)
Foreign-key constraints:
    "fk_deptno" FOREIGN KEY (deptno) REFERENCES dept(deptno)
    "fk_mgr" FOREIGN KEY (mgr) REFERENCES emp(empno)
Referenced by:
    TABLE "emp" CONSTRAINT "fk_mgr" FOREIGN KEY (mgr) REFERENCES emp(empno)

SQL updates

The UPDATE statement can compute a new value and return it without the need to do another query. Using the RETURNING clause returns the new values in the same call.

The following adds 100 to the salaries of all employees who are not managers and shows the new value:

UPDATE emp SET sal=sal+100
    WHERE job != 'MANAGER'
    RETURNING ename,sal AS new_salary;
 ename  | new_salary
--------+------------
 SMITH  |        900
 ADAMS  |       1200
 WARD   |       1350
 KING   |       5100
 FORD   |       3100
 MARTIN |       1350
 JAMES  |       1050
 ALLEN  |       1700
 MILLER |       1400
 SCOTT  |       3100
 TURNER |       1600
(11 rows)

Join

A self-join is a regular join where the table is joined with itself. The following statement matches employees with their manager and filters those that are earning more than their manager.

SELECT
    employee.ename,
    employee.sal,
    manager.ename AS "manager ename",
    manager.sal AS "manager sal"
FROM
    emp employee
JOIN emp manager ON
    employee.mgr = manager.empno
WHERE
    manager.sal<employee.sal
ORDER BY employee.sal;
 ename | sal  | manager ename | manager sal
-------+------+---------------+-------------
 FORD  | 3100 | JONES         |        2975
 SCOTT | 3100 | JONES         |        2975
(2 rows)

Prepared statements

Use a prepared statement with typed input to prevent SQL injection. A prepared statement declares parameterized SQL.

  1. Prepare the statement employee_salary with a parameterized query. The following prepared statement accepts the input of an employee number as an integer only and displays the name and salary:

    prepare employee_salary(int) AS
        SELECT ename,sal FROM emp WHERE empno=$1;
    
    PREPARE
    
  2. Use EXECUTE to execute a prepared statement. The following executes the prepared statement for the employee ID 7900:

    EXECUTE employee_salary(7900);
    
        ename | sal
    -------+------
    JAMES | 1050
    (1 row)
    
  3. Execute the same prepared statement with another value:

    EXECUTE employee_salary(7902);
    
    ename | sal
    -------+------
    FORD  | 3100
    (1 row)
    
  4. A prepared statement stays in the session until it is de-allocated. The following frees the memory used by this statement:

    DEALLOCATE employee_salary;
    
    DEALLOCATE
    

Indexes

Use indexes to query table values more efficiently.

  1. Create a table with randomly generated rows. You can use the generate_series() function to generate rows. The following uses generate_series() to create a table with 42 rows and a random value from 1 to 10:

    CREATE TABLE demo AS SELECT generate_series(1,42) num, round(10*random()) val;
    
    SELECT 42
    
  2. Create the index demo_val on the demo table. The following statement creates an index on val (hashed for distribution) and num in ascending order:

    CREATE INDEX demo_val ON demo(val,num);
    
    CREATE INDEX
    
  3. Use ANALYZE to gather optimizer statistics on the table. The query planner chooses the best access path when provided with statistics about the data stored in the table:

    analyze demo;
    
    ANALYZE
    
  4. Query the Top 3 numbers for a specific value:

    SELECT * FROM demo WHERE val=5 ORDER BY num FETCH FIRST 3 ROWS only;
    
    num | val
    -----+-----
      11 |   5
      35 |   5
    (2 rows)
    
  5. Verify that index is leading to faster query execution using EXPLAIN ANALYZE. When defining an index for a specific access pattern, verify that the index is used. The following shows that an Index Only Scan was used, without the need for an additional Sort operation:

    EXPLAIN ANALYZE SELECT * FROM demo WHERE val=5 ORDER BY num FETCH FIRST 3 ROWS only;
    
                                                              QUERY PLAN
    -------------------------------------------------------------------------------------------------------------------------    --
     Limit  (cost=0.00..3.19 rows=3 width=12) (actual time=1.757..1.765 rows=3 loops=1)
       ->  Index Only Scan using demo_val on demo  (cost=0.00..4.47 rows=4 width=12) (actual time=1.754..1.758 rows=3     loops=1)
             Index Cond: (val = '5'::double precision)
             Heap Fetches: 0
     Planning Time: 0.214 ms
     Execution Time: 1.860 ms
     Peak Memory Usage: 8 kB
    (7 rows)
    
  6. Clean up the table for this exercise:

    DROP TABLE IF EXISTS demo;
    
    DROP TABLE
    

Recursive queries

The following example uses a recursive common table expression (CTE) to show the manager hierarchy. The emp_manager CTE is built using the WITH RECURSIVE clause to follow the hierarchy under JONES, down to the last level. The first subquery in the recursive clause starts at JONES. The second lists the employees who have JONES as a manager. They are declared with a UNION ALL and are executed recursively to get the other levels. The main query is then run on the CTE.

WITH RECURSIVE emp_manager AS (
    SELECT empno, ename, ename AS PATH
        FROM emp WHERE ename='JONES'
    UNION ALL
    SELECT emp.empno, emp.ename, emp_manager.path || ' manages ' || emp.ename
        FROM emp JOIN emp_manager ON emp.mgr = emp_manager.empno
)
SELECT * FROM emp_manager;
 empno | ename |               path
-------+-------+-----------------------------------
  7566 | JONES | JONES
  7788 | SCOTT | JONES manages SCOTT
  7902 | FORD  | JONES manages FORD
  7876 | ADAMS | JONES manages SCOTT manages ADAMS
  7369 | SMITH | JONES manages FORD manages SMITH
(5 rows)

Window functions

Use analytic window functions to compare the hiring time interval by department.

The following SQL statement uses WINDOW to define groups of employees by department, ordered by hiring date. The LAG window function is used to access the previous row to compare the hiring date interval between two employees. FORMAT builds text from column values, and COALESCE handles the first hire for which there is no previous row in the group. Without these window functions, this query would need to read the table twice.

SELECT
    dname,ename,job,
    coalesce (
        'hired '||to_char(hiredate -
        lag(hiredate) over (per_dept_hiredate),'999')||' days after '||
        lag(ename) over (per_dept_hiredate),
        format('(1st hire in %L)',dname)
    ) AS "last hire in dept"
    FROM emp JOIN dept USING(deptno)
    WINDOW per_dept_hiredate
    AS (PARTITION BY dname ORDER BY hiredate)
    ORDER BY dname,hiredate;
   dname    | ename  |    job    |      last hire in dept
------------+--------+-----------+------------------------------
 ACCOUNTING | CLARK  | MANAGER   | (1st hire in 'ACCOUNTING')
 ACCOUNTING | KING   | PRESIDENT | hired  161 days after CLARK
 ACCOUNTING | MILLER | CLERK     | hired   67 days after KING
 RESEARCH   | SMITH  | CLERK     | (1st hire in 'RESEARCH')
 RESEARCH   | JONES  | MANAGER   | hired  106 days after SMITH
 RESEARCH   | FORD   | ANALYST   | hired  245 days after JONES
 RESEARCH   | SCOTT  | ANALYST   | hired  371 days after FORD
 RESEARCH   | ADAMS  | CLERK     | hired   34 days after SCOTT
 SALES      | ALLEN  | SALESMAN  | (1st hire in 'SALES')
 SALES      | WARD   | SALESMAN  | hired    2 days after ALLEN
 SALES      | BLAKE  | MANAGER   | hired   68 days after WARD
 SALES      | TURNER | SALESMAN  | hired  130 days after BLAKE
 SALES      | MARTIN | SALESMAN  | hired   20 days after TURNER
 SALES      | JAMES  | CLERK     | hired   66 days after MARTIN
(14 rows)

REGEXP matching

Use regular expressions in an array to do pattern matching. REGEXP performs a pattern match of a string expression. The following lists employees with an e-mail ending in '.org' or a domain starting with 'gmail.':

SELECT * FROM emp
    WHERE email ~ any ( ARRAY[ '@.*\.org$' , '@gmail\.' ] );
 empno | ename |   job   | mgr  |  hiredate  | sal  | comm | deptno |      email      | other_info
-------+-------+---------+------+------------+------+------+--------+-----------------+------------
  7876 | ADAMS | CLERK   | 7788 | 1983-01-12 | 1100 |      |     20 | ADAMS@acme.org  |
  7566 | JONES | MANAGER | 7839 | 1981-04-02 | 2975 |      |     20 | JONES@gmail.com |
  7900 | JAMES | CLERK   | 7698 | 1981-12-03 |  950 |      |     30 | JAMES@acme.org  |
(3 rows)

Arithmetic date intervals

Using arithmetic on date intervals, you can find employees with overlapping evaluation periods.

The interval data type allows you to store and manipulate a period of time in years, months, days, and so forth. The following example compares overlapping evaluation periods. The WITH clause defines the evaluation period length depending on the job:

WITH emp_evaluation_period AS (
    SELECT ename,deptno,hiredate,
        hiredate + CASE WHEN job IN ('MANAGER','PRESIDENT')
        THEN interval '3 month' ELSE interval '4 weeks'
        END evaluation_end FROM emp
    )
SELECT * FROM emp_evaluation_period e1
    JOIN emp_evaluation_period e2
    ON (e1.ename>e2.ename) AND (e1.deptno=e2.deptno)
    WHERE (e1.hiredate,e1.evaluation_end)
    overlaps (e2.hiredate,e2.evaluation_end);
 ename  | deptno |  hiredate  |   evaluation_end    | ename  | deptno |  hiredate  |   evaluation_end
--------+--------+------------+---------------------+--------+--------+------------+---------------------
 MILLER |     10 | 1982-01-23 | 1982-02-20 00:00:00 | KING   |     10 | 1981-11-17 | 1982-02-17 00:00:00
 TURNER |     30 | 1981-09-08 | 1981-10-06 00:00:00 | MARTIN |     30 | 1981-09-28 | 1981-10-26 00:00:00
 WARD   |     30 | 1981-02-22 | 1981-03-22 00:00:00 | ALLEN  |     30 | 1981-02-20 | 1981-03-20 00:00:00
(3 rows)

Cross table pivots

Use a cross table to show the sum of salary per job, by department. The shell \crosstabview meta-command displays rows as columns. The following statement sums the salaries across jobs and departments and displays them as a cross table:

SELECT job, dname, sum(sal)
    FROM emp JOIN dept USING(deptno)
    GROUP BY dname, job
    \crosstabview
    job    | ACCOUNTING | SALES | RESEARCH
-----------+------------+-------+----------
 PRESIDENT |       5000 |       |
 CLERK     |       1300 |   950 |     1900
 SALESMAN  |            |  5600 |
 MANAGER   |       2450 |  2850 |     2975
 ANALYST   |            |       |     6000
(5 rows)

ntile function

To send the e-mails to all employees in different batches, split them into three groups using the ntile() window function. Then format them using the format() function, and aggregate them in a comma-separated list using the string_agg() function:

WITH groups AS (
    SELECT ntile(3) OVER (ORDER BY empno) group_num
    ,*
    FROM emp
)
SELECT string_agg(format('<%s> %s',ename,email),', ')
FROM groups GROUP BY group_num;
                                                          string_agg
-------------------------------------------------------------------------------------------------------------------------------
 <ADAMS> ADAMS@acme.org, <JAMES> JAMES@acme.org, <FORD> FORD@acme.com, <MILLER> MILLER@acme.com
 <BLAKE> BLAKE@hotmail.com, <CLARK> CLARK@acme.com, <SCOTT> SCOTT@acme.com, <KING> KING@aol.com, <TURNER> TURNER@acme.com
 <SMITH> SMITH@acme.com, <ALLEN> ALLEN@acme.com, <WARD> WARD@compuserve.com, <JONES> JONES@gmail.com, <MARTIN> MARTIN@acme.com
(3 rows)

GIN index on documents

The employee skills are stored in a semi-structured JSON document. You can query them using the @>, ?, ?&, and ?| operators. For best performance, index them using a GIN index. GIN indexes provide quick access to elements inside a JSON document.

(GIN indexes are only available in YugabyteDB v2.11.0 or later. If you are using an earlier version, skip this scenario.)

  1. Create the GIN index on the JSON document.

    CREATE INDEX emp_skills ON emp USING gin((other_info->'skills'));
    

    This creates an index on the skills attributes in the other_info JSON column.

  2. Query on the JSON attribute list. SQL queries can navigate into the JSON document using -> and check if an array contains a value using @>. The following searches the employees with the "SQL" skill.

    SELECT * FROM emp WHERE other_info->'skills' @> '"SQL"';
    
  3. Explain the plan to verify that the index is used.

    explain SELECT * FROM emp WHERE other_info->'skills' @> '"SQL"';
    

    Thanks to the GIN index, this search doesn't need to read all documents.

SQL queries can search in text using the to_tsvector() text search function to extract a list of words that can be compared. This exercise finds all department descriptions with the words 'responsible' and 'services' in it using a GIN index.

(GIN indexes are only available in YugabyteDB v2.11.0 or later. If you are using an earlier version, skip this scenario.)

  1. Create a text search index on the description column. The following creates an index for the simple-grammar vector of words extracted from the department description:

    CREATE INDEX dept_description_text_search ON dept
        USING gin (( to_tsvector('simple',description) ));
    
  2. Query on description for matching words. The following compares the simple-grammar vector of words extracted from the department description with a word search pattern to find the departments that contain "responsible" and "service" in their description.

    SELECT * FROM dept
        WHERE to_tsvector('simple',description) @@ to_tsquery('simple','responsible & services');
    
  3. Explain the plan to verify that the index is used.

    explain SELECT * FROM dept
        WHERE to_tsvector('simple',description) @@ to_tsquery('simple','responsible & services');
    

Thanks to the GIN index, this search doesn't need to read all rows and text.

Stored procedures

A stored procedure encapsulates procedural logic into an atomic operation. Use stored procedures to encapsulate transactions with error handling. The following example creates a procedure in PL/pgSQL, named "commission_transfer", that transfers a commission "amount" from empno1 to empno2.

  1. Create the procedure for the commission transfer between employees. The procedure has two SQL operations: decrease from empno1 and add to empno2. It also adds error checking to raise a custom exception if empno1 doesn't have sufficient funds to transfer.

    CREATE OR REPLACE PROCEDURE commission_transfer(empno1 int, empno2 int, amount int) AS $$
    begin
        update emp set comm=comm-commission_transfer.amount
            where empno=commission_transfer.empno1 and comm>commission_transfer.amount;
        if not found then raise exception 'Cannot transfer % from %',amount,empno1; end if;
        update emp set comm=comm+commission_transfer.amount
            where emp.empno=commission_transfer.empno2;
        if not found then raise exception 'Cannot transfer from %',empno2; end if;
    end;
    $$ language plpgsql;
    
    CREATE PROCEDURE
    
  2. Call the procedure with employee IDs and the amount to be transferred. The following CALL statement calls the stored procedure, with values for all parameters, transferring 100 from employee 7521 to employee 7654.

    CALL commission_transfer(7521,7654,100);
    
    CALL
    
  3. List all employees who have received commission to verify the transfer:

    SELECT * FROM emp WHERE comm IS NOT NULL;
    
    empno | ename  |   job    | mgr  |  hiredate  | sal  | comm | deptno |        email        | other_info
    -------+--------+----------+------+------------+------+------+--------+---------------------+------------
      7521 | WARD   | SALESMAN | 7698 | 1981-02-22 | 1350 |  400 |     30 | WARD@compuserve.com |
      7654 | MARTIN | SALESMAN | 7698 | 1981-09-28 | 1350 | 1500 |     30 | MARTIN@acme.com     |
      7499 | ALLEN  | SALESMAN | 7698 | 1981-02-20 | 1700 |  300 |     30 | ALLEN@acme.com      |
      7844 | TURNER | SALESMAN | 7698 | 1981-09-08 | 1600 |    0 |     30 | TURNER@acme.com     |
    (4 rows)
    
  4. Call the procedure with an amount that is not available. The following attempts to transfer 1000000, which is more than what 7521 has available:

    CALL commission_transfer(7521,7654,999999);
    
    ERROR:  Cannot transfer 999999 from 7521
    CONTEXT:  PL/pgSQL function commission_transfer(integer,integer,integer) line 5 at RAISE
    

This raises the "Cannot transfer" error defined in the procedure, and automatically reverts all intermediate changes to return to a consistent state.

Triggers

Use triggers to automatically update data. This example uses a trigger to record the last time each row is updated automatically.

  1. Add a column to store the last update time.

    ALTER TABLE dept ADD last_update timestamptz;
    
    ALTER TABLE
    
  2. Add a function to set the last update time. The following query uses the transaction_timestamp() function, which returns the current date and time at the start of the current transaction. A stored procedure declares some procedural logic that returns a value. In this case, dept_last_update() returns the "new" state for a trigger after setting the "last_update" to the current time.

    CREATE OR REPLACE FUNCTION dept_last_update() RETURNS trigger AS $$
        begin
            new.last_update:=transaction_timestamp();
            return new;
        end;
    $$ language plpgsql;
    
    CREATE FUNCTION
    
  3. Use a trigger to call the procedure automatically. The following statement creates a trigger to execute the procedure on each row update of the dept table.

    CREATE trigger dept_last_update
        BEFORE UPDATE ON dept
        for each row
        EXECUTE PROCEDURE dept_last_update();
    
    CREATE TRIGGER
    
  4. Display the current state of the table. To verify the automatic logging of the last update time, the following displays the current state of departments before any update.

    SELECT deptno,dname,loc,last_update FROM dept;
    
    deptno |   dname    |   loc    | last_update
    --------+------------+----------+-------------
        10 | ACCOUNTING | NEW YORK |
        20 | RESEARCH   | DALLAS   |
        30 | SALES      | CHICAGO  |
        40 | OPERATIONS | BOSTON   |
    (4 rows)
    
  5. Update multiple rows in a single transaction. You can declare multiple updates in a single atomic transaction using BEGIN TRANSACTION and COMMIT. The following updates the location of departments 30 and 40 with a 3 second interval.

    BEGIN TRANSACTION;
    UPDATE dept SET loc='SUNNYVALE' WHERE deptno=30;
    SELECT pg_sleep(3);
    UPDATE dept SET loc='SUNNYVALE' WHERE deptno=40;
    COMMIT;
    
    BEGIN
    UPDATE 1
    pg_sleep
    ----------
    
    (1 row)
    UPDATE 1
    COMMIT
    
  6. Display the new state of the table.

    SELECT deptno,dname,loc,last_update FROM dept;
    
    deptno |   dname    |    loc    |          last_update
    --------+------------+-----------+-------------------------------
        10 | ACCOUNTING | NEW YORK  |
        20 | RESEARCH   | DALLAS    |
        30 | SALES      | SUNNYVALE | 2022-01-11 22:15:34.831474+00
        40 | OPERATIONS | SUNNYVALE | 2022-01-11 22:15:34.831474+00
    (4 rows)
    

In addition to the changed location, the last update timestamp has been automatically set. Although the updates were done at 3 second intervals, they show the same update time because they were run in the same atomic transaction.

Materialized views

To get fast on-demand reports, create a materialized view to store pre-joined and pre-aggregated data.

(Materialized views are only available in YugabyteDB v2.11.2 or later. If you are using an earlier version, skip this scenario.)

  1. Create the materialized view. This view stores the total salary per department, the number of employees, and the list of jobs in the department:

    CREATE MATERIALIZED VIEW report_sal_per_dept AS
    SELECT
        deptno,dname,
        sum(sal) sal_per_dept,
        count(*) num_of_employees,
        string_agg(distinct job,', ') distinct_jobs
    FROM dept join emp using(deptno)
    GROUP BY deptno,dname
    ORDER BY deptno;
    
    SELECT 3
    
  2. Create an index on the view. This allows fast queries on a range of total salary:

    CREATE INDEX report_sal_per_dept_sal ON report_sal_per_dept(sal_per_dept desc);
    
    CREATE INDEX
    
  3. You can schedule a daily refresh to recompute the view in the background. Use the REFRESH MATERIALIZED VIEW command to refresh the view:

    REFRESH MATERIALIZED VIEW report_sal_per_dept;
    
    REFRESH MATERIALIZED VIEW
    
  4. Query the materialized view to show the data is consistent as of the last refresh. This lists the departments with a total salary lower than 10000:

    SELECT *
        FROM report_sal_per_dept
        WHERE sal_per_dept<=10000
        ORDER BY sal_per_dept;
    
    deptno |   dname    | sal_per_dept | num_of_employees |       distinct_jobs
    --------+------------+--------------+------------------+---------------------------
        10 | ACCOUNTING |         8750 |                3 | CLERK, MANAGER, PRESIDENT
        30 | SALES      |         9400 |                6 | CLERK, MANAGER, SALESMAN
    (2 rows)
    
  5. The execution plan shows that no additional join or group by is needed when querying the materialized view:

    EXPLAIN ANALYZE
    SELECT *
        FROM report_sal_per_dept
        WHERE sal_per_dept<=10000
        ORDER BY sal_per_dept;
    
                                                                        QUERY PLAN
    --------------------------------------------------------------------------------------------------------------------------------------------------------
    Index Scan Backward using report_sal_per_dept_sal on report_sal_per_dept  (cost=0.00..5.33 rows=10 width=84) (actual time=1.814..1.821 rows=2 loops=1)
    Index Cond: (sal_per_dept <= 10000)
    Planning Time: 0.143 ms
    Execution Time: 1.917 ms
    Peak Memory Usage: 8 kB
    (4 rows)