postgresql-hll extension

The postgresql-hll extension adds a new data type hll, which is a HyperLogLog data structure. HyperLogLog is a fixed-size, set-like structure used for distinct value counting with tunable precision. For example, in 1280 bytes hll can estimate the count of tens of billions of distinct values with only a few percent error.

First, enable the extension:

CREATE EXTENSION "hll";

To run the helloworld example from the postgresql-hll repository, connect using ysqlsh and run the following:

CREATE TABLE helloworld (id integer, set hll);
CREATE TABLE

Insert an empty HLL as follows:

INSERT INTO helloworld(id, set) VALUES (1, hll_empty());
INSERT 0 1

Add a hashed integer to the HLL as follows:

UPDATE helloworld SET set = hll_add(set, hll_hash_integer(12345)) WHERE id = 1;
UPDATE 1

Add a hashed string to the HLL as follows:

UPDATE helloworld SET set = hll_add(set, hll_hash_text('hello world')) WHERE id = 1;
UPDATE 1

Get the cardinality of the HLL as follows:

SELECT hll_cardinality(set) FROM helloworld WHERE id = 1;
 hll_cardinality
-----------------
               2
(1 row)

For a more advanced example, see the Data Warehouse Use Case.