Taking a Closer Look at ClickHouse® Live View Tables

9 min readDec 5, 2019

by Vitaliy Zakaznikov

In a couple of previous blog posts, I have introduced Live Views tables and covered basic usage. Now, in this post, we will take a closer look at Live View tables. Specifically, we will look at the options available for the WATCH query, then introduce temporary Live Views, as well as look at the new JSONEachRowWithProgress format.

Setting Things Up

As before we will be using the 19.16.3 release and will use two instances of clickhouse-client. Don't forget that we will need to enable the
allow_experimental_live_view setting in both sessions.

ClickHouse client version 19.16.3.6 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 19.16.3 revision 54427.:) SET allow_experimental_live_view = 1

To keep the setup to the minimum, we will again use a very simple source table

CREATE TABLE myints (a Int32) Engine=Memory

and pre-populate it with some initial data

INSERT INTO myints VALUES (1),(2),(3)

and use it in a simple Live View table.

CREATE LIVE VIEW lv AS SELECT sum(a) FROM myints

`WATCH` Query Options

The WATCH query enables us to watch a live stream of query results. Without any options, the query returns the current query result and infinitely waits for changes to the query result.

:) WATCH lvWATCH lv┌─sum(a)─┬─_version─┐
│      6 │        1 │
└────────┴──────────┘
↙ Progress: 1.00 rows, 16.00 B (8.93 rows/s., 142.95 B/s.)

To abort the WATCH query you can use Ctrl-C.

`WATCH` QUERY `LIMIT`

However, an infinite stream of query result updates in not always necessary. Sometimes you just want to get a fixed number of changes, one of the most common cases being just the need to get the next update. To control the number of query result updates that you would like to receive you can use the LIMIT option where the value of the LIMIT can be 0 to N. The value of 0 indicates that the WATCH query should not wait for any new query results and therefore will return immediately once query is evaluated. One way to think of LIMIT 0 is as if you are setting the non-blocking flag.

Let’s give it a try and see that the WATCH query with the LIMIT 0 returns immediately.

:) WATCH lv LIMIT 0WATCH lv
LIMIT 0┌─sum(a)─┬─_version─┐
│      6 │        1 │
└────────┴──────────┘1 rows in set. Elapsed: 0.002 sec. user-node :)

However, using the WATCH query with the LIMIT set to 0 is not very useful as you might as well use the SELECT instead.

The most common value of LIMIT is just 1. In many use cases you do not want to receive an infinite number of query result changes but only the next one. With the LIMIT set to 1 you will receive the current query result and will be notified only once when the query result changes.

Again, let’s give a try. Execute WATCH query in one client and insert one more row into our source table for the WATCH query to complete.

Once you have the WATCH query running you can insert more data as follows.

INSERT INTO myints VALUES (4)

The client that has the WATCH query running should show the following.

:) WATCH lv LIMIT 1WATCH lv
LIMIT 1┌─sum(a)─┬─_version─┐
│      6 │        1 │
└────────┴──────────┘
┌─sum(a)─┬─_version─┐
│     10 │        2 │
└────────┴──────────┘2 rows in set. Elapsed: 20.346 sec.

As you can see, the WATCH query only provided one update as expected. You might ask why is LIMIT 1 very common? The answer is simple: it allows you to get notified the next time the query result has changed. One such use case is watching a table that stores log messages. In this case, we can use count() for our Live View. A separate SELECT query can be issued once we detect that the count() has changed, indicating that there are new log messages available to read from our source table.

`WATCH` QUERY `EVENTS`

A short form of the WATCH query can be constructed if you specify EVENTS after the table name. In this case, instead of the query result, you will just get the latest query result version.

Note that the stored query will still be evaluated every time source table data changes.

Let’s try it and confirm that with the EVENTS we only get the current query result version without the query result itself.

:) WATCH lv EVENTS LIMIT 0WATCH lv EVENTS
LIMIT 0┌─version─┐
│       2 │
└─────────┘

The EVENTS is useful when you only want to get short notifications that you can use as triggers about changes to the query result without looking at the actual result.

Temporary Live Views

In some use cases, for example in a dynamic web application, Live Views can be created dynamically to display specific live data to the user. However, it is common in such cases that the application can exit unexpectedly without cleaning up its Live Views. For such cases, a TEMPORARY Live View can be used.

A TEMPORARY Live View can be created using the CREATE TEMPORARY LIVE VIEW statement. This statement will create a Live View which will be automatically deleted when no users are watching the table after the timeout specified by the temporary_live_view_timeout setting. By default the value of temporary_live_view_timeout setting is set to five seconds.

It is important to note that compared to a regular temporary table, a TEMPORARY Live View table lives outside the scope of a single session. This is done because sometimes clients that create Live Views can disappear and reappear due to various factors such as unstable network connection but as long as the client comes back within the timeout period the Live View table does not have to be recreated.

Because a TEMPORARY Live View table lives outside the scope of a single session, it means that multiple clients can use the same temporary Live View. The Live View table will automatically be deleted after the timeout once all clients stop watching the table.

Let’s see TEMPORARY Live View in action by creating one

CREATE TEMPORARY LIVE VIEW temp_lv AS SELECT sum(a) FROM myints

and immediately running WATCH temp_lv.

:) WATCH temp_lvWATCH temp_lv┌─sum(a)─┬─_version─┐
│     10 │        1 │
└────────┴──────────┘

Note that the table will be deleted in five seconds so you need to be quick.

Now, abort the WATCH query using Ctrl-C and try to re-execute WATCH query after five seconds. You should see the following error.

:) WATCH temp_lvWATCH temp_lvReceived exception from server (version 19.16.3):
Code: 60. DB::Exception: Received from localhost:9000. DB::Exception: Table default.temp_lv doesn't exist..

As you can see, the Live View table has been deleted because the timeout after the last active user has elapsed. The fact that temporary Live Views are automatically deleted when no more clients are watching it helps with reducing resource consumption on the server. Remember, Live Views store their partial results in memory to always have the latest result ready for any client who might ask for it but it does come at a cost. First, similarly to the Materialized Views, Live View’s stored query is executed on each insert block and therefore inserts into the source table are blocked until processing completes. Second, mergeable blocks for the query result are kept in memory and therefore each active Live View will consume memory. The amount of memory depends on the query result. Therefore, keeping the number of Live Views to a minimum will help speed up inserts and will reduce memory consumption on the server.

NAMING TEMPORARY LIVE VIEWS

In a use case where temporary Live Views are created dynamically and are shared between different clients the question of naming temporary Live Views can arise. The recommended solution to this problem is to use the SHA1 of the query that is used to create the Live View table as a part of the table name.

For example, a temporary Live View for the SELECT 1 query can be created using the lv_<SHA1> convention with the addition of the IF NOT EXISTS clause.

CREATE TEMPORARY LIVE VIEW IF NOT EXISTS lv_42364a017b73ef516a0eca9827e6fa00623257ee AS SELECT 1

Using this convention will allow other clients to easily reuse this Live View or create a new one if it does not already exist.

Using Live Views Over HTTP Interface

The most common way that clients connect to Live View tables is via an HTTP interface. The HTTP interface provides ease of use and portability across many systems and programming languages and it is especially useful for web applications.

Let’s see how we can use an HTTP interface to watch our previously created lv table using the curl utility. Note, that for the WATCH query to work over an HTTP interface we must not forget to set the allow_experimental_live_view setting for the connection.

echo 'WATCH lv LIMIT 0' | curl 'http://localhost:8123/?allow_experimental_live_view=1' --data-binary @-
10    2

Now, let’s re-run the same query but with the LIMIT set to 1.

echo 'WATCH lv LIMIT 1' | curl 'http://localhost:8123/?allow_experimental_live_view=1' --data-binary @-
10    2

As you can see the connection remains open and curl does not return. Instead, the connection will block until the query result changes. So let's insert another row into our source table so that a new query result is pushed to our HTTP client.

INSERT INTO myints VALUES (5)

You should observe that your curl command exits and the result should be as below.

$ echo 'WATCH lv LIMIT 1' | curl 'http://localhost:8123/?allow_experimental_live_view=1' --data-binary @-
10    2
15    3
$

Wait, did you see that? ClickHouse has just pushed a notification to the outside world! No longer do you have to re-send the same query over and over again to your database server and bother it with unnecessary requests. With Live View tables, ClickHouse can now notify you that query result has changed and provide you with the new result at the same time.

Keeping Long-Lived Connections Alive

Because watching Live View tables requires a long-lived connection, Live View tables support heartbeats to keep the connection alive. The interval between heartbeats is set using the live_view_heartbeat_interval setting. By default, the live_view_heartbeat_interval is set to fifteen seconds. Every heartbeat interval the WATCH query will provide a progress notification that you can observe when using clickhouse-client. Let's start a WATCH query again in the clickhouse-client and this time we will pay attention to the progress notifications that are displayed below the query result.

:) WATCH lvWATCH lv┌─sum(a)─┬─_version─┐
│     15 │        3 │
└────────┴──────────┘
↗ Progress: 1.00 rows, 16.00 B (9.70 rows/s., 155.27 B/s.)

Do you see that arrow ↗ before the Progress message? Pay attention and you should see that the arrow keeps turning once in a while. How often does it turn? You guessed it, every live_view_heartbeat_interval seconds or whenever a new query result is returned. These periodic progress events are what the Live View table uses to keep the long-lived connection alive.

New `JSONEachRowWithProgress` Format

When Live View tables were added to ClickHouse a new output format was added as well. The new format is JSONEachRowWithProgress and is used to facilitate using Live View tables over the HTTP interface using the HTTP long polling technique.

Let’s again use curl command to watch our Live View. This time lets use the JSONEachRowWithProgress format.

First, execute the WATCH query with the LIMIT 0.

$ echo 'WATCH lv LIMIT 0 FORMAT JSONEachRowWithProgress' | curl 'http://localhost:8123/?allow_experimental_live_view=1' --data-binary @-
{"progress":{"read_rows":"1","read_bytes":"16","written_rows":"0","written_bytes":"0","total_rows_to_read":"0"}}
{"row":{"sum(a)":"15","_version":"3"}}
{"progress":{"read_rows":"1","read_bytes":"16","written_rows":"0","written_bytes":"0","total_rows_to_read":"0"}}
$

Now re-execute the same query but using the old JSONEachRow format so that we can compare them.

$ echo 'WATCH lv LIMIT 0 FORMAT JSONEachRow' | curl 'http://localhost:8123/?allow_experimental_live_view=1' --data-binary @-
{"sum(a)":"15","_version":"3"}
$

If you compare the two outputs there should be no surprises. The JSONEachRowWithProgress differ from the old JSONEachRow by having extra progress messages added to the output. But why do we need this new output format? The answer is simple, to keep the long-lived HTTP connections alive.

Let’s see how it works when our WATCH query is blocked waiting for new query results. Now we will use the LIMIT 1 to block and wait for the new result.

We should see the following

$ echo 'WATCH lv LIMIT 1 FORMAT JSONEachRowWithProgress' | curl 'http://localhost:8123/?allow_experimental_live_view=1' --data-binary @-
{"progress":{"read_rows":"1","read_bytes":"16","written_rows":"0","written_bytes":"0","total_rows_to_read":"0"}}
{"row":{"sum(a)":"15","_version":"3"}}
{"progress":{"read_rows":"1","read_bytes":"16","written_rows":"0","written_bytes":"0","total_rows_to_read":"0"}}

and then every live_view_heartbeat_interval seconds we should see progress messages added to the output.

{"progress":{"read_rows":"1","read_bytes":"16","written_rows":"0","written_bytes":"0","total_rows_to_read":"0"}}
{"progress":{"read_rows":"1","read_bytes":"16","written_rows":"0","written_bytes":"0","total_rows_to_read":"0"}}
...

These progress messages are what keeps the long-lived HTTP connection alive until the query result changes. Without them, the HTTP connection would just timeout.

Conclusion

We have taken a closer look at Live View tables. We looked at the WATCH query options, introduced temporary Live Views, and looked at the new JSONEachRowWithProgress format. All these pieces should enable you to start putting Live View tables into practice. As always until the next time, happy viewing using Live View tables!

Originally published at https://www.altinity.com on December 5, 2019.

Taking a Closer Look at ClickHouse® Live View Tables

Setting Things Up

`WATCH` Query Options

`WATCH` QUERY `LIMIT`

`WATCH` QUERY `EVENTS`

Temporary Live Views

NAMING TEMPORARY LIVE VIEWS

Using Live Views Over HTTP Interface

Keeping Long-Lived Connections Alive

New `JSONEachRowWithProgress` Format

Conclusion

Written by AltinityDB

No responses yet

Taking a Closer Look at ClickHouse® Live View Tables

Setting Things Up

WATCH Query Options

WATCH QUERY LIMIT

WATCH QUERY EVENTS

Temporary Live Views

NAMING TEMPORARY LIVE VIEWS

Using Live Views Over HTTP Interface

Keeping Long-Lived Connections Alive

New JSONEachRowWithProgress Format

Conclusion

Written by AltinityDB

No responses yet

`WATCH` Query Options

`WATCH` QUERY `LIMIT`

`WATCH` QUERY `EVENTS`

New `JSONEachRowWithProgress` Format