Add pg_deltax#933
Conversation
pg_deltax is Xata's time-series PostgreSQL extension that adds columnar storage and compression to a partitioned hits table. Implements the per-system script interface (install/start/check/stop/load/query/ data-size + benchmark.sh shim) so the shared driver in lib/ runs it end-to-end. Includes a c6a.4xlarge result run from the new methodology (true-cold cycles + concurrent QPS).
The xataio/pg_deltax URL still works via GitHub's 301 redirect, but the canonical repo name is xataio/deltax (matches the link in the PR description). Use the canonical URL directly so the clone does not silently depend on the redirect.
| @@ -0,0 +1,8 @@ | |||
| #!/bin/bash | |||
| # Report the test database's on-disk size: tables + indexes + TOAST. Excludes | |||
There was a problem hiding this comment.
I'm no Postgres expert but in my thinking, WAL data is conceptually part of the persistent table data. In this case, the dataset is static, so I assume that the WAL can be shrunk to zero size using some Postgres administration statement (something like "merge", "consolidate", or whatever the "compact-the-wal" thingy in Postgres is called).
| @@ -0,0 +1,8 @@ | |||
| #!/bin/bash | |||
| # Report the test database's on-disk size: tables + indexes + TOAST. Excludes | |||
| # pg_wal (durability metadata that grows with activity rather than dataset | |||
There was a problem hiding this comment.
The comment says that postgres-oriole uses the same approach but that's not true (check --> postgresql-orioledb/data-size).
| #!/bin/bash | ||
| # Report the test database's on-disk size: tables + indexes + TOAST. Excludes | ||
| # pg_wal (durability metadata that grows with activity rather than dataset | ||
| # size) and cluster-wide files (pg_global, pg_xact). Same convention as |
There was a problem hiding this comment.
Actually, postgresql/data-size is looking at the on-disk sizes using shell commands. This might be sensible to do here as well.
| # boundary calculation to the dataset's epoch (the hits data is from 2013). | ||
| sudo -u postgres psql -v ON_ERROR_STOP=1 -t test < create.sql | ||
| sudo -u postgres psql -v ON_ERROR_STOP=1 -t test -c "SET pg_deltax.mock_now = '2013-07-01 12:00:00'; SELECT deltax.deltax_create_table('hits', 'eventtime', '3 days'::interval, 15)" | ||
| sudo -u postgres psql -v ON_ERROR_STOP=1 -t test -c "SELECT deltax.deltax_enable_compression('hits', order_by => ARRAY['counterid', 'userid', 'eventtime'], segment_size => 30000)" |
There was a problem hiding this comment.
Re deltax_enable_compression: I will not oppose this but it slightly violates the spirit of the benchmark. The rules say:
It's better to use the default settings and avoid fine-tuning. Configuration changes can be applied if it is considered strictly necessary and documented.
Fine-tuning and optimization for the benchmark are not recommended but allowed. In this case, add results for the vanilla configuration and tunes results separately (e.g. 'MyDatabase' and 'MyDatabase-tuned')
Running without deltax_enable_compression or making compression deltax's default will be the preferred option.
Hi,
I'd like to add an entry for pg_deltax (https://github.com/xataio/deltax)
It is a time-series extension for PostgreSQL, which we just made open source.
I did read the rules for contribution, hopefully I didn't miss anything important. Please let me know if you have any feedback.