Tips

Install citus on single machine with docker

# Download docker-compose file
$ curl -L https://raw.githubusercontent.com/citusdata/docker/master/docker-compose.yml > docker-compose.yml

# Run docker-compose file
$ COMPOSE_PROJECT_NAME=citus docker-compose up -d

    citus_manager    python -u ./manager.py          Up
    citus_master     docker-entrypoint.sh postgres   Up      0.0.0.0:5432->5432/tcp
    citus_worker_1   docker-entrypoint.sh postgres   Up      5432/tcp

# Verify installation
$ docker exec -it citus_master psql -U postgres
postgres=# SELECT * FROM master_get_active_worker_nodes();

   node_name    | node_port
----------------+-----------
 citus_worker_1 |      5432
(1 row)

# Shutdown
$ COMPOSE_PROJECT_NAME=citus docker-compose down -v

https://docs.citusdata.com/en/v8.1/installation/single_machine_docker.html

Install citus on a single machine on ubuntu

# Add Citus repository for package manager
$ curl https://install.citusdata.com/community/deb.sh | sudo bash

# install the server and initialize db
$ sudo apt-get -y install postgresql-11-citus-8.1

# this user has access to sockets in /var/run/postgresql
$ sudo su - postgres
# include path to postgres binaries
$ export PATH=$PATH:/usr/lib/postgresql/11/bin

$ cd ~
$ mkdir -p citus/coordinator citus/worker1 citus/worker2

# create three normal postgres instances
$ initdb -D citus/coordinator
$ initdb -D citus/worker1
$ initdb -D citus/worker2

# Add citus extension to postgres config file
$ echo "shared_preload_libraries = 'citus'" >> citus/coordinator/postgresql.conf
$ echo "shared_preload_libraries = 'citus'" >> citus/worker1/postgresql.conf
$ echo "shared_preload_libraries = 'citus'" >> citus/worker2/postgresql.conf

# Start db
$ pg_ctl -D citus/coordinator -o "-p 9700" -l coordinator_logfile start
$ pg_ctl -D citus/worker1 -o "-p 9701" -l worker1_logfile start
$ pg_ctl -D citus/worker2 -o "-p 9702" -l worker2_logfile start

# Add citus extension
$ psql -p 9700 -c "CREATE EXTENSION citus;"
$ psql -p 9701 -c "CREATE EXTENSION citus;"
$ psql -p 9702 -c "CREATE EXTENSION citus;"

# Register workers on coordinator
$ psql -p 9700 -c "SELECT * from master_add_node('localhost', 9701);"
$ psql -p 9700 -c "SELECT * from master_add_node('localhost', 9702);"

# Verify installation
$ psql -p 9700 -c "select * from master_get_active_worker_nodes();"

     node_name | node_port
    -----------+-----------
     localhost |      9701
     localhost |      9702
    (2 rows)

$ psql -p 9700 -c "SELECT * from pg_dist_node;"

     nodeid | groupid | nodename  | nodeport | noderack | hasmetadata | isactive | noderole | nodecluster
    --------+---------+-----------+----------+----------+-------------+----------+----------+-------------
          1 |       1 | localhost |     9701 | default  | f           | t        | primary  | default
          2 |       2 | localhost |     9702 | default  | f           | t        | primary  | default
    (2 rows)


# stop db
$ pg_ctl -D citus/worker2  stop
$ pg_ctl -D citus/worker1  stop
$ pg_ctl -D citus/coordinator stop

https://docs.citusdata.com/en/v8.1/installation/single_machine_debian.html

http://docs.citusdata.com/en/v8.0/develop/api_udf.html#master-add-node

http://docs.citusdata.com/en/v8.0/develop/api_udf.html#master-get-active-worker-nodes

Install citus on multi-machine cluster on Ubuntu

For both coordinator and workers:

# Add Citus repository for package manager
$ curl https://install.citusdata.com/community/deb.sh | sudo bash

# install the server and initialize db
$ sudo apt-get -y install postgresql-11-citus-8.1

# preload citus extension
$ sudo pg_conftool 11 main set shared_preload_libraries citus

$ sudo pg_conftool 11 main set listen_addresses '*'

$ sudo vi /etc/postgresql/11/main/pg_hba.conf

    # Allow unrestricted access to nodes in the local network. The following ranges
    # correspond to 24, 20, and 16-bit blocks in Private IPv4 address spaces.
    host    all             all             10.0.0.0/8              trust

    # Also allow the host unrestricted access to connect to itself
    host    all             all             127.0.0.1/32            trust
    host    all             all             ::1/128                 trust


# start the db server
$ sudo service postgresql restart
# and make it start automatically when computer does
$ sudo update-rc.d postgresql enable

# add the citus extension
$ sudo -i -u postgres psql -c "CREATE EXTENSION citus;"

Only on coordinator:

# Add workers to dns
$ sudo vim /etc/hosts

    192.168.0.131 w1
    192.168.0.132 w2

# Register workers on coordinator
$ sudo -i -u postgres psql -c "SELECT * from master_add_node('w1', 5432);"
$ sudo -i -u postgres psql -c "SELECT * from master_add_node('w2', 5432);"


# Verify installation
$ sudo -i -u postgres psql -c "SELECT * FROM master_get_active_worker_nodes();"

     node_name | node_port
    -----------+-----------
     w1        |      5432
     w2        |      5432
    (2 rows)

# Ready to use
$ sudo -i -u postgres psql

https://docs.citusdata.com/en/v8.1/installation/multi_machine_debian.html

Have a unique constraint on one field of table

https://docs.citusdata.com/en/v8.1/faq/faq.html#can-i-create-primary-keys-on-distributed-tables

https://stackoverflow.com/a/43660911

Limitation of Citus Community

Re balance, Replicate, Isolate

These three important functions are not available:

  • rebalance_table_shards

  • replicate_table_shards

  • isolate_tenant_to_new_shard

When you add and register new node, you can not balance current existing filled data to this new empty node.

Tenant isolation is not available.

https://github.com/citusdata/citus/issues/828

https://docs.citusdata.com/en/v8.1/admin_guide/cluster_management.html#tenant-isolation

Adding a coordinator

Users can send their queries to any coordinator and scale out performance. If your setup requires you to use multiple coordinators, please contact us.

https://docs.citusdata.com/en/v8.1/admin_guide/cluster_management.html#adding-a-coordinator

Worker Node Failures

Citus supports two modes of replication

  1. PostgreSQL streaming replication.

  2. Citus shard replication.

Only the second one is available, suited for an append-only workload. and setting needs to be done before distributing data to the cluster.

https://docs.citusdata.com/en/v8.1/admin_guide/cluster_management.html#worker-node-failures

Django

https://docs.citusdata.com/en/v8.1/develop/migration_mt_django.html#django-migration

https://github.com/omidraha/citus-exp