Reference Architecture

A recommended setup consists in:

Load Balancers

Hue is often run with:

Task Server

** Not fully supported yet**

The task server is currently a work in progress to outsource all the blocking or resource intensive operations outside of the API server. Follow HUE-8738 for more information on when first usable task will be released.

Until then, here is how to try the task server service.

Make sure you have Rabbit MQ installed and running.

sudo apt-get install rabbitmq-server -y

In hue.ini, telling the API server that the Task Server is available:

[desktop]
[[task_server]]
enabled=true

Starting the Task server:

./build/env/bin/celery worker -l info -A desktop

Available tasks

Query Task

When the task server is enabled, SQL queries are going to be submitted outside of the Hue servers.

To configure the storage to use to persist those, edit the result_file_storage setting:

[desktop]
[[task_server]]
result_file_storage='{"backend": "django.core.files.storage.FileSystemStorage", "properties": {"location": "/var/lib/hue/query-results"}}'

Task Scheduler

For schedules configured statically in Python:

./build/env/bin/celery -A desktop beat -l info

For schedules configured dynamically via a table with Django Celery Beat:

[desktop]
[[task_server]]
beat_enabled=false

Then:

./build/env/bin/celery -A desktop beat -l info --scheduler django_celery_beat.schedulers:DatabaseScheduler

Note: the first time the tables need to be created with:

./build/env/bin/hue migrate

Monitoring

Performing a GET /desktop/debug/is_alive will return a 200 response if running.

Proxy

A Web proxy lets you centralize all the access to a certain URL and prettify the address (e.g. ec2-54-247-321-151.compute-1.amazonaws.com –> demo.gethue.com).

Here is one way to do it with NGINX or Apache.