Module contains a Nomad job ./conf/nomad/presto.hcl with Presto sql server.
- Prerequisites
- Compatibility
- Requirements
- Usage
- Example usage
- Inputs
- Outputs
- Secrets & credentials
- Contributors
- License
- References
Please follow this section in original template
Software | OSS Version | Enterprise Version |
---|---|---|
Terraform | 0.13.1 or newer | |
Consul | 1.8.3 or newer | 1.8.3 or newer |
Vault | 1.5.2.1 or newer | 1.5.2.1 or newer |
Nomad | 0.12.3 or newer | 0.12.3 or newer |
Module | Version |
---|---|
terraform-nomad-hive | 0.3.0 or newer |
terraform-nomad-minio | 0.3.0 or newer |
terraform-nomad-postgres | 0.3.0 or newer |
All software is provided and run with docker. See the Makefile for inspiration.
If you are using another system such as MacOS, you may need to install the following tools in some sections:
The following command will run the example in example/presto_cluster:
make up
and
make up-standalone
will run the example in example/presto_standalone
For more information, check out the documentation in the presto_cluster README.
Since the services in this module use the sidecar_service
, you need to connect to the services using a Consul connect proxy.
The proxy connections are pre-made and defined in the Makefile
:
make proxy-hive # to hivemetastore
make proxy-minio # to minio
make proxy-postgres # to postgres
make proxy-presto # to presto
You can now connect to Presto using the Presto CLI with the following command:
make presto-cli # connect to Presto CLI
If you are on a Mac the proxies and make presto-cli
may not work.
Instead, you can install the Consul binary and run the commands in the Makefile
manually (without docker run ..
).
Further, you need to install the Presto CLI on your local machine or inside the box.
See also required software.
- If you ran the presto_standalone example, you can verify successful deployment with either of the following options.
- If you ran the presto_cluster example, you can only verify with option 1 and option 3.
- Go to http://localhost:4646/ui/exec/hive-metastore
- Chose metastoreserver -> metastoreserver and click enter.
- Connect using beeline cli:
# from metastore (loopback)
beeline -u jdbc:hive2://
- You can now query existing tables with the (beeline-cli)
SHOW DATABASES;
SHOW TABLES IN <database-name>;
DROP DATABASE <database-name>;
SELECT * FROM <table_name>;
# examples
SHOW TABLES;
SELECT * FROM iris;
SELECT * FROM tweets;
⚠️ Only works with presto_standalone example.
- Go to http://localhost:4646/ui/exec/presto
- Chose standalone -> server and click enter.
- Connect using the Presto-cli:
presto
- You can now query existing tables with the Presto-cli:
SHOW CATALOGS [ LIKE pattern ]
SHOW SCHEMAS [ FROM catalog ] [ LIKE pattern ]
SHOW TABLES [ FROM schema ] [ LIKE pattern ]
# examples
SHOW CATALOGS;
SHOW SCHEMAS IN hive;
SHOW TABLES IN hive.default;
SELECT * FROM hive.default.iris;
ℹ️ Check required software section first.
The following command contains two docker containers with the flag --network=host
, natively run on Linux.
An important note is that MacOS Docker runs in a virtual machine. In that case, you need to use the local binary consul
to install proxy and in another terminal local binary with presto
cli to connect.
In a terminal run a proxy and Presto-cli
session:
make presto-cli
You can now query tables (3 tables should be available):
show tables;
select * from <table>;
To debug or continue developing you can use Presto cli locally. Some useful commands.
# manual table creation for different file types
presto --server localhost:8080 --catalog hive --schema default --user presto --file ./example/resources/query/csv_create_table.sql
presto --server localhost:8080 --catalog hive --schema default --user presto --file ./example/resources/query/json_create_table.sql
presto --server localhost:8080 --catalog hive --schema default --user presto --file ./example/resources/query/flattenedjson_json.sql
presto --server localhost:8080 --catalog hive --schema default --user presto --file ./example/resources/query/avro_tweets_create_table.sql
This module uses the following providers:
The following intentions are required. In the examples, intentions are created in the Ansible playboook 01_create_intetion.yml:
Intention between | type |
---|---|
presto-local => presto | allow |
minio-local => minio | allow |
presto => hive-metastore | allow |
presto-sidecar-proxy => hive-metastore | allow |
presto-sidecar-proxy => minio | allow |
⚠️ Note that these intentions needs to be created if you are using the module in another module.
The following code is an example of the Presto module in cluster
mode.
For detailed information check the example/presto_cluster or the example/presto_standalone directory.
module "presto" {
source = "github.com/fredrikhgrelland/terraform-nomad-presto.git?ref=0.3.0"
depends_on = [
module.minio,
module.hive
]
# nomad
nomad_job_name = "presto"
nomad_datacenters = ["dc1"]
nomad_namespace = "default"
# Vault provided credentials
vault_secret = {
use_vault_provider = true
vault_kv_policy_name = "kv-secret"
vault_kv_path = "secret/data/dev/presto"
vault_kv_secret_key_name = "cluster_shared_secret"
}
service_name = "presto"
mode = "cluster"
workers = 1
consul_http_addr = "http://10.0.3.10:8500"
debug = true
use_canary = true
hive_config_properties = [
"hive.allow-drop-table=true",
"hive.allow-rename-table=true",
"hive.allow-add-column=true",
"hive.allow-drop-column=true",
"hive.allow-rename-column=true",
"hive.compression-codec=ZSTD"]
# other
hivemetastore_service = {
service_name = module.hive.service_name
port = module.hive.port
}
minio_service = {
service_name = module.minio.minio_service_name
port = module.minio.minio_port
access_key = ""
secret_key = ""
}
# Vault provided credentials
minio_vault_secret = {
use_vault_provider = true
vault_kv_policy_name = "kv-secret"
vault_kv_path = "secret/data/dev/minio"
vault_kv_access_key_name = "access_key"
vault_kv_secret_key_name = "secret_key"
}
}
Name | Description | Type | Default | Required |
---|---|---|---|---|
nomad_provider_address | Nomad provider address | string | "http://127.0.0.1:4646" | yes |
nomad_data_center | Nomad data centers | list(string) | ["dc1"] | yes |
nomad_namespace | [Enterprise] Nomad namespace | string | "default" | yes |
nomad_job_name | Nomad job name | string | "presto" | yes |
mode | Switch for Nomad jobs to use cluster or standalone deployment | string | "standalone" | no |
shared_secret_user | Shared secret provided by user(length must be >= 12) | string | "asdasdsadafdsa" | no |
vault_secret | Set of properties to be able fetch shared cluster secret from Vault | object(bool, string, string, string) | use_vault_secret_provider = true vault_kv_policy_name = "kv-secret" vault_kv_path = "secret/data/dev/presto" vault_kv_secret_key_name = "cluster_shared_secret" |
no |
service_name | Presto service name | string | "presto" | yes |
resource | Resource allocation for Presto nodes (cpu & memory) | object(number, number) | { cpu = 500 memory = 1024 } |
no |
resource_proxy | Resource allocation for proxy (cpu & memory) | object(number, number) | { cpu = 200 memory = 128 } |
no |
port | Presto http port | number | 8080 | yes |
docker_image | Presto docker image | string | "prestosql/presto:341" | yes |
local_docker_image | Switch for Nomad jobs to use artifact for image lookup | bool | false | no |
container_environment_variables | Presto environment variables | list(string) | [""] | no |
hive_config_properties | Custom hive configuration properties | list(string) | [""] | no |
workers | cluster: Number of Nomad worker nodes | number | 1 | no |
coordinator | Include a coordinator in addition to the workers. Set this to false when extending an existing cluster |
bool | true | no |
use_canary | Uses canary deployment for Presto | bool | false | no |
consul_connect_plugin | Deploy Consul connect plugin for presto | bool | true | no |
consul_connect_plugin_version | Version of the Consul connect plugin for presto (on maven central) src here: https://github.com/gugalnikov/presto-consul-connect | string | "2.2.0" | no |
consul_connect_plugin_artifact_source | Artifact URI source | string | "https://oss.sonatype.org/service/local/repositories/releases/content/io/github/gugalnikov/presto-consul-connect" | no |
debug | Turn on debug logging in presto nodes | bool | false | no |
hivemetastore.service_name | Hive metastore service name | string | "hive-metastore" | yes |
hivemetastore.port | Hive metastore port | number | 9083 | yes |
minio_service | Minio data-object contains service_name, port, access_key and secret_key | obj(string, number, string, string) | - | no |
minio_vault_secret | Minio data-object contains vault related information to fetch credentials | obj(bool, string, string, string, string) | { use_vault_provider = false, vault_kv_policy_name = "kv-secret", vault_kv_path = "secret/data/dev/minio", vault_kv_access_key_name = "access_key", vault_kv_secret_key_name = "secret_key" } |
no |
Name | Description | Type |
---|---|---|
presto_service_name | Presto service name | string |
When using the mode = "cluster"
, you can set your secrets in two ways, either manually or upload secrets to Vault.
To set the credentials manually you first need to tell the module to not fetch credentials from Vault. To do that, set vault_secret.use_vault_provider
to false
(see below for example).
If this is done the module will use the variable shared_secret_user
to set the Presto credentials. These will default to defaultprestosecret
if not set by the user.
Below is an example on how to disable the use of Vault credentials, and setting your own credentials.
module "presto" {
...
vault_secret = {
use_vault_provider = false,
vault_kv_policy_name = "",
vault_kv_path = "",
vault_kv_secret_key_name = "",
}
shared_secret_user = "my-secret-key" # default 'defaultprestosecret'
}
By default use_vault_provider
is set to true
.
However, when testing using the box (e.g. make dev
) the Presto secret is randomly generated and put in secret/dev/presto
inside Vault, from the 01_generate_secrets_vault.yml playbook.
This is an independent process and will run regardless of the vault_secret.use_vault_provider
is false
or true
.
If you want to use the automatically generated credentials in the box, you can do so by changing the vault_secret
object as seen below:
module "presto" {
...
vault_secret = {
use_vault_secret_provider = true
vault_kv_policy_name = "kv-secret"
vault_kv_path = "secret/data/dev/presto"
vault_kv_secret_key_name = "cluster_shared_secret"
}
}
If you want to change the secrets path and keys/values in Vault with your own configuration you would need to change the variables in the vault_secret
-object.
Say that you have put your secrets in secret/services/presto/users
and change the key to my_presto_secret_name
.
You must have Vault policy with name kv-users-secret
and at least read-access to path secret/services/presto/users
.
Then you need to do the following configuration:
module "presto" {
...
vault_secret = {
use_vault_secret_provider = true,
vault_kv_policy_name = "kv-users-secret"
vault_kv_path = "secret/data/services/presto/users",
vault_kv_secret_key_name = "my_presto_secret_name"
}
}
This work is licensed under Apache 2 License. See LICENSE for full details.
- Blog post
- Presto, so far (release 340), supports only varchar columns