apache / datafusion-ballista

Apache DataFusion Ballista Distributed Query Engine

Home Page:https://datafusion.apache.org/ballista

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bug with initialising ClusterStorageConfig

BokarevNik opened this issue · comments

Describe the bug
Currently cluster_storage created only for Memory, and etcd and sled options are ignored

    let config = SchedulerConfig {
        namespace: opt.namespace,
        external_host: opt.external_host,
        bind_port: opt.bind_port,
        scheduling_policy: opt.scheduler_policy,
        event_loop_buffer_size: opt.event_loop_buffer_size,
        task_distribution: opt.task_distribution,
        finished_job_data_clean_up_interval_seconds: opt
            .finished_job_data_clean_up_interval_seconds,
        finished_job_state_clean_up_interval_seconds: opt
            .finished_job_state_clean_up_interval_seconds,
        advertise_flight_sql_endpoint: opt.advertise_flight_sql_endpoint,
        cluster_storage: ClusterStorageConfig::Memory,
        job_resubmit_interval_ms: (opt.job_resubmit_interval_ms > 0)
            .then_some(opt.job_resubmit_interval_ms),
        executor_termination_grace_period: opt.executor_termination_grace_period,
        scheduler_event_expected_processing_duration: opt
            .scheduler_event_expected_processing_duration,
    };

    let cluster = BallistaCluster::new_from_config(&config).await?;

    start_server(cluster, addr, config).await?;
    Ok(())

To Reproduce
For example,

cargo run --bin ballista-scheduler -- --cluster-backend sled --sled-dir ./tmp

There will be memory storage backend created and no such line in logs

2023-04-30T16:16:54.203599Z  INFO main ThreadId(01) ballista_scheduler::cluster: Initializing Sled database in temp directory 

Expected behavior
Expected to initialize etcd and sled

Additional context
I submitted pull request