Very very slow Apache Superset dashboard

Question

Very very slow Apache Superset dashboard

MasMadd opened this issue 9 days ago · comments

Bug description

Hi everyone,
I'm opening this issue because I've been trying to solve it myself for a few days by looking for solutions and explanations on the Internet and documentation but nothing seems to work.

Problem:
I have a dashboard created on Apache Superset which presents several filters on some columns of the dataset used by the charts present in the dashboard itself. These columns are already indexed on Postgres database table.
The dashboard presents approximately 21 charts, all referring to the same dataset (table).

A query (executed on pgadmin) to return all the data in the table, filtered on an indexed column uses:
2 seconds.

The problem is that the dashboard loading times are at least an order of magnitude higher than the times returned to me on pgadmin: what I perceive as making everything worse is that if I start inspecting the API calls made by the superset front-end it seems to me that the charts are rendered only AFTER the filters value data have been retrieved, which happens to be the longest, so the dashboard remains frozen for up to 30-40 seconds before anything can be displayed.

I also changed these settings on Postgres, hoping to help the parallelization process:
max_worker_processes = 32 # Sets the maximum number of total worker processes allowed for the entire PostgreSQL
instance, default: 8
max_parallel_maintenance_workers = 16 #
max_parallel_workers_per_gather = 32 # Specifies the maximum number of workers that can be started by a single query. Default: 2
max_parallel_workers = 32 # Sets the maximum number of parallel query worker processes allowed for the entire
PostgreSQL instance, default: 8

But this also doesn't seem to have worked as it always seems to me to be a lack of parallelization of Superset and that instead the calls are executed in series and for this reason the times are huge. On the internet I read that some people have speeded up this problem by setting DASHBOARD_VIRTUALIZATION = True in the Superset configurations, but it doesn't seem to change anything to me.

I attach some screenshot of the evaluation process.

(Pending request (no charts displayed, even if loading circle is stopped)

(Pending request resolved, charts displayed)

How to reproduce the bug

Configuration:
Superset docker image: apache/superset:0ddc0a6738bbffc86739a485169e2d5d23ad14e4
postgres db (docker) : cpus: '0.50' memory: 1536M
app superset (docker): cpus: '0.50', memory: 1536M
superset worker (docker): cpus: '0.50', memory: 2G
superset worker beat (docker): cpus: '0.50', memory: 512M

run_server.sh modified:
gunicorn
--bind -0.0.0.0:8088
--access-logfile $HYPHEN_SYMBOL
--error-logfile $HYPHEN_SYMBOL
--workers 30
--worker-class gevent
--threads 50
--timeout 60
--keep-alive 5
--max-requests 3000
--max-requests-jitter 50
--limit-request-line 0
--limit-request-field_size 0
"${FLASK_APP}"

Screenshots/recordings

No response

Superset version

master / latest-dev

Python version

3.9

Node version

16

Browser

Chrome

Additional context

No response

Checklist

I have searched Superset docs and Slack and didn't find a solution to my problem.
I have searched the GitHub issue tracker and didn't find a similar bug report.
I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.

dosubot · Answer 1 · Thu Jul 18 2024 23:26:19 GMT+0800 (China Standard Time)

Hey there, @MasMadd! I'm here to help you with the issue you're facing. I can assist with bugs, questions, and becoming a contributor. Let's work together to resolve this issue.

To address the slow loading times of your Apache Superset dashboard, consider the following optimizations:

Reduce Payload: Filter out unnecessary metrics and columns to reduce the payload when loading the dashboard. This can be achieved by modifying the data processing logic to only include the required metrics and columns. Here is a relevant code snippet:

def reduce_payload(data, slices):
    metric_names = set()
    column_names = set()
    for slc in slices:
        form_data = slc.form_data
        for metric_param in METRIC_FORM_DATA_PARAMS:
            for metric in utils.as_list(form_data.get(metric_param) or []):
                metric_names.add(utils.get_metric_name(metric))
                if utils.is_adhoc_metric(metric):
                    column_ = metric.get("column") or {}
                    if column_name := column_.get("column_name"):
                        column_names.add(column_name)

        column_names.update(
            filter_["subject"]
            for filter_ in form_data.get("adhoc_filters") or []
            if filter_.get("clause") == "WHERE" and filter_.get("subject")
        )

        column_names.update(
            filter_config["column"]
            for filter_config in form_data.get("filter_configs") or []
            if "column" in filter_config
        )

        try:
            query_context = slc.get_query_context()
        except DatasetNotFoundError:
            query_context = None

        if query_context:
            column_names.update(
                [
                    utils.get_column_name(column_)
                    for query in query_context.queries
                    for column_ in query.columns
                ]
                or []
            )
        else:
            _columns = [
                utils.get_column_name(column_)
                if utils.is_adhoc_column(column_)
                else column_
                for column_param in COLUMN_FORM_DATA_PARAMS
                for column_ in utils.as_list(form_data.get(column_param) or [])
            ]
            column_names.update(_columns)

    filtered_metrics = [
        metric
        for metric in data["metrics"]
        if metric["metric_name"] in metric_names
    ]

    filtered_columns = []
    column_types = set()
    for column_ in data["columns"]:
        generic_type = column_.get("type_generic")
        if generic_type is not None:
            column_types.add(generic_type)
        if column_["column_name"] in column_names:
            filtered_columns.append(column_)

    data["column_types"] = list(column_types)
    del data["description"]
    data.update({"metrics": filtered_metrics})
    data.update({"columns": filtered_columns})
    verbose_map = {"__timestamp": "Time"}
    verbose_map.update(
        {
            metric["metric_name"]: metric["verbose_name"] or metric["metric_name"]
            for metric in filtered_metrics
        }
    )
    verbose_map.update(
        {
            column_["column_name"]: column_["verbose_name"]
            or column_["column_name"]
            for column_ in filtered_columns
        }
    )
    data["verbose_map"] = verbose_map

    return data

Enable Dashboard Virtualization: Ensure that the DASHBOARD_VIRTUALIZATION feature flag is set to True to help parallelize API calls and improve loading times:

DEFAULT_FEATURE_FLAGS: dict[str, bool] = {
    ...
    "DASHBOARD_VIRTUALIZATION": True,
    ...
}

Optimize Gunicorn Configuration: Adjust your Gunicorn server settings to handle high loads more efficiently. Here is an example configuration:

-w 10 \
-k gevent \
--worker-connections 1000 \
--timeout 120 \
-b  0.0.0.0:6666 \
--limit-request-line 0 \
--limit-request-field_size 0 \
--statsd-host localhost:8125 \
"superset.app:create_app()"

Caching: Configure caching to improve performance. For example, use Redis for caching filter states:

FILTER_STATE_CACHE_CONFIG = {
    'CACHE_TYPE': 'RedisCache',
    'CACHE_DEFAULT_TIMEOUT': 86400,
    'CACHE_KEY_PREFIX': 'superset_filter_cache',
    'CACHE_REDIS_URL': 'redis://localhost:6379/0'
}

Resource Allocation: Ensure your Superset instance has adequate resources. For a moderately-sized instance, 8GB RAM and 2 vCPUs are typically adequate ^[1].

By implementing these strategies, you should see an improvement in the loading times of your Apache Superset dashboard.

To continue talking to Dosu, mention @dosu.

MasMadd · Answer 2 · Fri Jul 19 2024 20:52:08 GMT+0800 (China Standard Time)

To everyone, the solutions proposed by @dosu are already implented and do not solve my issues.

dosubot · Answer 3 · Fri Jul 19 2024 20:53:01 GMT+0800 (China Standard Time)

@rusackas, this one's a bit out of my wheelhouse, could you step in and provide your guidance?

To continue talking to Dosu, mention @dosu.