cloudera / impyla

Python DB API 2.0 client for Impala and Hive (HiveServer2 protocol)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

it is toooooo slow when fetching data from impala when rows >=10000

l00173745 opened this issue · comments

it is toooooo slow when fetching data from impala when rows >=10000

@l00173745 I'm going to close this if you don't have any additional information, this isn't an actionable issue without more information about the setup, query, timing, etc or a reproduction.

We did some work on the server side to improve the protocol with a result spooling feature - https://impala.apache.org/docs/build/html/topics/impala_query_results_spooling.html and https://issues.apache.org/jira/browse/IMPALA-8656. That can greatly increase the fetch performance for large result sets.