query returns one row only, my code style is es5

Question

query returns one row only, my code style is es5

jhsea3do opened this issue 8 years ago · comments

Hi ufukomer,

Here ais my es5 codes, it always return the first row of results
i think it may caused by the 'pending' state, but how can i get all 50 rows after the pending finished?



> var sql = "select system_name from itm.system group by system_name";

> var client = require('node-impala').createClient({"host": "hadoop3"});

> client.query(sql, function(err, data){ console.log('err', err, 'data', data)  })

{ state: 'pending' }

> err null data [ [ 'CASCECUP01:KUX' ],

[ { name: 'system_name', type: 'string', comment: '' } ] ]



> client.resultType = 'map'

'map'

> client.query(sql, function(err, data){ console.log('err', err, 'data', data)  })

{ state: 'pending' }

> err null data Map { 'system_name' => [ 'CASCECUP01:KUX' ] }



> var client = require('node-impala').createClient({"host": "hadoop3", "resultType": 'map'})

undefined

> client.query(sql, function(err, data){ console.log('err', err, 'data', data)  })

{ state: 'pending' }

> err null data Map { 'system_name' => [ 'CASCECUP01:KUX' ] }

Please check same query under impala-shell :

Ömer Ufuk Efendioğlu · Answer 1 · Tue Apr 26 2016 09:34:19 GMT+0800 (China Standard Time)

@jhsea3do As I understand, this is the major problem of Beeswax Service. Each INSERT into HDFS creates a new data file. Unfortunately, Beeswax reads only one of them, sometimes all of them. I have inserted two sample data into sample_08 database, which means two separate data files:

But Beeswax reads only one of them:

// node-impala: output of query (SELECT * FROM sample_08)
[ { code: '10-0000',
    description: 'Yow',
    total_emp: '1112',
    salary: '2000' } ]

Thus, that issue never happens as long as we keep all data in one data file. But of course, that is not the solution. It seems to me that the only suitable solution is using HiveServer2 rather than Beeswax. Although, that is not straightforward way since I should implement a sasl transport something similar to its Java and Python versions.

I would be glad to hear alternative solutions if you have an idea.

Metal Squilla · Answer 2 · Wed Nov 02 2016 13:59:30 GMT+0800 (China Standard Time)

Hi! Do you have other suggestions for using Impala with Node.js?

Ömer Ufuk Efendioğlu · Answer 3 · Fri Nov 04 2016 00:17:49 GMT+0800 (China Standard Time)

@tiejian create a command line app then use impala-shell via this app. I have never tried in NodeJS but there are many in GitHub, e.g. commander.js, cli. I'm not sure if these tools satisfy your need, so make your own search.

If your impala host is remote, you would probably need a socket (e.g. using socket io) to connect from command line app that runs in your local machine to command-line app that runs in remote. Hence, you could use impala-shell in this way, probably.

If I couldn't explain well, please don't hesitate to ask details.

Quentin Rousseau · Answer 4 · Mon Feb 05 2018 12:46:25 GMT+0800 (China Standard Time)

Hi,

Hitting that issue as well. This is a major problem and make the entire library not usable...

How can we do to help and fix it ?

Regards

Ömer Ufuk Efendioğlu · Answer 5 · Mon Feb 05 2018 14:59:10 GMT+0800 (China Standard Time)

@kwent we should make this library use HiveServer2 in a way that is similar to its python client as I mentioned in the comment above.

sun · Answer 6 · Mon Sep 09 2019 16:00:02 GMT+0800 (China Standard Time)

Hello Hello everybody,I encountered this problem recently. And maybe I found a way of bypassing this problem！But I am not sure，So I need you to verify it. and let's talk about the reason。
Similar to you,when I query SELECT * FROM atable limit 10 i got 10 rows;

[
{"a":"a","b":2},
{"a":"a","b":2},
{"a":"a","b":2},
{"a":"a","b":2},
{"a":"a","b":2},
{"a":"a","b":2},
{"a":"b","b":2},
{"a":"b","b":2},
{"a":"b","b":2},
{"a":"b","b":2},
]

While I query SELECT a,count(*) FROM atable group BY a i got only 1 row! This is where the problem lies！

[
  {
    "a": "a",
    "count(*)": "6"
  }
]

After my research I found a way to get the expected result！
I query SELECT a,count(*) FROM atable group BY a order by a！！！ the order by is the key.

[
  {
    "a": "a",
    "count(*)": "6"
  },
  {
    "a": "b",
    "count(*)": "4"
  }
]

Try it！I'm Looking forward to your feedback！
@ufukomer