Only half the queries get executed (exactly half)
mtimofiiv opened this issue · comments
Hello, I am sending a bulk set of queries via the multiSearch()
method. But the response does not match what I ask for.
When I do 46 queries, I get back 23. When I do 76, I get back 38. When I do 64, I get 32.
More specifically, I pass in an array with say 76 query objects in it, and I get back an array of 38 properly formatted results. There are no errors anywhere I can see.
I have tried sending with explicitly setting the index and without, and each time I get the same result.
Any idea why this would be happening?
Can you provide a code example?
Certainly! Here's a function that does it. I initialise the module with the proper ES URI params and with a single index upon which I am searching, and run my bulk query set through here:
const url = require('url')
const ES_URI = url.parse(process.env.ELASTICSEARCH_URI)
const index = {
_index: 'mgm_events',
_type: 'event'
};
const config = {
server: {
port: ES_URI.port,
host: ES_URI.hostname,
secure: ES_URI.protocol.indexOf('https') > -1
}
};
if (ES_URI.auth) config.server.auth = ES_URI.auth;
const mappings = {
created_at: { type: 'date' }
};
const es = require('es')(Object.assign(config, index));
es.indices.mappings(Object.assign(index, mappings));
function runQuerySet(querySet) {
return new Promise((resolve, reject) => {
es.multiSearch({}, querySet, (err, bulkResultSet) => {
if (err) return reject(err)
/*
Some logic here with processing the results, but for brevity's sake
we will just return the counts cause that's why we're here...
*/
return resolve({
queryLength: querySet.length,
resultLength: bulkResultSet.responses.length
})
})
})
}
runQuerySet(queries).then(result => {
console.log(result) // ends up being { queryLength: 4, resultLength: 2 }
})
Here are 2 sample queries I am reproducing the error with:
[
{ query: { bool: { must: [ { match: { name: 'create.offer' } } ] } } },
{ query: { bool: { must: [ { match: { name: 'landing' } } ] } } }
]
And the result (bulkResultSet
) set ends up like this:
{ responses: [ { took: 29, timed_out: false, _shards: [Object], hits: [Object] } ] }
// dummy [Object]s shown, but that is not important
The expected response here would be 2 items in the responses
array, one for each query. I only run 2 here but as I said earlier, it is always half the requested amount (so 26 requested queries gives me 13 results).
I figured it out.
It has to do with the fact that in the docs, it looks like all their examples of multiget all have a header and in core.js
we see this comment where there are only queries. Well, sure enough, the payload of the request does not set a header, so when ES receives it, it interprets half the requests as headers instead of queries.
So I guess my question would be this - should this be documented (and so people can accordingly structure the queries
parameter to include headers) or should this method take an extra argument?
Let me know, I could do a PR for either.
Hi @mtimofiiv - yes, that is correct - there needs to be a "header" prior to each query that includes things like the index
and searchType
.
If the payload example from above is altered to look as follows, does the query result in the correct number of results?
[
{ },
{ query: { bool: { must: [ { match: { name: 'create.offer' } } ] } } },
{ },
{ query: { bool: { must: [ { match: { name: 'landing' } } ] } } }
]
Ok cool. Would it be nice then to note this in the docs for future users? I can submit a PR to specify this in the readme.
@mtimofiiv - sorry for taking so long to respond... a PR would be awesome!
Closing for now... no modifications were added to the documentation, but a a really great plan would be to update links in the readme to the elastic.co docs for each function.