influxdata / influxdb

Scalable datastore for metrics, events, and real-time analytics

Home Page:https://influxdata.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

"list series" has gotten slow.

Dieterbe opened this issue · comments

list series /regex/ used to return in 150~300ms for me.
nowadays it's around 480ms.
and list series now takes about 860ms. not sure what this used to be, i have some old notes saying 2-3s but those may be outdated.

anyway this probably has something to do with #830: we now sort everytime we request the series list.
i recently upgraded to rc5 so that seems to make sense.

it's important that list series and list series /regex return speedily
for example graphite-influxdb gets user requests, then needs to figure out which series it needs, for which it executes list series /regex/ and then queries influxdb for data. It caches the list outputs, but still with a cold cache (new graph) it can take several seconds before we even know which series to use (because the user supplies multiple expressions)
I can imagine othen monitoring systems being built on top of influx will need similar things.

so how can me make this as fast as it can be?
maybe influx can keep the sorted list of series readily available for querying?

A cache of series nodes would be great. I get timeout ever i try to run a list series or something else with many sub nodes.

👍 for cache of series nodes.

It would be great to have cache of series nodes, right now even if I revert this commit, it still a bit slow (initial list series in graphite tooks like 16 seconds, good drop from 40, but still too much).

I find it surprising that list series is taking that long even with the presence of sorting. How many series do you guys have ? Could this be the overhead of json serialization ?

$ influx-cli -db graphite <<< "list series" | wc -l
188366

Not so much in fact. It's only 790339 here (including shard spaces).

i have also currently 200k, but i have switched to another solution. i try influxdb again, when it is stable and fast perhaps.

Yup, the problem is that 700k is not a big value. I've saw a lot more series in single database (around 30kk) on a single machine.

p.s. for me problem semi-solved with reverting commit (get more then 2x performance there) and making cache more aggressive. Though it's not that good and cache can expire :(

With 200k series, what are your expectations for running list series? How
fast is fast enough? Are you doing a query where you return all 200k
series? Is this running over an internet connection? How long are the
series names and the size of the raw data?

On Thu, Sep 4, 2014 at 1:10 PM, Kenterfie notifications@github.com wrote:

i have also currently 200k, but i have switch to another solution. i try
influxdb again, when it is stable and fast perhaps.


Reply to this email directly or view it on GitHub
#884 (comment).

The query shouldn't be longer than 5s for 200k. Connected over 1Gbit ethernet. Average size of a series name is 40-50 character.

I'ld say <100ms for both list series and list series /regex/, the latter is the more common case for me. i connect from localhost for my tests to rule out transmission times.

There's no such a thing as absolute time for an operation, I don't understand what does < 100ms means. Specially for regex list series, that's an O(n) operation, or at least this is how it's implemented today. This doesn't really matter since we have to iterate through all series names anyway to create the resulting series object, I don't see a way around the operation being O(n). I benchmarked the list series operation on my local machine and here's the numbers I got:

with sorting

200 -> 876   ms
700 -> 3281 ms
700 -> 3384 ms

w/o sorting

700 -> 2393 ms

As you can see from the with sorting section, the operation scales linearly with the number of series. Getting rid of sorting reduces the operation's time by roughly a second. From the profile I got it looks like most of the time is spent doing json marshaling. We have two options here, one is to use a faster json encoder that doesn't rely on reflection. The other option is to have an option to return the list in text format with a user specified delimiter. Thoughts ?

I guess I'm mostly surprised by how long the serialisation takes (as i also noticed in the OP that list series /with some regex/ is faster then plain list series, so regex matching a string is faster than json encoding it? would love to see your benchmark with also a scenario of regex filtering (filtering down to small, known subset, say 1~10%).

if we have a large number of things of all the same, known type, then using a better json encoder will probably make a lot of difference, at work we also sometimes write a custom json encoder (and decoder, actually) if we know the format of the inputs and outputs, because it makes a big difference compared to a standard encoder that has to account for all possible scenarios.

Any news?

Though I don't know Go too well, but I've tried to hack influx a bit. I've replaced json.Marshal(SerializeSeries(...)) with just a function that already prepares json (byte.Buffer and write data directly to it). Though it doesn't help a lot. It seems that there main perofrmance issue is that data is sorted inside metastore/store.go and also inside SerializeSeries.

As I don't need sorted data - I've got ''list series" to run for 1.1s on 1kk series. Without this same query tooks 3.2s.

So still - main performance problem is sorting. With different json marshaling performance also can be improved, but not much.

Anyway, there is still need to implement series name cache, because anyway performance still too slow.

Could the sorting not be made optional? As personally, I need the sorting but understand all the comments here about the need for speed. Perhaps, something like (in the fashion of sql, which influxdb seems to loosely follow)

list series order by asc

and without the "order by" it would skip the sorting and return faster.

My point was that the problem is not only with JSON Marshaling. There are 3 parts where it looses speed:

  1. Marshaling (10-15%)
  2. Sorting (30%, done twice - when getting data from metastore and before passing to json encoder)
  3. Querying the data (55-60%).

So it won't be enough to speed up of one of this things. And one of the most obvious way to improve situation - to store list of the series (sorted or unsorted - doesn't matter) somewhere in memory and do all the operations with it, not to query backends for seires.

@vladimir-smirnov-sociomantic how did you get these number ? also the sorting in SerializeSeries is a nop since list series returns one time series. That leaves sorting in the coordinator which according to my numbers contribute to a second (more or less). We can definitely make the sorting optional something along the lines of what @sanga mentioned earlier.

@jvshahid
This is what I've done: https://gist.github.com/vladimir-smirnov-sociomantic/4c8bad1185258bf86aa7

Values - I've created 1000008 (can be rounded to 1kk, though I've accidentely created 8 series more :)) series in influxdb (2 points each). After that run 'time curl -G 'http://localhost:8086/db/test/series?u=test&p=test&pretty=false' --data-urlencode "q=list series" >/dev/null'. With patch above I get 1.3s on first run, on 3 runs it's avg is 1.1s (lowest 0.6, highest 1.3). For 10 runs (without cold run) it's avg 0.7 (highest 0.9, lowest 0.6).

With influxdb 0.8.2 vanilla same DB it's very consistent between runs and avg time for 10 queries is 2.7s, with highest 3s, lowest 2.45.

If I only change SerializeSeries (leave sorting in place): 1.9s avg, highest 2.2s, lowest 1.7s

If I only remove sorting, but leave JSON Marshaling in place it's: 1.55s avg, highest 1.65, lowest 1.3.

So proper numbers will be:
Marshaling: ~30%
Sorting: ~43%

Without sorting and with custom marshaling 'list series' will be almost 4x faster.

Testing HW is simple desktop - i5-3470S with performance governour, 8GB ram, db is on HDD (cheap seagate ST500DM002).

UPD: modified paste link, fixed small bug. Get this values:
Column name - what patches are on the version. Vanilla - 0.8.2 without patches. Marshaling - only JSON marshalin modified, Sorting - only sorting disabled, Marshaling+Sorting - full patch. All time in seconds (what time says for curl).

        Vanilla Marshaling  Sorting     Marshaling+Sorting
        3.021   2.08        1.653       0.967
        2.774   2.1713      1.616       0.601
        2.668   1.73        1.57        0.624
        2.459   2.03        1.315       0.924
        2.754   2.067       1.634       0.878
        2.71    1.746       1.586       0.588
        2.684   2.033       1.559       0.892
        2.406   1.757       1.324       0.601
        2.716   2.023       1.821       0.858
        2.925   2.022       1.764       0.83
AVG     2.71    1.97        1.58        0.78
STDEV   0.18    0.16        0.16        0.15
MAX     3.021   2.1713      1.821       0.967
MIN     2.406   1.73        1.315       0.588

Using benchmark for ListSeries ( make integration_test only=SingleServerSuite.BenchmarkListSeries verbose=on benchmark=on) I've got following values:
Vanilla:

PASS: single_server_test.go:230: SingleServerSuite.BenchmarkListSeries        20        2754023298 ns/op

START: single_server_test.go:42: SingleServerSuite.TearDownSuite
PASS: single_server_test.go:42: SingleServerSuite.TearDownSuite 0.098s

OK: 1 passed
--- PASS: Test (88.68 seconds)
PASS
ok      github.com/influxdb/influxdb/integration        88.700s

Marshaling:

PASS: single_server_test.go:230: SingleServerSuite.BenchmarkListSeries        20        2262612770 ns/op

START: single_server_test.go:42: SingleServerSuite.TearDownSuite
PASS: single_server_test.go:42: SingleServerSuite.TearDownSuite 0.099s

OK: 1 passed
--- PASS: Test (78.81 seconds)
PASS
ok      github.com/influxdb/influxdb/integration        78.838s

Sorting:

PASS: single_server_test.go:230: SingleServerSuite.BenchmarkListSeries        50        2093408024 ns/op

START: single_server_test.go:42: SingleServerSuite.TearDownSuite
PASS: single_server_test.go:42: SingleServerSuite.TearDownSuite 0.109s

OK: 1 passed
--- PASS: Test (138.24 seconds)
PASS
ok      github.com/influxdb/influxdb/integration        138.294s

Sorting + Marshaling:

PASS: single_server_test.go:230: SingleServerSuite.BenchmarkListSeries        50        1617170577 ns/op

START: single_server_test.go:42: SingleServerSuite.TearDownSuite
PASS: single_server_test.go:42: SingleServerSuite.TearDownSuite 0.100s

OK: 1 passed
--- PASS: Test (114.17 seconds)
PASS
ok      github.com/influxdb/influxdb/integration        114.194s

unfortunately I can't compile influxdb from scratch right now because I can't build rocksdb but your patch looks really neat @vladimir-smirnov-sociomantic , i wish i could compile and run it. this looks like a fairly easy quick-win to boost list series performance. any chance we can get this in @jvshahid ?

uhm, this ticket is about list series being slowness.
hope @vladimir-smirnov-sociomantic`s patch gets merged soon....

It will be still slow, just a bit better.

@jvshahid @pauldix if you think that my patches (or parts of them) can be merged, I can make a PR with them. Just say if you'll need only Marshaling or also disable sorting.

There are some changes proposed by @pauldix #1059. Depending on the final cut of that proposal this story may or may not be relevant. We will probably just push back on this story for a little bit.

i didn't see anything in #1059 about speeding up listing of series, did i miss something? either way #1059 is about major refactoring that will probably take a while to get to.
To make influxdb usable as graphite backend, a solution/workaround for this problem would be welcome much sooner. Not to troll or fuel a negative discussion, but to explain how important this is for graphite users: we haven't run influxdb at $dayjob since these problems. I guess I could switch back to running v0.8.0-rc.4, but before I went down that road I wanted to see if this problem would be addressed. just disabling the sorting again/making it configurable or applying @vladimir-smirnov-sociomantic's patch would go a long way with fairly minimal work.

hey @vladimir-smirnov-sociomantic when i run your patch I got the error invalid character 'x' in string escape code
after some debugging i noticed that your code sometimes puts \ instead of \\:

[g1 ~]$ diff official.txt patched.txt 
(...)
< "servers.domU-foo.diskspace.\\x2f.byte_avail"]
---
> "servers.domU-foo.diskspace.\x2f.byte_avail"]
(...)
< "stats.dfbar\\cli\\command\\consumer\\foo.age"]
---
> "stats.dfbar\cli\command\consumer\foo.age"]

should be an easy fix, will fix it unless you beat me to it :)
other then that, the json output is the same format as the official one, so nice work!

Yup, it can be that, cause I haven't got any metrics with '' and with unicode symbols, so it should be a bug.

my results:
with 433k series.

so not bad :) next up is just removing all series that have a \ in them and using your pure patch, should be a bit faster still.

EDIT: my numbers are using influx-cli which uses standard influx client, which does json decode as well. i should redo this with pure wget to eliminate json decode time.

The underlying series structure is going to change completely for v0.9.0, so this is no longer actionable. Closing it out.

@toddboom AFAIK this is something that 0.9 doesn't address. because this is due to the json encoding, cacheless list iteration & regex checking and sorting of series. is 0.9 going to adress any of these? will it keep a sorted list in memory perhaps?

I think it's better to leave this open until benchmarks will proof that it's really fixed. Because @Dieterbe is right, it's about things that will be still there even series struct will be completely rewritten.

@Dieterbe v0.9.0 is effectively going to be a complete rewrite. Also, because of tags, you should end up with far fewer overall series. For the sake of managing actionable issues on our end, I'd prefer to keep this closed and reopen if necessary once v0.9.0 is out.

@toddboom if there is no caching and result will be still available only as json (with same json marshaling library) it'll still be a problem. You just push limits a bit further, and if not 1kk, then 10kk series will be a problem in terms of performance.

Though, of course, it's up to you to decide if you'll need separate issue for that. Just prepare that the problem will still be there, because it's cause is still in the code.

One question is, do you need to be returning entire series set in the result?

The model for 0.9.0 will be that you have series names and you have tags and their values. If you do list series, you're likely to only get thousands or maybe tens of thousands of results. If you get the values for a tag, it's the same.

Basically, you should be able to drill down to what you're looking for without getting 100k+ series names back in a single result.

But we'll also be looking at how to efficiently render the result set JSON.

@pauldix for graphite, returning entire series set is used to display all the metrics that system have. So yes, I still need that. And I don't think that tags will help me much. The only thing I need to filter out - is shards that were used for retention, but if I have 200k distinct metrics, I'll still get 200k records on those query. If I got 1kk - I'll get 1kk.

Good to hear about JSON.

I'm not sure why you would have 200k series names though. You end up having this in Graphite because it forces you to encode metadata (i.e. stuff you put in tags) into the series name. This will not be the preferred way to do things in Influx.

Let's go back to the beginning. What is your use case? Why do you need a list of series? What is the question you need answered (stated in regular English)?

I'd like to use influxdb as a Graphite backend. One of the usecases, where InfluxDB now performs not that good is getting list of all available metrics (for dashboards). It's done by "list series" query. When you have 200 hosts with 1k metrics for each, you'll get 200k metrics as a result for list series. Right now this query is very-very slow. And 200k is not even near the limit. On initial listing it performs query like 'list series /.*/' now. About why there are 200k series - if you got 200k independent time series of data, it'll end in 200k series in influx, and it won't be possible to reduce this with tags, right?

If influx's graphite plugin will be also modified to use tags, maybe it'll help somehow to make list series queries faster.

You can't visualize 200k time series on a dashboard (obviously). So this begs the question, what query needs to be answered to draw a dashboard? For example, if you want all hosts in a datacenter, then that's a query that can be done. Or if you want all series for a given host for the mysql service, then that's a query that can be answered.

The point is that the query language is there to filter down the result set. You should never be streaming the entirety of the metadata set down to a client.

As for the graphite plugin, it can be modified to use tags. But it will have to make assumptions about your series names. The most likely one I can think of is to rip apart the name like this:

(tagKey.tagValue)*seriesName

That is you have 0 or more tag value pairs followed by the series name. This is how I've seen most people structuring Graphite names.

What are you feeding data into Influx with? What's pushing the Graphite protocol? The best thing to do would be to modify the collectors to actually push up tag style data.

I'm feeding data with collectd right now.

Ok, even if it's possible to get only hosts, only metric groups, etc. It's still possible to have a lot of hosts, a lot of metrics inside group etc. Yeah, it won't be 200k, but currently several of my hosts have approx. 10k individual series in one of it's groups.

ok, so we'll want to make sure that things that have cardinality in the
tens of thousands are able to return results quickly

On Wed, Nov 26, 2014 at 8:28 AM, Vladimir Smirnov notifications@github.com
wrote:

I'm feeding data with collectd right now.

Ok, even if it's possible to get only hosts, only metric groups, etc. It's
still possible to have a lot of hosts, a lot of metrics inside group etc.
Yeah, it won't be 200k, but currently several of my hosts have approx. 10k
individual series in one of it's groups.


Reply to this email directly or view it on GitHub
#884 (comment).

graphite itself doesn't support a tagging system though. so when using influxdb as a backend for graphite we're stuck with string keys.

That is you have 0 or more tag value pairs followed by the series name. This is how I've seen most
people structuring Graphite names.

I don't think you can automatically spot where the tags are. the dimensions (nodes in the graphite metric string) that you want to seggregate or aggregate by (those are the ones that should become tags) are not always at the end (but they are often second to last and before), they are sometimes at the beginning but more often start and the 2nd or 3nd node.
i'm skeptical about an approach that automatically converts graphite strings into series+tags for stock graphite setups, i don't believe this is feasible, so i would just store them in influx the way it is in graphite.

to your point, the queries that graphite-influxdb invokes have a filter to narrow down (i.e. a regex), this is the most common case. that said, i think there are some cases for getting the entire list: for example some graphite dashboards do this to build an index, graph-explorer does this to run metrics2.0 plugins, and I personally sometimes do it to count how many series i have (in influx-cli list series | wc -l) to validate my stuff is working correctly.

Will have to agree with @Dieterbe here.

In our testing we are finding InfluxDB query performance (select, not list series) where many data points are returned to be entirely CPU limited with the bottleneck seeming to be serialisation and/or sorting as with this issue.

The API refactor will not change this, InfluxDB will still be returning the same amount of data even after the refactor.

Our use case is querying a days' or more worth of 1 second sampled metric data from Grafana in order to display a dashboard. The data itself is Graphite metric series ingested directly by InfluxDB.

1 days' worth of 1 second resolution data means 86400 (time,value) data point tuples to be returned by InfluxDB per metric name. Two months worth of 60 second resolution data (default graphite metric resolution) amounts to the same number of datapoints and identical response times.

On a 4 CPU 2.8Ghz Xeon E312xx InfluxDB takes ~3sec to serialise 86400 data points with linear scaling as number of days requested/datapoints returned is increased.

1 day:

2015-01-12 17:19:32,004 - DEBUG - Sending request - http://10.206.77.194:80/render?from=01:00_20150106&until=01:00_20150107&target=stats.amers.alpha-us1-cell.rtd-cph-idsi.us1i-cphidsi01.ids.perf.inUpdateRate&format=json
2015-01-12 17:19:35,285 - INFO - Query duration - 3.281203 sec. Datapoints: 86401
2015-01-12 17:19:35,392 - DEBUG - Sending request - http://10.206.77.194:80/render?from=01:00_20150106&until=01:00_20150107&target=<graphite series>&format=json
2015-01-12 17:19:38,690 - INFO - Query duration - 3.298818 sec. Datapoints: 86401

2 days:

2015-01-12 17:18:15,764 - DEBUG - Sending request - http://10.206.77.194:80/render?from=01:00_20150106&until=01:00_20150108&target=<graphite series>&format=json
2015-01-12 17:18:22,712 - INFO - Query duration - 6.948422 sec. Datapoints: 172801

3 days:

2015-01-12 17:20:19,836 - DEBUG - Sending request - http://10.206.77.194:80/render?from=01:00_20150106&until=01:00_20150109&target=<graphite series>&format=json
2015-01-12 17:20:32,875 - INFO - Query duration - 13.39656 sec. Datapoints: 259201

4 days:

2015-01-12 17:20:35,064 - DEBUG - Sending request - http://10.206.77.194:80/render?from=01:00_20150106&until=01:00_20150110&target=<graphite series>&format=json
2015-01-12 17:20:54,630 - INFO - Query duration - 19.565301 sec. Datapoints: 345601

We can provide the script used to perform these queries if that would be useful.

The underlying InfluxDB queries are normal select time,value from <series name> where time > X and time < Y order asc queries generated by the graphite_influxdb handler.

As this issue is about list series being slow, should I make a new issue for read query performance?

@pkittenis The changes for v0.9.0 will include much more than an API refactor. The entire codebase has been rewritten, and introduction of indexed tags will eliminate the need for such a proliferation of series names. In light of that, we're going to keep this issue closed. There may still be serialization overhead we'll want to address down the road, but I think that's best left for a separate issue once the new codebase has been released and profiled.

My other question is why are you returning 8600 data points for graphing?
You can't visualize that much raw data. You should be using rollup
intervals. This means you should be returning anywhere from 200-1000 data
points for a given series that you're visualizing.

That being said, we're working on performance enhancements across the board
(API serialization included)

On Mon, Jan 12, 2015 at 2:10 PM, Todd Persen notifications@github.com
wrote:

@pkittenis https://github.com/pkittenis The changes for v0.9.0 will
include much more than an API refactor. The entire codebase has been
rewritten, and introduction of indexed tags will eliminate the need for
such a proliferation of series names. In light of that, we're going to keep
this issue closed. There may still be serialization overhead we'll want to
address down the road, but I think that's best left for a separate issue
once the new codebase has been released and profiled.


Reply to this email directly or view it on GitHub
#884 (comment).

@pauldix for graphing purpose you need to get list of all series, except of the retention scheme.

And for 8600 data points for graphing... well... I know people that are doing that and they say that it's useful (at my current work one of the manager graph stats for all the client-related stats on one graph, and it's thousands of lines, he says that he can see when something bad happens and also he's saying that he need to have all thousands lines)

Thanks for the update, will wait for 0.9.0 to retest.

While I would agree rollup intervals should be used the queries themselves are generated by grafana via the graphite_influxdb handler. The queries are not something we're doing manually.

Any grafana dashboard with an influxdb backend will generate those queries and try and retrieve 86400 data points for the default 1min resolution graphite metric series if the time range spans 2 months or more.

commented

Getting a list of all datasets is probably the most basic operation in any DBMS. If this is a serious project, then pagination is a must. I hope this feature will be added soon.

This is fixed in 0.9.0-rc7. Actually since before then. For example you can do:

SHOW SERIES LIMIT 10 OFFSET 20
commented

Oh, that's good news :)

Is it also possible to paginate results from queries such as SELECT * FROM /.*/ LIMIT 1?
It is too expensive (in terms of time, bandwith and memory) when there are hundreds of thousands of series :(

for those using influx 0.8 as graphite backend,
i added support to graphite-influxdb for using elasticsearch to query metric metadata, bypassing influxdb list series. before: 400~800ms, via ES i get <50ms most of the time. with a few outliers, but all cases (median, upper 90th, upper) always at least as good as influx.
influx-vs-es
(see https://github.com/vimeo/graphite-influxdb/blob/de5dd7f37c2174bee6b7be860e31cf9635571337/get-series-influxdb-vs-es.png for more numbers)