grafana / phlare

🔥 horizontally-scalable, highly-available, multi-tenant continuous profiling aggregation system

Home Page:https://grafana.com/oss/phlare/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Panic internal: unknown: error parquet file 'mappings.parquet' contains no rows

mattiaforc opened this issue · comments

Describe the bug

I managed to run phlare once, actually scraping a spring boot application running on my laptop; after restarting the containers a panic shows in phlame logs and interface won't show a thing anymore.

To Reproduce

Steps to reproduce the behavior:

I started off the docker-compose example on branch master, modified both docker-compose.yaml and phlare.yaml as follows:
phlare.yaml:

scrape_configs:
  - job_name: "java"
    scrape_interval: "15s"
    static_configs:
      - targets: ["localhost:8012"]
    profiling_config:
      pprof_config:
        block: { enabled: false }
        goroutine: { enabled: false }
        memory: { enabled: false }
        mutex: { enabled: false }

docker-compose.yml:

services:
  phlare:
    image: grafana/phlare:latest
    command: -config.file=/etc/phlare/config.yaml
    volumes:
      - ./phlare.yaml:/etc/phlare/config.yaml
      - data:/data
    network_mode: host

  grafana:
    image: grafana/grafana:main
    environment:
      - GF_FEATURE_TOGGLES_ENABLE=flameGraph
      - GF_AUTH_ANONYMOUS_ENABLED=true
      - GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
      - GF_DIAGNOSTICS_PROFILING_ENABLED=true
      - GF_DIAGNOSTICS_PROFILING_ADDR=0.0.0.0
      - GF_DIAGNOSTICS_PROFILING_PORT=6060
    volumes:
      - ./datasource.yaml:/etc/grafana/provisioning/datasources/datasources.yml
    network_mode: host

volumes:
  data:

    # yaml-language-server: $schema=https://raw.githubusercontent.com/compose-spec/compose-spec/master/schema/compose-spec.json

Then I followed the quickstart for JVM and added the profiling endpoint to my Spring Boot application.

I ran docker-compose up -d the first time and everything went fine, I could see my "java" profiling correctly. I shut both grafana and phlame down with docker-compose down and added the following to phlame.yaml (as seen in the example):

  - job_name: "phlare"
    scrape_interval: "15s"
    static_configs:
      - targets: ["localhost:4100"]
  - job_name: "grafana"
    scrape_interval: "15s"
    static_configs:
      - targets: ["localhost:6060"]

Restarted the whole thing, and now the interface shows:
image
It does not matter what profile type I choose (e.g. process_cpu - cpu), the first request always states "internal: unknown: error parquet file 'mappings.parquet' contains no rows" and the next ones "internal: internal: stream error: stream ID 2757; INTERNAL_ERROR; received from peer" with different stream ID for each request.
You can find the phlare container logs attached below.

I tried to delete both containers with their volumes with:

docker-compose rm --stop --force -v grafana
docker-compose rm --stop --force -v phlare

but it changes nothing.

note: due to a corporate proxy settings, I tend to prefer to run docker-compose with network_mode: host because I have some tricky iptables configuration on my laptop - but I guess this is not the cause of the problem, since the first time I run it it worked just fine.

Expected behavior

There is probably something wrong about the steps I did/configuration I changed, but I think it would be better to:

  1. Not panic and handle it gracefully
  2. Be more explicit about the problem, if this is a simple case of misconfiguration.

Environment

  • Infrastructure: laptop
  • Deployment tool: docker-compose

Additional Context

Phlare container logs:

level=info caller=gokit.go:72 caller=server.go:288 http=[::]:4100 grpc=[::]:9095 msg="server listening on addresses"
level=info caller=memberlist_client.go:436 ts=2022-11-16T15:06:17.808624739Z msg="Using memberlist cluster label and node name" cluster_label= node=XXXX
level=info caller=module_service.go:82 ts=2022-11-16T15:06:17.810108291Z msg=initialising module=server
level=info caller=module_service.go:82 ts=2022-11-16T15:06:17.810183988Z msg=initialising module=memberlist-kv
level=info caller=module_service.go:82 ts=2022-11-16T15:06:17.810185433Z msg=initialising module=agent
level=info caller=module_service.go:82 ts=2022-11-16T15:06:17.810222756Z msg=initialising module=ring
level=info caller=module_service.go:82 ts=2022-11-16T15:06:17.810236679Z msg=initialising module=usage-stats
level=info caller=module_service.go:82 ts=2022-11-16T15:06:17.810297497Z msg=initialising module=ingester
level=info caller=lifecycler.go:547 ts=2022-11-16T15:06:17.810329404Z msg="not loading tokens from file, tokens file path is empty"
level=info caller=lifecycler.go:576 ts=2022-11-16T15:06:17.904703199Z msg="instance not found in ring, adding with no tokens" ring=ingester
level=info caller=ring.go:263 ts=2022-11-16T15:06:17.904710882Z msg="ring doesn't exist in KV store yet"
level=info caller=module_service.go:82 ts=2022-11-16T15:06:17.904790839Z msg=initialising module=distributor
level=info caller=module_service.go:82 ts=2022-11-16T15:06:17.904820071Z msg=initialising module=querier
level=info caller=phlare.go:301 ts=2022-11-16T15:06:17.90489977Z msg="Phlare started" version="(version=main-8cdca06f, branch=main, revision=8cdca06f)"
level=info caller=lifecycler.go:416 ts=2022-11-16T15:06:17.905032632Z msg="auto-joining cluster after timeout" ring=ingester
level=info caller=agent.go:70 ts=2022-11-16T15:06:22.810660132Z msg="received target groups" job=java
level=info caller=target.go:74 ts=2022-11-16T15:06:22.810705154Z msg="syncing target groups" job=java
level=info caller=agent.go:70 ts=2022-11-16T15:06:22.810778947Z msg="received target groups" job=phlare
level=info caller=target.go:74 ts=2022-11-16T15:06:22.810802034Z msg="syncing target groups" job=phlare
level=info caller=agent.go:70 ts=2022-11-16T15:06:22.810845186Z msg="received target groups" job=grafana
level=info caller=target.go:74 ts=2022-11-16T15:06:22.810864273Z msg="syncing target groups" job=grafana
2022/11/16 15:07:38 http2: panic serving 192.168.0.11:54568: runtime error: invalid memory address or nil pointer dereference
goroutine 1112 [running]:
golang.org/x/net/http2.(*serverConn).runHandler.func1()
	golang.org/x/net@v0.0.0-20220812174116-3211cb980234/http2/server.go:2245 +0x145
panic({0x3038fc0, 0x58cbfd0})
	runtime/panic.go:838 +0x207
github.com/opentracing-contrib/go-stdlib/nethttp.MiddlewareFunc.func5.1()
	github.com/opentracing-contrib/go-stdlib@v1.0.0/nethttp/server.go:150 +0x139
panic({0x3038fc0, 0x58cbfd0})
	runtime/panic.go:838 +0x207
github.com/segmentio/parquet-go.(*File).Root(...)
	github.com/segmentio/parquet-go@v0.0.0-20220914222423-67dbe8d21ca5/file.go:278
github.com/grafana/phlare/pkg/phlaredb/query.GetColumnIndexByPath(0x0, {0x3716517?, 0xc001a7f218?})
	github.com/grafana/phlare/pkg/phlaredb/query/util.go:11 +0x5f
github.com/grafana/phlare/pkg/phlaredb.(*parquetReader[...]).columnIter(0xc000205358, {0x3d404e8, 0xc001820480}, {0x3716517, 0xb}, {0x3d30490, 0xc001593f38}, {0x3716517, 0xb})
	github.com/grafana/phlare/pkg/phlaredb/block_querier.go:834 +0x78
github.com/grafana/phlare/pkg/phlaredb.(*singleBlockQuerier).SelectMatchingProfiles(0xc0002051e0, {0x3d40440?, 0xc002b88640?}, 0xc001d665f0)
	github.com/grafana/phlare/pkg/phlaredb/block_querier.go:595 +0x8df
github.com/grafana/phlare/pkg/phlaredb.(*PhlareDB).MergeProfilesLabels(0x0?, {0x3d404e8, 0xc00205cba0}, 0xc000a6f200)
	github.com/grafana/phlare/pkg/phlaredb/phlaredb.go:424 +0xf3e
github.com/grafana/phlare/pkg/ingester.(*Ingester).MergeProfilesLabels.func1(0xc0004bf800?)
	github.com/grafana/phlare/pkg/ingester/query.go:47 +0x28
github.com/grafana/phlare/pkg/ingester.(*Ingester).forInstance(0x9cf8c1be0?, {0x3d404e8?, 0xc00205cba0?}, 0xc002619a60)
	github.com/grafana/phlare/pkg/ingester/ingester.go:179 +0x134
github.com/grafana/phlare/pkg/ingester.(*Ingester).MergeProfilesLabels(0x40d9a7?, {0x3d404e8?, 0xc00205cba0?}, 0xc002619a01?)
	github.com/grafana/phlare/pkg/ingester/query.go:46 +0x5b
github.com/bufbuild/connect-go.NewBidiStreamHandler[...].func1({0x7f09cfb03738, 0xc00004c200})
	github.com/bufbuild/connect-go@v1.0.0/handler.go:148 +0x87
github.com/grafana/phlare/pkg/tenant.(*authInterceptor).WrapStreamingHandler.func1({0x3d404e8, 0xc00205cae0}, {0x7f09cfb03738, 0xc00004c200})
	github.com/grafana/phlare/pkg/tenant/interceptor.go:67 +0xd1
github.com/bufbuild/connect-go.(*Handler).ServeHTTP(0xc0000a4f50, {0x3d3f670, 0xc002b885c0}, 0xc0001df400)
	github.com/bufbuild/connect-go@v1.0.0/handler.go:213 +0x624
github.com/gorilla/mux.(*Router).ServeHTTP(0xc00026a600, {0x3d3f670, 0xc002b885c0}, 0xc0001df200)
	github.com/gorilla/mux@v1.8.0/mux.go:210 +0x1cf
github.com/opentracing-contrib/go-stdlib/nethttp.MiddlewareFunc.func5({0x3d32f20?, 0xc0024580d8}, 0xc0001def00)
	github.com/opentracing-contrib/go-stdlib@v1.0.0/nethttp/server.go:154 +0x623
net/http.HandlerFunc.ServeHTTP(0x0?, {0x3d32f20?, 0xc0024580d8?}, 0x0?)
	net/http/server.go:2084 +0x2f
golang.org/x/net/http2.(*serverConn).runHandler(0x0?, 0x0?, 0x0?, 0x0?)
	golang.org/x/net@v0.0.0-20220812174116-3211cb980234/http2/server.go:2252 +0x83
created by golang.org/x/net/http2.(*serverConn).processHeaders
	golang.org/x/net@v0.0.0-20220812174116-3211cb980234/http2/server.go:1957 +0x59b

Thank you so much for the report we'll look into it asap

Ok so it seems that in java the mapping is not saved shouldn't be a problem, but because we error out early we stop opening other files however we did open the tsdb file. So Phlare think the database is ready but it's not and that's how we panic.

We'll fix this very soon, again thank you for the detailed report this has sped up the process of finding the actual issue.

@cyriltovena Glad to hear that! Thanks for the quick response! I'm looking forward to using Phlare as it looks very promising, great work!