constant state of 'Indexing'
andyczerwonka opened this issue · comments
Describe the bug
After the latest upgrade last night, my project is now in a constant state of indexing, with one of my cores pinned at 100%. The new version is unusable to me. I've attached the index error report.
Expected behavior
Indexing completes.
Operating system
Linux
Editor/Extension
VS Code
Version of Metals
v1.3.1
Extra context or search terms
I'm also noticing that many of my project now report SemanticDB errors, but I'm assuming that's because the indexing is not completing.
Thanks for reporting! Any chance to get a stack trace of the case where indexing hangs? Would probably make it much easier to figure out
Thanks for reporting! Any chance to get a stack trace of the case where indexing hangs? Would probably make it much easier to figure out
If you pull in the latest spark-core
into any example, you will see that it hangs on the hadoop client dependency. I just used SizeEstimator.estimate
from spark-core
in the example to pull it in. You should see it fail for you.
Ok, this was interesting. Our issue was caused by hadoop releasing broken sources with:
public static final String REFERRER_ORIGIN_HOST = "audit.example.org.apache.hadoop.shaded.org.;
And we couldn't find the end quote.
We assumed that sources released actually compile, which is a reasonable assumption I though. Added a workaround for that possibility.
We assumed that sources released actually compile, which is a reasonable assumption
💯 There's no way sources should be broken. I'll report that to the project.
Turns out that I was able to get rid of that dependency in my project, so there is to rush to release the fix, at least for me. I think it's a pretty rare occurrence, saying that, it did come in via Spark, which I don't think is that rare.
No worries, I will just merge the workaround and hopefully this should no happen with the next release. Though I think it's worth investigating on their side why this happened. I am pretty sure we are not the only tooling that might use sources.
Any chance that it's little more complicated?
I observe indexing takes long for a fairly simple project: e.g. time: indexed workspace in 3m16s
where it didn't before.
When looking at the library list I also see a couple very odd things, like the entire compilation stack is added to the library collection, while I don't see these pop up in e.g. the bloop files.
I seem to only observe this in a mono-module project, not in a multi module project.
(Note GraalVM JDK 21, Metals version: 1.3.1)
UPDATE: the zinc stuff could there because my project dirs also generate .bloop folders.
Still something is off:
Metals version: 1.2.2
time: indexed workspace in 29s
Metals version: 1.3.1
time: indexed workspace in 2m15s
For the same bloop files
We started indexing Java jars, but that should not cause that much of an increase 🤔
Any chance to get this as a repro or at least a basic build.sbt with the deps? The issue you are experiencing is for sure something different.
Sadly the dep tree starts with an internal library suite. So not easy to make that available, but I'll have a shot looking at the difference in file accesses between both versions using procmon.
But likely not before tomorrow.
Is the project using a lot of Java deps? Could that explain a difference in indexing? Also this should only be at the start
How do you define "Java deps" here? I would guess they're all jars.
But yes, it looks to be in that general area.
I monitored both metals 1.2.2. and 1.3.1, specifically looking for file events in to my ivy and maven repo's.
version | file events |
---|---|
1.2.2 | 38k |
1.3.1 | 1.5m |
What's more then a bit suspicious to me it that (starting 14:26:30) over a period of 15 seconds it seems to be opening and closing the same file 3k times. Each time navigating the file tree in the process.
This is just one example it seems to be a recurring phenomenon with (all) other deps. Looks like something is unnecessarily chatty with these files?
FYI post indexing it shows 180 deps, but that's including the SBT deps (from the other "issue").
Sorry for the delayed response btw.
No worries, we started to index Java dependencies to for searching dependencies, which is why I asked about it. We might have a bug there
While I can't make up my mind from browsing the sources of the different meta projects (on GH) and I know I'm freewheeling here, but...
It looks like somehow it's re-opening that jar file (multiple times even) for each source file it contains. (in case of this guava file 636 files)
Should I log a new ticket?
Sure! Makes sense. I haven't had the chance to look into it yet.