How are blocks marked as unreachable
tusharmath opened this issue · comments
Can you please help in understanding why a set of blocks were marked as unreachable? Here is an example —
run()
I looked at the IR for B22. It contains nothing but deoptimization, that's why it is marked as dead block. You can see the logic for marking here.
Essentially greyed out blocks mean "don't look at this blocks, they are not interesting (i.e. can't be reached or don't contain anything interesting)"
The opacity i think mislead me and I am new to vm optimizations so please forgive me for being ignorant or rather dumb.
Below are IRs for two implementations of the subscribe
function. The fast one is 4.8
times faster than the slower one but it is also marked with the red dotted border.
for (var i = 0; i < this.array.length; ++i) {
observer.next(this.array[i])
}
observer.complete()
return subscription
fast.zip
wrapping complete
with an end
function makes it faster?
for (var i = 0; i < this.array.length; ++i) {
observer.next(this.array[i])
}
end()
function end () {
observer.complete()
}
return subscription
I have spent three days trying to figure out the why of it. Can you please help me?
@tusharmath do you have a benchmark I could run locally? it would be easier for me to figure things if there was something like that.
I have pushed the code here — https://github.com/tusharmath/observable-air
-
git clone https://github.com/tusharmath/observable-air
-
npm install
-
npm run benchmark
code: https://github.com/tusharmath/observable-air/blob/master/lib/benchmark.js
To create files for IRHydra —
npm run hydra
The two versions of code —
FAST
https://github.com/tusharmath/observable-air/blob/master/src/sources/From.ts#L13
SLOW
https://github.com/tusharmath/observable-air/blob/master/src/sources/From.ts#L27
I checked this out. Instead of looking at subscribe
function you should look at the benchmark function itself (filter by defer
to find it in the list). If you follow the chain of inlined functions until you arrive to subscribe
then you will discover that in the slow case From2Observable.subscribe
itself is inlined into benchmark function. However small functions that perform filtering and operations are not inlined into it.
In the fast case FromObservable.subscribe
is not inlined into the benchmark (and small functions are inlined into it).
This is where performance difference comes from: because the loop inside From2Observable.subscribe
is the hottest loop in the benchmark the fact that in the slow case small functions are not inlined into it, because inliner runs out of depth budget, causes performance degradation.
Writing your code like this:
subscribe(observer) {
// ____ ___ ____ __________ _ ____ ___ ___
// `MM' `M' 6MMMMb\ `MMMMMMMMM dM. 6MMMMb\ `MMb dMM'
// MM M 6M' ` MM \ ,MMb 6M' ` MMM. ,PMM
// MM M MM MM d'YM. MM M`Mb d'MM
// MM M YM. MM , ,P `Mb YM. M YM. ,P MM
// MM M YMMMMb MMMMMMM d' YM. YMMMMb M `Mb d' MM
// MM M `Mb MM ` ,P `Mb `Mb M YM.P MM
// MM M MM MM d' YM. MM M `Mb' MM
// YM M MM MM ,MMMMMMMMb MM M YP MM
// 8b d8 L ,M9 MM / d' YM. L ,M9 M `' MM
// YMMMMM9 MYMMMM9 _MMMMMMMMM _dM_ _dMM_MYMMMM9 _M_ _MM_
//
for (var i = 0; i < this.array.length; ++i) {
observer.next(this.array[i]);
}
observer.complete();
return subscription;
}
would also make it faster (on the current node) because V8 would refuse to inline "large" function subscribe
.
However this case is not representative: in the real world it's unlikely that code will be this monomorphic. As soon as you start benchmarking more realistic code, e.g. code where subscribe
is not monomorphic with respect to observer you will discover that the slower case is more realistic then the faster one.
Perfect! Thank you so much @mraleph.
But how did you know that I should be searching for the defer
function, there are so many?
(filter by defer to find it in the list)
You can use undocumented trick: use src:from2
in the filter. This filters method list to only include those that contain from2
in their sources.
small functions are not inlined into it, because inliner runs out of depth budget
- How would one figure out that the function wasn't inlined because of the depth budget?
- Can I view the depth in IRHYDRA?
- What could be other reasons for the function not getting inlined (or inlined)?
How would one figure out that the function wasn't inlined because of the depth budget?
Well, I know that inlining depth is 5 - so it's easy to see - cause function looks otherwise inlinable :)
You can use --trace-inlining
to trace inlining decisions - what was inlined and what was not - but then you have to read output yourself. IRHydra does not parse that for you.
Can I view the depth in IRHYDRA?
If you look at the animation above you can see inlining path suite.add.defer > subscribe > subscribe > subscribe > subscribe > next
each >
represents one level of inlining, which for example means that next
was inlined at depth 5 - which is the limit (controlled by flag --max_inlining_levels
.
What could be other reasons for the function not getting inlined (or inlined)?
There are plenty. Can non-optimizable by Crankshaft (e.g. uses some unsupported ES6 construct). Can have source that is too big. Can be too big in terms of IR instructions, etc.
Thanks