mraleph / irhydra

Tool for displaying IR used by V8 and Dart VM optimizing compilers

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How are blocks marked as unreachable

tusharmath opened this issue · comments

Can you please help in understanding why a set of blocks were marked as unreachable? Here is an example —

run()

screen shot 2016-10-01 at 3 59 32 pm

Archive.zip

I looked at the IR for B22. It contains nothing but deoptimization, that's why it is marked as dead block. You can see the logic for marking here.

Essentially greyed out blocks mean "don't look at this blocks, they are not interesting (i.e. can't be reached or don't contain anything interesting)"

The opacity i think mislead me and I am new to vm optimizations so please forgive me for being ignorant or rather dumb.

Below are IRs for two implementations of the subscribe function. The fast one is 4.8 times faster than the slower one but it is also marked with the red dotted border.

slow.zip

    for (var i = 0; i < this.array.length; ++i) {
      observer.next(this.array[i])
    }
    observer.complete()
    return subscription

fast.zip
wrapping complete with an end function makes it faster?

for (var i = 0; i < this.array.length; ++i) {
      observer.next(this.array[i])
    }
    end()
    function end () {
      observer.complete()
    }

    return subscription

I have spent three days trying to figure out the why of it. Can you please help me?

@tusharmath do you have a benchmark I could run locally? it would be easier for me to figure things if there was something like that.

I have pushed the code here — https://github.com/tusharmath/observable-air

To create files for IRHydra —

  • npm run hydra

The two versions of code —

FAST
https://github.com/tusharmath/observable-air/blob/master/src/sources/From.ts#L13

SLOW
https://github.com/tusharmath/observable-air/blob/master/src/sources/From.ts#L27

I checked this out. Instead of looking at subscribe function you should look at the benchmark function itself (filter by defer to find it in the list). If you follow the chain of inlined functions until you arrive to subscribe then you will discover that in the slow case From2Observable.subscribe itself is inlined into benchmark function. However small functions that perform filtering and operations are not inlined into it.

In the fast case FromObservable.subscribe is not inlined into the benchmark (and small functions are inlined into it).

This is where performance difference comes from: because the loop inside From2Observable.subscribe is the hottest loop in the benchmark the fact that in the slow case small functions are not inlined into it, because inliner runs out of depth budget, causes performance degradation.

Writing your code like this:

    subscribe(observer) {
        // ____     ___  ____   __________              _        ____   ___       ___
        // `MM'     `M' 6MMMMb\ `MMMMMMMMM             dM.      6MMMMb\ `MMb     dMM'
        //  MM       M 6M'    `  MM      \            ,MMb     6M'    `  MMM.   ,PMM
        //  MM       M MM        MM                   d'YM.    MM        M`Mb   d'MM
        //  MM       M YM.       MM    ,             ,P `Mb    YM.       M YM. ,P MM
        //  MM       M  YMMMMb   MMMMMMM             d'  YM.    YMMMMb   M `Mb d' MM
        //  MM       M      `Mb  MM    `            ,P   `Mb        `Mb  M  YM.P  MM
        //  MM       M       MM  MM                 d'    YM.        MM  M  `Mb'  MM
        //  YM       M       MM  MM                ,MMMMMMMMb        MM  M   YP   MM
        //   8b     d8 L    ,M9  MM      /         d'      YM. L    ,M9  M   `'   MM
        //    YMMMMM9  MYMMMM9  _MMMMMMMMM       _dM_     _dMM_MYMMMM9  _M_      _MM_
        //
        for (var i = 0; i < this.array.length; ++i) {
          observer.next(this.array[i]);
        }
        observer.complete();
        return subscription;
    }

would also make it faster (on the current node) because V8 would refuse to inline "large" function subscribe.

However this case is not representative: in the real world it's unlikely that code will be this monomorphic. As soon as you start benchmarking more realistic code, e.g. code where subscribe is not monomorphic with respect to observer you will discover that the slower case is more realistic then the faster one.

Perfect! Thank you so much @mraleph.

But how did you know that I should be searching for the defer function, there are so many?

(filter by defer to find it in the list)

You can use undocumented trick: use src:from2 in the filter. This filters method list to only include those that contain from2 in their sources.

@mraleph

small functions are not inlined into it, because inliner runs out of depth budget

  1. How would one figure out that the function wasn't inlined because of the depth budget?
  2. Can I view the depth in IRHYDRA?
  3. What could be other reasons for the function not getting inlined (or inlined)?

How would one figure out that the function wasn't inlined because of the depth budget?

Well, I know that inlining depth is 5 - so it's easy to see - cause function looks otherwise inlinable :)

You can use --trace-inlining to trace inlining decisions - what was inlined and what was not - but then you have to read output yourself. IRHydra does not parse that for you.

Can I view the depth in IRHYDRA?

If you look at the animation above you can see inlining path suite.add.defer > subscribe > subscribe > subscribe > subscribe > next each > represents one level of inlining, which for example means that next was inlined at depth 5 - which is the limit (controlled by flag --max_inlining_levels.

What could be other reasons for the function not getting inlined (or inlined)?

There are plenty. Can non-optimizable by Crankshaft (e.g. uses some unsupported ES6 construct). Can have source that is too big. Can be too big in terms of IR instructions, etc.

Thanks