facebookincubator / cinder

Cinder is Meta's internal performance-oriented production version of CPython.

Home Page:https://trycinder.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

build error: listnode.gcda missing-profile

belm0 opened this issue · comments

seen on stock ubuntu:20.04 docker image with --enable-optimizations and NOT --enabled-shared

build is fine with --enabled-shared

   >> Objects/accu.o
   Parser/listnode.c: In function ‘list1node’:
   Parser/listnode.c:66:1: error: ‘/cinder/Parser/listnode.gcda’ profile count data file not 
   found [-Werror=missing-profile]
   66 | }
      | ^

This error is primarily caused by a difference in what code paths we actually exercise in the profiling stage. There are various tests that we have disabled because they're either flaky, or else do unsafe things to the host system (like changing the time), and we don't normally build with PGO, instead preferring the sampled approach taken by clang's AutoFDO since it's easier to manage at scale.

I have a change up internally to suppress this error, but even after being suppressed, the resulting PGO data still won't be ideal since that currently gets done with the JIT disabled, and with the current method of collecting PGO by running tests, simply enabling the JIT during it won't do much, as it will primarily be profiling the actual JIT compilation rather than profiling the JIT'd code.

Until that lands, to get things building, you can change this line: https://github.com/facebookincubator/cinder/blob/cinder/3.8/configure#L6784
to:

PGO_PROF_USE_FLAG="-fprofile-use -fprofile-correction -Wno-error=missing-profile"

With the caveat that this only fixes GCC on linux, as the clang pgo paths were removed at some point, so the change I mentioned above has to re-add them.

Local testing (with a caveat these numbers were collected in a Ubuntu 22.04 image under WSL2 with clang and a couple other fixes applied due to the newer compilers) puts cinder in interpreter mode at ~5% faster on average than stock python 3.8 built and run in the same way (measured with the tooling mentioned in #72), though that's with both wins on some benchmarks and regressions in others. Some adjustments appear to be needed in order to get the benchmarks to run in a JIT-compatible way, as getting them to use JIT in the first place appears to need a bit of work, and even then they appeared to be using a separate process for every run, which would result in discarding the JIT'd code every single run.

@Orvid is there more to be done here?

5c04dc5 should have fixed this particular issue.