iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.

Home Page:http://iree.dev/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Restructure C/C++ directories to reflect project structure

stellaraccident opened this issue · comments

The flat C++ tree and the redundant top-level iree/ directory have long been things that we have regretted. They have been that way since day 1 and owe to the source in a monorepo that used a very flat structure for such things. Having everything fat does not make dependency order clear and causes us to have to carefully keep on the look out for forbidden dependency edges. We started transitioning to a new structure when reworking the python packaging, since that has to be directory based. And now following up with the rest of the tree.

New structure:

  • compiler/src/iree (from iree/compiler)
  • runtime/
    • bindings/python
    • src/iree
      • base
      • builtins
      • hal
      • modules
      • runtime (this is unfortunate but sticking with it for now - it is high level API)
      • schemas
      • task
      • testing
      • vm
  • samples/
  • tools/
  • integration_tests/

Dependency rules:

  • runtime/ can only depend within itself
  • compiler/ generally depends within itself (testing relies on tools/) but with some exceptions:
    • src/iree/ConstEval depends on the runtime/ (it recursively uses the compiler)
  • tools/ can depend on compiler/ and runtime/
  • samples/ can depend on anything

There are a couple of misc gotchas that we will look to correct over time (i.e. runtime Python bindings need to embed tools/ binaries).

The move will happen piece by piece:

  • iree/runtime -> runtime/src/iree
  • iree/compiler -> compiler/src/iree
  • iree/tools -> tools/
    • will require reworking some of the libraries in here that really should be elsewhere
  • iree/test -> integration_tests/
  • iree/samples -> samples/

Once we're done with this, I'd like to rework the Bazel->CMake target mappings to more closely follow the directory structure and to correct issues like this (in runtime/bindings/python/CMakeLists.txt):

    # TODO: Update CMake target/path mangling rules to make this syntactically
    # rooted on "iree" in some way.
    runtime_bindings_python_PyExtRt

So far, I've kept the rules the same to keep diffs down, but once the move is done, we can do one NFC to reset it.

Just catching up on things. What's the rationale for this structure specifically? Ben and I had long ago discussed moving all runtime stuff into its own directory separate from compiler, so +1 on the general idea, but I'm not sure where the src/ directories crept in. It seems like rather than having compiler/src/iree/compiler and runtime/src/iree/runtime we could've just had iree/compiler and iree/runtime (which is what I had been thinking previously)

Just catching up on things. What's the rationale for this structure specifically? Ben and I had long ago discussed moving all runtime stuff into its own directory separate from compiler, so +1 on the general idea, but I'm not sure where the src/ directories crept in. It seems like rather than having compiler/src/iree/compiler and runtime/src/iree/runtime we could've just had iree/compiler and iree/runtime (which is what I had been thinking previously)

Judgment call mostly: these components are effectively sub-projects and having a consistent way to organize them as such was a priority, and we've long been wanting to avoid the "monorepo creep" of having a single, unified source/include tree. Being explicit about these boundaries also means that you have a sensible delineation for downstream source packages and such. Having some kind of "src" directory in a multi-language, C dominant project is not uncommon, as that gives you a place to root your include tree while also having peers of other languages. The price you pay is a small bit of directory nesting to make the namespaces line up.

The runtime/ directory is presently closer to the ultimate vision of this. The compiler/ directory still has some funny src/API/python kind of nesting that we need to untangle.

What finally cinched it for me was looking at where to put project/package description files (i.e. setup.py, pyproject.toml), which are the "user interfaces" to the project in terms of packaging. Having those be at a sensible root and not just commingled in a C directory tree made it all make more sense from a user/docs perspective.

As a final benefit, having to take an explicit dep to get access to the include tree adds a bit of robustness to the source organization, since it is harder to have an illegal dependency (most relevant with CMake and external things that depend on IREE).

Thanks for explaining 🙂 SGTM

I like the new project structure, but I would prefer if there were fewer top level directories.

Would we consider moving llvm-external-projects/ under compiler/? Or third_party/?

Beyond that, I could see

  • benchmarks/, build_tools/, and tools/ merging in some way (and .github/, but I think we're stuck with it at the root)
  • benchmarks/ and what's in iree/test/ could also share a top level directory
  • samples/ could be ejected to https://github.com/google/iree-samples, though that would make development a little trickier

I don't think I have a strong opinion on being super minimal on top level directories... At least not insofar as degrading anything to shave one.

Regarding llvm-external-projects, now that that just has one thing in it, it is easier to think about. We can probably just put iree-dialects somewhere.

(also, there is still some organization needed within the compiler directory so might make sense to move iree-dialects in combination with looking in detail at that)

As originally described, this restructuring is complete now.

A few loose ends:

  • We could move benchmarks under tests/
  • compiler/ could use some organization
  • iree-dialects could find a new home