iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.

Home Page:http://iree.dev/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add stream allocation policy attr interface for controlling allocation behavior.

benvanik opened this issue · comments

We need a way for frontends to specify allocation behavior that connects with the stream allocation phase.
The current idea is to add an IREE::Stream::AllocationPolicyAttrInterface that allows us to control the various stages of allocation (copy on write, emplace, copy elision, lifetime refinement, and scheduling). Stream dialect ops that perform resource allocation can carry the optional attr ala affinity and maybe implement an IREE::Stream::AllocationOpInterface (debatable if needed). Higher-level dialects can use a stream.allocation_policy dialect attr to indicate their desired behavior though whether that survives through the chaos of torch/linalg/etc is undefined and only IREE ops will likely handle them correctly.

Similar to the util inlining policy we can have a few well-known named implementations that control the common settings:

  // error if an allocation is required
  stream.allocation_policy = #stream.allocate.never
  // warn if an allocation is required but continue anyway
  stream.allocation_policy = #stream.allocate.warn
  // force an allocation even if it could alias
  stream.allocation_policy = #stream.allocate.always
  // whatever is needed (default if omitted)
  stream.allocation_policy = #stream.allocate.auto

For specifying stream->hal allocation lowering we may want a different attribute - possibly even a hal.allocation_policy - that specifies the buffer parameters (memory type, usage, placement, etc). Where the stream allocation policy controls which buffers are allocated the hal allocation policy would specify how buffers are allocated.