apache / celeborn

Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.

Home Page:https://celeborn.apache.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

support register shuffle task in map partition mode

RexXiong opened this issue · comments

Abstract

Make each partition location data can be grouped by (partitionId, attemptId) as partition result group is the necessary step to support map partition, because with each map attempt task a new set of results would be generated. Then we can support register map partition task. The image below will demonstrate this
image

Implements

  1. encode partitionId[Int] = (attemptId 8bit, rawPartitionId 24 bit)
  2. The epoch of each group will be managed by their own partition id as before and map/reduce partition use the same way.
  3. support register shuffle task in map partition mode(first time will offer and reserve all slots for attemptId=0 task)