PPT Slide
INTEGRATION OF DATA MANAGEMENT AND PARTITIONING
Data Distribution/Scheduling is achieved by:
- Assigning a key to each object that is derived from the data itself, i.e.
defines a simple ordering scheme for the data
- Partitioning/Repartitioning the ordering produces a data/computation distribution
- Smoothing to improve load balance/ partition quality