Class MapReduce

Implements a simplistic version of the popular Map-Reduce algorithm. Acts like an iterator for the original passed data after each result has been processed, thus offering a transparent wrapper for results coming from any source.

Properties summary

  • $_counter protected
    int

    Count of elements emitted during the Reduce phase

  • $_data protected
    \Traversable

    Holds the original data that needs to be processed

  • $_executed protected
    bool

    Whether the Map-Reduce routine has been executed already on the data

  • $_intermediate protected
    array

    Holds the shuffled results that were emitted from the map phase

  • $_mapper protected
    callable

    A callable that will be executed for each record in the original data

  • $_reducer protected
    callable|null

    A callable that will be executed for each intermediate record emitted during the Map phase

  • $_result protected
    array

    Holds the results as emitted during the reduce phase

Method Summary

  • __construct() public

    Constructor

  • _execute() protected

    Runs the actual Map-Reduce algorithm. This is iterate the original data and call the mapper function for each , then for each intermediate bucket created during the Map phase call the reduce function.

  • emit() public

    Appends a new record to the final list of results and optionally assign a key for this record.

  • emitIntermediate() public

    Appends a new record to the bucket labelled with $key, usually as a result of mapping a single record from the original data.

  • getIterator() public

    Returns an iterator with the end result of running the Map and Reduce phases on the original data

Method Detail

__construct() public

__construct(\Traversable $data, callable $mapper, ?callable $reducer)

Constructor

Example:

Separate all unique odd and even numbers in an array

$data = new \ArrayObject([1, 2, 3, 4, 5, 3]);
 $mapper = function ($value, $key, $mr) {
     $type = ($value % 2 === 0) ? 'even' : 'odd';
     $mr->emitIntermediate($value, $type);
 };

 $reducer = function ($numbers, $type, $mr) {
     $mr->emit(array_unique($numbers), $type);
 };
 $results = new MapReduce($data, $mapper, $reducer);

Previous example will generate the following result:

['odd' => [1, 3, 5], 'even' => [2, 4]]

Parameters

\Traversable $data

the original data to be processed

callable $mapper

the mapper callback. This function will receive 3 arguments. The first one is the current value, second the current results key and third is this class instance so you can call the result emitters.

callable|null $reducer optional

the reducer callback. This function will receive 3 arguments. The first one is the list of values inside a bucket, second one is the name of the bucket that was created during the mapping phase and third one is an instance of this class.

_execute() protected

_execute()

Runs the actual Map-Reduce algorithm. This is iterate the original data and call the mapper function for each , then for each intermediate bucket created during the Map phase call the reduce function.

Throws

LogicException
if emitIntermediate was called but no reducer function was provided

emit() public

emit(mixed $val, mixed $key)

Appends a new record to the final list of results and optionally assign a key for this record.

Parameters

mixed $val

The value to be appended to the final list of results

mixed $key optional

and optional key to assign to the value

emitIntermediate() public

emitIntermediate(mixed $val, mixed $bucket)

Appends a new record to the bucket labelled with $key, usually as a result of mapping a single record from the original data.

Parameters

mixed $val

The record itself to store in the bucket

mixed $bucket

the name of the bucket where to put the record

getIterator() public

getIterator()

Returns an iterator with the end result of running the Map and Reduce phases on the original data

Returns

\Traversable

Property Detail

$_counter protected

Count of elements emitted during the Reduce phase

Type

int

$_data protected

Holds the original data that needs to be processed

Type

\Traversable

$_executed protected

Whether the Map-Reduce routine has been executed already on the data

Type

bool

$_intermediate protected

Holds the shuffled results that were emitted from the map phase

Type

array

$_mapper protected

A callable that will be executed for each record in the original data

Type

callable

$_reducer protected

A callable that will be executed for each intermediate record emitted during the Map phase

Type

callable|null

$_result protected

Holds the results as emitted during the reduce phase

Type

array

© 2005–present The Cake Software Foundation, Inc.
Licensed under the MIT License.
CakePHP is a registered trademark of Cake Software Foundation, Inc.
We are not endorsed by or affiliated with CakePHP.
https://api.cakephp.org/4.1/class-Cake.Collection.Iterator.MapReduce.html