MPI High-level API

MPI controller API module provides high-level functions for initialization of the backend, for working with object storage and submitting tasks.

API

unidist.core.backends.mpi.core.controller.api.init()

Initialize MPI processes.

Notes

When initialization collect the MPI cluster topology.

unidist.core.backends.mpi.core.controller.api.is_initialized()

Check if MPI backend has already been initialized.

Returns:: True or False.
Return type:: bool

Function shutdown() sends cancelation signal to all MPI processes. After that, MPI backend couldn’t be restarted.

unidist.core.backends.mpi.core.controller.api.shutdown()

Shutdown all MPI processes.

Notes

Sends cancelation operation to all workers and monitor processes.

Functions get() and put() are responsible for read/write operations from/to object storage. Both of the functions block execution until read/write finishes.

unidist.core.backends.mpi.core.controller.api.get(data_ids)

Get an object(s) associated with data_ids from the object storage.

Parameters:: data_ids (unidist.core.backends.mpi.core.common.MpiDataID or list) – An ID(s) to object(s) to get data from.
Returns:: A Python object.
Return type:: object

unidist.core.backends.mpi.core.controller.api.put(data)

Put the data into object storage.

Parameters:: data (object) – Data to be put.
Returns:: An ID of an object in object storage.
Return type:: unidist.core.backends.mpi.core.common.MpiDataID

wait() carries out blocking of execution until a requested number of MpiDataID isn’t ready.

unidist.core.backends.mpi.core.controller.api.wait(data_ids, num_returns=1)

Wait until data_ids are finished.

This method returns two lists. The first list consists of DataID-s that correspond to objects that completed computations. The second list corresponds to the rest of the DataID-s (which may or may not be ready).

Parameters:

data_ids (unidist.core.backends.mpi.core.common.MpiDataID or list) – DataID or list of DataID-s to be waited.
num_returns (int, default: 1) – The number of DataID-s that should be returned as ready.

Returns:

List of data IDs that are ready and list of the remaining data IDs.

Return type:

tuple

submit() submits a task execution request to a worker. Specific worker will be chosen by schedule_rank() scheduling function.

unidist.core.backends.mpi.core.controller.api.submit(task, *args, num_returns=1, **kwargs)

Execute function on a worker process.

Parameters:

task (callable) – Function to be executed in the worker.
*args (iterable) – Positional arguments to be passed in the task.
num_returns (int, default: 1) – Number of results to be returned from task.
**kwargs (dict) – Keyword arguments to be passed in the task.

Returns:

Type of returns depends on num_returns value:

if num_returns == 1, DataID will be returned.
if num_returns > 1, list of DataID-s will be returned.
if num_returns == 0, None will be returned.

Return type:

unidist.core.backends.mpi.core.common.MpiDataID or list or None

Scheduler

Currently, scheduling happens in a simple round-robing fashion. schedule_rank method just returns the next rank number in a loop.

unidist.core.backends.mpi.core.controller.common.RoundRobin.schedule_rank(self)

Find the next non-reserved rank for task/actor-task execution.

Returns:: A rank number.
Return type:: int

GarbageCollector

GarbageCollector controls memory footprint and sends cleanup requests for all workers, if certain amount of data IDs is out-of-scope.

class unidist.core.backends.mpi.core.controller.garbage_collector.GarbageCollector(local_store)

Class that tracks deleted data IDs and cleans worker’s object storage.

Parameters:: local_store (unidist.core.backends.mpi.core.local_object_store) – Reference to the local object storage.

Notes

Cleanup relies on internal threshold settings.

collect(data_id)

Track ID destruction.

This ID is out of scope, it’s associated data should be cleared.

Parameters:: data_id_metadata (tuple) – Tuple of the owner rank and data number describing a MpiDataID.

increment_task_counter()

Track task submission number.

Notes

For cleanup purpose.

regular_cleanup()

Cleanup all garbage collected IDs from local and all workers object storages.

Cleanup triggers based on internal threshold settings.