MPI High-level API
MPI controller API module provides high-level functions for initialization of the backend, for working with object storage and submitting tasks.
API
- unidist.core.backends.mpi.core.controller.api.init()
Initialize MPI processes.
Notes
When initialization collect the MPI cluster topology.
- unidist.core.backends.mpi.core.controller.api.is_initialized()
Check if MPI backend has already been initialized.
- Returns:
True or False.
- Return type:
bool
Function shutdown()
sends cancelation signal to all MPI processes.
After that, MPI backend couldn’t be restarted.
- unidist.core.backends.mpi.core.controller.api.shutdown()
Shutdown all MPI processes.
Notes
Sends cancelation operation to all workers and monitor processes.
Functions get()
and
put()
are responsible for read/write operations from/to object storage.
Both of the functions block execution until read/write finishes.
- unidist.core.backends.mpi.core.controller.api.get(data_ids)
Get an object(s) associated with data_ids from the object storage.
- Parameters:
data_ids (unidist.core.backends.mpi.core.common.MpiDataID or list) – An ID(s) to object(s) to get data from.
- Returns:
A Python object.
- Return type:
object
- unidist.core.backends.mpi.core.controller.api.put(data)
Put the data into object storage.
- Parameters:
data (object) – Data to be put.
- Returns:
An ID of an object in object storage.
- Return type:
wait()
carries out blocking of execution
until a requested number of MpiDataID
isn’t ready.
- unidist.core.backends.mpi.core.controller.api.wait(data_ids, num_returns=1)
Wait until data_ids are finished.
This method returns two lists. The first list consists of
DataID
-s that correspond to objects that completed computations. The second list corresponds to the rest of theDataID
-s (which may or may not be ready).- Parameters:
data_ids (unidist.core.backends.mpi.core.common.MpiDataID or list) –
DataID
or list ofDataID
-s to be waited.num_returns (int, default: 1) – The number of
DataID
-s that should be returned as ready.
- Returns:
List of data IDs that are ready and list of the remaining data IDs.
- Return type:
tuple
submit()
submits a task execution request to a worker.
Specific worker will be chosen by schedule_rank()
scheduling function.
- unidist.core.backends.mpi.core.controller.api.submit(task, *args, num_returns=1, **kwargs)
Execute function on a worker process.
- Parameters:
task (callable) – Function to be executed in the worker.
*args (iterable) – Positional arguments to be passed in the task.
num_returns (int, default: 1) – Number of results to be returned from task.
**kwargs (dict) – Keyword arguments to be passed in the task.
- Returns:
Type of returns depends on num_returns value:
if num_returns == 1,
DataID
will be returned.if num_returns > 1, list of
DataID
-s will be returned.if num_returns == 0,
None
will be returned.
- Return type:
unidist.core.backends.mpi.core.common.MpiDataID or list or None
Scheduler
Currently, scheduling happens in a simple round-robing fashion.
schedule_rank
method
just returns the next rank number in a loop.
- unidist.core.backends.mpi.core.controller.common.RoundRobin.schedule_rank(self)
Find the next non-reserved rank for task/actor-task execution.
- Returns:
A rank number.
- Return type:
int
GarbageCollector
GarbageCollector
controls memory footprint and sends cleanup requests for all workers,
if certain amount of data IDs is out-of-scope.
- class unidist.core.backends.mpi.core.controller.garbage_collector.GarbageCollector(local_store)
Class that tracks deleted data IDs and cleans worker’s object storage.
- Parameters:
local_store (unidist.core.backends.mpi.core.local_object_store) – Reference to the local object storage.
Notes
Cleanup relies on internal threshold settings.
- collect(data_id)
Track ID destruction.
This ID is out of scope, it’s associated data should be cleared.
- Parameters:
data_id_metadata (tuple) – Tuple of the owner rank and data number describing a
MpiDataID
.
- increment_task_counter()
Track task submission number.
Notes
For cleanup purpose.
- regular_cleanup()
Cleanup all garbage collected IDs from local and all workers object storages.
Cleanup triggers based on internal threshold settings.