Shared Object Store

MPI SharedObjectStore stores data in the shared object store. In depend on MpiSharedObjectStoreThreshold, data can be stored in LocalObjectStore.

Topology changes

If shared object store is enabled on a cluster, unidist has some changes in the number of service processes. Monitor processes are created and assigned on each host. All monitoring processes can be divided into root monitor and non-root monitor. The non-root monitor is only responsible for managing shared object store on its host, while the root monitor also performs the main work of the monitor.

Service buffer

Shared object store uses an additional service buffer to store the number of references to stored data by processes on the host and check whether the data has been written to shared memory or not.

A service buffer is an array of long integers that stores service information for each data in the shared object store. Service information consists of 4 numbers:

  • Worker ID - the first part of the DataID.

  • Data number - the second part of the DataID.

  • First data index - the first shared memory index where the data is located.

  • References number - the number of data references, which shows how many processes are using this data.

Shared memory size

Memory for shared object store is allocated and managed by the monitor process, and other processes on the same host have read and write access to it. By default, shared object store uses 95% of all available virtual memory. You can control the size of shared memory using configuration settings: MpiSharedObjectStoreMemory and MpiSharedServiceMemory.

Shared memory management

All workers on the same host can write to shared memory, but the monitor process manages it and determines where data will be written and when it will be deleted. If a process wants to write some data to shared memory, it asks the monitor to reserve memory in shared object store of the desired size and then writes it to shared memory.

When the monitor receives a request to delete data from shared memory, it checks the number of references from all processes to this data. If the number of references is 0, the data will be deleted and the shared memory will be freed for further use.

All shared storage management (memory reservation and deallocation) is defined in unidist.core.backends.mpi.core.monitor.shared_memory_manager.SharedMemoryManager.

API

class unidist.core.backends.mpi.core.shared_object_store.SharedObjectStore

Class that provides access to data in shared memory.

Notes

This class initializes and manages shared memory.

contains(data_id)

Check if the store contains the data_id information required to deserialize the data.

Returns:

Return the True status if shared store contains required information.

Return type:

bool

Notes

This check does not ensure that the data is physically located in shared memory.

delete_service_info(data_id, service_index)

Delete service information for the current data Id.

Parameters:

Notes

This function should be called by the monitor during the cleanup of shared data.

finalize()

Release used resources.

Notes

Shared store should be finalized before MPI.Finalize().

get(data_id, owner_rank=None, shared_info=None)

Get data from another worker using shared memory.

Parameters:
  • data_id (unidist.core.backends.mpi.core.common.MpiDataID) – An ID to data.

  • owner_rank (int, default: None) – The rank that sent the data. This value is used to synchronize data in shared memory between different hosts if the value is not None.

  • shared_info (dict, default: None) – The necessary information to properly deserialize data from shared memory. If shared_info is None, the data already exists in shared memory in the current process.

classmethod get_instance()

Get instance of SharedObjectStore.

Return type:

SharedObjectStore

get_ref_number(data_id, service_index)

Get current references count of data_id by service index.

Parameters:
Returns:

The number of references to this data_id

Return type:

int

get_shared_buffer(first_index, last_index)

Get the requested range of shared memory

Parameters:
  • first_index (int) – Start of the requested range.

  • last_index (int) – End of the requested range. (excluding)

Notes

This function is used to synchronize shared memory between different hosts.

get_shared_info(data_id)

Get required information to correct deserialize data_id from shared memory.

Parameters:

data_id (unidist.core.backends.mpi.core.common.MpiDataID) –

Returns:

Information required for data deserialization

Return type:

dict

is_allocated()

Check if the shared memory is allocated and ready to put data.x

Returns:

True ot False.

Return type:

bool

put(data_id, serialized_data)

Put data into shared memory.

Parameters:
should_be_shared(data)

Check if data should be sent using shared memory.

Parameters:

data (dict) – Serialized data to check its size.

Returns:

Return the True status if data should be sent using shared memory.

Return type:

bool