C/C++ API Reference

Core API

Initialization and finalization

dyad_rc_t dyad_init(bool debug, bool check, bool shared_storage, bool reinit, bool async_publish, bool fsync_write, unsigned int key_depth, unsigned int key_bins, unsigned int service_mux, const char *kvs_namespace, const char *prod_managed_path, const char *cons_managed_path, bool relative_to_managed_path, const char *dtl_mode_str, const dyad_dtl_comm_mode_t dtl_comm_mode, void *flux_handle)

Initializes the DYAD context with explicit parameters.

Allocates and initializes the thread-local DYAD context with the provided configuration. If the context is already initialized and reinit is false, returns DYAD_RC_OK immediately without reinitializing. If reinit is true, calls dyad_finalize() to tear down the existing context before reinitializing.

The initialization sequence is:

  1. Allocate the context struct and set it to dyad_ctx_default.

  2. Open or adopt a Flux handle and retrieve the broker rank.

  3. Compute node_idx from rank / service_mux.

  4. Copy the KVS namespace string.

  5. Initialize the DTL via dyad_set_and_init_dtl_mode().

  6. Set the producer-managed path via dyad_set_prod_path().

  7. Set the consumer-managed path via dyad_set_cons_path().

  8. Redirect log output to per-process log files under logs/.

On any failure after allocation, dyad_clear() is called to release partially initialized resources and the context is reset to dyad_ctx_default.

If neither prod_managed_path nor cons_managed_path is provided, DYAD will not perform any synchronization or data transfer. A warning is printed to stderr and DYAD_RC_OK is returned.

Note

If DYAD_PROFILER_DFTRACER is defined, initializes the DFTracer profiler at the start of this function.

Note

Log output is redirected to per-process files under logs/ unless DYAD_LOGGER_NO_LOG is defined.

Parameters:
  • debug[in] Enable debug logging.

  • check[in] Enable operation checking.

  • shared_storage[in] Enable shared storage mode.

  • reinit[in] Force re-initialization if already initialized.

  • async_publish[in] Enable asynchronous KVS publishing.

  • fsync_write[in] Enable fsync() on write.

  • key_depth[in] KVS key hierarchy depth.

  • key_bins[in] KVS key bins per level.

  • service_mux[in] Number of Flux broker ranks per node. Clamped to a minimum of 1.

  • kvs_namespace[in] Flux KVS namespace. Must not be NULL.

  • prod_managed_path[in] Producer-managed directory path. May be NULL.

  • cons_managed_path[in] Consumer-managed directory path. May be NULL.

  • relative_to_managed_path[in] If true, file paths are interpreted as relative to the managed directory.

  • dtl_mode_str[in] Name of the DTL mode to use.

  • dtl_comm_mode[in] Communication mode for the DTL.

  • flux_handle[in] Optional existing Flux handle. If NULL, a new handle is opened via flux_open().

Return values:
  • DYAD_RC_OK – Initialization succeeded, or was already initialized and reinit is false.

  • DYAD_RC_NOCTX – Context allocation failed, or DYAD_PATH_DELIM is invalid or undefined.

  • DYAD_RC_FLUXFAIL – The Flux handle could not be opened or the broker rank could not be retrieved.

  • DYAD_RC_* – Any error propagated from dyad_set_and_init_dtl_mode(), dyad_set_prod_path(), or dyad_set_cons_path().

Returns:

dyad_rc_t return code indicating the outcome:

dyad_rc_t dyad_init_env(const dyad_dtl_comm_mode_t dtl_comm_mode, void *flux_handle)

Initializes DYAD by reading configuration from environment variables.

Reads the following environment variables to configure DYAD, then delegates to dyad_init() to set up the full DYAD context:

Environment Variable

Default

Description

DYAD_SYNC_DEBUG

false

Enable debug logging

DYAD_SYNC_CHECK

false

Enable operation checking

DYAD_SHARED_STORAGE

false

Enable shared storage mode

DYAD_REINIT

false

Force re-initialization

DYAD_ASYNC_PUBLISH

false

Enable asynchronous KVS publishing

DYAD_FSYNC_WRITE

false

Enable fsync() on write

DYAD_KEY_DEPTH

3

KVS key hierarchy depth

DYAD_KEY_BINS

1024

KVS key bins per level

DYAD_SERVICE_MUX

1

Flux broker ranks per node

DYAD_KVS_NAMESPACE

NULL

Flux KVS namespace

DYAD_PATH_CONSUMER

NULL

Consumer-managed directory path

DYAD_PATH_PRODUCER

NULL

Producer-managed directory path

DYAD_PATH_RELATIVE

false

Paths are relative to managed dirs

DYAD_DTL_MODE

DYAD_DTL_DEFAULT

Data transport layer mode

If DYAD_DTL_MODE is not set, defaults to DYAD_DTL_DEFAULT and logs a warning to stderr.

Parameters:
  • dtl_comm_mode[in] Communication mode for the data transport layer.

  • flux_handle[in] Optional existing Flux handle. If NULL, DYAD will create its own handle.

Returns:

dyad_rc_t return code propagated from dyad_init().

void dyad_ctx_init(dyad_dtl_comm_mode_t dtl_comm_mode, void *flux_handle)

Initializes the DYAD context from environment variables.

Delegates to dyad_init_env() to initialize the DYAD context from environment variables. This is the top-level initialization entry point in the initialization chain:

dyad_ctx_init()
    -> dyad_init_env()   [reads environment variables]
        -> dyad_init()   [allocates and configures the context]

If initialization fails, logs the error to stderr and sets ctx->initialized and ctx->reenter to false to leave the context in a safe, inert state. Called by dyad_wrapper_init() at library load time.

Parameters:
  • dtl_comm_mode[in] Communication mode for the data transport layer.

  • flux_handle[in] Optional existing Flux handle. If NULL, DYAD will open its own handle via flux_open().

dyad_ctx_t *dyad_ctx_get(void)

Returns a pointer to the thread-local DYAD context, or NULL if not initialized. The pointer is stored as a static thread-local variable to prevent multiple initializations and ensure each thread maintains an independent pointer to its own context instance.

dyad_rc_t dyad_finalize(void)

Finalizes and deallocates the DYAD context.

Calls dyad_clear() to release all resources held by the context, then frees the context struct itself and sets the global context pointer to NULL. If the context is already NULL, returns DYAD_RC_OK immediately without taking any action.

This is the top-level teardown function in the finalization chain:

dyad_finalize()
    -> dyad_clear()        [frees all context resources]
        -> dyad_dtl_finalize() [finalizes the DTL handle]

Note

If DYAD_PROFILER_DFTRACER is defined, finalizes the DFTracer profiler after the context is freed.

Return values:

DYAD_RC_OK – The context was successfully finalized, or was already NULL.

Returns:

dyad_rc_t return code indicating the outcome:

void dyad_ctx_fini(void)

Tears down the DYAD context at the wrapper or library level.

Calls dyad_finalize() to release all DYAD resources if the context is non-NULL. If the context is already NULL, returns immediately without taking any action.

This is the top-level finalization entry point in the teardown chain:

dyad_ctx_fini()
    -> dyad_finalize()     [frees the context struct]
        -> dyad_clear()    [frees all context resources]
            -> dyad_dtl_finalize() [finalizes the DTL handle]

Called by dyad_wrapper_fini() at library unload time.

Note

The DYAD_PROFILER == 3 branch is obsolete and will be removed in a future pull request.

dyad_rc_t dyad_clear(void)

Resets the DYAD context to its default state, releasing all internal resources without freeing the context struct itself.

Releases all resources held by the global DYAD context in the following order:

  1. Finalizes the DTL handle via dyad_dtl_finalize().

  2. Closes the Flux handle via flux_close().

  3. Frees the KVS namespace string.

  4. Frees the producer-managed path and its canonical form.

  5. Frees the consumer-managed path and its canonical form.

Unlike dyad_finalize(), this function does not free the context struct itself. The context pointer remains valid after this call, allowing the GOTCHA wrapper layer to continue using it for the lifetime of the process. This is important because the wrapper requires a valid context pointer even during error recovery.

If the context is already NULL, returns DYAD_RC_OK immediately without taking any action.

This function is called by dyad_finalize() as part of normal teardown, and also directly by dyad_init() on initialization failure to release partially initialized resources.

Note

If DYAD_PROFILER_DFTRACER is defined, finalizes the DFTracer profiler after the context is cleared.

Return values:

DYAD_RC_OK – All resources were successfully released, or the context was already NULL.

Returns:

dyad_rc_t return code indicating the outcome:

Path management

dyad_rc_t dyad_set_prod_path(const char *path)

Sets the producer-managed directory path in the DYAD context.

Updates ctx->prod_managed_path and its associated canonical form, lengths, and hashes. Any previously set producer path is freed before the new one is stored.

If prod_managed_path is NULL, the producer path fields in the context are cleared, disabling producer-side operations.

If prod_managed_path is non-NULL, the function also resolves its canonical form via realpath(). If the canonical form differs from the provided path, it is stored separately in ctx->prod_real_path for use in path prefix matching. If realpath() fails (e.g. because the directory does not yet exist) or the canonical form is identical to the provided path, ctx->prod_real_path is set to NULL to avoid redundant matching.

The re-entrancy guard (ctx->reenter) is disabled for the duration of this call and restored before returning.

Note

realpath() failure is not treated as an error. This can occur when the producer-managed directory has not yet been created, which is valid in configurations where both producer and consumer managed paths are set but only one side creates its directory.

Parameters:

path[in] Path to the producer-managed directory. May be NULL to clear the producer path.

Return values:
  • DYAD_RC_OK – The producer path was successfully set or cleared.

  • DYAD_RC_NOCTX – The DYAD context is NULL.

  • DYAD_RC_BADMANAGEDPATHprod_managed_path is an empty string, or hashing the path returned 0.

  • DYAD_RC_SYSFAIL – Memory allocation or memcpy() failed.

Returns:

dyad_rc_t return code indicating the outcome:

dyad_rc_t dyad_set_cons_path(const char *path)

Sets the consumer-managed directory path in the DYAD context.

Functionally equivalent to dyad_set_prod_path() but sets the consumer-managed directory path (ctx->cons_managed_path).

Parameters:

path[in] Path to the consumer-managed directory. May be NULL to clear the consumer path.

Returns:

dyad_rc_t return code indicating the outcome. See dyad_set_prod_path() for the full list of return codes.

Return codes

enum dyad_core_return_codes

Return codes for DYAD core operations.

Return codes are grouped by subsystem. Negative values indicate errors; DYAD_RC_OK (0) indicates success. Use DYAD_IS_ERROR() to test whether a return code represents an error condition.

Values:

enumerator DYAD_RC_OK

Operation worked correctly.

enumerator DYAD_RC_SYSFAIL

Some sys call or C standard library call failed

enumerator DYAD_RC_NOCTX

No DYAD Context found.

enumerator DYAD_RC_BADMETADATA

Cannot create/populate a DYAD response.

enumerator DYAD_RC_BADFIO

File I/O failed.

enumerator DYAD_RC_BADMANAGEDPATH

Cons or Prod Manged Path is bad.

enumerator DYAD_RC_BADDTLMODE

Invalid DYAD DTL mode provided.

enumerator DYAD_RC_UNTRACKED

Provided path is not tracked by DYAD.

enumerator DYAD_RC_BAD_B64DECODE

Decoding of data w/ base64 failed.

enumerator DYAD_RC_BAD_COMM_MODE

Invalid comm mode provided to DTL.

enumerator DYAD_RC_BAD_CLI_ARG_DEF

Trying to define a CLI argument failed.

enumerator DYAD_RC_BAD_CLI_PARSE

Trying to parse CLI arguments failed.

enumerator DYAD_RC_BADBUF

Invalid buffer/pointer passed to function.

enumerator DYAD_RC_FLUXFAIL

Some Flux function failed.

enumerator DYAD_RC_BADCOMMIT

Flux KVS commit didn’t work.

enumerator DYAD_RC_NOTFOUND

Flux KVS lookup didn’t work.

enumerator DYAD_RC_BADFETCH

Flux KVS commit didn’t work.

enumerator DYAD_RC_BADRPC

Flux RPC pack or get didn’t work.

enumerator DYAD_RC_BADPACK

JSON packing failed.

enumerator DYAD_RC_BADUNPACK

JSON unpacking failed.

enumerator DYAD_RC_RPC_FINISHED

The Flux RPC responded with ENODATA (i.e., end of stream) sooner than expected

enumerator DYAD_RC_UCXINIT_FAIL

UCX initialization failed.

enumerator DYAD_RC_UCXWAIT_FAIL

UCX wait (either custom or ‘ucp_worker_wait’) failed

enumerator DYAD_RC_UCXEP_FAIL

An operation on a ucp_ep_h failed.

enumerator DYAD_RC_UCXCOMM_FAIL

UCX communication routine failed.

enumerator DYAD_RC_UCXMMAP_FAIL

Failed to perform operations with ucp_mem_map.

enumerator DYAD_RC_UCXRKEY_PACK_FAILED

Failed to perform operations with ucp_mem_map.

enumerator DYAD_RC_UCXRKEY_UNPACK_FAILED

Failed to perform operations with ucp_mem_map.

enumerator DYAD_RC_MARGOINIT_FAIL

Margo initialization failed.

enumerator DYAD_RC_MARGO_BAD_PROTO

Bad network protocol for Margo initialization.

DYAD_IS_ERROR(code)

Tests whether a dyad_rc_t value represents an error.

Parameters:
  • code – A dyad_rc_t return code.

Returns:

Non-zero if code indicates an error, zero if it indicates success.

Internal implementation

The following documents all internal functions from the core client implementation that are not exposed in the public headers.

Functions

int gen_path_key(const char *str, char *path_key, const size_t len, const uint32_t depth, const uint32_t width)

Generates a hierarchical KVS key from a file path using MurmurHash3.

Produces a structured KVS key of the form "h1.h2...hN.str", where each hN is a hexadecimal hash bucket value and str is the original file path. The number of hash levels is controlled by depth and the number of buckets per level by width.

Each level is computed by applying MurmurHash3_x64_128 to str with a different seed (drawn from a fixed table of primes, cycling every 10 levels), then reducing the four 32-bit hash words via XOR and taking the result modulo width to select a bucket. The buckets from all levels are concatenated with '.’ separators, followed by the original str appended as the final component.

If str is shorter than 128 bytes, it is padded with '@' characters to 128 bytes before hashing to improve hash distribution for short paths.

The hierarchical structure distributes keys across a tree of KVS directories, avoiding hot spots that would occur if all keys were stored at the same level. depth and width together control the fanout and depth of this tree.

Parameters:
  • str[in] File path to encode as a KVS key. Must not be NULL or empty.

  • path_key[out] Buffer to receive the generated key. Must not be NULL. On success, contains a null-terminated string of the form "h1.h2...hN.str".

  • len[in] Size of the path_key buffer in bytes. Must be greater than zero and large enough to hold the full key including the null terminator.

  • depth[in] Number of hash levels (directory levels) in the key. Must be no greater than 10 to stay within the seed table; larger values cycle through the seed table.

  • width[in] Number of buckets per hash level. Controls the fanout at each directory level of the KVS key tree.

Return values:
  • 0 – The key was successfully generated and written to path_key.

  • -1str, path_key, or len is invalid, str is empty, or the generated key exceeded len.

Returns:

int

static void future_cleanup_cb(void **f, void *arg)

Callback to clean up a Flux future after an asynchronous KVS commit completes.

Registered via flux_future_then() by dyad_kvs_commit() when ctx->async_publish is enabled. Invoked by the Flux reactor when the asynchronous KVS commit future is fulfilled, allowing the commit to complete without blocking the caller.

If the future completed with an error, logs a message to stderr before destroying the future. The error is not propagated since there is no caller context to return to at callback invocation time.

Parameters:
  • f[in] Pointer to the fulfilled Flux future. Destroyed before returning.

  • arg[in] Unused. Reserved for future use.

dyad_rc_t dyad_kvs_commit(const dyad_ctx_t *ctx, void **txn)

Commits a Flux KVS transaction to publish file metadata.

Submits txn to the Flux KVS under ctx->kvs_namespace. The commit behavior depends on whether asynchronous publishing is enabled:

  • Synchronous (ctx->async_publish is false): Blocks until the commit is acknowledged by the KVS, then destroys the future. The caller can be certain the metadata is visible to consumers upon return.

  • Asynchronous (ctx->async_publish is true): Registers future_cleanup_cb via flux_future_then() and returns immediately without waiting for the commit to complete. The future is destroyed by the callback when the commit eventually completes. The caller cannot assume the metadata is visible to consumers upon return.

This function is an internal helper called by publish_via_flux() and is not intended to be called directly by users.

Note

In asynchronous mode, a failure to register future_cleanup_cb via flux_future_then() is logged but does not affect the return code. The commit may still complete, but the future may not be properly cleaned up.

Parameters:
  • ctx[in] Pointer to the DYAD context. Must not be NULL. Provides the Flux handle, KVS namespace, and async_publish flag.

  • txn[in] Pointer to the Flux KVS transaction to commit. Must not be NULL. The caller retains ownership and is responsible for destroying txn after this function returns.

Return values:
  • DYAD_RC_OK – The transaction was successfully submitted, and in synchronous mode, acknowledged by the KVS.

  • DYAD_RC_BADCOMMIT – The flux_kvs_commit() call failed to submit the transaction.

Returns:

dyad_rc_t return code indicating the outcome:

dyad_rc_t publish_via_flux(const dyad_ctx_t *ctx, const char *upath)

Builds and commits a Flux KVS transaction to advertise a produced file.

Generates a KVS key from upath via gen_path_key(), then creates and commits a single-entry Flux KVS transaction that maps the key to the producer’s broker rank (ctx->rank). This rank is later retrieved by consumers via dyad_kvs_read() to determine file locality and, if needed, to identify which broker to contact for data transfer.

This function sits between dyad_commit() and dyad_kvs_commit() in the producer publish pipeline:

dyad_produce()
    -> dyad_commit()          [resolves path, checks management, guards reenter]
        -> publish_via_flux() [builds and submits the KVS transaction]
            -> dyad_kvs_commit() [commits the transaction to the Flux KVS]

The KVS transaction is destroyed before returning regardless of whether the commit succeeded or failed.

This function is an internal helper and is not intended to be called directly by users.

Parameters:
  • ctx[in] Pointer to the DYAD context. Must not be NULL. Provides the Flux handle, KVS namespace, producer rank, and key generation parameters (key_depth and key_bins).

  • upath[in] Path to the file relative to the producer-managed directory. Used to generate the KVS key. Must not be NULL.

Return values:
  • DYAD_RC_OK – The transaction was successfully built and committed.

  • DYAD_RC_FLUXFAIL – The Flux KVS transaction could not be created or packed.

  • DYAD_RC_* – Any error code propagated from dyad_kvs_commit().

Returns:

dyad_rc_t return code indicating the outcome:

dyad_rc_t dyad_commit(dyad_ctx_t *ctx, const char *fname)

Publishes file metadata to the Flux KVS to notify consumers that a file is ready.

Resolves fname to a path relative to the producer-managed directory and publishes the file’s metadata to the Flux KVS via publish_via_flux(). This signals to waiting consumers that the file has been written and is ready to be read or transferred.

If fname is not under the producer-managed path, the function returns DYAD_RC_OK immediately without taking any action, so that DYAD does not interfere with file operations outside its managed directories.

This function is the internal implementation called by dyad_produce(). It may also be called directly when finer control over the commit step is needed, such as when bypassing the context validation performed by dyad_produce().

Note

The caller is responsible for ensuring the file has been fully written and flushed to storage before calling this function, as consumers may begin reading the file immediately upon receiving the KVS notification.

Note

If ctx->check is set and the operation succeeds, the environment variable DYAD_CHECK_ENV is set to "ok".

Parameters:
  • ctx[in] Pointer to the DYAD context. Must not be NULL and must have a valid prod_managed_path set. ctx->reenter is temporarily set to false during the KVS publish to prevent re-entrant interception.

  • fname[in] Path to the file to be published. May be an absolute path or, if ctx->relative_to_managed_path is set, a path relative to ctx->prod_managed_path.

Return values:
  • DYAD_RC_OK – The file metadata was successfully published, or fname is not under the producer-managed path (no action taken).

  • DYAD_RC_* – Any error code propagated from publish_via_flux().

Returns:

dyad_rc_t return code indicating the outcome:

static void print_mdata(const dyad_ctx_t *ctx, const dyad_metadata_t *mdata)
dyad_rc_t dyad_kvs_read(const dyad_ctx_t *ctx, const char *topic, const char *upath, bool should_wait, dyad_metadata_t **mdata)

Looks up file metadata from the Flux KVS.

Queries the Flux KVS for metadata associated with the file identified by topic (the KVS key) and upath (the file path relative to the consumer-managed directory). The only metadata currently stored in the KVS is the producer’s broker rank (owner_rank), which is used by the caller to determine locality and, if needed, to dispatch an RPC to the correct producer broker for data transfer.

If should_wait is true, the lookup blocks using FLUX_KVS_WAITCREATE until the producer publishes the metadata. If should_wait is false, the lookup returns immediately with DYAD_RC_NOTFOUND if the metadata is not yet available.

If *mdata is already allocated on entry, the existing object is reused and only fpath and owner_rank are overwritten. Otherwise a new dyad_metadata_t object is allocated. On error, any partially allocated mdata is freed before returning.

Parameters:
  • ctx[in] Pointer to the DYAD context. Must not be NULL. Provides the Flux handle and KVS namespace.

  • topic[in] KVS key for the file, generated from upath via gen_path_key(). Must not be NULL.

  • upath[in] Path to the file relative to the consumer-managed directory. Stored in the populated metadata object. Must not be NULL.

  • should_wait[in] If true, block until the producer publishes the metadata to the KVS. If false, return immediately if the metadata is not yet available.

  • mdata[inout] Address of a dyad_metadata_t pointer to be populated. Must not be NULL. If *mdata is already allocated, it is reused; otherwise a new object is allocated. The caller is responsible for freeing it via dyad_free_metadata() when no longer needed.

Return values:
  • DYAD_RC_OK – Metadata was successfully retrieved and mdata has been populated.

  • DYAD_RC_NOTFOUNDmdata is NULL, the KVS lookup failed, or the metadata is not yet available and should_wait is false.

  • DYAD_RC_SYSFAIL – Memory allocation for the metadata object or its fpath field failed.

  • DYAD_RC_BADMETADATA – The KVS response could not be unpacked to extract the producer’s broker rank.

Returns:

dyad_rc_t return code indicating the outcome:

dyad_rc_t dyad_fetch_metadata(const dyad_ctx_t *ctx, const char *fname, const char *upath, dyad_metadata_t **mdata)

Retrieves file metadata from the Flux KVS for an internal consumer operation.

Looks up metadata for the file identified by upath in the Flux KVS, blocking until the producer publishes it. This is the internal counterpart to dyad_get_metadata(), used exclusively by dyad_consume().

After retrieving metadata, the function determines whether a remote data transfer is actually needed by checking if the producer and consumer reside on the same node. This is done by comparing mdata->owner_rank / ctx->service_mux against ctx->node_idx. If they match, the producer is on the same node and the file is already locally accessible, so the metadata object is freed and set to NULL to signal to dyad_consume() that the data transfer step should be skipped.

Note that this locality check is only relevant for node-local storage. When shared storage is enabled, dyad_consume() never uses the metadata for data transfer regardless, so the check is not needed in that path.

The mdata == NULL convention on return is therefore not an error condition but a deliberate signal that the data transfer step should be skipped.

This function is not intended to be called directly by users.

Note

Unlike dyad_get_metadata(), this function always blocks until the producer publishes the file’s metadata to the KVS.

Note

The service_mux field in ctx controls how many Flux broker ranks map to a single node, and is used to determine node-level locality.

Parameters:
  • ctx[in] Pointer to the DYAD context. Must not be NULL.

  • fname[in] Absolute path to the file being consumed.

  • upath[in] Path to the file relative to the consumer-managed directory. Used to generate the Flux KVS lookup key.

  • mdata[out] Address of a dyad_metadata_t pointer. Always set to NULL on entry. On success, either points to the retrieved metadata if a remote transfer is needed, or remains NULL if the producer is on the same node and the file is already locally accessible.

Return values:
  • DYAD_RC_OK – Metadata was successfully retrieved, or the producer is on the same node and no transfer is needed (mdata will be NULL).

  • DYAD_RC_* – Any error code propagated from dyad_kvs_read().

Returns:

dyad_rc_t return code indicating the outcome:

dyad_rc_t dyad_get_data(const dyad_ctx_t *ctx, const dyad_metadata_t *mdata, char **file_data, size_t *file_len)

Retrieves file data from a remote producer’s Flux broker via RPC.

Dispatches a streaming Flux RPC to the DYAD module running on the producer’s broker (identified by mdata->owner_rank) and retrieves the file data via the configured Data Transport Layer (DTL). The retrieved data is returned in file_data and its length in file_len.

The sequence of operations is:

  1. Pack an RPC payload containing the file path and producer rank.

  2. Send a streaming Flux RPC to the producer’s DYAD module.

  3. Receive and parse the RPC response.

  4. Establish a DTL connection to the producer.

  5. Receive the file data over the DTL connection.

  6. Close the DTL connection.

  7. Wait for the end-of-stream RPC message from the producer module.

The streaming RPC protocol expects exactly one data message followed by an end-of-stream signal (indicated by ENODATA). If additional messages arrive or the module reports an error, DYAD_RC_BADRPC is returned. Two return codes from the DTL have special meaning and bypass the end-of-stream wait: DYAD_RC_RPC_FINISHED (end of stream already received) and DYAD_RC_BADRPC (a prior RPC operation failed irrecoverably).

When built with UCX DTL support (DYAD_ENABLE_UCX_DTL), the producer prepends the file size to the data buffer. This prefix is extracted and stripped before returning, so file_data always points to the raw file contents.

This function is an internal helper called by dyad_consume() and dyad_consume_w_metadata(). It is not intended to be called directly by users.

Parameters:
  • ctx[in] Pointer to the DYAD context. Must not be NULL. Provides the Flux handle, DTL handle, and other connection parameters.

  • mdata[in] Metadata for the file to retrieve. Must not be NULL. mdata->fpath and mdata->owner_rank identify the file and the producer broker to contact.

  • file_data[out] Address of a pointer to be set to the buffer containing the retrieved file data. The buffer is allocated by the DTL layer. The caller is responsible for releasing it via ctx->dtl_handle->return_buffer().

  • file_len[out] Address of a size_t to be set to the number of bytes in file_data.

Return values:
  • DYAD_RC_OK – File data was successfully retrieved.

  • DYAD_RC_BADRPC – An RPC operation failed, the producer module sent an unexpected number of responses, or the module reported an error.

  • DYAD_RC_BADFIO – UCX DTL only: the producer-prepended file size was negative, indicating a read failure on the producer side.

  • DYAD_RC_* – Any error code propagated from dtl_handle->rpc_pack(), dtl_handle->rpc_recv_response(), dtl_handle->establish_connection(), or dtl_handle->recv().

Returns:

dyad_rc_t return code indicating the outcome:

dyad_rc_t dyad_cons_store(const dyad_ctx_t *ctx, const dyad_metadata_t *mdata, int fd, const size_t data_len, char *file_data)

Writes file data retrieved from a producer to the consumer-managed directory.

Stores data_len bytes of file_data to the appropriate path under the consumer-managed directory, as determined by combining ctx->cons_managed_path with the relative file path in mdata->fpath. Any intermediate directories that do not yet exist are created as needed.

For large files (at or above DYAD_POSIX_TRANSFER_GRANULARITY bytes), the data is written in chunks of DYAD_POSIX_TRANSFER_GRANULARITY rather than in a single write() call.

This function is an internal helper called by dyad_consume() and dyad_consume_w_metadata() after data has been retrieved from the producer via dyad_get_data(). It is not intended to be called directly by users.

Note

If the operation succeeds and ctx->check is set, the environment variable DYAD_CHECK_ENV is set to "ok".

Parameters:
  • ctx[in] Pointer to the DYAD context. Must not be NULL. Used to resolve the consumer-managed path and check the check flag.

  • mdata[in] Metadata for the file being stored. Must not be NULL. mdata->fpath is appended to ctx->cons_managed_path to form the full destination path.

  • fd[in] Open, writable file descriptor for the destination file.

  • data_len[in] Number of bytes to write from file_data.

  • file_data[in] Buffer containing the file data to write. Must be at least data_len bytes in size.

Return values:
  • DYAD_RC_OK – All data_len bytes were successfully written.

  • DYAD_RC_BADFIO – Directory creation failed, a write() call failed, or the total bytes written does not match data_len.

Returns:

dyad_rc_t return code indicating the outcome:

dyad_rc_t dyad_produce(dyad_ctx_t *ctx, const char *fname)

Publishes a file under a DYAD-managed directory so it is available to consumers.

If fname falls under the producer-managed path, this function publishes the file’s metadata to the Flux KVS via dyad_commit(), signaling to consumers that the file is ready to be read.

This function is the producer-side counterpart to dyad_consume(). It does not transfer file data directly; instead it notifies the DYAD infrastructure that the file has been written, allowing waiting consumers to proceed with retrieval.

Note

The caller is responsible for ensuring the file has been fully written and flushed to storage before calling this function, as consumers may begin reading the file immediately upon notification.

Warning

The caller must ensure ctx remains valid for the duration of this call.

Parameters:
  • ctx[in] Pointer to the DYAD context. Must not be NULL and must have a valid prod_managed_path set.

  • fname[in] Path to the file that has been written and is ready for consumption.

Return values:
  • DYAD_RC_OK – The file was successfully published.

  • DYAD_RC_NOCTX – The context ctx or its Flux handle is NULL.

  • DYAD_RC_BADMANAGEDPATH – The producer-managed path in the context is NULL.

  • DYAD_RC_* – Any error code propagated from dyad_commit().

Returns:

dyad_rc_t return code indicating the outcome:

dyad_rc_t dyad_get_metadata(dyad_ctx_t *ctx, const char *fname, bool should_wait, dyad_metadata_t **mdata)

Retrieves metadata for a file under a DYAD-managed directory.

This function is coupled with Python API. This populates mdata which is used by dyad_consume_w_metadata ()

dyad_rc_t dyad_free_metadata(dyad_metadata_t **mdata)

Frees a dyad_metadata_t object allocated by dyad_get_metadata().

Releases all memory associated with mdata, including the internal file path string, and sets *mdata to NULL. If mdata or *mdata is NULL, the function returns DYAD_RC_OK without taking any action.

Parameters:

mdata[in] Address of the dyad_metadata_t pointer to free. Set to NULL on return.

Return values:

DYAD_RC_OK – The metadata was successfully freed, or was already NULL.

Returns:

dyad_rc_t return code indicating the outcome:

dyad_rc_t dyad_consume(dyad_ctx_t *ctx, const char *fname)

Ensures a file under a DYAD-managed directory is ready to be read.

If fname falls under the consumer-managed path, this function ensures the file is fully available before the caller proceeds to read it.

If fname is not under the consumer-managed path, the function returns DYAD_RC_OK immediately without taking any action.

The behavior depends on the underlying storage configuration:

  • Shared storage (e.g., a parallel or network filesystem visible across multiple nodes): The file is visible to all consumers if it exists. An exclusive lock is acquired to synchronize access. A file size of zero indicates the producer has not yet written the file, so the lock is released and the function waits for the file to be published via the Flux KVS before returning. No data transfer is performed since the file is directly accessible once available.

  • Node-local storage: The file being empty means it is either not yet produced or not yet visible locally. While holding the exclusive lock, the function waits for the file to be published via the Flux KVS. Since the producer does not participate in locking, it may write the file during this wait. When metadata is returned by dyad_fetch_metadata(), if the file has local visibility (e.g., the producer is on the same node and has already written the file), dyad_fetch_metadata() frees and nulls the metadata object to signal that no remote transfer is needed. Otherwise, the file data is retrieved from the remote producer’s Flux broker via dyad_get_data(), written to local disk, and the lock is released.

    • An exclusive lock is acquired to serve two purposes. First, the producer acquires the lock in dyad_open_wrapper() to prevent consumers with direct file visibility (e.g. via shared storage or co-location on the same node) from reading a partially written file, and releases it in dyad_close_wrapper() once the file is fully written. Second, the consumer acquires the lock here to ensure that only one consumer performs the data fetch at a time, with other consumers blocking until the lock is released. Because POSIX fcntl locks are cooperative, these guarantees only hold between processes that also participate in locking.

Note

This function temporarily sets ctx->reenter to false during execution to prevent re-entrant interception, restoring it to true before returning.

Warning

The caller must ensure ctx remains valid for the duration of this call.

Parameters:
  • ctx[in] Pointer to the DYAD context. Must not be NULL and must have a valid cons_managed_path set.

  • fname[in] Path to the file to be checked and made ready. May be an absolute path or, if ctx->relative_to_managed_path is set, a path relative to ctx->cons_managed_path.

Return values:
  • DYAD_RC_OK – The file is ready to read, was already available, or is not under the managed path (no action needed).

  • DYAD_RC_NOCTX – The context ctx or its Flux handle is NULL.

  • DYAD_RC_BADMANAGEDPATH – The consumer-managed path in the context is NULL.

  • DYAD_RC_BADFIO – A file I/O error occurred (open or close failed).

  • DYAD_RC_* – Any error code propagated from dyad_excl_flock(), dyad_fetch_metadata(), dyad_get_data(), or dyad_cons_store().

Returns:

dyad_rc_t return code indicating the outcome:

dyad_rc_t dyad_consume_w_metadata(dyad_ctx_t *ctx, const char *fname, const dyad_metadata_t *mdata)

Ensures a file is ready to be read using caller-supplied metadata.

This is a variant of dyad_consume() intended for use by the Python API, where metadata is managed manually by the caller. Instead of resolving the file path and consulting the Flux KVS itself, this function accepts a pre-populated dyad_metadata_t object obtained from a prior call to dyad_get_metadata().

If mdata is NULL, the file is assumed to be already available locally and the function returns DYAD_RC_OK without taking any action. Otherwise, if the file is empty (not yet fetched), the function retrieves the file data from the producer’s Flux broker via dyad_get_data() and writes it to local disk via dyad_cons_store().

As with dyad_consume(), an exclusive lock is acquired for consumer-to-consumer synchronization. Because POSIX fcntl locks are cooperative, the producer — which does not acquire any lock — is unaffected and may write the file regardless of any consumer lock held.

The typical usage pattern from Python is:

  1. Call dyad_get_metadata() to obtain mdata, optionally waiting for the producer to publish the file.

  2. Pass mdata to this function to ensure the file is locally available.

  3. Read the file.

  4. Free mdata via dyad_free_metadata().

Note

This function temporarily sets ctx->reenter to false during execution to prevent re-entrant interception, restoring it to true before returning.

Warning

The caller must ensure ctx remains valid for the duration of this call.

Warning

Unlike dyad_consume(), this function does not support shared storage; it always attempts to fetch and store the file locally when the file is empty and mdata is non-NULL.

Parameters:
  • ctx[in] Pointer to the DYAD context. Must not be NULL and must have a valid cons_managed_path set.

  • fname[in] Path to the file to be checked and made ready. Must not be NULL.

  • mdata[in] Metadata for the file, previously obtained via dyad_get_metadata() or manually constructed by the caller. If NULL, the file is assumed to be locally available and no transfer is performed. The caller retains ownership and is responsible for freeing this object after this function returns.

Return values:
  • DYAD_RC_OK – The file is ready to read, or mdata was NULL indicating the file is already local.

  • DYAD_RC_NOCTX – The context ctx or its Flux handle is NULL.

  • DYAD_RC_BADMANAGEDPATH – The consumer-managed path in the context is NULL.

  • DYAD_RC_BADFIO – A file I/O error occurred (open or close failed).

  • DYAD_RC_* – Any error code propagated from dyad_excl_flock(), dyad_get_data(), or dyad_cons_store().

Returns:

dyad_rc_t return code indicating the outcome:

int dyad_sync_directory(dyad_ctx_t *ctx, const char *path)

Synchronizes the parent directory of a file to ensure its entry is durably written to storage.

Opens the parent directory of path and calls fsync() on it to flush any pending directory entry updates to stable storage. This is necessary after creating a new file to guarantee that the directory entry is durable and visible to other processes, even in the event of a system crash. See https://lwn.net/Articles/457671/ for background.

This is particularly important in the DYAD producer path — calling dyad_commit() to publish a file to the KVS before the directory entry is synced could allow a consumer to attempt to open the file before it is visible in the directory.

ctx->reenter is saved and restored around the directory open and sync operations to prevent re-entrant interception of the open() call.

Note

All three operations (open, fsync, close) are attempted even if one fails — the return code reflects whether any of them failed, but does not distinguish which one.

Parameters:
  • ctx[in] Pointer to the DYAD context. May be NULL, in which case the reenter guard is skipped. If non-NULL, ctx->reenter is temporarily set to false during the operation.

  • path[in] Path to the file whose parent directory is to be synced. Must not be NULL. The parent directory is derived via dirname().

Return values:
  • 0 – The parent directory was successfully opened, synced, and closed.

  • -1 – The parent directory could not be opened, fsync() failed, or close() failed.

Returns:

int