DYAD: Dynamic and Asynchronous Data Streamliner

DYAD is a synchronization and data movement tool for computational science workflows built on top of Flux. DYAD aims to provide the benefits of in situ and in transit tools (e.g., fine-grained synchronization between producer and consumer applications, fast data access due to spatial locality) while relying on a file-based data abstraction to maximize portability and minimize code change requirements for workflows. More specifically, DYAD aims to overcome the following challenges associated with traditional shared-storage and modern in situ and in transit data movement approaches:

  • Lack of per-file object synchronization in shared-storage approaches

  • Poor temporal and spatial locality in shared-storage approaches

  • Poor performance for file metadata operations in shared-storage approaches (and possibly some in situ and in transit approaches)

  • Poor portability and the introduction of required code changes for in situ and in transit approaches

In resolving these challenges, DYAD aims to provide the following to users:

  • Good performance (similar to in situ and in transit) due to on- or near-node temporary storage of data

  • Transparent per-file object synchronization between producer and consumer applications

  • Little to no code change to existing workflows to achieve the previous benefits

Contributing

Indices and tables