Utility API
Functions
-
uint32_t hash_str(const char *str, const uint32_t seed)
Computes a non-zero MurmurHash3 hash of a string.
Hashes
strusing MurmurHash3_x64_128 with the givenseed, then reduces the four 32-bit output words via XOR and adds 1 to ensure the result is never zero. This allows zero to be used unambiguously as an error sentinel by the caller.If
stris shorter than 128 bytes, it is padded with'@'characters to 128 bytes before hashing to improve hash distribution for short strings, consistent with the padding used ingen_path_key().- Parameters:
str – [in] Null-terminated string to hash. If
NULLor empty, returns 0.seed – [in] Seed value for MurmurHash3.
- Return values:
0 –
strisNULLor empty.non-zero – The computed hash value.
- Returns:
uint32_t
-
uint32_t hash_path_prefix(const char *str, const uint32_t seed, const size_t len)
Computes a non-zero MurmurHash3 hash of the first
lenbytes of a string.Hashes only the first
lenbytes ofstrusing MurmurHash3_x64_128 with the givenseed, then reduces the four 32-bit output words via XOR and adds 1 to ensure the result is never zero. This allows zero to be used unambiguously as an error sentinel by the caller.If
lenis shorter than 128 bytes, the prefix is padded with'@'characters to 128 bytes before hashing to improve hash distribution for short prefixes, consistent with the padding used inhash_str()andgen_path_key().If
stris shorter thanlenbytes, the function returns 0 since the requested prefix length exceeds the actual string length.- Parameters:
str – [in] Null-terminated string whose prefix is to be hashed. If
NULL, returns 0.seed – [in] Seed value for MurmurHash3.
len – [in] Number of bytes to hash from the start of
str. If 0 or greater thanstrlen(str), returns 0.
- Return values:
0 –
strisNULL,lenis 0, orlenexceeds the length ofstr.non-zero – The computed hash value of the first
lenbytes.
- Returns:
uint32_t
-
char *concat_str(char *str, const char *to_append, const char *connector, size_t str_capacity)
Appends a string to an existing buffer, joining them with a connector.
Concatenates
connectorandto_appendonto the end ofstrin-place, producing"str + connector + to_append". The result is written back into thestrbuffer.If
stralready ends withconnector, the trailing connector is stripped before appending to avoid duplicating it. For example, concatenating"foo/"with connector"/"andto_append"bar"produces"foo/bar"rather than"foo//bar".The operation is performed via an intermediate heap-allocated buffer to safely handle the in-place update of
str. Ifstr,to_append, orconnectoroverlap in memory, the function returnsNULLwithout modifyingstr.Note
This function allocates a temporary heap buffer internally for the concatenation and frees it before returning.
- Parameters:
str – [inout] Null-terminated string to append to. Also serves as the output buffer. Must not be
NULLand must be at leaststr_capacitybytes in size.to_append – [in] Null-terminated string to append. Must not be
NULLand must not overlap withstr.connector – [in] Null-terminated string to insert between
strandto_append(e.g."/"). Must not overlap withstr.str_capacity – [in] Total size of the
strbuffer in bytes. The combined result must fit within this capacity including the null terminator.
- Return values:
str – The operation succeeded and
strnow contains the concatenated result.NULL – The combined result would exceed
str_capacity, orstr,to_append, andconnectoroverlap in memory.
- Returns:
char*
-
bool extract_user_path(const char *prefix, const char *full, const char *delim, char *upath, const size_t upath_capacity)
Extracts the path component following a managed directory prefix.
Checks whether
fullbegins withprefix(separated bydelim) and, if so, extracts the portion offullthat follows the prefix and delimiter intoupath. This is used to derive the path of a file relative to a DYAD-managed directory from its absolute path.For example, with
prefix"/managed",delim"/", andfull"/managed/subdir/file.txt", the extractedupathwould be"subdir/file.txt".The following conditions all cause the function to return
falsewithout modifyingupath:upathisNULL.prefix,full, ordelimoverlaps with theupathbuffer.fulldoes not begin withprefix.Any path argument exceeds
PATH_MAXbytes.fullis equal toprefixwith no user path component following it.The delimiter is not present between
prefixand the user path infull(e.g."/managed_other/file"does not match prefix"/managed").The extracted user path would exceed
upath_capacitybytes including the null terminator.
If
prefixitself ends withdelim, the trailing delimiter is stripped before matching to avoid requiring a double delimiter between the prefix and the user path.Note
upathis not explicitly null-terminated by this function. Callers should zero-initialize the buffer before calling to ensure the result is null-terminated.- Parameters:
prefix – [in] Null-terminated managed directory path to match against the start of
full. Must not beNULL.full – [in] Null-terminated absolute file path to extract from. Must not be
NULLand must not overlap withupath.delim – [in] Null-terminated path delimiter string (e.g.
"/"). IfNULL, treated as an empty string.upath – [out] Buffer to receive the extracted relative path. Must not be
NULLor overlap with any other argument. Not null-terminated by this function; the caller should ensure the buffer is zeroed before calling.upath_capacity – [in] Size of the
upathbuffer in bytes. The extracted path must fit within this capacity including a null terminator.
- Return values:
true –
fullbegins withprefixand the relative path was successfully extracted intoupath.false – Any of the failure conditions listed above were met.
upathis not modified.
- Returns:
bool
-
bool cmp_canonical_path_prefix(const dyad_ctx_t *ctx, const bool is_prod, const char *path, char *upath, const size_t upath_capacity)
Checks whether a path falls under a DYAD-managed directory and extracts its relative component.
Determines if
pathis under the DYAD-managed directory for either the producer (is_prodistrue) or consumer (is_prodisfalse), and if so, extracts the portion ofpathfollowing the managed prefix intoupath.To handle symlinks and non-canonical paths, the check is attempted in up to four passes before returning
false:Hash and match
pathagainst the managed path prefix.Hash and match
pathagainst the canonical (real) managed path prefix, if one is available (can_prefix_len> 0).Resolve
pathto its canonical form viarealpath(), then hash and match the result against the managed path prefix.Hash and match the canonical form of
pathagainst the canonical managed path prefix.
Each pass first compares a hash of the appropriate prefix-length of the path against the pre-computed prefix hash stored in the context, and only calls
extract_user_path()on a hash match. This avoids the cost of full string comparison for paths that clearly do not match.upathis populated by the first passing match and the function returns immediately without attempting further passes.Note
The function assumes that the prefix lengths (
prod_managed_len,cons_managed_len, etc.) and pre-computed hashes stored inctxare accurate and consistent with the corresponding path strings. No internal validation of these values is performed.Note
Hash collisions between an unrelated path and a managed prefix will cause
extract_user_path()to be called unnecessarily, but the full string comparison insideextract_user_path()will correctly reject the mismatch.Note
This function only works correctly when there are no multiple absolute paths to the same file via hard links.
- Parameters:
ctx – [in] Pointer to the DYAD context. Must not be
NULL. Provides the managed path, its canonical form, their lengths, and their pre-computed hashes for both producer and consumer sides.is_prod – [in] If
true, match against the producer-managed path (ctx->prod_managed_path). Iffalse, match against the consumer-managed path (ctx->cons_managed_path).path – [in] Null-terminated path to check. May be a symlink or non-canonical path;
realpath()is used as a fallback if direct matching fails.upath – [out] Buffer to receive the relative path component following the managed prefix. Should be zero-initialized by the caller. Not explicitly null-terminated by this function.
upath_capacity – [in] Size of the
upathbuffer in bytes.
- Return values:
true –
path(or its canonical form) is under the managed directory and the relative component has been written toupath.false –
ctxisNULL,pathdoes not fall under the managed directory under any of the four matching passes, orrealpath()failed when resolvingpath.
- Returns:
bool
-
int mkpath(const char *dir, const mode_t m)
Recursively creates a directory and all missing parent directories.
Creates
dirand any intermediate parent directories that do not yet exist, similar tomkdir-p. Ifdiralready exists, returns 0 immediately without error.The implementation recurses up the directory tree via
dirname()until it reaches a directory that already exists, then creates each missing component on the way back down.strdupa()is used to duplicate the path before passing it todirname()sincedirname()may modify its argument in place.See https://stackoverflow.com/questions/2336242/recursive-mkdir-system-call-on-unix for the basis of this implementation.
Note
The permission mode
mis applied to each directory created during the recursive descent. The effective permissions may differ frommdepending on the processumask.Note
This function uses
strdupa()which allocates on the stack. Deep directory hierarchies or very long paths may cause stack overflow.Warning
Return codes from intermediate
mkdir()calls during recursion are not checked. Only the return value of the finalmkdir()fordiritself is returned to the caller.- Parameters:
dir – [in] Null-terminated path of the directory to create. Must not be
NULL. IfNULL, setserrnotoEINVALand returns 1.m – [in] Permission mode bits to apply to each newly created directory, passed directly to
mkdir().
- Return values:
0 –
diralready exists or was successfully created along with all required parent directories.1 –
dirisNULL(errnoset toEINVAL).non-zero – The return value of
mkdir()for the final directory component if creation failed, witherrnoset bymkdir().
- Returns:
int
-
int mkdir_as_needed(const char *path, const mode_t m)
Creates a directory and all missing parent directories, with existence and permission checks.
Creates
pathand any missing intermediate parent directories usingmkpath(). Before attempting creation, checks whetherpathalready exists and validates that it is a directory with the expected permission bits. The same checks are repeated aftermkpath()returns a non-zero value, since a concurrent process may have created the directory in the interim.The process
umaskis temporarily set to 0 during directory creation to ensure that the permission bits specified bymare applied exactly as requested. The originalumaskis restored aftermkpath()returns.If
DYAD_SYNC_DIRis defined at compile time, the parent directory ofpathis synced viasync_containing_dir()after successful creation to ensure the new directory entry is durable on storage.Note
The
umaskis restored to its original value aftermkpath()returns, but is not restored ifmkpath()is interrupted abnormally.Note
Return code 5 is not an error in the strict sense — the directory is usable — but callers may wish to log or handle the permission mismatch depending on their security requirements.
Warning
This function calls
perror()directly onmkpath()failure, which writes tostderr. Callers that manage their own error output should be aware of this side effect.- Parameters:
path – [in] Null-terminated path of the directory to create. Must not be
NULLor empty.m – [in] Permission mode bits to apply to newly created directories. The
umaskis set to 0 during creation so these bits are applied exactly.
- Return values:
0 – The directory was successfully created.
1 – The directory already exists with the requested permissions.
5 – The directory already exists but with different permission bits.
-1 –
mkpath()failed and the directory does not exist afterward.-2 –
pathalready exists but is not a directory.-3 –
pathisNULLor empty.-4 –
mkpath()failed but a subsequentstat()foundpathexists as a non-directory entry.
- Returns:
int
-
int get_path(const int fd, const size_t max_size, char *path)
Resolves the file path associated with an open file descriptor.
Reads the symbolic link
/proc/self/fd/followed byfdviareadlink()to obtain the path of the file currently open onfd, and writes the result intopath. This is a Linux-specific mechanism and requires/procto be mounted.pathis zero-initialized up tomax_size+ 1 bytes before thereadlink()call. Ifreadlink()returns exactlymax_sizebytes, a truncation warning is logged since the path may have been silently truncated.Note
If
readlink()returns exactlymax_sizebytes, the path may have been truncated. A debug message is logged but the function still returns 0. Callers that require exact paths should use a buffer of at leastPATH_MAX+ 1 bytes.Note
This function relies on
/proc/self/fd/, which is Linux-specific and requires/procto be mounted.Warning
There is an off-by-one issue in the null terminator placement:
path[max_size + 1] is written rather thanpath[max_size], which writes one byte past the end of amax_size+ 1 sized buffer. Thepathbuffer should be at leastmax_size+ 2 bytes to avoid a buffer overwrite.- Parameters:
fd – [in] Open file descriptor whose path is to be resolved.
max_size – [in] Maximum number of bytes to write into
path, excluding the null terminator. Must be at least 1. Thepathbuffer must be at leastmax_size+ 1 bytes in size to accommodate the null terminator.path – [out] Buffer to receive the resolved path. Zero-initialized by this function up to
max_size+ 1 bytes before thereadlink()call.
- Return values:
0 – The path was successfully resolved and written to
path.-1 –
max_sizeis less than 1, orreadlink()failed (errnoset byreadlink()).
- Returns:
int
-
bool is_path_dir(const char *path)
Check if the path is a directory.
-
bool is_fd_dir(int fd)
Checks whether an open file descriptor refers to a directory.
- Parameters:
fd – [in] File descriptor to check. If negative, returns
falseimmediately without callingfstat().- Return values:
true –
fdis a valid open file descriptor referring to a directory.false –
fdis negative,fstat()failed, orfddoes not refer to a directory.
- Returns:
bool
-
ssize_t get_file_size(int fd)
Returns the size of an open file in bytes.
Calls
fstat()onfdto obtain the file size from the file’s stat structure. Does not modify the file position and works on any file descriptor for whichfstat()is supported.Note
On a return value of 0,
errnocan be checked to distinguish between an empty file and afstat()failure. Iffstat()failed,errnois set to one of:EBADF:fdis not a valid open file descriptor.EFAULT:The stat buffer address is invalid (internal error).EIO:An I/O error occurred while reading file metadata.
- Parameters:
fd – [in] Open file descriptor to measure.
- Return values:
>=0 – The size of the file in bytes. A value of 0 means the file is empty or
fstat()failed.- Returns:
ssize_t
-
dyad_rc_t dyad_excl_flock(const dyad_ctx_t *ctx, int fd, struct flock *lock)
Acquires an exclusive (write) lock on an open file descriptor.
Sets a POSIX write lock (
F_WRLCK) over the entire file usingfcntl()withF_SETLKW, blocking the caller until the lock is acquired. This prevents other processes from acquiring any lock (shared or exclusive) on the file until the lock is released viadyad_release_flock().If
lockisNULL, the function returns without taking any action.- Parameters:
ctx – [in] DYAD context.
fd – [in] File descriptor of the open file to lock.
lock – [out] Pointer to a
flockstructure populated by this function. Must not beNULL. The structure is used for subsequent unlock calls viadyad_release_flock().
- Return values:
DYAD_RC_OK – The lock was successfully acquired.
DYAD_RC_BADFIO – The
fcntl()call failed to acquire the lock.
- Returns:
dyad_rc_tReturn code indicating the outcome:
Acquires a shared (read) lock on an open file descriptor.
Sets a POSIX read lock (
F_RDLCK) over the entire file usingfcntl()withF_SETLKW, blocking the caller until the lock is acquired. Multiple consumers holding shared locks on the same file may coexist, but a shared lock cannot be acquired while an exclusive lock is held, and vice versa.If
lockisNULL, the function returns without taking any action.- Parameters:
ctx – [in] DYAD context.
fd – [in] File descriptor of the open file to lock.
lock – [out] Pointer to a
flockstructure populated by this function. Must not beNULL. The structure is used for subsequent unlock calls viadyad_release_flock().
- Return values:
DYAD_RC_OK – The shared lock was successfully acquired.
DYAD_RC_BADFIO – The
fcntl()call failed to acquire the lock.
- Returns:
dyad_rc_tReturn code indicating the outcome:
-
dyad_rc_t dyad_release_flock(const dyad_ctx_t *ctx, int fd, struct flock *lock)
Releases a lock previously acquired on an open file descriptor.
Clears a POSIX lock (
F_UNLCK) over the entire file usingfcntl()withF_SETLKW, releasing any lock (exclusive or shared) previously set bydyad_excl_flock()ordyad_shared_flock(). Other processes blocked on a lock acquisition for this file will be allowed to proceed.If
lockisNULL, the function returns without taking any action.- Parameters:
ctx – [in] DYAD context.
fd – [in] File descriptor of the open file to unlock.
lock – [inout] Pointer to the
flockstructure previously populated bydyad_excl_flock()ordyad_shared_flock(). Must not beNULL.
- Return values:
DYAD_RC_OK – The shared lock was successfully acquired.
DYAD_RC_BADFIO – The
fcntl()call failed to acquire the lock.
- Returns:
dyad_rc_tReturn code indicating the outcome:- Returns:
dyad_rc_tReturn code indicating the outcome:
-
int sync_containing_dir(const char *path)
Run fsync for the containing directory of the given path. For example, if path is “/a/b”, then fsync on “/a”. This cannot be used with DYAD interception.