Fixed-sized Arrays
Who doesn't like a tuple with 1024 elements?
Let's say we want to statfs()
a mount point to determine which BSD device name it belongs to. In english, that means discover that /Volumes/MyDisk
comes from /dev/disk6s2
.
struct statfs fsinfo;
if (statfs(path, &fsinfo) != 0) {
//error
}
The equivalent Swift code, along with a POSIX error helper:
func posix_expects_zero<R: BinaryInteger>(_ f: @autoclosure () throws -> R) throws {
let returncode = try f()
if returncode != 0 {
// Substitute a custom error type here if you want,
// use strerror(returncode) to get the message as a C-string.
// NSError does that for us automatically.
throw NSError(
domain: NSPOSIXErrorDomain,
code: numericCast(returncode),
userInfo: nil)
}
}
// Use the default empty initializer. Swift knows we mean the
// struct but I state it explicitly for clarity
var fsinfo: statfs = statfs()
statfs(path, &fsinfo)
C Imports
The call is defined in C as int statfs(const char *path, struct statfs *info)
. The struct has several fields but we're only concerned with mount-from-name:
struct statfs {
//...
char f_mntfromname[MAXPATHLEN];
//...
}
On Apple platforms MAXPATHLEN == PATH_MAX == 1024.
If you hard-code 1024 instead of using the appropriate #define my ghost will haunt you and your family for 12 generations.
Uh-oh. A fixed-size array. That means it imports in Swift as a tuple. A tuple with 1024 elements:
public var f_mntfromname: (Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8, Int8)
This is not very useful. Can we do anything about that? Since this post exists you can probably guess the answer is yes.
C Strings
This fixed-size array contains C characters. We don't know if the value is null-terminated because the documentation doesn't say. It infuriatingly hints that the f_fstypename
field is null-terminated on 64-bit systems but is silent on the mount to/from names. They aren't directly defined in terms of PATH_MAX
, which usual implies a null-terminated C string, but MAXPATHLEN
is defined as PATH_MAX
. Maybe we're supposed to infer something from that?
For fixed-size arrays, some C APIs still null-terminate (so the actual string can be sizeof(array) - 1
), while others are happy to fill the entire buffer (so the string can be sizeof(array)
with no terminating null). This is the sort of pernicious stumbling block that makes programs appear to work fine, pass all tests, then hit a strange edge case where some new FizzyWizz Disk System™️ comes along and suddenly you encounter a really long BSD device name that is exactly 1024 characters long and it turns out your program had an exploitable memory corruption bug.
It is likely that the values are null-terminated but I'll show you how to handle it agnostically so this works with both cases. That means we never have to think about it again, reducing cognitive load. No one re-using this code in other contexts will have to think about it either. That's such a clear win why wouldn't we do it that way?
Anyway I digress...
A Partial Solution
The first thing we need is the offset of the field. The new MemoryLayout.offset
method can give that to us: MemoryLayout<statfs>.offset(of: \Darwin.statfs.f_mntfromname)!
. Since the struct and the function have the same name, we need to provide the fully-qualified name or we'll get an ambiguous reference error trying to construct the key path. We can also force-unwrap because we know the key path is valid and the field has an offset.
The memory layout offset plus withUnsafePointer
lets us create a pointer to the start of the struct's field. With that we can create a string:
return withUnsafePointer(to: fsinfo, { (ptr) -> String? in
let offset = MemoryLayout<statfs>.offset(of: \Darwin.statfs.f_mntfromname)!
let fieldPtr = (UnsafeRawPointer(ptr) + offset).assumingMemoryBound(to: UInt8.self)
if fieldPtr[count - 1] != 0 {
let data = Data(bytes: UnsafeRawPointer(fieldPtr), count: count)
return String(data: data, encoding: .utf8)
} else {
return String(cString: fieldPtr)
}
})
As a quick check we see if the buffer is nil-terminated. If it is, we take the C-string fast path. Otherwise we create the data instance so we can use String's length-limited constructor. It is tempting to use Data(bytesNoCopy:count:deallocator:)
, but the String(data:encoding:)
initializer does not promise to copy the underlying buffer and Data
's contract doesn't either. Since this is expected to be the rare case we'll play it safe. (If it provided to be a performance problem I'd spend more time investigating alternatives.)
It is possible that a shorter string was written that is nil-terminated but the remainder of the buffer was filled with junk. Swift forced us to zero-initialize the struct so the only way that happens is if the kernel is copying garbage into our address space. Kernels typically try to avoid leaking kernel memory to userspace so we ignore that case here since the only result would be taking the slow path. (There are endless ways to bikeshed getting those bytes into a string. I have no intention of covering them exhaustively).
Now let's make this an extension on statfs
:
extension statfs {
var mntfromname: String? {
mutating get {
return withUnsafePointer(to: fsinfo, { (ptr) -> String? in
let offset = MemoryLayout<statfs>.offset(of: \Darwin.statfs.f_mntfromname)!
let fieldPtr = (UnsafeRawPointer(ptr) + offset).assumingMemoryBound(to: UInt8.self)
let count = Int(MAXPATHLEN)
if fieldPtr[count - 1] != 0 {
let data = Data(bytes: UnsafeRawPointer(fieldPtr), count: count)
return String(data: data, encoding: .utf8)
} else {
return String(cString: fieldPtr)
}
})
}
}
}
This works, but what if we want to handle other fields like f_mntoname
? It seems a shame to repeat this code over and over. Making this code generic is fairly trivial; We can accept a key path and count, perform a few substitutions, and we're done:
func fixedArrayToString<T>(t: T, keyPath: PartialKeyPath<T>, count: Int) -> String? {
return withUnsafePointer(to: t) { (ptr) -> String? in
let offset = MemoryLayout<T>.offset(of: keyPath)!
let fieldPtr = (UnsafeRawPointer(ptr) + offset).assumingMemoryBound(to: UInt8.self)
if fieldPtr[count - 1] != 0 {
let data = Data(bytes: UnsafeRawPointer(fieldPtr), count: count)
return String(data: data, encoding: .utf8)
} else {
return String(cString: fieldPtr)
}
}
}
extension statfs {
var mntfromname: String? {
get {
return fixedArrayToString(
t: self,
keyPath: \Darwin.statfs.f_mntfromname,
count: Int(MAXPATHLEN))
}
}
var mntonname: String? {
get {
return fixedArrayToString(
t: self,
keyPath: \Darwin.statfs.f_mntonname,
count: Int(MAXPATHLEN))
}
}
}
Conclusion
Now you know how to turn those N-element tuples into something useful.
This blog represents my own personal opinion and is not endorsed by my employer.