Swift: Let's Query DNS
Interop in the real world
So far we've been going over some of the interop capabilities in Swift, but today I want to switch gears a bit and actually put this stuff to use. Along the way we'll discover some annoyances and take opportunities not available in Objective-C to remedy them.
Let's say we want to determine if a host is reachable. To do that, we can use the getaddrinfo()
function. This function is IPv4 vs IPv6 agnostic and combines both hostname lookup and service lookup which usually saves us a lot of trouble.
func getaddrinfo(node: CString,
service: CString,
hints: ConstUnsafePointer<addrinfo>,
result: UnsafePointer<UnsafePointer<addrinfo>>)
-> Int32
Here we pass the node (host) name, service (name or port) along with an addrinfo
struct to give some hints about what sort of address family and protocol we want. The result is a pointer to a pointer to a struct of the same addrinfo
; as we'll see a little later, this result is actually a linked list of structs we can traverse to enumerate all the responses.
Throughout this post I'm going to gloss over some of the details on the structs and parameters - check out the OS X man pages for the full explanation.
Making the Call
var hints = addrinfo(
ai_flags: 0,
ai_family: AF_UNSPEC,
ai_socktype: SOCK_STREAM,
ai_protocol: IPPROTO_TCP,
ai_addrlen: 0,
ai_canonname: nil,
ai_addr: nil,
ai_next: nil)
let host:CString = "www.apple.com"
let port:CString = "http" //could use "80" here
var result = UnsafePointer<addrinfo>.null()
let error = getaddrinfo(csHost, csPort, &hints, &result)
First we declare the hints:addrinfo
struct. In Swift, we're required to initialize the struct before using it so we must provide values for all the fields. The only ones we care about are the family as Unspecified (meaning IPv4 or IPv6), socket type (stream instead of datagram), and protocol TCP.
You might be tempted to declare hints
with let, but if you try that you'll get an error trying to pass it to getaddrinfo
. Why? Because that parameter is declared in C as const struct addrinfo * __restrict
. That means it's a constant pointer to a mutable struct addrinfo
, so Swift is correctly enforcing the requirements.
Aside:
__restrict
declares that the pointer has no aliases, which allows a lot of optimizations that are normally impossible. Aliases are bad because they force the compiler to assume in functionvoid doIt(int * a, int * b)
thata
andb
might happen to point to the same memory location.
Considerfor(...) { *a = *a + *b; }
. With aliasing, the compiler cannot placea
andb
in registers because writes toa
might also be modifyingb
.
__restrict
promises the compiler that this can't happen, allowing it to move both values into registers, then writing the register value back out to memory only when the loop exits.
It almost goes without saying that Swift is largely immune to this problem since we don't work with pointers directly most of the time; the default assumption is no aliasing.
Next we declare the host and service we are interested in as CString
; Swift has automatic conversion via the StringLiteralConvertible
protocol, though a quirk causes the generated associated type signature on CString
to show up as conversion from CString
to CString
which makes no sense. In reality it means you can convert from String
to CString
.
The most interesting declaration is result
. Why aren't we declaring an UnsafePointer<UnsafePointer<addrinfo>>
? It turns out that would be redundant, cluttering up our code with indirection that isn't required. Instead we just pass a reference to result
and Swift will automatically turn that into an UnsafePointer<T>
for us.
Why use UnsafePointer<addrinfo>.null()
? Because the function we are calling is going to allocate all the addrinfo
structs for us. Calling UnsafePointer<addrinfo>.alloc(1)
would allocate enough memory to hold one addrinfo
struct and store that address, which getaddrinfo
would promptly overwrite with a different address, leaking memory.
Using the Results
if(error == 0) {
for var res = result; res; res = res.memory.ai_next {
if res.memory.ai_family == AF_INET {
var buffer = UnsafePointer<CChar>.alloc(Int(INET_ADDRSTRLEN))
if inet_ntop(res.memory.ai_family,
res.memory.ai_addr, buffer, UInt32(INET_ADDRSTRLEN))
{
let ipAddress = String.fromCString(CString(buffer))
println("IPv4 \(ipAddress) for host \(host):\(port)")
}
}
else if res.memory.ai_family == AF_INET6 {
var buffer = UnsafePointer<Int8>.alloc(Int(INET6_ADDRSTRLEN))
if inet_ntop(res.memory.ai_family,
res.memory.ai_addr, buffer, UInt32(INET6_ADDRSTRLEN))
{
let ipAddress = String.fromCString(CString(buffer))
println("IPv6 \(ipAddress) for host \(host):\(port)")
}
}
}
//free the chain
freeaddrinfo(result)
}
First, we dutifully check the error code. Then we setup a for loop that assigns result
to a new variable res
, continues as long as res
isn't null
, and follows the next pointer in the linked list on each iteration. We're using the fact that UnsafePointer
implements the LogicValue
protocol to make the for loop look almost exactly like it would in C. This loop will just follow the linked list chain of pointers until it reaches the end. freeaddrinfo
takes care of freeing the entire linked list, for which I am very thankful since doing it correctly would be a bit of a pain.
You'll notice we use UnsafePointer<T>.memory
to dereference the pointer and access the underlying struct. We want to use the inet_ntop
function to convert from network representation to presentation. In normal human words, we want to convert from network byte order binary to a readable string.
func inet_ntop(family: Int32,
addrInfoStruct: ConstUnsafePointer<()>,
destBuffer: UnsafePointer<Int8>,
bufferSize: socklen_t)
-> CString
Now we have a problem; we need to give inet_ntop
a destination buffer to write the string into. We have two ways of accomplishing that in Swift. We could declare a [Int8]
array or possibly use ContiguousArray<T>
, but here I've chosen to use an UnsafePointer<CChar>
to just directly allocate some memory. I don't think any of us really know what idiomatic Swift is or much about best practices just yet, so until the community settles on a preferred direction we'll just continue to wing it.
We have some annoying casting in there and once we have a valid result we've gotta convert from the other direction to get a Swift String
.
Casting Pointers
The sockaddr
struct in ai_addr
is actually treated somewhat like a union in that it is really a pointer to a different struct type that must be cast to the appropriate type depending on the protocol family, e.g. sockaddr_in
or sockaddr_in6
. How can we accomplish that in Swift?
Turns out it's relatively easy:
if res.memory.ai_family == AF_INET6 {
let s = UnsafePointer<sockaddr_in6>(res.memory.ai_addr).memory
let opaqueAddr = s.sin6_addr
let portNum = s.sin6_port
}
UnsafePointer<T>
has an initializer that takes an UnsafePointer<U>
where U is any type you like, thereby converting the pointer from one type to another. Just as it would be in C, you should be very careful because this is an unsafe operation. Swift will not complain whatsoever if you convert an UnsafePointer<Int>
to an UnsafePointer<sockaddr_in6>
, you'll just start trying to read or write random memory when you access the pointer.
Making Our Lives Easier
Back to the ugly casting and having to allocate a buffer; perhaps we can make things a bit easier on ourselves.
The first extension is to String
that allows us to skip the CString
cast and just accept an UnsafePointer
directly:
extension String {
///Converts from a null-terminated vector of CChar (or Int8)
static func fromCString(buf:UnsafePointer<CChar>) -> String? {
return String.fromCString(CString(buf))
}
///Converts from a null-terminated vector of UInt8
static func fromCString(buf:UnsafePointer<UInt8>) -> String? {
return String.fromCString(CString(buf))
}
}
Not bad and definitely an improvement, but why not create our own StringBuffer
class to encapsulate the buffer creation and conversion?
///Represents a vector of CChar, null-terminated,
///with automatic conversion to String.
///Useful for APIs that want a pointer to write a
///string result, allowing you to specify the
///max size of the allocation.
///
///This class is initialized to be filled with null,
///so absent any writes it is always safe to
///use description or convert to String.
///
///Warning: Writes are always responsible
///for ensuring the terminating null is present!
class StringBuffer : Printable {
var length:Int
var buffer:UnsafePointer<CChar>
init(_ capacity:Int) {
assert(capacity > 0, "capacity must be > 0")
self.length = capacity + 1
self.buffer = UnsafePointer.alloc(self.length)
self.buffer.initializeZero(self.length)
}
convenience init(_ capacity:Int32) {
self.init(Int(capacity))
}
deinit {
self.buffer.dealloc(self.length)
}
@conversion func __conversion() -> String? {
return String.fromCString(self.buffer)
}
@conversion func __conversion<T>() -> UnsafePointer<T> {
return UnsafePointer<T>(self.buffer)
}
var description: String {
get {
let s = self as String?
if let ss = s {
return ss
} else {
return "[empty StringBuffer]"
}
}
}
var ulength: UInt { get { return UInt(self.length) } }
var ulength32: UInt32 { get { return UInt32(self.length) } }
}
Now we can accept lengths in different units, keep track of them to automatically deallocate the buffer when the class is deallocated, and support automatic conversion of the buffer back to String
. We use one shortcut to try and make it slightly safer: we allocate one more than the requested size and initialize the entire chunk of memory to null
. Since the automatic conversion is expecting a null
terminated C string ([CChar]
), we ensure that the automatic conversion can't run off the end of the buffer if the user fails to include the terminator.
My Soapbox
There is a bit of a philosophical debate among C programmers as to whether that kind of feature is desirable or not; some claim you shouldn't provide those kinds of safety mechanisms because people will come to rely on them and it can hide incorrect behavior. My own feeling is that no one is perfect and memory management bugs tend to lead to massive security vulnerabilities (Heartbleed, Sasser, Slammer, Code Red, ...) that have very real costs both in terms of money and even sometimes directly leading to people's deaths (we now know that intelligence agencies were exploiting Heartbleed, including in some countries to locate dissents or political enemies and harm them). I greatly prefer safety and refuse to entertain any arguments to the contrary. The trail of failure is too long now to pretend all the if only fantasy scenarios will come true [1]. Many of the programs you depend on are written by dicks and idiots. They won't get memory management right. Deal with it.
Ahem.
Back to the Show
The ulength
and ulength32
properties are just conveniences since Swift doesn't normally have built-in integer conversions, even when widening. Unfortunately the various constants involved here are just #define
s, so they end up imported in Swift as Int32
even when the API involved is declared UInt32
, so I made initializers that takes signed or unsigned lengths and provide some computed length properties to smooth everything over. In the real version of this code I've got some assert()
s in there to blow things up if you happen to allocate a 5 GB buffer then ask for the ulength32
of it.
So what is this initializeZero
call? Yet another extension:
extension UnsafePointer {
///Writes value to the pointer, optionally
///repeating repeatCount times.
///All use of this function is unsafe and
///should be reviewed thoroughly
func write(value:CChar, repeatCount:UInt = 1) {
self.write(value, startOffset: 0, repeatCount: repeatCount)
}
///Writes value to the pointer, optionally
///starting at the given offset and repeating for repeatCount.
///All use of this function is unsafe and should be reviewed thoroughly
func write(value:CChar, startOffset:Int = 0, repeatCount:UInt = 1) {
memset(UnsafePointer<()>(self + startOffset),
Int32(UInt8(value)), repeatCount)
}
///Initializes the memory pointed to by UnsafePointer to zero.
///num should be the number of objects of T (same as alloc()).
///Passing a larger size will stomp on memory
///and represents a potentially significant security risk;
///All use of this function is unsafe and should be reviewed thoroughly
func initializeZero(num:Int) {
self.write(0, startOffset:0, repeatCount:UInt(num))
}
}
This gives you a way to write to a buffer arbitrarily and could be easily extended to support writing more than one value, but in that case you already have built-in support for that with initializeFrom
. The initializeZero
call might be pointless if UnsafePointer<T>
initializes its own memory to null
s, but in my limited testing it didn't appear to do that so better safe than sorry.
Malloc You Too Buddy
If UnsafePointer<T>
knew the size of the object it points at, I'd put more assert()
s in there to make sure the memory writes were valid, but in reality it is just a simple struct with only one field of pointer size so at runtime there is nowhere to keep that information. Needless to say, use with care. Personally, I ran this example with the malloc debugging options enabled to make sure I was really using memory correctly.
To enable malloc debugging, edit your Scheme in Xcode and check the boxes on the Diagnostics tab as shown. The downside to these options is they consume much more memory since the memory is not freed and neither are Objective-C objects; their memory sticks around forever as a literal NSZombie
object which causes an immediate failure if you attempt to send it any messages. The guard pages and scribble protection go even further, allocating extra unmapped pages around each valid allocation that will trigger signals if touched and overwriting all unused memory on valid pages with specific byte patterns so any write overruns can be detected. I highly encourage you to run your application with at least one good pass using these tools before release, even if you only use ARC.
All this nonsense also enhances my warm fuzzy feeling for Swift since we don't have to deal with pointers in most cases and our array accesses are bounds-checked. Other than debugging retain cycles, memory management is a mostly hands-off affair.
End Digression
Where were we? Oh yeah, let's look at our for loop now:
for var res = result; res; res = res.memory.ai_next {
if res.memory.ai_family == AF_INET {
var ipAddress = StringBuffer(INET_ADDRSTRLEN)
if inet_ntop(res.memory.ai_family,
res.memory.ai_addr, ipAddress, ipAddress.ulength32)
{
println("IPv4 \(ipAddress) for host \(host):\(port)")
}
}
else if res.memory.ai_family == AF_INET6 {
var ipAddress = StringBuffer(INET6_ADDRSTRLEN)
if inet_ntop(res.memory.ai_family,
res.memory.ai_addr, ipAddress, ipAddress.ulength32)
{
println("IPv6 \(ipAddress) for host \(host):\(port)")
}
}
}
That's better and a bit clearer in intent. We know immediately what the point of ipAddress
is (a buffer for a string) and and immediately infer that the constructor parameter is the size. Plus we don't have to bother converting the result, we can just use it.
Conclusion
I hope this gives you some good examples for calling C-based APIs in real Swift code.
Go Forth Swift-ly, Be JOVIAL, and C what BASIC Processing you Make... uh... Haskell.[2]
[1] If only programmers wrote better code, if only people didn't use undefined behavior, if only programmers properly sanitized their input... The list of if only*s is a long one and harkens back to the days when car makers said the same thing to justify refusing to install seat belts, air bags, and other safety features: *if only people drove better, if only we had better roads... they can't and we don't.
[2] I'm sorry
This blog represents my own personal opinion and is not endorsed by my employer.