Type Safe Memory Access¶
This is work-in-progress, not an official language specification.
Introduction¶
Swift enforces type safe access to memory and follows strict aliasing
rules. However, code that uses unsafe APIs or imported types can
circumvent the language’s natural type safety. Consider the following
example of type punning using the UnsafePointer
type:
let ptrT = UnsafeMutablePointer<T>(allocatingCapacity: 1)
// Store T at this address.
ptrT[0] = T()
// Load U at this address
let u = UnsafePointer<U>(ptrT)[0]
The program exhibits undefined behavior unless T
and U
are
related types and the loaded type U
is layout compatible
with the stored type T
(see Layout Compatible Types).
Note
Unsafe[Mutable]Pointer
needs to provide a type safe API. In
other words, when a program accesses memory via UnsafePointer
,
the UnsafePointer
element should be consistent with the type
used to allocate the memory. The “unsafe” in UnsafePointer
actually refers to memory management–it is the user’s
responsibility to manage the object’s lifetime. The type safety of
UnsafePointer is not only a desirable programming model, it is an
absolute requirement performance reasons, as UnsafePointer
is
intended for high-performance implementation of data
structures. Converting between UnsafePointer
values with
different Pointee
types, as shown above, violates this type
safety, and will likely be disallowed in future versions of the API.
Layout Compatible Types¶
Two types are mutually layout compatible if their in-memory representation has the same size and alignment or they have the same number of mutually layout compatible elements. Layout compatibility is specified as part of the ABI and may be expanded over time, so this document is not meant to be authoritative or complete. Nonetheless, some “obvious” cases of mutually layout compatible types are:
- identical types and type aliases
- integers of the same multiple-of-8 size in bits
- floating point types of the same size
- class types and
AnyObject
existentials- pointer types (e.g.
OpaquePointer
,UnsafePointer
)- block function types and
AnyObject
- thin function and C function types
- imported C types that have the same layout in C
- fragile structs with one stored property and their stored property type
- fragile enums with one case and their payload type
- contiguous array storage and homogeneous tuples which have the same number and type of elements.
Types are layout compatible, but not mutually so, in the following cases:
- aggregates (tuples, array storage, and structs), are layout compatible with larger aggregates of the same kind if their common elements are mutually layout compatible.
- an enum payload is layout compatible with its enum type if the enum has only one payload case (and zero or more no-payload cases).
Layout compatibility is transitive.
Note
Unrelated class types have no guaranteed heap layout compatibility for except for the memory layout within the object’s stored properties.
Note
“Fragile” enums and structs have strict layout rules that ensure binary compatibility. Library Evolution Support in Swift explains the impact of resilience on object layout.
See Layout Compatible Examples
Layout Compatible Rules¶
The following layout rules apply to dynamic memory accesses that occur
during program execution. In particular, they apply to access that
originates from stored property getter and setters, and reading or
assigning subscripts (including the Unsafe[Mutable]Pointer
pointee
property and subscripts). Aggregate loads and stores can
be considered a sequence of loads and stores of named or indexed
elements.
- Address formation: Given any two accesses to the same memory object, the relationship between their address expressions must be determined by Swift’s ABI for type layout. The addresses may be either disjoint or overlapping. If they overlap the offset must be determined to be either a named or indexed subobject or known byte offset relative to the other. In other words, the access path of each load and store must be comparable given layout compatibility guarantees. In the case of inout arguments, for the purpose of this rule, the address expressions include both generation of the argument (caller side) and its use (callee side).
Additionally, the type of the memory access itself must be compatible with the element type as follows:
- Loads must be layout compatible with all stores to the same memory object.
- Stores to the same memory object must be mutually layout compatible.
If the object’s allocated type is visible to the Swift program, then the rules are extended to that allocated type:
- Loads must be layout compatible with the memory object’s allocated type.
- Stores must be mutually layout compatible with the memory object’s allocated type.
Legally Circumventing Strict Aliasing¶
Accessing unrelated layout compatible types requires special
consideration. For example, Int32
and UInt32
are “obviously” layout
compatible; however, simply storing to a location via
UnsafeMutablePointer<Int32>
and loading from the same location as
UnsafePointer<UInt32>
is undefined.
Reinterpreting a value’s bits should be done using unsafeBitCast
to
avoid type punning. For example, the above conversion can be performed
legally as:
let ptrI32 = UnsafeMutablePointer<Int32>(allocatingCapacity: 1)
ptrI32[0] = Int32()
let u = unsafeBitCast(ptrI32[0], to: UInt32.self)
In the future, an API will likely exist to allow legal type
punning. This could be useful for external APIs that require pointer
arguments and for manual memory layout. Loads and stores of type
punned memory would still need to follow the Layout Compatible
Rules, but would be exempt from the Strict Alias Rules. Such an
API, for example, would allow accessing same address as both Int32
and UInt32
.
Casting Pointers¶
unsafeBitCast
should generally be avoided on pointer types,
particularly class types. For pointer to integer conversions,
bitPattern
initializers are available in both
directions. unsafeBitCast
may be used to convert between
nondereferenceable pointer types, but as with any conversion to and
from opaque pointers, this presents an opportunity for type punning
when converting back to a dereferenceable pointer type.
unsafeBitCast
is even more problematic for class types. First, layout
needs to be considered when Optional
or existential class types are
involved. Note that the internal _unsafeReferenceCast
API is preferred
in those cases, because it always handles conversions to and from
optionals and existentials correctly.
Furthermore, unsafeBitCast
of class types may introduce undefined
behavior at the point of access. Normal class casts and class
existential casts rely on the dynamic type to be a subclass of or
conform to the static type at the point of the cast. However, an
unsafeBitCast
will succeed when the static and dynamic types are
unrelated, which leads to undefined behavior if the cast pointer
is ever dereferenced. Consider this example:
class A {
var i: Int = 3
}
class B {
var i: Int = 3
}
let a = A()
let b = unsafeBitCast(a, to: B.self)
a.i = 10
print(b.i)
This program exhibits undefined behavior for two reasons. First, it violates Strict Alias Rules (#1) because the same memory object may be accessed via unrelated class types. Second, it violates Layout Compatible Rules (#1) because there is no guarantee of layout among unrelated classes even if they are fragile.
Pointer Casting Example¶
Merely forming an address that violates strict aliasing is not itself undefined behavior; the address must have some static use within the code. However, undefined behavior may occur even if those accesses are themselves never executed. In other words undefined behavior is caused by a dynamic address and its static uses. For example the following program is undefined:
public protocol SomeClass : class {
func getVal() -> Int
}
class ActualClass {
var i: Int
init(i: Int) { self.i = i }
}
// If 'isActualClass' is true, then 'obj' is a subclass of ActualClass
// that conforms to SomeClass.
public func foo<T : SomeClass>(obj: T, isActualClass: Bool) -> Int {
// This unsafe cast violates the type system because
// it's operating on class types.
let actualRef = unsafeBitCast(obj, to: ActualClass.self)
if (isActualClass) {
// The unsafe cast is only valid under this condition.
// Even though this access is never executed when the cast is invalid,
// it still causes undefined behavior.
return actualRef.i
}
return obj.getVal()
}
The following code is both legal and more explicit:
public func foo<T : SomeClass>(obj: T, isActualClass: Bool) -> Int {
if (isActualClass) {
// Now we know that the unsafeReferenceCast is type safe.
let actualRef = unsafeReferenceCast(obj, to: ActualClass.self)
return actualRef.i
}
return obj.getVal()
}
Examples¶
Layout Compatible Examples¶
Calls to mcompatible
, compatible
, and incompatible
reflect
Layout Compatible Rules as their names signify. Calls to unknown
take invalidly formed addresses:
class C {
var i: Int32 = 7
}
class D {
var i: Int32 = 11
}
struct S1 {
var i: Int32
}
struct S2 {
var i: Int32
var j: Int32
}
struct S3 {
var i: Int32
var j: Int32
var k: Int32
}
struct S2_1 {
var s2: S2
var i: Int32
}
enum E1 {
case Payload(Int32)
}
enum E2 {
case Payload(Int32)
case NoPayload
}
struct S_IE2 {
var i: Int32
var e2: E2
}
struct S_SIE2_E2 {
var sie2: S_IE2
var e2: E2
}
struct S_I_E2_E2 {
var i: Int32
var e2a: E2
var e2b: E2
}
// Signify mutually compatible access.
func mcompatible(x: inout Int32, _ y: inout UInt32) {}
func mcompatible(x: inout C, _ y: inout AnyObject) {}
func mcompatible<T>(x: inout UnsafePointer<T>, _ y: inout OpaquePointer) {}
func mcompatible(x: inout Int32, _ y: inout S1) {}
func mcompatible(x: inout Int32, _ y: inout E1) {}
func mcompatible(x: inout (Int32, Int32), _ y: inout S2) {}
func mcompatible(x: inout S2_1, _ y: inout S3) {}
// Signify one-way layout compatibility.
func compatible(x: inout Int32, with y: inout E2) {}
func compatible(x: inout S1, with y: inout S2) {}
func incompatible(x: inout S_SIE2_E2, _ y: inout S_I_E2_E2) {}
func unknown(x: inout Int32, _ y: inout Int32) {}
func access<T>(i: inout Int32, j: inout UInt32, t: inout (Int32, Int32),
c: inout C, a: inout AnyObject,
u: inout UnsafePointer<T>, p: inout OpaquePointer,
s1: inout S1, s2: inout S2, s3: inout S3, s2_1: inout S2_1,
s_sie2_e2: inout S_SIE2_E2, s_i_e2_e2: inout S_I_E2_E2,
e1: inout E1, e2: inout E2) {
// mutually compatible
mcompatible(&i, &j) // same size integers
mcompatible(&c, &a) // class and any object existential
mcompatible(&u, &p) // pointers
mcompatible(&i, &s1) // single element struct
mcompatible(&i, &e1) // single case enum
mcompatible(&t, &s2) // tuple and homogeneous struct
// struct { {I32, I32}, I32} vs. struct {I32, I32, I32}; fixed size, no spare bits
mcompatible(&s2_1, &s3)
// struct { {A, B}, C} vs. struct {A, B, C}; unknown size
incompatible(&s_sie2_e2, &s_i_e2_e2)
// Compatible: can load one type from an object 'with' another type.
compatible(&i, with: &e2) // load the payload from a single payload enum
compatible(&s1, with: &s2) // load struct {A} from struct {A, B}
// Layout compatibility places no guarantees on class layout. The
// following unknown call takes two addresses of compatible type
// (Int32), but the addresses are generated from incompatible class
// types. Even though the class definitions of 'C' and 'D' are
// trivial, there is no guarantee that the two addresses passed to
// this call are identical.
unknown(&c.i, &unsafeBitCast(c, to: D.self).i)
// Properties within heap storage follow the usual layout rules.
func getStructPointer(iptr: UnsafeMutablePointer<Int32>)
-> UnsafeMutablePointer<S1> {
// Convert from UnsafeMutablePointer<Int32> to UnsafeMutablePointer<S1>
// with a hypothetical 'unsafeCastElement' label to be explicit.
return UnsafeMutablePointer(unsafeCastElement: iptr)
}
mcompatible(&c.i, &getStructPointer(&c.i).pointee)
}