Working with multiple architectures & compiled binaries

When working with iOS apps (or really anything within Apple’s ecosystem) I’ve sometimes found the need to deeply introspect the libraries and executables built in my project to answer questions like “Is bitcode enabled for all architectures?” or “Which architectures was this binary compiled for”, and so forth.

These aren’t easy questions to answer unless you know your way around the command-line, and which commands to invoke. So I thought I’d go over how to analyze compiled binaries, and share some helpful scripts I wrote to simplify the process.

TL;DR: This post dives into the commands used to interact with your app’s compiled binaries. If you want to make some simple queries against your binaries to help in debugging your app submissions, check out the helper scripts I posted to a Github Gist, which is also available at the end of this post.

In my post about Cocoa Dynamic Frameworks, I briefly cover the Mach-O format, how source code is compiled and linked to produce executables. But when you get a compiled application, sometimes you need to dig into the details of the produced executable.

Xcode itself includes a huge number of command-line tools that form the basis of the entire build process for your apps. In this post I’ll be talking about three of them: lipo, otool, and nm.

  • lipo: Creates and operates on universal / multi-architecture files. This is used to get information about universal (aka FAT) executables, and can be used to combine multiple executables built for different architectures together.
  • otool: Object file display tool. It displays specific parts of libraries or object files, allowing you to scan the table structure for the built binary.
  • nm: Displays the symbol table (aka “name list”) for the given object files or libraries.

Cookbook Recipes

Each of those above command-line tools have their own unique command-line options that you can explore on your own, but I’ll cover a sort of cookbook of commands you can use to answer some of the common questions you may have about your binaries.

For the purposes of this post, I’ll use the Salesforce Embedded Service SDK for iOS, which I’m the lead developer of. It will provide a good example since it’s a large project with multiple frameworks and architectures.

Which architectures are in this binary?

$ lipo -info ServiceCore.framework/ServiceCore 
Architectures in the fat file: ServiceCore.framework/ServiceCore are: i386 x86_64 armv7 arm64 

The lipo -info command is used to get high-level information about a binary image. As you can see here, this framework is a multi-architecture library for both simulator architectures and device architectures.

You can also get detailed information using the aptly named -detailed_info argument.

$ lipo -detailed_info ServiceCore.framework/ServiceCore  
Fat header in: ServiceCore.framework/ServiceCore
fat_magic 0xcafebabe
nfat_arch 4
architecture i386
    cputype CPU_TYPE_I386
    cpusubtype CPU_SUBTYPE_I386_ALL
    offset 4096
    size 2829856
    align 2^12 (4096)
architecture x86_64
    cputype CPU_TYPE_X86_64
    cpusubtype CPU_SUBTYPE_X86_64_ALL
    offset 2834432
    size 2990112
    align 2^12 (4096)
architecture armv7
    cputype CPU_TYPE_ARM
    cpusubtype CPU_SUBTYPE_ARM_V7
    offset 5832704
    size 14446496
    align 2^14 (16384)
architecture arm64
    cputype CPU_TYPE_ARM64
    cpusubtype CPU_SUBTYPE_ARM64_ALL
    offset 20283392
    size 14837748
    align 2^14 (16384)

You can see how each architecture in this mode indicates the size of each image, as well as the byte offset within the parent FAT archive.

Does this binary include a specific class?

$ nm ServiceCore.framework/ServiceCore | grep '_OBJC_CLASS_$_SCServiceCloud'
00000000001f1590 S _OBJC_CLASS_$_SCServiceCloud

The nm command can be used to analyze the symbol tables in the framework. The first column indicates the offset, in hex, for where the symbol exists. The second column represents the symbol type; there’s several types you’ll see, but the important ones for most uses are:

  • T: Text section symbol
  • D: Data section symbol
  • U: Undefined type
  • S: Everything that isn’t another type, which usually means a Swift or Objective-C class, method, ivar, etc.

When a lower-case variant of those symbol types are shown, it means those are non-external (aka private) symbols. As per the documentation for nm:

A lower-case u in a dynamic shared library indicates a undefined reference to a private external in another module in the same library.

man nm

Using this, we can see not only what classes or ivars are defined in a framework, but we can see what private definitions are defined, and more importantly which external APIs are referred to.

For example, here’s the results when looking for uses of UIApplication:

$ nm ServiceCore.framework/ServiceCore | grep UIApplication
                 U _OBJC_CLASS_$_UIApplication
                 U _UIApplicationDidBecomeActiveNotification
                 U _UIApplicationDidChangeStatusBarOrientationNotification
                 U _UIApplicationDidEnterBackgroundNotification
                 U _UIApplicationDidFinishLaunchingNotification
                 U _UIApplicationDidReceiveMemoryWarningNotification
                 U _UIApplicationWillEnterForegroundNotification
                 U _UIApplicationWillResignActiveNotification
                 U _UIApplicationWillTerminateNotification

This shows that the code references the UIApplication class, as well as the constant values for a variety of notifications. These are marked as the U (undefined) type since those symbols are defined elsewhere (in UIKit.framework itself), so the ServiceCore framework doesn’t know itself what its type is.

Was this binary built using Bitcode?

This is a slightly harder one to check for since you have to check each architecture individually (e.g. in a multi-architecture binary, i386 will not have Bitcode, but arm64 will).

$ otool -arch i386 -l ServiceCore.framework/ServiceCore | grep __LLVM
$ otool -arch arm64 -l ServiceCore.framework/ServiceCore | grep __LLVM
  segname __LLVM
   segname __LLVM

As you can see, i386 doesn’t include bitcode (which is represented by the __LLVM segment name in the Mach-O table), whereas arm64 does.

How do I separate a multi-architecture binary into its original, separate binaries?

$ lipo -extract arm64 ServiceCore.framework/ServiceCore -output ServiceCore-arm64
$ lipo -info ServiceCore-arm64 
Architectures in the fat file: ServiceCore-arm64 are: arm64 

This one is pretty simple, and can be used to pull a individual framework out for independent analysis.

How do I create a multi-architecture binary?

$ lipo ServiceCore-armv7 ServiceCore-arm64 \
    -create -output ServiceCore-combined
$ lipo -info ServiceCore-combined
Architectures in the fat file: ServiceCore-combined are: armv7 arm64 

Using the -create and -output arguments to lipo, you can specify a list of files to merge together into a single FAT archive.

How can I remove an architecture I don’t want?

$ lipo -remove i386 -remove x86_64 \
    -output ServiceCore-stripped ServiceCore.framework/ServiceCore 
$ lipo -info ServiceCore-stripped 
Architectures in the fat file: ServiceCore-stripped are: armv7 arm64 

This can be useful when you want to remove simulator architectures from a FAT library, especially prior to submitting to the App Store.

Helpful command-line scripts

I’ve consolidated these commands together into a set of helper scripts that makes my day-to-day easier when working with build issues. Check them out on Github and let me know if you like them!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.