Shopify: Scaling iOS CI with MacStadium & Anka
Shopify is a leading provider of internet infrastructure for commerce, offering trusted tools to start, grow, market, and manage a retail business of any size. Founded in Ottawa, Canada, Shopify has grown from five people in a coffee shop to over 5,000 across the globe.
Providing a platform and services that are engineered for reliability, Shopify strives to deliver a better shopping experience for consumers everywhere. Businesses of all sizes use Shopify, whether they’re selling online, in retail stores, or on-the-go. An important part of the Shopify platform is its iOS app which allows users to run their ecommerce business from their mobile device by processing orders, managing products, tracking sales, running marketing campaigns, and more.
Note: The following case study is abstracted from the Shopify Engineering blog.
Shopify had a growing number of software developers working on its mobile apps which include Shopify, Shopify POS, and Frenzy. As a result, the demand for a scalable and stable build system had been increasing. Shopify’s Developer Acceleration team decided to invest in creating a single unified build system for all continuous integration and delivery (CI/CD) pipelines across the organization, including support for both Android and iOS.
“We want our developers to build and test code in a reliable way, as often as they want,” said Sander Lijbrink, software engineer at Shopify. “Having a CI system makes this effortless; we can deliver new features quickly and with confidence, without sacrificing the stability of our products.”
"We want our developers to build and test code in a reliable way, as often as they want."
The internal Shopify team had built its own CI system, called Shopify Build, based on Buildkite. Initially, Shopify Build only supported Linux environments, and while it worked extremely well for backend and Android projects, it did not support Mac or iOS. The team had separate CI systems for iOS projects, but they wanted to provide their iOS developers with the same benefits as their other developers by integrating iOS into Shopify Build.
“Building infrastructure for iOS comes with its unique set of challenges,” said Lijbrink. “It’s the only piece of infrastructure at Shopify that doesn’t run on top of Linux.” Since most cloud providers don’t provide infrastructure that can run macOS, the Shopify team turned to MacStadium for their Mac build infrastructure.
VMware and Mac Pros
To start, Shopify had a cluster of Mac Pros running ephemeral VMs on top of VMware ESXi. Although it performed well, it became a maintenance burden on the small team. “We relied on tools such as Packer and ovftool, but we built many custom provisioning scripts to build and distribute VMware virtual machines,” said Lijbrink.
In addition to being difficult to maintain, each Mac Pro shared solid-state-based SAN storage. “By the end of 2017, we exceeded the write throughput, degrading build stability and speed for all of our mobile developers,” said Lijbrink. “Due to our write-heavy CI workload, the only solution was to upgrade to a substantially more expensive dedicated storage solution. Dedicated storage would push us a bit farther, but the system would not be horizontally scalable.”
Moving to Anka and Mac mini
Around the same time that the Shopify team was experiencing challenges with their build environment, a new macOS virtualization technology called Anka was released by Veertu. Anka provides a Docker-like command line interface for spinning up lightweight macOS virtual machines, built on top of Apple’s Hypervisor.framework.
Anka has a container registry similar to Docker with push and pull functionality, fast boot times, and easy provisioning provided through a command line interface. With Anka, Shopify could quickly provision a virtual machine with the preferred macOS version, disk, memory, CPU configuration, and Xcode version.
“Our VMWare-based setup was running a small cluster of 12-core Mac Pros in MacStadium,” said Lijbrink. “The Mac Pros provided high bandwidth to the shared storage and ran multiple VMs in parallel. For that reason, they were the only viable choice for a SAN-based setup. However, Anka runs on local storage, and therefore it doesn’t require a SAN.”
After some experimentation, the Shopify team decided that a cluster of i7 Mac minis running Anka was the best solution. In addition, they shifted to dedicated virtual machines per host. “We’re running only one Anka VM per Mac mini, giving us four cores and up to 16 GB memory per build node. Running a single VM also avoids the performance degradation that we found when running multiple VMs on the same host, as they need to share resources.”
Distribution in Different Regions
The Shopify team uses a separate Mac mini as a controller node that provisions an Anka VM with all dependencies such as Xcode, iOS simulators, and Ruby. Anka’s VM image management optimizes disk space usage and data transfer times when pushing and pulling the VMs on the Mac nodes. Since all nodes run Anka independently, the team is able to run their cluster in two MacStadium data centers in parallel. “If a regional outage occurs, we offload builds to just one of the two clusters, giving us extra resiliency,” said Lijbrink.
“We monitor the demand for build nodes and work with MacStadium to scale the number of Mac minis in both data centers,” continued Lijbrink. “It’s easier than managing Macs ourselves, but it’s still a challenge as we can’t scale our cluster dynamically. Our workload is quite spiky, with high load exceeding our capacity at moments during the day. During those moments, our queue time will increase. We expect to add more Mac minis to our cluster as we grow our developer teams to keep our queue times under control.”
“It took us about four months to implement the new infrastructure on top of Anka with a small team,” said Lijbrink. “Building your own CI system requires an investment in engineering time and infrastructure, and at Shopify, we believe it’s worth it for companies that plan to scale while continuing to iterate at a high pace on their iOS apps.”
“By using Anka, we substantially improved the maintainability and scalability of our iOS build infrastructure,” said Lijbrink. “During the day, our team of about 60 iOS developers runs about 350 iOS build jobs per hour. Anka provides superior boot times by reducing the setup time of a build step. Upgrading to new versions of macOS and Xcode is easier than before. We have eliminated shared storage as a single point of failure thereby increasing the reliability of our CI system. It also means the system is horizontally scalable, so we can easily scale with the growth of our engineering team. Finally, the system is easier to use for our developers by being part of Shopify Build, sharing the same interface we use for CI across Shopify.”