code-for-a-living May 13, 2021

Building the software that helps build SpaceX

We’ve talked about the software that flies SpaceX rockets, the team that tests the code to ensure it’s airtight, and the code that helps Starlink satellites communicate with customers and one another. For our last piece, we’re diving into the work of a team that helps the vehicles get built.

We’ve talked about the software that flies SpaceX rockets, the team that tests the code to ensure it’s airtight, and the code that helps Starlink satellites communicate with customers and one another. For our last piece, we’re diving into the work of a team that helps the vehicles get built. The Application Software team crafted an ERP system for every stage of building a rocket. “One of our responsibilities is to build the software used by almost everyone at the company to get the vehicle to the pad and ready for launch,” explains Anthony Rose, a Software Engineering Manager. “That includes supply chain, manufacturing, finance, inventory, etc.”

From purchasing and receiving raw material, creating and executing work orders to build spacecraft, tracking quality escapes and change management, and implementing procedures to launch a rocket, the system needs to be robust enough to handle the manufacturing and launch of a Falcon 9. This rocket could be carrying cargo or humans to the International Space Station or delivering satellites to orbit; reliability is a top concern.

“One example of our applications is a part management system that says a certain part exists in the factory. It’s been fabricated. Now, where is it? Our system helps that part move to the location it needs to be at in order for the rockets to be built as efficiently as possible. Another example is how we do change management and defect tracking. We have to carefully track how parts relate to each other and how defects or changes in one design will flow back and affect all the other pieces in a vehicle,” explains Rose. 

The software stack – from monolith to microservices

Steven Sepanloo, a Software Engineering Manager on the team, ran us through their software stack and how it’s shifted over the four and half years he’s worked at SpaceX. “It’s been evolutionary and not revolutionary until recently. Four years ago, we had AngularJS, C#, and an MSSQL relational backend, which was a very standard transactional stack for a single-page web application. Those applications still exist, but we’ve moved over to using Angular for newer applications.” The team has also started using containerization and building microservices, as well as moving away from SQL server in favor of PostgreSQL. 

Kyle Madonia has been with SpaceX for just shy of seven and half years and works as a Senior Manager on the Application Software team. “When I started, we were just moving off ASP.NET  web forms and beginning to use Knockout JS, so this was even before Angular.” 

In the monolithic applications SpaceX used at the time, some applications were on a homegrown JavaScript framework and other applications started moving to KnockoutJS.  “In 2014, we started building a new architecture. Fun fact, our old solution was called DS9, and this new approach was called TNG, because our team loved Star Trek,” says Madonia with a laugh. The team began using AngularJS and considering the idea of microservices.

“One problem with a monolith is that you have to deploy all of the pieces together. If something was broken in our shop floor system, but we had big changes that we wanted to release in the inventory system, we had to wait to deploy the changes,” says Madonia. “We were on a weekly release cadence and we needed to move faster.” In order to do that, the team focused on changing operations in order to minimize maintenance of older applications.

In 2019, as building systems for Starlink became a reality for SpaceX, the team saw an opportunity to move further in that direction. “We wanted a new manufacturing execution system for Starlink because we knew right away that building satellite dishes was going to be different from building rockets. For example, we didn’t need the whole shop floor application for this factory,” she says. The team began building something outside Warp Drive, the monolithic ERP system it had been using for years. “It uses ActiveMQ to make async API calls, and it sits apart from the main system.” 

From there, the team built a completely separate system for the operators working on satellite constellations, a new telemetry system, and all of the necessary consumer-facing apps for Starlink customers. “Every service is separate and has very specific rules that govern the communication and dependencies between them. You can always use APIs to get the information you need, but each service is not tied to anything else.”

Up, up, and away

For the team focused on the software outside of the vehicles, the big change has been less about the programming languages they use and more about the diversity of projects they support. “The interesting thing to me is that we have scaled what essentially feels like four parallel businesses: commercial payload delivery with Falcon, human spaceflight with Dragon, a global internet service provider with Starlink, and interplanetary transport with Starship,” says Sepanloo. Oh, and don’t forget the development of Starbase down in Texas. “A lot of the innovations have actually been on how we work and how we adapt as an organization in order to support these different business lines. We didn’t grow the size of our organization – instead, we looked at how we could be more efficient. For example, we brought engineers closer to the problem spaces to minimize the time needed between understanding a problem and building a robust solution.”

The mindset on this team is that you can only really reach mastery of what you’re doing if the person who is able to create change is the same person who understands what must be changed. That’s a high-minded way of saying product management decisions and engineering decisions live under one roof and, ideally, with one person or team.

Engineers are encouraged to figure out what needs to be done, explain the business model – what it impacts and why it’s worth doing – and then build a scalable and performant solution that solves the problem end-to-end. “That’s a very efficient process akin to how you might think a startup is built out,” says Sepanloo. “We really focused on bringing our engineers as close as possible to the actual system they’re impacting.”

As Rose explains it, the trick for teams supporting Starlink is to figure out “how you pivot a company that has been focused on building highly bespoke launch vehicles to building consumer grade electronics that have to scale.” To get his software engineers familiar with the assembly line that will be constructing the Starlinks, he moved them close to the metal. 

“Historically, we sat in the main building with all of the other software teams – it was the same building that housed the rocket factory, but we sat on a different floor. Right now, my team is right next to the Starlink assembly line. We’ve moved everybody over here so we can be on the line, understand it, and build into a system which is unfamiliar to many software engineers,” says Rose. “We actually took shifts working the different stations on the line so we could really understand what the challenges were. Since then, we’ve been able to start building some innovative solutions – and this is only the tip of the iceberg.”

Data data everywhere – or not

Starlink has not only introduced new ideas into the manufacturing process but has also led to new ideas around data ingestion and storage. “When you try something new for the first time, like you change a fundamental configuration of the rocket, it’s important to be able to analyze that data,” says Sepanloo. SpaceX has helped pioneer the idea of reusable rockets, which makes data on flight performance especially interesting. The second stage is discarded after use, but the booster (first stage) and fairings are reusable. “Our vehicles are not expendable. We’ve now flown one booster nine times, a milestone for us. Data as a function of increasing flight rate is very important to understanding the hardware.” 

Even operations that are in developmental testing provide critical data. “When Starship SN15 had a hard (and exciting!) landing, people assumed we would be unhappy, but we were celebrating,” says Madonia. “We did so many successful maneuvers and we were able to gather a ton of valuable data to improve our next attempt.”

With Starlink, a different approach is needed. Instead of processing data from a rocket launch, which has a clear start and end time, satellites and the network are constantly streaming data. While the team already had a data store for rockets, they built a new telemetry data store for Starlink using .NET 5, Kafka, HBase, HDFS, Docker, and Kubernetes. This system is designed to scale horizontally in order to accommodate very large-scale telemetry use cases, but the team remains vigilant on keeping only the minimum amount of data necessary. The strategy is to avoid getting stuck doing analysis paralysis in a data swamp and instead, capture only the data needed, analyze it, and try again.

To explore the Solar System

Rose and Sepanloo are busy day to day with the nuts and bolts of building space vehicles. But they haven’t lost sight of the grander mission their software is enabling. “I think the interesting things that are going on with SpaceX are the big goals. The number of vehicles that need to be produced to build a self-sustaining civilization on Mars within a reasonable timeframe requires a pace of manufacturing not seen prior to what we’re trying to do,” says Sepanloo. 

As they see it, it’s not only building the thing that’s hard, it’s building the thing that builds the thing which also presents challenges. 

From this team creating smarter factories for Starlink to automating manual work away from satellite and network operators, from building a great customer experience for Starlink to innovating software solutions to help build Starship to ensuring their systems can handle the hefty launch manifest in 2021, there’s no shortage of inspirational projects. “Every year, I keep thinking that this one is going to be such an exciting year for SpaceX,” says Madonia. “And every year is – but then I think it again the next year! When I started here, I never imagined that I would be working on software to help build out a global ISP that could bring Internet to underserved communities.” 

The team expects to automate systems rapidly as machinery becomes more and more complex. “That’s actually where I think a lot of the really cool innovation is going on right now to turn us into a company that can build out an entire fleet and run multiple spaceports,” says Sepanloo. “It’s a cool opportunity for our team to look at how we build the right systems to bridge the digital and the physical.”

If you want to learn more about working in software at SpaceX, check out their careers page. For the other blog posts in this series, you can check out the rest of our series.

Tags: ,
Podcast logo The Stack Overflow Podcast is a weekly conversation about working in software development, learning to code, and the art and culture of computer programming.


code-for-a-living May 11, 2021

Network protocols in orbit: Building a space-based ISP

There are requirements that make software engineers sweat. Massive distribution to thousands of nodes. High reliability and availability. Multiple distinct platforms. Rapid network growth. This is the world SpaceX’s Starlink program, which has set a goal to provide high-speed broadband internet to locations where access has been unreliable, expensive, or completely unavailable.
newsletter May 21, 2021

The Overflow #74: Behind the scenes at SpaceX

Welcome to ISSUE #74 of the Overflow! This newsletter is by developers, for developers, written and curated by the Stack Overflow team and Cassidy Williams at Netlify. This week: So long to a podcast co-host, it’s a good year to go to Mars, and neural networks doing the important work of making video games more realistic. From the…