The Boeing 737 MAX Saga: Lessons for Software Organizations

"The Boeing 737 MAX Saga: Lessons for Software Organizations" is the feature article for the June 2019 issue of the Software Quality Professional journal.

After two airline crashes, Boeing paused delivery of the 737 MAX. The story provides a case study on the interrelationships between software and systems engineering, human factors, corporate behavior, and customer service. This article examines the events from a safety and software quality perspective.

Software Release Process & Versioning Guide

Who: Benchmark Space Systems 
When: December 2018 - January 2019 

Benchmark Space Systems (BSS) reached out to us for help creating a formal software release process. They were primarily interested in enforcing quality checks throughout the development cycle to ensure their customers would receive only the highest quality software releases.

We began the project by interviewing the BSS team to understand their team structure, existing development and quality control processes, and goals for the project. After collecting input from the BSS team, we went off and created our initial draft of the software release process.

Two versions of the software release process were created: a high-level process flow, and a detailed process flow. We walked through the process flows with the BSS team and provided recommendations for implementing various process stages. We also provided recommendations for implementing portions of the process flow that could be automated using Jenkins and the Embedded Artistry Jenkins Pipeline Library. BSS took the drafts and immediately began trying out the process flows over a two week trial period.

After BSS implemented the initial process flows, clarified detailed steps, and provided feedback, we created detailed process documentation for the BSS team:

  1. Detailed software release process description document

  2. Software process summary diagrams

  3. Versioning guidelines

  4. Roles and responsibilities

  5. Release process checklist

  6. GitHub Issue template tailored for the BSS process

  7. GitHub Pull Request template tailored for the BSS process

  8. Visual Paradigm process flow diagrams for the BSS process

BSS ended the project with a functional software release process.

Testimonial

Paul Shepherd, Lead Electrical Engineer at BSS, had this to say:

Benchmark Space Systems worked with Embedded Artistry to develop our internal Software Release and Continuous Integration processes. From our first meeting, it was clear we shared a common belief that a good process is one where quality is the default outcome, rather than something you have to fight for. With Embedded Artistry's guidance, we were able to implement and deploy the processes immediately. We rest easier knowing that we are shipping only the highest quality software to our customers.

Sample Process Diagram

Screen Shot 2019-02-01 at 14.01.42.png

Manufacturing Test Firmware

Who: Inboard Technology
When: August 2018

Inboard contacted us to quickly create manufacturing test firmware for their new hardware platform. The engineer who wrote the original manufacturing test firmware had left the company, and the tests needed to be updated and expanded to support the new platform.

We ported existing tests to the new firmware platform and updated them to match the recent APIs. We also implemented additional tests for new hardware features. Along the way, we helped Inboard resolve hundreds of compiler warnings and a handful of customer-facing bugs.

After completing the manufacturing test firmware, we updated the factory test station sequence.

We provided overview documentation describing the logic behind each manufacturing test and instructions for updating and modifying the tests. We also created summary documentation describing the full manufacturing test flow, including the Windows application.

The project was completed in three weeks, enabling Inboard to begin testing their prototypes without delay.

Testimonial

Dan Casciato, Manufacturing & Test Engineer at Inboard Technology, had this to say:

It's simple: Embedded Artistry is very easy to do business with. Their deep industry knowledge and expertise makes them the ideal partner for any embedded project, no matter the size. Really, they're the best. 

Product Development & EPM Consulting

Who: Marble
When: September 2017 - August 2018 

Rozi consulted with Marble and advised them on product development strategy for their last-mile delivery robot. During the initial phase of the engagement, Rozi:

  • Created program plan for development of last-mile delivery robots

  • Created engineering schedule from prototype to production validation test

  • Defined milestones and deliverables for each department

  • Defined engineering build strategy and planning

  • Created company-wide organization chart proposal including roles & responsibilities

  • Streamlined company-wide communication

After creating the product development plan, Rozi continued to mentor the HW and SW team leads on team development and product development fundamentals.

Rozi led the talent search and hiring of a full-time Engineering Program Management (EPM) lead. While the hiring process was ongoing, Rozi provided EPM support for Marble in the following ways:

  • Created SW task/issue management and sprint planning processes

  • Created HW development and release processes

  • Defined the vendor management strategy

Once an EPM lead was hired, Rozi trained him on existing processes, company culture, and future goals before handing over management responsibilities and phasing off of the project.

Testimonial

Jason Calaiaro, HW lead at Marble, had this to say:

Rozi got our company back on track by helping us create a program plan and schedule, improving our engineering processes, and hiring a fantastic EPM lead.

STM32-based Power Control Board

Who: Marble
When: August 2018

We were contacted to provide quick-turn firmware development for a power control board. The power control board was destined for use in a robot, and its primary responsibility was to convert 48V down to 12V for various subsystems. Other core features were requested:

  • Control power distribution to various subsystems

  • Perform an automatic system power-on sequence

  • Collect telemetry data from INA233 sensors and the voltage regulator

    • Power/Current/Voltage/Temperature

  • Provide an I2C slave interface for testing and validation purposes

  • Broadcasting telemetry data over CAN periodically

  • Provide a CAN interface for controlling the power system

  • Handle button presses to override the system power state

The most critical aspect of this project was the timeframe. The robots were to be assembled 45 days after the initial project discussion. By the time hardware was in hand, we were left with 28 working days. We completed the project in 22 days, well under the time budget. We were also able to release early firmware builds for testing and validation, allowing us to adjust the specification and behavior on the fly.

In order to meet this deadline, we leveraged the STM32 code generation capabilities and leveraged FreeRTOS. While we do not normally support vendor-generated code on production products, the STM32CubeMX software allowed us to create an initial design within the specified time period. The firmware is also intended for prototype systems, allowing for longer-term improvements to be made before hardware/software is deployed to customers.

The firmware was designed in an event-driven manner, with separate threads to handle the I2C slave interface, CAN broadcasting, CAN command/response, caching telemetry data, and changes to the system power state. By keeping the design simple and event-driven, we were able to quickly implement all of the required features. 

We also delivered a product specification which covered the hardware components, firmware requirements, I2C slave communication interface, and CAN communication interface.

Components used on this product:

  • STM32F103

  • INA233 Power/Current/Voltage Sensor

  • BMR456 Voltage Regulator

Communication Protocols Used on this Product:

  • I2C Master

  • I2C Slave (implemented interface)

  • CAN (implemented interface)

Testimonial

Jason Calaiaro, HW lead at Marble, had this to say:

Every time Phillip helps us on a project, he dives right in to understand the requirements and schedule constraints. He provides us with a detailed plan of what he will accomplish and when, he always delivers on time, and his documentation is incredible.

Snapdragon Flight Driver Development

Who: RavenOps
When: 1/18-3/18
Where: San Francisco, CA
Languages: C, C++

RavenOps transitioned from the Crazyflie2 to the Snapdragon Flight platform. We assisted with the transition in the following ways:

  • Researching the platform and identifying an approach for writing custom device drivers for hardware components which integrate with the DSP
  • Writing a proof-of-concept driver and test application using a Snapdragon driver framework
  • Porting RavenOps's Time-of-Flight driver from the Crazyflie2 to the Snapdragon flight
  • Writing a demo application which demonstrates use of the driver
  • Documenting development environment setup and assisting with development environment setup
  • Documenting the driver framework and demonstration project, with notes on how to expand APIs and add drivers

The following articles were published as a result of our learnings on this project:

Petzi Treat Camera Update

Who: Petzila
When: 9/2017-12/2017
Where: San Francisco, CA
Languages: C

After implementing support for the MW300 platform and AWS IoT backend, Petzila was interested in migrating their existing hardware platform to the new server setup. This required migrating the existing firmware to the new SDK. Firmware support was unified between the two platforms as much as possible.

Due to the constrained nature of the original MC200-based design, there were significant challenges with memory size during the port. Much of our work involved size optimizations and buffer tuning. We successfully reduced the overall firmware size by 20%. We successfully ported the new firmware and AWS IoT backend to the MC200-based platform and rolled off of the project.

Build System Overhaul & Jenkins Pipeline Setup

Who: Marble
When: 9/2017-12/2017
Where: San Francisco, CA
Languages: C++, CMake, Groovy, Jenkins Pipeline

Rozi was consulting with Marble and recommended that they overhaul their software development processes. Phillip worked with Marble to improve their build system, re-enable unit tests, and bring-up a build server with multiple nodes for load balancing.

Improving the build system included:

  • Refactoring the build system from using three separate CMake builds in a sequence to using a single CMake build
  • Reducing build times
  • Re-enabling unit tests and helping the team get tests to pass
  • Enabling out-of-source build support
  • Supporting software variants with compile-time settings

Once the build system was updated, we created a Jenkins server for Marble HQ which utilized our Jenkins Pipeline Library. The server utilized three slave nodes to support multiple concurrent builds, as the typical build took ~12 minutes (including unit tests). This was a drastic improvement over the 45min Travis CI builds (without unit tests) which the team was previously using.

The Jenkins server was configured to notify GitHub of the build & test status for each pull request. The GitHub process was updated so that pull requests could not be merged unless the build was successful and all tests passed.

Testimonial

Kevin Peterson, SW lead at Marble, had this to say:

Phillip is a talented embedded software engineer who did some excellent work for Marble. He did a great job on several projects for us including build server development and firmware for a board we designed. One of his truly excellent traits is his ability to come in on time. Phillip carefully scopes his work and sticks to the original scoping. His documentation is also incredibly thorough. Very impressive.

Crazyflie: ESB Broadcasting Protocol

Who: RavenOps
When: 6/17-9/17
Where: San Francisco, CA
Languages: C

RavenOps was initially using the Crazyflie2 platform for demo projects. The Crazyflie platform natively supports 1:1 communications between a drone and a base station, and RavenOps needed to enable drones to broadcast information to other drones in the vicinity, as well as for a base station to broadcast commands to the entire drone fleet.

Our support included:

  • Modifying the ESB layer to support sending and receiving broadcast commands
  • Adopting CrazySwarm code for a host controller
  • Writing a logging library to buffer log messages to dump at a later time for debugging
  • Investigate solutions for BLE/ESB coexistence problems

The broadcasting software was used in early RavenOps investor demos.

AWS IoT Migration

Who: Petzila
When: 5/2017-8/2017
Where: San Francisco, CA
Languages: C

When we first met Petzi, they were developing their second generation treat camera. We assisted with new product development in the following ways:

  • Ported existing platform from MC200 to MW300
  • Migrated to new SDK
  • Bring-up and debugging of new hardware design
  • Added support for AWS IoT infrastructure
  • Upgraded authentication and server interactions to work with new AWS-based backend
  • Implemented and validated new OTA logic, with fallback support for the old server environment
  • Resolved issues with on-boarding and provisioning flow
  • Tuned firmware settings for new hardware

The following article was published as a result of our learnings on this project:

Doblet v1.5 Ecosystem Support & USB Debugging

Who: Doblet
When: 4/2017-8/2017
Where: San Francisco, CA
Languages: C

Phillip supported Doblet in various capacities as they were developing the ecosystem for their v1.5 charger:

  • Migration of source code back to git, as releases were typically archived in zip-file drops and in Particle's online IDE
  • Architecting a new communication protocol for gathering data from Doblet chargers and passing that the Particle cloud
  • Rewriting firmware for more robust communication & simpler control flow
  • Debugging long-standing USB issues for USB-C and USB-micro devices
  • Debugged issues with charging stand electrical hardware and firmware
  • Assisted in chip selection and hardware refresh of the Doblet charger
  • Write drivers for new components

The following articles were published as a result of our learnings on this project:

Build Server & Buildroot Configuration

Who: Industrial Optic
When: 4/2017-5/2017
Where: San Francisco, CA
Languages: Buildroot, Jenkins pipelines

Phillip supported Industrial Optic in various capacities as they were developing their initial prototypes:

  • Bring-up of a Jenkins server for continuous integration and nightly builds
  • Configuring Buildroot for their Raspberry Pi CM3 demo platform
  • Providing feedback on electrical components, such as the motor controller driver

The following article was published as a result of our learnings on this project:

Rylo Camera

Who: Rylo
When: 9/2016-4/2017
Where: San Francisco, CA
Languages: C++, CMake, Groovy, Jenkins Pipeline

We supported Rylo's first product development effort with the following services:

  • Schematic review
  • Build system & build server setup
  • Creation of product development roadmap & schedule
  • Creation of manufacturing test plan
  • Managed multiple external vendors
  • Build support in China
  • Bring-up of manufacturing test firmware
  • Bring-up of customer camera firmware behavior

RearVision

Who: Pearl Automation
When: 2014-2016
Where: Scotts Valley, CA
Languages: C, C++, ARM assembly, Thrift, Python

Pearl's RearVision consists of two separate embedded devices - the camera system, mounted on the car's license plate, as well as an in-car OBD-II powered system.

Phillip was an early hire at Pearl - #12, hired right after series A closed. He was the primary developer for the camera system.

In the early stages of the company, Phillip performed initial bringup of the system and was responsible for:

  • SOC Bringup
  • RTOS Selection
  • Setting up the build system
  • Porting initial software from gcc to clang
  • Software / source tree architecture
  • Implementing the DFU framework to flash devices
  • Creating a USB CDC shell and defining commands for various drivers
  • Driver bringup for multiple devices
    • SPI
    • SPI-NOR
    • Solar IC
    • GasGauge
    • USB
    • EHCI (ported u-boot EHCI stack to our system)
    • Cameras
  • Defining factory test methods and processes

After the bringup stage, focus shifted to firmware support for the camera sub-system. Responsibilities included:

  • Implementing low-level C++ functionality in the RearVision frame firmware
  • Converting early libraries and drivers to C++
  • Conversion of C-based USB Host stack to C++
  • Ported vendor host-side Camera USB/UVC APIs from Linux to an RTOS
  • Managed interfaces for client software to initiate and configure video streams.
  • Worked closely with our ISP vendor to:
    • Identify and prioritize requirements and issues
    • Assist in debugging ISP, video, and memory issues
    • Optimize ISP pipeline performance to reduce latency and improve quality over BT/Wifi
  • Optimized video system memory usage
    • Created a buffer pool class to re-use large buffers in camera path
    • Eliminated all copies from USB->Camera path when getting new frames.
    • Converting malloc()/free() calls to use smart pointers
    • Debugging memory stompers

Further Reading:
PearlAuto Website

iPhone 6 & 6+

Who: Apple
When: 2013-2014
Where: Cupertino, CA; Shenzhen, CN
Languages: C, Lua

After the iPhone 5C, Phillip transitioned onto bringup for the iPhone 6/6+. A shortage of team members meant that Phillip managed both the iPhone 6 and iPhone 6+ projects. After EVT was completed, he transitioned to project management of the iPhone test lines. He traveled to most builds for both programs during this product cycle.

Factory SW:
During the prototype phase, Phillip was responsible for rapid bringup of new drivers for evaluating parts at prototype builds. He worked closely with the product development and reliability teams to make sure they had the ability to validate all the parts under consideration while at the builds. He also expanded our driver and factory test support to encompass new design changes.

Phillip spent significant time training and developing the CM software team. Unlike on the iPhone 5C, he managed to get the iPhone 6 CM team to help write software, allowing the firmware team to manage their immense workload by offloading tasks to the CM team. Phillip's strategy was to keep the hardest tasks for himself - once a task could be crystallized into a simple set of instructions, he would utilize the CM team. This allowed him to stay ahead of deadlines with an extremely challenging schedule, and helped to expand his communication and project management abilities. It is not simple to provide instructions across language barriers!

Current testing was expanded on the iPhone 6/6+ programs. Limits could now be set for different phones while keeping the same core software logic (previously, duplicate software was required). Also, much of the current testing coverage was pushed up to SMT to catch failures before final assembly.

Factory PM:
After transitioning to project management, Phillip's primary responsibility was the factory test and calibration lines. He was responsible for coordinating software and fixture readiness, ensuring that software, fixtures, and other critical deliverables were ready for each builds. He frequently briefed executives on program status and readiness for hardware builds.

Phillip became well-versed in crisis management, as there were always new failures, missing coverage, late deliverables, and missing support. Prioritization and triage were important in making sure build goals were met without further delays.

Phillip managed test plans, test coverage, and line flow for iPhone 6 and iPhone 6+ development, including these areas:

  • SMT
  • Subassembly
  • IQC
  • FATP
  • Rel
  • Packout

iPhone 5C

Who: Apple
When: 2013
Where: Cupertino, CA; Shanghai, CN
Languages: C, lua

The iPhone 5C was Phillip's first project at Apple. Phillip joined during the EVT stage of the project, and his primary focus was expanding test coverage, debugging failures, fixing issues, and driving down retest rates and cycle time. He traveled to each remaining development build until the product ramped (4 builds).

Prior to the development builds, Phillip worked with with a variety of external teams to drive and implement test requirements:

  • System HW
  • RF HW
  • Touch firmware/test
  • Sensor HW
  • Operations
  • Reliability

During a late-stage part switch, Phillip performed a rapid bringup of a new accelerometer - this was a high pressure situation, as the build material was being held until firmware support was completed.

At development builds, Phillip worked closely with the Apple System HW team to debug failing units. Together, they worked to distinguish hardware failures from software bugs and add new coverage to catch failures that were missed.

Phillip joined forces with the System HW and RF HW teams to expand the current testing suite run during burnin. This testing suite helped identify numerous software bugs, SMT issues, and caught a major reliability issue prior to MP. The success of current testing on this product resulted its expansion and use at SMT in later iPhone products.

Phillip also worked closely with the operations team to monitor retest rates and test cycle times, helping improve UPH on key test items.

Phillip taught the CM's software team to help triage and debug manufacturing test issues. During later iPhone builds, failure and retest quantities become too large for one person to filter through. Developing the skills of the CM software team helped the firmware team stay afloat and focus on new issues rather than duplicates.

MIL-STD-1553B Embedded Development Kit

Who: Georgia Tech Research Institute (GTRI), Georgia Institute of Technology (GT)
When: 2012
Where: Atlanta, GA
Languages: C (pre-C99), i8085 assembly, VHDL

GTRI has multiple contracts to redesign existing systems and update them with modern electronics. The i8085 was nearing end-of-life, so GTRI requested that we implement the i8085 in VHDL and produce a development kit for prototyping new hardware. The final product resulted in a Micro-C program (running on i8085 in VHDL) talking over a MIL-STD-1553B bus to blink a light on a receiving board.

The MIL-STD-1553B Embedded Development Kit consisted of a Xilinx Spartan FPGA and two MIL-STD-1553B capable daughter cards of our own design. The FPGA board interfaced with MIL-STD-1553B transceivers to prove that we could communicate over the bus with our software.

Our primary challenge was getting software running on our i8085 FPGA device. We revived an old ISO-C compiler (Micro-C) and managed to get software compiling in a virtual machine. Next, the vendor's 1553 driver needed to be back-ported for i8085 compilation. This was challening, as many modern conveniences and best practices are not valid in prior to C99. The compiler also had its own separate limitations, such as lack of support for float/double/long/enum/typedef.

After getting example code running on our i8085 design, we created a VHDL system architecture that allowed us to interface with our daughter board and run the i8085 demo program.

Writing Samples

Source Code

8085-HI6130-Port: Port of HI-6130 Driver and Demo Project to a MICRO-C Compiler for the 8085
SREC-to-COE: Converter to convert Motorola SREC hex files to Xilinx COE memory initialization files.
1553-Firmware: Contains VHDL implementing an 8085, Holt HI-6130 1553 IC, and Memory. Also includes firmware used to demo the system.

High Level Schedule / Waterfall

1474993589111.png