Firmware updates over Low-Powered Wide Area Networks
This post was originally published on Mbed Developer Blog.
Firmware updates are essential for large scale deployment of connected devices. Security patches protect customer and business data, and new functionality, optimizations and specalization extend the lifetime of devices. This article demonstrates firmware updates over the most challenging type of networks: low power and long range networks.
Billions of Internet of Things (IoT) devices are hitting the market in the next few years, and industry leaders are pouring billions of dollars into the ecosystem. IoT devices require both long range and low power consumption and have a battery life that lasts years. Traditional wireless network technologies, such as cellular and Wi-Fi, cannot accommodate these needs. To facilitate the requirements of these devices, new network technologies have been popping up in the past few years, the so called “Low-Power Wide Area Networks” (LPWANs). Networks such as LoRaWAN, Sigfox and NB-IoT combine cheap radio chips with kilometers of range and very low battery consumption.
A downside of these networks is that the data rates are way lower than those of traditional radio networks. Data rates in LPWANs are measured in bits per second, rather than megabytes per second. Additionally, many of these networks operate in the unlicensed spectrum (ISM band), which requires devices to adhere to duty cycle limitations, only allowing to send a fraction of the time while suffering from interference. These characteristics make it difficult to support firmware updates over the air. This means you can never update most of the devices deployed in the field: the devices are deployed in places that are impossible to reach, or the cost of sending a technician is too high with thousands of devices in a variety of places.
Not being able to update the firmware on IoT devices is unacceptable when doing an actual deployment for several reasons. First, it's impossible to write 100% secure software — which have occurred many times in 2016. Second, these devices are supposed to last for up to ten years, so keeping up to date with the latest standards and protocols becomes more and more important. Lastly, being able to add functionality or specialize devices throughout the lifetime, from manufacturing and distribution to transfer of ownership or change of purpose, would secure various business cases. This prompted Jan Jongboom (Principal Applications Engineer, ARM) and Johan Stokking (CTO & Co-Founder, The Things Industries), both active members of the LoRa Alliance (governing the LoRaWAN standard), to work on a proposal to properly allow these devices to update over LPWANs. A demonstration of this work will occur at the LoRa Alliance All Members Meeting and Open House in Philadelphia, June 12-14, 2017.
This article focuses on the work done on LoRaWAN, as well as challenges in terms of power consumption, link loss and limited data rate. These challenges also apply to other LPWANs.
The key requirements for firmware updates over LPWANs are: 1. Ability to send data to multiple devices at the same time (so called multicast) in an efficient manner in terms of power consumption and channel utilization 2. Recovering from lost packets 3. Verifying the authenticity and integrity of the firmware, while following standards end-to-end.
This article will discuss these challenges one by one and present a solution.
Adding multicast support
Unlike cellular or Wi-Fi, where a device maintains a connection with the network at all times, most LPWAN including LoRaWAN are uplink oriented. In other words, sending data (uplink) is more important than receiving data (downlink). It is only possible to send a downlink message at set times, during which time windows are referred to as RX windows. These RX windows only open shortly after a transmission. This is great for battery life because the device does not need to maintain a connection with the network but can instead go to sleep mode as much as possible. For sending firmware images, this is terrible: you need downlink oriented transmission of many packets. With a maximum payload size at the second highest LoRaWAN data rate (spreading factor 9) of 115 bytes, you need to exchange 869 messages to send a 100 KB firmware image. Because of the 1% duty cycle in many markets (including Europe), this requires over 9 hours (at 400 ms time on air per message) to update a single device - assuming no packet loss. In addition, the gateways may cover hundreds or thousands of devices that are also subjected to duty cycle limitations, due to which it may take weeks to update a fleet of devices. Finally, for every received packet, the required transmission consumes copious amounts of energy (transmission consumes 40 mAh on LoRa and RX 9 mAh) and uses a lot of the available spectrum.
To enable proper firmware updates, one needs to add two features to the devices and the network:
- A way to send the firmware image without the device requiring to transmit first, optimizing the device’s duty cycle and power consumption.
- Multicast support - for updating multiple devices at the same time, optimizing the gateway duty cycle.
The first step is to get all devices that you need to update to listen at exactly the same time at the same frequency, data rate and security session. If you load the same keys to the devices (LoRaWAN uses AES-128 encryption for packets), then they can both receive and decrypt the same packets, as if they are one device. Once you are certain that the devices are listening, you can start broadcasting the firmware image without the need for devices to transmit first. This means that firmware updates need to be scheduled, typically hours or days in advance, depending on the sleep behavior of the devices.
Devices typically operate in a secure session that is unique to the device and the network while being activated. Because most LPWANs are uplink oriented, the network waits for the device to send a regular message, in order to have the opportunity to send instructions to set up the multicast group for the interested devices.
The first instruction contains the temporary, common device address and security session keys that you use for all devices in the multicast group. This includes the maximum number of packets for which the group is valid. The second instruction informs the device when to wake up from sleep, which is a relative value in seconds for each device, to start listening at a specific frequency and data rate. The device acknowledges the instructions to the network, which then makes the update ready at the scheduled time.
Once the update window opens and all devices wake up from sleep at the same time, the network can start sending the firmware as quickly as possible. Because the network can continuously send messages, you can transmit the 869 packets (100 KByte) in under six minutes (at 400 ms time on air per packet). The network still needs to adhere to the duty cycle limitation of the gateways, which send the packets. After that, these gateways need to be quiet for a relatively long time after sending the firmware, but they can still receive messages. If you need to send a downlink message, another gateway can handle it. In a proper setup, there are always multiple gateways within reach of a device to balance channel use.
Security in multicast sessions
When instructing multiple devices to join a temporary multicast session where all the devices share the same session keys, there is a potential security risk when one of the devices gets compromised: packet injection. Having the multicast session keys, the attacker can send packets as if they came from the server. Although this is indeed a serious issue when using multicast without additional security measurements, such as controlling lights simultaneously, this update mechanism contains three measures to secure the update process.
First, once the file has been received the devices calculates the checksum of the data that it has received. This checksum is sent to the server on the device's private secure session. The server compares this checksum with the checksum of the data that it sent. This check will fail if the data has been tampered with. The server responds to each device individually, on their private secure sessions, whether the checksum is correct.
Second, part of the server's response to indicate the correctness of the checksum, the server sends the message integrity code (MIC) which guarantees data integrity to the device. This MIC can not be forged by anyone that does not know the device's private secure session keys: only the device and the server can calculate the same MIC. So the server checks the device's checksum, and the device checks the server's MIC, communicating on the device's private secure session.
Third, when an attacker injects random packets, the device may not be able to reconstruct the original image. To avoid devices that run out of power because they keep listening for error correction packets, as presented in the next section, the multicast session has a lifetime: a fixed limit on the number of messages. When reaching this limit, the device will switch back to its private secure session and power efficient operating mode, and discard all data.
Sending large binary packets over a lossy network
In the schema proposed above, there is no communication between the device and the network when the multicast transmission is in progress. Thus, it is not possible to determine which device received which fragments of firmware update. On a LoRaWAN network, there is no guaranteed quality of service: there might be up to 20% packet loss and even more when the device is moving. To deal with the high packet loss, Nicolas Sornin (Semtech) proposed a fragmentation algorithm that works similarly to the way RAID-6 performs error correction in case a storage disk fails.
In the first step, the network sends the firmware as is, fragmented in packets. Next, the network starts sending error correction packets, which are XORed to what the device already received. Because the fragments have an increasing frame number, the device knows which fragments are missing and can use the correction packets to reconstruct the missed fragments. The network keeps sending correction packets until all devices confirm that they reconstructed all the fragments of the firmware update, or, in case of extreme packet loss, until the update server sent all correction packets. With the error correction algorithm, you need up to five correction packets to correct for three missed fragments.
After the device reconstructs the full firmware, the device switches back to its private secure session and operating mode. After successfully testing the device's checksum and the server's message integrity code as presented above, the device performs the firmware update.
Cryptographic verification of the firmware
The protocol that we devised only handles raw data integrity of the firmware. It involves timing and message level security and accounts for packet loss. However, a proper firmware update process also requires additional security on top of the network layer because hijacking the firmware update algorithm is a very big attack vector.
To protect against these attacks, our reference implementation contains some additional properties:
- A X509 certificate public key of the owner who is authorized to update firmware on the device.
- A manufacturer UUID (universally unique identifier).
- A device type UUID.
- A device UUID.
The actual firmware update contains (in addition to the actual update) a manifest that consists of the cryptographic hash of the update, the manufacturer and the device type that the update applies to; all signed with the manufacturer's private key of the X509 certificate. Whenever the device receives the update, you can verify that a trusted authority signed it and whether it was meant for this device, as the device contains the manufacturer's public key.
Both the update client running on the device and the bootloader (which runs before the actual firmware runs) contain these checks. They are part of the reference implementation that ARM built to securely update firmware over LoRaWAN and that ARM will release under the Apache 2.0 license in July.
To demonstrate the the firmware update process, Andrea Corrado (Graduate Engineer, ARM) created a custom board that contains a Multi-Tech xDot LoRa radio. This board then plugs into a target MCU (NXP FRDM-K22F), which runs the actual update client. This distinction allows for quick prototyping because the LoRaWAN stack and the update client run on separate MCUs. However, the next step is to run the whole stack on a single, self updatable MCU in the near future.
On top of this, an Adafruit NeoPixel Shield is attached, which contains an 8x5 grid of super bright multicolor LEDs. These LEDs are used to show status updates during the demonstration.
Boards straight out of the factory
The development board schematics and bill of materials is available as part of the mbed HDK.
On the network side, an update server was built on top of The Things Network’s distributed and decentralized LoRaWAN network server. The update server orchestrates device selection for update, sets up network security and the multicast groupings, schedules fragments and error correction packets, and it verifies hashes and firmware integrity. Currently, this update server is in the application layer. This allows for easy portability to other network technologies. However, as advocates of open standards, the protocol will be proposed for inclusion in the LoRaWAN specification. This enables wide adoption of an interoperable update process across device makers and networks in the LoRa Alliance. The Things Industries will release the update server under the MIT license.
Firmware updates are a critical requirement before devices that use LPWANs for connectivity hit the market in volume. Device makers can now ship products while assuring their customers of security updates, new functionality, optimizations and specialization throughout their lifetime.
Standardizing multicast support and fragmentation of large payloads within the LoRaWAN specification adds the ability to send a large payload to many devices in a reliable way without congesting the network too much. In addition, because the reference design also contains cryptographic verification of firmware on top of this specification, the solution is secure, and you can actually deploy it in the field. A demonstration of this work as a full end-to-end live solution running on The Things Network using multiple devices running the commercially available ARM mbed OS 5 will occur during the LoRa Alliance All Members Meeting and Open House in Philadelphia, June 12-14, 2017.
Jan Jongboom is Principal Applications Engineer at ARM® , working on ARM® mbed™. ARM® mbed™ provides a secure, scalable platform for enterprise IoT, including the operating system, cloud services, tools and developer ecosystem necessary for the creation and deployment of commercial, standards-based IoT solutions at scale. By offering all the vital tools needed to develop these IoT solutions, combined with the support of more than 70 partners and a community of 250,000+ developers, mbed is uniquely positioned as the leading ecosystem for IoT device creation, deployment and management. With a focus on low-energy devices, connectivity and security, mbed provides a unique value proposition in the IoT market. ARM mbed Cloud easily enables organisations to securely connect to, update and provision devices agnostic to the underlying operating system.
Johan Stokking is CTO and Co-Founder of The Things Industries and The Things Network. The Things Network: a global community of more than 18,000 members from over 87 countries around the world bringing together startups, developers, businesses, universities and governments in building an open, decentralized and crowd sourced Internet of Things data network. The Things Network Foundation hosts a free to use public community network and community spaces for anyone to connect and share solutions. Commercial services through The Things Industries provide the secure last mile for mission critical deployments of Internet of Things business applications.
For more information, contact Jan Jongboom at firstname.lastname@example.org, or Johan Stokking at email@example.com.
Acknowledgements: We'd like to think Nicolas Sornin (Semtech) for his work on fragmentation and multicast and helping us standardizing our ideas within the LoRa Alliance; we have been exploring the same field of thought, and his work has been a tremendous help. We also want to thank Multi-Tech Systems for hosting the first public demonstration setup in Philadelphia. Furthermore, we’d like to thank Van Zuylen (Amsterdam) for providing us with a valuable place to brainstorm during the last months; it was the place where we envisioned the original idea and where we celebrated our first successful demo.