Embedded Firmware Best Practices
Essential guidelines for developing robust and reliable embedded firmware. Each topic is grouped by category, tagged with priority, and expandable for detailed explanations, code examples, and field-tested tips.
Code Quality
Foundational habits that keep embedded codebases readable, reviewable, and safe to change.
Use Version Control
Always use Git or similar version control systems to track changes and collaborate effectively.
Critical
Use Version Control
Always use Git or similar version control systems to track changes and collaborate effectively.
Why it matters
Version control is essential for embedded development. It allows you to track every change, revert to working versions when bugs are introduced, and collaborate with team members. Git branches enable parallel development of features without affecting the main codebase.
Tips
- Create meaningful commit messages that describe the 'why' not just the 'what'
- Use feature branches for new development
- Tag releases for easy reference to production firmware versions
- Include hardware revision information in your commit history
Follow Coding Standards
Adopt industry standards like MISRA C for embedded systems to ensure code safety and reliability.
Critical
Follow Coding Standards
Adopt industry standards like MISRA C for embedded systems to ensure code safety and reliability.
Why it matters
MISRA C is a set of software development guidelines designed to promote safety, security, and reliability in embedded systems. Following these guidelines helps prevent common programming errors, makes code more maintainable, and is often required for safety-critical applications.
Code example
/* MISRA C compliant example */
static uint32_t calculate_checksum(const uint8_t *data, size_t len) {
uint32_t sum = 0U;
if (data != NULL) {
for (size_t i = 0U; i < len; i++) {
sum += (uint32_t)data[i];
}
}
return sum;
}Tips
- Use static analysis tools like PC-lint or Polyspace
- Enable compiler warnings and treat them as errors
- Document any intentional deviations from standards
Write Modular Code
Break down functionality into reusable modules with clear interfaces and single responsibilities.
High Priority
Write Modular Code
Break down functionality into reusable modules with clear interfaces and single responsibilities.
Why it matters
Modular code separates concerns into distinct units, each handling a specific functionality. This approach improves testability, allows code reuse across projects, and makes maintenance easier. In embedded systems, modules often correspond to hardware peripherals or application features.
Code example
/* Module header: sensor_interface.h */
typedef struct {
int32_t temperature;
uint32_t humidity;
uint32_t timestamp;
} sensor_data_t;
int sensor_init(void);
int sensor_read(sensor_data_t *data);
void sensor_deinit(void);Tips
- One module = one responsibility
- Define clear public APIs in header files
- Hide implementation details as static functions
- Use opaque pointers for complex data structures
Document Your Code
Use clear comments and documentation to explain complex logic, hardware interactions, and API usage.
High Priority
Document Your Code
Use clear comments and documentation to explain complex logic, hardware interactions, and API usage.
Why it matters
Good documentation is crucial for embedded systems where code often interacts with hardware in non-obvious ways. Comments should explain the 'why' behind decisions, document hardware quirks, and describe timing requirements. Use Doxygen-style comments for API documentation.
Code example
/**
* @brief Initialize the ADC peripheral for temperature sensing
*
* Configures ADC channel 3 for single-ended input with 12-bit resolution.
* Must be called before any sensor_read() operations.
*
* @note Requires VREF to be stable before calling
* @return 0 on success, negative error code on failure
*/
int sensor_init(void);Tips
- Document hardware dependencies and timing requirements
- Explain magic numbers with named constants or comments
- Keep comments up-to-date when code changes
- Use README files for module-level documentation
Hardware Interaction
Patterns for talking to peripherals, handling interrupts, and keeping hardware code portable.
Use Hardware Abstraction Layers
Create HALs to separate hardware-specific code from application logic for better portability.
Critical
Use Hardware Abstraction Layers
Create HALs to separate hardware-specific code from application logic for better portability.
Why it matters
A Hardware Abstraction Layer (HAL) provides a consistent interface to hardware peripherals, hiding the low-level register manipulations. This allows application code to be ported between different MCUs with minimal changes and enables testing on host systems using mock implementations.
Code example
/* HAL interface */
typedef struct {
int (*init)(const gpio_config_t *config);
int (*write)(uint32_t pin, bool state);
bool (*read)(uint32_t pin);
} gpio_driver_t;
/* Platform-specific implementation */
static const gpio_driver_t nrf_gpio_driver = {
.init = nrf_gpio_init,
.write = nrf_gpio_write,
.read = nrf_gpio_read,
};Tips
- Define abstract interfaces before implementing platform-specific code
- Use function pointers or weak symbols for swappable implementations
- Create mock HAL implementations for unit testing
- Document hardware assumptions in the HAL interface
Implement Proper Initialization
Always initialize peripherals and variables before use to avoid undefined behavior.
Critical
Implement Proper Initialization
Always initialize peripherals and variables before use to avoid undefined behavior.
Why it matters
Embedded systems often have complex initialization sequences that must follow specific orders. Peripherals may depend on clocks, power domains, or other peripherals being initialized first. Document and enforce these dependencies to prevent hard-to-debug issues.
Code example
int system_init(void) {
int ret;
/* Clock must be initialized first */
ret = clock_init();
if (ret != 0) {
return ret;
}
/* Power domain depends on clock */
ret = power_init();
if (ret != 0) {
return ret;
}
/* Peripherals depend on power */
ret = gpio_init();
if (ret != 0) {
return ret;
}
return 0;
}Tips
- Initialize all variables at declaration
- Document peripheral initialization order requirements
- Use initialization flags to prevent double-init issues
- Check return values from all initialization functions
Handle Interrupts Carefully
Keep ISRs short and fast. Use flags to defer processing to main loop when possible.
High Priority
Handle Interrupts Carefully
Keep ISRs short and fast. Use flags to defer processing to main loop when possible.
Why it matters
Interrupt Service Routines (ISRs) should execute as quickly as possible to minimize latency for other interrupts. Use ISRs only to capture time-critical data and set flags, then perform complex processing in the main loop or a task. Be aware of shared data issues between ISRs and main code.
Code example
volatile bool data_ready = false;
volatile uint16_t adc_value;
void ADC_IRQHandler(void) {
/* Clear interrupt flag first */
ADC->ISR = ADC_ISR_EOC;
/* Quick data capture */
adc_value = ADC->DR;
/* Signal main loop */
data_ready = true;
}
/* Main loop processing */
void main_loop(void) {
if (data_ready) {
data_ready = false;
process_adc_data(adc_value);
}
}Tips
- Use volatile for variables shared with ISRs
- Disable interrupts when accessing shared multi-byte data
- Avoid floating-point math in ISRs
- Use RTOS semaphores or message queues for complex ISR-to-task communication
Manage Power Consumption
Implement sleep modes and optimize peripheral usage to extend battery life in portable devices.
Medium Priority
Manage Power Consumption
Implement sleep modes and optimize peripheral usage to extend battery life in portable devices.
Why it matters
Power management is critical for battery-powered devices. Modern MCUs offer multiple sleep modes with different power consumption and wake-up latencies. Disable unused peripherals, use event-driven design, and measure actual power consumption during development.
Tips
- Profile power consumption early in development
- Use the deepest sleep mode possible for your latency requirements
- Disable unused peripheral clocks
- Consider using DMA to allow CPU sleep during data transfers
Nordic & Zephyr RTOS
Idiomatic Zephyr / nRF Connect SDK patterns — Device Tree, Kconfig, BLE, and work queues.
Use Zephyr Device Tree
Leverage Device Tree overlays to configure hardware without modifying source code.
Critical
Use Zephyr Device Tree
Leverage Device Tree overlays to configure hardware without modifying source code.
Why it matters
Zephyr's Device Tree system provides a hardware-agnostic way to describe your board's configuration. Device Tree overlays allow you to customize pin assignments, peripheral settings, and sensor configurations without changing C code, making your firmware more portable across Nordic development kits.
Code example
/* nrf52840dk.overlay */
&i2c0 {
status = "okay";
compatible = "nordic,nrf-twim";
bme280@76 {
compatible = "bosch,bme280";
reg = <0x76>;
label = "BME280";
};
};
&spi1 {
status = "okay";
cs-gpios = <&gpio0 17 GPIO_ACTIVE_LOW>;
};Tips
- Create board-specific overlays for custom hardware
- Use Device Tree bindings documentation as reference
- Test overlays with 'west build --pristine' for clean builds
- Document overlay changes in your project README
Configure prj.conf Properly
Use Kconfig options in prj.conf to enable only required features and optimize resource usage.
Critical
Configure prj.conf Properly
Use Kconfig options in prj.conf to enable only required features and optimize resource usage.
Why it matters
The prj.conf file controls which Zephyr subsystems and drivers are included in your build. Enabling only what you need reduces flash and RAM usage. Understanding Kconfig dependencies helps avoid mysterious build errors.
Code example
# Core configuration
CONFIG_GPIO=y
CONFIG_I2C=y
CONFIG_SPI=y
# BLE configuration for nRF52
CONFIG_BT=y
CONFIG_BT_PERIPHERAL=y
CONFIG_BT_DEVICE_NAME="MyDevice"
# Power management
CONFIG_PM=y
CONFIG_PM_DEVICE=y
# Logging (disable in production)
CONFIG_LOG=y
CONFIG_LOG_DEFAULT_LEVEL=3Tips
- Use 'west build -t menuconfig' to explore Kconfig options
- Create separate prj.conf files for debug and release builds
- Document why each config option is enabled
- Check CONFIG dependencies with 'west build -t guiconfig'
Leverage Nordic SDK Libraries
Use nRF Connect SDK libraries for BLE, Thread, Matter, and other protocol stacks.
High Priority
Leverage Nordic SDK Libraries
Use nRF Connect SDK libraries for BLE, Thread, Matter, and other protocol stacks.
Why it matters
The nRF Connect SDK provides production-ready implementations of BLE services, mesh networking, and IoT protocols. Using these libraries saves development time and ensures compliance with protocol specifications. They're tested and optimized for Nordic hardware.
Code example
/* Using Nordic BLE libraries */
#include <bluetooth/bluetooth.h>
#include <bluetooth/conn.h>
#include <bluetooth/gatt.h>
static struct bt_conn_cb conn_callbacks = {
.connected = on_connected,
.disconnected = on_disconnected,
};
int bluetooth_init(void) {
int err = bt_enable(NULL);
if (err) {
return err;
}
bt_conn_cb_register(&conn_callbacks);
return 0;
}Tips
- Check nRF Connect SDK samples for implementation patterns
- Use Nordic DevZone forums for troubleshooting
- Keep SDK version consistent across your team
- Test with Nordic Power Profiler for power optimization
Use Zephyr Workqueues
Offload non-critical work from ISRs and high-priority threads using work queues.
High Priority
Use Zephyr Workqueues
Offload non-critical work from ISRs and high-priority threads using work queues.
Why it matters
Zephyr work queues provide a mechanism to defer work to a lower-priority context. This keeps ISRs fast and prevents priority inversion. The system work queue is suitable for most cases, but create dedicated work queues for time-sensitive or blocking operations.
Code example
static struct k_work sensor_work;
static void sensor_work_handler(struct k_work *work) {
/* Heavy processing here */
struct sensor_data data;
sensor_read(&data);
process_and_transmit(&data);
}
void sensor_irq_handler(void) {
/* Just submit work, don't process here */
k_work_submit(&sensor_work);
}
int main(void) {
k_work_init(&sensor_work, sensor_work_handler);
/* ... */
}Tips
- Use k_work_delayable for periodic or debounced operations
- Create dedicated work queues for blocking I/O
- Monitor work queue depth to detect overload
- Use work queue pools for parallel processing
Power Management
Stretch battery life with disciplined sleep modes, radio scheduling, and per-peripheral power control.
Profile Power Early
Measure actual power consumption during development, not just at the end.
Critical
Profile Power Early
Measure actual power consumption during development, not just at the end.
Why it matters
Power consumption issues are much harder to fix late in development. Use tools like Nordic Power Profiler Kit early and often to understand your device's power profile. Correlate power spikes with code execution to identify optimization opportunities.
Tips
- Establish a power budget before starting development
- Measure each peripheral's contribution to total power
- Test power in all operating modes (active, idle, sleep)
- Document power measurements for each firmware version
Implement Sleep Modes
Use the deepest sleep mode compatible with your wake-up latency requirements.
Critical
Implement Sleep Modes
Use the deepest sleep mode compatible with your wake-up latency requirements.
Why it matters
Modern MCUs offer multiple sleep modes trading off power savings against wake-up time. System ON sleep on nRF52 uses ~1.5µA while System OFF uses ~0.3µA but requires full reboot. Choose based on your application's responsiveness requirements.
Code example
/* Zephyr power management */
#include <pm/pm.h>
#include <pm/device.h>
void enter_low_power(void) {
/* Disable unused peripherals */
pm_device_action_run(uart_dev, PM_DEVICE_ACTION_SUSPEND);
/* Enter low power mode - Zephyr handles this automatically
when idle if CONFIG_PM=y */
}
/* Wake sources: GPIO, timer, or BLE events */Tips
- Configure proper wake sources before entering deep sleep
- Retain RAM contents if faster wake-up is needed
- Use RTC for periodic wake-ups instead of busy-waiting
- Test wake-up latency meets your requirements
Optimize Radio Usage
Minimize radio-on time for BLE, WiFi, and LTE to dramatically reduce power consumption.
High Priority
Optimize Radio Usage
Minimize radio-on time for BLE, WiFi, and LTE to dramatically reduce power consumption.
Why it matters
Radio transmission is typically the highest power consumer in wireless devices. Optimize by reducing advertising intervals, using connection parameter updates, batching data transmissions, and leveraging low-power modes like BLE's sniff subrating.
Code example
/* Optimized BLE connection parameters */
static struct bt_le_conn_param conn_params = {
.interval_min = 80, /* 100ms - balance latency vs power */
.interval_max = 160, /* 200ms */
.latency = 4, /* Skip up to 4 intervals */
.timeout = 400, /* 4 seconds supervision timeout */
};
/* Request parameter update after connection */
bt_conn_le_param_update(conn, &conn_params);Tips
- Increase advertising interval when not actively seeking connections
- Use longer connection intervals for low-bandwidth applications
- Batch sensor data and transmit in bursts
- Consider using BLE coded PHY for longer range at lower power
Manage Peripheral Power
Disable unused peripherals and use low-power alternatives when possible.
High Priority
Manage Peripheral Power
Disable unused peripherals and use low-power alternatives when possible.
Why it matters
Even idle peripherals consume power. Disable peripheral clocks when not in use, use GPIO interrupts instead of polling, and choose low-power peripheral modes. On Nordic devices, use the PPI system to connect peripherals without CPU intervention.
Tips
- Use Zephyr's PM_DEVICE API to suspend/resume peripherals
- Configure unused pins as disconnected inputs
- Use timer callbacks instead of busy-wait delays
- Leverage hardware PWM instead of software bit-banging
OTA Updates
Ship updates safely — dual-bank A/B partitioning, signed images, resumable downloads, and small deltas.
Implement Dual-Bank Updates
Use A/B partitioning to ensure safe firmware updates with automatic rollback capability.
Critical
Implement Dual-Bank Updates
Use A/B partitioning to ensure safe firmware updates with automatic rollback capability.
Why it matters
Dual-bank (A/B) updates write new firmware to an inactive partition while the device runs from the active one. After verification, the bootloader switches to the new image. If the update fails or the new firmware is faulty, automatic rollback to the previous version ensures the device remains operational.
Code example
/* MCUboot partition layout in DTS */
/ {
chosen {
zephyr,code-partition = &slot0_partition;
};
};
&flash0 {
partitions {
boot_partition: partition@0 { /* MCUboot */ };
slot0_partition: partition@10000 { /* Active */ };
slot1_partition: partition@80000 { /* Staging */ };
scratch_partition: partition@f0000 { /* Swap */ };
};
};Tips
- Use MCUboot for production-ready secure boot and updates
- Test rollback scenarios thoroughly
- Include self-test code that confirms boot within timeout
- Plan flash layout early - changing partitions later is difficult
Sign and Verify Images
Cryptographically sign firmware images to prevent unauthorized code execution.
Critical
Sign and Verify Images
Cryptographically sign firmware images to prevent unauthorized code execution.
Why it matters
Firmware signing ensures only authorized code runs on your device. MCUboot supports RSA, ECDSA, and ED25519 signatures. The bootloader verifies the signature before accepting an update, protecting against both malicious and corrupted firmware.
Code example
# Signing with MCUboot's imgtool
west sign -t imgtool -- \
--key root-ec-p256.pem \
--version 1.2.0 \
--header-size 0x200 \
--slot-size 0x70000
# Verification happens automatically at boot
# MCUboot checks signature before jumping to appTips
- Store signing keys securely - never commit to source control
- Use hardware security modules (HSM) for production signing
- Implement key revocation strategy for compromised keys
- Version your firmware and track which devices have which version
Handle Update Failures
Implement robust error handling for network failures, power loss, and corrupted downloads.
High Priority
Handle Update Failures
Implement robust error handling for network failures, power loss, and corrupted downloads.
Why it matters
OTA updates can fail at any point due to network issues, power loss, or flash errors. Implement resumable downloads, verify image integrity before applying, and ensure the bootloader can always recover to a known-good state.
Code example
int ota_download_image(const char *url) {
size_t offset = ota_get_download_progress();
while (offset < image_size) {
int ret = http_download_chunk(url, offset, chunk_buf);
if (ret < 0) {
/* Save progress and retry later */
ota_save_progress(offset);
return ret;
}
ret = flash_write(slot1_addr + offset, chunk_buf, ret);
if (ret < 0) {
return ret;
}
offset += ret;
ota_save_progress(offset);
}
return ota_verify_image();
}Tips
- Implement chunk-based downloads with progress persistence
- Verify complete image hash before confirming update
- Use watchdog to detect stuck boot loops
- Test update process with simulated failures
Minimize Update Size
Use delta updates or compression to reduce bandwidth and update time.
Medium Priority
Minimize Update Size
Use delta updates or compression to reduce bandwidth and update time.
Why it matters
Full image updates can be hundreds of kilobytes. Delta updates transmit only changed bytes, reducing download time and cellular/power costs. Zephyr and MCUboot support LZMA compression, and tools like Memfault provide delta update infrastructure.
Tips
- Enable image compression in MCUboot configuration
- Consider delta update solutions for large codebases
- Track binary size changes in your CI pipeline
- Optimize code and remove debug symbols for production
Memory Optimization
Stay inside flash and RAM budgets with stack analysis, memory pools, packed structs, and link-time optimization.
Monitor Stack Usage
Track thread stack usage to prevent overflows and optimize memory allocation.
Critical
Monitor Stack Usage
Track thread stack usage to prevent overflows and optimize memory allocation.
Why it matters
Stack overflows are a common cause of embedded system crashes and can be difficult to debug. Zephyr provides stack usage analysis tools. Size stacks appropriately - too small causes crashes, too large wastes precious RAM.
Code example
/* Enable stack analysis in prj.conf */
CONFIG_THREAD_ANALYZER=y
CONFIG_THREAD_ANALYZER_USE_PRINTK=y
CONFIG_THREAD_ANALYZER_AUTO=y
CONFIG_THREAD_ANALYZER_AUTO_INTERVAL=5
/* Zephyr will print stack usage:
* Thread: main
* Stack size: 2048
* Stack used: 1456
* Stack unused: 592
*/Tips
- Add safety margin (20-30%) to measured stack usage
- Enable stack canaries during development
- Avoid large local arrays - use static or heap allocation
- Profile stack usage under worst-case conditions
Use Memory Pools
Prefer fixed-size memory pools over dynamic allocation for predictable behavior.
Critical
Use Memory Pools
Prefer fixed-size memory pools over dynamic allocation for predictable behavior.
Why it matters
Dynamic memory allocation (malloc/free) can lead to fragmentation and non-deterministic timing. Memory pools allocate fixed-size blocks, eliminating fragmentation and providing O(1) allocation time. Zephyr provides k_mem_slab and k_mem_pool for this purpose.
Code example
/* Define a memory slab for sensor readings */
K_MEM_SLAB_DEFINE(sensor_slab,
sizeof(struct sensor_reading),
16, /* 16 blocks */
4); /* 4-byte alignment */
void *alloc_reading(void) {
void *ptr;
if (k_mem_slab_alloc(&sensor_slab, &ptr, K_NO_WAIT) == 0) {
return ptr;
}
return NULL;
}
void free_reading(void *ptr) {
k_mem_slab_free(&sensor_slab, ptr);
}Tips
- Size pools based on maximum concurrent allocations
- Use different pools for different object types
- Monitor pool utilization in debug builds
- Consider static allocation if pool size is always known
Optimize Data Structures
Pack structures, use appropriate types, and minimize memory fragmentation.
High Priority
Optimize Data Structures
Pack structures, use appropriate types, and minimize memory fragmentation.
Why it matters
Compiler padding can waste significant memory in structures. Use __attribute__((packed)) carefully (it may impact performance), order struct members by size, and choose the smallest integer type that fits your data range.
Code example
/* Unoptimized: 12 bytes due to padding */
struct sensor_bad {
uint8_t type; /* 1 byte + 3 padding */
uint32_t value; /* 4 bytes */
uint8_t status; /* 1 byte + 3 padding */
};
/* Optimized: 8 bytes, no padding */
struct sensor_good {
uint32_t value; /* 4 bytes */
uint8_t type; /* 1 byte */
uint8_t status; /* 1 byte */
uint8_t reserved[2]; /* Explicit padding */
};Tips
- Order struct members from largest to smallest
- Use sizeof() to verify expected structure sizes
- Consider bit-fields for boolean flags
- Use enums with explicit uint8_t backing type
Reduce Flash Usage
Minimize code size through compiler optimization, dead code elimination, and link-time optimization.
High Priority
Reduce Flash Usage
Minimize code size through compiler optimization, dead code elimination, and link-time optimization.
Why it matters
Flash memory is limited on MCUs. Use compiler flags for size optimization, remove unused code and libraries, and consider link-time optimization (LTO). Zephyr's Kconfig system helps by only including enabled features.
Code example
# CMakeLists.txt optimizations
target_compile_options(app PRIVATE
-Os # Optimize for size
-ffunction-sections
-fdata-sections
)
target_link_options(app PRIVATE
-Wl,--gc-sections # Remove unused sections
)
# prj.conf - disable unused features
CONFIG_PRINTK=n # If not needed
CONFIG_LOG=n # For production
CONFIG_ASSERT=n # For productionTips
- Use 'west build -t rom_report' to analyze flash usage
- Remove debug features in production builds
- Consider storing large const data in external flash
- Use LTO for additional code size reduction
Safety & Security
Watchdogs, input validation, secure communication, and graceful error handling for production-grade firmware.
Implement Watchdog Timers
Use watchdog timers to recover from system hangs and ensure continuous operation.
Critical
Implement Watchdog Timers
Use watchdog timers to recover from system hangs and ensure continuous operation.
Why it matters
Watchdog timers reset the system if software fails to 'kick' them periodically. This provides recovery from infinite loops, deadlocks, and other fault conditions. Configure the timeout based on your longest expected operation, with margin for variability.
Code example
#include <zephyr/drivers/watchdog.h>
static const struct device *wdt = DEVICE_DT_GET(DT_ALIAS(watchdog0));
static int wdt_channel_id;
int watchdog_init(void) {
struct wdt_timeout_cfg cfg = {
.window.min = 0,
.window.max = 5000, /* 5 second timeout */
.callback = NULL, /* Reset on timeout */
};
wdt_channel_id = wdt_install_timeout(wdt, &cfg);
return wdt_setup(wdt, WDT_OPT_PAUSE_HALTED_BY_DBG);
}
void main_loop(void) {
while (1) {
do_work();
wdt_feed(wdt, wdt_channel_id); /* Kick the dog */
}
}Tips
- Feed watchdog in main loop, not ISRs
- Set timeout longer than worst-case processing time
- Use multiple watchdog channels for monitoring different tasks
- Pause watchdog during debugging to prevent reset cycles
Validate Input Data
Always validate data from external sources (sensors, communication interfaces) before processing.
Critical
Validate Input Data
Always validate data from external sources (sensors, communication interfaces) before processing.
Why it matters
External data can be corrupted, out of range, or maliciously crafted. Always validate before use. Check bounds, verify checksums, and sanitize inputs to prevent buffer overflows, crashes, and security vulnerabilities.
Code example
typedef struct {
uint8_t cmd;
uint16_t length;
uint8_t data[MAX_PAYLOAD];
uint16_t crc;
} packet_t;
int process_packet(const uint8_t *buf, size_t len) {
/* Validate minimum size */
if (len < sizeof(packet_t) - MAX_PAYLOAD) {
return -EINVAL;
}
packet_t *pkt = (packet_t *)buf;
/* Validate length field */
if (pkt->length > MAX_PAYLOAD) {
return -EINVAL;
}
/* Validate CRC */
if (calculate_crc(buf, len - 2) != pkt->crc) {
return -EBADMSG;
}
return handle_command(pkt);
}Tips
- Check all array indices before access
- Validate numeric ranges for physical quantities
- Use checksums or CRCs for transmitted data
- Implement rate limiting for external inputs
Secure Communication
Use encryption and authentication for wireless communication and firmware updates.
High Priority
Secure Communication
Use encryption and authentication for wireless communication and firmware updates.
Why it matters
Wireless communication can be intercepted or spoofed. Use TLS for IP-based protocols, enable BLE encryption and bonding, and implement message authentication. Nordic devices support hardware crypto acceleration for efficient security.
Code example
/* BLE security configuration */
static struct bt_conn_auth_cb auth_callbacks = {
.passkey_display = on_passkey_display,
.cancel = on_auth_cancel,
.pairing_complete = on_pairing_complete,
};
int security_init(void) {
bt_conn_auth_cb_register(&auth_callbacks);
/* Require encryption and authentication */
return bt_conn_set_security(conn, BT_SECURITY_L4);
}Tips
- Use hardware crypto when available for better performance
- Store encryption keys in secure storage, not flash
- Implement key rotation for long-lived devices
- Enable BLE secure connections (LESC) for stronger pairing
Implement Error Handling
Add comprehensive error checking and recovery mechanisms for robust operation.
High Priority
Implement Error Handling
Add comprehensive error checking and recovery mechanisms for robust operation.
Why it matters
Robust embedded systems gracefully handle errors rather than crashing. Check return values, implement retry logic for transient failures, and define clear error recovery procedures. Log errors for later diagnosis.
Code example
int sensor_read_with_retry(sensor_data_t *data) {
int ret;
int retries = 3;
while (retries-- > 0) {
ret = sensor_read(data);
if (ret == 0) {
return 0; /* Success */
}
if (ret == -ENODEV) {
/* Sensor disconnected - no point retrying */
LOG_ERR("Sensor not found");
return ret;
}
LOG_WRN("Sensor read failed, retrying...");
k_msleep(100);
}
LOG_ERR("Sensor read failed after retries");
return ret;
}Tips
- Distinguish between recoverable and fatal errors
- Use exponential backoff for retry delays
- Log enough context to diagnose issues remotely
- Consider safe fallback modes for critical systems
Testing & Debugging
Catch bugs early with unit tests, structured logging, edge-case coverage, and end-to-end integration testing.
Unit Test Your Code
Write unit tests for critical functions to catch bugs early in development.
High Priority
Unit Test Your Code
Write unit tests for critical functions to catch bugs early in development.
Why it matters
Unit tests verify individual functions work correctly in isolation. For embedded systems, use frameworks like Ztest (Zephyr's native framework) or Unity. Mock hardware dependencies to run tests on your development machine.
Code example
/* Zephyr Ztest example */
#include <ztest.h>
#include "checksum.h"
static void test_checksum_empty(void) {
uint8_t data[] = {};
uint16_t result = calculate_checksum(data, 0);
zassert_equal(result, 0, "Empty data should return 0");
}
static void test_checksum_known_value(void) {
uint8_t data[] = {0x01, 0x02, 0x03, 0x04};
uint16_t result = calculate_checksum(data, sizeof(data));
zassert_equal(result, 0x0A0A, "Checksum mismatch");
}
ZTEST_SUITE(checksum_tests, NULL, NULL, NULL, NULL, NULL);
ZTEST(checksum_tests, test_checksum_empty);
ZTEST(checksum_tests, test_checksum_known_value);Tips
- Test edge cases: empty input, maximum values, null pointers
- Run tests on both host and target hardware
- Integrate tests into your CI/CD pipeline
- Aim for high coverage of critical code paths
Use Debug Interfaces
Leverage JTAG/SWD debugging tools and logging to troubleshoot issues efficiently.
High Priority
Use Debug Interfaces
Leverage JTAG/SWD debugging tools and logging to troubleshoot issues efficiently.
Why it matters
Hardware debuggers (J-Link, ST-Link) provide breakpoints, memory inspection, and peripheral register views. Combined with RTT logging, you can debug without affecting timing-sensitive code. Use Zephyr's logging subsystem for structured output.
Code example
#include <zephyr/logging/log.h>
LOG_MODULE_REGISTER(sensor, LOG_LEVEL_DBG);
int sensor_read(sensor_data_t *data) {
LOG_DBG("Starting sensor read");
int ret = i2c_read(i2c_dev, buf, len, addr);
if (ret < 0) {
LOG_ERR("I2C read failed: %d", ret);
return ret;
}
LOG_INF("Sensor value: %d", data->value);
LOG_HEXDUMP_DBG(buf, len, "Raw data:");
return 0;
}Tips
- Use Segger RTT for low-impact logging
- Configure log levels per module for focused debugging
- Use hardware breakpoints to catch memory corruption
- Profile code with cycle-accurate timing via ITM/ETM
Test Edge Cases
Test boundary conditions, error scenarios, and resource limitations thoroughly.
Medium Priority
Test Edge Cases
Test boundary conditions, error scenarios, and resource limitations thoroughly.
Why it matters
Bugs often hide at boundaries - buffer limits, integer overflow points, and timing edges. Test what happens when resources are exhausted, when operations are interrupted, and when inputs are at their limits.
Tips
- Test with minimum and maximum input values
- Simulate memory exhaustion and resource starvation
- Test with unreliable communication (dropped packets, timeouts)
- Verify behavior at temperature and voltage extremes
Perform Integration Testing
Test complete system integration including hardware, firmware, and external interfaces.
High Priority
Perform Integration Testing
Test complete system integration including hardware, firmware, and external interfaces.
Why it matters
Integration tests verify that components work together correctly. This includes testing communication protocols, sensor fusion algorithms, and end-to-end functionality. Use real hardware and simulate external systems when needed.
Tips
- Create automated test fixtures with real hardware
- Test communication with actual mobile apps and gateways
- Verify long-running stability (hours or days)
- Test firmware update process end-to-end
Remember
These best practices are guidelines based on industry experience. Always adapt them to your specific project requirements, hardware constraints, and regulatory standards. Consistency and documentation are key to maintainable embedded systems.
Apply these practices automatically with FirmwareMaestro
Generate Zephyr / nRF Connect SDK projects that ship with MISRA-aware code, sensible Kconfig, MCUboot wiring, and watchdog scaffolding from day one.