Dec 25, 2015

Mainline U-Boot on the stm32f429 discovery board

Emcraft has been slowly mainlining the support for STM32F4 to the U-Boot git, for the STM32F429 discovery board.  As a bonus, the mainline Linux kernel now has stm32_defconfig that should work on this board as well. I shelled out $36 (includes tax and shipping) to Avenet and attempted to build for this board, and ran into an error specific to a toolchain with hardware floating point support:

~u-boot$ make stm32f429-discovery_defconfig
~u-boot$ make
arm-buildroot-uclinux-uclibcgnueabihf-ld.bfd: error: /mnt/work/band/uClinux/BRm4/output/host/usr/bin/../lib/gcc/arm-buildroot-uclinux-uclibcgnueabihf/5.3.0/libgcc.a(_udivmoddi4.o) uses VFP register arguments, u-boot does not
arm-buildroot-uclinux-uclibcgnueabihf-ld.bfd: failed to merge target specific data of file /mnt/work/band/uClinux/BRm4/output/host/usr/bin/../lib/gcc/arm-buildroot-uclinux-uclibcgnueabihf/5.3.0/libgcc.a(_udivmoddi4.o)

The culprit was "__aeabi_uldivmod", pointed out by this recent patch, which is already mainlined.  So I searched for source files having that undefined symbol, and found it in arch/arm/cpu/armv7m/stm32f4/timer.c, which I fixed like this:

#include <div64.h>
ulong get_timer(ulong base)
{
return (get_ticks() / (CONFIG_SYS_HZ_CLOCK / CONFIG_SYS_HZ))
        (ulong)lldiv(get_ticks(), CONFIG_SYS_HZ_CLOCK/CONFIG_SYS_HZ)
      - base;
}

Understanding the Kconfig style <board>_defconfig

Just like the Linux kernel, make <board>_defconfig yields the top level .config file--which should not be hand-edited--used in the main make step.  The .config will then control both what source files get pulled into the executable, and what the CPP sees (through #define's).

Along with these scant description in configs/stm32f429-discovery_defconfig:

CONFIG_ARM=y
CONFIG_TARGET_STM32F429_DISCOVERY=y
CONFIG_SYS_PROMPT="U-Boot > "
# CONFIG_CMD_SETEXPR is not set

board/st/stm32f429-discovery/Kconfig has the defaults [if TARGET_STM32F429_DISCOVERY] for SYS_BOARD, SYS_VENDOR, SYS_SOC ("stm32f4"), and SYS_CONFIG_NAME ("stm32f429-discovery").  From this, scripts/Makefile.autoconf generates include/config.h by substituting SYS_CONFIG_NAME into a template string, yielding:

#define CONFIG_BOARDDIR board/st/stm32f429-discovery
#include <config_defaults.h>
#include <config_uncmd_spl.h>
#include <configs/stm32f429-discovery.h>
#include <asm/config.h>
#include <config_fallbacks.h>

include/configs/stm32f429-discovery.h has the board specific configurations, like:

#define CONFIG_SYS_FLASH_BASE 0x08000000
#define CONFIG_SYS_INIT_SP_ADDR 0x10010000
#define CONFIG_SYS_TEXT_BASE 0x08000000
#define CONFIG_SYS_ICACHE_OFF

Linker script is auto-generated from template u-boot.lds

Similar to how the u-boot.lds was generated in the Emcraft case I covered earlier, the top level Makefile looks in several places for the u-boot.lds template:

ifndef LDSCRIPT
ifeq ($(wildcard $(LDSCRIPT)),)
LDSCRIPT := $(srctree)/board/$(BOARDDIR)/u-boot.lds
endif
ifeq ($(wildcard $(LDSCRIPT)),)
LDSCRIPT := $(srctree)/$(CPUDIR)/u-boot.lds
endif
ifeq ($(wildcard $(LDSCRIPT)),)
LDSCRIPT := $(srctree)/arch/$(ARCH)/cpu/u-boot.lds
endif
endif

 In this example, that is the arch/arm/cpu/u-boot.lds (the last option).  But unlike in the old Emcraft build, the explicit MEMORY segment definition is gone from the linker script, and has been replaced by the "-Ttext" argument to the linker command line.  The vector table is supplied in arch/arm/lib/vectors_m.S (note the "_m"):

   .section  .vectors
ENTRY(_start)
.long CONFIG_SYS_INIT_SP_ADDR @ 0 - Reset stack pointer
.long reset @ 1 - Reset
.long __invalid_entry @ 2 - NMI
.long __hard_fault_entry @ 3 - HardFault
.long __mm_fault_entry @ 4 - MemManage
...

where CONFIG_SYS_INIT_SP_ADDR is supplied in the stm32f429-discovery.h, and the reset vector is supplied in arch/arm/cpu/armv7m/start.S:

.globl reset
.type reset, %function
reset:
b _main

Note that ENTRY(_start) was already specified in the linker script.

Reset vector to board init function array

The arch/arm/lib/crt0.S:_main therefore does the heavy lifting of setting up the C runtime, including setting up the global_data structure as before--in board_init_f_mem(ulong top) function [which is passed the (8-byte aligned according to EABI requirement) stack pointer configured by the CONFIG_SYS_INIT_SP_ADDR above].

common/board_f.c:board_init_f(flag) calls the init functions (that can run before code relocation?) defined in the array:

static init_fnc_t init_sequence_f[] = {
...

arch/arm/cpu/stmf4/soc.c:arch_cpu_init

__weak int arch_cpu_init() is the first architecture specific CPU initialization, and starts to configure clocks straight away; the STM32_FLASH register is written INSIDE configure_clocks(), right after the PLL is ready.  In contrast, Emcraft's u-boot for its stm32f7-som board prepared external flash access first and initialized the systick timer, before configuring the clocks.
  • In clock.c:configure_clocks(), the discovery board uses HSI (high speed internal) as the PLL source, rather than the HSE (high speed external) used for the Emcraft SOM board (clock_setup).
  • The mainline u-boot code seems to use the setbits_le32() / clrbits_le32() / writel() macros extensively, whereas the Emcraft code is more of the straight bit-bang against the raw RCC (reset and clock control) registers.
Compared to the Emcraft code, the MPU setup in generic stm32f4/soc.c is much simpler: it just grants a full access to strongly ordered, shareable 4 GB region.

init device model

This wasn't in the Emcraft code.  There doesn't seem to be any stm32f(4) specific device model code?

bootstage_add_record: evolution of show_boot_progress()

In the previous blog entry, show_boot_progress() was just a an optional (weak function) hook for architecture specific handling of the checkpoints.  bootstage_add_record() now records to a static array of boot stage record (BOOTSTAGE_ID_COUNT = 215 currently) before calling the show_boot_progress().

board_early_init_f: setup GPIO for UART

In general, we want to configure a console as soon as possible.  The assignment of the TX/RX pins to the serial console is board-specific, and for the discovery board, is in the stm32f429-discovery.c:

static const struct stm32_gpio_dsc usart_gpio[] = {
{STM32_GPIO_PORT_X, STM32_GPIO_PIN_TX}, /* TX */
{STM32_GPIO_PORT_X, STM32_GPIO_PIN_RX}, /* RX */
};

But with slightly more work, the arch/arm/include/asm/arch-stm32f4/gpio.h accomodates all STM32 UART variants through #define CONFIG_STM_USART, like this:

#if (CONFIG_STM32_USART == 1)
#define STM32_GPIO_PORT_X   STM32_GPIO_PORT_A
#define STM32_GPIO_PIN_TX   STM32_GPIO_PIN_9
#define STM32_GPIO_PIN_RX   STM32_GPIO_PIN_10
#define STM32_GPIO_USART    STM32_GPIO_AF7
...

This works if both the TX and RX pins are on the same GPIO port group, which is not the case for the Emcraft stm32f7-som board!

timer_init

The Emcraft U-Bool uses systick as the timer (reload value = 0xFFFFFF-1, with external clock = 12 MHz), but stm32f4-discovery board uses the TIM2 driven by the internal block.  Both the Emcraft's U-Boot port and the stm32f4-discovery U-Boot wrap compensates as long as the timer's source clock ticks (which is a problem when the processor goes to deep sleep).

dram_init

Emcraft's stm32f7-som put the SDRAM on the 1st bank (at 0xC0000000), while stm32f4-discovery board put the SDRAM on the 2nd bank (at 0xD0000000).  The rest of the code--including the wait time during the setup--are quite similar.  After SDRAM is brought up, we can (optionally) run DRAM test.

New in mainline (vs Emcraft code): relocate U-Boot to SDRAM

gd->mon_len apparently is the RAM space required for U-Boot code, data, and BSS.  gd->start_addr_sp points to the BEGINNING of contiguous pages for the U-Boot:

static int reserve_uboot(void)
{
gd->relocaddr -= gd->mon_len;
gd->relocaddr &= ~(4096 - 1);
gd->start_addr_sp = gd->relocaddr;

return 0;
}

reserve_malloc() decreases the gd->start_addr_sp by TOTAL_MALLOC_LEN from above, and we also need a global bd_t struct before the malloc area that gd->bd will point to.  bi_arch_number is part of the bd_t and may be potentially meaningful for Linux, but apparently is optional?  Below the bd_t is gd_t (global data), fdt blob, and the 16 byte aligned stack (pointer?).

After calculating all these space requirements, relocation begins:
  • reloc_fdt()
  • setup_reloc()
    • copy global data to gd->new_gd
If you are wondering:"where is the rest of relocation?", remember that board_init_f was called from the assembly function _main.

Relocation continues after returning from board_init_f()

_main uses a clever code to accomplish relocation: of course we have to memcpy the contents from the flash/SRAM to SDRAM.  But by changing the LR from the nominal address AFTER relocate_code(in arch/arm/lib/relocate.S) to the corresponding address in the relocated case, the CPU will magically start running from the relocated address after returning from relocate_code().  Vectors relocation requires not only copying the content, but also telling the CPU about the new vector table address by writing to the V7M_SCB_VTOR register.

c_runtime_cpu_setup; is this what finally pulls the relocation trigger, as I described above?

.globl c_runtime_cpu_setup
c_runtime_cpu_setup:
mov pc, lr

After relocation, _main continues with:
  • zeroes out BSS (between __bss_start and __bss_end, defined in the linker)
  • coloured_LED_init()
  • red_led_on()
  • board_init_r(gd_t* new_gd, ulong dest_addr)

board_init_r(): board init AFTER relocation

Like the board_init_f(), board_init_r() just runs through an array of init functions, this time defined in board_r.c.
  • initr_trace: I did not turn on CONFIG_TRACE
  • initr_reloc does NOT relocate (it's done already!), but merely sets the gd->flags: GD_FLG_RELOC | GD_FLG_FULL_MALLOC_INIT
  • initr_caches: just calls weak function enable_caches().  This is my chance to call the Emcraft's cache initialization code!
  • initr_reloc_global_data?
  • init_barrier: no-op for ARM
  • initr_malloc
  • bootstage_relocate: copy the bootstage names, because "name" pointer is still pointing to the strings in the old .text sections (which have been copied, but the pointers are still dangling)
  • board_init: gd->bd->bi_boot_params = CONFIG_SYS_SDRAM_BASE + 0x100
  • stdio_init_tables,
  • initr_serial
  • initr_announce
  • power_nit_board
  • initr_flash: even though U-Boot booted off the discovery board's internal flash, now this internal flash is registered with U-Boot, so it can write to this flash (for U-Boot self-update, for example).
  • initr_secondary_cpu: yet more initialization that requires full environment and flash? [not the discovery board]
  • stdio_add_devices
  • initr_jumptable: I saw the jump table briefly in the previous blog entry, but honestly I still don't understand it.
  • console_init_r
  • misc_init_r: reads the CPU serial number and stores into "serial#" environment variable
  • interrupt_init: no-op
  • initr_enable_interrupts: no-op
  • initr_ethaddr (if CONFIG_CMD_NET)
  • initr_net (if CONFIG_CMD_NET)
Finally, as in the Emcraft code studied in the previous blog entry, run_main_loop is the last init function to be called, and expected never to return.