Henry Choi: October 2014

Oct 31, 2014

Booting up Linux on Zedboard over network

In a previous post, I broke out of the constraints of an SD for the Linux root file system for my Zedboard. I may start modifying the ARM Linux kernel soon, in which case copying the uImage to the SD card intermediate step seems mildly irritating. On PCs, the PXE boot is a well-proven technology, and I wrote down how to do it a few years ago. PXE is an Intel technology, and unavailable for ARM I think, but fortunately, other people (like Sven Anderson) have already figured out how to coax U-Boot download the kernel image over TFTP and then boot it. I followed his "Zynq design from scratch" blog to setup Kernel netboot, as described below.

TFTP server (tftpd-hpa) on Ubuntu 14.04 LTS

Install the tftpd-hpa on the host:

$ sudo apt-get install tftpd-hpa

The default configuration file (/etc/default/tftpd-hpa) just needs some tweaks (my changes highlighted):

RUN_DAEMON="yes"
TFTP_USERNAME="tftp"
TFTP_DIRECTORY="/var/lib/tftpboot"
TFTP_ADDRESS="0.0.0.0:69"
TFTP_OPTIONS="--secure --ipv4"

For reasons I cannot recall now, I ran the TFTP as its own daemon instead of through inetd. Because I have bigger fish to fry, I am moving on. Serving all interfaces on the host is fine because my development box only has 1 NIC. I originally allowed IPv6 to run on my server, but got fed up with apt-get taking forever to time out against canonical servers and turned off IPv6 on my desktop with this command, by appending the following to my /etc/sysctl.conf,

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

and then running

$ sudo sysctl -p

But turning off IPv6 requires IPv6 to be turned off in the TFTP config file, by changing TFTP_ADDRESS to the value shown above; note that the port number does not change.

The server must serve up the kernel image and device tree, which I put in TFTP_DIRECTORY defined in the config file above.

/var/lib/tftpboot$ sudo cp ~/work/zynq/buildroot/output/images/uImage zedImage

/var/lib/tftpboot$ sudo cp ~/work/zynq/buildroot/output/images/zynq-zed-adv7511.dtb .

No, I still cannot make the ftp server follow symbolic link, sigh... For sanity test, install tftp on the host, and try "get"-ting one of the above files into your home folder. So here is a post-build script to ease the pain; change the folders according to your own setup:

sudo rm -rf /export/root/zed/; sudo mkdir /export/root/zed/
sudo tar -C /export/root/zed -xf /mnt/work/zynq/buildroot/output/images/rootfs.tar

sudo cp /mnt/work/zynq/buildroot/output/images/uImage /var/lib/tftpboot/zedImage

Troubleshooting

If TFTP suddenly stops working when you have not changed the target side, then suspect the TFTP server--it often does not start or crashes. To check,

$ sudo service tftpd-hpa status

ftpd-hpa start/running, process 5930

It should show print a PID if it's really running, as in the above example. If not, try stopping and restarting

$ sudo service tftpd-hpa restart

Manually boot off TFTP

Logically, U-Boot needs to know 2 things to boot:

What files to download over TFTP, and which memory to stick them.

ipaddr and serverip U-Boot environment variables stay the same as the NFS root file system case.
Leveraging the knowledge gained from the previous post, I will continue to mount the root file system over NFS, so I only need to download the kernel image (uImage) and DTB (because my eval board is ARM). Copied for convenience, the kernel image should be at 0x3000000, and the device tree image should be at 0x2A00000.

The boot command specifying the memory address containing the kernel. From previous post, I already have a working boot command. If I keep the memory addresses the same, I do NOT have to modify the boot command.

So let's focus on downloading the uImage and DTB files to the target. I stop U-Boot in the serial console during the 3 second wait. The U-Boot command to download a file from the TFTP server is (surprise!) "tftpboot". So first, the device tree blob:

zynq-uboot> tftpboot 0x2A00000 192.168.1.2:zynq-zed-adv7511.dtb

The the kernel image itself:

zynq-uboot> tftpboot 0x3000000 192.168.1.2:zedImage

Then we can simply boot from memory:

zynq-uboot> bootm
...
Welcome to Zeboard!

zed login:

Codify the success as U-Boot sdboot script

Remember that the FSBL is still necessary to program the FPGA bitstream and kick off U-Boot (the 2nd stage boot loader). All these files are still on the SD card, so when U-Boot loads, it will find itself in sdboot mode. There is a matching environment variable "sdboot" which I configured in the previous post to load the uImage and the DTB files from SD's BOOT partition, and kick off the bootm command (used above). So for TFTP boot, I just have to modify this sdboot environment variable to what worked above:

setenv serverip 192.168.1.2

setenv ipaddr 192.168.1.9

setenv kernel_image "zedImage"

setenv devicetree_image "zynq-zed-adv7511.dtb"

setenv sdboot 'if mmcinfo; then run uenvboot; echo Copying files over TFTP to RAM && tftpboot 0x2A00000 ${serverip}:${devicetree_image} && tftpboot 0x3000000 ${serverip}:${kernel_image} && bootm 0x3000000 - 0x2A00000; fi'

saveenv

boot

Enjoy!

Oct 30, 2014

Understanding /dev/zero

The Linux /dev/zero pseudo file generates "0" bytes, as you can see in this example, where 4 zero bytes is strung together to give an impression of a ulong = 0:

henry@Zotac64:~$ od -vAn -N4 -tx4 /dev/zero
00000000

The od command dumps file in octal and other formats "-N4" gets 4 bytes, and "-tx4" formats it as an 4 byte hex number. strace shows that the system call used is read(), to grab 4 bytes:
...
open("/dev/zero", O_RDONLY) = 3
read(3, "zv_\22", 4) = 4
...

How does it actually work under the hood? Firstly, it is a mem device, as you can see under /sys folder:

henry@Zotac64:~$ ls -o /sys/class/mem/zero
... /sys/class/mem/zero -> ../../devices/virtual/mem/zero

It is a char mem device (major number 1), as you can see below:

henry@Zotac64:~$ ls -lh --time-style=+ /dev/zero
crw-rw-rw- 1 root root 1, 5 /dev/zero

Browsing the kernel source

The memory devices are defined <kernel>/drivers/char/mem.c:

static const struct memdev {
const char *name;
umode_t mode;
const struct file_operations *fops;
struct backing_dev_info *dev_info;
} devlist[] = {
... [3] = { "null", 0666, &null_fops, NULL },
... [5] = { "zero", 0666, &zero_fops, &zero_bdi },
[7] = { "full", 0666, &full_fops, NULL },
[8] = { "random", 0666, &random_fops, NULL },
[9] = { "urandom", 0666, &urandom_fops, NULL },
#ifdef CONFIG_PRINTK
[11] = { "kmsg", 0644, &kmsg_fops, NULL },
#endif
};

Note that the array indices ARE the minor device number appearing in the /dev/ folder. The actual device creation is done en-mass in chr_dev_init():

err = bdi_init(&zero_bdi);
...
mem_class = class_create(THIS_MODULE, "mem");
...

for (minor = 1; minor < ARRAY_SIZE(devlist); minor++) {
...
device_create(mem_class, NULL, MKDEV(MEM_MAJOR, minor),

NULL, devlist[minor].name);
...

device_create() registers it with sysfs, as we can check in /sys/class:

henry@Zotac64:~$ ls /sys/class/mem/
full kmsg mem null port random urandom zero

chr_dev_init() is probably called by an indirection through the fs_initcall(chr_dev_init) section declaration, which inserts the function with a predefined pattern for the linker:

#define __define_initcall(fn, id) \
static initcall_t __initcall_##fn##id __used \
__attribute__((__section__(".initcall" #id ".init"))) = fn; \
LTO_REFERENCE_INITCALL(__initcall_##fn##id)

The above macro will expand out to

static initcall_t __initcall_chr_dev_init5 __attribute((__section__(".initcall_chr_dev_init5.init"))) = chr_dev_init

that is, there is a static function pointer pointing to chr_dev_init.

/dev/zero file ops are in mem.c also:

static const struct file_operations zero_fops = {
.llseek = zero_lseek,
.read = read_zero,
.write = write_zero,
.aio_read = aio_read_zero,
.aio_write = aio_write_zero,
.mmap = mmap_zero,
};

For "zero", open/release ops are unnecessary, so the fops does NOT include them. The zero file does not need to seek or write, so they are defined off appropriately:

#define zero_lseek null_lseek

#define write_zero write_null

#define aio_write_zero aio_write_null

The read clears the user buffer using either slow byte operation for < 64 bytes, or wholesale memset; for larger buffer, the clearing happens in chunks of PAGE_SIZE.

static ssize_t read_zero(struct file *file, char __user *buf,
size_t count, loff_t *ppos)
{
...
if (!access_ok(VERIFY_WRITE, buf, count))

return -EFAULT;

while (count) {
unwritten = __clear_user(buf, chunk);
written += chunk - unwritten;
if (unwritten)
break;
if (signal_pending(current))
return written ? written : -ERESTARTSYS;
buf += chunk;
count -= chunk;
cond_resched();

}
return written ? written : -EFAULT;

}

cond_resched() allows an important task to preempt this thread--all part of being a considerate citizen.

Although I don't understand fully what it means to memory map a character device, it IS possible to mmap zero for shared mapping:

static int mmap_zero(struct file *file, struct vm_area_struct *vma) {

#ifndef CONFIG_MMU

return -ENOSYS;

#endif

if (vma->vm_flags & VM_SHARED)

return shmem_zero_setup(vma);

return 0;

}

Trace this, baby!

Let's make this blog entry a bit more interesting by tracing the interaction with the /dev/zero file. The Ubuntu 12.04 LTS kernel I am running on my PC has some FTRACE capability, according to the config file saved in /boot"

henry@Zotac64:~$ grep FTRACE /boot/config-3.13.0-37-generic
CONFIG_KPROBES_ON_FTRACE=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
# CONFIG_PSTORE_FTRACE is not set
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_FTRACE=y
CONFIG_FTRACE_SYSCALLS=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_FTRACE_MCOUNT_RECORD=y
# CONFIG_FTRACE_STARTUP_TEST is not set

Low level tracing

According to the ftrace author Steven Rostedt (also this more advanced topic), CONFIG_DYNAMIC_FTRACE allows me to tap into any of the events in /sys/kernel/debug/tracing/available_events, which numbers over 1000 on my PC kernel!

# wc -l /sys/kernel/debug/tracing/available_events
1114 /sys/kernel/debug/tracing/available_events

Of those, the read and write system calls (as shown in strace above) might be interesting:

syscalls:sys_exit_read
syscalls:sys_enter_read
syscalls:sys_exit_write
syscalls:sys_enter_write

Writing to trace buffer can be turned on/off by writing 1/0 to a file /sys/kernel/debug/tracing/tracing_on, but tracing itself is changed by writing the desired tracer to /sys/kernel/debug/tracing/current_tracer, which is nop on system startup:

# echo function_graph > current_tracer

function_graph tracer is more full featured than the simpler function tracer.

To synchronize the events with the user space, ftrace exposes a file /sys/kernel/debug/tracing/trace_marker, that can be written to like this example:

[tracing]# echo hello world > trace_marker
[tracing]# cat trace
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<...>-3718 [001] 5546.183420: 0: hello world

If I want to do the equivalent from within the kernel, use the function trace_printk(). I found that what I wrote to the trace_marker does NOT show up in trace when I use function_graph tracer.

While tracing is enabled, the result can be viewed either in /sys/kernel/debug/tracing/per_cpu/cpu[0-9]/trace, or rolled up in /sys/kernel/debug/tracing/trace (might be confusing to look at). To view the file AFTER tracing is turned off, you have to copy the trace into another file.

Putting everything together, a way to briefly trace reading 4 bytes from /dev/zero would be:

echo read_zero > set_ftrace_filter

echo 1 > tracing_on
echo function > current_tracer
echo "begin read zero" > trace_marker
od -vAn -N4 -tx4 /dev/zero
echo "end read zero" > trace_marker
echo 0 > tracing_on
cp trace /home/henry/read_zero.ftrace
echo nop > current_tracer
chown henry /home/henry/read_zero.ftrace

# echo > set_ftrace_filter # Undo the filter

Then look at the result in /sys/kernel/debug/tracing/trace

echo nop > current_tracer

Make it easy on your self: trace-cmd

I found that Steven Rosted wrote another another tool to ease data collection: trace-cmd.

trace-cmd record -p function od -vAn -N4 -tx4 /dev/zero

The "-l" argument is the filter, and "-e" is the events; both can be repeated. The rest of the command is the command to call: what you want to trace (in this example, it would be the string starting with "od"). To look at the result, I played around with kernelshark, but I found it more confusing than just the text output from trace-cmd report:

trace-cmd report | grep -v 'trace-cmd' > read_zero.txt

It shows that we fly through the open and then read of the /dev/zero in a few microseconds:

od-4878 [000] 22634.564764: sys_enter: NR 2 ...
od-4878 [000] 22634.564764: sys_enter_open:...
...
od-4878 [000] 22634.564771: do_sys_open: [FAILED TO PARSE] filename=/dev/zero flags=32768 mode=438
...
od-4878 [000] 22634.564772: sys_exit: NR 2 = 3
od-4878 [000] 22634.564772: sys_exit_open: 0x3
...
od-4878 [000] 22634.564782: sys_enter: NR 0 (3, 22612f0, 4, 7ffff7fd18c0, 1, 3)
od-4878 [000] 22634.564782: sys_enter_read: ...
od-4878 [000] 22634.564784: function: read_zero
od-4878 [000] 22634.564784: sys_exit: NR 0 = 4
od-4878 [000] 22634.564785: sys_exit_read: 0x4

The whole read took 3 us (22634.564785 - 22634.564782)! The read() system call number is apparently 3. If I remove the filter option, I can see the call graph like this example:

trace-cmd record -p function -e "irq:*" od -vAn -N4 -tx4 /dev/zero
25182.162030: SyS_read
25182.162031: fget_light
25182.162031: vfs_read
25182.162031: rw_verify_area
25182.162031: security_file_permission
25182.162032: apparmor_file_permission
25182.162032: common_file_perm
25182.162032: aa_file_perm
25182.162032: __fsnotify_parent
25182.162032: fsnotify
25182.162033: read_zero
25182.162033: read_zero.part.5
25182.162033: __clear_user
25182.162033: _cond_resched
25182.162034: __fsnotify_parent
25182.162034: fsnotify

This call graph matches the source code we browsed earlier. But I am disappointed about not seeing any interrupts so far; what does it look like in ftrace? I think that system call involves raising a SWI (vector 11 on ARM), which handled in assembly (for ARM: <kernel>/arch/arm/kernel/entry-common.S, vector_swi function, which should call sys_read but calls sys_ni_syscall if system call is not implmented). That sys_read is linked (through clever assembly NR_syscalls numbering in entry-common.S) to the read function defined in <kernel>/fs/read_write.c:

SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count)
{
...

Conclusions

The read system call sure is fast at least when it does not actually have to read anything.
Tentatively, I cannot see the SWI interrupt through ftrace.

Oct 23, 2014

Building the Ubuntu 14.04 LTS kernel on Ubuntu 14.04 LTS

To experiment with device drivers in the self-hosted configuration (the target and the host are the same), the Linux kernel must first be built. I started from a Ubuntu help page.

Clone Ubuntu git: just the source ma'am

I got the Trusty source

git clone git://kernel.ubuntu.com/ubuntu/ubuntu-trusty.git

And then copy the config from /boot

henry@Zotac64:~/work/ubuntu-trusty$ cp /boot/config-3.13.0-37-generic .config

Bring the config up to date with the latest kernel changes

henry@Zotac64:~/work/ubuntu-trusty$ make oldconfig

Build the kernel (on for x64, don't need ARCH definition):

henry@Zotac64:~/work/ubuntu-trusty$ make

When done, the vmlinux ELF file ready. It's huge; even larger than the Buildroot rootfs for the Zynq eval platform I've been using:

henry@Zotac64:~/work/ubuntu-trusty$ ls -lh vmlinux
-rwxrwxr-x 1 henry henry 151M Oct 23 19:18 vmlinux

Since this kernel should be compatible with my running kernel (where I am typing this blog entry) I can use this repository to write kernel modules and play around, WITHOUT changing my working kernel.

The "debian way": not so useful for kernel module development

```
apt-get source linux-image-$(uname -r)
```

Get the build tools with this command:

sudo apt-get build-dep linux-image-$(uname -r)

Undefine CROSS_COMPILE, otherwise the build gets confused:

```
export CROSS_COMPILE=
```

The kernel source is at ~/work/ubuntu/linux-3.13.0. First generate the config files (use generic) with these commands:

chmod a+x debian/scripts/*
chmod a+x debian/scripts/misc/*
fakeroot debian/rules clean
fakeroot debian/rules editconfigs

After entering the last command, type 'n' a few times. Build the kernel with this command:

fakeroot debian/rules clean
fakeroot debian/rules binary-headers binary-generic

If the build is successful, a set of three .deb binary package files will be produced in the directory above the build root directory, as you can see below:

henry@Zotac64:~/work/ubuntu$ ls
linux-3.13.0
linux_3.13.0-37.64.diff.gz
linux_3.13.0-37.64.dsc
linux_3.13.0.orig.tar.gz
linux-cloud-tools-3.13.0-37-generic_3.13.0-37.64_amd64.deb
linux-headers-3.13.0-37_3.13.0-37.64_all.deb
linux-headers-3.13.0-37-generic_3.13.0-37.64_amd64.deb
linux-image-3.13.0-37-generic_3.13.0-37.64_amd64.deb
linux-image-extra-3.13.0-37-generic_3.13.0-37.64_amd64.deb
linux-tools-3.13.0-37-generic_3.13.0-37.64_amd64.deb

Oct 14, 2014

Ways to study the Linux kernel and driver source code

I picked up dorking around with Linux device driver after 10 years of working on either high level SW or low level FW (no OS). What big change to Linux! One of the welcome changes is the possibility to browse the kernel in an IDE.

Current status (so you don't have to read to the end)

Eclipse kernel source browsing works to > 80% of my expectation.
Haven't tried seriously to cross compile the kernel for ARM within Eclipse, because I use Buildroot.
Can browse statically compiled source and set software breakpoint within gdb to kgdb, but cannot set hardware breakpoint on Zedboard.
kgdbwait to delay kernel startup does NOT work on Zedboard, but DOES work on x86 target (see my other blog post just for the Dell Optiplex x86 target)
JTAG debugging from Xilinx SDK works for some functions, but the breakpoints I really want (like the Ethernet driver interrupt) does NOT hit.

Eclipse to browse the code: I prefer this to KDevelop

For me, Eclipse CDT worked much better than KDevelop, because I am already used to the Eclipse shortcuts. I just followed this Eclipse documentation mostly. One missing information is that the Buildroot plugin must first be installed into Eclipse, through menu --> Help --> Install New Software --> Add; the Buildroot Eclipse SDK for Eclipse Luna : http://buildroot.org/downloads/eclipse/luna. I could not tell Eclipse how to cross compile using the CROSS_COMPILE environment variable (which for me is "arm-xilinx-linux-gnueabi-").

henry@Zotac64:~/work/zynq/kernel$ which ${CROSS_COMPILE}gcc
/opt/Xilinx/SDK/2014.2/gnu/arm/lin/bin/arm-xilinx-linux-gnueabi-gcc

What I WANT TO setup in Eclipse is:

the location of my defconfig (the file specified to "make ARCH=arm <defconfig>"), and
Let Eclipse build uImage with the make command
[Not necessary] My DTS file, so that it will also build the DTB

kgdb to source debug device driver

Now that I am building not only the kernel, but the whole toolchain and the root file system in Buildroot, the ability to cross compile in Eclipse is not so necessary after all. Instead, ability to break at a line might be invaluable. This is one of the pleasant changes to find after 10 years: kgdb is now in mainline kernel (and even documented!).

Building kgdb into the kernel

Necessary config options:

CONFIG_KGDB=y
CONFIG_DEBUG_INFO=y -- I already had it on for ftrace, to turn on symbolic data

Optional but recommended (makes sense for me):

CONFIG_FRAME_POINTER=y -- save frame information, to allow gdb to more accurately construct stack trace
CONFIG_KGDB_SERIAL_CONSOLE=y -- kgdb over Ethernet is not in mainline, so stick with the tried and true. This also allow kgdbwait and early debugging
# CONFIG_DEBUG_RODATA is not set -- This option marks certain regions of the kernel's memory space as RO. I did NOT have this option to begin with, so I will leave this alone. If my processor did NOT have HW breakpoint, I should turn this off. Zynq has 5 breakpoints and 1 watchpoint registers--which I found kgdb cannot use for some reason.
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set -- I have plenty of disk space but the processor is relatively slow. Besides, I find it confusing if the compiler optimizes some code away when debugging.

Booting the kernel with kgdb bootarg

Since I statically built the kgdboc into the kernel, I don't have to modprobe kgdboc on kernel startup. Therefore, I can use the same serial console option I use. On Zedboard, I've been using ttyPS0

kgdboc=ttyPS0,115200

So at the moment, the full bootargs (much of it carried over from previous blog entry on NFS root file system) is set in U-Boot like this:

zynq-uboot> setenv bootargs 'console=ttyPS0 ip=192.168.1.9:192.168.1.2:192.168.1.1:255.255.255.0 root=/dev/nfs nfsroot=192.168.1.2:/export/root/zed rw earlyprintk kgdboc=ttyPS0'

zynq-uboot> saveenv

For me, the baud is optional, actually, because my FSBL opens the serial console even before Linux starts. I checked that nothing is broken by booting the modified kernel with the above bootargs. And because the target only has 1 serial console (ttyPS0), I cannot run the kgdbcon through the same device, according to the kgdb documentation on kgdbcon.

NOT booting the kernel with kgdbwait bootarg (wait for gdb)

This will be a PITA normally, but a life saver when debugging early boot problem. Insert "kgdbwait" right after the kgdboc argument in the bootargs, like this:

zynq-uboot> setenv bootargs 'kgdboc=ttyPS0 kgdbwait kgdboc=ttyPS0 kgdbcon earlyprintk console=ttyPS0 ip=192.168.1.9:192.168.1.2:192.168.1.1:255.255.255.0 root=/dev/nfs nfsroot=192.168.1.2:/export/root/zed rw'

Unfortunately for me, this "wait for gdb" feature did NOT work.; the kernel startup just proceeds as if the kernel option is not there. After bootup, the kgdboc sys file contains nothing, so I knew that the initial registration did not succeed.

$ cat /sys/module/kgdboc/parameters/kgdboc

The strange thing is that kgdb can use ttyPS0 just fine after startup (see below).

Need agent-proxy to use from gdb

To halt the remote kernel over serial console in gdb, agent-proxy is recommended:

$ git clone git://git.kernel.org/pub/scm/utils/kernel/kgdb/agent-proxy.git

$ cd agent-proxy; make

Agent-proxy starts 2 TCP servers on localhost. The 2nd port is the debug port.

$ ./agent-proxy 2223^2222 0 /dev/ttyACM0,115200

Agent Proxy 1.96 Started with: 2223^2222 0 /dev/ttyACM0,115200

Agent Proxy running. pid: 17078

If we have not set the kgdboc option in kernel command line, we would have to do that on the target BEFORE we connect from gdb:

$ echo ttyPS0 > /sys/module/kgdboc/parameters/kgdboc

Leave this running. We can even connect to the target's serial port by telnetting to the 1st port:

$ telnet localhost 2223

Need cross gdb on the host

I tell Buildroot to build gdb, by checking Toolchain --> Build cross gdb for the host. It will be in BR2/output/host/usr/bin/arm-linux-gdb. gdb cannot work with uImage, which is compressed. The right file to feed to the ARM gdb is <BR2>/output/build/linux-custom/vmlinux:

$ output/host/usr/bin/arm-linux-gdb output/build/linux-custom/vmlinux

We can then connect to the target over serial, and get debug:

(gdb) target remote localhost:2222
Remote debugging using localhost:2222
kgdb_breakpoint () at kernel/debug/debug_core.c:1050
1050 arch_kgdb_breakpoint();
(gdb) where
#0 kgdb_breakpoint () at kernel/debug/debug_core.c:1050
#1 0xc0088e60 in kgdb_initial_breakpoint () at kernel/debug/debug_core.c:949
...

While we are stopped, we can list functions that ARE actually in the kernel--which is a bit of a guesswork in Eclipse.  We can always stop the kernel with Ctrl-C in gdb, and then resume with the "continue" command:

(gdb) c
Continuing.

Alas, the target does NOT seem to support HW breakpoint.

(gdb) hb tty_find_polling_driver

Hardware assisted breakpoint 1 at 0xc0260280: file drivers/tty/tty_io.c, line 356.

(gdb) c

Continuing.

Warning:

Cannot insert hardware breakpoint 1.

Could not insert hardware breakpoints:

You may have requested too many hardware breakpoints/watchpoints.

Debugging kgdbwait with kgdb

Looking at configure_kgdboc(void), the possible problems are:

kgdboc option might be getting dropped, AFTER printing the kernel boot args (unlikely)
CONFIG_CONSOLE_POLL is turned off in the kernel: since kgdboc DOES work after startup, I reject this possibility
tty_find_polling_driver() for the ttyPS0
ttyPS0 might not have been registered as a console driver yet

Strangely, opt_kgdb_wait is not setup:

(gdb) p __setup_opt_kgdb_wait

$1 = {str = 0x0 <__vectors_start>, setup_func = 0x0 <__vectors_start>, early = 0}

(gdb) p __setup_kgdboc_early_init

$3 = {str = 0x0 <__vectors_start>, setup_func = 0x0 <__vectors_start>, early = 0}

(gdb) p __setup_str_opt_kgdb_wait

$4 = "available"

To figure out why the early param was not getting handled, I put in kgdb_breakpoint() in start_kernel, at the beginning of parse_early_param(), but that hangs the kernel. This is an outstanding problem for me to solve...

gdb debug kernel over JTAG

Since kgdbwait is not working for me on the Zedboard, I gave JTAG route a shot. So my kernel config contains:

CONFIG_KGDB=y
CONFIG_KGDB_SERIAL_CONSOLE=y
CONFIG_DEBUG_INFO=y -- I already had it on for ftrace, to turn on symbolic data
# CONFIG_DEBUG_RODATA is not set
CONFIG_FRAME_POINTER=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set

I also deleted the kgdb related bootargs. Now fire up Xilinx SDK (see my xsdk on Linux recipe in firefox/Chrome), and switch to the Debug perspective (look at the XSDK window's upper right corner; the "+" icon lets you add new Eclipse perspectives like this):

In menu --> Run --> Debug Configurations, select Xilinx C/C++ application (System Debugger), like this:

It's a bit confusing, because System Debugger IS based on the TCF (Eclipse Target Communication Framework; see this explanation). Click the "+" icon in the upper left corner to create a new debug configuration. I named the configuration "Kernel", and It's probably a good idea to specify the symbol file and the source lookup paths in before while creating the debug configuration as well.

To load the kernel image to the debugger, so right click on Core #0 above --> Symbol Files --> Add --> browse to the vmlinux file (the uncompressed kernel image, which is sitting at the <kernel> directory after a successful build).

I am not sure what the benefit of specifying the kernel start address (see my earlier blog entry for why the kernel start address is 0x0300000) is--other than "I don't know how the debugger will figure out what address the set the breakpoint at without this clue"--but it doesn't seem to hurt. I also checked "Instructions read", to force use of the HW breakpoints when possible.

On Linux, I don't have to add source lookup path, so I don't have to wonder about the difference between equally unexplained concepts: e.g. Compilation Path vs. Path Mapping.

Click "Debug"; this is where the SDK magic happens: it connects to the target's debug HW over JTAG over localhost:3121, and auto discovers the CPUs, as you can see below:

Because I set max_cpus=1 in my bootargs (I am working toward running Linux on Core #0 and bare metal, hard real-time state machines on Core #1), Linux only runs on Core #0 above. When the debugger stops the program, I could see that the kernel was in the background idle processing, in start_kernel --> rest_init(). I can set breakpoints in the source such as the kernel main.c, or at a function call by clicking on the small inverted triangle on the upper right corner of the breakpoint tab.

Unfortunately, this breakpoint, or even a breakpoint in my Ethernet MAC driver ISR, is NOT hitting, rendering this effort rather futile. In kgdb, at least the breakpoints I set AFTER the kernel starts are hitting reliably.

Another problem: I don't know how to restart the program from the XSDK (posted the question to Xilinx EDK forum). Until I figure this out, the only way to debug the statically compiled __initcalls is to get a root console and type "reboot", which means that I can only debug a kernel that actually boots and gives me a console.

KDevelop: did not work out well for me

Mostly a note to myself in case I have to setup KDevelop on another Ubuntu desktop. Mostly based on http://www.gnurou.org/code/kdevelop-kernel. Starting from a downloaded kernel (in this case the ADI Zynq kernel, located in a folder we will call <kernel>)

apt-get package is kdevelop
Turn off background parser: menu --> Settings --> Configure KDevelop --> Background Parser group --> uncheck.
Import: menu --> Project --> Open/Import Project --> Browse INTO (don't just select the folder) <kernel> --> Next, then:

I named the project adi_kernel
Generic Project Manager
Finish --> KDevelop goes to work for a few minutes, and then yields a project in the Project explorer window.

Right click this new project --> Open Configuration, and "Add button" the folders to INCLUDE lists; this gets to be "trial-and-error" but I added only drivers on as-needed basis

/include/*
/arch/arm/*
/lib/*
/ipc/*
/init/*
/mm/*
/kernel/*
/drivers/amba/*, /drivers/i2c/*, /drivers/gpu/drm/*, /drivers/usb/*

Enable the background parser. It will start chewing through the source
To add additional include paths, pull up any source and hover over unresolved header files at the top of the file --> Solve --> Add custom include path. In the "Setup Custom Include Paths" window:

Storage directory: <kernel>
Specify the following relative paths:

include
arch/arm/include
arch/arm/mach-versatile/include

TODO: consider building the kernel in the IDE

Oct 8, 2014

Using the Zynq OCM Linux device driver

Fixing the controller register address in DTS

While debugging the HDMI display on the Zedboard--specifically the I2C communication to the ADV7511 chip from the kernel--I noticed that the kernel was also complaining about the OCM device driver, something like

zynq-ocm fffc0000.ps7-ocm: ZYNQ OCM pool: 256 KiB @ 0xe0880000
zynq-ocm fffc0000.ps7-ocm: can't request region for resource [mem 0xfffc0000-0xffffffff]
zynq-ocm: probe of fffc0000.ps7-ocm failed with error -16

Since the error is there even in the dmesg of log of "successful" reference project, it seems that the problem has been there for some time, but nobody did anything about it. I care about OCM because I am considering it as a primary means of communication with the bare metal application running on CPU1. Later in dmesg, an OCM problem shows up again:

zynq_pm_remap_ocm: OCM pool is not available
zynq_pm_suspend_init: Unable to map OCM.

The call sequence is zynq_init_late() --> zynq_pm_late_init() --> zynq_pm_suspend_init() --> zynq_pm_remap_ocm() and are called from. zynq_init_late() seems to be part of the platform definition in arch/arm/mach-zynq/common.c:

DT_MACHINE_START(XILINX_EP107, "Xilinx Zynq Platform")
.smp = smp_ops(zynq_smp_ops),
.map_io = zynq_map_io,
.init_irq = zynq_irq_init,
.init_machine = zynq_init_machine,
.init_late = zynq_init_late,
.init_time = zynq_timer_init,
.dt_compat = zynq_dt_match,
.reserve = zynq_memory_init,
.restart = zynq_system_reset,
MACHINE_END

In the HW exported to SDK (system.hdf), there are 4 memory address ranges:

OCM controller ps7_ocmc_0: 0xF800C000 - 0xF800CFFF
SRAM ps7_ram_0: 0x0 - 0x0002FFFF
SRAM ps7_ram_1: 0xFFFF0000 - 0xFFFFFDFF
DDR ps7_ddr_0: 0x00100000 - 0x1FFFFFFF; note the bottom 1 MB of the 512 MB DDR is not visible.

And according to ug585-Zynq-7000-TRM section 29, Zynq has 256 KB (0x40000) of RAM and 128 KB of ROM (which is not visible to the applications because this is the boot ROM). The application addressable 256 KB should be mapped to either low range (0x0000_0000 to
0x0003_FFFF) or a high address range (0xFFFC_0000 to 0xFFFF_FFFF) in a granularity of
four independent 64 KB sections via the 4-bit slcr.OCM_CFG[RAM_HI]. The addressing scheme used in this project (adv7511_zed.xpr) seems to be along the line of the example in TRM table 29-5 , but there are important differences:

There is no SRAM
Zedboard only has 512 MB DDR3

Note: OCM controller (probably not the OCM itself) starts address 0xFFFC0000 rather than 0xF800C000 defined in system.hdf. For example, xapp1079 CPU0 application writes to 0xFFFF0000 when writing to the OCM.
According to this page, -16 is -EBUSY. The "zynq-ocm" messages are being emitted by <kernel>/arch/arm/mach-zynq/zynq_ocm.c. How does the OCM controller (<kernel>/arch/arm/boot/dts/zynq-zed.dts, which is included by zynq-zed-adv7511.dts) map the memory region 0xFFFC000 - 0xFFFFFFF?

ps7_ocmc_0: ps7-ocmc@f800c000 {
compatible = "xlnx,zynq-ocmc-1.0";
interrupt-parent = <&ps7_scugic_0>;
interrupts = <0 3 4>;
reg = <0xf800c000 0x1000>;
} ;

But these themselves are inconsistent with /proc/device-tree/amba@0/ps7-ocm@0xfffc0000:

$ cat /proc/device-tree/amba@0/ps7-ocm@0xfffc0000/name
ps7-ocm

$ cat /proc/device-tree/amba@0/ps7-ocm@0xfffc0000/compatible
xlnx,ps7-ocmc-1.00.axlnx,zynq-ocmc-1.0

$ hexdump /proc/device-tree/amba@0/ps7-ocm@0xfffc0000/interrupts
0000000 0000 0000 0000 0300 0000 0400
000000c

$ hexdump /proc/device-tree/amba@0/ps7-ocm@0xfffc0000/reg
0000000 fcff 0000 0400 0000
0000008

The "ZYNQ OCM pool" dmesg is from zynq_ocm_probe() in <kernel>/arch/arm/mach-zynq/zynq_ocm.c:

152 for (i = 0; i < ZYNQ_OCM_BLOCKS; i++) {
153 unsigned long size;
154 void __iomem *virt_base;
155
156 /* Skip all zero size resources */
157 if (zynq_ocm->res[i].end == 0)
158 break;
...
174 dev_info(&pdev->dev, "ZYNQ OCM pool: %ld KiB @ 0x%p\n",
175 size / 1024, virt_base);
176 }
177
178 /* Get OCM config space */
179 res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
180 zynq_ocm->base = devm_ioremap_resource(&pdev->dev, res);
181 if (IS_ERR(zynq_ocm->base))
182 return PTR_ERR(zynq_ocm->base);

I don't know yet the mapping from the physical to virtual address, but the dev_info line prints only once because the ZYNQ_OCM_BLOCKS (4) number of 64 KB has been concatenated into contiguous 256 KB in an earlier loop. The EBUSY is occurring when device driver calls devm_request_mem_region()--which is #defined to __devm_request_region(dev, &iomem_resource, (start), (n), (name)) in <kernel>/resource.c--with 0xFFFC0000. __devm_request_region can only return NULL for 2 reasons:

devres_alloc() fails--since this just calls kmalloc in the end, this is unlikely to fail
__request_region() fails--this probably hits the HW, so is more likely to fail, but not clear why.

With a cold trail, I google for "OCM driver fails to load" and turned up half year old thread. It appears that the failure is due to incorrect OCM *controller* register address passed from the DTB: according to the corrected DTS, it should be <0xf800c000 0x1000>, but the booted target thinks it's <0xfcff0000 0x4000000>, which is CLEARLY incorrect, as the OCM controller register region should only be 64 KB. Even more bizarre, when I reverse compile the generated DTB back into DTS like this, and then look at it, THAT is different than either the DTS source AND what is loaded on the target!

$ <kernel>/scripts/dtc/dtc -I dtb -O dts <kernel>/arch/arm/boot/dts/zynq-zed-adv7511.dtb > devicetree.dts

The ocm section in reverse compiled DTS appears as follows:

ps7-ocm@0xfffc0000 {
compatible = "xlnx,ps7-ocmc-1.00.a", "xlnx,zynq-ocmc-1.0";
interrupt-parent = <0x1>;
interrupts = <0x0 0x3 0x4>;
reg = <0xfffc0000 0x40000>;
};

The "compatible" field matches what the target loaded, but where is the "xlnx,ps7-ocmc-1.00.a" coming from? It turns out that ocmc is NOT the only device node whose compatible field gets expanded by the DTS compiler; for example, ps7-xadc@f8007100's compatible field was "xlnx,zynq-xadc-1.00.a"" in DTS, but mysteriously expanded to "xlnx,zynq-xadc-1.00.a", "xlnx,ps7-xadc-1.00.a" in the DTB. The result is the same even if I try running the DTC (device tree compiler) manually and the reverse compile it:

<kernel>/scripts/dtc$ ./dtc -I dts -O dtb -o ../../arch/arm/boot/dts/zynq-zed-adv7511.dtb ../../arch/arm/boot/dts/zynq-zed-adv7511.dts

$ <kernel>/scripts/dtc/dtc -I dtb -O dts <kernel>/arch/arm/boot/dts/zynq-zed-adv7511.dtb > devicetree.dts

I finally figured out that I was reading the wrong DTS file: instead of zynq-zed.dts, I should have been reading the zynq.dtsi. I fixed the ocmc entry in <kernel>/arch/arm/boot/dts/zynq.dtsi like this:

ps7_ocmc_0: ps7-ocmc@f800c000 {
compatible = "xlnx,ps7-ocmc-1.00.a", "xlnx,zynq-ocmc-1.0";
interrupt-parent = <&gic>;
interrupts = <0 3 4>;
reg = <0xf800c000 0x1000>;
} ;

When I try this new devicetree.dtb, the initial OCM pool creation error went away!:

zynq-ocm f800c000.ps7-ocmc: ZYNQ OCM pool: 256 KiB @ 0xe0880000

Now that the driver has found all OCM in the HW (determined by reading the OCM config register), but mapped the physical address to the kernel virtual address 0xe0880000 (which is constant only for the duration of THAT kernel; if I built another kernel, it may not be the same)--which is NOT the same as the userspace virtual address. The translation table entry itself is created in

virt_base = devm_ioremap_resource(&pdev->dev, &zynq_ocm->res[i]);

which is where that virtual address printed in dmesg shown above. I checked the resource flag right before it is IO remapped, and the IORESOURCE_CACHEABLE bit was cleared, so this memory should not be cached at all => no DMB required. The physical memory address range 0xFFFC0000 ~ 0xFFFFFFFF shows up in /proc because the driver calls gen_pool_add_virt(). You can see the OCM controller registered for the last 256 KB of memory map:

# cat /proc/iomem
00000000-1fdfffff : System RAM
00008000-0062175b : Kernel code
00658000-006b57f3 : Kernel data
...
f800c000-f800cfff : /amba@0/ps7-ocmc@f800c000
fffc0000-ffffffff : f800c000.ps7-ocmc

Using the OCM

As dmesg above shows, the coalesced OCM memory is ioremapped to 0xE0880000 and then added to something called gen_pool (through gen_pool_add_virt). To test writing/reading OCM, I need a simple userspace application--which is a bit complicated when I am cross-compiling (using Buildroot in this case).

Buildroot Eclipse integration

In a previous blog entry, I looked into the Buildroot Eclipse integration, which exposes the Buildroot's cross compile environment in Eclipse. Let's take it out for a spin. Buildroot integration only supports an empty executable project, as you can see below.

If I create a new file main.cpp in the ocm project, I can write a few lines of code:

#include <stdio.h>

int main(int argc, char* argv[]) {

printf("ocm");

return 0;

}

Building this in Eclipse is easy, and runs the correct cross tools to generate an executable ocm:

18:40:34 **** Build of configuration debug for project ocm ****

make all

Building file: ../main.cpp

Invoking: Buildroot ARM C++ Compiler (/mnt/work/zed/buildroot/output)

/mnt/work/zed/buildroot/output/host/usr/bin/arm-buildroot-linux-gnueabi-g++ -O0 -g3 -Wall -c -fmessage-length=0 -MMD -MP -MF"main.d" -MT"main.d" -o "main.o" "../main.cpp"

Finished building: ../main.cpp

Building target: ocm

Invoking: Buildroot ARM CC Linker (/mnt/work/zed/buildroot/output)

/mnt/work/zed/buildroot/output/host/usr/bin/arm-buildroot-linux-gnueabi-g++ -o "ocm" ./main.o

Finished building target: ocm

18:40:35 Build Finished (took 413ms)

If I just copy this exe to the target's /bin folder, I should be able to run this program. And this is where the value of NFS mounting the rootfs shines: I can just copy the file into the host's NFS export folder!

~/work/zed/workspace$ sudo cp ocm/debug/ocm /export/root/zedbr2/bin/

And when I run this on the target, I see the string "ocm", as expected

# /bin/ocm

ocm

Confirming that a command line program works, I can move onto working with the OCM through mmap system call.

Userspace application to write to OCM

Here is a simple program (requiring sudo) to write 256 KB to the OCM, and then read it back:

#include <iostream>
#include <errno.h>
#include <fcntl.h> //open
#include <unistd.h> //close
#include <sys/mman.h> //mmap
#include <stdio.h>
#include <stdint.h>

using namespace std;

int main(int argc, char* argv[]) {
int err = 0;
uint32_t i;
#define OCM_SIZE 256*1024
#define OCM_LOC 0xFFFC0000//0xe0880000//
printf("ocm: 256 KB @ 0x%x\n", OCM_LOC);

void* ocm = NULL;
uint32_t* buf;
int memf = open("/dev/mem"
, O_RDWR | O_SYNC //do I want cacheing?
);
if(memf < 0) {
err = errno;
cerr << "open " << err;
goto err_;
}
ocm = mmap(NULL, OCM_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, memf,
OCM_LOC);
if(ocm == MAP_FAILED) {
err = errno;
cerr << "mmap " << err;
goto err_close;
}
cout << "Writing to OCM" << endl;
for(buf = (uint32_t*)ocm, i=0; i < OCM_SIZE/4; ++i, ++buf)
*buf = i;// << 16;
cout << "Reading OCM" << endl;
for(buf = (uint32_t*)ocm, i=0; i < OCM_SIZE/4; ++i, ++buf) {
if(*buf != i)//(i << 16))
fprintf(stderr, "ocm[%x] = %x\n", i, *buf);
}

err_munmap:
munmap(ocm, OCM_SIZE);
err_close:
if(memf > 0) close(memf);
err_:
return err;
}

When you run it, you should not see any mismatch, but I sometimes see a problem at address 0xFFFCFCF2--only ONLY at that address. Why?

APPENDIX: Buildroot structure to cross compile a userspace application NOT working yet!

Buildroot manual section 9.1 explains the recommended in-tree folders for a board specific packages. Since the OCM test application only makes sense for the Zynq platform, I will comply with the recommended structure. I only need the test application for now, so the following steps are the simplest to get 1 board specific package into buildroot:

First, add too Buildroot's top level Config.in:

menu "Zedboard specific packages"
source "package/avnet/zedboard/Config.in"
endmenu

This will pull in (my) Zedboard specific packages. Then the board level package/avnet/zedboard/Config.in can pull in specific packages (just 1 for now):

source "package/avnet/zedboard/ocm/Config.in"

package/avnet/zedboard/ocm/Config.in should offer help for make {x|n|g}config:

config BR2_PACKAGE_ZYNQ_OCM
bool "Zynq OCM user app"
default n
help
Select this if you want a test application for the Zynq OCM
(on-chip memory).
Requires CONFIG_ARCH_ZYNQ in kernel config.

When I run make xconfig in buildroot, I see the correct option under the package category, as you can see below:

~/work/zed/buildroot$ touch package/avnet/zedboard/zedboard.mk

OCM interrupt useless for application

I wondered if I can tell the reader of the OCM of new messages by raising an interrupt, but found in the Zynq TRM section 29.2.5 that the OCM interrupts are only asserted by the HW on parity error or lock request.