tag:blogger.com,1999:blog-40320203372475826192024-03-20T19:23:52.167-07:00Henry ChoiHenry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.comBlogger91125tag:blogger.com,1999:blog-4032020337247582619.post-9553937170942314552017-12-23T07:12:00.001-08:002017-12-23T07:12:23.670-08:00Xenomai on Raspberry PiI am still trying to get enough free time to work on AMP (asymmetric multiprocessing) on RPi2 or RPi3, but until that works, I want to use Xenomai as a fallback. I discussed the <a href="https://docs.google.com/document/d/1lA5D7RHcxb8zmye7JauXWZ46jxm4T9UO2cUtXoxHX6g/edit">real-time capabilities of the Xenomai kernel patch</a> several years ago. I was on x86 at that time and besides Xenomai has gone to version 3, so I will see if the new Xenomai Cobalt architecture is better or worse on the ARM kernel than when I tested Xenomai five years ago on x86.<br />
<h2>
Building the Xenomai kernel</h2>
<div>
After ensuring that I could build the RPi kernel that still boots the Rasbian distribution that I downloaded (since I did not changing any kernel config significantly except CONFIG_DEBUG_INFO I expected this to work), I started changing the kernel config for Xenomai. On x86, I build the kernel and a light weight rootfs built with Buildroot. This time, I don't need the rootfs, but it's still easier to have Buildroot build Xenomai.<br />
<h3>
Cross-compiling the RPi kernel with Buildroot</h3>
Before building the Xenomai kernel, I practiced building a regular kernel that can still boot the Rasbian image downloadable from raspberrypi.org--following the steps outlined on the <a href="http://raspberrypi.org/">raspberrypi.org</a>. One tip is to NOT use the toolchain recommended on the webpage; gcc version 4.8 has a bug that errors out during a kernel module build later. I found that the other toolchain, also available from the <a href="https://github.com/raspberrypi/tools">RPi tools repo</a> works:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">export $HOME/rpi/tools/arm-bcm2708/arm-rpi-4.9.3-linux-gnueabihf/bin:$HOME</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">export CROSS_COMPILE=arm-linux-gnueabihf-</span><br />
<br />
To build a kernel that is just like the current kernel, I copy the kernel config from the known good kernel. This method
has the advantage that you can just keep reusing the kernel modules
that came with your install of Rasbian.<br />
<span style="font-family: "courier new" , "courier" , monospace;">pi@raspberrypi$ sudo modprobe configs</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">pi@raspberrypi$ gunzip -c /proc/config.gz > ~/.config </span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<br />
Once I copy the .config file above to the Linux repo on the development host, I can bootstrap the linux repo
(rpi-4.9.y branch to match the Rasbian image downloaded from the
Raspberry site) by applying the .configs file saved above, like this
(the make targets are as recommended by the above RPi document):<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">parallels@ubuntu:~/rpi/linux$ cp </span></span><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">../BAK/.config .</span></span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;">parallels@ubuntu:~/rpi/linux$ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make oldconfig </span><br />
<br />
Save this config as a defconfig by running this command:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">parallels@ubuntu:~/rpi/linux$ ARCH=arm make savedefconfig</span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">parallels@ubuntu:~/rpi/linux$ mv defconfig ../BAK/rpi_linux_defconfig</span></span></span></span></div>
<div>
<br /></div>
<div>
To build just the kernel (no need to build the kernel modules if the config has not changed), just the zImage target is sufficient.</div>
<div>
<br />
<span style="font-family: "courier new" , "courier" , monospace;">ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make zImage</span><br />
<br />
The resulting binary arch/arm/boot/zImage can be copied to the SD card, and the config.txt can point to it:</div>
<div>
</div>
<div>
</div>
<div>
</div>
<div>
</div>
<div>
</div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;">kernel=zImage</span><br />
<h3>
Patching the RPi kernel with Xenomai Cobalt co-kernel</h3>
Before patching the currently working Linux kernel, I copy the kernel to another folder :
~/rpi/linux.xeno.<br />
<br />
Xenomai kernel patch is available as a i-pipe patch download from the <a href="https://xenomai.org/downloads/ipipe/">download area</a>. Since my RPi kernel version is 4.9.41, I found the patch that is as close to it as possible: ipipe-core-4.9.38-arm-3.patch<br />
<br />
I then ran the script that Xenomai guys refer to in their <a href="https://xenomai.org/installing-xenomai-3-x/">install guide</a>:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">parallels@ubuntu:~/rpi/xenomai-3.0.6$
scripts/prepare-kernel.sh --linux=../linux.xeno --ipipe
../ipipe-core-4.9.38-arm-3.patch --arch=arm</span><br />
<br />
Unfortunately, this fail because of the difference between the mainline kernel version 4.9.38 and the RPi kernel 4.9.41--despite being only 3 versions apart. I worked around by:<br />
<ol>
<li>deleting the patches that fail from the *.patch file</li>
<li>applying the reduced patch</li>
<li>manually applying the patch to the files that failed the automatic patch</li>
<li>saving the resulting diff as a new patch file (<span style="font-family: "courier new" , "courier" , monospace;">git diff > ../ipipe-core-4.9.41-arm-3.patch</span>)</li>
</ol>
<h3>
Configuring the kernel for Cobalt</h3>
On recommendation from the Xenomai folks, I disabled the following kernel features:<br />
<ul>
<li>CONFIG_CPU_FREQ</li>
<li>CONFIG_CPU_IDLE</li>
<li>CONFIG_KGDB</li>
</ul>
<h3>
Build the kernel</h3>
Given the massive changes to the kernel, I can no longer get away with just building the kernel proper: I have to also build the kernel modules.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">parallels@ubuntu:~/rpi/linux.xeno$ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make zImage module </span><br />
<br />
The built modules must be copied to the SD card's ext4 partition, pointing INSTALL_MOD_PATH at wherever the ext4 partition is mounted at on your dev host (when the SD card is inserted into the dev host).<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">sudo make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- INSTALL_MOD_PATH=<mount point> modules_install</span><br />
<br />
The new kernel is still named zImage, and copied to the boot partition of the SD card. RPi bootloader's config.txt points at this zImage:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">kernel=zImage</span><br />
<br />
There is a tense moment when the modified SD card is inserted into the RPi and I turn on the RPi. I seem to have been lucky: this kernel boots up with the Rasbian image, and the Xenomai kernel patch ran during the kernel bootup:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">pi@raspberrypi:~ $ dmesg | grep -Ei 'xeno|pipe'<br />[ 0.000000] I-pipe, 19.200 MHz clocksource, wrap in 960767920505705 ms<br />[ 0.000000] clocksource: ipipe_tsc: mask: 0xffffffffffffffff max_cycles: 0x46d987e47, max_idle_ns: 440795202767 ns<br />[ 0.001474] Interrupt pipeline (release #3)<br />[ 0.172468] clocksource: Switched to clocksource ipipe_tsc<br />[ 0.278993] [Xenomai] scheduling class idle registered.<br />[ 0.279039] [Xenomai] scheduling class rt registered.<br />[ 0.279289] I-pipe: head domain Xenomai registered.<br />[ 0.284432] [Xenomai] Cobalt v3.0.6 (Stellar Parallax) </span><br />
Since Xenomai is a way to run real-time code in userspace, it makes sense that I need to build userspace libs. <br />
<h2>
Building the Xenomai userspace</h2>
In the same scripts/ folder where I ran the prepare-kernel.sh above, there is "bootstrap" script, which generates the "configure" script.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">parallels@ubuntu:~/rpi/xenomai-3.0.6$ scripts/bootstrap </span><br />
<span style="font-family: "courier new" , "courier" , monospace;">parallels@ubuntu:~/rpi/xenomai-3.0.6$</span><span style="font-family: "courier new" , "courier" , monospace;"> ./configure --with-core=cobalt --enable-smp --host=arm-linux-gnueabihf CFLAGS="-mcpu=cortex-a7 -mfpu=neon-vfpv4 -mfloat-abi=hard" LDFLAGS="-mcpu=cortex-a7 -mfpu=neon-vfpv4 -mfloat-abi=hard"</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"></span><span style="font-family: "courier new" , "courier" , monospace;"></span><span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
I tried mcpu=cortex-a53, but that seemed to be incompatible with enable-smp option. Maybe gcc-4.9 doesn't know about cortex-a53. The --host argument should be the same as the CROSS_COMPILE environment variable, but without the trailing dash. This is consistent with the gcc option you can see when run gcc on Raspberry natively (gcc -v). I suspect that I can use the more modern version of the FPU, but Buildroot seems to use VFP4 as well (as of Buildroot version 2017.10), so I am not going to be aggressive about it. Keep the build in a separate folder.<br />
<span style="font-family: "courier new" , "courier" , monospace;"><br />parallels@ubuntu:~/rpi/xenomai-3.0.6$ make DESTDIR=~/build install</span><br />
<br />
The build output will then be neatly packaged in ~/build folder.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">parallels@ubuntu:~/rpi/xenomai-3.0.6$ ls ~/build<br />dev usr</span><br />
<h2>
Deploying the Xenomai userspace</h2>
I created 3 separate tarballs in this folder (are the headers and the lib tarballs actually used?):<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">$ cd ~/build</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">$ tar czf xeno-userspace.tar.gz *</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">$ cd usr/xenomai/include/<br />$ tar czf ../../../xeno-headers.tar.gz *<br />$ cd ../lib</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">$ tar czf ../../../xeno-libs.tar.gz *</span><br />
<br />
These xenomai userspace lib has to be extracted to the root folder /:<br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;">pi@raspberrypi:~ $ cd /</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">pi@raspberrypi:/ $ sudo tar xzf ~/xeno-userspace.tar.gz </span><br />
<br />
This will put the Xenomai device mount points in /dev folder, and the userspace headers and libs in /usr/xenomai folder. The lib folder has to be appended to the /etc/ld.so.conf.d:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">pi@raspberrypi:/ $ echo "/usr/xenomai/lib/" | sudo tee /etc/ld.so.conf.d/xenomai.conf</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">pi@raspberrypi:/ $ sudo ldconfig -v</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">pi@raspberrypi:/ $ sync</span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">pi@raspberrypi:/ $ sudo reboot</span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"></span><br />
ldconfig modifies the dynamic linker's bindings to include Xenomai dylibs. sync to flush to the disk is an overkill in my opinion, but I thought a harmles neat trick.<br />
<h2>
Testing the Xenomai userspace</h2>
The latency test exercises the Xenomai libs:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">pi@raspberrypi:/ $ sudo /usr/xenomai/bin/latency</span></span><br />
<br />
Note that running Xenomai requires sudo privilege because the Xenomai devices are accessible only to the root user by default (I ignored the build/config option to expose the Xenomai device to a designated group, because I think I have to call mlockall as sudo anyway). Here are my typical results (tabulated ever second) on RPi3:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">RTH|----lat min|----lat avg|----lat max|-overrun|---msw|---lat best|--lat worst<br />RTD| 1.310| 2.229| 8.602| 0| 0| -1.166| 32.842RTD| 1.252| 2.234| 10.471| 0| 0| -1.166| 32.842</span><br />
<br />
I stressed the system by scp'ing 9 GB file (loading the Ethernet--USB really--and the filesystem and the SD card driver), and installing beefy APT packages like emacs. You can see that the worst case latency during the whole duration is greater than 2 orders of magnitude of the average latency.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">RTT| 00:29:04 (periodic user-mode task, 1000 us period, priority 99)<br />RTH|----lat min|----lat avg|----lat max|-overrun|---msw|---lat best|--lat worst<br />RTD| 0.793| 2.277| 14.126| 0| 0| -1.961| <span style="background-color: yellow;">418.934</span></span><br />
<br />
Therefore, it appears that while Xenomai is certainly much better than stock Linux, it still shouldn't be trusted to run hardware loops faster than 100~200 Hz (using my rule of thumb that the jitter should be < 1% of the sampling period).<br />
<h2>
Show stopper: random crash after a few hours</h2>
I noticed system freezes when I left the latency test running overnight.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">Dec 20 17:40:49 raspberrypi kernel: [ 9091.384209] brcmfmac: brcmf_sdio_readframes: RXHEADER FAILED: -110</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">Dec 20 17:40:49 raspberrypi kernel: [ 9091.384232] brcmfmac: brcmf_sdio_rxfail: abort command, terminate frame, send NAK</span></div>
<div>
<br /></div>
<div>
Disabled wireless in general (bad idea for robust operation in real-world noisy environment) in /boot/config.txt.</div>
<div>
<br /></div>
<div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;">dtoverlay=pi3-disable-bt</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;">dtoverlay=pi3-disable-wifi</span></div>
</div>
<div>
<br /></div>
<div>
These steps do not seem to solve the problem. I still think it has to do with some device drivers expecting power management or CPU frequency adjustment, which I turned off at the kernel level above, but the root cause is unclear.</div>
<div>
<br /></div>
<div>
<br /></div>
Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com1tag:blogger.com,1999:blog-4032020337247582619.post-82786962277018283832017-11-25T14:30:00.001-08:002017-11-26T08:47:16.038-08:00"Bare metal" control of the New Haven OLEDMany embedded devices have no display, and make do with status LEDs (think about your home router or switch). One step up is an LCD display--like the New Haven OLED display I bought a few years ago, to study the Linux frame buffer device drivers. Originally, I was going to experiment on my Zedboard (which packs Xilinx Zynq SoC) but since then I've embraced the Raspberry Pi project. So in this blog entry, I create a status display GUI on my NHD-1.27-12896UGC3, which comes with its own displayer controller SSD1351. When complete, the RPi Linux driven terminal looks like this if the soldering and connections were OK.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNqn2Koq3N-ZWI_35tA95_Te8gWaS-SWe88eMKu1H_r_IlfBXvcviPE9SYlvkmJ4Ixhww-yIXIUbrwsT-d5sod5Bq5TO-vDyKDSkx8da-t7qwHVmDzpNw1tzaIzicAnhGPCpHdxlPvQOn2/s1600/IMG_0032.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="378" data-original-width="504" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNqn2Koq3N-ZWI_35tA95_Te8gWaS-SWe88eMKu1H_r_IlfBXvcviPE9SYlvkmJ4Ixhww-yIXIUbrwsT-d5sod5Bq5TO-vDyKDSkx8da-t7qwHVmDzpNw1tzaIzicAnhGPCpHdxlPvQOn2/s1600/IMG_0032.jpg" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">RPi console output on NHD-1.27-12896UGC3. Note the crisp color and the deep black.</td></tr>
</tbody></table>
<h2>
RPi Linux supports SSD1351 out of the box</h2>
The Raspberry Pi linux kernel (can be cloned from <span style="background-color: #dfdbc3; color: #4d2f2d; font-family: "courier"; font-size: 14px;">https://github.com/raspberrypi/linux</span>) already has the matching kernel module for the display driver, which you can verify for yourself by running the following commands on <i>your</i> Raspberry Pi:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">pi@raspberrypi$ sudo modprobe configs</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">pi@raspberrypi$ gunzip -c /proc/config.gz > ~/BAK/.config </span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
You can see SSD1351 support in the resulting file:<br />
<br />
<div style="background-color: #dfdbc3; color: #4d2f2d; font-family: Courier; font-size: 14px; font-stretch: normal; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">CONFIG_FB_TFT_</span><span style="background-color: #4d2f2d; color: #dfdbc3;">SSD1351</span><span style="font-variant-ligatures: no-common-ligatures;">=m</span></div>
<br />
This define pulls in fb_ssd1351, which is one of the fbtft (TFT frame buffer) devices enumerated in fbtft_device.c: the 2 SSD1351 devices enumerated there are <a href="https://learn.adafruit.com/adafruit-pioled-128x32-mini-oled-for-raspberry-pi/usage">pioled</a> and <a href="https://www.freetronics.com.au/pages/oled128-quickstart-guide-raspberry-p">freetronicsoled128</a>, neither of which are the NHD 1.27 128x96 device I have. They are however driven similarly: 20 MHz SPI in Mode 0 (of the 4 SPI modes available; in mode 0, the slave samples SDA on the rising of SCL). One puzzle is how pioled can drive a display only 32 pixels tall, when fb_ssd1351.c hard codes the initial height to 128, but let's see whether freetronicsoled128 can handle the "height=96" mod probe argument. It looks like the probe code (fbtft_probe_dt) handles a whole bunch of options:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">pdata->display.width = fbtft_of_value(node, "width");</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">pdata->display.height = fbtft_of_value(node, "<span style="background-color: yellow;">height</span>");</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">pdata->display.regwidth = fbtft_of_value(node, "regwidth");</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">pdata->display.buswidth = fbtft_of_value(node, "buswidth");</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">pdata->display.backlight = fbtft_of_value(node, "backlight");</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">pdata->display.bpp = fbtft_of_value(node, "bpp");</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">pdata->display.debug = fbtft_of_value(node, "debug");</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">pdata->rotate = fbtft_of_value(node, "rotate");</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">pdata->bgr = of_property_read_bool(node, "bgr");</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">pdata->fps = fbtft_of_value(node, "fps");</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">pdata->txbuflen = fbtft_of_value(node, "txbuflen");</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">pdata->startbyte = fbtft_of_value(node, "startbyte");</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">of_property_read_string(node, "gamma", (const char **)&pdata->gamma);</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;">if (of_find_property(node, "led-gpios", NULL))</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>pdata->display.backlight = 1;</span><br />
<br />
The default module properties in the kernel code for this device bears remembering:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">.name = "freetronicsoled128",</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">.spi = &(struct spi_board_info) {</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>.modalias = "fb_ssd1351",</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>.max_speed_hz = 20000000,</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>.mode = SPI_MODE_0,</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>.platform_data = &(struct fbtft_platform_data) {</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>.display = {</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>.buswidth = 8,</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>.backlight = FBTFT_ONBOARD_BACKLIGHT,</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>},</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>.bgr = true,</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>.gpios = (const struct fbtft_gpio []) {</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>{ "reset", 24 },</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>{ "dc", 25 },</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>{},</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>},</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>}</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">}</span><br />
<br />
The reset and dc pins above are the D/C# and RES# pins defined in the SSD1351 controller interface table shown below:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjTOk9U94OsFsBTqX5tVzbcCNKAXPHJl3ydsXCN1-y5QvBzzs2qFrLhcvNx4I2nTvmsUg3gvfo8d2gjtEKdS-WdR1e40e79BoillTGFkZFZWKgxY-k0lz5gmsGdweKJT5Y7N_6xoYa99f5N/s1600/Screen+Shot+2017-11-24+at+8.21.04+AM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="355" data-original-width="1600" height="142" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjTOk9U94OsFsBTqX5tVzbcCNKAXPHJl3ydsXCN1-y5QvBzzs2qFrLhcvNx4I2nTvmsUg3gvfo8d2gjtEKdS-WdR1e40e79BoillTGFkZFZWKgxY-k0lz5gmsGdweKJT5Y7N_6xoYa99f5N/s640/Screen+Shot+2017-11-24+at+8.21.04+AM.png" width="640" /></a></div>
Since it mentions the DC# pin explicitly (rather than being tied low as for the 3-wire interface), the device driver is expecting to use the 4-wire SPI interface above--through RPi GPIO pin 25. The kernel config did not set CONFIG_FBTFT_ONBOARD_BACKLIGHT because the device doesn't need backlight (it's an OLED!). NHD-1.27-12896UGC3 data sheet shows the recommended wiring for the 4-wire SPI mode as follows:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhaXLvZ9aaj2pmrIrzbVSyJspfGmTfSXpPfP_5jGuaiT2ZD_gJc1jUgD7nZbHMPJ8A8kriEcuSEaYwIGWa89OVwNd0tjyEwLY76L0FB6ZddBlzG9QlZZCrYsome50Ro9atGZ6B35YF6Kf6K/s1600/Screen+Shot+2017-11-25+at+9.04.43+AM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="378" data-original-width="409" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhaXLvZ9aaj2pmrIrzbVSyJspfGmTfSXpPfP_5jGuaiT2ZD_gJc1jUgD7nZbHMPJ8A8kriEcuSEaYwIGWa89OVwNd0tjyEwLY76L0FB6ZddBlzG9QlZZCrYsome50Ro9atGZ6B35YF6Kf6K/s1600/Screen+Shot+2017-11-25+at+9.04.43+AM.png" /></a></div>
Including the 3.3 V power and ground, only 7 wires connect the display module to RPi GPIO header:<br />
<ul>
<li>D/C: RPi P1.22, GPIO.25</li>
<li>SCLK: RPi P1.23 (AKA SCLK)</li>
<li>SDIN: RPi P1.19 (AKA MOSI)</li>
<li>/RES: RPi P1.18, GPIO.24</li>
<li>/CS: RPi P1.24 (AKA CE0), GPIO.8</li>
</ul>
When all soldered and wired, the connection looks like this (ignore the logic analyzer probes on the display pins).<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiOn3xA758zUndIEDT1FzVfdS7qL7H3HGA_46sb3dYcwyLYyYhGDwSPqeuaa3phjg5JN_0AWhrKrHgn6jqu16YhET5Nw91dxNkwJDKEBP7XxOkDKRAlB8ujpunELHY2ftwU3_mXH6Y2CLO8/s1600/IMG_0033.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="756" data-original-width="1008" height="480" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiOn3xA758zUndIEDT1FzVfdS7qL7H3HGA_46sb3dYcwyLYyYhGDwSPqeuaa3phjg5JN_0AWhrKrHgn6jqu16YhET5Nw91dxNkwJDKEBP7XxOkDKRAlB8ujpunELHY2ftwU3_mXH6Y2CLO8/s640/IMG_0033.jpg" width="640" /></a></div>
<br />
To load the kernel module, I supply the display height (which is different than the default 128 pixels) to the module argument like this:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">sudo modprobe fbtft_device name=freetronicsoled128 height=96</span><br />
<br />
But according to the kernel log, the height argument was ignored:<br />
<br />
<div style="background-color: #dfdbc3; color: #4d2f2d; font-family: Courier; font-size: 14px; font-stretch: normal; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">Nov 25 17:22:38 hchoi2-RPi1B kernel: [ 1709.489760] graphics fb1: fb_ssd1351 frame buffer, 128x128, 32 KiB video memory, 4 KiB DMA buffer memory, fps=20, spi0.0 at 20 MHz</span></div>
<br />
Anyhow, the module load succeeded and I now have another frame buffer device (in addition to the default HDMI out):<br />
<br />
<div style="background-color: #dfdbc3; color: #00a400; font-family: Courier; font-size: 14px; font-stretch: normal; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"><b>pi@hchoi2-RPi1B</b></span><span style="color: #4d2f2d; font-variant-ligatures: no-common-ligatures;">:</span><span style="color: #4324d4; font-variant-ligatures: no-common-ligatures;"><b>~ $</b></span><span style="color: #4d2f2d; font-variant-ligatures: no-common-ligatures;"> ls /dev/fb</span></div>
<div style="background-color: #dfdbc3; color: #4d2f2d; font-family: Courier; font-size: 14px; font-stretch: normal; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">fb0 fb1 </span></div>
<div>
<span style="font-variant-ligatures: no-common-ligatures;"><br /></span></div>
I can then use the 2nd frame buffer as the console output:<br />
<br />
<div style="background-color: #dfdbc3; color: #00a400; font-family: Courier; font-size: 14px; font-stretch: normal; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"><b>pi@hchoi2-RPi1B</b></span><span style="color: #4d2f2d; font-variant-ligatures: no-common-ligatures;">:</span><span style="color: #4324d4; font-variant-ligatures: no-common-ligatures;"><b>~ $</b></span><span style="color: #4d2f2d; font-variant-ligatures: no-common-ligatures;"> con2fbmap 1 1</span></div>
<br />
The console can be redirected by changing the last 1 in the above command to 0.<br />
<h2>
Low level control of SSD1351 on Arduino Uno</h2>
<div>
Linux FB framework is powerful but requires a lot of code, which does not fit on most deeply embedded targets. The vendor (New Haven Display) put out <a href="https://github.com/NewhavenDisplay/NHD-1.27-12896UGC3_Example.git">a "bare metal" example on GitHub</a> for controlling the device from Arduino Uno. This is an easier way to understand the low level control than wading through the many layers of the Linux FB driver code. The following is my annotation of the example Arduino code.</div>
<h3>
Low level primitive</h3>
<div>
The supplied example shows 3 different methods of sending command to the device: 2 parallel interface and the 4-pin SPI. I am only interested in the serial interface (cannot dedicate that many pins just for the display!) so I will ignore the parallel interface going forward. The chip requires MSb (most-significant-bit-interface), and Arduino will bit-bang each bit on its GPIO pin while holding CS (chip select) and D/C# low for the whole duration of 8-bits.</div>
<div>
<br /></div>
<div>
Writing 1 B of data over the serial is exactly the same, except for holding D/C# high while writing.</div>
<h3>
Initialization</h3>
<div>
<ol>
<li>Chip reset: pull down the RES# pin for 500 usec, then pulling it up again, and then waiting for at least 500 usec. </li>
<li>Unlock command: write 0x12 and then 0xB1 to the command lock register (0xFD)</li>
<li>Sleep mode on (display off): write (nothing ) to 0xAE register</li>
<li>Set clock = divisor + 1, frequency = 0xF: write 0xF1 to 0xB3. Writing to this register requires command unlocking (step #2).</li>
<li>Set mux ratio</li>
<li>Set display offset and start</li>
<li>Set color depth to 18-bit (256k color), 16-bit format 2.</li>
<li>GPIO input disabled</li>
<li>Enable internal Vdd regulator</li>
<li>Choose external VSL</li>
<li>Set contrast current for the 3 collars (slightly different than the default: 0x8A, 0x70, 0x8A)</li>
<li>Reset output currents for all colors</li>
<li>Enhance display performance</li>
<li>...</li>
<li>Sleep mode off (display on): write (nothing) to 0xAF register.</li>
</ol>
<h3>
Blank out the entire screen to black</h3>
</div>
<div>
Blanking out the screen to any color just means writing the same (whatever) color to every pixel. It consists of setup and data stage:</div>
<div>
<ol>
<li>Set column start and end to 0 and 127, respectively</li>
<li>Set row start and end to 0 and 95, respectively</li>
<li>Start write to RAM: write the destination register address (0x5C)</li>
<li>For the next 128x96 pixels, write the given pixel value (RGB) as SPI data. For the 262k color over 8-bit serial interface, the data format is given in Table 8-8 of the SSD1351 data sheet. If I don't check for saturation, it's convenient to keep the colors as separate bytes, and output 8-bits for each color in rapid succession.</li>
</ol>
<h3>
Print a fixed font letter</h3>
</div>
<div>
If I emit a different color for a pixel than the background color, I can show a dot at a given point. If I arrange a group of neighboring pixels in a pre-arranged way, that is a symbol that can be shown at offset (x, y) on the screen. If I then hold a read-only bitmap representing a letter, it is possible to print one letter at a time on the screen, by testing each bit of the bitmap as the pixel position moves to the right. Here's an example of the letter 'E' in 10-point font:</div>
<div>
<br /></div>
<div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;">const unsigned char A10pt [] = {<span class="Apple-tab-span" style="white-space: pre;"> </span>// 'A' (11 pixels wide)</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>0x0E, 0x00, // ### </span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>0x0F, 0x00, // #### </span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>0x1B, 0x00, // ## ## </span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>0x1B, 0x00, // ## ## </span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>0x13, 0x80, // # ### </span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>0x31, 0x80, // ## ## </span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>0x3F, 0xC0, // ######## </span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>0x7F, 0xC0, // ######### </span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>0x60, 0xC0, // ## ## </span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>0x60, 0xE0, // ## ###</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>0xE0, 0xE0, // ### ###</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;">};</span></div>
</div>
<div>
<br /></div>
<div>
Note that this "10 point" font is actually 11 pixels tall and 13 pixels wide. A for-loop to print this letter at position x and y on the screen is:</div>
<div>
<br /></div>
<div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> index = 0;</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> for(i=0;i<11;i++) // display custom character A</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> {</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> OLED_SetColumnAddress_12896RGB(<span style="background-color: yellow;">x</span>, 0x7F);</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> OLED_SetRowAddress_12896RGB(<span style="background-color: yellow;">y</span>, 0x5F);</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> OLED_WriteMemoryStart_12896RGB();</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> for (count=0;count<8;count++)</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> {</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> if((<span style="background-color: yellow;">A10pt</span>[index] & mask) == mask)</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> OLED_Pixel_12896RGB(<span style="background-color: yellow;">textColor</span>);</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> else</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> OLED_Pixel_12896RGB(backgroundColor);</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> mask = mask >> 1;</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> }</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> index++;</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> mask = 0x80;</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> for (count=0;count<8;count++)</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> {</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> if((<span style="background-color: yellow;">A10pt</span>[index] & mask) == mask)</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> OLED_Pixel_12896RGB(textColor);</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> else</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> OLED_Pixel_12896RGB(backgroundColor);</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> mask = mask >> 1;</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> }</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> index++;</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> mask = 0x80;</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> y_pos--;</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> }</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"> x += 13;</span></div>
</div>
<div>
<br /></div>
<div>
This implementation is intimately tied to the font representation above (each row of the font consists of the 2 B and the pixel width and height are hard coded. But note that a few of the hard coded parameters can be parametrized: the letter position (x, y), the letter itself, and the foreground color (and possibly the background color), and can be refactored into a common function that looks up the letter in a table--like the ASCII table:</div>
<div>
<br /></div>
<div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;">void OLED_Text_12896RGB(unsigned char x_pos, unsigned char y_pos, unsigned char letter, unsigned long textColor, unsigned long backgroundColor);</span></div>
</div>
<div>
<br /></div>
<div>
This strategy is slow but functional. Each byte write can be grouped together into a long sequence of bytes:<br />
<ul>
<li>The nRS pin can be held low the whole time (i.e. avoid the repeated function calls)</li>
<li>The SPI write can be accelerated over DMA if the background is the same. That is, instead of a letter consisting of just 1 bitmap, it can just be a long sequence of colors for the entire rectangular region the letter takes up. This will bloat the DATA segment dedicated to the letters.</li>
</ul>
Even more optimization techniques such as keeping a frame buffer and writing out a whole screen in one shot are just the beginning in graphics programming, and I won't write these myself because I don't want to reinvent the wheel.</div>
<div>
<h2>
Porting the Arduino example to RPi Linux user space</h2>
</div>
<div>
Driving out the SPI signal from RPi is an excellent way to prototype an embedded GUI platform even before the new board is brought up. Even after the board is brought up, writing a user space program to try out an idea is a great convenience. The key to porting the Arduino example to RPi is to leverage someone else's work on driving the RPi's SPI interface. <a href="http://www.airspayce.com/mikem/bcm2835/index.html">The BCM2835 library</a> is mature and performant. Using it, configuring the GPIO and SPI can be coded concisely:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">#include <bcm2835.h></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;">#define DC_PIN 25</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">#define RES_PIN 24</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;">int main() {</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> if (!bcm2835_init()) </span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> return 1; </span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> if (!bcm2835_spi_begin()) { </span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> fprintf(stderr, "bcm2835_spi_begin failed %d. Are you running as root??\n", </span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> errno); </span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> return 1; </span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> } </span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> bcm2835_spi_setBitOrder(BCM2835_SPI_BIT_ORDER_MSBFIRST); // The default </span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> bcm2835_spi_setDataMode(BCM2835_SPI_MODE0); // The default </span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> bcm2835_spi_setClockDivider(BCM2835_SPI_CLOCK_DIVIDER_32); </span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> bcm2835_spi_chipSelect(BCM2835_SPI_CS0); // The default </span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> bcm2835_spi_setChipSelectPolarity(BCM2835_SPI_CS0, LOW); // the default </span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> </span><br />
<span style="font-family: "courier new" , "courier" , monospace;">// the output pins: D/C (GPIO.3), RES (GPIO.5) </span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> bcm2835_gpio_fsel(DC_PIN, BCM2835_GPIO_FSEL_OUTP);</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> bcm2835_gpio_fsel(RES_PIN, BCM2835_GPIO_FSEL_OUTP);</span><br />
<br />
Divider = 32 yields 8 MHz SPI speed. I could try going faster, but even at 8 MHz, the signal integrity is marginal. When the part is integrated on a PCB, I should be able to go faster. Anyway, the smiley shows that once the image is set, the display can just refresh itself without a periodic refresh from the host, which means that a slow processor like a C2000 can just update the display asynchronously.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjOVL1hd8hf0D_f54ZSTz9wdHhYEDvMg6I9ZLdshm69AzyEYjF5F3GZurzKy8W1UEOWxTlorG2zabbkcEh7BK2eSuGqmLQZkr4X85FLjqg4a0F9HiHnzooBVhHM3u7LDAnICAfb6MS09RkN/s1600/IMG_0034.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="455" data-original-width="474" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjOVL1hd8hf0D_f54ZSTz9wdHhYEDvMg6I9ZLdshm69AzyEYjF5F3GZurzKy8W1UEOWxTlorG2zabbkcEh7BK2eSuGqmLQZkr4X85FLjqg4a0F9HiHnzooBVhHM3u7LDAnICAfb6MS09RkN/s1600/IMG_0034.jpg" /></a></div>
This is not quite bare metal in the true sense. But still, this code should be readily transferrable to an embedded target such as C2000.<br />
<h2>
Bare metal control of the NHD panel from C2000</h2>
TODO<br />
<br /></div>
Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com0tag:blogger.com,1999:blog-4032020337247582619.post-62222310027955037312017-11-19T15:21:00.000-08:002017-11-19T15:21:31.037-08:00JTAG DAP parserI have been trying to get a low level debug session going against my Raspberry Pi 3 using my J-Link debug probe. If you Google for "J-Link Raspberry Pi", you will find success reported mostly for the original Raspberry Pi. At first, I tried to use JLinkGDBServer and JLinkExe on my Ubuntu VM, but I haven't managed to write a working JLinkScript yet. When even OpenOCD failed to connect to the target, I started digging into the root cause. Following <a href="https://sysprogs.com/VisualKernel/tutorials/raspberry/jtagsetup/">this page</a> to enable JTAG on RPi's GPIO was relatively easy, as was exposing the copper for the JTAG's TRST, TDI, TDMS, TCLK, TDO lines and capturing the failed debug session on the logic analyzer. Saleae already has the JTAG signal analyzer, so reading the raw bits going into and coming out of the JTAG scan is all relatively easy, as you can see below.<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHfC5afRmQLDS00NdEti4oJtWyNFZXjajHeAxYXQKvoTCyqmxKas8ogYpktSEd7sjW7JCHDXNcBRRJHz2vG9JfvBY-OFmqEPtIAJUmW3uPEB2r7Q_8nARpoM144gmAfKmAjfJksZivGTTp/s1600/Screen+Shot+2017-11-19+at+1.02.58+PM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="256" data-original-width="928" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHfC5afRmQLDS00NdEti4oJtWyNFZXjajHeAxYXQKvoTCyqmxKas8ogYpktSEd7sjW7JCHDXNcBRRJHz2vG9JfvBY-OFmqEPtIAJUmW3uPEB2r7Q_8nARpoM144gmAfKmAjfJksZivGTTp/s1600/Screen+Shot+2017-11-19+at+1.02.58+PM.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">A small portion of a JTAG session between J-Link and Raspberry Pi 3 target, with JTAG enabled on RPi3's P8 header</td></tr>
</tbody></table>
But such low level exchange does not yield insight, so I started reading documents: ADI (ARM debug interface) 5.2, CoreSight specification 3.0, ARM Cortex A/R programmer's guide, and ARM Cortex-A7 TRM (technical reference manual), and I understood how the debug host controls the ARM CPU's debug subsystem by writing appropriate values to the DAP (debug access port) registers. But to actually apply this understanding to my problem required rather painful mental bit-shifting, and repeatedly looking up the register definitions in the ADI and CoreSight specification. So I saved the Saleae's JTAG capture to a CSV file, and wrote a Python script to do the low level heavy-lifting for me.<br />
<h2>
Parser</h2>
<div>
In the above trace, there are only 2 JTAG TAP (test access protocol) states that yield decodable value: Shift-IR (instruction register) and Shift-DR (data register). All other transactions either lead up to this state, or move the state machine back to the starting state. Roughly speaking, the target register is specified in Shift-IR state, and the data values for the specified registers are given in the Shift-DR state. So unless the JTAG signals are bad (unlikely on a shipping HW like the Raspberry Pi), I can just focus on these 2 states and ignore most of the lines in the CSV emitted by Saleae Logic GUI. I learned about the pandas Python library in a Udacity data science course on Supervised Learning: it will do much of the tabular data cleanup for me. Although I can pip install pandas on my system, it was just easier to install Anaconda (version 2, to stay with Python 2.7) to a separate folder. So my script begins by using that special python package that comes with anaconda2.</div>
<div>
<br /></div>
<div>
<div>
<span style="font-family: Courier New, Courier, monospace;">#!/anaconda2/bin/python</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">from enum import Enum</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">import sys</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">class TapState(Enum): Invalid, I, D = range(3)</span></div>
</div>
<div>
<span style="font-family: Courier New, Courier, monospace;"><br /></span></div>
<div>
<div>
<span style="font-family: Courier New, Courier, monospace;">tap_state = TapState.Invalid.value</span></div>
</div>
<div>
<br /></div>
<div>
Again, the reduction of the JTAG TAP states to I (instruction) and D (data) is a drastic simplification for this case, where I am only interested in the parsing the layer above the JTAG.</div>
<div>
<br /></div>
<div>
In the capture, there are 3 other registers that appear besides the DPACC (debug port access) and APACC (application port access), so I enumerate them.</div>
<div>
<br /></div>
<div>
<div>
<span style="font-family: Courier New, Courier, monospace;">class IR(Enum):</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;"> ABORT = 8</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;"> DPACC = 10</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;"> APACC = 11</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;"> IDCODE = 14</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;"> BYPASS = 15</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;"><br /></span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">ir = IR.BYPASS.value</span></div>
</div>
<div>
<br /></div>
<div>
From my previous experience with SWD, I know about the trick that ARM plays with the SELECT register to map different registers to the limited size of register banks. Here, 3 separate selections can happen independently, so I need 3 separate variables to maintain in a DAP transaction:</div>
<div>
<br /></div>
<div>
<div>
<span style="font-family: Courier New, Courier, monospace;">apsel = -1 # After PoR, APSEL unknown</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">apbank_sel = 0</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">dpbank_sel = 0</span></div>
</div>
<div>
<br /></div>
<div>
This simple script just takes 1 CSV file--which is separated with ';' rather than a comma. pandas packages easily deals with it:</div>
<div>
<br /></div>
<div>
<div>
<span style="font-family: Courier New, Courier, monospace;">import pandas as pd</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">fn = sys.argv[1]</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">df = pd.read_csv(fn, sep=';', index_col=0)</span></div>
</div>
<div>
<br /></div>
<div>
Saleae emits the timestamp as the first column, and it is sometimes convenient to look up a packet by the timestamp, so I am specifying the 0th column as the index.</div>
<div>
<br /></div>
The very first exchange in a JTAG session is the JTAG scan: discovering how many JTAG devices are cascaded. It's a complete waste for the normal case of just 1 device, but the long-ass sequence is still there; so I just drop it:<br />
<br />
<span style="font-family: Courier New, Courier, monospace;">df = df[df['TDIBitCount'] < 100] # drop the JTAG scan</span><br />
<div>
<br /></div>
Next, I need to deal with pandas representation of CSV data: all numbers are floating point by default, and the rest are string.<br />
<br />
<span style="font-family: Courier New, Courier, monospace;">df['TDOBitCount'] = df['TDOBitCount'].astype(int)</span><br />
<span style="font-family: Courier New, Courier, monospace;">df['TDIBitCount'] = df['TDIBitCount'].astype(int)</span><br />
<span style="font-family: Courier New, Courier, monospace;">df['TDI'] = df.TDI.apply(lambda x: int(x, 16))</span><br />
<span style="font-family: Courier New, Courier, monospace;">df['TDO'] = df.TDO.apply(lambda x: int(x, 16))</span><br />
<br />
Finally, I can iterate through the TAP Shift-IR and Shift-DR. The first thing is to break out the items as separate variables, for legibility. The last column is the number of bits output, which is the same as the number of bits input in all cases I've seen (JTAG seems to work like SPI), so it's safe to drop it.<br />
<br />
<span style="font-family: Courier New, Courier, monospace;">for row in df.itertuples():</span><br />
<span style="font-family: Courier New, Courier, monospace;"> timestamp, packet_type, TDI, TDO, nBit = row[:-1] # row[0] is timestamp</span><br />
<div>
<br /></div>
Since Shift-IR just sets the target register, handling that is straight-forward:<br />
<br />
<span style="font-family: Courier New, Courier, monospace;"> if packet_type == 'Shift-IR':</span><br />
<span style="font-family: Courier New, Courier, monospace;"> tap_state = TapState.I.value</span><br />
<span style="font-family: Courier New, Courier, monospace;"> if (nBit == 4) and TDI in [IR.DPACC.value, IR.APACC.value, IR.IDCODE.value]:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> ir = TDI</span><br />
<span style="font-family: Courier New, Courier, monospace;"> else: tap_state = TapState.Invalid.value # drop this packet</span><br />
<br />
Shift-DR is far more complicated, but once again, I make a simplifying assumption that I am only interested in the DPACC or APACC. In both cases, I am only interested in the standard TDI packet comprising of 32 bit data, 2 bit address, and 1 bit R/W indicator.<br />
<br />
<span style="font-family: Courier New, Courier, monospace;"> if ir == IR.DPACC.value:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> if nBit == 35: </span><br />
<span style="font-family: Courier New, Courier, monospace;"> dout = TDO >> 3</span><br />
<span style="font-family: Courier New, Courier, monospace;"> ack = TDO & 0x7</span><br />
<span style="font-family: Courier New, Courier, monospace;"> din = TDI >> 3</span><br />
<span style="font-family: Courier New, Courier, monospace;"> addr = (TDI & 0x6) << 1</span><br />
<span style="font-family: Courier New, Courier, monospace;"> rnw = 'R' if (TDI & 0x1) else 'W'</span><br />
<span style="font-family: Courier New, Courier, monospace;"><br /></span>
<span style="font-family: Courier New, Courier, monospace;"> decoded = None</span><br />
<span style="font-family: Courier New, Courier, monospace;"> # decode DAP reg</span><br />
<span style="font-family: Courier New, Courier, monospace;"> if addr == 0: decoded = 'DPIDR'</span><br />
<span style="font-family: Courier New, Courier, monospace;"> elif addr == 0x8:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> apsel = din >> 24</span><br />
<span style="font-family: Courier New, Courier, monospace;"> apbank_sel = (din >> 4) & 0xF</span><br />
<span style="font-family: Courier New, Courier, monospace;"> dpbank_sel = din & 0xF</span><br />
<span style="font-family: Courier New, Courier, monospace;"> decoded = 'SELECT AP {:#x} APB {:#x} DPB {:#x}'.format(apsel, apbank_sel, dpbank_sel)</span><br />
<span style="font-family: Courier New, Courier, monospace;"> elif addr == 0xC: decoded = 'RDBUFF'</span><br />
<span style="font-family: Courier New, Courier, monospace;"> elif addr == 0x4: # act on dpbank_sel</span><br />
<span style="font-family: Courier New, Courier, monospace;"> if dpbank_sel == 0:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> decoded = 'CTRL/STAT'</span><br />
<span style="font-family: Courier New, Courier, monospace;"> elif dpbank_sel == 1:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> decoded = 'DLCR'</span><br />
<span style="font-family: Courier New, Courier, monospace;"> elif dpbank_sel == 2:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> decoded = 'TARGETID'</span><br />
<span style="font-family: Courier New, Courier, monospace;"> elif dpbank_sel == 3:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> decoded = 'DLPIDR'</span><br />
<span style="font-family: Courier New, Courier, monospace;"> elif dpbank_sel == 4:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> decoded = 'EVENTSTAT'</span><br />
<span style="font-family: Courier New, Courier, monospace;"><br /></span>
<span style="font-family: Courier New, Courier, monospace;"> print('@{} {:#x} | {:x} {} | {} -> DPACC -> {:#x} | {}'. \</span><br />
<span style="font-family: Courier New, Courier, monospace;"> format(timestamp, din, addr, decoded, rnw, dout, ack))</span><br />
<span style="font-family: Courier New, Courier, monospace;"> else: print('@{} Unhandled {:#x} -> DPACC -> {:#x}'.format(timestamp, TDI, TDO))</span><br />
<div>
<br /></div>
For APACC, my current level of understanding of the ARM MEM-AP registers are not solid enough to hard code the values I see in the TAR (target address register), so I keep things simple.<br />
<br />
<span style="font-family: Courier New, Courier, monospace;"> elif ir == IR.APACC.value:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> if nBit == 35: </span><br />
<span style="font-family: Courier New, Courier, monospace;"> dout = TDO >> 3</span><br />
<span style="font-family: Courier New, Courier, monospace;"> ack = TDO & 0x7</span><br />
<span style="font-family: Courier New, Courier, monospace;"> din = TDI >> 3</span><br />
<span style="font-family: Courier New, Courier, monospace;"> addr = (TDI & 0x6) << 1</span><br />
<span style="font-family: Courier New, Courier, monospace;"> rnw = 'R' if (TDI & 0x1) else 'W'</span><br />
<span style="font-family: Courier New, Courier, monospace;"><br /></span>
<span style="font-family: Courier New, Courier, monospace;"> decoded = None</span><br />
<span style="font-family: Courier New, Courier, monospace;"> # Assume this is a MEM-AP and decode</span><br />
<span style="font-family: Courier New, Courier, monospace;"> if apbank_sel == 0:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> if addr == 0:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> decoded = 'CSW'</span><br />
<span style="font-family: Courier New, Courier, monospace;"> elif addr == 4:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> decoded = 'TAR'</span><br />
<span style="font-family: Courier New, Courier, monospace;"> elif addr == 0xC:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> decoded = 'DRW'</span><br />
<span style="font-family: Courier New, Courier, monospace;"> elif apbank_sel == 0xf:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> if addr == 4:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> decoded = 'CFG'</span><br />
<span style="font-family: Courier New, Courier, monospace;"> elif addr == 8:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> decoded = 'BASE'</span><br />
<span style="font-family: Courier New, Courier, monospace;"> elif addr == 0xC:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> decoded = 'IDR'</span><br />
<span style="font-family: Courier New, Courier, monospace;"><br /></span>
<span style="font-family: Courier New, Courier, monospace;"> print('@{} {:#x} | {:x} {} | {} -> APACC -> {:#x} | {}'. \</span><br />
<span style="font-family: Courier New, Courier, monospace;"> format(timestamp, din, addr, decoded, rnw, dout, ack))</span><br />
<span style="font-family: Courier New, Courier, monospace;"> else: print('@{} Unhandled {:#x} -> APACC -> {:#x}'.format(timestamp, TDI, TDO))</span><br />
<br />
Finally, I can handle the IDCODE easily, so I just threw that in at the end:<br />
<br />
<span style="font-family: Courier New, Courier, monospace;"> elif ir == IR.IDCODE.value:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> if nBit == 32: print('IDCODE -> {:#8x}'.format(TDO))</span><br />
<span style="font-family: Courier New, Courier, monospace;"> else: # Hmm what is this?</span><br />
<span style="font-family: Courier New, Courier, monospace;"> tap_state = TapState.Invalid.value</span><br />
<br />
All in all, a simple parser! Let's see if it's any useful.<br />
<h2>
Using the parser on openocd session</h2>
<div>
The very first line decoded with the parser on a session between my J-Link Ultra+ and RPi3 are:</div>
<div>
<br /></div>
<div>
<div>
<span style="font-family: Courier New, Courier, monospace;">@0.01815178 0x0 | 4 CTRL/STAT | R -> DPACC -> 0x0 | 2</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">@0.0181938 0x20 | 4 CTRL/STAT | W -> DPACC -> 0x0 | 2</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">@0.01823582 0x0 | 4 CTRL/STAT | R -> DPACC -> 0x0 | 2</span></div>
</div>
<div>
<div>
<span style="font-family: Courier New, Courier, monospace;">@0.01827784 0x50000000 | 4 CTRL/STAT | W -> DPACC -> 0x0 | 2</span></div>
</div>
<div>
<br /></div>
<div>
According to my copy of <i>ADI v5.2</i> section <i>B.2.2 CTRL/STAT, Control/Status register</i>, 0x20 is the STICKYERR bit; writing a 1'b1 to it clears that bit; makes sense except for the fact it was not set to begin with, so a complete waste of time. Also, openocd is writing 0x0 to the upper 8 bits and then then writing 0x5 means it is requesting the system and debug subsystem reset. This is actually not a good thing if I just want to halt a running system, so I don't know if I will run into a problem later.</div>
<div>
<br /></div>
<div>
So it seems that armed with tables of the various DP/AP registers, I can start to make sense of what openocd is requesting the target. I was therefore surprised to discover--just a few ms later, that openocd goes through a "ping" of potential AP in the address space: all 256 of them, by trying to read the CIDR (component ID register; the 1st register in AP register bank 0xF) of each of the possible 4 KB mapping in the base register. Using the same parser, I saw that J-Link discovers all available AP components by reading the ROM table (it fails to use the discovered ROM table in an intelligent way, but that's another topic altogether). Going through 256 possible AP takes J-Link Ultra+ about 133 ms at 100 kbps; it would take J-Link+ about 10x that duration (its inter-packet time is long for some reason). It then queries the IDR of each possible AP component--only 8 of which are populated for the RPi3 (another ~130 ms wasted). OpenOCD then goes through another round of unnecessary exchange with the target: it tries to unlock software access to the debug registers by writing the magic keys for each of the discovered components (the RPi's AP components are ROM table v9, which do not implement the software lock/unlock). The waste is even worse, because after the discovery, openocd tries to reset the system and the debug subsystem (again), and <i>then</i> go through the same discovery and <i>software unlock</i> mechanism it went through last time.</div>
<div>
<br /></div>
<div>
<div>
<span style="font-family: Courier New, Courier, monospace;">@0.32220612 0x20 | 4 CTRL/STAT | W -> DPACC -> 0xf0000001 | 2</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">@0.32224814 0x0 | 4 CTRL/STAT | R -> DPACC -> 0xa0000000 | 2</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">@0.32229016 0x50000000 | 4 CTRL/STAT | W -> DPACC -> 0x0 | 2</span></div>
</div>
<div>
<br /></div>
<div>
<br /></div>
Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com1tag:blogger.com,1999:blog-4032020337247582619.post-6899228017252988652017-10-11T07:44:00.000-07:002017-11-25T16:34:00.986-08:00Hacking the Raspberry Pi BootThe downloadable Rasbian image does not come with U-Boot (the most popular embedded system boot loader), yet can clearly boot Linux. But since the Pi's 2nd stage boot loader is closed source, booting Pi with U-Boot allows me to perform non-standard boot actions (like exposing the RPi's JTAG pins). <br />
<h2>
How does the RPi boot?</h2>
<div>
An ARM booting is very different than
the Intel/AMD system (BIOS or UEFI). RPi's 1st stage bootloader is
on-chip, and looks for the file bootcode.bin in the boot partition:</div>
<div>
<br /></div>
<div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;">pi@raspberrypi:~ $ mount</span></div>
</div>
<div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;">/dev/mmcblk0p1
on /boot type vfat
(rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,errors=remount-ro)</span></div>
</div>
<div>
<br /></div>
<div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;">pi@raspberrypi:~ $ ls -gh /boot/bootcode.bin </span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;">-rwxr-xr-x 1 root 50K Jul 3 10:07 /boot/bootcode.bin</span></div>
</div>
<div>
<br /></div>
<div>
The
2nd stage bootloader runs on RPi's GPU (also on-chip), and requires the
start_<device>.elf executables, also in /boot partition:</div>
<div>
<br /></div>
<div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;">-rwxr-xr-x 1 root 645K Jul 3 14:07 start_cd.elf</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;">-rwxr-xr-x 1 root 4.8M Jul 3 14:07 start_db.elf</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;">-rwxr-xr-x 1 root 2.8M Jul 3 14:07 start.elf</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;">-rwxr-xr-x 1 root 3.8M Jul 3 14:07 start_x.elf</span></div>
</div>
<div>
<br /></div>
A
headless system will only have start.elf. The 3rd stage bootloader
(also running on the GPU) is currently booting straight into Linux,
using the kernel.img and the DTB (device tree blob) files (also in
/boot):<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">-rwxr-xr-x 1 root 18K May 15 19:09 bcm2708-rpi-3-b.dtb</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">-rwxr-xr-x 1 root 4.4M Jul 3 10:07 kernel7.img # for >= RPi2</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">-rwxr-xr-x 1 root 4.2M Jul 3 10:07 kernel.img # All other RPi</span><br />
<div>
<br /></div>
<div>
The
RPi boot loader running on the GPU reads config.txt (also in /boot) for
boot options, including which Linux kernel to load. There is no
"kernel" line in the config.txt right after a fresh install, so the boot
loader defaults to it loads the highest numbered
kernel<num>.img. Since we have a working system but want a boot
loader capable of booting multiple OS, we should point the RPi
bootloader to U-Boot image instead of the kernel<num>.img. [other
people seem to prefer renaming u-boot.bin to kernel8.img, taking
advantage of the boot loader's search for the highest numbered
kernel<num>.img.]<br />
<br />
While booting the kernel, the RPi bootloader feeds the content of the file /boot/cmdline.txt to
the kernel, but now I need U-Boot to do that. When you view this file, it can get overwhelming, but just note it down somewhere for now, so you can punch it into U-Boot down below.<br />
<br />
Let's first back up this known working /boot partition to the development before messing with it.<br />
<h2>
Cross compiler toolchain for RPi</h2>
Remember throughout this article that the RPi is the <i>target</i> and my 64-bit Ubuntu 16.04 is the development<i> host</i>, which also implies that I am <i>cross</i> compiling the RPi's programs (starting with the bootloader) on the host, using a <i>toolchain</i> graciously provided by the Raspberry Pi folks on gihub. To clone the git repo:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">parallels@ubuntu:</span></span>$ git clone https://github.com/raspberrypi/tools</span><br />
<br />
The tools are in the directory arm-bcm2708<br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;">$ ls arm-bcm2708/<br />arm-bcm2708hardfp-linux-gnueabi gcc-linaro-arm-linux-gnueabihf-raspbian<br />arm-bcm2708-linux-gnueabi <span style="background-color: yellow;"><b>gcc-linaro-arm-linux-gnueabihf-raspbian-x64</b></span><br />arm-rpi-4.9.3-linux-gnueabihf</span><br />
<br />
I
decided to use the Linaro toolchain, because it is popular among
embedded developers. Note that the 32-bit version of the toolchain will
not even run on a 64-bit host (that's pretty much all PCs these days).
I need to put the toolchain's bin/ folder in $PATH, by adding the
following line at the end of my ~/.profile, like this:<br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;">PATH="$HOME/rpi/tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian-x64/bin:$HOME/bin:$HOME/.local/bin:$PATH"</span><br />
<br />
[If using .profile, you have to logout and log back in for the change to take effect.]<br />
<br />
In the toolchain's bin folder, there are a whole bunch of executables, like this:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">~/rpi/tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian-x64/bin$ ls<br />arm-linux-gnueabihf-addr2line arm-linux-gnueabihf-gfortran<br />...</span><br />
<br />
To build a either the kernel or U-Boot, an environment variable called
CROSS_COMPILE is necessary, as a prefix for the tool it should run. In the above
case, that prefix is <span style="font-family: "courier new" , "courier" , monospace;">arm-linux-gnueabihf-</span>, which I hard code in my .profile again.<br />
<h2>
Building U-Boot for RPi3</h2>
I clone the latest U-Boot repo:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">$ git clone git://git.denx.de/u-boot.git</span><br />
<br />
In the repo's configs/ folder, there are 4 RPi board configs:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">~/rpi/u-boot$ ls configs/*rpi*</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">configs/rpi_2_defconfig configs/rpi_3_defconfig<br />configs/rpi_3_32b_defconfig configs/rpi_defconfig</span><br />
These
are the predefined U-Boot configs. Without understanding the details
of the config, I will try using the 32b version for now. <i>Remembering</i> that I now have the environment variable CROSS_COMPILE, I seed this .config file from this defconfig:<br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;">~/rpi/u-boot$ make <span style="background-color: yellow;">rpi_3_defconfig</span><br /> HOSTCC scripts/basic/fixdep<br /> HOSTCC scripts/kconfig/conf.o<br /> HOSTCC scripts/kconfig/zconf.tab.o<br /> HOSTLD scripts/kconfig/conf<br />#<br /># configuration written to .config<br />#</span><br />
<br />
And then build the U-Boot binary itself with<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">~/rpi/u-boot$ make</span><br />
<br />
If
all goes well, you should see and the ELF binary (with symbols) and the
stripped (without symbol) in the root folder, like this:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">-rwxrwxr-x 1 parallels 2.2M Oct 1 12:00 u-boot<br />-rw-rw-r-- 1 parallels 372K Oct 1 12:00 u-boot.bin</span><br />
<br />
Copy u-boot.bin to RPi's /boot folder.<br />
<br />
To cause RPi boot loader to choose U-Boot instead of the kernel, just add a new line in the config.txt:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">kernel=u-boot.bin</span><br />
<br />
Overriding
what the GPU boots this way (as supposed to changing the file that the
GPU looks for) offers the possibility of backing out easily using the
recovery method (which lets you modify the config.txt)--IF you are
running the NOOBS install. Also enable UART in config.txt, for the
debugging session to begin shortly below.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">enable_uart=1</span><br />
<h2>
U-Boot boots kernel7.img</h2>
When I reboot RPi with the HDMI monitor and keyboard connected, I see the U-Boot screen on the monitor with this message:<br />
<span style="font-family: "courier new" , "courier" , monospace;"><br />U-Boot 2017.09-00351-g7eefa35-dirty (Oct 11 2017 - 21:25:18 -0700)<br /><br />DRAM: 896 MiB<br />RPI 3 Model B (0xa22082)<br />MMC: sdhci@7e300000: 0<br />...<br /><span style="background-color: yellow;">Hit any key to stop autoboot</span>: 0 </span><span style="font-family: "courier new" , "courier" , monospace;">Scanning mmc 0:1...</span><br />
<br />
Note
that the memory available to the CPU is 128 MB shy of 512 MB DRAM on
board, because I gave 128 MB to the GPU (in /boot/config.txt).<br />
<br />
Anyway
it's clearly not booting to Linux, so I reboot and <span style="background-color: yellow;">hit the Enter key</span>
to halt the U-Boot, and I am left with the U-Boot console:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">U-Boot></span><br />
<br />
The
next step is to configure U-Boot to load the kernel image already on
/boot: kernel.img. Load the image and the DTB file, and give those to the bootz command in
the mmc_boot command, as you can see below. [Note that "mmc 0:1" is
pointing to the 2nd partition on the SD
card--which is NOT the /boot partition for the NOOBS install (in which
case it is the 0:6 partition).]<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">U-Boot> setenv fdtfile bcm2708-rpi-3-b.dtb</span></span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">U-Boot> </span>setenv
bootcmd_mmc0 fatload mmc 0:1 ${kernel_addr_r} kernel7.img\;fatload mmc
0:1 ${fdt_addr} ${fdtfile}\;bootz ${kernel_addr_r} - ${fdt_addr}</span><br />
This works because the default boot command--bootcmd--points to mmc_boot, as you can see in the boot command chain below:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">U-Boot> printenv bootcmd</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">bootcmd=run distro_bootcmd </span><br />
<br />
The distro_bootcmd, in turn, is a for loop, going through different boot hardware:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">U-Boot> printenv </span><span style="font-family: "courier new" , "courier" , monospace;">distro_bootcmd</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">distro_bootcmd</span>=for target in $boot_targets}; do run bootcmd_${target}; done </span><br />
<br />
And the RPi3 config file set up multiple boot devices:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">U-Boot> printenv </span><span style="font-family: "courier new" , "courier" , monospace;">boot_targets</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">boot_targets=mmc0 usb0 pxe dhcp</span><br />
<br />
The kernel needs bootargs: the content of /boot/cmdline.txt that I saved away earlier. In U-Boot, you do that through an environment called <span style="background-color: yellow;">bootargs</span>. Because it is many lines long, it is easy to make a typo when creating this environment variable; so I do it in bite-sized chunks:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">U-Boot> setenv rootargs "root=/dev/mmcblk0p2 rootfstype=ext4"</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">U-Boot> setenv fbargs "bcm2708_fb.fbwidth=1824 bcm2708_fb.fbheight=984 bcm2708_fb.fbswap=1"</span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">U-Boot> setenv vcmemargs "vc_mem.mem_base=0x3ec00000 vc_mem.mem_size=0x40000000"</span></span></span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">U-Boot> setenv miscargs "</span></span></span></span>dwc_otg.lpm_enable=0 elevator=deadline fsck.repair=yes console=serial0,115200"</span></span></span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">U-Boot> setenv <span style="background-color: yellow;">bootargs</span> ${</span></span></span></span></span></span></span></span><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">rootargs</span>} </span></span></span></span></span></span></span></span><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">${fbargs}</span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"> ${</span></span></span></span></span></span></span></span><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">miscargs</span></span></span></span></span></span></span></span>} </span></span></span></span></span></span></span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">${</span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">vcmemargs</span></span></span></span>}</span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><br /></span></span>
Note that I changed the rootfs device from a UUID to a more generic "2nd partition on the mmc device 0", to be more generic. I found that for a NOOBS install (vs. a straight Rasbian imaging as I did), the ext4 partition is the 6th partition is the 6th partition, so you can adjust it to mmcblk0p6 above.<br />
<br />
I save the configuration change to the U-Boot environment file (persisted in /boot) before rebooting.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">U-Boot> saveenv</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">U-Boot> boot</span></span><br />
<br />
After
a few seconds, I see that RPi3 is starting to boot Linux, and then
voilà: I am back to the Rasbian desktop! I save the resulting
/boot/uboot.env file for safe keeping. Since I want to start the FW
before booting Linux, I studied U-Boot's standalone examples.<br />
<br />
<dl id="comments-block" style="color: #333333; font-family: Georgia, serif; line-height: 1.6em; margin: 1em 0px 1.5em;"><dd class="comment-body" style="margin: 0.25em 0px 0px;"><div style="margin-bottom: 0.75em;">
From (https://dius.com.au/2015/08/19/raspberry-pi-uboot/), I learned that I could get the RPi bootloader to do all the heavy lifting of creating a merged DTB and putting it at the address specified in config.txt--for the device_tree_address line. The default U-Boot ${fdt_addr} is 0x2effb600, so I just added the following line to /boot/config.txt:<br />
<br />
device_tree_address=0x2effb600<br />
kernel=u-boot.bin</div>
</dd></dl>
<h2>
U-Boot Hello World example</h2>
According to <a href="http://www.denx.de/wiki/view/DULG/UBootStandalone#Section_5.12.3.">U-Boot documentation</a>,
cache coherence problem can be worked around by having U-Boot build
system package up the application binary into a U-Boot image (RPi's
kernel.img I've been working with above is one such example). U-Boot
already builds a stand-alone hello_world example (examples/standalone),
whose start address is defined in arch/arm/config.mk:<br />
<br />
<div style="background-color: #1e1e1e; color: #d4d4d4; font-family: 'Droid Sans Mono', 'Courier New', monospace, 'Droid Sans Fallback'; font-size: 14px; font-weight: normal; line-height: 19px; white-space: pre;">
<div>
<span style="color: #9cdcfe;">CONFIG_STANDALONE_LOAD_ADDR</span><span style="color: #d4d4d4;"> = 0xc100000</span></div>
</div>
<br />
I can verify this by reading the ELF produced by the build.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">parallels@ubuntu:~/rpi/u-boot/examples/standalone$ ${CROSS_COMPILE}readelf hello_world -h<br />...</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> Entry point address: <span style="background-color: yellow;">0xc100000</span></span><br />
<br />
I can copy the build output (hello_world.bin) to the SD card's /boot partition, and run it in U-Boot shell (note that my /boot partition on the SD card is partition #1 on device #0):<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">U-Boot> fatload mmc 0:1 </span><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">0xc100000 hello_world.bin</span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">U-Boot>go </span></span><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">0xc100000 Hello world</span></span></span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">## Starting application at 0x0C100000 ...</span></span></span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">0xc100000</span></span></span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">arg[1] = Hello</span></span></span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">arg[2] = world</span></span></span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">arg[3] = "" </span></span></span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">## Application terminated, rc = 0x0</span></span></span></span><br />
<br />
This shows that a stand-alone application can control low level peripheral. <br />
<h3>
Aside: cache problem</h3>
On the U-Boot main website's documentation page (as well as the README at the top of the U-Boot repo) there is some discussion about cache coherence
problem because U-Boot runs all stand alone code with cache disabled. Supposedly, the workaround is to use the "bootm" script , which requires coaxing the U-Boot build system to produce a U-Boot image for this, I can copy the resulting file to the SD card.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">~/rpi/u-boot/examples/standalone$
../../tools/mkimage -A arm -O u-boot -T standalone -C none -a 0xc100000
-d hello_world.bin -v hello_world.img<br />Adding Image hello_world.bin<br />Image Name: <br />Created: Sat Oct 7 19:33:34 2017<br />Image Type: ARM U-Boot Standalone Program (uncompressed)<br />Data Size: 630 Bytes = 0.62 KiB = 0.00 MiB<br />Load Address: 0c100000<br />Entry Point: 0c100000</span><br />
<br />
But this did not work, because U-Boot rejects it as an invalid kernel image. <br />
<h2>
Modifying the hello_world example to toggle a GPIO line</h2>
Printing a string to console is nice, but controlling GPIO lines or SPI would be more practical for embedded applications. In an industrial application, an RPi system can signal that it is about to boot the Linux kernel by bouncing a GPIO line, or sending out a SPI/I2C packet. Let's say I want to toggle the RPi3 pin J8.15 right before booting the kernel. On the bcm2835~7 SoC, GPIO22 drives that J8.15 pin, as the command "gpio readall" (available in out-of-the-box Rasbian) shows (I yellow-highlighted it for you):<br />
<br />
<span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;">pi@raspberrypi:~ $ gpio readall<br /> +-----+-----+---------+------+---+---Pi 3---+---+------+---------+-----+-----+<br /> | BCM | wPi | Name | Mode | V | Physical | V | Mode | Name | wPi | BCM |<br /> +-----+-----+---------+------+---+----++----+---+------+---------+-----+-----+<br /> | | | 3.3v | | | 1 || 2 | | | 5v | | |<br /> | 2 | 8 | SDA.1 | IN | 1 | 3 || 4 | | | 5v | | |<br /> | 3 | 9 | SCL.1 | IN | 1 | 5 || 6 | | | 0v | | |<br /> | <span style="color: red;"><span style="background-color: #999999;">4</span></span> | 7 | </span></span><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-size: x-small;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="background-color: #cccccc;"><i>JTAG_05</i></span></span></span> | IN | 1 | 7 || 8 | 1 | ALT5 | TxD | 15 | 14 |<br /> | | | 0v | | | 9 || 10 | 1 | ALT5 | RxD | 16 | 15 |<br /> | 17 | 0 | GPIO. 0 | IN | 0 | 11 || 12 | 0 | IN | GPIO. 1 | 1 | 18 |<br /> | <span style="background-color: #cccccc;">27</span> | 2 | <span style="background-color: #cccccc;"><i>JTAG_07</i></span> | IN | 0 | <span style="background-color: #cccccc;">13</span> || 14 | | | 0v | | |<br /> | <span style="background-color: yellow;">22</span> | 3 | <span style="background-color: #cccccc;"><i>JTAG_03</i></span> | IN | 0 | <span style="background-color: yellow;">15</span> || <span style="background-color: #cccccc;">16</span> | 0 | IN | <span style="background-color: #cccccc;"><i>JTAG_11</i></span> | 4 | <span style="background-color: #cccccc;">23</span> |<br /> | | | 3.3v | | | 17 || <span style="background-color: #cccccc;">18</span> | 0 | IN | <span style="background-color: #cccccc;"><i>JTAG_13</i></span> | 5 | <span style="background-color: #cccccc;">24</span> |<br /> | 10 | 12 | MOSI | IN | 0 | 19 || 20 | | | 0v | | |<br /> | 9 | 13 | MISO | IN | 0 | 21 || <span style="background-color: #cccccc;">22</span> | 0 | IN | <span style="background-color: #cccccc;"><i>JTAG_09</i></span> | 6 | <span style="background-color: #cccccc;">25</span> |<br /> | 11 | 14 | SCLK | IN | 0 | 23 || 24 | 1 | IN | CE0 | 10 | 8 |<br /> | | | 0v | | | 25 || 26 | 1 | IN | CE1 | 11 | 7 |<br /> | 0 | 30 | SDA.0 | IN | 1 | 27 || 28 | 1 | IN | SCL.0 | 31 | 1 |<br /> | 5 | 21 | GPIO.21 | IN | 1 | 29 || 30 | | | 0v | | |<br /> | 6 | 22 | GPIO.22 | IN | 1 | 31 || 32 | 0 | IN | GPIO.26 | 26 | 12 |<br /> | 13 | 23 | GPIO.23 | IN | 0 | 33 || 34 | | | 0v | | |<br /> | 19 | 24 | GPIO.24 | IN | 0 | 35 || 36 | 0 | IN | GPIO.27 | 27 | 16 |<br /> | <span style="background-color: #cccccc;">26</span> | 25 | <span style="background-color: #cccccc;"><i>JTAG_05</i></span> | IN | 0 | <span style="background-color: #cccccc;">37</span> || 38 | 0 | IN | GPIO.28 | 28 | 20 |<br /> | | | 0v | | | 39 || 40 | 0 | IN | GPIO.29 | 29 | 21 |</span></span><br />
<br />
To bounce a specified GPIO line, I can modify the hello_world program (I can create another stand-alone project, but I am feeling lazy) like this:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">/* see BMC2835 peripheral datasheet p.92, downloadable from RPi website */<br />#define GPIO_ALT_FUNCTION_IN 0x0<br />#define GPIO_ALT_FUNCTION_OUT 0x1<br />#define GPIO_ALT_FUNCTION_0 0x4<br />#define GPIO_ALT_FUNCTION_1 0x5<br />#define GPIO_ALT_FUNCTION_2 0x6<br />#define GPIO_ALT_FUNCTION_3 0x7<br />#define GPIO_ALT_FUNCTION_4 0x3<br />#define GPIO_ALT_FUNCTION_5 0x2<br /><br />/* from dwelch67's bare metal project */<br />#define BCM2708_PERI_BASE 0x3f000000<br />#define GPIO_BASE (BCM2708_PERI_BASE + 0x200000)<br /><br />__attribute__((always_inline)) static inline void GPIOOutSet(int GPIO)<br />{<br /> volatile uint32_t* GPIO_SET = (volatile uint32_t*)(GPIO_BASE + 0x1C);<br /> GPIO_SET[GPIO >= 32] |= 1 << (GPIO & 0x1F);<br />}<br />__attribute__((always_inline)) static inline void GPIOOutClear(int GPIO)<br />{<br /> volatile uint32_t* GPIO_CLR = (volatile uint32_t*)(GPIO_BASE + 0x28);<br /> GPIO_CLR[GPIO >= 32] |= 1 << (GPIO & 0x1F);<br />}<br /><br />/* Have to forward declare because whatever text that appears at the beginning<br /> * is put at the entry point of the stand alone app.<br /> */<br />static void GPIOFunction(int GPIO, int functionCode);<br /><br />int hello_world (int argc, char * const argv[])<br />{<br /> app_startup(argv);<br /> if (strcmp(argv[1], "gpio") == 0) {<br /> if (argc < 3) {<br /> printf("which pin? ");<br /> return 0;<br /> }<br /> unsigned pin = 10 * (argv[2][0] - '0') /* simpler than sscanf */<br /> + (argv[2][1] - '0');<br /> printf("%d\n", pin);<br /> GPIOFunction(pin, GPIO_ALT_FUNCTION_OUT);<br /> GPIOOutSet(pin); GPIOOutClear(pin);<br /> }<br /> return (0);<br />}<br /><br />static unsigned _divide(unsigned n, unsigned d) {<br /> unsigned q = 0;<br /> while (n >= d) {<br /> ++q;<br /> n -= d;<br /> }<br /> return q;<br />}<br /><br />static void GPIOFunction(int GPIO, int functionCode)<br />{<br /> int registerIndex = _divide(GPIO, 10);<br /> int bit = (GPIO - 10 * registerIndex) * 3;<br /> volatile uint32_t* GPIO_FSEL = (volatile uint32_t*)(GPIO_BASE + 0);<br /> uint32_t oldValue = GPIO_FSEL[registerIndex];<br /> uint32_t mask = 0b111 << bit;<br /> GPIO_FSEL[registerIndex] = (oldValue & ~mask) /* don't touch others */<br /> | ((functionCode << bit) & mask);<br />}</span><br />
<br />
As you can see, it's a complete rewrite of the hello world stand alone application. But U-Boot's make system will dutifully produce a hello_world.bin with correct entry point. <br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;">parallels@ubuntu:~/rpi/u-boot$ ${CROSS_COMPILE}readelf examples/standalone/hello_world -h<br />...<br /> Entry point address: <span style="background-color: yellow;">0xc100000</span><br /> Start of program headers: 52 (bytes into file)</span><br />
<br />
I then copied hello_world.bin to the SD card's /boot and tried it out in U-Boot command line:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">U-Boot> fatload mmc 0:1 </span><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">0xc100000 hello_world.bin</span></span><br />
<span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">U-Boot> go </span></span><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">0xc100000 gpio 22</span></span></span></span><br />
<br />
This is what saw in logic analyzer connected to the UART TX (J8.8) and J8.15:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4CyW5VG-51gfaYzUvv_a08Sc-EZ1sjE7FHNoQRBqlsNFhrFRVJnBa79xgAE7b7MQH-ameuXdLSfyMqDKHW-8brSQm9Xe4T5Z4Mej1omy0QQtWl7TszIfY_jlN1UCZBoQ18n27rGJWEDhp/s1600/GPIO22.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="167" data-original-width="440" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4CyW5VG-51gfaYzUvv_a08Sc-EZ1sjE7FHNoQRBqlsNFhrFRVJnBa79xgAE7b7MQH-ameuXdLSfyMqDKHW-8brSQm9Xe4T5Z4Mej1omy0QQtWl7TszIfY_jlN1UCZBoQ18n27rGJWEDhp/s1600/GPIO22.png" /></a></div>
UART TX is busy printing the printf statements, but the ~150 ns wide impulse is the GPIO bounce I wanted.<br />
<h2>
Enabling JTAG debugging on RPi3</h2>
Now that I can control the GPIO peripheral before booting the kernel, another cool thing to do is turn on the JTAG. On p. 102 of the CM2835 peripheral datasheet mentioned above, thre is a table of GPIO alternative functions. On the ALT4 column, GPIO22~26 are the ARM JTAG pins. I configure these pins for JTAG using a similar code as above.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">} else if (strcmp(argv[1], "jtag") == 0) {<br /> printf ("Remapping GPIO to JTAG ...\n");<br /> GPIOFunction(22, GPIO_ALT_FUNCTION_4);//TRST: JTAG 3<br /> GPIOFunction(4, GPIO_ALT_FUNCTION_5);//TDI : JTAG 5<br /> GPIOFunction(24, GPIO_ALT_FUNCTION_4);//TMS : JTAG 7<br /> GPIOFunction(25, GPIO_ALT_FUNCTION_4);//TCK : JTAG 9</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> //Don't need: RPi doesn't "return" CLK<br /> //GPIOFunction(26, GPIO_ALT_FUNCTION_4);//RTCK: JTAG 11<br /> GPIOFunction(27, GPIO_ALT_FUNCTION_4);//TDO : JTAG 13<br /> if (argc > 2 && strcmp(argv[2], "wait") == 0) {<br /> printf ("Waiting ...\n");<br /> while (!tstc());<br /> (void) getc();<br /> }<br />}</span><br />
<br />
Now I need to find these pins on the J8 header. I marked them with a gray highlighter above. I use SEGGER J-Link, and the JTAG pin definitions are listed in the <i>20-pin J-Link connector</i> section of the J-Link manual. When I run JLinkGDBServer, I see the this error:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">parallels@ubuntu:~$ JLinkGDBServer -device cortex-a7<br />...<br />------Target related settings------<br />Target device: cortex-a7<br />Target interface: JTAG<br />Target interface speed: 1000kHz<br />Target endian: little<br />...<br />Target voltage: 3.31 V<br />Listening on TCP/IP port 2331<br />Connecting to target...ERROR: Cortex-A/R-JTAG (connect): Could not determine address of core debug registers. Incorrect CoreSight ROM table in device?<br />ERROR: Could not connect to target.<br />Target connection failed. GDBServer will be closed...Restoring target state and closing J-Link connection...<br />Shutting down...<br />Could not connect to target.<br />Please check power, connection and settings.</span><br />
<br />
When I look at TDO (which is driven from from RPi), I see that the target is responding , as you can see here.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKVl42w2YI9UvU5x2wshkYu7oOt3gMeEgFZ1-8LnE3_2q1e3TQ4iW0-e6ShlqapYWq55TKUo5vCQAf9tglm_1IfQSTROFxCIcqKhu5Q3kMOOuNoteBvv5OEoZJJ01aSCTbA4hYYj6JiTfy/s1600/RPI+JTAG.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="299" data-original-width="616" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKVl42w2YI9UvU5x2wshkYu7oOt3gMeEgFZ1-8LnE3_2q1e3TQ4iW0-e6ShlqapYWq55TKUo5vCQAf9tglm_1IfQSTROFxCIcqKhu5Q3kMOOuNoteBvv5OEoZJJ01aSCTbA4hYYj6JiTfy/s1600/RPI+JTAG.png" /></a></div>
<h3>
JLinkExe fails to find core debug registers</h3>
JLinkExe can even get some information out of the target as well<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">Connecting to target via JTAG<br />TotalIRLen = 4, IRPrint = 0x01<br />JTAG chain detection found 1 devices:<br /> #0 Id: 0x4BA00477, IRLen: 04, CoreSight JTAG-DP<br />ARM AP[0]: 0x24770002, APB-AP<br />ROMTbl[0][0]: CompAddr: 80010000 CID: B105900D, PID:04-004BBD03 <br />ROMTbl[0][1]: CompAddr: 80011000 CID: B105900D, PID:04-004BB9D3 <br />ROMTbl[0][2]: CompAddr: 80012000 CID: B105900D, PID:04-004BBD03 <br />ROMTbl[0][3]: CompAddr: 80013000 CID: B105900D, PID:04-004BB9D3 <br />ROMTbl[0][4]: CompAddr: 80014000 CID: B105900D, PID:04-004BBD03 <br />ROMTbl[0][5]: CompAddr: 80015000 CID: B105900D, PID:04-004BB9D3 <br />ROMTbl[0][6]: CompAddr: 80016000 CID: B105900D, PID:04-004BBD03 <br />ROMTbl[0][7]: CompAddr: 80017000 CID: B105900D, PID:04-004BB9D3 <br />TotalIRLen = 4, IRPrint = 0x01<br />JTAG chain detection found 1 devices:<br /> #0 Id: 0x4BA00477, IRLen: 04, CoreSight JTAG-DP<br /><br />****** Error: Cortex-A/R-JTAG (connect): Could not determine address of core debug registers. Incorrect CoreSight ROM table in device?<br />TotalIRLen = 4, IRPrint = 0x01<br />JTAG chain detection found 1 devices:<br /> #0 Id: 0x4BA00477, IRLen: 04, CoreSight JTAG-DP<br />TotalIRLen = 4, IRPrint = 0x01<br />JTAG chain detection found 1 devices:<br /> #0 Id: 0x4BA00477, IRLen: 04, CoreSight JTAG-DP<br />Cannot connect to target.</span><br />
<br />
I then tried OpenOCD instead.<br />
<h3>
OpenOCD cannot connect either</h3>
OpenOCD does its own thing, but it can also get some response from RPi as well.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">$ openocd -f /usr/share/openocd/scripts/interface/jlink.cfg -f ~/rpi/openocd/rpi3.cfg <br />Open On-Chip Debugger 0.9.0 (2015-09-02-10:42)<br />Licensed under GNU GPL v2<br />For bug reports, read<br /> http://openocd.org/doc/doxygen/bugs.html<br />adapter speed: 1000 kHz<br />adapter_nsrst_delay: 400<br />none separate<br />Info : auto-selecting first available session transport "jtag". To override use 'transport select <transport>'.<br />Info : J-Link ARM V8 compiled Nov 28 2014 13:44:46<br />Info : J-Link caps 0xb9ff7bbf<br />Info : J-Link hw version 80000<br />Info : J-Link hw type J-Link<br />Info : J-Link max mem block 9224<br />Info : J-Link configuration<br />Info : USB-Address: 0x0<br />Info : Kickstart power on JTAG-pin 19: 0xffffffff<br />Info : Vref = 3.306 TCK = 1 TDI = 0 TDO = 1 TMS = 0 SRST = 1 TRST = 1<br />Info : J-Link JTAG Interface ready<br />Info : clock speed 1000 kHz<br />Info : JTAG tap: rspi.arm tap/device found: 0x4ba00477 (mfg: 0x23b, part: 0xba00, ver: 0x4)<br />Warn : JTAG tap: rspi.arm UNEXPECTED: 0x4ba00477 (mfg: 0x23b, part: 0xba00, ver: 0x4)<br />Error: JTAG tap: rspi.arm expected 1 of 1: 0x07b7617f (mfg: 0x0bf, part: 0x7b76, ver: 0x0)<br />Error: Trying to use configured scan chain anyway...<br />Warn : Bypassing JTAG setup events due to errors<br />Error: 'arm11 target' JTAG error SCREG OUT 0x00<br />Error: unexpected ARM11 ID code</span><br />
<br />
It seems that JTAG pin mapping works, but the client side tools don't play well with RPi3. I will come back after getting a response back from SEGGER.</div>
Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com3tag:blogger.com,1999:blog-4032020337247582619.post-62315510810500484132017-09-28T21:24:00.000-07:002017-10-10T07:24:22.492-07:00Aligning 2 AR device local frames through calibrationI spent the last 6 months self-studying computer vision and nonlinear estimation methods--to align the local frames of 2 AR devices in a peer-to-peer AR game. I am going to stop this phase of my self-study activities for a while, so I want to document what I've done so far.<br />
<h2>
Problem description</h2>
ARKit (or ARCore) will tell you the current pose (position and orientation) of your device in its local frame (position is initialized to 0 where ARKit starts). Let's call your device A, and your friend's device B, and imagine that you are trying to play an AR video game where you need to look at some objects and be presented with a marker that is spatially accurate. If your friend also runs an ARKit on her phone, she also has her own local frame, written as <span style="font-family: "helvetica neue"; font-size: 16px; text-align: center;">l</span><sub style="font-family: "Helvetica Neue"; text-align: center;">A</sub> and <span style="font-family: "helvetica neue"; font-size: 16px; text-align: center;">l</span><sub style="font-family: "Helvetica Neue"; text-align: center;">B</sub> in the schematic below.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEib7nJhglufFlvbu7joXqDB9UNeBu04q2Oxfwf1ei4ALvIyfgFUFfWS3_nW7n4rOmwGAUxeXrwT5kqPTE_sSecqdUnyzv0Dj3HcivSdAFZp8Ae7JgeiTFJWEwiDmvAFH6yG-Uzv7PO1zPe_/s1600/Shootout+frame.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="599" data-original-width="827" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEib7nJhglufFlvbu7joXqDB9UNeBu04q2Oxfwf1ei4ALvIyfgFUFfWS3_nW7n4rOmwGAUxeXrwT5kqPTE_sSecqdUnyzv0Dj3HcivSdAFZp8Ae7JgeiTFJWEwiDmvAFH6yG-Uzv7PO1zPe_/s1600/Shootout+frame.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The frames involved in aligning the device A's and B's local frames. Note that the frames are left-handed because I write Unity games. I adopt the pose notation from <a href="https://smile.amazon.com/Principles-Inertial-Multisensor-Integrated-Navigation/dp/1608070050/ref=sr_1_2?ie=UTF8&qid=1506646121&sr=8-2&keywords=Paul+Groves">Paul Grove's seminal textbook on GNSS</a>, which contains careful and thorough explanation of riding body dynamics.</td></tr>
</tbody></table>
As you can see, <span style="font-family: "helvetica neue"; font-size: 16px; text-align: center;">l</span><sub style="font-family: "Helvetica Neue"; text-align: center;">A</sub> and <span style="font-family: "helvetica neue"; font-size: 16px; text-align: center;">l</span><sub style="font-family: "Helvetica Neue"; text-align: center;">B</sub> are not aligned: nobody knows that the relative position offset <span style="font-family: "helvetica neue"; font-size: 20px; font-stretch: normal; line-height: normal; text-align: center; text-decoration: underline;">r</span><sup style="font-family: "Helvetica Neue"; text-align: center;">l</sup><span style="font-family: "helvetica neue"; font-size: 16px; text-align: center;">A</span><sub style="font-family: "Helvetica Neue"; text-align: center;">lAlB</sub> and rotation <span style="font-family: "helvetica neue"; font-size: 20px; font-stretch: normal; line-height: normal; text-align: center; text-decoration: underline;">C</span><sup style="font-family: "Helvetica Neue"; text-align: center;">l</sup><span style="font-family: "helvetica neue"; font-size: 16px; text-align: center;">A</span><sub style="font-family: "Helvetica Neue"; text-align: center;">lB</sub> are. If we <i>could </i>estimate these 2 quantities (Vector3 and Quaternion for me), we then align the 2 devices.<br />
<h2>
1st attempt: A and B face each other, with torch lit, and spin around each other</h2>
<h3>
The observable: blob angle <span style="font-family: "helvetica neue"; font-size: 16px; font-stretch: normal; line-height: normal;">(</span><span style="font-family: "stixgeneral"; font-size: 16px;">𝛂</span><span style="font-family: "helvetica neue"; font-size: 16px; font-stretch: normal; line-height: normal;">, </span><span style="font-family: "stixgeneral"; font-size: 16px;">𝛃</span><span style="font-family: "helvetica neue"; font-size: 16px; font-stretch: normal; line-height: normal;">) </span>as appearing on the camera image, divided by the focal length (camera intrinsic)</h3>
I first tried detecting another phone's torch as a blob with OpenCV's blob detector <a href="http://henryomd.blogspot.com/2017/02/recognizing-remote-phones-torch-in.html">last December</a>. Automatically adjusting the torch brightness based on the ambient scene intensity observed by the other phone is challenging enough (requires network communication--which I solved with Apple's Multipeer Kit), but the fact that the iOS torch brightness granularity is atrocious meant many trial-and-error on my part. When I finally got it not too bright or not too dim, the torch is observable as a blob that is quite close to the remote phone's camera frame. In the A's camera frame, B's torch therefore is at pose <span style="font-family: "helvetica neue"; font-size: 20px; font-stretch: normal; line-height: normal; text-align: center; text-decoration: underline;">r</span><sup style="font-family: "Helvetica Neue"; text-align: center;">A</sup><sub style="font-family: "Helvetica Neue"; text-align: center;">AB</sub><span style="font-family: "helvetica neue"; font-size: 20px; font-stretch: normal; line-height: normal; text-align: center;"> </span><span style="font-family: "helvetica neue"; font-size: 20px; font-stretch: normal; line-height: normal; text-align: center; text-decoration: underline;">C</span><sup style="font-family: "Helvetica Neue"; text-align: center;">A</sup><sub style="font-family: "Helvetica Neue"; text-align: center;">B</sub>. Such pose is observed on A's camera at pixel location p(x,y), shown below.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhB9yivZQTRGIqMAz5Vw6Gmufds1DikIIWbs9k__tp0EIaVP_a7DEtVlQ5vcHNrx0D3CLg3l6SOpW9GM1A0DF-yn9sY52ExVrw0tKOHoRNsOySM0nJXY8IAtSUARdrOgDgS7XFNA_SqUJYX/s1600/TorchObservable.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="227" data-original-width="208" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhB9yivZQTRGIqMAz5Vw6Gmufds1DikIIWbs9k__tp0EIaVP_a7DEtVlQ5vcHNrx0D3CLg3l6SOpW9GM1A0DF-yn9sY52ExVrw0tKOHoRNsOySM0nJXY8IAtSUARdrOgDgS7XFNA_SqUJYX/s1600/TorchObservable.png" /></a></div>
I verified experimentally that the pixel origin is at the upper left corner. Along with the device pose, ARKit also reports the camera intrinsic matrix K (containing the horizontal/vertical focal length and the camera center--all in [pixel] unit), so that given the pixel location of the blob p(x,y), the <i>angle</i> from A's camera optical axis to the blob is calculated as <span style="font-family: "helvetica neue"; font-size: 16px; font-stretch: normal; line-height: normal;">(</span><span style="font-family: "stixgeneral"; font-size: 16px;">𝛂</span><span style="font-family: "helvetica neue"; font-size: 16px; font-stretch: normal; line-height: normal;">, </span><span style="font-family: "stixgeneral"; font-size: 16px;">𝛃</span><span style="font-family: "helvetica neue"; font-size: 16px; font-stretch: normal; line-height: normal;">)</span> = <span style="font-family: "helvetica neue"; font-size: 16px;">atan2(</span><span style="font-family: "helvetica neue"; font-size: 16px; text-decoration: underline;">p</span><span style="font-family: "helvetica neue"; font-size: 16px;"> - </span><span style="font-family: "helvetica neue"; font-size: 16px; text-decoration: underline;">c</span><span style="font-family: "helvetica neue"; font-size: 16px;">, <u>f</u></span><sub style="font-family: "Helvetica Neue";">z</sub><span style="font-family: "helvetica neue"; font-size: 16px;">), </span>where <span style="font-family: "helvetica neue"; font-size: 16px; text-decoration: underline;">p</span>, <span style="font-family: "helvetica neue"; font-size: 16px; text-decoration: underline;">c</span>, and <span style="font-family: "helvetica neue"; font-size: 16px;"><u>f</u></span><sub style="font-family: "Helvetica Neue";">z</sub> are all vector (2D) quantities.<br />
<h3>
The expected: blob angle <span style="font-family: "helvetica neue"; font-size: 16px; font-stretch: normal; line-height: normal;">(</span><span style="font-family: "stixgeneral"; font-size: 16px;">𝛂</span><span style="font-family: "helvetica neue"; font-size: 16px; font-stretch: normal; line-height: normal;">, </span><span style="font-family: "stixgeneral"; font-size: 16px;">𝛃</span><span style="font-family: "helvetica neue"; font-size: 16px; font-stretch: normal; line-height: normal;">) </span>according to the rigid body transformation</h3>
<div>
Using a series of rigid body transformation, I can express the B's pose in A's camera frame, as developed below. Assuming that B's torch is at the origin of the B's camera frame, I can take that into the A's local frame <span style="font-family: "helvetica neue"; font-size: 16px; text-align: center;">l</span><sub style="font-family: "Helvetica Neue"; text-align: center;">A</sub> using the unknown quantities <span style="font-family: "helvetica neue"; font-size: 20px; font-stretch: normal; line-height: normal; text-align: center; text-decoration: underline;">r</span><sup style="font-family: "Helvetica Neue"; text-align: center;">l</sup><span style="font-family: "helvetica neue"; font-size: 16px; text-align: center;">A</span><sub style="font-family: "Helvetica Neue"; text-align: center;">lAlB</sub> and <span style="font-family: "helvetica neue"; font-size: 20px; font-stretch: normal; line-height: normal; text-align: center; text-decoration: underline;">C</span><sup style="font-family: "Helvetica Neue"; text-align: center;">l</sup><span style="font-family: "helvetica neue"; font-size: 16px; text-align: center;">A</span><sub style="font-family: "Helvetica Neue"; text-align: center;">lB</sub>: </div>
<div>
<div style="font-family: "Helvetica Neue"; font-size: 16px; font-stretch: normal; line-height: normal; text-align: center;">
<span style="font-size: 20px; font-stretch: normal; line-height: normal; text-decoration: underline;">r</span><sup>l</sup>A<sub>lAB</sub> = <span style="font-size: 20px; font-stretch: normal; line-height: normal; text-decoration: underline;">C</span><sup>l</sup>A<sub>lB</sub> <span style="font-size: 20px; font-stretch: normal; line-height: normal; text-decoration: underline;">r</span><sup>l</sup>B<sub>lBB + </sub><span style="font-size: 20px; font-stretch: normal; line-height: normal; text-decoration: underline;">r</span><sup>l</sup>A<sub>lAlB</sub></div>
</div>
<div>
<br /></div>
<div>
Next, take that into the frame A:</div>
<div>
<div style="font-family: "Helvetica Neue"; font-size: 16px; font-stretch: normal; line-height: normal; text-align: center;">
<span style="font-size: 20px; font-stretch: normal; line-height: normal; text-decoration: underline;">r</span><sup>A</sup><sub>AB </sub>= <span style="font-size: 20px; font-stretch: normal; line-height: normal; text-decoration: underline;">C</span><sup>A</sup><sub>lA </sub>( <span style="font-size: 20px; font-stretch: normal; line-height: normal; text-decoration: underline;">r</span><sup>l</sup>A<sub>lAB</sub> − <span style="font-size: 20px; font-stretch: normal; line-height: normal; text-decoration: underline;">r</span><sup>l</sup>A<sub>AlA </sub>)</div>
</div>
<div>
<br /></div>
<div>
The expected observation is then</div>
<div>
<div style="font-family: "Helvetica Neue"; font-size: 16px; font-stretch: normal; line-height: normal; text-align: center;">
<span style="font-size: 20px; font-stretch: normal; line-height: normal; text-decoration: underline;">z</span><sub>AB </sub>= atan2(x,y of <span style="font-size: 20px; font-stretch: normal; line-height: normal; text-decoration: underline;">r</span><sup>A</sup><sub>AB</sub><span style="font-size: 20px; font-stretch: normal; line-height: normal;"> / </span>z of <span style="font-size: 20px; font-stretch: normal; line-height: normal; text-decoration: underline;">r</span><sup>A</sup><sub>AB</sub>)</div>
<h3>
Synchronizing A and B poses</h3>
</div>
<div>
Equations are generally cheap to write down, but implementing such equation on a physical device usually takes a lot of care and effort. Note the above equation mixes ARKit reported poses and observations made on 2 different devices, which have different clock. Even though networked modern phones run NTP, they will only be synchronized to on the order of a second. Here is the scheme I used to interpolate the ARKit reports between A and B.<br />
<br />
First, the records are tagged with local time (which is monotonically increasing), as shown below:<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhVgepcaDNl1ddh0LNR4hRhI3N-P3QPF-9eScTOryWM3OhgV_XobijFK6-sfjQ9aBfg2Fq625EM4k8uXI9NnrT-YeLo_nC50CzerEufmJlkdVMMtqmUDtRQGIYo24ARqZcLhBTXsSfiicQx/s1600/Shootout+time.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="227" data-original-width="600" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhVgepcaDNl1ddh0LNR4hRhI3N-P3QPF-9eScTOryWM3OhgV_XobijFK6-sfjQ9aBfg2Fq625EM4k8uXI9NnrT-YeLo_nC50CzerEufmJlkdVMMtqmUDtRQGIYo24ARqZcLhBTXsSfiicQx/s1600/Shootout+time.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">𝚫clk is a state (parameter to estimate). In this example, I align B's record to A using the current estimate of 𝚫clk</td></tr>
</tbody></table>
<div class="separator" style="clear: both; text-align: left;">
Using the current estimate of 𝚫clk, I can map the other device's record to the local device. These records of course do NOT align, so I (linearly) interpolate as shown below:</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhjQ7ielfHABpSx3_IN5bXNzApg6SKsaEzMOz9nUe9oZq4O24TfO7UJwVMKem01iS2QdXENzkGudg8rKzF00xw1vtEcdJDgw3nBgGtlINUroFModYGXzGEE70mhJ24h0c6vUm7qbB1UNuSD/s1600/Shootout+time+interpolate.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="248" data-original-width="608" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhjQ7ielfHABpSx3_IN5bXNzApg6SKsaEzMOz9nUe9oZq4O24TfO7UJwVMKem01iS2QdXENzkGudg8rKzF00xw1vtEcdJDgw3nBgGtlINUroFModYGXzGEE70mhJ24h0c6vUm7qbB1UNuSD/s1600/Shootout+time+interpolate.png" /></a></div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
</div>
<h3>
Caution: inverse of <span style="font-family: "helvetica neue"; font-size: 20px; font-stretch: normal; line-height: normal; text-align: center; text-decoration: underline;">r</span><sup style="font-family: "Helvetica Neue"; text-align: center;">l</sup><span style="font-family: "helvetica neue"; font-size: 16px; text-align: center;">A</span><sub style="font-family: "Helvetica Neue"; text-align: center;">lAlB</sub> is NOT −<span style="font-family: "helvetica neue"; font-size: 20px; font-stretch: normal; line-height: normal; text-align: center; text-decoration: underline;">r</span><sup style="font-family: "Helvetica Neue"; text-align: center;">l</sup><span style="font-family: "helvetica neue"; font-size: 16px; text-align: center;">B</span><sub style="font-family: "Helvetica Neue"; text-align: center;">lBlA</sub></h3>
<div>
From B's perspective, the above equation is completely valid if I flip the A and B. But when calculating <span style="font-family: "helvetica neue"; font-size: 20px; font-stretch: normal; line-height: normal; text-align: center; text-decoration: underline;">r</span><sup style="font-family: "Helvetica Neue"; text-align: center;">l</sup><span style="font-family: "helvetica neue"; font-size: 16px; text-align: center;">B</span><sub style="font-family: "Helvetica Neue"; text-align: center;">lBlA</sub>, do NOT assume (as I did at first) that it is the same as −<span style="font-family: "helvetica neue"; font-size: 20px; font-stretch: normal; line-height: normal; text-align: center; text-decoration: underline;">r</span><sup style="font-family: "Helvetica Neue"; text-align: center;">l</sup><span style="font-family: "helvetica neue"; font-size: 16px; text-align: center;">A</span><sub style="font-family: "Helvetica Neue"; text-align: center;">lAlB</sub>. Chasing this bug for a few days drilled into my obtuse brain the absolute necessity of keeping track of the resolving reference frame: when I change the resolving frame, I need to apply the rotation between the 2 different resolving frames!</div>
<h3>
Least squares? Not just yet! Initialize first</h3>
<div>
As soon as I have a bunch of observations and expected observations for some unknown parameters, I immediately think of the method of least squares, which I explained in a <a href="http://henryomd.blogspot.com/2017/08/understanding-least-squares-in-pictures.html">previous blog entry</a>. To temp one even more into this line of thinking, the remote phone's blob reminds you strongly of the satellite observation that motivated Gauss to invent the method of least squares in the first place! The problem is that general rigid body transformation is a crazy nonlinear problem, and divergence is highly likely when using linear approximation (the Gauss-Newton method). Luckily, when the 2 devices are facing each other, there is a strong constraint: the <i>average</i> orientation of the other device is quite close to being 180º around vertical, and the distance is somewhere between 2~4 m apart (too close or too far, the blob detector fails). Using this constraint, you can solve for <span style="font-family: "helvetica neue"; font-size: 20px; font-stretch: normal; line-height: normal; text-align: center; text-decoration: underline;">r</span><sup style="font-family: "Helvetica Neue"; text-align: center;">l</sup><span style="font-family: "helvetica neue"; font-size: 16px; text-align: center;">A</span><sub style="font-family: "Helvetica Neue"; text-align: center;">lAlB</sub> and <span style="font-family: "helvetica neue"; font-size: 20px; font-stretch: normal; line-height: normal; text-align: center; text-decoration: underline;">C</span><sup style="font-family: "Helvetica Neue"; text-align: center;">l</sup><span style="font-family: "helvetica neue"; font-size: 16px; text-align: center;">A</span><sub style="font-family: "Helvetica Neue"; text-align: center;">lB</sub> that will yield the good initial guess of <span style="font-family: "helvetica neue"; font-size: 20px; font-stretch: normal; line-height: normal; text-align: center; text-decoration: underline;">r</span><sup style="font-family: "Helvetica Neue"; text-align: center;">A</sup><sub style="font-family: "Helvetica Neue"; text-align: center;">AB</sub><span style="font-family: "helvetica neue"; font-size: 20px; font-stretch: normal; line-height: normal; text-align: center;"> </span>and <span style="font-family: "helvetica neue"; font-size: 20px; font-stretch: normal; line-height: normal; text-align: center; text-decoration: underline;">C</span><sup style="font-family: "Helvetica Neue"; text-align: center;">A</sup><sub style="font-family: "Helvetica Neue"; text-align: center;">B</sub>, as shown by the A and B observation residuals below:</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://media.licdn.com/mpr/mpr/AAEAAQAAAAAAAA2iAAAAJDI5YTEwOWU0LWU4ZTgtNDM0OC04NjM1LTVjOTJmZGQ1NjIyNA.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="287" data-original-width="800" height="229" src="https://media.licdn.com/mpr/mpr/AAEAAQAAAAAAAA2iAAAAJDI5YTEwOWU0LWU4ZTgtNDM0OC04NjM1LTVjOTJmZGQ1NjIyNA.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">This is actually the initial residual for the 2nd approach (discussed below) when A and B are not moving rapidly in space. Can you spot the unobservability of the problem from these residuals?<br />
<br /></td></tr>
</tbody></table>
<div>
I blew by this rather fast, but there is a huge gotcha you won't find in many places. When averaging or interpolating Quaternion (Unity uses Quaternions rather than rotation matrix), you can't just do a SLERP (spherical linear interpolation), because of the "quaternion jumps"; when you write out quaternion as [x, y, z, w] (or the other order, as is often done), that is actually the same as [-x, -y, -z, -w]. So two quaternions that are in fact almost the same may look like very different numbers. And you say: "Ah, I see why they use SLERP, which will use the angle-axis representation to interpolate between two angle"--and you would be wrong, because the angle axis representation suffers from the same ambiguity of minus signs (flipping the axis of rotation is the same as flipping the amount of rotation). What I wound up doing is:<br />
<br />
<ol>
<li>Generate the 2 different representations of the quaternion dQ for the <i>difference </i>between those 2 quaternions (call them dQ1 and dQ2),</li>
<li>Calculate the angle axis of those 2 possible rotations</li>
<li>Choose the angle axis with smaller magnitude</li>
<li>Then apply the interpolation and average</li>
</ol>
<div>
I actually ran into the "Q is the same as -Q" problem a year ago, but did not fully understand the problem or devise a solution until this hobby project.</div>
<br />
<h3>
Gauss Newton iterations</h3>
</div>
<div>
Among nonlinear least squares methods, Gauss-Newton is the simplest: it just requires Jacobian, which is the partial derivative of the residual WRT each of the parameters to estimate. Since I have multiple (hopefully many more than the parameters to estimate) observations, the Jacobian is a "tall" matrix. As you can imagine when you stare at the derivation of expected observation, analytical derivation of the Jacobian is difficult, but fortunately, numerical method to estimate the Jacobian usually works quite well. If the i'th observation <u>Z</u>i = <b><i>f</i></b>(<u>X</u>), where <u>X</u> = [x1, ..., xn], the i'th row of the Jacobian J is a row vector</div>
<div style="text-align: center;">
Ji = [ {<b><i>f</i></b>([x1 + ∂x1, ..., xn]) − <b><i>f</i></b>([x1, ..., xn])} / ∂x1, ..., {<b><i>f</i></b>([x1, ..., xn + ∂xn]) − <b><i>f</i></b>([x1, ..., xn])} / ∂xn ]</div>
<div>
<br /></div>
<div>
Where I think analytical derivation of the partial derivative can be helpful is to obtain the Hessian (double derivative) necessary for Newton's method, which can converge <i>much</i> faster than any other method (at the risk of some ringing)</div>
<div>
<br /></div>
<div>
The Gauss-Newton correction is the solution to the standard Normal Equation (J <u>𝛅X</u>= 𝜺<u>Z</u>), which is J* (pseudo inverse of J) multiplied by the residual 𝜺<u>Z</u>:</div>
<div style="text-align: center;">
<u>𝛅X</u> = J* 𝜺<u>Z</u></div>
<div>
which is also a minimum norm solution (that is, ⎮<u>𝛅X</u>⎮ is the minimum along all solutions that satisfy J <u>𝛅X</u>= 𝜺<u>Z</u>). Matlab's backslash operator (\) yields the <i>sparse</i> solution, which has the most number of zeros in the solution--not quite the same thing!</div>
<h3>
Boldly solving unobservable problem---heuristically</h3>
<div>
If you have an intuition about this sort of geometry problem, you would have caught on to the fact that this is an unobservable problem: for a given observation, rotation cannot be teased out from translation. Try for yourself: imagine translating B relative to A, vs. rotating the B about the vertical: the A's blob will move in a similar way! The observability is only restored when you have a lot of observations at different distance away: because the distance acts as a lever arm. Unfortunately, it is quite difficult to ensure sufficient distribution of distances during calibration.</div>
<div>
<br /></div>
<div>
Another problem is the weak observability of translation (change in <span style="font-family: "helvetica neue"; font-size: 20px; font-stretch: normal; line-height: normal; text-align: center; text-decoration: underline;">r</span><sup style="font-family: "Helvetica Neue"; text-align: center;">A</sup><sub style="font-family: "Helvetica Neue"; text-align: center;">AB</sub>) relative to rotation (change in <span style="font-family: "helvetica neue"; font-size: 20px; font-stretch: normal; line-height: normal; text-align: center; text-decoration: underline;">C</span><sup style="font-family: "Helvetica Neue"; text-align: center;">A</sup><sub style="font-family: "Helvetica Neue"; text-align: center;">B</sub>): except when the devices are quite close, the distance acting as the lever arm amplifies the change in rotation, so that even with a minimum norm solution, practically only the rotation is corrected. As I read in <a href="https://smile.amazon.com/Least-Squares-Data-Fitting-Applications/dp/1421407868/ref=sr_1_1?ie=UTF8&qid=1506653395&sr=8-1&keywords=least+squares+application">this textbook on least squares</a>, this situation is distantly similar to having a large condition number (ratio between the greatest and the least singular values of a matrix J), and a standard technique to dealing with this situation is Tikhonov regularization, which is to minimize</div>
<div style="text-align: center;">
⎮J <u>𝛅X</u> + 𝛌I <span style="text-align: center;">− </span> 𝜺<u>Z</u>⎮</div>
<div>
rather than the original cost function ⎮J <u>𝛅X</u> <span style="text-align: center;">− </span> 𝜺<u>Z</u>⎮. The intuition is to artificially boost all the singular values: for the large singular values, the artificial boost is negligible, while the weak singular values are "saved" by the boost. But to address the rotation drowning out translation, I used <i>unequal</i> boosting for rotation vs. translation, so that my regularization cost function is</div>
<div>
<div style="text-align: center;">
⎮J <u>𝛅X</u> + 𝚲 − 𝜺<u>Z</u>⎮</div>
</div>
<div>
where <span style="text-align: center;">𝚲</span> is a diagonal weight matrix. The solution to this regularized problem is</div>
<div style="text-align: center;">
<u style="text-align: center;">𝛅X</u><span style="text-align: center;"> = </span>(J' J + <span style="text-align: center;">𝚲</span>)* J' <span style="text-align: center;">𝜺</span><u style="text-align: center;">Z</u><span style="text-align: center;">, where J' is J transposed.</span></div>
<div>
<span style="text-align: center;"><br /></span></div>
<div>
<span style="text-align: center;">Even this technique was not enough, so I then tried correcting the rotation and translation </span><i style="text-align: center;">only</i><span style="text-align: center;"> at every </span><i style="text-align: center;">other</i><span style="text-align: center;"> correction step.</span></div>
<h3>
<span style="text-align: center;">Torpedoed by the SLAM's static scenery assumption</span></h3>
<div>
<span style="text-align: center;">The result of all this was quite </span>unsatisfactory when the 2 devices spun around each other, because a rather large object (a person) in the middle of the scenery that seems to stay in the scenery while the device is rotation confuses ARKit, as you can see here</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://media.licdn.com/mpr/mpr/AAIA_wDGAAAAAQAAAAAAAAszAAAAJGZhNzIzNzExLTk5MmQtNDNiNy1hZGU1LWI1ZWIxZDFmZWFlNA.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="608" data-original-width="503" src="https://media.licdn.com/mpr/mpr/AAIA_wDGAAAAAQAAAAAAAAszAAAAJGZhNzIzNzExLTk5MmQtNDNiNy1hZGU1LWI1ZWIxZDFmZWFlNA.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Device A is spinning around coffee table, so the yaw angle (top) is monotonically increasing. But ARKit is quite confused about A's position (The trajectory should sketch out a circle!), because my wife (B) is apparently static in the scenery. Even though it does the best it can, the residuals are huge at 2 m inter-device distance.</td></tr>
</tbody></table>
<h2>
2nd attempt: just tip and rotate the phone in hand while sitting still</h2>
<div>
After realizing the root cause of the horrendous residual, I tried just sitting still and rotating the device gently in my hand, so the observed blob traces out a rectangle on the camera image. Then I was able to reduce the residual down to an acceptable level (~0.03 radians). The problem was that moving the phones so gently required some practice, and it took me 1~2 minutes to gather enough quality observations to throw into the least squares solver derived above.</div>
<div>
<br /></div>
<div>
Since my target application is a consumer facing video game, I reasoned that this is impractical, and looked for...</div>
<h2>
3rd attempt: A and B face the <i>same</i> direction, and share features</h2>
<h3>
Feature detect</h3>
<div>
A surer way to NOT confuse ARKit is to clear away any moving object in ARKit's scenery--for example by having both A and B face the same direction. Of course, then I have no blob to observe. What I can observe instead are the feature points that are commonly used in feature tracking algorithms; OpenCV packs no less than a dozen different such algorithms: FAST, SIFT, SURF, ORB, just to name a few. In fact, ARKit (and all known SLAM methods) itself uses such feature detectors; ARKit even emits the <i>position </i>(but not the descriptors)<i> </i>of those features as point clouds. I thought about aligning the point clouds from A and B by minimizing the Euclidean distance between those points, but realized it would be computationally intractable for any non-trivially small offsets between A and B. So I set about running a feature detector myself (outside of ARKit).</div>
<h3>
Match features</h3>
<div>
I borrowed heavily from stereo correspondence problem, as explained in Hartley and Zisserman: when situated pretty close together and facing the same direction, A and B images are like the left and the right images of a stereo camera. With small effort, I can pick up a lot of features that should appear in both A and B's images. If B sends the top 200 of such feature points (with descriptors) to A, then A can run the OpenCV's DescriptorMatcher--even a brute force matcher is OK--to pick out candidate matches.</div>
<h3>
Solve for the fundamental matrix</h3>
<div>
Such matches are corrupted by outliers--"crazy" matches--because the matchers really don't have much context to work with (certainly much less than the human vision system). So when solving the stereo correspondence problem, we need to use a robust (immune to outliers) algorithm such as RANSAC (random sampling and consensus). [Some people reported better result with PROSAC--a derivative of RANSAC] Once again, OpenCV does the heavy lifting in findFundamentalMat() function, which takes 2 sets of feature points. Note that the solution F is in [pixel] unit.</div>
<h3>
Recover the essential matrix</h3>
<div>
The intrinsic matrix K from ARKit is then applied to F to obtain E:</div>
<div style="text-align: center;">
<span style="font-size: large;"><span style="font-family: "helveticaneue";">E = K</span><sup style="font-family: HelveticaNeue;">-1</sup><span style="font-family: "helveticaneue";"> F K</span></span></div>
<div>
E is by definition the translation and rotation: E = T^ R, where T^ is the skew matrix of translation:</div>
<div>
<pre><code><span style="font-size: large;"> 0 -tx ty
T^ = tz 0 -tx
-ty tx 0 </span></code></pre>
</div>
<div>
When I decompose E with SVD, i.e.</div>
<div style="text-align: center;">
<span style="font-size: large;">E = <span style="font-family: "helveticaneue";">U Σ V</span><sup style="font-family: HelveticaNeue;">T</sup></span></div>
<div>
There are 2 possible solutions to consider:</div>
<div style="text-align: center;">
<span style="font-size: large;"><span style="font-family: "helveticaneue";">(T^, R) = (U W</span><span style="font-family: "helveticaneue";"> Σ U</span><sup style="font-family: HelveticaNeue;">T</sup><span style="font-family: "helveticaneue";">, U W</span><span style="font-family: "helveticaneue";"> </span><span style="font-family: "helveticaneue";">Σ V</span><sup style="font-family: HelveticaNeue;">T</sup><span style="font-family: "helveticaneue";">)</span> or</span><br />
<span style="font-size: large;"><span style="font-family: "helveticaneue";"><span style="font-size: large;"><span style="font-family: "helveticaneue";">(T^, R) = </span></span>(U </span><span style="font-family: "helveticaneue";">W</span><sup style="font-family: HelveticaNeue;">T</sup><span style="font-family: "helveticaneue";"> Σ U</span><sup style="font-family: HelveticaNeue;">T</sup><span style="font-family: "helveticaneue";">, U </span><span style="font-family: "helveticaneue";">W</span><sup style="font-family: HelveticaNeue;">T</sup><span style="font-family: "helveticaneue";"> </span><span style="font-family: "helveticaneue";">Σ V</span><sup style="font-family: HelveticaNeue;">T</sup><span style="font-family: "helveticaneue";">)</span></span></div>
<div>
where</div>
<div>
<pre><code><span style="font-size: large;"> 0 -1 0
W = 1 0 0
0 0 1</span></code></pre>
<pre><code><span style="font-size: large;"> </span></code></pre>
<pre style="font-size: 12px;"><span style="font-family: inherit;"><code><span style="font-size: small; white-space: normal;">The solution that puts more features <i>in front</i> (z > 0) of the camera should be chosen.</span></code></span></pre>
<pre><span style="font-family: -webkit-standard;"><span style="font-family: inherit;"><span style="white-space: normal;">Note that the resulting translation is </span><i style="white-space: normal;">still</i></span><span style="white-space: normal;"><span style="font-family: inherit;"> in [pixel] unit, rather than [m]. This is because the projective geometry alone cannot resolve the scale; in VIO (visual inertial odometry, of which ARKit is an implementation), the inertial sensor is brought in to establish scale. This suggests that if I run the above algorithm <i>twice</i> with ARKit turned on, I can use the change in position to recover the scale factor</span>.</span></span></pre>
<h3>
Unity implementation</h3>
</div>
<div>
<span style="font-family: Arial,Helvetica,sans-serif;"><span style="font-size: small;"><code><span style="white-space: normal;">Incomplete--I did not do this because...</span></code></span></span></div>
<h2>
Why I am stopping now</h2>
<div>
My gut feeling is that users will not, or cannot, perform any calibration lasting more than 10 seconds: the compliance will be low, and they can't do it well, even if they want to. So even the stereo correspondence method is problematic. I think the right solution is to <i>always</i> share the high quality features between devices and build a joint map. This is what Hololens does already, but it is just too much effort for me to take on (because I have to really learn the details of "M" in SLAM), because I have low level projects (<a href="https://www.blogger.com/post-preview-auth.g?postID=6911956533698164350&blogID=4032020337247582619">power electronics</a> and a <a href="http://henryomd.blogspot.com/2017/05/dual-system-architecture-for-raspberry.html">hard real-time AMP architecture for Raspberry Pi</a>) I've neglected for too long.</div>
<div>
<br /></div>
<div>
I'll revisit this "Shootout" project when there is an AR engine that shares global map in a completely transparent manner to the application developer.</div>
<div>
<code><span style="font-size: small; white-space: normal;"><br /></span></code></div>
Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com0tag:blogger.com,1999:blog-4032020337247582619.post-23013857633078000262017-09-17T11:50:00.001-07:002017-09-17T11:50:47.003-07:00Caught between (ground) truth and the (LS) fitSince my <a href="http://henryomd.blogspot.com/2017/09/path-to-multiplayer-ar-game.html">last blog post</a>, I've been trying to apply the method of least squares (LS) to the problem of aligning the local frames of 2 different phones running an ARKit based UI (in fact, a Unity game). Along the way, I had to understand the differences between various nonlinear LS methods: the simplest of them all is the Gauss-Newton method, which requires just a Jacobian (partial differential matrix of the observable as a function of the state elements), but can't deal with losing a rank in the Jacobian. Also, for some random values of initial state guesses, the iteration jumps around wildly, which is a bit unsettling. Originally, I tried to symbolically write out the Jacobian terms, but because the geometry involves 5 different frames, taking the derivative of so many Quaternion laden terms fills one entire page--for just 1 term. So I settled on a numerical method instead. Given the considerable difficulty with even the single partial differential, I wasn't going to try my luck with double partial differential necessary for the Hessian--which is necessary for Newton's method.<br />
<br />
But what really sucked about my problem is the lack of observability--because the state has both translation and rotation which are indistinguishable/weakly distinguishable, the iterations can "wander" in the infinite space of solutions and not converge. Upon further reading, I learned about the difference between the maximum sparse solution of LS (Matlab's "\" operator) and the minimum norm solution (using the pseudo inverse solution). And to deal with a weak condition number case, I used Tikhonov regularization. I have yet to come across any discussion of observability in least squares context, so I don't know whether to be pleased or disappointed at the result I got<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4Bab8BNywoTf3fcJvuOMqWTq1OYFyC2abQKtun3O8qKeS9WHIiXRbrpStYWCCMx55V2VIqBpOC_RtfTE0GceX66kJ8AoyOLKSw-xtEBLj9DK963bdAM0ODf-VLHTNv3OSBil-8LrKvZhU/s1600/LS1.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="700" data-original-width="560" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4Bab8BNywoTf3fcJvuOMqWTq1OYFyC2abQKtun3O8qKeS9WHIiXRbrpStYWCCMx55V2VIqBpOC_RtfTE0GceX66kJ8AoyOLKSw-xtEBLj9DK963bdAM0ODf-VLHTNv3OSBil-8LrKvZhU/s1600/LS1.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The solution of 30 iterations of (Tikhonov) regularized LS, with the 6 DOF between the 2 device local frame as the states, and the other phone's blob location on the camera image as the observable. (I wrote about <a href="http://henryomd.blogspot.com/2017/02/recognizing-remote-phones-torch-in.html">detecting a remote phone's torch using OpenCV</a> last year). On the top, the blue circles are my phone's pose within its own local frame, while the red "x" and the dotted lines are my wife's phone pose at the observation instant. The black "x" and attached lines are my wife's phone's pose in MY camera's frame. Since the 2 phones are merely rotating, the black "x" should be a fixed Z distance: about 2 m, which is clearly not what the LS solution settled on.</td></tr>
</tbody></table>
I think I should be disappointed because this solution settled on a wrong distance between the 2 devices: the ground truth is approximately 2 m, but as you can see above, the LS settled on around 6 m. But it's not the LS that I should be disappointed in: it's my fault for not thinking through the observability problem up front. What can I say? I learn by failing. But I do believe the smart researchers can better warn students about these practical problems. Maybe they just don't have the energy to write about these practical considerations after painstakingly deriving all those equations. That's why I appreciate <i><a href="https://www.linkedin.com/pulse/book-review-topics-astrodynamics-henry-choi/">Topics in Astrodynamics</a> </i>(I was originally motivated to study least squares when I learned about the differential correction method (AKA batch filter), which is well explained in Chapter 15 of <i>Topics in Astrodynamics</i>)<i> </i>even more, because in that book, there is at least a mention about the need for better model when the residual is unacceptably large.Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com0tag:blogger.com,1999:blog-4032020337247582619.post-18237449576761737032017-09-05T19:45:00.001-07:002017-09-07T19:30:49.455-07:00Path to a multiplayer AR gameFor slightly less than a year, I've been working on a hobby project to create a peer-to-peer AR game on an iPhone. I first looked into indoor localization using audible chirps (linear frequency modulation, in technical terms). I worked out <a href="http://henryomd.blogspot.com/2017/03/signal-processing-audio-chirp-in-matlab.html">the signal processing</a>, borrowing heavily from radar fundamentals, and wrote about it in <a href="http://henryomd.blogspot.com/2017/02/understanding-pulse-compression.html">a blog entry</a>. But when I wrote a CoreAudio based program on my phone, <a href="http://henryomd.blogspot.com/2017/03/audio-chirp-processing-problems-on-os-x.html">I found out</a> in March of this year (about 6 months into the effort) that the received audio signals were heavily distorted and attenuated, so that the signals were quite weak. I then pivoted toward a SLAM based approach, and have been trying to solve the problem with the device pose estimate from ARKit, as you can see in the picture of the coordinate frames and the observable in my problem definition.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjPa5mHB0cGJkAfNU2Sl0xW7JQcijFj39BySTp5HYCmzLYiC0kLcterpQYdzBzLEOQ4iZUhSEtb3rq4xGUWgV-q9f70LwbNfrjtT63KV0hZal9Xlrl5famACVwyjkL4AFY4FDii-sipZild/s1600/Shootout+frame.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="599" data-original-width="827" height="463" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjPa5mHB0cGJkAfNU2Sl0xW7JQcijFj39BySTp5HYCmzLYiC0kLcterpQYdzBzLEOQ4iZUhSEtb3rq4xGUWgV-q9f70LwbNfrjtT63KV0hZal9Xlrl5famACVwyjkL4AFY4FDii-sipZild/s640/Shootout+frame.png" width="640" /></a></div>
Just last week, Google released ARCore to keep up with Apple. Last night, I read about it in <a href="https://medium.com/super-ventures-blog/how-is-arcore-better-than-arkit-5223e6b3e79d">this review</a> by Matt Miesnieks.<br />
<br />
I was impressed at Mr. Miesnieks' technical depth and breath in this area. His comments about the difficulty of multi-player AR struck me as spot on, and would have normally disheartened me. But thankfully, I reached a mini-milestone: I collected a 2-minute calibration session data from my wife's and my phone, and put it through the 1st stage of the AR map alignment algorithm I have been working on, as you can see below:<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi2OMnQbosQLdqcQ6s9FwBtGYw8t484wpDYsY4Vrp616Epj_vdJgitjTiGSyJ9nCZhra_fSCOdk0MOMUVfbQB4qBwM14WfboUNvv24eHJAbeO7qGtqa3NOgopBwUlvOSwPuh8WewRX6nE5e/s1600/rect+residual.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="369" data-original-width="1028" height="229" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi2OMnQbosQLdqcQ6s9FwBtGYw8t484wpDYsY4Vrp616Epj_vdJgitjTiGSyJ9nCZhra_fSCOdk0MOMUVfbQB4qBwM14WfboUNvv24eHJAbeO7qGtqa3NOgopBwUlvOSwPuh8WewRX6nE5e/s640/rect+residual.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">o: measurement of the other devices normalized pixel location on each of the 2 phones in the 1-1 calibration session.<br />
x: expected normalized pixel location based on the estimate of the 6-DOF offset between the 2 phones.</td></tr>
</tbody></table>
Since this fit is BEFORE the <a href="http://henryomd.blogspot.com/2017/08/understanding-least-squares-in-pictures.html">least squares</a> based 2nd stage of the alignment algorithm, this is encouraging! And here is the result of 10 iterations of nonlinear least squares (which I explained in a previous blog entry), which shows that I am on the right track!<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhjZdEI2gELHzBHPVhXJ19eYDnukEu3WrJN7wnvDjFOfsaLKK6QrRGBJ10fTgS6PIBktJGW-DoSqEtFrVbdnQIU1NHQboyLnirlwHtn8YKe5OfIhtrrajzk9dQ9ffxWuo2NMzILVOZtIb5g/s1600/LS+reesult.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="414" data-original-width="1053" height="251" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhjZdEI2gELHzBHPVhXJ19eYDnukEu3WrJN7wnvDjFOfsaLKK6QrRGBJ10fTgS6PIBktJGW-DoSqEtFrVbdnQIU1NHQboyLnirlwHtn8YKe5OfIhtrrajzk9dQ9ffxWuo2NMzILVOZtIb5g/s640/LS+reesult.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Residual innovation (measured vs. predicted) after 10 iterations of nonlinear least squares.</td></tr>
</tbody></table>
But even here, I can see that my model is not accurate enough to reduce the residual level low enough for a high quality interactive AR game play, suggesting 2 things:<br />
<ol>
<li>Expand the model to add at least another 3 states (maybe even 6) to the currently 3 states I am estimating.</li>
<li>The possibility that at least some of the estimates may have to be updated even after the initial convergence.</li>
</ol>
The 2nd is going to be particularly painful, so I want to first evaluate how quickly ARKit drifts after the initial convergence.<br />
<br />
Although this is supposed to be a tough problem, I've been learning and reviewing many things I learned in school in this hobby project. I am curious whether I can actually pull off an algorithm that is as difficult as Mr. Miesnieks suggests; I'll find out in the next couple of months.Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com0tag:blogger.com,1999:blog-4032020337247582619.post-28706228401186860142017-08-23T21:14:00.002-07:002020-08-02T11:33:52.928-07:00Understanding Least Squares in picturesA few months ago, <a href="https://www.linkedin.com/pulse/my-second-childhood-henry-choi">I posted about the major technical concepts I've been learning on my own these past few years</a>. The estimation method of least squares (LS for short) figures prominently there. I think in pictures, so here's an explanation that I came up after reading a lot of textbooks.<br />
<div>
<br /></div>
<div>
Consider some nonlinear function y = f(x), colored gray below. I am interested in estimating the true value of x (<span style="font-family: "helvetica neue"; font-size: 16px; text-align: center; text-decoration: underline;">x</span><sup style="font-family: "helvetica neue"; text-align: center;">true</sup>) by measuring the function's output y, which is corrupted by noise, yielding a value of <span style="font-family: "helvetica neue"; font-size: 16px; text-align: center;">y</span><sub style="font-family: "helvetica neue"; text-align: center;">0</sub> instead of the noise-free value of f(<span style="font-family: "helvetica neue"; font-size: 16px; text-align: center; text-decoration: underline;">x</span><sup style="font-family: "helvetica neue"; text-align: center;">true</sup>) as you can see below.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjDUE0oPj9c_-Pjw6JFLgg6Atsly8h7zz0OocipEwc1pxjc0swU4cJUPwuzSf0A4RZTsjWaDsmhZRxel0zuN_poht_8t9ULkN9Az5pBU59bwwcm3pm2FuDot_nDPnbZgbWaRpZQSBMv8Qgi/s1600/LS1.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="264" data-original-width="517" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjDUE0oPj9c_-Pjw6JFLgg6Atsly8h7zz0OocipEwc1pxjc0swU4cJUPwuzSf0A4RZTsjWaDsmhZRxel0zuN_poht_8t9ULkN9Az5pBU59bwwcm3pm2FuDot_nDPnbZgbWaRpZQSBMv8Qgi/s1600/LS1.png" /></a></div>
<div>At about age 10, we learned how to solve this problem in closed form if f were linear and the measurement is noise-free: dividing y_0 - f(x_guess)--the difference of the expected and observed value of output y--BY the slope of the function which I call H here. Checking the intuition: the slope H is Δy/Δx, so Δy / H = Δx, as long as Δy and H and are not zero. Well it turns out that LS is fundamentally just the same idea, with fancy sounding deviations.</div>
<div>
<br /></div>
<div>
First, we pick a reasonable starting point x[0], and ignore the fact that the function f is nonlinear, and just estimate the slope H at x[0]. Therefore, we can still get a refined estimate x[1], which is closer to the truth than the starting guess. We can then re-estimate the slope at x[1] and repeat the process. Unfortunately, the will introduce a sizable error in the estimate we converge to. Usually, it is difficult to reduce noise any further (if good HW engineers worked on the problem for any reasonable period), but note that the noise in the single observation pulled us to 1 direction. So what if we had another measurement that was tainted by noise in such a way as to be off in the opposite direction? Then the average of the noise corrupted correction will be much closer to the noise-free value! Since we can't dictate that the next measurement be off in the opposite direction of the first (in the nose sense), we have ensure that the noise in unbiased, and increase the sample size sufficiently to reduce the aggregate contribution of the noise to an acceptable level.</div>
<div>
<br /></div>
<div>
The LS is the framework for fitting a large sample of observations (usually much greater than the number of parameters to estimate). Picture the 2nd iteration of the above effort, but we now have 2 observations instead of just 1, as shown below.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrUjqC_LAVK9dPlHCVoDRMkjGQ2ycfBmtOPzjphwahaTHiG27lUoHaWKz5Q0ArrIUwr-LXHnEvfsJgoTfjBPErUKcyoD4OeF7kFZ5EpijSbKIoIiAwqc5Brq7qLU5wVaUFPg5l6qPSDR16/s1600/LS2.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="253" data-original-width="520" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrUjqC_LAVK9dPlHCVoDRMkjGQ2ycfBmtOPzjphwahaTHiG27lUoHaWKz5Q0ArrIUwr-LXHnEvfsJgoTfjBPErUKcyoD4OeF7kFZ5EpijSbKIoIiAwqc5Brq7qLU5wVaUFPg5l6qPSDR16/s1600/LS2.png" /></a></div>
<div>
In general, each observation will have a different operating point, and therefore different "slope" H. The matrix H is now generalized, to stack up all such slopes for each observation, and we solve the associated normal equation involving the H matrix and the vector of residuals <span style="font-family: "helvetica neue"; font-size: 16px; text-align: center; text-decoration: underline;">(<b>y</b></span><span style="font-family: "helvetica neue"; font-size: 16px; text-align: center;"> - </span><span style="font-family: "helvetica neue"; font-size: 16px; text-align: center; text-decoration: underline;"><b>f</b></span><span style="font-family: "helvetica neue"; font-size: 16px; text-align: center;">(</span><span style="font-family: "helvetica neue"; font-size: 16px; text-align: center; text-decoration: underline;"><b>x</b></span><span style="font-family: "helvetica neue"; font-size: 16px; text-align: center;">[1])). </span>As long as the function is convex, we will approach the true value rapidly, after only a few iterations. Leaving aside the history (LS goes back to Gauss) or the math, the result just seems magical when applied on a crazy nonlinear problem (like rigid body rotation). Here is an example of how the expected and observed feature points (in a camera image) line up only after 4 iterations.</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiyyhbjzdf5zTMs9Ontc3BH6Hp0jY1-UA92sfk3mj_hnayaym1byJg5hvPtyjqx94RkI5ly2RW-LUhMcJV0OPD-cGtX-4BETjFHIRCKrWTylrES6f3xMAv_njcph_5PYlm3lLeXiNul4pFV/s1600/Nonlinear+LS+residual.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="178" data-original-width="519" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiyyhbjzdf5zTMs9Ontc3BH6Hp0jY1-UA92sfk3mj_hnayaym1byJg5hvPtyjqx94RkI5ly2RW-LUhMcJV0OPD-cGtX-4BETjFHIRCKrWTylrES6f3xMAv_njcph_5PYlm3lLeXiNul4pFV/s1600/Nonlinear+LS+residual.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Even with 4 out of 7 model parameters held at some random values, the residual error is uncannily small--only after 4 iterations through a crazily nonlinear function.</td></tr>
</tbody></table>
<div>
Even though we are not taking a simple inverse any more, fundamentally, we still need the slope to be "non-zero": technical term for it is being full-rank. Even when you do lose rank, there are some things to try after recovering from the disappointment:</div>
<div>
<ol>
<li>Drop the H column that lost rank, and only solve for the parameters that we have the rank for--hoping that we will recover full rank at the new linearization point.</li>
<li>Try a different starting point.</li>
</ol>
</div>
Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com0tag:blogger.com,1999:blog-4032020337247582619.post-73409331326674850332017-07-08T11:47:00.000-07:002017-07-30T18:49:14.761-07:00USB-C battery pack power stage designThis is my capstone project in <a href="https://www.coursera.org/specializations/power-electronics">Coursera's Power Electronics Specialization from University of Colorado at Boulder</a>, wherein I design the switching power delivery and battery charging circuit. The product requirement is for 3-cell LiPo in series (which will range from 9.6 V to 12 V, with an equivalent series resistance R<sub>bat</sub> = 50 m𝛀) to work in the following operating points:<br />
<ol>
<li>Provide power to USB bus, up to 2 A at V<sub>bus</sub> = 5 +/-0.1V (therefore with effective R<sub>bus</sub> = 2.5 𝛀), when V<sub>bat</sub> = 12.6 V.</li>
<li>Provide power to USB bus, 3 A at V<sub>bus</sub> = 20 V (therefore with effective R<sub>bus</sub> = 6.7 𝛀), when V<sub>bat</sub> = 9.6 V. </li>
<li>Charge the battery pack (which is at V<sub>bat</sub> = 11.1 V) from the USB bus, drawing up to 3 A at 20 V. Since R<sub>bat</sub> = 50 m𝛀 and we want 3A going to the battery, the supply side should be higher than V<sub>bat</sub> by 150 mV, so we need D V<sub>bus</sub> = 11.1 + 0.150 = 11.25, so D = 11.25 / 20 = 0.56.</li>
</ol>
In all cases, the given switching frequency Fs = 1/Ts = 100 kHz.<br />
<br />
I used Omni Graffle for idea sketch, and LTSpice IV for the schematic capture and simulation (after reading <a href="https://smile.amazon.com/gp/product/3899292588/ref=oh_aui_detailpage_o00_s00?ie=UTF8&psc=1">this book on LTSpice</a>).<br />
<h2>
Switch design</h2>
Since the battery voltage will be higher than the 5 V (the output voltage in mode 1) but lower than 20 V (the output voltage in mode 2), neither the pure step down (buck) nor step-up (boost) converter will do the job alone; I need a buck-boost converter, which will work like the buck converter if Vbat > Vbus, but like the boost converter when Vbat < Vbus, as you can see below.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgUcOaUBEA0wAuvzllJnYyjV6PDHxlinLPFlJhlWTtO6UXpMO7b-yyBZcF-X3RttOkbQyja7YFyeA1BOy12z27swgzSoGJBspJyQV88g85ETosbzZPCbPAi50Ju0bw5EDhW_iOU20doojN0/s1600/Screen+Shot+2017-06-20+at+8.01.45+PM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="150" data-original-width="855" height="112" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgUcOaUBEA0wAuvzllJnYyjV6PDHxlinLPFlJhlWTtO6UXpMO7b-yyBZcF-X3RttOkbQyja7YFyeA1BOy12z27swgzSoGJBspJyQV88g85ETosbzZPCbPAi50Ju0bw5EDhW_iOU20doojN0/s640/Screen+Shot+2017-06-20+at+8.01.45+PM.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><div style="font-size: 12.800000190734863px; text-align: center;">
Let the switch duty cycle D is the portion of the time (out of one switching period Ts) that the switch is in position 1. Then:<br />
Left: lossless buck converter duty cycle D = Vbus / Vbat = 5 / <span style="background-color: yellow;">12.6</span> ≈ 0.4</div>
<div style="font-size: 12.800000190734863px; text-align: center;">
Right: lossless boost converter duty cycle (1-D) = Vbat / Vbus = 9.6 / 20 = 0.48 => <span style="background-color: yellow;">D = 0.52</span></div>
<div style="font-size: 12.800000190734863px; text-align: center;">
In both cases, the average current flows from left to right (Vbat to Vbus).</div>
</td></tr>
</tbody></table>
If you combine the 2 of them in series, using the same inductor in the middle, then I get a non-inverting buck boost convert converter, shown below.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjSXOgekhiBsK4Ulvc8kDBcsspm01F-GfSA4LBE6FEIElsM35t89jro1w77rD2PgS1bVjgd2RLSg3LgfHgnZlxbYwgLMTuv9-uy5K0TGHqwzh1rDbdFoEstKQEVFyuDaulKo_D3X4CF_7w/s1600/Screen+Shot+2017-06-20+at+8.02.53+PM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="174" data-original-width="439" height="157" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjSXOgekhiBsK4Ulvc8kDBcsspm01F-GfSA4LBE6FEIElsM35t89jro1w77rD2PgS1bVjgd2RLSg3LgfHgnZlxbYwgLMTuv9-uy5K0TGHqwzh1rDbdFoEstKQEVFyuDaulKo_D3X4CF_7w/s400/Screen+Shot+2017-06-20+at+8.02.53+PM.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Unlike the <i>inverting </i>buck-boost converter, the 2 switches do NOT move together at the transition of the duty cycle in a switching period; one of the switches remain in one position depending on the mode (step up/down) while the other switch does all the swiching.</td></tr>
</tbody></table>
<div class="separator" style="clear: both; text-align: center;">
</div>
Because of the duality of buck and boost (step down and step up), I can keep the switches in the boost mode when charging the battery from the bus (whose voltage is guaranteed in this problem to be higher); to visualize this, note the perfect equivalence with the buck converter if I keep S<sub>buck</sub> closed and switch S<sub>boost</sub> between positions 1 and 2 in the picture below.<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiLxnhZTCa-et3wQucTSYemuUK2rbUTPcsfKo551z_CXkI3HnGvMjlLfwB40TEDpq3Yb7Y2bGfHAdioShQ9nwdWOowlpNPCbjWgAQRa3HlZzg-KQKJkM-rZl9wCxxT-bmPMGGEJISetqSs4/s1600/Screen+Shot+2017-06-20+at+8.03.32+PM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="160" data-original-width="449" height="142" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiLxnhZTCa-et3wQucTSYemuUK2rbUTPcsfKo551z_CXkI3HnGvMjlLfwB40TEDpq3Yb7Y2bGfHAdioShQ9nwdWOowlpNPCbjWgAQRa3HlZzg-KQKJkM-rZl9wCxxT-bmPMGGEJISetqSs4/s400/Screen+Shot+2017-06-20+at+8.03.32+PM.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">When charging the battery, Vbus becomes the input voltage and Vbat becomes the output voltage of a buck converter flipped on the back. The average current flows from right to left, and the switch duty cycle D is the <i>complement</i> of the ratio Vbat / Vbus = 11.1 / 20 = <span style="background-color: yellow;">0.555</span>; that is, D = 0.445.</td></tr>
</tbody></table>
Note that the switch setting for S<sub>boost</sub> is the opposite of S<sub>buck</sub>, so Vbat / Vbus ratio during charging is <i>not</i> the usual duty cycle ratio, but the complement of that: D'. To prevent downstream confusion, let's summarize the rough output voltage ratio in the 3 operating cases:<br />
<ol>
<li>Buck mode: Vout = D Vin</li>
<li>Boost mode: Vout = Vin / D'</li>
<li>Charge mode: Vin = D' Vout</li>
</ol>
<div>
Ignoring losses, in the buck converter mode, the inductor current ripple 𝚫i<sub>L</sub> ≈ (V<sub>bat</sub> - V<sub>bus</sub>) D Ts / 2L. Using the given operating point values in this mode (operating point 1 in the intro),<br />
<blockquote class="tr_bq">
𝚫i<sub>L</sub> ≈ (12.6 - 5) 0.4 / (2 100k L) ≈ 15E-6 / L</blockquote>
Another way to look at this is to compare against the average inductor current, which must equal the average current to the load: i<sub>L</sub> ≈ 2 A. The ripple percentage is then 𝚫i<sub>L </sub>/ I<sub>L</sub> ≈ 15E-6 / L / 2 = 0.75 E-6 / L. To hold 𝚫i<sub>L</sub> within say 10% of the steady-state current then, L must be greater than 0.75 E-6 / 10% = 7.5 𝛍H.<br />
<br />
And again ignoring losses, in the boost converter mode:<br />
<ul>
<li>The voltage ripple at the the output of the converter 𝚫v = V<sub>bus</sub> D Ts / (2 R<sub>bus</sub> C), where C is the capacitance in series with the load. If we want 𝚫v < 0.1 V, then C must be larger than V<sub>bus</sub> D Ts / (2 R<sub>bus</sub> 𝚫v) = i<sub>bus</sub> D Ts / (2 𝚫v) = (3 A) (0.52) (10 𝛍s) / (2 * 0.1) = 78 𝛍F.</li>
<li>𝚫i<sub>L</sub> ≈ V<sub>bat</sub> D Ts / L which for 10% ripple target means the inductance must be larger than V<sub>bat</sub> D Ts / (10% 3 A) = 166 𝛍H.</li>
</ul>
</div>
<div>
In charging mode, S<sub>boost</sub> is actually working like a buck switch, so the current ripple analysis can be repeated with voltages flipped numbers: 𝚫i<sub>L</sub> ≈ (V<sub>bus</sub> - V<sub>bat</sub>) D' Ts / 2L. Note that the duty cycle is the complement of the usual buck converter duty cycle, because I am using S<sub>boost</sub> as the buck converter. For operating point 3, the ratio of the inductor current ripple 𝚫i<sub>L</sub>/I<sub>L</sub> then work out to (20 V - 11.1 V) 0.555 / (2 10% 100k L). To keep this down below 10% requires L greater than (20 - 11.1) 0.555 / 20k = 247 𝛍H! The inductor is getting bigger and bigger!</div>
<h3>
(Solid state) switch design</h3>
To implement this schematic with the usual power electronics semiconductor devices, I need the devices Q1~Q4:<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjFvcOrAU8QUYcbRvRWLy_nwtIp1AcYy23zVr2xvy9gUvN-_rAueS3tqlGfIJMVaX-odx9Dql0rji99fdQ6Fa4Y4lso7brEpfEFmU2JMiT8NNgpoVI5buY8rFWQBDaAxzpzP8J6YecLY7GZ/s1600/Screen+Shot+2017-06-20+at+8.23.46+PM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="146" data-original-width="451" height="128" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjFvcOrAU8QUYcbRvRWLy_nwtIp1AcYy23zVr2xvy9gUvN-_rAueS3tqlGfIJMVaX-odx9Dql0rji99fdQ6Fa4Y4lso7brEpfEFmU2JMiT8NNgpoVI5buY8rFWQBDaAxzpzP8J6YecLY7GZ/s400/Screen+Shot+2017-06-20+at+8.23.46+PM.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The "Q" just means some kind of semiconductor device, rather than a transistor necessarily.</td></tr>
</tbody></table>
All switch implementation that meet the following currents and voltages across the switches are permissible. In each mode, I have to be careful about the voltage that a switch must block while it is open.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifjoKdzNxeVu3DraRdU1uWYUmZ41RR4SjkFz01jzCbsI7PQ1UNWbww2Rn1JwIm_SAP9fHiLWwuwTCDI9mMK5ZQbwYulVNlJlx_JoduUwsKG3WbWxLKKGwUncJ4p155l8V7Kj01PxIEUDQD/s1600/Screen+Shot+2017-07-02+at+9.21.18+PM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="119" data-original-width="739" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifjoKdzNxeVu3DraRdU1uWYUmZ41RR4SjkFz01jzCbsI7PQ1UNWbww2Rn1JwIm_SAP9fHiLWwuwTCDI9mMK5ZQbwYulVNlJlx_JoduUwsKG3WbWxLKKGwUncJ4p155l8V7Kj01PxIEUDQD/s1600/Screen+Shot+2017-07-02+at+9.21.18+PM.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Current flow and voltage across the switches during the D Ts and D' Ts phase of the buck mode</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiyy6cpMP73Byt5oq4GHB21p_OqsnIbgJG5Hyf1-uF6y9Z1MWbD-05YF0xPVKQatslRz-cOP7F0uiOgfp96OZ7UR4Fd5o2dbZrhezkp9bvT759jFGN4jExNqQwLYGjf-VQf88HaPoDtI3z_/s1600/Screen+Shot+2017-07-02+at+9.23.21+PM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="122" data-original-width="732" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiyy6cpMP73Byt5oq4GHB21p_OqsnIbgJG5Hyf1-uF6y9Z1MWbD-05YF0xPVKQatslRz-cOP7F0uiOgfp96OZ7UR4Fd5o2dbZrhezkp9bvT759jFGN4jExNqQwLYGjf-VQf88HaPoDtI3z_/s1600/Screen+Shot+2017-07-02+at+9.23.21+PM.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Current flow and voltage across the switches during the D Ts and D' Ts phase of the boost mode</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhI5YBBHAnWIEz2uQQVXQnA93OQyAZ-MkOIEly8IBrKnsipxU_K0dhnzYxyFmFAeN0S8lP30R29O7Jm4gshwd5asK6GpLcunNW7QV5Q9KuJBtvl6EvmlGC_fm3ZdqJpWrMwk-khWX1LkhUs/s1600/Screen+Shot+2017-07-02+at+9.25.02+PM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="129" data-original-width="733" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhI5YBBHAnWIEz2uQQVXQnA93OQyAZ-MkOIEly8IBrKnsipxU_K0dhnzYxyFmFAeN0S8lP30R29O7Jm4gshwd5asK6GpLcunNW7QV5Q9KuJBtvl6EvmlGC_fm3ZdqJpWrMwk-khWX1LkhUs/s1600/Screen+Shot+2017-07-02+at+9.25.02+PM.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Current flow and voltage across the switches during the D Ts and D' Ts phase of the charging mode</td></tr>
</tbody></table>
From these diagrams, the following requirements can be placed on the Q1~Q4 and a natural choice for the implementation:<br />
<ul>
<li>Q1: active +/passive- current, but only + voltage ==> MOSFET in parallel with a diode</li>
<li>Q2: passive - current, but only + voltage ==> diode</li>
<li>Q3: active +/passive - current, but only - voltage ==> MOSFET in parallel with a diode</li>
<li>Q4: active +/passive - current, but only + voltage ==> MOSFET in parallel with a diode</li>
</ul>
<h3>
MOSFET choice</h3>
<div>
I choose Infineon BSC100N03MS as the MOSFET, because it was used in the instructor given examples. According to the data sheet, the drain to source voltage (the blocking voltage) is 30 V, which is sufficient for the maximum of expected 20 V applied on the USB bus end. It is capable of conducting 44 A nominally, and 176 in a pulse. My expected gate-source voltage is 20 V, which is much larger than V<sub>in</sub> (which is 12.6 V max). In short, it has adequate voltage and current margin for the problem at hand.</div>
<h3>
S<sub>buck</sub> implementation using diodes (and a MOSFET)</h3>
<div>
Q1-Q2 pair that comprises S<sub>buck</sub> has only 1 active element (MOSFET) between them, so the control is rather straight-forward (at first), like this:</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiZyZE2N0Zvfm6l33Ydxq-uUCasQTgfKVpeE9NZ4QodIi71MxB_MEmRcbTJ5P2XBNjpGsqgNazNr0Uv7nwpYHJrD4J3PB3CbiaijL7TMHJWlbaWJrcblc1cWJ8EmAftu4_2i3ZdmZIDlq7K/s1600/Screen+Shot+2017-06-21+at+8.23.21+AM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="195" data-original-width="308" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiZyZE2N0Zvfm6l33Ydxq-uUCasQTgfKVpeE9NZ4QodIi71MxB_MEmRcbTJ5P2XBNjpGsqgNazNr0Uv7nwpYHJrD4J3PB3CbiaijL7TMHJWlbaWJrcblc1cWJ8EmAftu4_2i3ZdmZIDlq7K/s1600/Screen+Shot+2017-06-21+at+8.23.21+AM.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">In S<sub>buck</sub>, Q1 consists of fast recovery diode (D3 in this schematic) in anti-parallel with a MOSFET, and Q2 is just another diode. (I am not sweating over the optimal part selection here; just grabbed any diode previously used in the class.) The MOSFET conducts the actively turned-on + current, and the passively conducts the negative current in the other direction. The diode on the right blocks the positive voltage at the source of the MOSFET when it conducts, but passively conducts the negative current (flowing from the bottom to the top, or ground to the inductor) when the MOSFET is turned off.</td></tr>
</tbody></table>
<div>
But since the MOSFET is not grounded, the gate will be floating, so we need a floating gate driver to supply enough current to the gate (to turn it on, or AKA close it) when commanded (c_buck in the schematic below). The floating gate driver in turn needs floating (i.e. isolated) power supply. In this class, a low power isolated unregulated DC-DC converter is recommended for use.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhSM0WTVT-6_hmbSV9vwzkE8YB2CG-OQaUaRZEayONYQCn29KTTO1_KU-uZeXYnpjo68a9kjxwz3RqJdlN3Rm4Mg9FeuAr7EkmCsOhct_varqiQuXRfF8JG4FjTIvWxL8qn0fQFhpc_hRiZ/s1600/Screen+Shot+2017-06-28+at+6.29.36+PM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="302" data-original-width="398" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhSM0WTVT-6_hmbSV9vwzkE8YB2CG-OQaUaRZEayONYQCn29KTTO1_KU-uZeXYnpjo68a9kjxwz3RqJdlN3Rm4Mg9FeuAr7EkmCsOhct_varqiQuXRfF8JG4FjTIvWxL8qn0fQFhpc_hRiZ/s1600/Screen+Shot+2017-06-28+at+6.29.36+PM.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">S<sub>buck</sub>, with MOSFET gate driver powered by a floating unregulated DC-DC power supply. Note Vss pin of the gate driver is connected to the SOURCE (downstream of the current) side of the MOSFET. If you get this wrong, the MOSFET will always be on!</td></tr>
</tbody></table>
</div>
<div>
<br />
The isolated DC-DC converter (U8 above) in turn will get the power straight from the battery, and provide V<sub>outP</sub> − V<sub>outM</sub> = n V<sub>in</sub>, where I choose n = 1 (can be between 1 and 2).<br />
<br />
Note the resistor between the MOSFET gates and their drivers is there to limit the inrush current into the gate, and should be chosen for the particular MOSFET. But conversely, we want to remove the gate charge from the MOSFET as quickly as possible during the turn off transition--again as much as the devices can tolerate. So in some examples, I found a Schottky diode (which is a very fast diode with little voltage drop) shorting out the current limiting resistor (as shown below), to effectively remove the current limit during the turn off transition. To be honest, I still don't understand why it's OK to remove current limit when removing the gate charge from the MOSFET, while a current limiting resistor is required when adding it.</div>
<h3>
S<sub>boost</sub> implementation using diodes</h3>
<div>
<div style="text-align: right;">
</div>
S<sub>buck</sub> starts out as an exact dual of S<sub>buck</sub> in that it can pass the average current in either direction: through M<sub>boost</sub>-D<sub>boost_Dp</sub> pair during boost mode, or through M<sub>chg</sub>-D<sub>chg_Dp</sub> pair during charge mode.<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiTYuMi-uaY52tzGiPHivsLJPYjPDxfz8znLAaOhyphenhypheno7xDXYVEhXhj_cffIw3LL_Xa_fgHjF78ZvJJutskQK6jxhDRBZfO74m9fQQYA5GNJWp93-gZY-lgcfOrJaEVzX1QSnwWO4-9QduFKW/s1600/Screen+Shot+2017-06-22+at+4.18.18+PM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="336" data-original-width="530" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiTYuMi-uaY52tzGiPHivsLJPYjPDxfz8znLAaOhyphenhypheno7xDXYVEhXhj_cffIw3LL_Xa_fgHjF78ZvJJutskQK6jxhDRBZfO74m9fQQYA5GNJWp93-gZY-lgcfOrJaEVzX1QSnwWO4-9QduFKW/s1600/Screen+Shot+2017-06-22+at+4.18.18+PM.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">During the boost mode (step up power supply), M<sub>boost</sub>-D<sub>boost_Dp</sub> work in a complimentary fashion to keep the average current flowing from left (Vbat) to right (Vbus). During battery charging mode, M<sub>chg</sub>-D<sub>chg_Dp</sub> work together to keep the average current flowing from right (Vbus) to left (Vbat). During buck mode (step down power supply), M<sub>chg</sub> is always open to keep the current flowing only through D<sub>boost_Dp</sub>.</td></tr>
</tbody></table>
Of course, the gate drivers are necessary to drive the MOSFETs.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiTEtS4aeantBKKjWsF1kRVRsQJ0wgfK3LCIlPTNSJXxnX9Q_9g5jkRY3FNw6zSrznYWuzyYETGBEtbreVfOEVEzzEvsIKHrZDkjFquJE_tiWX88qKiuzmi84UQO6Hav4Wwe6HdxAQq0oUo/s1600/Screen+Shot+2017-06-28+at+6.38.34+PM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="475" data-original-width="450" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiTEtS4aeantBKKjWsF1kRVRsQJ0wgfK3LCIlPTNSJXxnX9Q_9g5jkRY3FNw6zSrznYWuzyYETGBEtbreVfOEVEzzEvsIKHrZDkjFquJE_tiWX88qKiuzmi84UQO6Hav4Wwe6HdxAQq0oUo/s1600/Screen+Shot+2017-06-28+at+6.38.34+PM.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">BUG ALERT: I should have connected outM pin of Uchg_dd to the SOURCE pin of Mchg, but instead connected it to the drain pin.</td></tr>
</tbody></table>
</div>
<h3>
MOSFET logic gate coordination</h3>
In this realization of the power stage, there are total of 3 MOSFETS that have to be turned on/off in a coordinated manner. Let's enumerate their required on (closed) and off (open) states during the 2 phases in each mode:<br />
<table 5px="" border-spacing:="" border="1"><tbody>
<tr><th>Mode</th><th colspan="2">Step down (buck)</th><th colspan="2">Step up (boost)</th><th colspan="2">Charge Battery</th></tr>
<tr><th>Phase</th><th>D Ts</th><th><span style="color: red;">D'</span> Ts</th><th><span style="color: red;">D'</span> Ts</th><th>D Ts</th><th><span style="color: red;">D'</span> Ts</th><th>D Ts</th></tr>
<tr><td><div style="text-align: center;">
M<sub>buck</sub></div>
</td><td><div style="text-align: center;">
1</div>
</td><td><div style="text-align: center;">
0</div>
</td><td><div style="text-align: center;">
1</div>
</td><td><div style="text-align: center;">
1</div>
</td><td><div style="text-align: center;">
0</div>
</td><td><div style="text-align: center;">
0</div>
</td></tr>
<tr><td><div style="text-align: center;">
M<sub>boost</sub></div>
</td><td><div style="text-align: center;">
0</div>
</td><td><div style="text-align: center;">
0</div>
</td><td><div style="text-align: center;">
0</div>
</td><td><div style="text-align: center;">
1</div>
</td><td><div style="text-align: center;">
0</div>
</td><td><div style="text-align: center;">
0</div>
</td></tr>
<tr><td><div style="text-align: center;">
M<sub>chg</sub></div>
</td><td><div style="text-align: center;">
0</div>
</td><td><div style="text-align: center;">
0</div>
</td><td><div style="text-align: center;">
0</div>
</td><td><div style="text-align: center;">
0</div>
</td><td><div style="text-align: center;">
0</div>
</td><td><div style="text-align: center;">
1</div>
</td></tr>
</tbody></table>
The assignment provides 2 independent control signals: Vcontrol1 and Vcontrol2. To control all 3 switches with just these 2 signals, I must reduce the above truth table to use only 2 inputs. Digital logic reduces inadvertent switching action from noise, and I can generate the following 2 digital signals from Vcontrol1 and Vcontrol2, using the digital logic threshold voltage V<sub>high</sub>/2 = 2.5V:<br />
<ol>
<li>Boost = V(control1) > V<sub>high</sub>/2</li>
<li>Charge = V(control2) > V<sub>high</sub>/2</li>
</ol>
<div>
Since the HW recommended a 5 V digital logic, I am using 2.5 V as the threshold. I still need a rapidly switching PWM from these 2 control voltages, so I can derive the PWM signals for the 3 modes like this:</div>
<div>
<ol>
<li>For step-down mode, PWM(Vcontrol1), with VM=V<sub>high</sub>/2, so that <span style="text-align: center;">M</span><sub style="text-align: center;">buck</sub> duty cycle will be continuous (D = 1) when going from the buck mode to the boost mode.</li>
<li>For step-up mode, PWM(Vcontrol1 - V<sub>high</sub>/2)</li>
<li>For charge mode, PWM(Vcontrol2 - V<sub>high</sub>/2)</li>
</ol>
<div>
In cases 2 and 3 above, I need a reference voltage and a subtractor. I am allowed to use an IC to generate a precision reference voltage (despite the irony of having to use a voltage reference in a DC-DC converter design); the LT1121-5 (the "-5" in the part name is for 5 V output) was recommended, and I use a voltage divider to derive a 2.5 V from it. With a 1% resistor, I know that the V<sub>high</sub>/2 reference can be 2 % off (if one resistor is 1% higher and the other is 1% lower), but since the design does not require operation near 100 % duty cycle, I'll be OK. The subtraction of the V<sub>high</sub>/2 reference voltage can be done with an op-amp with sufficiently high open-loop gain. I just used the instructor recommended part LT1498 (which has a rail-to-rail input-output, 10 MHz gain-BW product, and 6 V/us slew rate), as you can see below, where V(Ctrl1_2V5) = V(control1) - Ref2V5.</div>
</div>
<div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifK1DbRDiRAXwN7Z_VM32v97df7IaJQjyJQsobq7_0BwGJCvZ3qJ6ntjaXf1nGV6up3tTSrsdtUypAc354O0bfR88lEJGmSQz7vgFRLa5LujFW90YYnthY7thntKoVgizozl46ofapTxpf/s1600/Screen+Shot+2017-06-27+at+6.10.12+PM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="184" data-original-width="489" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifK1DbRDiRAXwN7Z_VM32v97df7IaJQjyJQsobq7_0BwGJCvZ3qJ6ntjaXf1nGV6up3tTSrsdtUypAc354O0bfR88lEJGmSQz7vgFRLa5LujFW90YYnthY7thntKoVgizozl46ofapTxpf/s1600/Screen+Shot+2017-06-27+at+6.10.12+PM.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">This subtractor is going to suffer from analog issues (like part and temperature variance).</td></tr>
</tbody></table>
The reference voltage takes quite a few microseconds to settle, and the reference's ability to track half the 5 V from the voltage generator is somewhat disappointing.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibZWjxof-UXYOLESemX-fqoRana1ul5jPs7p1CIJszXhqF4UsEo03-vbd4g3IqbWo7Gj8Gg7hAl1-k2q4mZ9Bp7ke5TZLzGuMVnja0G85xza2lHRnOEGcps-wWWumoCJpPd4GuXpavFIPI/s1600/Screen+Shot+2017-06-27+at+6.27.31+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="280" data-original-width="522" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibZWjxof-UXYOLESemX-fqoRana1ul5jPs7p1CIJszXhqF4UsEo03-vbd4g3IqbWo7Gj8Gg7hAl1-k2q4mZ9Bp7ke5TZLzGuMVnja0G85xza2lHRnOEGcps-wWWumoCJpPd4GuXpavFIPI/s1600/Screen+Shot+2017-06-27+at+6.27.31+PM.png" /></a></div>
The subtracted signal is then fed into a PWM generator with VM=V<sub>high</sub>/2. The truth table for the MOSFETs can then be implemented with this mixed signal logic:</div>
<div>
<ul>
<li><span style="text-align: center;">duty(M</span><sub style="text-align: center;">buck</sub>) = NOT(control2 > V<sub>high</sub>/2) AND (control1 > V<sub>high</sub>/2 OR PWM(control1))</li>
<li><span style="text-align: center;">duty(M</span><sub style="text-align: center;">boost</sub>) = NOT(control2 > V<sub>high</sub>/2) AND control1 > V<sub>high</sub>/2 AND PWM(control1 - V<sub>high</sub>/2)</li>
<li><span style="text-align: center;">duty(M</span><sub style="text-align: center;">chg</sub>) = control2 > V<sub>high</sub>/2 AND PWM(control2 - V<sub>high</sub>/2)</li>
</ul>
</div>
<div>
To satisfy both the truth table <i>and</i> the MOSFET duty cycles for the operational points, control2 should be clearly be < V<sub>high</sub>/2 during buck and boost modes. ANDing the digital signal with PWM's output is necessary because achievable minimum duty cycle may not be 0. The logic IC implements the "> V<sub>high</sub>/2" operation internally, so I can simplify the above logic expression:<br />
<ul>
<li><span style="text-align: center;">duty(M</span><sub style="text-align: center;">buck</sub>) = NOT(control2) AND (control OR PWM(control1))</li>
<li><span style="text-align: center;">duty(M</span><sub style="text-align: center;">boost</sub>) = NOT(control2) AND control1 AND PWM(control1 - V<sub>high</sub>/2)</li>
<li><span style="text-align: center;">duty(M</span><sub style="text-align: center;">chg</sub>) = control2 AND PWM(control2 - V<sub>high</sub>/2)</li>
</ul>
</div>
Here is the implementation of <span style="text-align: center;">duty(M</span><sub style="text-align: center;">buck</sub>), <span style="text-align: center;">duty(M</span><sub style="text-align: center;">boost</sub>), and <span style="text-align: center;">duty(M</span><sub style="text-align: center;">chg</sub>) that uses the 5 V and "2.5 V" explained above.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiBCpEnP7ee1r1yyPikhsUyn3PkTotXT6wVrtDOvFZyZc1WIIU2iimrFmiykeGXjnMaQtfHKlsaEvRVuz30dcDsxQ31qQ4xsm6x5rrcQBORp9qgUC7rNdRnhoVb_WaeYdN8cu-b9MiuO-v5/s1600/Screen+Shot+2017-06-27+at+6.33.16+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="300" data-original-width="821" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiBCpEnP7ee1r1yyPikhsUyn3PkTotXT6wVrtDOvFZyZc1WIIU2iimrFmiykeGXjnMaQtfHKlsaEvRVuz30dcDsxQ31qQ4xsm6x5rrcQBORp9qgUC7rNdRnhoVb_WaeYdN8cu-b9MiuO-v5/s1600/Screen+Shot+2017-06-27+at+6.33.16+PM.png" /></a></div>
At first, I use hard coded values for Vcontrol1 and Vcontrol2 to simulate the converter response, so ideally, I want 3 distinct time periods to exercise the 3 operational modes of the converter. That is, I want each of the 3 duty cycles to <i>not</i> overlap. Ignoring noise, it looks like I will be able to do that.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQizmKm2hhCprSdiMfCJPHHnMAHc-GJ8hXfbRAUf7Oetp0b9TUe6GSMw_6gF8l4bdntLXkPyyoDZpwyAs_QCzOaMhWv6Jwzabu9gPtTeeucH440cX-4EtxV7xdP95yYXLn5ak3PiqKfEFI/s1600/Screen+Shot+2017-06-27+at+6.41.00+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="206" data-original-width="840" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQizmKm2hhCprSdiMfCJPHHnMAHc-GJ8hXfbRAUf7Oetp0b9TUe6GSMw_6gF8l4bdntLXkPyyoDZpwyAs_QCzOaMhWv6Jwzabu9gPtTeeucH440cX-4EtxV7xdP95yYXLn5ak3PiqKfEFI/s1600/Screen+Shot+2017-06-27+at+6.41.00+PM.png" /></a></div>
<h3>
Excessive power loss through the diodes</h3>
If I drive the converter in buck mode with <span style="text-align: center;">duty(M</span><sub style="text-align: center;">buck</sub>) shown above, I can see the expected response of a buck converter.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjOeOjiX-Ja33Lj3anwkOE7NB-x4e-9Tz8e7ccH2dIGHUCRSVUAMJZzQM3pMTWTIlRc9kYEW7zZKqX7r6ztQf9ov-tP1eUQd6OHMaG8EaZcVCBGMmjLy-vSgREJq0oJ669VIAf4JQAe-L9X/s1600/Buck+Problem.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="327" data-original-width="562" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjOeOjiX-Ja33Lj3anwkOE7NB-x4e-9Tz8e7ccH2dIGHUCRSVUAMJZzQM3pMTWTIlRc9kYEW7zZKqX7r6ztQf9ov-tP1eUQd6OHMaG8EaZcVCBGMmjLy-vSgREJq0oJ669VIAf4JQAe-L9X/s1600/Buck+Problem.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The current spikes through the MOSFET at the beginning of each MOSFET turn-on is many times the inductor current, and is on the order of maximum current spike specified in the MOSFET and the diode data sheet.</td></tr>
</tbody></table>
The output voltage is much lower than the required 5 V, because of the forward voltage drop of 2 diodes in the forward path (1 in S<sub>buck</sub> and 1 in S<sub>boost</sub>): 0.85 V each. 2 A nominal current through 1.7 V drop comes out to 3.4 W, which is 34 % of the power delivered to the bus (2 A at 5 V) in the step-down operation mode: clearly unacceptable as a high efficiency switching converter, so back to the drawing board. To remove the diodes in the current conducting paths, I need a complementary MOSFET, arranged with a dead time IC.<br />
<h3>
Switch implementation without diodes</h3>
<div>
From the switch current flow and voltage digram given at the outset, it seems that Q3 and Q4 can each pass current in either direction and block the voltage in a single direction. Therefore, I only need to replace the diode I had used in Q2 with another MOSFET that turns on during the D' interval of the buck mode. Here is the modified switch implementations that can handle all 3 operational modes.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_xER3tRJvkAEsMRKuxZOKyg7CHr9KPHj4Utc7KgJyZOg9_cUOGJpnNC_NYxVf22CKkbUedu_-qWRlzrG1-_6Oqth8tyRA3pVQEDMUkyWT2uWf9SVoIelzDmmu3WhzrgmhctDV7-k7HBV4/s1600/Screen+Shot+2017-07-02+at+9.53.43+PM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="351" data-original-width="941" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_xER3tRJvkAEsMRKuxZOKyg7CHr9KPHj4Utc7KgJyZOg9_cUOGJpnNC_NYxVf22CKkbUedu_-qWRlzrG1-_6Oqth8tyRA3pVQEDMUkyWT2uWf9SVoIelzDmmu3WhzrgmhctDV7-k7HBV4/s1600/Screen+Shot+2017-07-02+at+9.53.43+PM.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">MOSFET driver for Mbuck_DP does not need a high side voltage supply, because Mbuck_DP's source is connected to the ground.<br />
Except for Mbuck_DP, all MOSFETs can pass current in either direction. I control that by designing the gate control signal carefully.</td></tr>
</tbody></table>
In this design, 100ns dead-time is used everywhere. Whenever the MOSFET source is grounded, the high side gate power supply is unnecessary, so the diode replacement MOSFET is somewhat simpler than the high side.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
Now that the MOSFETs are doing the double duty, I have to expand the MOSFET truth table to consider the MOSFETs that are complementary to the existing MOSFETs.<br />
<table 5px="" border-spacing:="" border="1">
<tbody>
<tr>
<th></th>
<th colspan="2">Step down<br />
control1 = 0<br />
control2 = 0</th>
<th colspan="2">Step up<br />
control1 = 1<br />
control2 = 0</th>
<th colspan="2">Charge<br />
control1 = 0<br />
control2 = 1</th>
</tr>
<tr>
<th>MOSFET</th>
<th>D Ts</th><th>D' Ts</th>
<th>D Ts</th><th>D' Ts</th>
<th>D Ts</th><th>D' Ts</th>
</tr>
<tr>
<td>M<sub>buck</sub></td>
<td>1</td><td>0</td>
<td>1</td><td>1</td>
<td>1</td><td>1</td>
</tr>
<tr>
<td>M<sub>buck_DP</sub></td>
<td>0</td><td>1</td>
<td>0</td><td>0</td>
<td>0</td><td>0</td>
</tr>
<tr>
<td>M<sub>boost</sub></td>
<td>0</td><td>0</td>
<td>1</td><td>0</td>
<td>0</td><td>1</td>
</tr>
<tr><td>M<sub>chg</sub></td>
<td>1</td><td>1</td>
<td>0</td><td>1</td>
<td>1</td><td>0</td>
</tr>
</tbody>
</table>
The "x" in the truth table means "don't care"; which can be helpful in simplifying the digital logic to implement the truth table:<br />
<div>
<ul>
<li><span style="text-align: center;">duty(M</span><sub style="text-align: center;">buck</sub>) = control2 OR control1 OR pwm(control1)</li>
<li><span style="text-align: center;">duty(M</span><sub style="text-align: center;">buckP</sub>) = NOT(control1) AND NOT(control2) AND pwm'(control1)</li>
<li><span style="text-align: center;">duty(M</span><sub style="text-align: center;">boost</sub>) = (control1 AND pwm(control1 - V<sub>high</sub>/2)) OR (control2 AND pwm'(control2 - V<sub>high</sub>/2))</li>
<li><span style="text-align: center;">duty(M</span><sub style="text-align: center;">chg</sub>) = (NOT(control1) AND NOT(control2)) OR (NOT(control2) AND control1 AND pwm'(control1 - V<sub>high</sub>/2)) OR (control2 AND pwm(control2 - V<sub>high</sub>/2))</li>
</ul>
</div>
where I use the prime notation (pwm') to indicate the complement of the duty cycle. Scanning the above Karnaugh chart, it appears that I need both of the boolean signals for control1 and control2. So it's just easier to factor out those common digital logic circuit for reuse.<br />
<br />
The 5V and 2.5V reference voltages are shown in a separate part of the schematic:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdXTgeOPiLEDRlqckdk5niO4naYWpmfWHuTh9hwVHKRrl1-z9LP8a9uKDneHdqoZhwiMG7hW6tvXbloVxGF3FW3HUlLQU4IngRSKGmFqv1V8RGHoxmHkob4sS_Uoei9PEWWmP2M3O7_HXI/s1600/Screen+Shot+2017-06-29+at+6.26.46+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="167" data-original-width="286" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdXTgeOPiLEDRlqckdk5niO4naYWpmfWHuTh9hwVHKRrl1-z9LP8a9uKDneHdqoZhwiMG7hW6tvXbloVxGF3FW3HUlLQU4IngRSKGmFqv1V8RGHoxmHkob4sS_Uoei9PEWWmP2M3O7_HXI/s1600/Screen+Shot+2017-06-29+at+6.26.46+PM.png" /></a></div>
The NOT(control1) and NOT(control2) are implemented with an off-the-shelf inverters.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjzN8JGsXrFM0Z8UjXU4Ns6jL17h6ADmLxyfB2hsy_4FNh0G5-RFFUbQXZb8__GDah-RWMiHPH1dEN4Qyvx7L3HqnfTk1J0c1BVDaIh_nAWKnqGBFLdQKoS1ryiEVsEpsIKk49Y8MqIowI-/s1600/Screen+Shot+2017-06-29+at+9.18.39+AM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="54" data-original-width="453" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjzN8JGsXrFM0Z8UjXU4Ns6jL17h6ADmLxyfB2hsy_4FNh0G5-RFFUbQXZb8__GDah-RWMiHPH1dEN4Qyvx7L3HqnfTk1J0c1BVDaIh_nAWKnqGBFLdQKoS1ryiEVsEpsIKk49Y8MqIowI-/s1600/Screen+Shot+2017-06-29+at+9.18.39+AM.png" /></a></div>
Then the 2.5V bias subtracted voltages are derived from Ref2V5 and off-the-shelf op-amps.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgElyjFaeGa8hyua66Jsh8ITU0fl3F45NEIa1HO3woZxiWnhHFpKu0ibT1pWRXH-dFVA5eQwDDbbILNKM2Lc9YWoC77UUYpqDSNboZnKVJHo_BLTSgfD9OxGT8Onrr2qYBpbYpx0vb1Dngh/s1600/Screen+Shot+2017-06-29+at+6.32.30+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="158" data-original-width="533" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgElyjFaeGa8hyua66Jsh8ITU0fl3F45NEIa1HO3woZxiWnhHFpKu0ibT1pWRXH-dFVA5eQwDDbbILNKM2Lc9YWoC77UUYpqDSNboZnKVJHo_BLTSgfD9OxGT8Onrr2qYBpbYpx0vb1Dngh/s1600/Screen+Shot+2017-06-29+at+6.32.30+PM.png" /></a></div>
The duty cycle and the complementary duty cycle pairs for the buck, boost, and the charge mode are generated by the PWM-deadtime IC pairs; the only differences are the voltage level being commanded. Here, for example, is the PWM pairs for the boost mode:<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfgrh3ch97ZaIoBKSgfaqKxtfnwUkYwNorjCuKPiyJGwdUku2IHYAwG2e8CIHtnOfChKAASamTjbsHlUkcSZ7X7TZ3jLQJJuezQr1aTzFKuTBb_9FdWd2cFPjuwoUYr3VCrIgLjRlFXked/s1600/Screen+Shot+2017-06-29+at+8.46.05+PM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="83" data-original-width="306" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfgrh3ch97ZaIoBKSgfaqKxtfnwUkYwNorjCuKPiyJGwdUku2IHYAwG2e8CIHtnOfChKAASamTjbsHlUkcSZ7X7TZ3jLQJJuezQr1aTzFKuTBb_9FdWd2cFPjuwoUYr3VCrIgLjRlFXked/s1600/Screen+Shot+2017-06-29+at+8.46.05+PM.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">I am using the same dead-time of Td = 100ns for all PWM.</td></tr>
</tbody></table>
Finally, the 4 duty cycles can be generated from logic gates. The most complex logic is the 3 AND gates being fed to an OR gate for <span style="text-align: center;">duty(M</span><sub style="text-align: center;">chg</sub>), but the benefit of carefully building up the intermediate signals pays off.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi9rvjiBnWP5Ftm8QM0GvWNCzvpSX1SeGd7rIIfZYcYlw_qeb1rE2eMznzl19t1SKcJWh9h3SlOsAYClhudzNk0zPxk8bLxFAgLnzqhdRSuGkK0p2OzCWaVxEU_Mta_ZxPND4lOh0pX93lm/s1600/Screen+Shot+2017-07-02+at+10.39.19+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="81" data-original-width="289" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi9rvjiBnWP5Ftm8QM0GvWNCzvpSX1SeGd7rIIfZYcYlw_qeb1rE2eMznzl19t1SKcJWh9h3SlOsAYClhudzNk0zPxk8bLxFAgLnzqhdRSuGkK0p2OzCWaVxEU_Mta_ZxPND4lOh0pX93lm/s1600/Screen+Shot+2017-07-02+at+10.39.19+PM.png" /></a></div>
<h3>
LC filter to damp out voltage ripple</h3>
</div>
<div>
The original power stage design shown above has only one inductor, and a cap each on the input and output ends. There are a few problems with this approach: to reduce the input and output voltage ripple requires <i>huge </i>inductance and capacitance, which in turn--since the natural frequency of LC tank is sort(LC)--slows down the transient response. And for some reason, the LiPo battery model given for the project neglects the naturally <i>huge</i> (>> 1 mF) capacitance. If such a large capacitance is modeled, the LC tank would slow down the response unacceptably during the transient to the charging mode. Reducing the L is desirable for speeding up the transient and keeping the part cost down, but the current ripple at the inductor might go above the saturating inductor current.</div>
<div>
<br /></div>
<div>
I think one way to work around the physics of the problem is to put a cascaded LC filter on the input and output side of the power stage, as you can see in this schematic:</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh1QNznVOoqhv_pzsB2MJcnMxDtaUReQkWidBQXGrDFJRPaB-noItk_aMBtasqBJ9WdcAQY6FQoUheXNjAP32vZxHTsqwMN5wzZwIeqveo1SERTveNmXPqjMNONmqGapp6FgVTohBoglI7t/s1600/Screen+Shot+2017-07-08+at+11.08.19+AM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="75" data-original-width="154" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh1QNznVOoqhv_pzsB2MJcnMxDtaUReQkWidBQXGrDFJRPaB-noItk_aMBtasqBJ9WdcAQY6FQoUheXNjAP32vZxHTsqwMN5wzZwIeqveo1SERTveNmXPqjMNONmqGapp6FgVTohBoglI7t/s1600/Screen+Shot+2017-07-08+at+11.08.19+AM.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">L2-Cout1 damps out the switching frequency ripple. L2 can easily be an order of magnitude smaller than the main inductor in the power stage.</td></tr>
</tbody></table>
<div>
By using a small inductor-capacitor combination, the high frequency (at the switching frequency Fs) V<sub>bus</sub> ripple can be reduced with negligible impact on the slower dynamics of the inner power stage circuit. I want to do the same thing on the input end, but the assignment allows at most 2 inductors, so I held back.</div>
<h2>
Inductor design</h2>
In this assignment, the Ferroxcube 3F3 material (ferrite) is suggested as the core material. Ferrite cores have the H-B (electric field intensity-to-magnetic field density) nonlinearity depicted below:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiOCkHgtDkPktqW0ArFkCP4mdOscggEa4HPQBMfUnN5ENN09QOhcAGp2N6jyPoInqYXH7AqcvPBQ_QLGJi-hkTf65LE4Hqa0w4xAmsisKVnK26lGG7kCBAQpn37h6a8zBeCmtVkblRnyUVs/s1600/Screen+Shot+2017-07-08+at+7.02.35+AM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="318" data-original-width="326" height="312" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiOCkHgtDkPktqW0ArFkCP4mdOscggEa4HPQBMfUnN5ENN09QOhcAGp2N6jyPoInqYXH7AqcvPBQ_QLGJi-hkTf65LE4Hqa0w4xAmsisKVnK26lGG7kCBAQpn37h6a8zBeCmtVkblRnyUVs/s320/Screen+Shot+2017-07-08+at+7.02.35+AM.png" width="320" /></a></div>
3 core material properties for the 3F3 are:<br />
<ul>
<li>Saturation flux density B<sub>sat</sub> 0.33T ~ 0.43T, between 25 ºC ~ 100 ºC</li>
<li>Remnant flux density B<sub>r</sub> 0.12T</li>
<li>Coercive force H<sub>c</sub> ~ 12 A/m (relatively temperature insensitive)</li>
</ul>
<h3>
Choosing the inductance value</h3>
<div>
At first, I tried to use L1 = 247 𝛍H calculated above, and go through the core geometrical constant method (the "K<sub>g</sub>" method explained in Chapter 14 of <a href="https://smile.amazon.com/Fundamentals-Power-Electronics-Robert-Erickson/dp/0792372700/ref=sr_1_1?ie=UTF8&qid=1499437226&sr=8-1&keywords=fundamentals+of+power+electronics">the course textbook</a> that I follow in the next section) is used to design the filter inductor L1. But doing that, I wound up with a HUGE inductor (one that weighs <i>hundreds</i> of grams). A USB charging IC is selling for less than $1 at <a href="https://www.digikey.com/products/en?keywords=MAX1555&pkeyword=MAX1555&v=">Digikey</a>, so I reasoned that a huge inductor that is an order or magnitude or more expensive than the power management IC is a bad design, and looked for ways to reduce L1. The cascaded LC filter at the USB bus output end I explained earlier. I read the rubric, and actually did not find a requirement for the 10 % current ripple on L1, so I empirically arrived at the L1 and L2 value that met the requirements. </div>
<h3>
"K<sub>g</sub>" method </h3>
The step-by-step is to enumerate the requirements and constraints on the inductor:<br />
<ul>
<li>Winding material resistivity 𝛒 Assuming copper at 100<span style="font-family: "helveticaneue"; font-size: 12px;"><span style="font-family: -webkit-standard; font-size: small;"> </span>º</span>C, 2.3E-8 𝛀m.</li>
<li>Desired inductance: L1 = 10 𝛍H, L2 = 1 𝛍H.</li>
<li>Maximum current I<sub>max</sub> looks to be about 30 A for L1 (the main inductor) and -30 A for L2 (the isolation inductor).</li>
<li>B<sub>max</sub> < B<sub>sat</sub>, which is a material property of the magnetic core. For ferrite core operating rather hot, B<sub>sat</sub> = 0.4 T feels reasonable, so B<sub>max</sub> = 0.3 T.</li>
<li>Fill factor K<sub>u</sub> = 0.5 if using not-so-thin wire.</li>
<li>R should be determined by acceptable copper loss = I<sup>2</sup><sub>rms</sub> R. Let's say I accept 0.5 W copper loss (1 W through L1 and L2) at 3 A boost or charge mode. Then maximum R = 0.5/3<sup>2</sup> = 56 m𝛀 for each inductor.</li>
</ul>
The initial guess for K<sub>g</sub> is (𝛒 / R K<sub>u</sub>) (L I<sub>max</sub> / B<sub>max</sub>)<sup>2</sup>, which works out to 8.2E-3 cm<sup>5</sup> for L1, and 0.082E-3 cm<sup>5</sup> for L2. Appendix D of <a href="https://smile.amazon.com/Fundamentals-Power-Electronics-Robert-Erickson/dp/0792372700/ref=sr_1_1?ie=UTF8&qid=1499437226&sr=8-1&keywords=fundamentals+of+power+electronics">the course textbook</a> lists some widely used standard ferrite cores. Among them, the pot core type 1811 and 905 have K<sub>g</sub> close to the values calculated above for L1 and L2 (9.4E-3 and 0.183E-3, according to the table below), and weigh 7.3 g and 1.0 g, respectively.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6CnevIlGnduVBt6DEvn1RTWUTHfqH7-ebQki8ObgzVFaUOFLEnz1cJidgYLE0JWMB4uShMf46R5seZXwKVu17vcIAkw6eSoCsLlhlxCYRcgst9GavWdlxn4LcFSP8m6TmUKMsypNzVvur/s1600/Screen+Shot+2017-07-07+at+7.41.41+AM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="366" data-original-width="646" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6CnevIlGnduVBt6DEvn1RTWUTHfqH7-ebQki8ObgzVFaUOFLEnz1cJidgYLE0JWMB4uShMf46R5seZXwKVu17vcIAkw6eSoCsLlhlxCYRcgst9GavWdlxn4LcFSP8m6TmUKMsypNzVvur/s1600/Screen+Shot+2017-07-07+at+7.41.41+AM.png" /></a></div>
There are other core design that will also meet the K<sub>g</sub> requirement (like the EE core), but since I don't yet understand the subtleties of core material choice, I am just going with the 1st choice.<br />
<br />
The next step is to calculate the required air gap <i>l</i><sub>g</sub> (inductors store most of the magnetic energy in the air gap) with this formula: <i>l</i><sub>g</sub> = L (𝛍<sub>0</sub>/A<sub>c</sub>) (I<sub>max</sub> / B<sub>max</sub>)<sup>2</sup>, where 𝛍<sub>0</sub> is the air's magnetic permeability 4𝛑 E-7 and A<sub>c</sub> is the core's cross sectional area (0.433 cm<sup>2</sup> and 0.101 cm<sup>2</sup> in the above table). Using the values for L1 and L2 above, I get 2.9 mm and 1.24 mm for inductors L1 and L2.<br />
<br />
Next, N--the number of windings around this core--is calculated with N = (L I<sub>max</sub> ) / (B<sub>max </sub>A<sub>c</sub>). I get 23.1 for L1 and 9,9 for L2, so just round them to 23 and 10, respectively. These correspond to A<sub>L </sub>= L 10<sup>9</sup> / N<sup>2</sup>--inductance per turn that can be measured while grinding down the core post to create the air gap--of 18.7 nH and 10.2 nH for the 2 inductors. A sanity check at this point is to approximate (assuming core permeability 𝛍<sub>c</sub> >> 𝛍<sub>0</sub>) the magnetic flux density B as ≈ 𝛍<sub>0</sub> (N / <i>l</i><sub>g</sub>) I; i.e. proportional to the current through the winding and inversely proportional to the air gap. For L1 and L2 calculations made so far, B then approximates to 10 mT times the current, so that at 30 A, B < 0.3 T.<br />
<br />
Next, the winding wire cross area A<sub>wire</sub> should be <i>less</i> than K<sub>u</sub> W<sub>A</sub> / N, where W<sub>A</sub> is the core's window area (bobbin winding area in the above table): 0.187 cm<sup>2</sup> and 0.097 cm<sup>2</sup> for the 2 cores chosen. So maximum A<sub>wire</sub> comes out to 4E-3 cm<sup>2</sup> and 1.7E-3 cm<sup>2</sup> for L1 and L2. In the same Appendix D of <a href="https://smile.amazon.com/Fundamentals-Power-Electronics-Robert-Erickson/dp/0792372700/ref=sr_1_1?ie=UTF8&qid=1499437226&sr=8-1&keywords=fundamentals+of+power+electronics">the course textbook</a>, table D.6 lists the AWG (American Wire Gauge) specs. From this table, I picked out AWG #22 (A<sub>wire</sub> = 3.243E-3 cm<sup>2</sup>; diameter 701 𝛍m) and AWG #25 (A<sub>wire</sub> = 1.6E-3 cm<sup>2</sup>; diameter 505 𝛍m) as being reasonably close to the maximum A<sub>wire</sub> calculated above. To check that these wire chose does indeed meet the target copper loss, calculate R<sub>ser</sub> = 𝛒 N (MLT) / A<sub>wire</sub>, where MLT is the mean length per turn for the cores given in the above table: 3.71 cm and 2.90 cm for the 1811 and 1408 respectively. So R<sub>ser</sub> comes out to 60 m𝛀 for L1, and 27 m𝛀, which are reasonably close to the R<sub>ser</sub> sought at the beginning. For convenience, here is the comparison of the 2 inductors:<br />
<table>
<thead>
<tr>
<th><br /></th>
<th><span style="font-weight: normal; text-align: start;">"K</span><sub style="font-weight: normal; text-align: start;">g</sub><span style="font-weight: normal; text-align: start;"> method" </span>formula</th>
<th>L1</th>
<th>L2</th>
</tr>
</thead>
<tbody>
<tr>
<td>Target inductance L [𝛍H]</td>
<td>Input</td>
<td>10</td>
<td>1</td>
</tr>
<tr>
<td>Target serial resistance R<sub>ser</sub> [m𝛀]</td>
<td>Input</td>
<td>56</td>
<td>56</td>
</tr>
<tr>
<td>Target K<sub>g</sub> [cm<sup>5</sup>]</td>
<td>(𝛒 / R K<sub>u</sub>) (L I<sub>max</sub> / B<sub>max</sub>)<sup>2</sup></td>
<td>8.2E-3</td>
<td>0.082E-3</td>
</tr>
<tr>
<td>Off-the-shelf K<sub>g</sub> [cm<sup>5</sup>]</td>
<td>Textbook Table D.1</td>
<td>9.4E-3</td>
<td>0.183E-3</td>
</tr>
<tr><td>Ferrite pot core size [mm<sup>2</sup>]</td><td>Textbook Table D.1</td><td>18x11</td><td>9x5</td></tr>
<tr><td>Core weight [g]</td><td>Textbook Table D.1</td><td>7.3</td><td>1.0</td></tr>
<tr><td>Cross section area A<sub>c</sub> [cm<sup>2</sup>]</td><td>Textbook Table D.1</td><td>0.433</td><td>0.101</td></tr>
<tr><td>Winding (window) area W<sub>A</sub> [cm<sup>2</sup>]</td><td>Textbook Table D.1</td><td>0.187</td><td>0.034</td></tr>
<tr><td>Magnetic path length <i>l</i><sub>m</sub> [cm]</td><td>Textbook Table D.1</td><td>2.6</td><td>1.26</td></tr>
<tr>
<td>Air gap <i>l</i><sub>g</sub> [mm]</td>
<td>L (𝛍<sub>0</sub>/A<sub>c</sub>) (I<sub>max</sub> / B<sub>max</sub>)<sup>2</sup></td>
<td>2.9</td>
<td>1.24</td>
</tr>
<tr><td>Number of winding N</td><td>round((L I<sub>max</sub> ) / (B<sub>max </sub>A<sub>c</sub>))</td><td>23</td><td>10</td></tr>
<tr><td>Inductance per turn A<sub>L</sub> [nH]</td><td>L 10<sup>9</sup> / N<sup>2</sup></td><td>18.7</td><td>10.2</td></tr>
<tr><td>B<sub>scale</sub> [mT/A]</td><td>𝛍<sub>0</sub> (N / <i>l</i><sub>g</sub>)</td><td>10</td><td>10</td></tr>
<tr><td>Maximum A<sub>wire</sub> [cm<sup>2</sup>]</td><td>K<sub>u</sub> W<sub>A</sub> / N</td><td>0.187</td><td>0.097</td></tr>
<tr><td>AWG chosen</td><td>< Maximum A<sub>wire</sub> </td><td>#22</td><td>#25</td></tr>
<tr><td>AWG A<sub>wire</sub> [cm<sup>2</sup>]</td><td><br /></td><td>3.2E-3</td><td>1.6E-3</td></tr>
<tr><td>AWG diameter [mm]</td><td><br /></td><td>0.7</td><td>0.5</td></tr>
<tr><td>Winding resistance R<sub>ser</sub> [m𝛀]</td><td>𝛒 N (MLT) / A<sub>wire</sub></td><td>60</td><td>27</td></tr>
</tbody>
</table>
In the schematic, I now replace the numeric value of L1 and L2 with SPICE model that includes H<sub>c</sub>, B<sub>r</sub>, B<sub>s</sub>, A<sub>c</sub>, <i>l</i><sub>m</sub>, <i>l</i><sub>g</sub>, N, and R<sub>ser</sub>:<br />
<ul>
<li>L1 node1 node2 Hc=12 Br=0.12 Bs=0.33 A=43u Lm=26m Lg=2.9m N=23 Rser=60m</li>
<li>L2 node1 node2 Hc=12 Br=0.12 Bs=0.33 A=10u Lm=12.6m Lg=1.24m N=10 Rser=27m</li>
</ul>
Here is an example of entering the SPICE line into the inductor component in LTSpice (^ + right click on the inductor component):<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh7pCx1e5bTCATLIflfWxtGtbuG9FznC5DURgwsoSzQR1rU6d-qhnOl8JRzpHBuSQFiLGsXG1c38tytRA0PAnJSwnjOwqzEM3vdZM0LiP2R-5NMVgcOIzhBbNRHbV4rx2QccwPNwgLMtcu0/s1600/Screen+Shot+2017-07-08+at+8.31.13+AM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="156" data-original-width="546" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh7pCx1e5bTCATLIflfWxtGtbuG9FznC5DURgwsoSzQR1rU6d-qhnOl8JRzpHBuSQFiLGsXG1c38tytRA0PAnJSwnjOwqzEM3vdZM0LiP2R-5NMVgcOIzhBbNRHbV4rx2QccwPNwgLMtcu0/s1600/Screen+Shot+2017-07-08+at+8.31.13+AM.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">SPICE works in MKS unit, so some mental conversion from the above table is required. I actually found that the "Value" line would not be deleted, so I wrote the SpiceLine on the Value line instead.</td></tr>
</tbody></table>
And here is the result of simulation that uses the non-linear, lossy inductors thus specified:<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhVE6q4k_mgjzi4tEf3TFXdKYF2E_esWEs37XweUDJ9qfQ56j-OCoch5VsXyew-8LWMzoDP8donv6lYzjxVetL_qljVcKv25TKagHwuJwD8Rfhybw1aFVeBZfwWYFq-kKsvyl9BbNNOHqz2/s1600/Screen+Shot+2017-07-08+at+9.58.18+AM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1020" data-original-width="1600" height="408" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhVE6q4k_mgjzi4tEf3TFXdKYF2E_esWEs37XweUDJ9qfQ56j-OCoch5VsXyew-8LWMzoDP8donv6lYzjxVetL_qljVcKv25TKagHwuJwD8Rfhybw1aFVeBZfwWYFq-kKsvyl9BbNNOHqz2/s640/Screen+Shot+2017-07-08+at+9.58.18+AM.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Vbus and Ibus settles quickly to 5 V/2 A and the 20 V/3 A steady state conditions for the buck and boost modes, and the maximum inductor current stays below the saturating current.</td></tr>
</tbody></table>
<h2>
Extra feature: current limiting</h2>
<div>
Although not plotted above, I found that the current spikes on the battery was approaching 200 A--clearly undesirable (and also physically impossible, due to the huge capacitance that should normally accompany a regular LiPo battery), so I came up with current limiting circuit. The current sense chip I've been drawing above (but not utilizing yet) only detect current in the forward direction, so I threw in another one, like this:</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgOjenezmRfeTgz0avxLG9MOvtCvsdoViX_T4CJxsrDY9mY5tKCi3DvsIDaCxbLNV7Gl65x-R8pQWl4wwcfcsibScnPghOedRhC6exVT4XyZMkyiu9Q3WqyhlZNJl3p4baSK6MsBBXwazvf/s1600/Screen+Shot+2017-07-29+at+11.10.07+PM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="201" data-original-width="269" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgOjenezmRfeTgz0avxLG9MOvtCvsdoViX_T4CJxsrDY9mY5tKCi3DvsIDaCxbLNV7Gl65x-R8pQWl4wwcfcsibScnPghOedRhC6exVT4XyZMkyiu9Q3WqyhlZNJl3p4baSK6MsBBXwazvf/s1600/Screen+Shot+2017-07-29+at+11.10.07+PM.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Vcs measures the current in the forward direction, and Mcs measures in the opposite direction.</td></tr>
</tbody></table>
<div>
I then low pass (1 pole) it before throwing that into a current limit comparator:</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiF93i8wULgUu7kB5JbA8NiHjOz3FtAO4N2pNCFyd_0NNnKB7LfmJiByTnpPMQPUlMO1lbwyQQ9CzmJCNYRcNWrpSPTecXGbSFumXO0kZMcUqo2MlqPB-jsvkUuKHBfZiOwQDnrhSXnkxQl/s1600/Screen+Shot+2017-07-29+at+11.13.07+PM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="162" data-original-width="199" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiF93i8wULgUu7kB5JbA8NiHjOz3FtAO4N2pNCFyd_0NNnKB7LfmJiByTnpPMQPUlMO1lbwyQQ9CzmJCNYRcNWrpSPTecXGbSFumXO0kZMcUqo2MlqPB-jsvkUuKHBfZiOwQDnrhSXnkxQl/s1600/Screen+Shot+2017-07-29+at+11.13.07+PM.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The comparator has a large input impedance, so should not load the current sense chip.</td></tr>
</tbody></table>
<div>
The opposite direction works exactly like above, except for using Mcs as the input. The current limit (iLlim) is hard wired to 90 % of 5 V, or 4.5 V, which should means I will cap the limit to 4.5 V / (20 x 0.01 𝛀) = 22.5 A, which is well below the 30 A saturation limit picked for the main inductor above.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiwYZEuTEczpm-O7wBH1-BG1g0BJ3hYGiOUn33mDDGafT1vSJ33bZ50XHf1563d114xBpXSBvDvNdlIz3Oq4SFUu9Xu71_rW9U7rtue7mQNQSk8mDWbi0MAc2lanII5X9xjv47rAQER9V38/s1600/Screen+Shot+2017-07-29+at+11.18.59+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="69" data-original-width="143" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiwYZEuTEczpm-O7wBH1-BG1g0BJ3hYGiOUn33mDDGafT1vSJ33bZ50XHf1563d114xBpXSBvDvNdlIz3Oq4SFUu9Xu71_rW9U7rtue7mQNQSk8mDWbi0MAc2lanII5X9xjv47rAQER9V38/s1600/Screen+Shot+2017-07-29+at+11.18.59+PM.png" /></a></div>
<div>
To actually turn off the MOSFETs correctly when the current exceeds the threshold, the digital logic must be changed as follows:</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwUj957vAZ7fEx-wE6dA8a3wlwBxMPFw5oqc1r0YzVbzugZ0CRk8l1D3Y1f883Fyp5Ur0hm7WxclRREEBihIQS_snayYaGeBx6iS4VKF1ejL7s7sOxTCc9kdN6DzI3agS42SuU6rSouBIm/s1600/Screen+Shot+2017-07-29+at+11.22.54+PM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="93" data-original-width="405" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwUj957vAZ7fEx-wE6dA8a3wlwBxMPFw5oqc1r0YzVbzugZ0CRk8l1D3Y1f883Fyp5Ur0hm7WxclRREEBihIQS_snayYaGeBx6iS4VKF1ejL7s7sOxTCc9kdN6DzI3agS42SuU6rSouBIm/s1600/Screen+Shot+2017-07-29+at+11.22.54+PM.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Buck stage complementary signal pair modified for current limitation.</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-qx3GjjP9q8pFS65kKRTC2zdoUdfGFrDksgE9F4uRgKsrUCNKjn1MSQzhsu3V8W7XNW08FcunMM1Epoojd1l7amevUrCY0N1hraTCYgeK72wdSmf0e7f7DBTJCcqbSp41YsrQQrTePsDV/s1600/Screen+Shot+2017-07-29+at+11.23.58+PM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="135" data-original-width="469" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-qx3GjjP9q8pFS65kKRTC2zdoUdfGFrDksgE9F4uRgKsrUCNKjn1MSQzhsu3V8W7XNW08FcunMM1Epoojd1l7amevUrCY0N1hraTCYgeK72wdSmf0e7f7DBTJCcqbSp41YsrQQrTePsDV/s1600/Screen+Shot+2017-07-29+at+11.23.58+PM.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Boost stage complementary signal pair modified for current limitation.</td></tr>
</tbody></table>
<div>
I verified in LTSpice that the battery current is now capped at around 25 A, and the inductor current in either direction is also capped at about the same value.</div>
<br />Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com4tag:blogger.com,1999:blog-4032020337247582619.post-154296822590881502017-06-17T22:13:00.000-07:002017-06-17T22:13:11.951-07:00Understanding the UKF SLAM in MatlabIn my <a href="http://henryomd.blogspot.com/2017/06/understanding-ekf-slam.html">previous post</a>, I worked through a 2-D EKF SLAM (Simultaneous Localization and Mapping) homework assignment for the <a href="http://ais.informatik.uni-freiburg.de/teaching/ws13/mapping/">online course by Prof. Cyrill Stachniss</a>. In this blog entry, I get to solve the same problem using the UKF (Unscented Kalman Filter). Being the same non-linear problem, the state formation and the equations of forward process evolution and observation are the same as in the EKF SLAM case. It reading the exposition below, it might be helpful to pull up my <a href="http://henryomd.blogspot.com/2017/06/understanding-ekf-slam.html">previous post</a> next to this post.<br />
<h2>
Prediction step</h2>
<div>
In EKF, I simply fed the current state estimate <u>𝛍</u><sup>-</sup>[k] (and the external input <u><b>u</b></u>[k]) to the nonlinear function g(<u><b>u</b></u>[k]; <u>𝛍</u><sup>+</sup>[k-1]) to predict the next state:<br />
<blockquote class="tr_bq">
<u>𝛍</u><sup>-</sup>[k] = g(<u><b>u</b></u>[k]; <u>𝛍</u><sup>+</sup>[k-1])</blockquote>
But in UKF, I have to go through the UT (Unscented Transform): generate an ensemble (called sigma points 𝛘<sup>+</sup>[k-1] in section 3.4.1 of the <a href="http://probabilistic-robotics.org/">Probabilistic Robotics book</a>) of <u>𝛍</u><sup>+</sup>[k-1], and feed all those points to g(). Then <u>𝛍</u><sup>-</sup>[k] is the mean and the covariance of the transformed ensemble are the new <u>𝛍</u><sup>-</sup>[k] and 𝚺<sub>𝛍𝛍</sub><sup>-</sup>[k].<br />
<h3>
Sigma points 𝛘</h3>
The ensemble 𝛘<sup>+</sup>[k-1] should be centered around <u>𝛍</u><sup>+</sup>[k-1], with enough "spread"--based on 𝚺<sub>𝛍𝛍</sub><sup>+</sup>[k-1]. 𝛘<sup>+</sup><sub>0</sub>[k-1] is the current estimate <u>𝛍</u><sup>+</sup>[k-1]. Subsequent points are some + and - direction along each of dimension of the uncertainty. I visualize this as the pair of points roughly 1~2 standard deviation on either side of <u>𝛍</u><sup>+</sup>[k-1]. A pair of points for each of the state dimension n produces 1 + 2n sample points, as you can see in the left ellipsoid below, which is rotated around the principle (x-y) axis because of non-zero cross-covariance term in 𝚺<sub>𝛍𝛍</sub><sup>+</sup>[k-1].<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhuQmNs4itd_rAKZqtmoiDHEUS1BM_dRx1fGUIZiyn4bKNXXdvi4UjY8_l8PSigXb3MZs37EAOhc7P51324HqrQTqP-HKDl_qlQmlo3k3hC4kq1atRvEDnksb0JWJKixpK6u_ehNSHTRH2C/s1600/Screen+Shot+2017-06-17+at+6.37.09+PM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="128" data-original-width="244" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhuQmNs4itd_rAKZqtmoiDHEUS1BM_dRx1fGUIZiyn4bKNXXdvi4UjY8_l8PSigXb3MZs37EAOhc7P51324HqrQTqP-HKDl_qlQmlo3k3hC4kq1atRvEDnksb0JWJKixpK6u_ehNSHTRH2C/s1600/Screen+Shot+2017-06-17+at+6.37.09+PM.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Unscented transform maps the sigma points through g()</td></tr>
</tbody></table>
Along with 𝛘<sup>+</sup>[k-1], we have to generate weights <i>w</i><sub>m</sub> and <i>w</i><sub>c</sub> to use when calculating the mean and the covariance of the transformed points, respectively. In the book, <i><u><b>w</b></u></i><sub>m</sub> and <i><u><b>w</b></u></i><sub>c</sub> are the same except for <i>w</i><sub>c</sub><sub>0 </sub>= <i>w</i><sub>m</sub><sub>0</sub> + (1 - 𝛂<sup>2</sup> + 𝛃). Although the book presented only 1 sigma point generating and weighting algorithm, there are apparently many variations listed, for example, in <a href="https://smile.amazon.com/Kalman-Filtering-Theory-Practice-MATLAB/dp/1118851218/ref=sr_1_1?ie=UTF8&qid=1497750676&sr=8-1&keywords=grewal+andrews+kalman">Grewall and Andrews</a>. The only thing that did not understand is the extra (1 - 𝛂<sup>2</sup> + 𝛃) term on <i>w</i><sub>c</sub><sub>0</sub>, because the weights are supposed to sum to 1 (normalized); I checked that <i>w</i><sub>m</sub> sums to 1, which means that <i>w</i><sub>c</sub> does <i>not</i> sum to 1. Until I can tackle down the Ph.Ds in my team for a satisfying answer (don't hold you breadth), here is the Matlab code to generate the sigma points:<br />
<br />
<div style="font-family: Courier; line-height: normal;">
<span style="color: #0433ff;">function</span> [X, wm, wc] = compute_sigma_points(mu, sigma)</div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
% Computes the 2n+1 sigma points according to the unscented transform,</div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
% where n is the dimensionality of the mean vector mu.</div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
% The sigma points should form the columns of sigma_points,</div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
% i.e. sigma_points is an nx2n+1 matrix.</div>
<div style="color: #25992d; font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;">n = length(mu); </span>% state dimension</div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: #0433ff;">global</span> alpha kappa beta;</div>
<div style="font-family: Courier; line-height: normal;">
lambda = max(alpha^2 * (n + kappa) - n, 1);</div>
<div style="font-family: Courier; line-height: normal;">
scale = lambda + n;</div>
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="font-family: Courier; line-height: normal;">
[U, D, V] = svd(sigma); <span style="color: #25992d;">%[V,D] = eig(sigma);</span></div>
<div style="font-family: Courier; line-height: normal;">
D = sqrt(D);</div>
<div style="line-height: normal;">
<span style="font-family: "courier";">D = max(D, 0.05 * eye(n)); </span><span style="color: #25992d; font-family: "courier";">% lower bound </span><span style="color: #25992d; font-family: "courier";">explained below</span></div>
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;">sigmasqr = </span><span style="color: #0433ff;">...</span>sqrt(n+lambda) * Seems to be too big!</div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;"> </span><span style="color: #0433ff;">...</span> sqrtm(sigma); %chol(sigma);</div>
<div style="font-family: Courier; line-height: normal;">
U * D * V'; <span style="color: #25992d;">%V * D / V;</span></div>
<div style="color: #25992d; font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="font-family: Courier; line-height: normal;">
mureplicated = repmat(mu, 1, n);</div>
<div style="font-family: Courier; line-height: normal;">
X = [mu, mureplicated + sigmasqr, mureplicated - sigmasqr];</div>
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
% Computing the weights for recovering the mean and variance.</div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
% Note: weights are column vectors, to allow vector multiply w/ samples</div>
<div style="font-family: Courier; line-height: normal;">
wm = [lambda/scale; repmat(1/(2*scale), 2*n, 1)];</div>
<div style="font-family: Courier; line-height: normal;">
wc = wm;</div>
<div style="font-family: Courier; line-height: normal;">
wc(1) = wc(1) + 1 - alpha^2 + beta;</div>
<div style="color: #0433ff; font-family: Courier; line-height: normal;">
end</div>
<h3>
Prediction in SLAM</h3>
Unless some of the states are completely decoupled (the covariance between them is zero), one should feed the full <u>𝛍</u><sup>+</sup>[k-1] and 𝚺<sub>𝛍𝛍</sub><sup>+</sup>[k-1] to the <span style="font-family: "courier";">compute_sigma_points</span>. But in SLAM, the landmarks do <i>not </i>move (one of the fundamental assumptions), so there is no need to predict the landmark positions, so I can just constrain the prediction to the robot pose. In this 2-D planar robot, the I can use Matlab's vector operator to transform <i>all </i>𝛘<sup>+</sup>[k-1]<i> </i>in just a few lines of code.<br />
<br />
<div style="font-family: Courier; line-height: normal;">
[X, wm, wc] = compute_sigma_points(mu(1:3), sigma(1:3,1:3));</div>
<div style="font-family: Courier; line-height: normal;">
theta = X(3, :);</div>
<div style="font-family: Courier; line-height: normal;">
ang = theta + rot1 + rot2;</div>
<div style="font-family: Courier; line-height: normal;">
pose_sample = [X(1, :) + trans * cos(theta + rot1);</div>
<div style="font-family: Courier; line-height: normal;">
X(2, :) + trans * sin(theta + rot1);</div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;"> cos(ang); </span>% expand out x,y separately</div>
<div style="font-family: Courier; line-height: normal;">
sin(ang)];</div>
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
% Be careful when computing the robot's orientation (sum up the sines and</div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
% cosines and recover the 'average' angle via atan2)</div>
<div style="font-family: Courier; line-height: normal;">
mu(1:2) = pose_sample(1:2,:) * wm;</div>
<br />
<div style="font-family: Courier; line-height: normal;">
mu(3) = atan2(pose_sample(4,:) * wm, pose_sample(3,:) * wm);</div>
<div>
<br /></div>
I can see why <i>w</i><sub>m</sub> being normalized is important for keeping <u>𝛍</u><sup>-</sup>[k] unbiased. Prediction of the covariance is also concise in Matlab:<br />
<br />
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
dSigma = X - mu(1:3) * ones(1,length(wm)); <span style="color: #25992d;">% subtract the mean</span></div>
<div style="font-family: Courier; line-height: normal;">
dSigma(3,:) = wrapToPi(dSigma(3,:)); <span style="color: #25992d;">% normalize angles again</span></div>
<div style="color: #25992d; font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="font-family: Courier; line-height: normal;">
sigma(1:3,1:3) = dSigma * diag(wc) * dSigma' <span style="color: #0433ff;">...</span></div>
<div style="font-family: Courier; line-height: normal;">
+ diag([0.01 0.01 deg2rad(0.5)]); <span style="color: #25992d;">% robot move uncertainty</span></div>
<div>
<br /></div>
<h3>
Negative feedback loop between 𝚺<sub>mm</sub> and sigma point spread</h3>
<div style="line-height: normal;">
I first tried using the matrix square root or Cholesky decomposition to generate the pairs of points away from the mean, but I ran into a problem: there is a "negative" feedback between "uncertainty" (can be seen with the norm of the covariance matrix) and the sigma point spread. Intuitively, if the sigma point spread scales linearly with the size of the hyper-ellipsoid, decreasing uncertainly leads to smaller spread in the sigma points, which (unless the nonlinear system equation is grossly non-linear or even singular) in turn leads to smaller covariance matrix norm; because 𝚺<sub>mm</sub> does <i>not </i>grow during the prediction step like 𝚺<sub>xx</sub> (see my earlier comment about the landmark stationarity assumption in SLAM), this leads to 𝚺<sub>mm</sub> asymptotically decaying to 0. This contrasts sharply with the uncertainty of the landmark being lower bounded by the robot pose in EKF (they are completely correlated). I by lower bounding the diagonal entries of the SVD(𝚺<sub>𝛍𝛍</sub>), I tried to keep the landmark uncertainty disappearing to 0. I am not sure yet if this is the best way to deal with the problem, but 𝚺<sub>mm</sub> --> 0 <i>is </i>a numerical problem at minimum because 𝚺<sub>𝛍𝛍</sub> becomes ill-conditioned. Because I always think about "how am I going to do this on a 32-bit or 16-bit MCU", I am particularly alarmed about this inconvenient discovery that is usually not mentioned in the Kalman Filter fundamental books. Even with this heuristics, the landmark position uncertainty has reduced dramatically compared to EKF.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBskGprlEjjL99rx6ZEh6Nm7aqL8qdvvDajYRU6YKNDdqevSo7PLmqxDM7ad4vF9Sn0zg__q6TUqZRxlFmChTKE_EUY0UEBEnLSTwNTzoD1WLs89RuHltLKMYQi9P7ifg9-gN9NTw5MU-8/s1600/UFK+error.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="420" data-original-width="560" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBskGprlEjjL99rx6ZEh6Nm7aqL8qdvvDajYRU6YKNDdqevSo7PLmqxDM7ad4vF9Sn0zg__q6TUqZRxlFmChTKE_EUY0UEBEnLSTwNTzoD1WLs89RuHltLKMYQi9P7ifg9-gN9NTw5MU-8/s1600/UFK+error.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Robot pose (<span style="color: magenta;">magenta</span>) and landmark position (<span style="color: cyan;">cyan</span>) estimate at the end of 331 time steps. Note that the estimates have drifted less than in EKF SLAM.</td></tr>
</tbody></table>
<br /></div>
<h2>
Correction step</h2>
The unscented transform is used again during the correction step--this time using the a-priori mean <u>𝛍</u><sup>-</sup>[k] and covariance 𝚺<sub>𝛍𝛍</sub><sup>-</sup>[k] from the prediction step, and using <i>all</i> of them instead of just the robot pose portion, as in the SLAM prediction step. To correct the states from observation, I need 2 things: the Kalman gain K[k] and the innovations from the landmark observations. But at a given time step k, the observation consists of old landmarks (from which innovation can be formed) and new landmarks. So I have to form K<sub>p</sub>[k] and an <i>ensemble </i>of innovations (<u><b>z</b></u><sub>p</sub>[k] - h<span style="font-size: 13.333333015441895px;">p</span>(𝛘<sup>-</sup>[k])) using just previously observed landmarks at time step k (subscript "p" for previously known). Note that I <i>cannot</i> reuse the ensemble generated during prediction because the predicted state <u>𝛍</u><sup>-</sup>[k] is different than <u>𝛍</u><sup>+</sup>[k-1]. Forming the innovations from <b><u>z</u></b><sub>p</sub>[k] and 𝛘<sup>-</sup>[k] is relatively straight-forward, but K<sub>p</sub>[k] involves more steps. For clarity, let the ensemble of expected observations be<br />
<blockquote class="tr_bq">
<b><i>Z</i></b><sub>p</sub>[k] = h<sub>p</sub>(𝛘<sup>-</sup>[k])</blockquote>
<blockquote class="tr_bq">
<b><u>z</u></b><sub>p</sub><sup>avg</sup>[k] = <i><b>Z</b></i><sub>p</sub>[k] <i><u><b>w</b></u></i><sub>m</sub></blockquote>
Then loosely speaking, K<sub>p</sub>[k] is the cross-correlation of 𝛘<sup>-</sup>[k] with <i><b>Z</b></i><sub>p</sub>[k], divided by the covariance of <i><b>Z</b></i>[k]:<br />
<blockquote class="tr_bq">
S<sub>p</sub>[k] = (<i><b>Z</b></i><sub>p</sub>[k] - <b><u>z</u></b><sub>p</sub><sup>avg</sup>[k]) diag(<i><u><b>w</b></u></i><sub>c</sub>) (<i><b>Z</b></i><sub>p</sub>[k] - <b><u>z</u></b><sub>p</sub><sup>avg</sup>[k])<sup>T</sup> + Q<sub>p</sub>[k]</blockquote>
<blockquote class="tr_bq">
K<sub>p</sub>[k] = (𝛘<sup>-</sup>[k] - <u>𝛍</u><sup>-</sup>[k]) diag(<i><u><b>w</b></u></i><sub>c</sub>) (<i><b>Z</b></i><sub>p</sub>[k] - <b><u>z</u></b><sub>p</sub><sup>avg</sup>[k])<sup>T</sup> / S<sub>p</sub>[k]</blockquote>
The uncertainty of measurement Q<sub>p</sub>[k] works to de-weight the innovation for the correction, just like in EKF. The state and covariance correction are then:<br />
<blockquote class="tr_bq">
<u style="text-align: right;">𝛍</u><sup style="text-align: right;">+</sup><span style="text-align: right;">[k] = </span><u style="text-align: right;">𝛍</u><sup style="text-align: right;">-</sup><span style="text-align: right;">[k] + </span>K<sub>p</sub>[k] ( <u><b>z</b></u>[k] - <b><u>z</u></b><sub>p</sub><sup>avg</sup>[k] )</blockquote>
<blockquote class="tr_bq">
𝚺<sub>𝛍𝛍</sub><sup>+</sup>[k] = 𝚺<sub>𝛍𝛍</sub><sup>-</sup>[k] - K<sub>p</sub>[k] S<sub>p</sub>[k] K<sub>p</sub><sup>T</sup>[k]</blockquote>
Here is the Matlab implementation of the simple idea given above:<br />
<br />
<div style="font-family: Courier; line-height: normal;">
lId = z(:,1);</div>
<div style="font-family: Courier; line-height: normal;">
oz = z(ismember(lId, map), :);</div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: #0433ff;">if</span><span style="color: black;"> ~isempty(oz) </span>% Have known landmark => can generate innovation</div>
<div style="font-family: Courier; line-height: normal;">
[X, wm, wc] = compute_sigma_points(mu, sigma);</div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;"> nX = size(X,2); </span>% number of sigma points</div>
<div style="color: #25992d; font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;"> </span>% Reorder obs according to map, which is also the order in mu (state)</div>
<div style="font-family: Courier; line-height: normal;">
[~,lIdx] = ismember(oz(:,1), map);</div>
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="font-family: Courier; line-height: normal;">
nz = size(oz, 1);</div>
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;"> </span>% line 7 on slide 32: Z = h(X)</div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;"> </span>% Generate distribution of expected observation from sigma_point of</div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;"> </span>% robot poses AND landmark positions. </div>
<div style="font-family: Courier; line-height: normal;">
Dx = X(2*lIdx+2,:) - ones(nz,1) * X(1,:); <span style="color: #25992d;">% nZ x nX</span></div>
<div style="font-family: Courier; line-height: normal;">
Dy = X(2*lIdx+3,:) - ones(nz,1) * X(2,:); <span style="color: #25992d;">% nZ x nX</span></div>
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
</div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;"> Range = sqrt(Dx.^2 + Dy.^2); </span>% predicted range and bearing distribution</div>
<div style="color: #25992d; font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="font-family: Courier; line-height: normal;">
Bearing = atan2(Dy, Dx) - ones(nz,1) * X(3,:);</div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;"> </span>% To average the angle, must go through unit circle</div>
<div style="font-family: Courier; line-height: normal;">
Cos = cos(Bearing) * wm; Sin = sin(Bearing) * wm;</div>
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="font-family: Courier; line-height: normal;">
Range_avg = Range * wm;</div>
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;"> </span>% Rearrange according to the mu order</div>
<div style="font-family: Courier; line-height: normal;">
Z_avg = zeros(2*nz, 1);</div>
<div style="font-family: Courier; line-height: normal;">
Z_avg(1:2:end) = Range_avg;</div>
<div style="font-family: Courier; line-height: normal;">
Z_avg(2:2:end) = atan2(Sin, Cos);</div>
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="font-family: Courier; line-height: normal;">
q = zeros(2*nz,1);</div>
<div style="font-family: Courier; line-height: normal;">
q(1:2:end) = Range_avg * 0.01;</div>
<div style="font-family: Courier; line-height: normal;">
q(2:2:end) = deg2rad(1) * ones(nz,1);</div>
<div style="font-family: Courier; line-height: normal;">
q = q .^ 2;</div>
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
</div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;"> </span>% UKF correction algo step 9</div>
<div style="font-family: Courier; line-height: normal;">
dZ = zeros(2*nz, nX);</div>
<div style="font-family: Courier; line-height: normal;">
dZ(1:2:end, :) = Range;</div>
<div style="font-family: Courier; line-height: normal;">
dZ(2:2:end, :) = Bearing;</div>
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;"> </span>% innovation covariance matrix S, line 9 on slide 32</div>
<div style="font-family: Courier; line-height: normal;">
dZ = dZ - Z_avg * ones(1,nX); <span style="color: #25992d;">% Subtract the mean of sample</span></div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;"> </span>% Normalize the angles</div>
<div style="font-family: Courier; line-height: normal;">
dZ(2:2:end) = wrapToPi(dZ(2:2:end));</div>
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="font-family: Courier; line-height: normal;">
S = dZ * diag(wc) * dZ' + diag(q);</div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;"> </span>% Compute Sigma_x_z, the Cross covariance, line 10 on slide 32</div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;"> </span>% (which is equivalent to sigma times the Jacobian H transposed in EKF).<span style="color: black;"> </span></div>
<div style="font-family: Courier; line-height: normal;">
dSigma = X - mu * ones(1, nX); <span style="color: #25992d;">% subtract the mean</span></div>
<div style="font-family: Courier; line-height: normal;">
dSigma(3,:) = wrapToPi(dSigma(3,:)); <span style="color: #25992d;">% normalize angles again</span></div>
<div style="color: #25992d; font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="font-family: Courier; line-height: normal;">
Sigma_xz = dSigma * diag(wc) * dZ';</div>
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;"> </span>% Compute the Kalman gain, line 11 on slide 32</div>
<div style="font-family: Courier; line-height: normal;">
K = Sigma_xz / S;</div>
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;"> </span>% Current observation</div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;"> </span>%lId = oz(:,1);</div>
<div style="font-family: Courier; line-height: normal;">
range = oz(:,2); bearing = oz(:,3);</div>
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
<span style="color: black;"> </span>% Update mu and sigma, line 12 + 13 on slide 32</div>
<div style="font-family: Courier; line-height: normal;">
innoZ = zeros(2*nz,1);</div>
<div style="font-family: Courier; line-height: normal;">
innoZ(1:2:end) = range; <span style="color: #25992d;">% Form the innovation</span></div>
<div style="font-family: Courier; line-height: normal;">
innoZ(2:2:end) = bearing;</div>
<div style="font-family: Courier; line-height: normal;">
innoZ = innoZ - Z_avg;</div>
<div style="font-family: Courier; line-height: normal;">
innoZ(2:2:end) = wrapToPi(innoZ(2:2:end));</div>
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
</div>
<div style="font-family: Courier; line-height: normal;">
mu = mu + K * innoZ;</div>
<div style="font-family: Courier; line-height: normal;">
mu(3) = wrapToPi(mu(3));</div>
<div style="font-family: Courier; line-height: normal;">
sigma = sigma - K * S * K';</div>
<div style="color: #0433ff; font-family: Courier; line-height: normal;">
end</div>
<div style="line-height: normal; min-height: 14px;">
<h2 style="font-family: -webkit-standard;">
Accepting new landmarks</h2>
<h3>
<span style="font-size: small;"><span style="font-weight: normal;">Previously unobserved landmarks are no use for correction, but still have to be accepted into an expanding state. I have not yet figured out how to calculate the cross-correlation of the newly observed landmark with existing landmark or even the robot, but here's my code used to generate the above result:</span></span><br />
<span style="font-size: small;"><span style="font-weight: normal;"><br /></span></span>
<div style="font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;">z = z(~ismember(z(:,1), map), :); <span style="color: #25992d;">% Deal with only the new landmarks</span></span></div>
<div style="font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;">nz = size(z,1);</span></div>
<div style="color: #25992d; font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;"><span style="color: #0433ff;">for</span><span style="color: black;"> i=1:nz </span>% For newly discovered landmarks in this timestep, sample around</span></div>
<div style="font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;"> nMu = length(mu);</span></div>
<div style="font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;"> lId = z(i,1); range = z(i,2); bearing = z(i,3); <span style="color: #25992d;">% raw measurement</span></span></div>
<div style="font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;"> [X, wm, wc] = compute_sigma_points([range; bearing] <span style="color: #0433ff;">...</span></span></div>
<div style="font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;"> , diag([0.01*range deg2rad(1)] .^ 2));</span></div>
<div style="color: #25992d; font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;"><span style="color: black;"> </span>% Transform range/bearing to Cartesian</span></div>
<div style="font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;"> Range = X(1,:);</span></div>
<div style="font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;"> Bearing = X(2,:);</span></div>
<div style="font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;"> Dx = Range .* cos(Bearing + mu(3));</span></div>
<div style="font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;"> Dy = Range .* sin(Bearing + mu(3));</span></div>
<div style="font-family: Courier; font-weight: normal; line-height: normal; min-height: 12px;">
<br /></div>
<div style="color: #25992d; font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;"><span style="color: black;"> dx = Dx * wm;</span>% Expected landmark position relative to robot position</span></div>
<div style="color: #25992d; font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;"><span style="color: black;"> dy = Dy * wm;</span>% = weighted average</span></div>
<div style="font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;"> mu(nMu + (1:2)) = [mu(1) + dx; mu(2) + dy];</span></div>
<div style="font-family: Courier; font-weight: normal; line-height: normal; min-height: 12px;">
<br /></div>
<div style="font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;"> dSigma = [Dx - dx; Dy - dy];</span></div>
<div style="color: #25992d; font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;"><span style="color: black;"> </span>% Stick the new weighted variance at the lower right corner of sigma</span></div>
<div style="font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;"> sigma(nMu+(1:2),nMu+(1:2)) = dSigma * diag(wc) * dSigma';</span></div>
<div style="font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;"><br /></span></div>
<div style="color: #25992d; font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;"><span style="color: black;"> map(end+1) = lId; </span>%add landmark to the map</span></div>
<div style="color: #0433ff; font-family: Courier; font-weight: normal; line-height: normal;">
<span style="font-size: small;">end</span></div>
<br />
<div style="font-family: Courier; font-size: 12px; font-weight: normal; line-height: normal; min-height: 14px;">
<br /></div>
</h3>
</div>
<br /></div>
Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com4tag:blogger.com,1999:blog-4032020337247582619.post-81419941144755134482017-06-06T22:39:00.001-07:002017-06-19T08:42:41.852-07:00Understanding the EKF SLAMIn April of this year, I attended the <a href="https://scien.stanford.edu/index.php/workshop-on-augmented-and-mixed-reality-program/">AR (augmented reality) workshop at Stanford</a>, and learned the current issues and direction of the AR community. I was curious whether I can just use VI (visual-inertial) SLAM (simultaneous localization and mapping) for the AR video game I want to write, and got confirmation that this would be possible after talking to <a href="https://www.linkedin.com/in/renatosalas/">Renato F. Salas-Moreno</a>. Since I work with inertial sensors in my current job, VI SLAM is particularly interesting. While the VI SLAM on a single device is amply demonstrated by the slew of cutting edge products like Google Tango and Microsoft HoloLens, the interactive AR video game I have in mind requires merging of the map produced from multiple devices in the same room. I learned from <a href="https://www.linkedin.com/in/marc-pollefeys-30a7075/">Marc Pollefeys</a> that HoloLens solves the problem by using the map of a "master" device. I got the feeling that this is not a completely solved problem, but since I am new to the whole SLAM problem, I am going to work through the EKF (extended Kalman filter) example from <a href="http://ais.informatik.uni-freiburg.de/teaching/ws13/mapping/">the online SLAM course</a>, which uses the notation in the <a href="http://probabilistic-robotics.org/">Probabilistic Robotics book</a>. In the equations to follow, I am forced to use the underline as the vector, instead of the usual overbar notation, because of the poor formatting capability of Blogger editor.<br />
<h2>
State formation</h2>
In SLAM, the device's pose (position and orientation) <u>x</u>, and the positions (but not orientation) of various landmarks (loosely defined as things that can be identified from camera images) <u>m</u><sub>i</sub>--both in the world frame are captured as the state vector <u>𝛍</u><sup>true</sup> = [<u><b>x</b></u><sup>true</sup>; <u><b>m</b></u><sup>true</sup><sub>1</sub>; ...; <u><b>m</b></u><sup>true</sup><sub>N</sub>], or more compactly <u>𝛍</u><sup>true</sup> = [<u><b>x</b></u><sup>true</sup>; <u><b>m</b></u><sup>true</sup>]. Since I don't know where the device or the landmarks are initially my initial state estimate is <u>𝛍</u>[0] = [<u>0</u>; <u>0</u>]. Most text will annotate this vector with a hat (^) for the estimate, but again due to the Formatting constraint in the Blogger editor, I will just omit the hat, and instead denote the (unknowable) as <sup>true</sup>, as I've done above.<br />
<br />
Ignoring for now the management of the landmark associations (they are sometimes unobservable as the device moves around the world frame), it is clear that the initial covariance matrix 𝚺<sub>mm</sub>[0] = diag(∞, ..., ∞) because at the outset I know nothing about the landmark positions--except the landmark positions are independent of each other, and the device position, for that matter. So 𝚺<sub>𝛍𝛍</sub>[0] = diag(0, 0, 0, ∞, ..., ∞). Note that this initialization of the device pose covariance (setting to 0) is different than what may be found in other Kalman filter textbooks (which uses ∞) because of the different initialization method in the book (coming up shortly below). In linear (the original) Kalman filter formulation, the state evolution and observable generation are linear stochastic processes, but in EKF, they are Gaussian nonlinear processes (that can nevertheless be linearized):
<br />
<blockquote class="tr_bq">
<table>
<tbody>
<tr><td><br /></td><td><u>Kalman Filter</u></td><td><u>Extended Kalman Filter</u></td></tr>
<tr>
<td>Process (state change) model</td>
<td><u>𝛍</u><sup>true</sup> = A <u>𝛍</u><sup>true</sup> + B <u><b>u</b></u> + <b><u>r</u> </b></td>
<td><u>𝛍</u><sup>true</sup> = g(<u>𝛍</u><sup>true</sup>, <u><b>u</b></u>) + <u><b>r</b></u></td>
</tr>
<tr>
<td>Measurement model</td>
<td><u><b>z</b></u><sub>k</sub>[k]= H <u>𝛍</u><sup>true</sup> + <u><b>q</b></u><sub>k</sub>[k]</td>
<td><u><b>z</b></u><sub>k</sub>[k]= h<sub>k</sub>(<u>𝛍</u><sup>true</sup>) + <u><b>q</b></u><sub>k</sub>[k]</td>
</tr>
</tbody>
</table>
</blockquote>
where <u><b>r</b></u> ~ 𝓝(<b><u>0</u></b>, R[k]) and <u><b>q</b></u><sub>k</sub>[k] ~ 𝓝(<u><b>0</b></u>, Q<sub>k</sub>[k]). The subscripts k in the measurement model is necessary in SLAM because landmarks come and go at every time step: to me, it means "pertaining to all observations at time step k".<br />
<br />
The zero initial value of the states pretty much assure that the prediction of the state for the next time step is still zero--unless the system is such that zero is not an equilibrium, or if there is an external input AKA control input. In this example of a planar robot movement using odometer model, there is apparently a non-zero initial external input (the odometry data):<br />
<blockquote class="tr_bq">
<u><b>u</b></u>[0] = [rot1; trans; rot2] = [0.1007 0.1001 0.0002]<sup>T</sup>.</blockquote>
<h2>
Prediction step</h2>
When first studying Kalman filter, I was confused whether to perform prediction right after outputting the control command, or at the beginning of the next step, right before processing the fresh measurement. Because of my background in real-time control, I perform the non-time-critical tasks such as prediction after outputting the control command (i.e. minimize time delay between sensor measurement available and when the control command is output). But in this case, the external input is not the control <i>command</i> but rather another sensor reading (odometry reading), which is only available at the next time step. That is why the "prediction" step uses the a-posteriori estimate from the last time step <u><b>x</b></u><sup>+</sup>[k-1] = [x<sup>+</sup>[k-1]; y<sup>+</sup>[[k-1]; 𝛉<sup>+</sup>[k-1] ], which in this example:<br />
<blockquote class="tr_bq">
<u>𝛍</u><sup>-</sup>[k] = g(<u><b>u</b></u>[k]; <u>𝛍</u><sup>+</sup>[k-1])</blockquote>
For additional clarity, I adopted the "-" and "+" superscripts for the a-priori and a-posteriori estimates. <br />
<br />
Even though g(<u><b>u</b></u>[k]; <u>𝛍</u><sup>+</sup>[k-1]) can be a BHF (big hairy function) in general, stationarity assumption for the landmarks (<b><u>m</u></b>[k] = <b><u>m</u></b>[k-1]) in SLAM simplifies the state update to only the device pose update:<br />
<blockquote class="tr_bq" style="text-align: right;">
<div style="text-align: left;">
<u>𝛍</u><sup>-</sup>[k] = <u>𝛍</u><sup>+</sup>[k-1] + [trans * cos(𝛉<sup>+</sup>[k-1] + rot1) ; trans * sin(𝛉<sup>+</sup>[k-1] + rot1) ; rot1 + rot2]</div>
</blockquote>
<h3>
Observation prediction: has to wait till landmark observation in SLAM</h3>
<div>
We will see shortly the a-posterior update (correction) driven by the sensor measurement (observation). The prediction of the observation h(<u>𝛍</u>[k]) is only necessary at that time, but in the spirit of calculating everything possible <i>outside</i> the time critical path from sensor measurement being available to outputting control command, I want to pre-calculate h(<u>𝛍</u>[k]) now--except that in SLAM, I don't know a-priori which landmarks I will observe, so I <i>have to</i> wait till discovering all observations for the time step.</div>
<h3>
Covariance prediction</h3>
To update the covariance, I need the Jacobian matrix G<sup>-</sup>[k]:<br />
<blockquote class="tr_bq">
G<sup>-</sup>[k] = ∂<u>𝛍</u>[k] / ∂<u>𝛍</u>[k-1] @ <u>𝛍</u><sup>-</sup>[k] = ∂g(<u><b>u</b></u>[k]; <u>𝛍</u>[k-1]) / ∂<u>𝛍</u>[k-1] @ <u>𝛍</u><sup>-</sup>[k]</blockquote>
Note that the partial derivative is derived from the <i>prediction</i> (i.e. without the new sensor input), AT <u>𝛍</u>[k-1]--the last optimal state estimate. Once again, the landmark stationarity and independence assumption allows simplification:<br />
<blockquote class="tr_bq">
G<sup>-</sup>[k] = diag(G<sup>-</sup><sub>x</sub>[k], I<sub>2N</sub>); where G<sup>-</sup><sub>x</sub>[k] = ∂<u><b>x</b></u>[k] / ∂<u><b>x</b></u>[k-1] @ <u><b>x</b></u><sup>-</sup>[k]</blockquote>
<div>
In the prediction equation above, only 𝛉 affects change of <u><b>x</b></u>, so that only the dx/d𝛉 and dy/d𝛉 need to be considered for the Jacobian calculation:<br />
<blockquote class="tr_bq">
dx/d𝛉 = -trans * sin(𝛉 + rot1); dy/d𝛉 = trans * cos(𝛉 + rot1)</blockquote>
<blockquote class="tr_bq">
G<sup>-</sup><sub>x</sub>[k] = I<sub>3</sub> + trans * [ -sin(𝛉<sup>-</sup>[k] + rot1) ; cos(𝛉<sup>-</sup>[k] + rot1) ; 0]</blockquote>
I think it is helpful to remember separation of the device pose and the landmark positions manifest in the covariance matrix structure as well:<br />
<blockquote class="tr_bq">
𝚺<sub>𝛍𝛍</sub> = [𝚺<sub>xx</sub> 𝚺<sub>xm</sub> ; 𝚺<sup>T</sup><sub>xm</sub> 𝚺<sub>mm</sub>], since 𝚺<sub>mx </sub>= 𝚺<sup>T</sup><sub>xm</sub>.</blockquote>
Covariance prediction step involves this huge matrix (dimension (3+2N)<sup>2</sup> in this example):<br />
<blockquote class="tr_bq">
𝚺<sup>-</sup><sub>𝛍𝛍</sub>[k] = G<sup>-</sup>[k] 𝚺<sup>+</sup><sub>𝛍𝛍</sub>[k-1] G<sup>-T</sup>[k] + R[k]</blockquote>
In R[k] = diag(R<sup>-</sup><sub>x</sub>[k], 0), R<sup>-</sup><sub>x</sub> characterizes the random noise in the device pose (imagine the device being knocked around by an accident), evaluated at <u>𝛍</u><sup>-</sup>[k]. I have not seen a noise covariance modeled with such fidelity as to be sensitive to the current state estimate, but wanted clarity for now. Note that unless the physical process has no noise (i.e. R[k] is 0: impossible!), the state uncertainly <i>always </i>grows during the prediction step, as you can see in the following plot of the robot location estimate overlaid against the <i>known</i> landmarks.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhZSqAZo0LeYPXPtrzKIkzVsOXVY7bcuWVWZG6KuLjz9g9LRx4zhvST1rWE_fvgLuIiDD5wJm1T8-RxuwND-0e-ZixzsltbDDQfPSXNmYJlIhyphenhyphenE_lMJTWPtM8RiyuzEjIOJxaqDkYTK5gnY/s1600/EFK+prediction.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhZSqAZo0LeYPXPtrzKIkzVsOXVY7bcuWVWZG6KuLjz9g9LRx4zhvST1rWE_fvgLuIiDD5wJm1T8-RxuwND-0e-ZixzsltbDDQfPSXNmYJlIhyphenhyphenE_lMJTWPtM8RiyuzEjIOJxaqDkYTK5gnY/s1600/EFK+prediction.png" /></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.800000190734863px;"><span style="color: red;">Red</span>: prior device pose estimate and uncertainly (a-posteriori estimate from previous time step)<br />
<span style="color: magenta;">Magenta</span>: predicted device pose and uncertainty, which is larger than the red ellipse</td></tr>
</tbody></table>
In SLAM, the huge covariance prediction calculation above in the first term above can be broken up into 3 isolated portions (the off-diagonal terms are transpose of each other) for simplicity.<br />
<blockquote class="tr_bq">
<table>
<tbody>
<tr>
<td>G<sup>-</sup>[k] 𝚺<sup>+</sup><sub>𝛍𝛍</sub>[k-1] G<sup>-T</sup>[k] =
</td>
<td>G<sup>-</sup><sub>x</sub>[k] 𝚺<sup>+</sup><sub>xx</sub>[k-1] G<sup>-T</sup>[k]</td>
<td>, G<sup>-</sup><sub>x</sub>[k] 𝚺<sup>+</sup><sub>xm</sub>[k-1] </td>
</tr>
<tr>
<td></td>
<td>(G<sup>-</sup><sub>x</sub>[k] 𝚺<sup>+</sup><sub>xm</sub>[k-1])<sup>T</sup></td>
<td>, 𝚺<sup>+</sup><sub>mm</sub>[k-1]]</td>
</tr>
</tbody></table>
</blockquote>
In fact, the independence of the landmarks allows even more factoring of G<sup>-</sup>[k] 𝚺<sup>-</sup><sub>𝛍𝛍</sub>[k-1] G<sup>-T</sup>[k]:<br />
<blockquote class="tr_bq">
<table><tbody>
<tr>
<td>G<sup>-</sup><sub>x</sub>[k] 𝚺<sup>+</sup><sub>xx</sub>[k-1] G<sup>-T</sup>[k]</td>
<td>, G<sup>-</sup><sub>x</sub>[k] 𝚺<sup>+</sup><sub>x1</sub>[k-1]</td>
<td>, ...</td>
<td>, G<sup>-</sup><sub>x</sub>[k] 𝚺<sup>+</sup><sub>xN</sub>[k-1]</td>
</tr>
<tr>
<td>(G<sup>-</sup><sub>x</sub>[k] 𝚺<sup>+</sup><sub>x1</sub>[k-1])<sup>T</sup>
</td>
<td>, 𝚺<sup>+</sup><sub>11</sub>[k-1]</td>
<td>, ...</td>
<td>, 𝚺<sup>+</sup><sub>1N</sub>[k-1]</td>
</tr>
<tr>
<td>...</td>
<td>, ...</td>
<td>, ...</td>
<td>, ...</td>
</tr>
<tr>
<td>(G<sup>-</sup><sub>x</sub>[k] 𝚺<sup>+</sup><sub>xN</sub>[k-1])<sup>T</sup></td>
<td>, 𝚺<sup>+</sup><sub>N1</sub>[k-1]</td>
<td>, ...</td>
<td>, 𝚺<sup>+</sup><sub>NN</sub>[k-1]</td>
</tr>
</tbody></table>
</blockquote>
</div>
This formulation better shows the correlation between the device pose and <i>individual </i>landmark, and the inter-landmark correlation; it might be a better suited to the when landmarks rapidly come in and out of view, but might be an overkill for the simple case of fixed and unchanging landmarks.<br />
<h2>
Correction step</h2>
In this example problem, the range and bearing of a subset of the <i>known</i> landmarks are observed at each time step. Assuming that the association problem (which landmark is this for?) is solved somehow, we can calculate the difference between the measured observation and the predicted observation, which is called innovation. If a given landmark was previously observed, its estimated position is already in <u>𝛍</u><sup>-</sup>[k]. But a newly discovered landmark <b><i>i</i></b>'s state <u><b>m</b></u><sup>-</sup><sub>i</sub>[k] is initialized with yet another function:<br />
<blockquote class="tr_bq">
<u><b>m</b></u><sup>-</sup><sub>i</sub>[k] = <b><i>initial</i></b>(<u>𝛍</u><sup>-</sup>[k], <u><b>z</b></u><sub>i</sub>[k])</blockquote>
There doesn't seem to be a lot of discussion on this initialization function, but if we can initialize, then we can use the completed <u>𝛍</u><sup>-</sup>[k] to calculate the predicted observation h<sub>p</sub>(<u>𝛍</u><sup>true</sup>) and the measurement Jacobian H<sub>p</sub>[k], which is the gradient of the observables <b><u>z</u></b>[k] WRT <u>𝛍</u><sup>true:</sup><br />
<blockquote class="tr_bq">
H<sub>p</sub>[k] = ∂h<sub>p</sub>(<u>𝛍</u>[k]) / ∂<u>𝛍</u>[k] @ <u>𝛍</u><sup>-</sup>[k]</blockquote>
Just like the process Jacobian, there are many ways to estimate the measurement Jacobian (analytical, numerical, and their respective children), but I've only used analytical models so far.<br />
<h3>
State correction</h3>
K<sub>p</sub>[k] is the optimal gain under the zero-mean Gaussian noise hypothesis--called Kalman gain.<br />
<blockquote class="tr_bq">
K<sub>p</sub>[k] = 𝚺<sup>-</sup><sub>𝛍𝛍</sub>[k] H<sub>p</sub><sup>T</sup>[k] / (H<sub>p</sub>[k] 𝚺<sup>-</sup><sub>𝛍𝛍</sub>[k] H<sub>p</sub><sup>T</sup>[k] + Q<sub>p</sub>[k])</blockquote>
<div>
Note that the "/" operator is in fact matrix inversion. I find Kalman "weight" to be a better reminder of its true nature, because that is how much weight is given to the innovation <u><b>z</b></u>[k] - h<sub>p</sub>(<u>𝛍</u><sup>-</sup>[k]) to update the state estimate:<br />
<blockquote class="tr_bq">
<u style="text-align: right;">𝛍</u><sup style="text-align: right;">+</sup><span style="text-align: right;">[k] = </span><u style="text-align: right;">𝛍</u><sup style="text-align: right;">-</sup><span style="text-align: right;">[k] + </span>K<sub>p</sub>[k] ( <u><b>z</b></u>[k] - h<sub>p</sub>(<u>𝛍</u><sup>-</sup>[k]) )</blockquote>
</div>
Note that the unobserved landmarks do <i>not</i> contribute to correcting the state estimate.<br />
<h3>
Covariance correction</h3>
The Kalman weight also contributes to the state covariance correction:<br />
<blockquote class="tr_bq">
𝚺<sup>+</sup><sub>pp</sub>[k] = (I<sub>pp</sub> - K<sub>p</sub>[k] H<sub>p</sub>[k]) 𝚺<sup>-</sup><sub>pp</sub>[k] = 𝚺<sup>-</sup><sub>pp</sub>[k] - K<sub>p</sub>[k] H<sub>p</sub>[k] 𝚺<sup>-</sup><sub>pp</sub>[k]</blockquote>
where 𝚺<sub>pp</sub> = [𝚺<sub>xx</sub> 𝚺<sub>xp</sub> ; 𝚺<sup>T</sup><sub>xp</sub> 𝚺<sub>pp</sub>] is the combination of the device pose to device pose covariance, device pose to <i>observed </i>(at k) landmark positions, and the landmark position to landmark position covariance. Note that the covariance <i>decreases</i> commensurate to the Kalman weight, in reverse of the always increasing covariance during the prediction step earlier.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEwaA_wdxSPJKbqSsDeOi2fqP2ggi_iwAddmgprA7DCMgWMGvhAnjTM37605V3JRugk61_oa-ieynBmrLzg6p-JGafBWDptdr_uK0oIoTJgXNT9TR2QXAn61qRbwEQg4sNNQf4n4si3yZ9/s1600/EKF+correction.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEwaA_wdxSPJKbqSsDeOi2fqP2ggi_iwAddmgprA7DCMgWMGvhAnjTM37605V3JRugk61_oa-ieynBmrLzg6p-JGafBWDptdr_uK0oIoTJgXNT9TR2QXAn61qRbwEQg4sNNQf4n4si3yZ9/s1600/EKF+correction.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><span style="color: red;">Red</span>: predicted device pose estimate and uncertainly (a-priori estimate from <i>this </i>time step)<br />
<span style="color: magenta;">Magenta</span>: corrected device pose and uncertainty, which is smaller than the red ellipse<br />
<span style="color: blue;">Blue</span>: predicted landmark position estimate and uncertainty (a-priori estimate from <i>this </i>time step)<br />
<span style="color: cyan;">Cyan</span>: corrected landmark position and uncertainty, which is smaller than the blue ellipse<br />
Black: range and bearing measurements at that time step, drawn from the corrected device pose (maybe I should have drawn it from the predicted device pose)</td></tr>
</tbody></table>
This completely textbook implementation of EKF leaves much to be desired. I had to futz around with the sensor and process uncertainties to avoid gross estimation error--to some extent, this is one of the deficiencies of KF I already heard about before. But I was surprised by how much the solution drifts as the device moves away from the starting position, which is in the lower left corner below.<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgsrl-0oB-iLmH71SuOV1-5AWrA8_LRxJqq6JxWIPxQK29U_jH9xhiZe5Cl0RdEZIbWXDUvxn5kNB9DTk7d5QtAazhFFg8WXp7I8liUZGWD2xkCWuvhZnX9_Xu44LwwwCtcIFkHOLnWoXsP/s1600/EFK+drift.png" imageanchor="1"><img border="0" height="480" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgsrl-0oB-iLmH71SuOV1-5AWrA8_LRxJqq6JxWIPxQK29U_jH9xhiZe5Cl0RdEZIbWXDUvxn5kNB9DTk7d5QtAazhFFg8WXp7I8liUZGWD2xkCWuvhZnX9_Xu44LwwwCtcIFkHOLnWoXsP/s640/EFK+drift.png" width="640" /></a><br />
If I reduced the sensor covariance too much, the final landmark (at 9,2) doesn't even fall within the 2𝛔 ellipse.
The drift might be due to the sensor bias, and probably uncorrectable without another sensor or a loop closure (coming later in the course).<br />
<h2>
There is much, much more to Kalman filter</h2>
While reading GNSS books about a year ago, I realized that the Kalman filter that I and many other people know is just the tip of the iceberg, and I am not even talking about the UFK. A year ago, I could not have imagined such things as the Kalman smoother, closed-loop Kalman filter, or constrained Kalman filter. I hope to share my future discoveries on such topics with you.<br />
<br />
<br />Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com3tag:blogger.com,1999:blog-4032020337247582619.post-73200775985290448632017-05-13T17:53:00.000-07:002017-05-13T17:53:37.734-07:00Dual system architecture for Raspberry PIBecause I first started learning about programming in a real-time context, I never became comfortable with the idea of controlling physical (safety-critical) devices on a non-deterministic OS. So I stayed with MCUs (Microcontroller) for a long time. On the other hand, as I tried to build shipping products, I became painfully aware of the whole infrastructure necessary to control and communicate with the MCU, and display the relevant information to the human operator or remote systems. I realized that a big chunk of my paycheck was justified because I had the patience and skills to do this mundane but necessary chores for the companies I worked in. In fact, when I was in the biotech industry, my fellow engineers often viewed ourselves as plumbers (although not as well paid). But in my first foray into the world of SoC programming on Xilinx Zynq, I came up with a solution to the problem of integrating a general OS (running UI) and a bare-metal system by running Linux on CPU0 of Zynq, and running the bare-metal firmware on CPU1 of Zynq. I later learned that <a href="http://henryomd.blogspot.com/2015/02/zynq-amp-linux-on-cpu0-and-bare-metal.html">my solution</a> is being used at CERN and Bosch. The solution hinges on the OCM (on-chip-memory) and the software interrupt available on the Zynq, as you can see below:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLDdkvLGYg_wO6nE-uRSlEsTFRvzBhVpKY9V7kWl9GxUSScjlQ2cO2h8N6oqksudshZ1SKCoieXEnzjI4kl5ILlGpa34NAqvTnIksY64fMfBYKFEyyDmrTKUdQbot4BNEOw7UOh7SFO2Ue/s1600/Zynq+architecture.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLDdkvLGYg_wO6nE-uRSlEsTFRvzBhVpKY9V7kWl9GxUSScjlQ2cO2h8N6oqksudshZ1SKCoieXEnzjI4kl5ILlGpa34NAqvTnIksY64fMfBYKFEyyDmrTKUdQbot4BNEOw7UOh7SFO2Ue/s1600/Zynq+architecture.png" /></a></div>
The interrupt controller is part of ARMv7 architecture, but OCM is Xilinx's contribution. The 2 systems can even share data that is too big to fit in the relatively tiny OCM; I even prototyped a custom image collection system that takes images from an Aptina (now On Semi) CCD scheduled by the bare metal, that Linux can read later from the shared DRAM area, using Xilinx VDMA core, as explained in <a href="http://henryomd.blogspot.com/2015/06/video-dma-to-linux-on-zedboard.html">my previous blog entry</a>.<br />
<br />
But since leaving the biotech industry, I haven't touched FPGA. Instead, I have worked with ARM Cortex M (<a href="http://henryomd.blogspot.com/2016/01/mainline-uclinux-on-stm32f429-discovery.html">even got Linux to run on ARM Cortex M4!</a>), FreeRTOS and Raspberry PI (3). In the last year, I was too busy studying DSP and computer vision, but I am beginning to itch for low level programming again. I was blown away by Raspberry PI's bang-for-the-buck (it even has a programmable GPU, which <a href="https://www.raspberrypi.org/blog/accelerating-fourier-transforms-using-the-gpu/">you can use as DSP</a>), so this time, I think I will ditch my $600 Zedboard in favor of the $35 RPI3 I bought more than a year ago. The only complaint I have is the low performance of the data pipe from the CPU to the GPU, but this is a common problem in ALL computation pipeline that includes GPU (including CUDA). I have some hope that ARM will solve the problem by more tightly integrating Mali with ARM Cortex A series in the future, but let's not let that stop us.<br />
<br />
So if the OCM was the key to the dual system in my solution on Zynq, and BCM2837 (the SoC for RPI3) is MISSING an equivalent HW, how shall I work around it? I first thought about using the DRAM as the shared memory, but I am concerned about degrading the cache hit, which I want to keep high for the hard real-time code still running out of DRAM. Since the message path between the real-time and the non-real-time system does NOT have to be real-time, I am going to use another HW to shuttle the message, as shown below.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhSXqjQo5GzKUqv8dTZOytShStJZePqtbQE-UQ4Yjet4Frpk6G1GZHN9aGcf6eYzEbDbJpTkxp8GoVQQOBTbFRI141fT2kv46GBDOzBwdPObkGsQyMoOd4PoLvHpEjtQuEx-ayvBsLhvqOG/s1600/RPI+architecture.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhSXqjQo5GzKUqv8dTZOytShStJZePqtbQE-UQ4Yjet4Frpk6G1GZHN9aGcf6eYzEbDbJpTkxp8GoVQQOBTbFRI141fT2kv46GBDOzBwdPObkGsQyMoOd4PoLvHpEjtQuEx-ayvBsLhvqOG/s1600/RPI+architecture.png" /></a></div>
RPI3 only has 1 USART, so I cannot directly connect the 2 halves. But since Linux (and U-Boot also, I think) is capable of using USBserial console out of the box (it appears as a /dev file), there should be little work necessary to communicate with the bare-metal code--if the bare-metal can read/write to the BCM2837's USART peripheral. The current maximum baud rate supported by the USBserial IC category leader FT232R is 3 Mbps, which is fast enough to shuttle most messages into/out of the real-time system--except images. So this architecture requires keeping image processing algorithms on Linux--which is frankly just easier (because you can't port OpenCV to bare metal).<br />
<br />
Another change from my previous solution is to remove the capability to start/stop the real-time subsystem from Linux--at least for now. I heard from Bosch that safety-critical real-time program needs to be available quickly from power-on, so the ~10 second delay when being launched by Linux would be unacceptable. So this time, U-Boot is going to first kick off the bare metal program on CPU3, and THEN start Linux (before exiting).<br />
<br />
I lost my Linux development system when I exited the biotech industry, so I just installed Parallels, to begin recreating my Buildroot development environment. Because I am busy studying a few other things (on top of my full time job), a fully working demo is going to take some time to materialize. If you know how to do the following, you should just be able to pull it off by yourself:<br />
<br />
<ol>
<li>Building and running a Buildroot distribution on RPI.</li>
<li>Booting multiple systems (in sequence) from U-Boot.</li>
<li>Configuring Linux to run on only subset of the CPUs, and using only a portion of the physical DRAM.</li>
<li>Writing a bare-metal (or FreeRTOS) system that runs out of the reserved portion of the DRAM (with L1/L2 cache turned on).</li>
</ol>
Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com0tag:blogger.com,1999:blog-4032020337247582619.post-84494925527340796492017-05-08T21:56:00.000-07:002017-05-08T21:56:01.404-07:00Understanding Doppler phaseI wanted to fuse the real-time mic recording on my iPhone with the accelerometer/gyro signals for my hobby indoor location project. But for reasons I discussed in <a href="http://henryomd.blogspot.com/2017/03/audio-chirp-processing-problems-on-os-x.html">my previous blog entry</a>, I am currently stuck. To get unstuck, I started thinking about the range rate estimation used in radars, which are primarily Doppler based. Doppler frequency shift is covered in high school physics, but it's been a long time since I've had to think about Doppler, so I derived it out myself.<br />
<br />
Consider a mic M initially at rest, being subject to a sound wave of frequency F, and wave speed c, as shown below. For convenience, let's pick the beginning of sample to coincide with phase 0 of the sine wave. This is just picking the constant offset phase, so using 0 initial phase does not invalidate the ensuing derivation.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjTIoa7JMibD6jYrRgSUlI1u5DIUTCgkpjEED7YDpyflyumSmtyQcLeQZjcNLU1CKnW0owlQ0f_OyHJQ2gsMICEhX6fjjl5vB3CjrLFey1ULTwAEG67vXXp9VJiwt27aYCOQp85CkI-A6KX/s1600/Doppler1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjTIoa7JMibD6jYrRgSUlI1u5DIUTCgkpjEED7YDpyflyumSmtyQcLeQZjcNLU1CKnW0owlQ0f_OyHJQ2gsMICEhX6fjjl5vB3CjrLFey1ULTwAEG67vXXp9VJiwt27aYCOQp85CkI-A6KX/s1600/Doppler1.png" /></a></div>
Since the mic is at rest (v_M = 0), the sound wave passes by the mic at c [m/s]. After t [s] passes, the mic will experience the phase <span style="font-family: STIXGeneral; font-size: 16px; line-height: normal; text-align: center;">𝛟</span><span style="font-family: 'Helvetica Neue'; font-size: 16px; text-align: center;"> = 2</span><span style="font-family: STIXGeneral; font-size: 16px; line-height: normal; text-align: center;">𝛑</span><span style="font-family: 'Helvetica Neue'; font-size: 16px; text-align: center;">Ft</span> of the sound wave, as shown below.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEii0K3Rt6AX-qGgRao4POLwtNLIYZ2g8YEE1UnyWDYvHpIEOhuUxsEQzMHVT4W-QNlmpB62j6B-pmeKPDa_cipkeMI-ZCYg_SmRMoKi67mhfqpVe60fhR4FC_jwvDOSmCmJFop70K0BXZV1/s1600/Doppler2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEii0K3Rt6AX-qGgRao4POLwtNLIYZ2g8YEE1UnyWDYvHpIEOhuUxsEQzMHVT4W-QNlmpB62j6B-pmeKPDa_cipkeMI-ZCYg_SmRMoKi67mhfqpVe60fhR4FC_jwvDOSmCmJFop70K0BXZV1/s1600/Doppler2.png" /></a></div>
Now suppose the mic is moving at v [m/s]. Then the situation is exactly analogous to a race between the mic and the sound wave; so <i>relative</i> to the mic, the sound wave is moving slower than when the mic was stationary, as shown below.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgDS80z8NBlUWsYxhZVvFSUYwL6u0kIyXOOVEtT0Af30TzeSF8uy9vvmg3NR9n_XeQc-yWEmZEqPProWAeiEF7aNudYFAeklYqJhZfnKGbeQ0g4QcO0aNLWYr66eVtgd6KT3vzphj4rIl_q/s1600/Doppler3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgDS80z8NBlUWsYxhZVvFSUYwL6u0kIyXOOVEtT0Af30TzeSF8uy9vvmg3NR9n_XeQc-yWEmZEqPProWAeiEF7aNudYFAeklYqJhZfnKGbeQ0g4QcO0aNLWYr66eVtgd6KT3vzphj4rIl_q/s1600/Doppler3.png" /></a></div>
How much slower? After the same t [s], the wave phase will have moved <span style="font-family: 'Helvetica Neue'; font-size: 16px; text-align: center;">-</span><span style="font-family: 'Helvetica Neue'; font-size: 16px; text-align: center;">2</span><span style="font-family: STIXGeneral; font-size: 16px; line-height: normal; text-align: center;">𝛑</span><span style="font-family: 'Helvetica Neue'; font-size: 16px; text-align: center;">F</span><span style="font-family: 'Helvetica Neue'; font-size: 16px; text-align: center;">t * </span><span style="font-family: 'Helvetica Neue'; font-size: 16px; text-align: center;">v/c</span> less than previous case, so that the mic is now experiencing the phase <span style="font-family: STIXGeneral; font-size: 16px; line-height: normal; text-align: center;">𝛟</span><span style="font-family: 'Helvetica Neue'; font-size: 16px; text-align: center;"> = 2</span><span style="font-family: STIXGeneral; font-size: 16px; line-height: normal; text-align: center;">𝛑</span><span style="font-family: 'Helvetica Neue'; font-size: 16px; text-align: center;">F(1-v/c)t</span>. When I divide the difference of phase between the 2 cases <span style="font-family: 'Helvetica Neue'; font-size: 16px; text-align: center;">-</span><span style="font-family: 'Helvetica Neue'; font-size: 16px; text-align: center;">2</span><span style="font-family: STIXGeneral; font-size: 16px; line-height: normal; text-align: center;">𝛑</span><span style="font-family: 'Helvetica Neue'; font-size: 16px; text-align: center;">t * </span><span style="font-family: 'Helvetica Neue'; font-size: 16px; text-align: center;">v/c </span>by the elapsed time t, I get <span style="font-family: 'Helvetica Neue'; font-size: 16px; text-align: center;">-</span><span style="font-family: 'Helvetica Neue'; font-size: 16px; text-align: center;">2</span><span style="font-family: STIXGeneral; font-size: 16px; line-height: normal; text-align: center;">𝛑</span><span style="font-family: 'Helvetica Neue'; font-size: 16px; text-align: center;">F</span><span style="font-family: STIXGeneral; font-size: 16px; line-height: normal; text-align: center;"> </span><span style="font-family: 'Helvetica Neue'; font-size: 16px; text-align: center;">v/c</span>, which is exactly the Doppler frequency shift. This also make sense because the derivative of phase is frequency.<br />
<br />
This derivation might be helpful if you want to measure Doppler from the raw waveform rather than by measuring the received wave frequency, as in radar processing. The problem I find with frequency based estimation is that I cannot get the instantaneous frequency; since frequency requires <i>several</i> waveforms to estimate (the more the more accurate). So while the Doppler from frequency domain processing may be fine in an average sense, I have to give up some resolution in the process. Since radar waves are typically microwave frequency, requiring lots of periods is not a problem for radar applications (even after the signal is heterodyned and down-sampled). But for sound waves sampled at CD sample rate, requiring say 20 periods (I would not go below due to noise concerns) of 10 kHz wave would equate to 10 / F = 2 ms, which may start to be a problem for high dynamics applications. <br />
Estimating the wave phase directly as a function of the range rate might also be helpful for estimating the distance change directly from the raw audio signal. I actually just read <i>The Doppler Equation in Range and Range Rate Measurement, NASA Technical Note X-55373</i>, dating back to 1965 (!) for the Apollo program, where the debate seems to have been over how best to track both the range and range rate of the rocket. Even though the author (NASA Goddard) seems to have decided in favor of obtaining the range by integrating the estimated range rate (from Doppler), you might find the direct estimation of the range (without the random walk noise inducing integration step) useful. In the next month, I hope to find some time to implement a Kalman filter to do just that.Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com0tag:blogger.com,1999:blog-4032020337247582619.post-5970421743749950802017-03-25T20:00:00.003-07:002017-03-26T08:42:01.077-07:00Understanding sorting network in MatlabWhen I began this hobby project to solve the indoor location problem using only the HW available on a modern smartphone--all just to play an augmented reality X-wing vs. Tie-figher game with my friend's 5-year old son, I had no idea how much I would have to figure out. And it's turning out to be quite a lot: the definition of "knowing" takes on a whole new dimension when you try to make something that works. But it turns out this is the best way I learn; I should have tried to make something by myself a long time ago!<br />
<br />
Now that I understand how to get continuous correlation from the matched filter (as explained in my <a href="http://henryomd.blogspot.com/2017/03/audio-chirp-processing-problems-on-os-x.html">previous blog entry</a>), I am trying to boost the SNR of the filter and fish out the instantaneous jump in correlation when there is a chirp. As shown in the last blog entry, that spike can be buried in the ambient sound (usually of much lower frequency than the near-Nyquist frequency of the correlation spike). One of my favorite techniques is to subtract the median of rolling time window: the median is superior to the mean in smoothing a signal in a high noise environment. The only problem is that rolling median is hard to compute, unlike the rolling mean. The most naive implementation will sort the content of the rolling window at every sample point. Even with an O(N logN) sorting algorithm, the cost of sorting goes up quickly as the window size grows. More sophisticated running median algorithm tries to keep an ordered rolling window. I actually tried to do this with a doubly linked list, with a special pointer to the median, as shown in the sketch below.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjDrKVVbu7zycHr97z6XPEjgxsh4u0p9QqAzUo_tGrtP-7Z1DbhgGzxdcw077L3RkIcBFq8rcz57Wwm_QGJAGAs0s_e-0mqs0yHXWaoRX3Dy5xd72uIjXA30ZgHLWXnJbE0XznaL8NzfO1t/s1600/DSP.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjDrKVVbu7zycHr97z6XPEjgxsh4u0p9QqAzUo_tGrtP-7Z1DbhgGzxdcw077L3RkIcBFq8rcz57Wwm_QGJAGAs0s_e-0mqs0yHXWaoRX3Dy5xd72uIjXA30ZgHLWXnJbE0XznaL8NzfO1t/s1600/DSP.png" /></a></div>
Since I use a circular queue in all embedded projects, I was going to maintain a doubly linked circular buffer of size 1 greater than the window size, so that I could make the oldest element stale, and decide whether to move the median to the left or to right (or leave as is) depending on the combination of the values of the elements, as hinted at below.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhyJbDcRHZLUa3-XnHeQKJUWns3_VClcU3kiejEsFXT14_p6o0sr0k_X4Aq77QUvS_hmRlTGM7zR1chPzT9-4HIsPOO8BaR9MGGkCjlRI16bQThp3FwD283SoR1Byvnr87kQdSu1yNh21kj/s1600/DSP.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhyJbDcRHZLUa3-XnHeQKJUWns3_VClcU3kiejEsFXT14_p6o0sr0k_X4Aq77QUvS_hmRlTGM7zR1chPzT9-4HIsPOO8BaR9MGGkCjlRI16bQThp3FwD283SoR1Byvnr87kQdSu1yNh21kj/s1600/DSP.png" /></a></div>
This took me 2 nights of coding and debugging to get right. Incidentally, problems like this come up often in SW interviews. Here is the class that does all the heavy lifting:<br />
<br />
<div style="color: #ba2da2; font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">template</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> <</span><span style="font-variant-ligatures: no-common-ligatures;">typename</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> T> </span><span style="font-variant-ligatures: no-common-ligatures;">class</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> SortedQ {</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">class</span><span style="font-variant-ligatures: no-common-ligatures;"> dValue {</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">friend</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">class</span><span style="font-variant-ligatures: no-common-ligatures;"> SortedQ;</span></div>
<div style="color: #ba2da2; font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">protected</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">:</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">dValue</span><span style="font-variant-ligatures: no-common-ligatures;"> *_prev, *_next;</span></div>
<div style="color: #ba2da2; font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">public</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">:</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> T val;</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">bool</span><span style="font-variant-ligatures: no-common-ligatures;"> stale;</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">inline</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;"> init(T v) { </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">stale</span><span style="font-variant-ligatures: no-common-ligatures;"> = </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">false</span><span style="font-variant-ligatures: no-common-ligatures;">; </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">val</span><span style="font-variant-ligatures: no-common-ligatures;"> = v; }</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">inline</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">dValue</span><span style="font-variant-ligatures: no-common-ligatures;">* next() {</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">return</span><span style="font-variant-ligatures: no-common-ligatures;"> (!</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_next</span><span style="font-variant-ligatures: no-common-ligatures;"> || !</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_next</span><span style="font-variant-ligatures: no-common-ligatures;">->stale) ? </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_next</span><span style="font-variant-ligatures: no-common-ligatures;"> : </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_next</span><span style="font-variant-ligatures: no-common-ligatures;">->_next;</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> }</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">inline</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">dValue</span><span style="font-variant-ligatures: no-common-ligatures;">* prev() {</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">return</span><span style="font-variant-ligatures: no-common-ligatures;"> (!</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_prev</span><span style="font-variant-ligatures: no-common-ligatures;"> || !</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_prev</span><span style="font-variant-ligatures: no-common-ligatures;">->stale) ? </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_prev</span><span style="font-variant-ligatures: no-common-ligatures;"> : </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_prev</span><span style="font-variant-ligatures: no-common-ligatures;">->_prev;</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> }</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">static</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">int</span><span style="font-variant-ligatures: no-common-ligatures;"> compare(</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">const</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;">* lp, </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">const</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;">* rp) {</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">const</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">dValue</span><span style="font-variant-ligatures: no-common-ligatures;">* *l = (</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">const</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">dValue</span><span style="font-variant-ligatures: no-common-ligatures;">**)lp;</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">const</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">dValue</span><span style="font-variant-ligatures: no-common-ligatures;">* *r = (</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">const</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">dValue</span><span style="font-variant-ligatures: no-common-ligatures;">**)rp;</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">return</span><span style="font-variant-ligatures: no-common-ligatures;"> (*l)->val - (*r)->val;</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> }</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">inline</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;"> rlink(</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">dValue</span><span style="font-variant-ligatures: no-common-ligatures;">& right) { </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_next</span><span style="font-variant-ligatures: no-common-ligatures;"> = &right; right._prev = </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">this</span><span style="font-variant-ligatures: no-common-ligatures;">; }</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">inline</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;"> llink(</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">dValue</span><span style="font-variant-ligatures: no-common-ligatures;">& left) { </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_prev</span><span style="font-variant-ligatures: no-common-ligatures;"> = &left; left._next = </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">this</span><span style="font-variant-ligatures: no-common-ligatures;">; }</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">inline</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;"> replace(</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">dValue</span><span style="font-variant-ligatures: no-common-ligatures;">* old) {</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_prev</span><span style="font-variant-ligatures: no-common-ligatures;"> = old->_prev; </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_next</span><span style="font-variant-ligatures: no-common-ligatures;"> = old->_next;</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_prev</span><span style="font-variant-ligatures: no-common-ligatures;">->_next = </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_next</span><span style="font-variant-ligatures: no-common-ligatures;">->_prev = </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">this</span><span style="font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> }</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;"> rinsert(</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">dValue</span><span style="font-variant-ligatures: no-common-ligatures;">* newnode);</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;"> linsert(</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">dValue</span><span style="font-variant-ligatures: no-common-ligatures;">* newnode);</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">inline</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;"> dropout() {</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">if</span><span style="font-variant-ligatures: no-common-ligatures;"> (</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_next</span><span style="font-variant-ligatures: no-common-ligatures;">) </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_next</span><span style="font-variant-ligatures: no-common-ligatures;">->_prev = </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_prev</span><span style="font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">if</span><span style="font-variant-ligatures: no-common-ligatures;"> (</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_prev</span><span style="font-variant-ligatures: no-common-ligatures;">) </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_prev</span><span style="font-variant-ligatures: no-common-ligatures;">->_next = </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_next</span><span style="font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> }</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> };</span></div>
<div style="color: #ba2da2; font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">public</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">:</span></div>
<div style="color: #ba2da2; font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">static</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">constexpr</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">unsigned</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> SIZE = </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">5</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">dValue</span><span style="font-variant-ligatures: no-common-ligatures;"> arr[</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">SIZE</span><span style="font-variant-ligatures: no-common-ligatures;">+</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;">];</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> SortedQ() {</span></div>
<div style="color: #008400; font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> memset(</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">arr</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">, </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">0</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">, </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">sizeof</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">arr</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">)); </span><span style="font-variant-ligatures: no-common-ligatures;">// mass-set pointers to 0</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> }</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> ~SortedQ() = </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">default</span><span style="font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;"> init();</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">inline</span><span style="font-variant-ligatures: no-common-ligatures;"> T& median() { </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">return</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_median</span><span style="font-variant-ligatures: no-common-ligatures;">->val; }</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal; min-height: 13px;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span></div>
<div style="color: #008400; font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">//Tail is where we will insert the NEXT sample to, so</span></div>
<div style="color: #008400; font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">//head should be the slot AFTER the tail (circular).</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">inline</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">dValue</span><span style="font-variant-ligatures: no-common-ligatures;">* head() { </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">return</span><span style="font-variant-ligatures: no-common-ligatures;"> &</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">arr</span><span style="font-variant-ligatures: no-common-ligatures;">[(</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">_tail</span><span style="font-variant-ligatures: no-common-ligatures;"> + </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;">) % (</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">SIZE</span><span style="font-variant-ligatures: no-common-ligatures;">+</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;">)]; }</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> T& push(T& f);</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;"> print();</span></div>
<div style="color: #ba2da2; font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">private</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">:</span></div>
<div style="color: #008400; font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">unsigned</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> _tail;</span><span style="font-variant-ligatures: no-common-ligatures;">//In stable state, _tail points to the NEXT slot</span></div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">dValue</span><span style="font-variant-ligatures: no-common-ligatures;">* _median;</span></div>
<br />
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">};</span></div>
<br />
As evidenced by my slow coding speed, I think I am not a top-notch SW developer--especially given that my code assumes non-duplicate elements. If I piqued your interest by tying this to SW interviews, the best known solution is to keep 2 separate ordered lists (that can report its size) on either side of the median, and to keep the 2 sides of equal length at all times. A simple list has O(N) search and insertion time, so people use advanced data structures like balanced binary tree, or even skip list, for the O(logN) insertion and search time. I can guarantee that such implementation will not be requested in a 45 minute SW interview!<br />
<br />
So even for a seemingly simple task of keeping a sorted window, SW solution will be complicated; and I hate complexity. As someone who is more interested in HW than the SW, there is bigger disappointment in the SW solution discussed above: there is no HW optimization! The CPU is spending most of its time managing pointers, and the actual value comparison is only a minuscule portion of the total running time. But in THIS problem, where the window size is fixed and known at design time, and the data is streaming in nature, there IS one important HW optimization possible: vector-wise sorting using sorting network. The streaming nature of the data means that the sliding window of successive samples look like delayed version of the previous window, as illustrated below:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivrSS10113dHYis0-Dyr9yu5nWCSrXgebWzSTwJCtxVH_RNOJaFMl9w6utgJAnJogJ1mhqcj6h_esKVtd5IYFSxMz9I11LpcYuFqgAuUmGHv1S-e03Lwzh0KciaGypEFaRq3FKbxjzt5lN/s1600/DSP.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivrSS10113dHYis0-Dyr9yu5nWCSrXgebWzSTwJCtxVH_RNOJaFMl9w6utgJAnJogJ1mhqcj6h_esKVtd5IYFSxMz9I11LpcYuFqgAuUmGHv1S-e03Lwzh0KciaGypEFaRq3FKbxjzt5lN/s1600/DSP.png" /></a></div>
I can batch-sort a block of inputs going through this sorting window, by treating each delay (starting at delay 0) as a separate vector. For the window size of 5, for example, there are 5 inputs going into my sorting network--which has a known optimum solution (both in number of comparisons and delay) discussed in <i>The Art of Computer Programming, Vol 3</i> and reproduced on <a href="http://www.angelfire.com/blog/ronz/Articles/999SortingNetworksReferen.html">Ron Zeno's website</a>.<br />
<br />
While the sorting network is ideally implemented in a custom HW, the next best implementation can be had on SIMD capable chips, like on Intel with SSE, or ARM with Neon. And the vectored nature of Matlab's max/min function can easily simulate it. Here's my implementation of 5-element sorting network:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbxkmZMHgF0J5-VwLWHJOwAlvMaZltm3HQZ2LG8tmZvp4kEGQKBCuhA2WJQ2ElxC8i0XgRHaFKgFTjaVCwhovNY2Oo_lLo4r9yQ5sov8T06_exCzRbuFz4-NdkatdrrLo7lT88SodUQSOW/s1600/DSP.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbxkmZMHgF0J5-VwLWHJOwAlvMaZltm3HQZ2LG8tmZvp4kEGQKBCuhA2WJQ2ElxC8i0XgRHaFKgFTjaVCwhovNY2Oo_lLo4r9yQ5sov8T06_exCzRbuFz4-NdkatdrrLo7lT88SodUQSOW/s1600/DSP.png" /></a></div>
There are 2 opportunities for parallelization: at the very begging and right before the last swapping. I deliberately drew them staggered because I am NOT going to parallelize them. The letters in the middle show the temporary vectors necessary to store the intermediate values--unless you have a HW assisted compare and swap--because each compare and swap needs to be done in 2 stages: taking the max of 2 vectors, and then taking the min of the same 2 vectors. Right away, you can see that the vectored sorting network will be memory hungry, and might not be suitable for cheap MCUs. But an iPhone can easily provide a few extra KB without throwing up.<br />
<br />
Here's a Matlab simulation I used to verify this network:<br />
<br />
<div style="font-family: Courier; line-height: normal;">
x = rand(1000,1);</div>
<div style="font-family: Courier; line-height: normal;">
d1 = [x(1); x(1:end-1)];</div>
<div style="font-family: Courier; line-height: normal;">
d2 = [d1(1); d1(1:end-1)];</div>
<div style="font-family: Courier; line-height: normal;">
d3 = [d2(1); d2(1:end-1)];</div>
<div style="font-family: Courier; line-height: normal;">
d4 = [d3(1); d3(1:end-1)];</div>
<div style="font-family: Courier; line-height: normal;">
plot(x); hold <span style="color: #b245f3;">on</span>;</div>
<div style="font-family: Courier; line-height: normal;">
plot(d4);</div>
<div style="font-family: Courier; line-height: normal;">
hold <span style="color: #b245f3;">off</span>;</div>
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="font-family: Courier; line-height: normal;">
A = max(x, d1); B = min(x, d1);</div>
<div style="font-family: Courier; line-height: normal;">
C = max(d2, d3); D = min(d2, d3);</div>
<div style="font-family: Courier; line-height: normal;">
E = max(B, D); D = min(B, D);</div>
<div style="font-family: Courier; line-height: normal;">
B = max(C, d4); C = min(C, d4);</div>
<div style="font-family: Courier; line-height: normal;">
F = max(A, B); B = min(A, B);</div>
<div style="font-family: Courier; line-height: normal;">
A = max(E, C); C = min(E, C);</div>
<div style="font-family: Courier; line-height: normal;">
E = max(A, B); B = min(A, B);</div>
<div style="font-family: Courier; line-height: normal;">
D = max(D, C);</div>
<div style="font-family: Courier; line-height: normal;">
B = max(B, D);</div>
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="font-family: Courier; line-height: normal;">
truth = medfilt1(x, 5);</div>
<div style="font-family: Courier; line-height: normal;">
med_err = truth(1:end-2) - B(3:end);</div>
<div style="font-family: Courier; line-height: normal;">
plot(x); hold <span style="color: #b245f3;">on</span>; </div>
<div style="font-family: Courier; line-height: normal;">
plot(truth(1:end-2)); plot(B(3:end)); hold <span style="color: #b245f3;">off</span>;</div>
<br />
<div style="font-family: Courier; line-height: normal;">
plot(med_err);</div>
<div>
<br /></div>
You can see in my <a href="http://henryomd.blogspot.com/2017/03/audio-chirp-processing-problems-on-os-x.html">previous blog entry</a> that I used 3 extra arrays of length 1K (that's the length of the matched filter) for correlation processing. That is already 3 x 1K x 4B = 12 KB of RAM. For this median processing, I can reuse those 3 arrays on the stack for A, B, and C. But I still have to come up with D and E (F is only necessary to get the running maximum), so I am looking at 20 KB extra RAM for the correlation and median processing. As a reference, nRF51822 the BLE MCU I am used to playing around with only has up to 32 KB of RAM, so this solution is starting to look expensive...<br />
<br />Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com2tag:blogger.com,1999:blog-4032020337247582619.post-46552838069437351462017-03-20T21:55:00.000-07:002017-04-22T19:44:44.450-07:00Audio chirp processing problems on OS XIn a <a href="http://henryomd.blogspot.com/2017/03/signal-processing-audio-chirp-on-osx.html">previous blog post</a>, I explored directly processing the sound samples from the built-in mic on OS X using the CoreAudio framework. In that post, I briefly mentioned about the non-ideal response seen on my MacBook Pro. For convenience, here is the raw sample.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiXfXs8-cOgkRffpQlVbgtTp1ziyHreQRttJKVvwSRHQSTgCjURKMggoiCFLqjtAWBAjv-sAYmsr0o2vX6GXYguQLColo3rbJ3NPc1LHL56zRJHL8sNAHB-woQlRmYNcNiD0v7nP1frPVdp/s1600/heard.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiXfXs8-cOgkRffpQlVbgtTp1ziyHreQRttJKVvwSRHQSTgCjURKMggoiCFLqjtAWBAjv-sAYmsr0o2vX6GXYguQLColo3rbJ3NPc1LHL56zRJHL8sNAHB-woQlRmYNcNiD0v7nP1frPVdp/s1600/heard.png" /></a></div>
This is supposed to be a flat chirp going from 50 Hz to 22 kHz (except for a short ramp up and down windowing at the beginning and the end). Since the MacBook Pro's mic is RIGHT under the speaker, I expected some distortion, but this seemed too much. So I tried turning off the "voice processing" units in the HW.<br />
<br />
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">if</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> (</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">true</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">) { </span><span style="font-variant-ligatures: no-common-ligatures;">// Is the voice filter on? Try to turn it off</span></div>
<div style="color: #703daa; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">AudioComponentDescription</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> desc;</span></div>
<div style="color: #3e1e81; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> desc.</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">componentType</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> = </span><span style="font-variant-ligatures: no-common-ligatures;">kAudioUnitType_Output</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="color: #3e1e81; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> desc.</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">componentSubType</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> = </span><span style="font-variant-ligatures: no-common-ligatures;">kAudioUnitSubType_VoiceProcessingIO</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="color: #3e1e81; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> desc.</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">componentManufacturer</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> = </span><span style="font-variant-ligatures: no-common-ligatures;">kAudioUnitManufacturer_Apple</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> desc.</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">componentFlags</span><span style="font-variant-ligatures: no-common-ligatures;"> = </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">0</span><span style="font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="color: #703daa; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> desc.</span><span style="font-variant-ligatures: no-common-ligatures;">componentFlagsMask</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> = </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">0</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">AudioComponent</span><span style="font-variant-ligatures: no-common-ligatures;"> comp = </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">AudioComponentFindNext</span><span style="font-variant-ligatures: no-common-ligatures;">(</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">NULL</span><span style="font-variant-ligatures: no-common-ligatures;">, &desc);</span></div>
<div style="font-family: Menlo; line-height: normal; min-height: 13px;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">AudioUnit</span><span style="font-variant-ligatures: no-common-ligatures;"> vpioUnit;</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">AudioComponentInstanceNew</span><span style="font-variant-ligatures: no-common-ligatures;">(comp, &vpioUnit);</span></div>
<div style="color: #78492a; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #31595d; font-variant-ligatures: no-common-ligatures;">CheckError</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">AudioUnitSetProperty</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(vpioUnit, </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">kAUVoiceIOProperty_BypassVoiceProcessing</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> , </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">kAudioUnitScope_Global</span><span style="font-variant-ligatures: no-common-ligatures;">, </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;">, &enableFlag, </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">sizeof</span><span style="font-variant-ligatures: no-common-ligatures;">(enableFlag))</span></div>
<div style="color: #d12f1b; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> , </span><span style="font-variant-ligatures: no-common-ligatures;">"BypassVoiceProcessing failed"</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">);</span></div>
<div style="color: #78492a; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #31595d; font-variant-ligatures: no-common-ligatures;">CheckError</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">AudioUnitSetProperty</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(vpioUnit, </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">kAUVoiceIOProperty_VoiceProcessingEnableAGC</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> , </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">kAudioUnitScope_Global</span><span style="font-variant-ligatures: no-common-ligatures;">, </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;">, &disableFlag, </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">sizeof</span><span style="font-variant-ligatures: no-common-ligatures;">(disableFlag))</span></div>
<div style="color: #d12f1b; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> , </span><span style="font-variant-ligatures: no-common-ligatures;">"VoiceProcessingEnableAGC"</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">);</span></div>
<div style="color: #78492a; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #008400; font-variant-ligatures: no-common-ligatures;">//Deprecated a long time ago</span></div>
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">//AudioUnitSetProperty(vpioUnit, kAUVoiceIOProperty_DuckNonVoiceAudio</span></div>
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">// , kAudioUnitScope_Global, 1, &disableFlag, sizeof(disableFlag));</span></div>
<br />
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> }</span></div>
<br />
The result is even stranger:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiuBzu8b2-jRIIO0Ww7GY6T_L3QfvtkhoFMMEGf3_l2emaM13eOReLS2GxvkkSYfPshzztHqDaLVpe7owhK2S3CuXsljNFKWPZT9fkBk9prWxZi2PMC3S07YeoMhdM-M0Kvp2m799xx6MR6/s1600/corr+chirps.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="178" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiuBzu8b2-jRIIO0Ww7GY6T_L3QfvtkhoFMMEGf3_l2emaM13eOReLS2GxvkkSYfPshzztHqDaLVpe7owhK2S3CuXsljNFKWPZT9fkBk9prWxZi2PMC3S07YeoMhdM-M0Kvp2m799xx6MR6/s640/corr+chirps.png" width="640" /></a></div>
I then tried different combinations of the above 2 options, and even got an address sanitizer inside <span style="color: #3e1e81; font-family: "menlo";">AudioUnitSetProperty</span> when I tried to enable <span style="color: #3e1e81; font-family: "menlo";">kAUVoiceIOProperty_VoiceProcessingEnableAGC</span>.<br />
<br />
Running the simple LFM (linear frequency modulated) matched filter on a non-ideal received signal is problematic. The non-constant group delay is the biggest problem. This can be visualized in 2 different ways: the first is to simply get the best cross-correlation I can, and shift the received signal by this many sample delay, as shown in the zoomed in view below:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhcfAMNVk5ppv0M-WBBntCvAhcfy5ZvpE8bs-_PTPHOmnbZReYto4xij5M0ZGdBE6EXvpRzWdB17QRF1zX5uWg1gy4WofptfyjpRrE77OjuUDgxOO_lEombC33CzlFpEMUlF-Ah68YJr_8W/s1600/6-14kHz+align.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhcfAMNVk5ppv0M-WBBntCvAhcfy5ZvpE8bs-_PTPHOmnbZReYto4xij5M0ZGdBE6EXvpRzWdB17QRF1zX5uWg1gy4WofptfyjpRrE77OjuUDgxOO_lEombC33CzlFpEMUlF-Ah68YJr_8W/s1600/6-14kHz+align.png" /></a></div>
This "best effort" fit clearly is not well aligned--even when I reduced the bandwidth from 20 kHz down to 8 kHz, to avoid the low and high frequencies (where the distortion seems the greatest). Another way to visualize is to segment the filter (I have been using 1024 taps) into smaller chunks and calculate the best cross correlation. If the MacBook mic (and the iPod speaker) were distortion-free, the group delay should be constant, which means that the maximum correlation index should increase linearly (because group delay is the derivative of the phase lag). But as you can see below, this is not the case.<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhN_ethkDu0fiwIaEJANL7C05tPZIT1G9NjlYjHdh3h-1Ry6NeJ6DjqczdoH501b44HKDNpmzF2XgAx_g6OJKmoMri1X8deH8iyNOK7fRYhmVxCfq8KKsuBCx27ggRAxKkh5LYT8Xl94CtY/s1600/6-14kHz+phase+lag.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="221" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhN_ethkDu0fiwIaEJANL7C05tPZIT1G9NjlYjHdh3h-1Ry6NeJ6DjqczdoH501b44HKDNpmzF2XgAx_g6OJKmoMri1X8deH8iyNOK7fRYhmVxCfq8KKsuBCx27ggRAxKkh5LYT8Xl94CtY/s640/6-14kHz+phase+lag.png" width="640" /></a></div>
<br />
Apple audio HW is supposed to be high quality, but I guess the built-in speaker/mic are far more limited...Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com0tag:blogger.com,1999:blog-4032020337247582619.post-74675657846827569452017-03-20T09:48:00.002-07:002017-03-20T09:48:56.603-07:00Signal processing audio chirp on OSXIn my <a href="http://henryomd.blogspot.com/2017/03/signal-processing-audio-chirp-in-matlab.html">previous blog post</a>, I studied pulse compression one level deeper by producing and processing an actual audio chirp. In this blog post, I get one step closer to implementing pulse compression on the phone by receiving an audio pulse with CoreAudio framework, and signal processing with Apple Accelerate framework's vDSP API. I know that Unity offers ability to record from system mic. I found the built-in Microphone class unsuitable for my need (deciding when a chirp was received) because Unity Microphone class's recording duration has 1 second resolution, and is finite, whereas I want an option to perform continuous (never stopping) correlation. So I bought <a href="https://smile.amazon.com/Learning-Core-Audio-Hands-Programming/dp/0321636848/ref=sr_1_1?ie=UTF8&qid=1489983469&sr=8-1&keywords=core+audio">this book</a> on CoreAudio and started with modifying the example from Chapter 8. You can download all the code below from my GitHub repo: https://github.com/henrychoi/OpenCVUnity, in the branch output_pulse.<br />
<h2>
CoreAudio code to sample the first channel of the OS X mic</h2>
As in the book example, I used the first mic found on the system.<br />
<br />
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">// generate description that will match audio HAL</span></div>
<div style="color: #703daa; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">AudioComponentDescription</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> inputcd = {</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">0</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">};</span></div>
<div style="color: #3e1e81; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> inputcd.</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">componentType</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> = </span><span style="font-variant-ligatures: no-common-ligatures;">kAudioUnitType_Output</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="color: #3e1e81; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> inputcd.</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">componentSubType</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> = </span><span style="font-variant-ligatures: no-common-ligatures;">kAudioUnitSubType_HALOutput</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="color: #3e1e81; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> inputcd.</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">componentManufacturer</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> = </span><span style="font-variant-ligatures: no-common-ligatures;">kAudioUnitManufacturer_Apple</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="font-family: Menlo; line-height: normal; min-height: 13px;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span></div>
<br />
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">AudioComponent</span><span style="font-variant-ligatures: no-common-ligatures;"> comp = </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">AudioComponentFindNext</span><span style="font-variant-ligatures: no-common-ligatures;">(</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">NULL</span><span style="font-variant-ligatures: no-common-ligatures;">, &inputcd);</span></div>
<div style="color: #3e1e81; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #31595d; font-variant-ligatures: no-common-ligatures;">CheckError</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="font-variant-ligatures: no-common-ligatures;">AudioComponentInstanceNew</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(comp, &</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">player</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">inputUnit</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">),</span></div>
<div style="color: #d12f1b; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">"Couldn't open component for inputUnit"</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">);</span></div>
<div>
<span style="color: black; font-variant-ligatures: no-common-ligatures;"><br /></span></div>
But then I deviated from the book's example because my use case is to correlate the audio signal against the matched filter. Because the filter is designed for a chirp signal of certain frequency and length, I want to match the sampling frequency and the number of frames in a block.<br />
<br />
<div style="color: #703daa; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> propertySize = </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">sizeof</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> (</span><span style="font-variant-ligatures: no-common-ligatures;">AudioStreamBasicDescription</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">);</span></div>
<div style="color: #3e1e81; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #31595d; font-variant-ligatures: no-common-ligatures;">CheckError</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="font-variant-ligatures: no-common-ligatures;">AudioUnitGetProperty</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">player</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">inputUnit</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">kAudioUnitProperty_StreamFormat</span><span style="font-variant-ligatures: no-common-ligatures;">,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">kAudioUnitScope_Output</span><span style="font-variant-ligatures: no-common-ligatures;">, </span><span style="color: #008400; font-variant-ligatures: no-common-ligatures;">//to other units</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> inputBus,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> &</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">player</span><span style="font-variant-ligatures: no-common-ligatures;">-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">streamFormat</span><span style="font-variant-ligatures: no-common-ligatures;">,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> &propertySize),</span></div>
<div style="color: #d12f1b; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">"Couldn't get ASBD from output unit"</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">);</span></div>
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">//Filter designed for this rate, so fix at expected frequency</span></div>
<div style="color: #4f8187; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">player</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">-></span><span style="font-variant-ligatures: no-common-ligatures;">streamFormat</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">.</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">mSampleRate</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> = </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">44100</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="color: #3e1e81; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #31595d; font-variant-ligatures: no-common-ligatures;">CheckError</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="font-variant-ligatures: no-common-ligatures;">AudioUnitSetProperty</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">player</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">inputUnit</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">kAudioUnitProperty_StreamFormat</span><span style="font-variant-ligatures: no-common-ligatures;">,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">kAudioUnitScope_Output</span><span style="font-variant-ligatures: no-common-ligatures;">,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> inputBus,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> &</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">player</span><span style="font-variant-ligatures: no-common-ligatures;">-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">streamFormat</span><span style="font-variant-ligatures: no-common-ligatures;">,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> propertySize),</span></div>
<br />
<div style="color: #d12f1b; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">"Couldn't set output unit Fs"</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">);</span></div>
<div>
<span style="color: black; font-variant-ligatures: no-common-ligatures;"></span><br />
<div style="color: #703daa; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"><span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">AudioStreamBasicDescription</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> deviceFormat;</span></span></div>
<span style="color: black; font-variant-ligatures: no-common-ligatures;">
<div style="color: #3e1e81; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #31595d; font-variant-ligatures: no-common-ligatures;">CheckError</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="font-variant-ligatures: no-common-ligatures;">AudioUnitGetProperty</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">player</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">inputUnit</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">kAudioUnitProperty_StreamFormat</span><span style="font-variant-ligatures: no-common-ligatures;">,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">kAudioUnitScope_Input</span><span style="font-variant-ligatures: no-common-ligatures;">, </span><span style="color: #008400; font-variant-ligatures: no-common-ligatures;">//from HW to IO unit</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> inputBus,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> &deviceFormat,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> &propertySize),</span></div>
<div style="color: #d12f1b; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">"Couldn't get ASBD from input unit"</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">);</span></div>
<div style="font-family: Menlo; line-height: normal; min-height: 13px;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> deviceFormat.</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">mSampleRate</span><span style="font-variant-ligatures: no-common-ligatures;"> = </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">44100</span><span style="font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="color: #3e1e81; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #31595d; font-variant-ligatures: no-common-ligatures;">CheckError</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="font-variant-ligatures: no-common-ligatures;">AudioUnitSetProperty</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">player</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">inputUnit</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">kAudioUnitProperty_StreamFormat</span><span style="font-variant-ligatures: no-common-ligatures;">,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">kAudioUnitScope_Input</span><span style="font-variant-ligatures: no-common-ligatures;">,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> inputBus,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> &deviceFormat,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> propertySize),</span></div>
<div style="color: #d12f1b; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">"Couldn't set Fs input unit"</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">);</span></div>
<div>
<span style="color: black; font-variant-ligatures: no-common-ligatures;"><br /></span></div>
<div>
<span style="color: black; font-variant-ligatures: no-common-ligatures;"></span><br />
<div style="font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">UInt32</span><span style="font-variant-ligatures: no-common-ligatures;"> bufferSizeFrames = </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;"><<</span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">LOG2M</span><span style="font-variant-ligatures: no-common-ligatures;">;</span></span></div>
<span style="color: black; font-variant-ligatures: no-common-ligatures;">
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> propertySize = </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">sizeof</span><span style="font-variant-ligatures: no-common-ligatures;">(</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">UInt32</span><span style="font-variant-ligatures: no-common-ligatures;">);</span></div>
<div style="color: #3e1e81; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #31595d; font-variant-ligatures: no-common-ligatures;">CheckError</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="font-variant-ligatures: no-common-ligatures;">AudioUnitSetProperty</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">player</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">inputUnit</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">kAudioDevicePropertyBufferFrameSize</span><span style="font-variant-ligatures: no-common-ligatures;">,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">kAudioUnitScope_Global</span><span style="font-variant-ligatures: no-common-ligatures;">,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">0</span><span style="font-variant-ligatures: no-common-ligatures;">,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> &bufferSizeFrames,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> propertySize),</span></div>
<div style="color: #d12f1b; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">"Couldn't set buffer frame size to input unit"</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">);</span></div>
<div>
<span style="color: black; font-variant-ligatures: no-common-ligatures;"><br /></span></div>
<div>
<span style="color: black; font-variant-ligatures: no-common-ligatures;">Chapter 08 example uses AUGraph to connect the mic HAL unit to default output unit (speaker), but I just want the raw signal from the mic, so the program is significantly simplified: I just have to start the AUHAL unit explicitly:</span></div>
<div>
<span style="color: black; font-variant-ligatures: no-common-ligatures;"><br /></span></div>
<div>
<span style="color: black; font-variant-ligatures: no-common-ligatures;"></span><br />
<div style="color: #3e1e81; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"><span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #31595d; font-variant-ligatures: no-common-ligatures;">CheckError</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="font-variant-ligatures: no-common-ligatures;">AudioUnitInitialize</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">player</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">inputUnit</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">),</span></span></div>
<span style="color: black; font-variant-ligatures: no-common-ligatures;">
<div style="color: #d12f1b; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">"Couldn't initialize input unit"</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">);</span></div>
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">// Input unit is now ready for use</span></div>
<div style="font-family: Menlo; line-height: normal; min-height: 13px;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span></div>
<div style="color: #3e1e81; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #31595d; font-variant-ligatures: no-common-ligatures;">CheckError</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="font-variant-ligatures: no-common-ligatures;">AudioOutputUnitStart</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">player</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">inputUnit</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">)</span></div>
<div style="color: #d12f1b; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> , </span><span style="font-variant-ligatures: no-common-ligatures;">"AudioOutputUnitStart"</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">);</span></div>
<div>
<span style="color: black; font-variant-ligatures: no-common-ligatures;"><br /></span></div>
</span></div>
<div>
<span style="color: black; font-variant-ligatures: no-common-ligatures;">Later, when I want to stop receiving the mic input, I call </span><span style="color: #3e1e81; font-family: "menlo";">AudioOutputUnitStop</span> instead. But in the continuous chirp detection scenario, I won't stop the input. The CoreAudio interface for sucking out the sound samples is <span style="color: #3e1e81; font-family: "menlo";">kAudioOutputUnitProperty_SetInputCallback</span>.</div>
<div>
<br /></div>
<div>
<div style="color: #703daa; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">AURenderCallbackStruct</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> callbackStruct;</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> callbackStruct.</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">inputProc</span><span style="font-variant-ligatures: no-common-ligatures;"> = </span><span style="color: #31595d; font-variant-ligatures: no-common-ligatures;">InputRenderProc</span><span style="font-variant-ligatures: no-common-ligatures;">; </span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> callbackStruct.</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">inputProcRefCon</span><span style="font-variant-ligatures: no-common-ligatures;"> = </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">player</span><span style="font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="font-family: Menlo; line-height: normal; min-height: 13px;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span></div>
<div style="color: #3e1e81; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #31595d; font-variant-ligatures: no-common-ligatures;">CheckError</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="font-variant-ligatures: no-common-ligatures;">AudioUnitSetProperty</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">player</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">inputUnit</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">, </span></div>
<div style="color: #3e1e81; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">kAudioOutputUnitProperty_SetInputCallback</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">, </span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">kAudioUnitScope_Global</span><span style="font-variant-ligatures: no-common-ligatures;">,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">0</span><span style="font-variant-ligatures: no-common-ligatures;">,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> &callbackStruct, </span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">sizeof</span><span style="font-variant-ligatures: no-common-ligatures;">(callbackStruct)),</span></div>
<div style="color: #d12f1b; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">"Couldn't set input callback"</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">);</span></div>
</div>
<div>
<span style="color: black; font-variant-ligatures: no-common-ligatures;"><br /></span></div>
<div>
<span style="color: black; font-variant-ligatures: no-common-ligatures;">The callback routine will be called (semi) real-time and therefore is time critical; similar to interrupt handlers, the callback routine must return as quickly as possible. This means that </span>I need to keep pushing the (zero padded) received signal from the audio reception callback to the signal processing algorithm running in another thread.</div>
</span></div>
</span></div>
<h2>
Memory pool and lockless queue</h2>
In the CoreAudio book, data is shuttled from the callback to the downstream through the CARingBuffer, which is specialized for the CA input buffer (this is what CoreAudio returns to me when query the HW for new sample block). Instead of the CARingBuffer the book uses, I wrote my own memory pool and queue for 2 reasons:<br />
<br />
<ol>
<li>I need to send meta data (like the host timestamp for the input block) along with the raw samples to the algorithm.</li>
<li>I am only interested in 1 channel (the first one) of the potentially stereo mic, so copying the both channels through the CARingBuffer would be a waster of memory.</li>
</ol>
<br />
The interface to the (aligned) memory pool was copied from <a href="https://docs.google.com/document/d/1lA5D7RHcxb8zmye7JauXWZ46jxm4T9UO2cUtXoxHX6g/edit#heading=h.64lcqrj3a09u">another blog I wrote</a> more than 5 years ago:<br />
<br />
<div style="color: #ba2da2; font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">typedef</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">struct</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> llsMP {</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">size_t</span><span style="font-variant-ligatures: no-common-ligatures;"> _capacity, _available;</span></div>
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">* _pool;</span><span style="font-variant-ligatures: no-common-ligatures;">/* the memory itself */</span></div>
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">* *_book;</span><span style="font-variant-ligatures: no-common-ligatures;">/* Array for book keeping, but don't know the size till ctor */</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">} llsMP;</span></div>
<div style="font-family: Menlo; line-height: normal; min-height: 13px;">
<span style="font-variant-ligatures: no-common-ligatures;"></span><br /></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">unsigned</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">char</span><span style="font-variant-ligatures: no-common-ligatures;"> llsMP_alloc(</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">struct</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">llsMP</span><span style="font-variant-ligatures: no-common-ligatures;">* me,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"><span class="Apple-tab-span" style="white-space: pre;"> </span> </span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">size_t</span><span style="font-variant-ligatures: no-common-ligatures;"> capacity, </span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">size_t</span><span style="font-variant-ligatures: no-common-ligatures;"> memsize, </span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">size_t</span><span style="font-variant-ligatures: no-common-ligatures;"> alignment</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> , </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">unsigned</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">char</span><span style="font-variant-ligatures: no-common-ligatures;"> zero);</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;"> llsMP_free(</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">struct</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">llsMP</span><span style="font-variant-ligatures: no-common-ligatures;">* me);</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">struct</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">llsMP</span><span style="font-variant-ligatures: no-common-ligatures;">* llsMP_new(</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">size_t</span><span style="font-variant-ligatures: no-common-ligatures;"> capacity, </span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">size_t</span><span style="font-variant-ligatures: no-common-ligatures;"> memsize, </span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">size_t</span><span style="font-variant-ligatures: no-common-ligatures;"> alignment</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> , </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">unsigned</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">char</span><span style="font-variant-ligatures: no-common-ligatures;"> zero);</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;"> llsMP_delete(</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">struct</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">llsMP</span><span style="font-variant-ligatures: no-common-ligatures;">* me);</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;">* llsMP_get(</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">struct</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">llsMP</span><span style="font-variant-ligatures: no-common-ligatures;">* me);</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">unsigned</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">char</span><span style="font-variant-ligatures: no-common-ligatures;"> llsMP_return(</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">struct</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">llsMP</span><span style="font-variant-ligatures: no-common-ligatures;">* me, </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;">* node);</span></div>
<br />
In this primitive, cross-platform API, I don't use more convenient types like uint8_t or bool, because I did not want to require stdint.h. I prefer using the non-memory alloc version if possible, to keep the object as part of the container's object, like this:<br />
<br />
<div style="color: #ba2da2; font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">typedef</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">struct</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> MyMic </span>{</div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> <span style="color: #703daa;">AudioStreamBasicDescription</span></span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> streamFormat;</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">AudioUnit</span><span style="font-variant-ligatures: no-common-ligatures;"> inputUnit;</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"><span class="Apple-tab-span" style="white-space: pre;"> </span></span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">AudioBufferList</span><span style="font-variant-ligatures: no-common-ligatures;"> *inputBuffer;</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">FILE</span><span style="font-variant-ligatures: no-common-ligatures;">* t_csv, *x_csv, *f_csv, *c_csv;</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">struct</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">llsMP</span><span style="font-variant-ligatures: no-common-ligatures;"> mpool;</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">struct</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">llsQ</span><span style="font-variant-ligatures: no-common-ligatures;"> padded_xQ;</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">int8_t</span><span style="font-variant-ligatures: no-common-ligatures;"> Eblock, iBlock;</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">bool</span><span style="font-variant-ligatures: no-common-ligatures;"> firstBlock;</span></div>
<br />
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">} MyMic;</span></div>
<div>
<span style="font-variant-ligatures: no-common-ligatures;"><br /></span></div>
The memory pool still needs to be initialized before the callback is called, which is why I offer the alloc() API.<br />
<br />
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">if</span><span style="font-variant-ligatures: no-common-ligatures;"> (!</span><span style="color: #31595d; font-variant-ligatures: no-common-ligatures;">llsMP_alloc</span><span style="font-variant-ligatures: no-common-ligatures;">(&</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">player</span><span style="font-variant-ligatures: no-common-ligatures;">-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">mpool</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> , </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">3</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #008400; font-variant-ligatures: no-common-ligatures;">// capacity</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> , </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">sizeof</span><span style="font-variant-ligatures: no-common-ligatures;">(</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">struct</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">SampleBlock</span><span style="font-variant-ligatures: no-common-ligatures;">) </span><span style="color: #008400; font-variant-ligatures: no-common-ligatures;">// memsize</span></div>
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> , </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">16</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">// alignment: recommended by vDSP programming guide</span></div>
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> , </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">)</span><span style="font-variant-ligatures: no-common-ligatures;">//memset zero, since this pool is used for zero padding</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> ) {</span></div>
<div style="color: #d12f1b; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">fprintf</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> (</span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">stderr</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">, </span><span style="font-variant-ligatures: no-common-ligatures;">"Can't allocate zero padded memory pool"</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">);</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">exit</span><span style="font-variant-ligatures: no-common-ligatures;"> (</span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">errno</span><span style="font-variant-ligatures: no-common-ligatures;">);</span></div>
<br />
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> }</span></div>
<div>
<span style="font-variant-ligatures: no-common-ligatures;"><br /></span></div>
The 16 byte alignment is recommended by vDSP--probably because it uses the SSE instructions to speed up the math (and these instructions require 16 byte aligned data). I enable zeroing out the memory because this memory pool is for the zero padded raw samples. Recall from my previous blog that using the circular convolution/correlation of FFT requires appending zeros of same length as the input block. If I zero out the memory pool initially and only write to the latter half of the memory pool node I obtain, I can maintain a zero padding forever, without ever having to explicitly zero out parts of the stack memory. Copying the received sound samples from the HW to the <i>latter half</i> of this memory buffer (the first half always remains zero padded) is achieved through careful pointer arrangement right before handing the buffer to the AudioUnit to fill, in the receive callback:<br />
<br />
<div style="font-family: Menlo; line-height: normal;">
<span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">struct</span><span style="font-variant-ligatures: no-common-ligatures;"> SampleBlock {</span></div>
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">UInt64</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> mHostTime;</span><span style="font-variant-ligatures: no-common-ligatures;">//The timestamp for these frames</span></div>
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">//UInt64 mWordClockTime //don't bother; always zero</span></div>
<div style="font-family: Menlo; line-height: normal; min-height: 13px;">
<span style="font-variant-ligatures: no-common-ligatures;"></span><br /></div>
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">//I can calculate the time of the 1st sample with this just as well</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">uint32_t</span><span style="font-variant-ligatures: no-common-ligatures;"> iBlock;</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">UInt32</span><span style="font-variant-ligatures: no-common-ligatures;"> nFrames;</span></div>
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">float</span><span style="font-variant-ligatures: no-common-ligatures;"> sample[</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;"><<</span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">LOG2_2M</span><span style="font-variant-ligatures: no-common-ligatures;">];</span>//2x sample block length, to zero pad</div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">};</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"><br /></span></div>
<div style="color: #4f8187; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">struct</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">SampleBlock</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">* block = (</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">struct</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">SampleBlock</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">*)</span><span style="color: #31595d; font-variant-ligatures: no-common-ligatures;">llsMP_get</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(&</span><span style="font-variant-ligatures: no-common-ligatures;">player</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">-></span><span style="font-variant-ligatures: no-common-ligatures;">mpool</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">);</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">if</span><span style="font-variant-ligatures: no-common-ligatures;"> (!block) {</span></div>
<div style="color: #d12f1b; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">fprintf</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">stderr</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">, </span><span style="font-variant-ligatures: no-common-ligatures;">"Memory pool exhausted\n"</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">);</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">return</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">errno</span><span style="font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> }</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">player</span><span style="font-variant-ligatures: no-common-ligatures;">-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">inputBuffer</span><span style="font-variant-ligatures: no-common-ligatures;">-></span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">mBuffers</span><span style="font-variant-ligatures: no-common-ligatures;">[</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">0</span><span style="font-variant-ligatures: no-common-ligatures;">].</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">mData</span><span style="font-variant-ligatures: no-common-ligatures;"> = (</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;">*)(&block-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">sample</span><span style="font-variant-ligatures: no-common-ligatures;">[</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;"><<</span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">LOG2M</span><span style="font-variant-ligatures: no-common-ligatures;">]);</span></div>
<div style="font-family: Menlo; line-height: normal; min-height: 13px;">
<span style="font-variant-ligatures: no-common-ligatures;"></span><br /></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #31595d; font-variant-ligatures: no-common-ligatures;">CheckError</span><span style="font-variant-ligatures: no-common-ligatures;">(</span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">AudioUnitRender</span><span style="font-variant-ligatures: no-common-ligatures;">(</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">player</span><span style="font-variant-ligatures: no-common-ligatures;">-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">inputUnit</span><span style="font-variant-ligatures: no-common-ligatures;">, ioActionFlags,</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> inTimeStamp, inBusNumber, inNumberFrames,</span></div>
<div style="color: #4f8187; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"><span class="Apple-tab-span" style="white-space: pre;"> </span></span><span style="font-variant-ligatures: no-common-ligatures;">player</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">-></span><span style="font-variant-ligatures: no-common-ligatures;">inputBuffer</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">),</span></div>
<div style="color: #d12f1b; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">"AudioUnitRender failed"</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">);</span></div>
<div style="font-family: Menlo; line-height: normal; min-height: 13px;">
<span style="font-variant-ligatures: no-common-ligatures;"></span><br /></div>
<div style="color: #4f8187; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> block-></span><span style="font-variant-ligatures: no-common-ligatures;">iBlock</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> = </span><span style="font-variant-ligatures: no-common-ligatures;">player</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">-></span><span style="font-variant-ligatures: no-common-ligatures;">iBlock</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> block-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">mHostTime</span><span style="font-variant-ligatures: no-common-ligatures;"> = inTimeStamp-></span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">mHostTime</span><span style="font-variant-ligatures: no-common-ligatures;">;</span></div>
<br />
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> block-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">nFrames</span><span style="font-variant-ligatures: no-common-ligatures;"> = inNumberFrames;</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"><br /></span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">if</span><span style="font-variant-ligatures: no-common-ligatures;"> (!</span><span style="color: #31595d; font-variant-ligatures: no-common-ligatures;">llsQ_push</span><span style="font-variant-ligatures: no-common-ligatures;">(&</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">player</span><span style="font-variant-ligatures: no-common-ligatures;">-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">padded_xQ</span><span style="font-variant-ligatures: no-common-ligatures;">, block)) {</span></div>
<div style="color: #d12f1b; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">fprintf</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">(</span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">stderr</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">, </span><span style="font-variant-ligatures: no-common-ligatures;">"llsQ_push failed\n"</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">);</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">return</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">errno</span><span style="font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">
</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> }</span></div>
<div>
<span style="font-variant-ligatures: no-common-ligatures;"><br /></span></div>
<div>
As you can see, the pointer to the memory pool node then can be pushed into a lockless queue, which works well in this case because there is only 1 writer thread, and 1 reader thread. Note also that the reception time on the host (in nanoseconds) and the block number (starts at 0) are attached to the sample block. If the correlation processor decides there is a positive event, it can then place the event within the audio sampling period (1/44100 Hz here, or < 23 usec).</div>
<div>
<br /></div>
<div>
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">/* SINGLE writer, SINGLE reader queue */</span></div>
<div style="color: #ba2da2; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">typedef</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">struct</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> llsQ {</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">size_t</span><span style="font-variant-ligatures: no-common-ligatures;"> _head, _tail, _mask;</span></div>
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">* *_q;</span><span style="font-variant-ligatures: no-common-ligatures;">/* array of pointers */</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> } llsQ;</span></div>
<div style="font-family: Menlo; line-height: normal; min-height: 13px;">
<span style="font-variant-ligatures: no-common-ligatures;"></span><br /></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">unsigned</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">char</span><span style="font-variant-ligatures: no-common-ligatures;"> llsQ_alloc(</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">struct</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">llsQ</span><span style="font-variant-ligatures: no-common-ligatures;">* me, </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">unsigned</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">char</span><span style="font-variant-ligatures: no-common-ligatures;"> exponent);</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;"> llsQ_free(</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">struct</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">llsQ</span><span style="font-variant-ligatures: no-common-ligatures;">* me);</span></div>
<div style="font-family: Menlo; line-height: normal; min-height: 13px;">
<span style="font-variant-ligatures: no-common-ligatures;"></span><br /></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">struct</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">llsQ</span><span style="font-variant-ligatures: no-common-ligatures;">* llsQ_new(</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">unsigned</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">char</span><span style="font-variant-ligatures: no-common-ligatures;"> exponent);</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;"> llsQ_delete(</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">struct</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">llsQ</span><span style="font-variant-ligatures: no-common-ligatures;">* me);</span></div>
<div style="font-family: Menlo; line-height: normal; min-height: 13px;">
<span style="font-variant-ligatures: no-common-ligatures;"></span><br /></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">unsigned</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">char</span><span style="font-variant-ligatures: no-common-ligatures;"> llsQ_push(</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">struct</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">llsQ</span><span style="font-variant-ligatures: no-common-ligatures;">* me, </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;">* node);</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">unsigned</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">char</span><span style="font-variant-ligatures: no-common-ligatures;"> llsQ_pop(</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">struct</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">llsQ</span><span style="font-variant-ligatures: no-common-ligatures;">* me, </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;">** node);</span></div>
</div>
<div>
<br /></div>
<div>
The main algorithm loop sleeps until the lockless queue becomes not-empty.</div>
<br />
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">for</span><span style="font-variant-ligatures: no-common-ligatures;"> (</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">int8_t</span><span style="font-variant-ligatures: no-common-ligatures;"> iBlock = </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">0</span><span style="font-variant-ligatures: no-common-ligatures;">; iBlock < </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">player</span><span style="font-variant-ligatures: no-common-ligatures;">-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">Eblock</span><span style="font-variant-ligatures: no-common-ligatures;">; ) {</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">struct</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">SampleBlock</span><span style="font-variant-ligatures: no-common-ligatures;">* block;</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">if</span><span style="font-variant-ligatures: no-common-ligatures;"> (!</span><span style="color: #31595d; font-variant-ligatures: no-common-ligatures;">llsQ_pop</span><span style="font-variant-ligatures: no-common-ligatures;">(&</span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">player</span><span style="font-variant-ligatures: no-common-ligatures;">-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">padded_xQ</span><span style="font-variant-ligatures: no-common-ligatures;">, (</span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">void</span><span style="font-variant-ligatures: no-common-ligatures;">**)&block)) {</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">usleep</span><span style="font-variant-ligatures: no-common-ligatures;">(</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1000</span><span style="font-variant-ligatures: no-common-ligatures;">);</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">continue</span><span style="font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> }</span></div>
<br />
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">// Have a new block to process</span></div>
<div>
<span style="font-variant-ligatures: no-common-ligatures;"><br /></span></div>
<div>
<span style="font-variant-ligatures: no-common-ligatures;">The correlation processor now has work to do.</span></div>
<h2>
Fast correlation processor</h2>
<div>
I discussed the central algorithm for the fast correlation using FFT in my previous blog entry. Repeating for convenience:</div>
<div>
<br /></div>
<div style="text-align: center;">
correlation(x, filter) = iFFT(FFT(x) • FFT*(filter))</div>
<div>
<br /></div>
<div>
where <span style="text-align: center;">•</span> is the element wise product of 2 vectors of equal length, and * is the complex conjugate operator. Because the filter coefficients are constant, I can pre-calculate <span style="text-align: center;">FFT*(filter) and store in the DATA segment of the program, as I've done here:</span></div>
<div>
<span style="text-align: center;"><br /></span></div>
<div>
<div style="color: #78492a; font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">#define LOG2M </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">10</span></div>
<div style="color: #78492a; font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">#define EXP_LOG2M (</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;"><<LOG2M)</span></div>
<div style="color: #78492a; font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">#define LOG2_HALFM (LOG2M - </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;">)</span></div>
<div style="color: #78492a; font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">#define LOG2_2M (LOG2M + </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;">)</span></div>
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">// The matched filter from Matlab</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">float</span><span style="font-variant-ligatures: no-common-ligatures;"> FFT_pulse_conj_vDSPz_flattened[</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;"><<</span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">LOG2_2M</span><span style="font-variant-ligatures: no-common-ligatures;">] </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">__attribute__</span><span style="font-variant-ligatures: no-common-ligatures;">((aligned(</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">16</span><span style="font-variant-ligatures: no-common-ligatures;">)))</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">= {</span></div>
<div style="color: #d12f1b; font-family: Menlo; line-height: normal;">
<span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">#include </span><span style="font-variant-ligatures: no-common-ligatures;">"FFT_pulse_conj_vDSPz_flattened.h"</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">};</span></div>
</div>
<div>
<br /></div>
<div>
Given the wealth of DSP libraries and algorithms available for free these days, I expected to find a DSP library that is optimized for OS X/iOS, and I did: Apple's Accelerate framework, which seems to take advantage of Intel's SSE. But typical of Apple documentation, the documentation is lacking (to put it mildly). One must begin with the vDSP's packing of the FFT of a real sequence, given <a href="https://developer.apple.com/library/content/documentation/Performance/Conceptual/vDSP_Programming_Guide/UsingFourierTransforms/UsingFourierTransforms.html#//apple_ref/doc/uid/TP40005147-CH3-SW1">here</a>. A few crucial concepts are:</div>
<div>
<ol>
<li>The result of FFT is a complex number in general, which they keep in 2 DIFFERENT arrays (lots of other implementations, including the venerable <i>Numerical Recipes</i>) keeps them in alternating fashion.</li>
<li>FFT of a real sequence x is anti-symmetrical about frequency=0; i.e. X(f) = X*(-f), where X = FFT(x).</li>
<li>X(f=0) and X(f=Fs/2)--FFT at the DC and the Nyquist frequency--are pure real.</li>
</ol>
<div>
Using all of the above properties, vDSP uses N numbers to represent the FFT of a real sequence of length N (which would otherwise require 2N numbers). But because they keep the real and the imaginary parts separately, an in-place FFT requires a prior step of separating out the real sequence into odd and even sequences, as I've done below with <span style="color: #3e1e81; font-family: Menlo;">vDSP_ctoz</span>.</div>
</div>
<div>
<br /></div>
<div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">float</span><span style="font-variant-ligatures: no-common-ligatures;"> creal[</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;"><<</span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">LOG2M</span><span style="font-variant-ligatures: no-common-ligatures;">] </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">__attribute__</span><span style="font-variant-ligatures: no-common-ligatures;">((aligned(</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">16</span><span style="font-variant-ligatures: no-common-ligatures;">))) </span><span style="color: #008400; font-variant-ligatures: no-common-ligatures;">//input/output buffer</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> , cimag[</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;"><<</span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">LOG2M</span><span style="font-variant-ligatures: no-common-ligatures;">] </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">__attribute__</span><span style="font-variant-ligatures: no-common-ligatures;">((aligned(</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">16</span><span style="font-variant-ligatures: no-common-ligatures;">)))</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> , ftemp[</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;"><<</span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">LOG2_2M</span><span style="font-variant-ligatures: no-common-ligatures;">] </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">__attribute__</span><span style="font-variant-ligatures: no-common-ligatures;">((aligned(</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">16</span><span style="font-variant-ligatures: no-common-ligatures;">))) </span><span style="color: #008400; font-variant-ligatures: no-common-ligatures;">//temporary scratch pad</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> ;</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">COMPLEX_SPLIT</span><span style="font-variant-ligatures: no-common-ligatures;"> splitc = { .</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">realp</span><span style="font-variant-ligatures: no-common-ligatures;"> = creal, .</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">imagp</span><span style="font-variant-ligatures: no-common-ligatures;"> = cimag }</span></div>
</div>
<div>
<span style="color: #008400; font-variant-ligatures: no-common-ligatures;">
<div style="color: black; font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> , split_temp = { .</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">realp</span><span style="font-variant-ligatures: no-common-ligatures;"> = ftemp, .</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">imagp</span><span style="font-variant-ligatures: no-common-ligatures;"> = &ftemp[</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;"><<</span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">LOG2M</span><span style="font-variant-ligatures: no-common-ligatures;">] }</span></div>
<div style="color: black; font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> , split_filter = { .</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">realp</span><span style="font-variant-ligatures: no-common-ligatures;"> = </span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">FFT_pulse_conj_vDSPz_flattened</span></div>
<div style="color: #4f8187; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> , .</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">imagp</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> = </span><span style="font-variant-ligatures: no-common-ligatures;">FFT_pulse_conj_vDSPz_flattened</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> + (</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"><<</span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">LOG2M</span><span style="color: black; font-variant-ligatures: no-common-ligatures;">) }</span></div>
<div style="color: black; font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> ;</span></div>
<div style="color: black; font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"></span></div>
<div style="color: black; font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">float</span><span style="font-variant-ligatures: no-common-ligatures;"> FFT_filter_nyq = *split_filter.</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">imagp</span><span style="font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="color: black; font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> *split_filter.</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">imagp</span><span style="font-variant-ligatures: no-common-ligatures;"> = </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">0</span><span style="font-variant-ligatures: no-common-ligatures;">; </span><span style="color: #008400;">//see explanation below</span></div>
<div style="color: black; font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">...</span></div>
<div style="color: black; font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"><br /></span></div>
<div style="color: black; font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">// When I get a block to process, split out a time series into odd and even</span></div>
<div style="color: black; font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"></span></div>
<div style="color: black; font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">vDSP_ctoz</span><span style="font-variant-ligatures: no-common-ligatures;">((</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">COMPLEX</span><span style="font-variant-ligatures: no-common-ligatures;">*)block-></span><span style="color: #4f8187; font-variant-ligatures: no-common-ligatures;">sample</span><span style="font-variant-ligatures: no-common-ligatures;">, </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">2</span><span style="font-variant-ligatures: no-common-ligatures;">, &splitc, </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;">, </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;"><<</span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">LOG2M</span><span style="font-variant-ligatures: no-common-ligatures;">);</span></div>
<div>
<br /></div>
</span></div>
<div>
Note the arrangement of the filter's FFT to be consistent with vDSP's representation of FFT of a real sequence. My Matlab code (explained in my <a href="http://henryomd.blogspot.com/2017/03/signal-processing-audio-chirp-in-matlab.html">previous blog post</a>) anticipates the C code will do this, and writes out the header file correctly. I can then FFT the zero padded sample block with vDSP API:</div>
<div>
<br /></div>
<div>
<div style="font-family: Menlo; line-height: normal;">
<span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">vDSP_fft_zript</span><span style="font-variant-ligatures: no-common-ligatures;">(fftSetup, &splitc, </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;">, &split_temp, </span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">LOG2_2M</span><span style="font-variant-ligatures: no-common-ligatures;">, </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">FFT_FORWARD</span><span style="font-variant-ligatures: no-common-ligatures;">);</span></div>
</div>
<div>
<br /></div>
<div>
The result is still in the "splitc"--because this is an inplace FFT. The temporary buffer is supposed to speed up the FFT (have not yet measured how much). If I multiply this FFT with the filter's FFT (actually the conjugate thereof), the result would be wrong, because the Nyquist portion masquerading as the imaginary value of FFT(x) @ DC distorts the result. Because vDSP does not take care of this, I have to jump through some hoops to save and then restore. At this this is straight-forward in the 1D case; the 2D case requires setting up a separate vector (huge inconvenience). I think the vDSP folks dropped the ball on this.</div>
<div>
<br /></div>
<div>
<div style="font-family: Menlo; line-height: normal;">
<span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">float</span><span style="font-variant-ligatures: no-common-ligatures;"> FFT_c_nyq = *splitc.</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">imagp</span><span style="font-variant-ligatures: no-common-ligatures;">; *splitc.</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">imagp</span><span style="font-variant-ligatures: no-common-ligatures;"> = </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">0</span><span style="font-variant-ligatures: no-common-ligatures;">;</span></div>
<div style="color: #78492a; font-family: Menlo; line-height: normal;">
<span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">vDSP_zvmul</span><span style="font-variant-ligatures: no-common-ligatures;">(&splitc, </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;">, &split_filter, </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;">, &splitc, </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span></div>
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> , </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"><<</span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">LOG2M</span><span style="font-variant-ligatures: no-common-ligatures;">//The multiplication is over vector of length M</span></div>
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> , </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">// don't conjugate (even though we are correlating);</span></div>
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> ); </span><span style="font-variant-ligatures: no-common-ligatures;">// because the filter is already in conjugate form</span></div>
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">// Restore the Nyquist frequency portion, which is pure real</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">*splitc.</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">imagp</span><span style="font-variant-ligatures: no-common-ligatures;"> = FFT_c_nyq * FFT_filter_nyq;</span></div>
</div>
<div>
<span style="color: #008400; font-variant-ligatures: no-common-ligatures;"><br /></span></div>
<div>
I can then inverse FFT to obtain the fast correlation..</div>
<div>
<br /></div>
<div>
<div style="font-family: Menlo; line-height: normal;">
<span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">vDSP_fft_zript</span><span style="font-variant-ligatures: no-common-ligatures;">(fftSetup, &splitc, </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;">, &split_temp, </span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">LOG2_2M</span><span style="font-variant-ligatures: no-common-ligatures;">, </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">FFT_INVERSE</span><span style="font-variant-ligatures: no-common-ligatures;">);</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">vDSP_ztoc</span><span style="font-variant-ligatures: no-common-ligatures;">(&splitc, </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;">, (</span><span style="color: #703daa; font-variant-ligatures: no-common-ligatures;">COMPLEX</span><span style="font-variant-ligatures: no-common-ligatures;">*)ftemp, </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">2</span><span style="font-variant-ligatures: no-common-ligatures;">, </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;"><<</span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">LOG2M</span><span style="font-variant-ligatures: no-common-ligatures;">);</span></div>
</div>
<div>
<span style="font-variant-ligatures: no-common-ligatures;"><br /></span></div>
<div>
But as I explained in <a href="http://henryomd.blogspot.com/2017/03/signal-processing-audio-chirp-in-matlab.html">previous blog post</a>, the correlation of the zero-padded block is not the final running correlation unless the result from the previous block is added to the overlap (the overlap save method). That is why the overlap_save (initially zero) is maintained from 1 block to the next:</div>
<div>
<br /></div>
<div>
<div style="font-family: Menlo; line-height: normal;">
<span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">float</span><span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">overlap_save[</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;"><<</span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">LOG2M</span><span style="font-variant-ligatures: no-common-ligatures;">] </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">__attribute__</span><span style="font-variant-ligatures: no-common-ligatures;">((aligned(</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">16</span><span style="font-variant-ligatures: no-common-ligatures;">))) </span><span style="color: #008400; font-variant-ligatures: no-common-ligatures;">//final correlation</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> ;</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;">...</span></div>
</div>
<div>
<span style="font-variant-ligatures: no-common-ligatures;"><div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">//Correlation to output: overlap_save + ftemp[0:M-1];</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">vDSP_vadd</span><span style="font-variant-ligatures: no-common-ligatures;">(ftemp, </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;">, overlap_save, </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;">, ftemp, </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;">, </span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;"><<</span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">LOG2M</span><span style="font-variant-ligatures: no-common-ligatures;">);</span></div>
<div style="font-family: Menlo; line-height: normal; min-height: 13px;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span></div>
<div style="color: #008400; font-family: Menlo; line-height: normal;">
<span style="color: black; font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">//Correlation to save for next round: overlap_save = ftemp[M:2M-1]</span></div>
<div style="font-family: Menlo; line-height: normal;">
<span style="font-variant-ligatures: no-common-ligatures;"> </span><span style="color: #3e1e81; font-variant-ligatures: no-common-ligatures;">memcpy</span><span style="font-variant-ligatures: no-common-ligatures;">(overlap_save, &ftemp[</span><span style="color: #272ad8; font-variant-ligatures: no-common-ligatures;">1</span><span style="font-variant-ligatures: no-common-ligatures;"><<</span><span style="color: #78492a; font-variant-ligatures: no-common-ligatures;">LOG2M</span><span style="font-variant-ligatures: no-common-ligatures;">], </span><span style="color: #ba2da2; font-variant-ligatures: no-common-ligatures;">sizeof</span><span style="font-variant-ligatures: no-common-ligatures;">(overlap_save));</span></div>
<div>
<span style="font-variant-ligatures: no-common-ligatures;"><br /></span></div>
</span></div>
<h2>
Result: clear difference between silence and chirp</h2>
<div>
To validate the output of the fast correlator, I ran the program twice: the first without any signal (just room noise) and the 2nd while playing the chirp I designed in Matlab on repeat. The Matlab validation script's orange line shows the difference between Matlab's calculation and the output from my CoreAudio program. The 1st panel is the raw sample value, and the middle panel is the magnitude of the element-wise multiplication of the 2 FFTs. The bottom panel is the correlator output.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-q2AiAoT-321qDT6FtX625UPxJIY9Xavz2_gpDmr2G3-J8TT0UtaJCmSVMY3diEEVizC7e1HIQeLIgRn_6tA_vbEo222efFkmXi7SqVcqs7E2rrdGNwLw4PZofchy_GpyquENHwsvvg2U/s1600/corr+silence.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="565" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-q2AiAoT-321qDT6FtX625UPxJIY9Xavz2_gpDmr2G3-J8TT0UtaJCmSVMY3diEEVizC7e1HIQeLIgRn_6tA_vbEo222efFkmXi7SqVcqs7E2rrdGNwLw4PZofchy_GpyquENHwsvvg2U/s640/corr+silence.png" width="640" /></a></div>
<div>
I am not sure what the mic is picking up during the first few milliseconds of the recording, but there is no obvious hit in the correlator output, which is clearly visible when I play the chirp, as you can see below.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirLcj4VFjB4C22kCPAT3o6wn0zMBXHHCp04ukXpRa2tFbqOw1-MAHq4zAlhBkba108l2fEM3WHivxxgkX0Ia5ATc_5FaNY54asaIyOi4Rca84sQ5Q2Lbeb0mYy-SlSya3rZcycRCPSkI7d/s1600/corr+chirps.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="562" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirLcj4VFjB4C22kCPAT3o6wn0zMBXHHCp04ukXpRa2tFbqOw1-MAHq4zAlhBkba108l2fEM3WHivxxgkX0Ia5ATc_5FaNY54asaIyOi4Rca84sQ5Q2Lbeb0mYy-SlSya3rZcycRCPSkI7d/s640/corr+chirps.png" width="640" /></a></div>
<h2>
Next steps</h2>
<div>
When I can further process the correlator output to a yes/no decision between chirp, picking out the sample at which the chirp was received, and mapping that back to a monotonically increasing timestamp on the receiver side should be straight-forward. And thanks to the use of the memory pool and circular queue, this processing pipeline can run ad-infinitum.</div>
<div>
<br /></div>
Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com0tag:blogger.com,1999:blog-4032020337247582619.post-74239634858736972882017-03-18T11:02:00.002-07:002017-03-18T11:02:26.760-07:00Why I never go back--if at all possibleYesterday, I drank a lot (more than a glass) for the first time in a long while. I had to numb myself against the negative feelings about myself and my past. My boss had scheduled a "get to know your boss's boss" mixer event that I had been dreading all week long. If it was just an informal mixer, I would have just fake-smiled through the event. But my boss had already told all the new hires to talk about oneself for 5 minutes each, and I knew he wanted us to talk about our personal aspects; and that was a problem because I know what my boss thinks about my past self: it's <a href="https://www.linkedin.com/pulse/joy-re-learning-again-henry-choi">the shameful version of me</a> that I am painfully aware of. What made it worse is that both my boss and his boss are the "big men" that I had dreamed of becoming in my younger days. Seeing (let alone talking to) my boss's boss always reopens my scar of regret. During my turn at the 5 minute introduction, I tried to just talk about my current strengths (low level programming and system engineering), but as I feared, my boss jumped in and told other people about the distractions I dabbled in. I know he doesn't do it out of malice; he thinks my past self is whimsically endearing--a positive trait, if you will. So if I were the disgustingly positive person (you know, the people who are liberal with the exclamation mark, can communicate purely in emojis, and are the center of conversation and laughter around the water cooler), I could just suck it up, and play the part that is consistent with my boss's image of myself.<br />
<br />
But I can't--at least not now--completely shake the regret about not having had the combination of personality and skills to climb the career ladder, and bitterness against the people who gave me hell about my shortcomings without recognizing my strengths. I find that the bitterness against the malice of other people is easier to deal with if I justify it as an inherent human nature: it is what it is.<br />
<blockquote class="tr_bq">
And why beholdest thou the mote that is in thy brother's eye, but considerest not the beam that is in thine own eye? </blockquote>
<blockquote class="tr_bq" style="text-align: right;">
Matthew 7:3-5</blockquote>
So for the most part, it is my own self that I cannot forgive. Most people frame the wisdom of <i>never go back</i> in terms of the unchanging nature of the old problems in <i>the other</i> {place | person}, and I agree that people rarely change. But at least in my case, going back to the old {place | relationship} where I experienced a failure just reopens my negative feelings about myself--which is not productive at all. This is why I had turned down my boss's previous job offers several times, until he offered me a deeply embedded programming role, and I really wanted a secure employment for the next few years.<br />
<br />
Being an introvert, I try to ride through the negative feelings by acknowledging my shortcomings and re-dedicating myself to <a href="https://www.linkedin.com/pulse/my-second-childhood-henry-choi">my personal mission</a>, and laughing through clenched teeth. This coping mechanism has worked well for me in the past 6 years: I have never been more productive or happier in my life. But I also understand why Google tries to hire fresh college graduates who don't have this kind of emotional baggage; the perfect employee if you will. But when faced with a choice between two novels, one with a linear path to success (<i>vendi, vidi, vici</i>), and the other with a downfall and an ultimate redemption, I know which one I would reach for.Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com2tag:blogger.com,1999:blog-4032020337247582619.post-46171736013198275582017-03-11T13:08:00.001-08:002017-03-11T15:37:00.938-08:00Signal processing audio chirp in MatlabIn my <a href="http://henryomd.blogspot.com/2017/02/understanding-pulse-compression.html">previous blog post</a>, I looked into the basic concept of pulse compression for ranging application. In it, I just looked at a short timespan signal, just to understand the concept. But in a non-time synchronized ranging application, the receiver will not know exactly when the sender will transmit the pulse. At best, it can be alerted approximately when the sender will transmit the pulse using external information. So the receiver has to be conservative and start the recording earlier than the expected pulse arrival, and only end it after either it has received the pulse, or reasonable timeout has expired. If that time window is say 1 second, and the sampling rate is 44100 Hz, and the sample size is 16 bits, almost 176 KB of memory is required for just to hold 1 second of stereo sound signals in memory: far more than Cortex M0 MCUs pack these days. For a cost sensitive application of pulse compression ranging then, a less memory (and compute) hungry signal processing algorithm is required. So in this blog, I review the basic DSP concepts I can borrow readily. If you are already a seasoned DSP hand, none of this will be new to you, so skip ahead to the simulation results section at the after the next section.<br />
<h2>
FIR filtering in pictures</h2>
<div>
For audio applications, the constant group delay property of the FIR filter is an absolute must to avoid phase distortion that is very difficult to design around in the IIR filter, but for some reason, FIR (finite impulse response) filters are used far more than the IIR (impulse response filter) even for non-audio applications. FIR filter kernels are far greater than the IIR, so the multiply-and-add speed of the MCU or in some cases dedicated DSP makes a big difference. Suppose we have to filter a relatively long time series <b>x[k]</b> of length <b>N</b> with an FIR filter <b>h[k] </b>of order <b>M</b>-1 (the FIR filter coefficient length is 1 greater than the filter order), as shown below.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg9B6TXtJB6S8MInDWQQVVWZN_1wV0Q0yp1smtBotpSlozQQml1z6pc9gcq_39pEKVItIF6xsUKvSiJ_WeyKqYu7RlOKWlJ1CRkPPUV9we1mgzCIb77P7YkmUgKPZ9-gyyQYsq1aHeJ7-nd/s1600/DSP.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg9B6TXtJB6S8MInDWQQVVWZN_1wV0Q0yp1smtBotpSlozQQml1z6pc9gcq_39pEKVItIF6xsUKvSiJ_WeyKqYu7RlOKWlJ1CRkPPUV9we1mgzCIb77P7YkmUgKPZ9-gyyQYsq1aHeJ7-nd/s1600/DSP.png" /></a></div>
In time domain, filtering (<i>convolving</i>) x[k] by h[k] is done by calculating the integral of the product of x[k] by the time reversed version of h[k] at every sampling instance. Since I assumed at the outset that x[k] is a much longer (N) time series signal than the filter kernel size (M), it is convenient to visualize the mirror image of h[k] sliding from left to right for each time series sample of the filter output, as you can see below.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjFiayXQf1KH_vgw5zglbRlG8RfAbimyNAC33tNjz6czUn3oXEcZ4aA6xmdpTeSEaXZQDWTdtaNFL1bRmbFD2XaqCMfIPOF6yF7Inx1hL3_odP_CAotbgMkabNuj9eBW4YDi6SPgQr9nWDU/s1600/DSP.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjFiayXQf1KH_vgw5zglbRlG8RfAbimyNAC33tNjz6czUn3oXEcZ4aA6xmdpTeSEaXZQDWTdtaNFL1bRmbFD2XaqCMfIPOF6yF7Inx1hL3_odP_CAotbgMkabNuj9eBW4YDi6SPgQr9nWDU/s1600/DSP.png" /></a></div>
Since we are dealing with discrete sampled system, it's possible to draw the FIR kernel at discrete points and visually integrate the overlap to obtain the filtered time series (green) above. It is also possible to see that at k = -M, there is no overlap and therefore the filtered output is 0. But at k = N-1, there is an overlap of 1 sample, so the response is non-zero in general. Therefore, the response is non-zero for total of N + (M-1) samples, BUT there are 3 separate regions of filter output:<br />
<ol>
<li>Pre: from k=-M+1 to k = -1, the filter is not completely overlapped with the signal, so the output is artificially small.</li>
<li>Completely overlapped: from k=0 to k=(N-1)-M, the filter is completely overlapped with the signal, so every filter kernel coefficient contributes to the output. Note there are only N-M+1 sample instances in a finite length time series consideration here.</li>
<li>Post: from k=N-M on, the filter kernel is again not completely overlapped with the signal.</li>
</ol>
This consideration of overlap will be important before long, for the so-called overlap-add block-by-block filtering method. For now, let's take a step back and consider the algorithmic complexity: for each sample instance of the filter response, an M multiply results have to be incrementally added (M-1) times (hence the MAC--multiply and accumulate--HW block of the DSP). So for length N time series, there are roughly N*M multiplies and N*M additions--or O(M^2) for a naive implementation. SW engineers freak out when they see quadratic or greater big O notation, and the O(M log2 (M)) of FFT (fast Fourier transform) puts them back at ease. But because the FFT is based on periodic signal assumption, we have to zero pad the end of time series signals and then throw away the ends of the FFT output. We can visualize the circular convolution by laying out the time series (both the x[k] and h[k]) on--circles, as you can see below.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjeyU-sN0YbphyphenhyphenMX73MwD6puH02A35tjpdX6Yau0HyhBWKocPpVir9D7ZZsoTtkM4APS1TkdNPgcbUm3_Lh09HAFe0snarITverzZYQvz16pUD1D1ysVkWC4nmNlvvF4dljqCl_tts_LRnx/s1600/DSP.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjeyU-sN0YbphyphenhyphenMX73MwD6puH02A35tjpdX6Yau0HyhBWKocPpVir9D7ZZsoTtkM4APS1TkdNPgcbUm3_Lh09HAFe0snarITverzZYQvz16pUD1D1ysVkWC4nmNlvvF4dljqCl_tts_LRnx/s1600/DSP.png" /></a></div>
Note how I emphasized the beginning and end of the dataset with the dot. In the circular case, the "sliding" of the filter kernel is achieved by rotating the kernel, as you can see below.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjMD3tUVspAFOM7lVcLGP86dC5r6UUiI36aN90jmqCaXzx3zvq8dSCeHEC8o_vwJIp-GNo2ydeWYSYR9DaLFixCHAl9QQytuTFOdx7oMQX5GfVvcD7XKIJ1ILat5eEaySAmqlmpmzYo20vA/s1600/DSP.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjMD3tUVspAFOM7lVcLGP86dC5r6UUiI36aN90jmqCaXzx3zvq8dSCeHEC8o_vwJIp-GNo2ydeWYSYR9DaLFixCHAl9QQytuTFOdx7oMQX5GfVvcD7XKIJ1ILat5eEaySAmqlmpmzYo20vA/s1600/DSP.png" /></a></div>
Can you see where the end of h[k] now overlaps the beginning of x[k]? The multiply-accumulate contribution from this overlap is correct for periodic signals, but otherwise incorrect. It is as if you change your signal as shown below:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhAwWc7wWKbPqyS072zKBLsUbDOEmVgK1Y_wTOiJlUmhY_5lMye3L7bpDd-S75ukBarWEn8fuUjXzzQRqNogCFYE9TcXhnZJH0803eLiiiyqKTF9vjlXDC0lPamQnLadYYuxtJWmHAQaY3G/s1600/DSP.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhAwWc7wWKbPqyS072zKBLsUbDOEmVgK1Y_wTOiJlUmhY_5lMye3L7bpDd-S75ukBarWEn8fuUjXzzQRqNogCFYE9TcXhnZJH0803eLiiiyqKTF9vjlXDC0lPamQnLadYYuxtJWmHAQaY3G/s1600/DSP.png" /></a></div>
To understand the zero padding concept, segment the x[k] sequence by blocks of M sequences. At delay = -M+1, only the end of the h[k] overlaps x--specifically x[0], as shown below.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjrBNTsTUZZrB2nJgb72eQgeNSNdyBvJittyQkUQSC1-yOF1fAqb8klwsztnn_QMa-wcEKlKx5NIgZXoEYP2ugDayEAQRxhFYu0wqWNPl2hefDHwWG-ats2h3oG0Qpu8jAxXJktHS9nmHoQ/s1600/DSP.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjrBNTsTUZZrB2nJgb72eQgeNSNdyBvJittyQkUQSC1-yOF1fAqb8klwsztnn_QMa-wcEKlKx5NIgZXoEYP2ugDayEAQRxhFYu0wqWNPl2hefDHwWG-ats2h3oG0Qpu8jAxXJktHS9nmHoQ/s1600/DSP.png" /></a></div>
As we slide the kernel to the right, the MAC grows because of more overlap, until at zero delay, h[k] and x[k] completely overlap. But subsequently, the overlap goes down again and the MAC decreases. In overlap-add, you ADD the R contribution of the upper convolution to the L contribution of the lower convolution. The MAC values at delay = 0, M, 2M, ... do not need addition because they are completely overlapped. So the block-wise convolution can be done with the padding either at the beginning or the end. I found it easier to visualize the scheme by padding on the left, as you can see below.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgqz-iSY5ApzfQNItC9A5gLW7dhr7PYCTY5LU8xqnb0l8fhIZe5bOPAZVfc0RLPeL6oGn5rj4qm7qfW9tbd3ZIMbZxrqxrEGBznLLwcf5abMHB1VU2Hov1xq1UhOLnpurtJFmu0aWU4WF71/s1600/DSP.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgqz-iSY5ApzfQNItC9A5gLW7dhr7PYCTY5LU8xqnb0l8fhIZe5bOPAZVfc0RLPeL6oGn5rj4qm7qfW9tbd3ZIMbZxrqxrEGBznLLwcf5abMHB1VU2Hov1xq1UhOLnpurtJFmu0aWU4WF71/s1600/DSP.png" /></a></div>
If the array to receive the results is initialized to zero, the stitching can be done with a simple addition of the new MAC result to the final result. At each block, 3 kinds of MAC results are generated:<br />
<br />
<ul>
<li>Those on the L region are added to the R region of the previous block's result.</li>
<li>Values at multiples of M are already complete.</li>
<li>Those of the R region will be complete in the next round.</li>
</ul>
<br />
My "padding on the left" can be drawn on a circle when doing the convolution with FFT (fast convolution), as shown below.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhvffvSzmq784LBcJCeXMxR9TYQTa9JFW2qw0BRWDCt6lGSnpq4b_QPXIJj1eLkhBbEspnf65Vv1Weo1GC8dwVvMCCrt5hJCZgC9pj564fmbZW6Z2eHygXv9ypSJs0UbCDV48etivP6bVB/s1600/DSP.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhvffvSzmq784LBcJCeXMxR9TYQTa9JFW2qw0BRWDCt6lGSnpq4b_QPXIJj1eLkhBbEspnf65Vv1Weo1GC8dwVvMCCrt5hJCZgC9pj564fmbZW6Z2eHygXv9ypSJs0UbCDV48etivP6bVB/s1600/DSP.png" /></a></div>
Note where the L and the R portions of the resulting MAC are; this is critical for assembling the results between blocks correctly. Note also that when using FFT, the block size is constrained to power of 2.<br />
<h2>
Simulating an audio chirp in noisy environment in Matlab</h2>
</div>
<div>
LFM (linear frequency modulation) pulse is the simplest waveform used in pulse compression (it's described in radar textbooks!). The frequency starts at <span style="color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;">f</span><sub style="color: #333333; font-family: HelveticaNeue; text-align: center;">0</sub>, and grows at <span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;">𝞵</span> = B/<span style="color: #333333; font-family: "helveticaneue"; font-size: 14px;">𝞽</span><sub style="color: #333333; font-family: HelveticaNeue;">p</sub>, where B is the frequency bandwidth (maximum frequency - <span style="color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;">f</span><sub style="color: #333333; font-family: HelveticaNeue; text-align: center;">0</sub>) and <span style="color: #333333; font-family: "helveticaneue"; font-size: 14px;">𝞽</span><sub style="color: #333333; font-family: HelveticaNeue;">p</sub> is the pulse duration. As shown in my <a href="http://henryomd.blogspot.com/2017/02/understanding-pulse-compression.html">previous blog entry</a>, the greater the B and <span style="color: #333333; font-family: "helveticaneue"; font-size: 14px;">𝞽</span><sub style="color: #333333; font-family: HelveticaNeue;">p</sub>, the more energy you can pack into the pulse, increasing the probability of true positive detection. In ranging, B is inversely proportional to the ranging resolution. B and <span style="color: #333333; font-family: "helveticaneue"; font-size: 14px;">𝞽</span><sub style="color: #333333; font-family: helveticaneue;">p</sub> are somewhat "baked" into the algorithm (in the form of the precomputed FFT values, as we will see shortly), but the amplitude can be dynamically changed for every pulse (but not DURING the pulse). The signal strength is directly proportional to the sound amplitude, but I deliberately wanted to keep the amplitude low, for a better user experience. So I found a anechoic singing recording, and set my amplitude to be the same as the RMS amplitude of the recording:<br />
<br />
<div style="font-family: courier; line-height: normal;">
CHIRP_STRENGTH = 1</div>
<div style="font-family: courier; line-height: normal;">
[s_ambient, Fs_a] = audioread(<span style="color: #b245f3;">'singing.wav'</span>, [1 2*N] + 8E4);</div>
<div style="font-family: courier; line-height: normal;">
<span style="color: #0433ff;">if</span> Fs_a ~= Fs <span style="color: #25992d;">% resample</span></div>
<div style="font-family: courier; line-height: normal;">
s_ambient = interp1(((1:length(s_ambient)) - 1) / Fs_a, s_ambient, t, <span style="color: #b245f3;">'linear'</span>);</div>
<div style="font-family: courier; line-height: normal;">
<span style="color: #0433ff;">else</span></div>
<div style="font-family: courier; line-height: normal;">
s_ambient = s_ambient';</div>
<span style="color: #0433ff; font-family: courier;">end</span><br />
<div>
<span style="color: #0433ff;"><br /></span></div>
<div style="font-family: courier; line-height: normal;">
RMS_ambient = sqrt(mean(s_ambient .^ 2));</div>
<div style="font-family: courier; line-height: normal;">
A_u = CHIRP_STRENGTH * RMS_ambient * sqrt(2);</div>
</div>
<div>
<br /></div>
<div>
In addition to the anechoic ambient sound, I also added some white noise with amplitude = 10% of the ambient sound) to the received signal:</div>
<div>
<br /></div>
<div>
<div style="font-family: courier; line-height: normal;">
sigma_r = 0.1 * RMS_ambient;</div>
<br />
<div style="font-family: courier; line-height: normal;">
rcv_noise = sigma_r * randn(1,N);</div>
<div style="font-family: courier; line-height: normal;">
s_r = rcv_noise;</div>
<div style="font-family: courier; line-height: normal;">
s_r = s_r + s_ambient((1:N));</div>
</div>
<div>
<br /></div>
<div>
It is exceptionally common in filter design to modify an ideal (the desired) waveform with a windowing function; in fact, the very first frequency domain filter design techniques taught in textbooks are exactly this. The windowing functions reduce the strength of the side lobes (what we don't want) at the expense of spreading out the bandwidth of the main lobe (what we do want, but ideally sharply focused in the frequency domain). Even leaving aside the filter design considerations, the physical bandwidth capabilities of the speaker / mic on smartphones means the low and high frequencies will be attenuated. I found the windows widely used in filter designs (Kaiser, Hamming, Chebichev, etc) to attenuate too much of the midband frequencies, which the smartphone speaker/mics are optimized for. So I designed a "trapezoidal" window, like this:</div>
<div>
<br /></div>
<div>
<div style="font-family: courier; line-height: normal;">
win = ones(1,M);</div>
<div style="font-family: courier; line-height: normal;">
forte_d = round(200 * Fs / 48000);</div>
<div style="font-family: courier; line-height: normal;">
piano_d = round(50 * Fs / 48000);</div>
<div style="font-family: courier; line-height: normal;">
win(1:forte_d) = linspace(0,1, forte_d);</div>
<div style="font-family: courier; line-height: normal;">
win((M-piano_d+1):M) = linspace(1,0, piano_d);</div>
<div style="font-family: Courier; font-size: 12px; line-height: normal; min-height: 14px;">
<br /></div>
</div>
<div>
The windowed compressed pulse of B = 22 kHz, <span style="color: #333333; font-family: "helveticaneue"; font-size: 14px;">𝞽</span><sub style="color: #333333; font-family: helveticaneue;">p</sub> = 1024 * Ts, where Ts = 44100 Hz (CD quality) then looks like this in time and frequency domain.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0JiGjC0Xs_r6s3j_AhJVo3gwjUCR5WPR8LLnE2gMEkSb6tkUG4tQqTAsfoSCpN9mVp3EpxX5Z3mJ2cl4iCZhDf79sD7lES56CItaox0GWBqSd9rqjuhcIui7ayHyya4uhj_obmoaMFXGv/s1600/sound+FT.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0JiGjC0Xs_r6s3j_AhJVo3gwjUCR5WPR8LLnE2gMEkSb6tkUG4tQqTAsfoSCpN9mVp3EpxX5Z3mJ2cl4iCZhDf79sD7lES56CItaox0GWBqSd9rqjuhcIui7ayHyya4uhj_obmoaMFXGv/s1600/sound+FT.png" /></a></div>
<div>
Given a true range, the closest simulated received sound at the mic due to a combination of the delayed compressed pulse, competing environmental sound, and receiver noise are:</div>
<div>
<br /></div>
<div>
<div style="color: #25992d; font-family: courier; line-height: normal;">
<span style="color: black;"> t_R = range/c; </span>% nominal wave travel time</div>
<div style="font-family: courier; line-height: normal;">
x(j, (1:M) + round(t_R * Fs)) = xmit;</div>
<div style="font-family: courier; line-height: normal;">
s_r = s_r + (scat_rcs(j) / (range)^attenuation) * x(j,:);</div>
<div style="font-family: Courier; font-size: 12px; line-height: normal; min-height: 14px;">
<br /></div>
</div>
<div>
Recall that s_r was initialized to the noise value a few paragraphs above.</div>
<h2>
Correlation processor in Matlab</h2>
<div>
A matched filter has the highest possible SNR for a given signal, and is easy to to design: it is just the complex conjugate of the original signal--that is, a time reversed version. Now stare at the <i>convolution in pictures</i> section above: the kernel is flipped, and then the MAC operation is performed. So if we are conjugating the original signal to obtain the filter kernel, and then conjugating that again to actually perform the MAC, then filtering the received signal with the matched filter is identical to <i>correlating </i>the received signal with the sent signal! So in my implementation of compressed pulse, I wound up using correlation--which is just the mirror of convolution--as can be seen in the convolution theorem and the correlation theorem, which differ by whether the kernel is complex conjugated:</div>
<div>
<br /></div>
<div style="text-align: center;">
Yconv(f) = X(f) .* H(f)</div>
<div style="text-align: center;">
Ycorr(f) = X(f) .* H*(f)</div>
<div>
<br /></div>
<div>
In my correlation, I only use the middle (N-M) portion of the N+M-1 possible correlation values, because the front and the back correlation values do not have complete overlap of the signal and the kernel; consider it an issue of fairness, if it helps you.</div>
<div>
<br /></div>
<div>
<div style="font-family: courier; line-height: normal;">
pulse_pad = [pulse zeros(size(pulse))];</div>
<div style="font-family: courier; line-height: normal;">
FFT_pulse = fft(pulse_pad);</div>
<div style="font-family: courier; line-height: normal;">
FFT_pulse_conj = conj(FFT_pulse);</div>
<div style="color: #25992d; font-family: courier; line-height: normal;">
<br /></div>
<div style="font-family: courier; line-height: normal;">
fft_corr = zeros(size(s_r) - [0 M]);</div>
<div style="font-family: courier; line-height: normal;">
rx_pad = [zeros(size(pulse)) s_r(1:M)];</div>
<div style="font-family: courier; line-height: normal;">
corr = ifft(fft(rx_pad) .* FFT_pulse_conj);</div>
<div style="color: #25992d; font-family: courier; line-height: normal;">
<span style="color: black;">fft_corr(1:M) = corr(M + (1:M)); </span>% Throw away the L half, save the R half</div>
<div style="font-family: courier; line-height: normal;">
<br /></div>
<div style="color: #25992d; font-family: courier; line-height: normal;">
<span style="color: #0433ff;">for</span><span style="color: black;"> i=M:M:(N-2*M) </span>% Feed subsequent received samples</div>
<div style="font-family: courier; line-height: normal;">
rx_pad((M+1):end) = s_r(i+(1:M));</div>
<div style="font-family: courier; line-height: normal;">
corr = ifft(fft(rx_pad) .* FFT_pulse_conj);</div>
<div style="font-family: courier; line-height: normal;">
fft_corr(i + ((-M+1):M)) = fft_corr(i + ((-M+1):M)) + corr;</div>
<div style="font-family: courier; line-height: normal;">
<span style="color: #0433ff;">end</span></div>
<div style="color: #25992d; font-family: courier; line-height: normal;">
% Pick up the last block's L half (and add to the previous block's R half)</div>
<div style="font-family: courier; line-height: normal;">
<span style="color: #0433ff;">if</span> isempty(i), i = M; <span style="color: #0433ff;">else</span> i = i + M; <span style="color: #0433ff;">end</span></div>
<div style="font-family: courier; line-height: normal;">
rx_pad((M+1):end) = s_r(i+(1:M));</div>
<div style="font-family: courier; line-height: normal;">
corr = ifft(fft(rx_pad) .* FFT_pulse_conj);</div>
<div style="font-family: courier; line-height: normal;">
fft_corr(i+((-M+1):0)) = fft_corr(i+ ((-M+1):0)) + corr(1:M);</div>
<div style="font-family: Courier; font-size: 10px; line-height: normal;">
<span style="font-family: -webkit-standard; font-size: small;"><br /></span></div>
<div style="line-height: normal;">
And here is a plot of the fft_corr using the ground truth of 7 m.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgr9NiUlOg94Jr3LHaHaxOhUK3fqL8Y3X93B2mJ1fPbwx49T7_X0ZKbEqzd7f3S9pcIiT7fcGBGIkCWKSXZIGwnObqMaDdPFYP_IqhDoP9CRr8pZzkFxOpPXaGouxwGCnW2_YvyoilhIBVf/s1600/fft_corr7m.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgr9NiUlOg94Jr3LHaHaxOhUK3fqL8Y3X93B2mJ1fPbwx49T7_X0ZKbEqzd7f3S9pcIiT7fcGBGIkCWKSXZIGwnObqMaDdPFYP_IqhDoP9CRr8pZzkFxOpPXaGouxwGCnW2_YvyoilhIBVf/s1600/fft_corr7m.png" /></a></div>
<div style="line-height: normal;">
Do you see the sharp peak at 7 m? The ambient sound (of equal magnitude as the chirp wave, since CHIRP_STRENGTH = 1 when forming the transmitted sound earlier) comes right through the relatively flat passband of the matched filter, so the sharp peak at 7 m is competing against the ambient sound.<br />
<h2>
Real compressed pulse</h2>
<div>
I saved 2 versions of the ideal compressed pulse calculated above: one from the left speaker, and the other from the right:</div>
<div>
<br /></div>
<div>
<div style="font-family: Courier; line-height: normal;">
silence = zeros(size(pulse));</div>
<div style="font-family: Courier; line-height: normal;">
audiowrite(strcat(tempDir, <span style="color: #b245f3;">'lchirp.wav'</span>), [pulse; silence]', Fs);</div>
<div style="font-family: Courier; line-height: normal;">
audiowrite(strcat(tempDir, <span style="color: #b245f3;">'rchirp.wav'</span>), [silence; pulse]', Fs);</div>
<div style="font-family: Courier; font-size: 12px; line-height: normal; min-height: 14px;">
<br /></div>
</div>
<div>
Since these are just WAV files, the PC can play it from its built-in speaker at will. I recorded what the PC's built-in mic heard:</div>
<div>
<br /></div>
<div>
<div style="font-family: Courier; line-height: normal;">
<span style="color: #0433ff;">for</span> i=0:(audiodevinfo(1)-1)</div>
<div style="font-family: Courier; line-height: normal;">
<span style="color: #0433ff;">if</span> isempty(strfind(audiodevinfo(1,i), <span style="color: #b245f3;">'Built'</span>))</div>
<div style="font-family: Courier; line-height: normal;">
<span style="color: #0433ff;">continue</span>;</div>
<div style="font-family: Courier; line-height: normal;">
<span style="color: #0433ff;">end</span></div>
<div style="font-family: Courier; line-height: normal;">
<span style="color: #0433ff;">break</span>;</div>
<div style="color: #0433ff; font-family: Courier; line-height: normal;">
end</div>
<div style="font-family: Courier; line-height: normal;">
Fs = 44100</div>
<div style="color: #25992d; font-family: Courier; line-height: normal;">
% OK, now I have the ID of the recorder</div>
<div style="font-family: Courier; line-height: normal;">
recorder = audiorecorder(Fs, 16, 2, i)</div>
<div style="color: #b245f3; font-family: Courier; line-height: normal;">
<span style="color: black;">disp(</span>'recording...'<span style="color: black;">);</span></div>
<div style="font-family: Courier; line-height: normal;">
recordblocking(recorder, 5);</div>
<div style="color: #b245f3; font-family: Courier; line-height: normal;">
<span style="color: black;">disp(</span>'recording done'<span style="color: black;">);</span></div>
<div style="font-family: Courier; line-height: normal; min-height: 12px;">
<br /></div>
<div style="font-family: Courier; line-height: normal;">
play(recorder);</div>
<div style="font-family: Courier; line-height: normal;">
heard = getaudiodata(recorder);</div>
<div style="font-family: Courier; line-height: normal;">
figure(1);</div>
<div style="font-family: Courier; line-height: normal;">
t = (1:length(heard)) / Fs;</div>
<div style="font-family: Courier; line-height: normal;">
ha(1) = subplot(211); plot(t, heard(:,1));</div>
<div style="font-family: Courier; line-height: normal;">
ha(2) = subplot(212); plot(t, heard(:,2));</div>
<div style="font-family: Courier; line-height: normal;">
linkaxes(ha, <span style="color: #b245f3;">'x'</span>);</div>
<div style="font-family: Courier; line-height: normal;">
xlabel(<span style="color: #b245f3;">'[s]'</span>);</div>
<div style="font-family: Courier; font-size: 12px; line-height: normal; min-height: 14px;">
<br /></div>
<div style="color: #b245f3; font-family: Courier; line-height: normal;">
<span style="color: black;">fname = </span>'~/github/OpenCVUnity/doc/heard.mat'<span style="color: black;">;</span></div>
<div style="font-family: Courier; line-height: normal;">
save(fname, <span style="color: #b245f3;">'heard'</span>);</div>
</div>
<div>
<br /></div>
<div>
Because I played the file twice within span of 5 s, you see 2 pulses on both the left and the right mics.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhA18EuFQRen5CrgLeZ-bRfw0Sn5K1IUp5waNXdfLe-pYOGVAqESovwmT_jZyRrfLtPLHyZ0SHXfLmrTw49CFJ3mPjGvpDl4FX4DhoDZV4QvR3OjXbi6zDvDwLldkwfyVMKXM0xjkAIWUQX/s1600/heard.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhA18EuFQRen5CrgLeZ-bRfw0Sn5K1IUp5waNXdfLe-pYOGVAqESovwmT_jZyRrfLtPLHyZ0SHXfLmrTw49CFJ3mPjGvpDl4FX4DhoDZV4QvR3OjXbi6zDvDwLldkwfyVMKXM0xjkAIWUQX/s1600/heard.png" /></a></div>
<div>
But either due to the poor quality of the PC speaker or (and?) mic, what the mic actually returned is far from the ideal response with which I calculated the matched filter frequency response.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjlWqoppssvS3Gu1qpDe-s0-xzBrsD5THZQIZGM0aThQ7LqAVO7jp1B6cs8232JbUz3vM_B76PG-Ifp949fwRuMtKHXIdYQLjebXSztaxTwC3LOdMNaDSmrwQgmjpWdMgyTw1wsx__5wta/s1600/heard.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjlWqoppssvS3Gu1qpDe-s0-xzBrsD5THZQIZGM0aThQ7LqAVO7jp1B6cs8232JbUz3vM_B76PG-Ifp949fwRuMtKHXIdYQLjebXSztaxTwC3LOdMNaDSmrwQgmjpWdMgyTw1wsx__5wta/s1600/heard.png" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg9r0SEIILKjV_HjP5DH1DUX6cHquf_V8SFUJCVBuOQkvKLH10xrF5igD0zKXTwqNXizMtmXETrsHvrlelAv0bei2Dg7_Ix9W7YDEmLFynZ1umzzO42Eo6ltXvA1Or4uNdi57lE_8T-8oVS/s1600/heard.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg9r0SEIILKjV_HjP5DH1DUX6cHquf_V8SFUJCVBuOQkvKLH10xrF5igD0zKXTwqNXizMtmXETrsHvrlelAv0bei2Dg7_Ix9W7YDEmLFynZ1umzzO42Eo6ltXvA1Or4uNdi57lE_8T-8oVS/s1600/heard.png" /></a></div>
<div>
Despite this imperfection, the matched filter thrived in a low noise environment.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhS643b5yfYFrKGHj6MlaWp5kEvUu-KEPHHHtHAIxqps284Udx5KgJYvn8N3FYAXrUtKU2OYs5_oKqsUMUlniOXpk-bJk0iD2T9gzxNKmS5qdws2CTALaB_XdSUtRvJ6PFkhqF9gc9qgvqC/s1600/heard.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhS643b5yfYFrKGHj6MlaWp5kEvUu-KEPHHHtHAIxqps284Udx5KgJYvn8N3FYAXrUtKU2OYs5_oKqsUMUlniOXpk-bJk0iD2T9gzxNKmS5qdws2CTALaB_XdSUtRvJ6PFkhqF9gc9qgvqC/s1600/heard.png" /></a></div>
<div>
Zoomed in, the correlation function is no longer a single sharp peak seen in simulation.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjM1Ey2_Kk38m3_2PBn3IrRnQfcz9XcP0YrOMy3xDpLnVsbIboh3z5Sckg6x5tbMGbvXZTByQf1VpVng2QoR0QbnjYomYbXe5bFKgfcNFUgKRrb7ipltDRxg4C0vHYmuX0vaqlJ3Y7yMsyq/s1600/heard.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjM1Ey2_Kk38m3_2PBn3IrRnQfcz9XcP0YrOMy3xDpLnVsbIboh3z5Sckg6x5tbMGbvXZTByQf1VpVng2QoR0QbnjYomYbXe5bFKgfcNFUgKRrb7ipltDRxg4C0vHYmuX0vaqlJ3Y7yMsyq/s1600/heard.png" /></a></div>
<div>
Ignore the crazy distance: a sound will not even travel that far; the distance is calculated with the hypothesis that the pulse was transmitted at the 1st sample of the recording.</div>
<div>
<br /></div>
<div>
If the signal has to compete with ambient sound, then correlation will let the ambient sound through, as you can see below:</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDkUzEHQVtLhtRMxruGHv3eNqqyvUEHgaj06WOBWRgPafS_eKJAP9cYcDPQuk00k51YjPkDP-RBXQz5grfVyaaivIOCC_AruPBBftlBsq8VLE-wqfzbF8XOSLGf8yra3hMIc4hvAAYeG3G/s1600/heard.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDkUzEHQVtLhtRMxruGHv3eNqqyvUEHgaj06WOBWRgPafS_eKJAP9cYcDPQuk00k51YjPkDP-RBXQz5grfVyaaivIOCC_AruPBBftlBsq8VLE-wqfzbF8XOSLGf8yra3hMIc4hvAAYeG3G/s1600/heard.png" /></a></div>
<div>
The good news is that the sharp peak survived. Picking off the sharp correlation peak among competing oscillation will require more downstream filter; I am not sure if they will necessarily be linear, in which case the error covariance is going to be an interesting topic.</div>
</div>
<div style="line-height: normal;">
<br /></div>
<div style="line-height: normal;">
<br /></div>
<div style="color: #25992d; font-family: Courier; font-size: 10px; line-height: normal;">
</div>
</div>
Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com2tag:blogger.com,1999:blog-4032020337247582619.post-83315242780867118592017-02-19T23:48:00.000-08:002017-02-19T23:48:02.664-08:00Recognizing remote phone's torch in Unity with OpenCV<h2>
So I need some image processing in my Unity game...</h2>
In my <a href="http://henryomd.blogspot.com/2017/02/understanding-pulse-compression.html">previous blog entry</a>, I theorized on the feasibility of sound based pseudo range calculation between mobile devices within 5 m. Of course, to localize a device using pseudo-range alone, you need at least 3 pseudo ranges to solve the Eucledian distance equation in 3D space even in the best case (devices time synchronized, no clock drift, etc). Due to the limited bandwidth of smartphone speakers and mics (~audible range), it would be difficult to obtain sufficient number of pseudo ranges with pulse lasting only on the order of 10 ms (long pulse is suitable only for stationary use case). Plus, it is impossible for a userspace application to get time synchronization fidelity required for precise range estimation. So I wondered if I can estimate the relative angle of my phone relative to a light source--such as the LED torch available on all smartphones these days. So if a stationary point light source is within the field of view of my camera, I have more constraint in the problem geometry shown below, where the B device is mobile and A device is stationary device. I put a torch "t" on the stationary device, and a camera "c" on the mobile device.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiJJxtm7w81jEigCwNC-nIS7uwl4MG5FuYmwTngWxYFywNmL6KcCbZDuATDIbLA1i6Phwky9YT94SF4sJ1ZrwSpez68Arj5Pii5vCgYwmZ1geIToy0i5wEc0RXsDhdYKGBHeXzcJuBhIqiQ/s1600/Frames.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiJJxtm7w81jEigCwNC-nIS7uwl4MG5FuYmwTngWxYFywNmL6KcCbZDuATDIbLA1i6Phwky9YT94SF4sJ1ZrwSpez68Arj5Pii5vCgYwmZ1geIToy0i5wEc0RXsDhdYKGBHeXzcJuBhIqiQ/s1600/Frames.png" /></a></div>
<h2>
iOSTorch</h2>
To light up my iPhone's torch, I bought the iOSTorch plugin from the Unity Asset Store ($2). It did not work right away on iOS 10, but it was open source, so I modified to hard code the iOS version to 10, and to turn off the flash (it was lighting up both the torch AND the flash). The torch brightness is specifiable as a float, but iPhone internally quantizes to 3 levels; even at the "low" setting, it is quite blinding at a close distance. To make it dimmer, I tried to cycle it off and on rapidly to get another 1 or 2 bits of resolution, but my phone overheated for some reason and then was bricked, and had to be restored through iTunes (so don't try it!). So I decided to suck it up and deal with the saturated pixels from the overly bright torch. With the image processing I discuss below, the result looks reasonable at distance beyond 1 m, as you can see below.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjksWxtYDvG7AI71EJ6RePuC3Vl8K_2l2XMwuz1xYqO6kDaa6aBUKRtK6S-7u8NT5hUdFRQvN2KU2FEpo-lUY_zXjyDBsL4V7D3qmeT3JMKAPNtZ5BB7dZbTGLQMJTMNXQoB88KFhIBXV3K/s1600/Morphological+opening.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="380" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjksWxtYDvG7AI71EJ6RePuC3Vl8K_2l2XMwuz1xYqO6kDaa6aBUKRtK6S-7u8NT5hUdFRQvN2KU2FEpo-lUY_zXjyDBsL4V7D3qmeT3JMKAPNtZ5BB7dZbTGLQMJTMNXQoB88KFhIBXV3K/s640/Morphological+opening.png" width="640" /></a></div>
I turn on the torch in PlayerController.Start():<br />
<span style="font-family: "menlo";">
<span style="color: #333333;"> </span><span style="color: #009695;">switch</span><span style="color: #333333;">(</span><span style="color: #3364a4;">Application</span><span style="color: #333333;">.</span><span style="color: #333333;">platform</span><span style="color: #333333;">)</span><span style="color: #333333;"> </span><span style="color: #333333;">{</span><br />
<span style="color: #333333;"> </span><span style="color: #009695;">case</span><span style="color: #333333;"> </span><span style="color: #3364a4;">RuntimePlatform</span><span style="color: #333333;">.</span><span style="color: #333333;">IPhonePlayer</span><span style="color: #333333;">:</span><br />
<span style="color: #333333;"> </span><span style="color: #3364a4;">iOSTorch</span><span style="color: #333333;">.</span><span style="color: #333333;">Init</span><span style="color: #333333;"> </span><span style="color: #333333;">()</span><span style="color: #333333;">;</span><br />
<span style="color: #333333;"> </span><span style="color: #3364a4;">iOSTorch</span><span style="color: #333333;">.</span><span style="color: #333333;">On</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #f57d00;">0.001f</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span><br />
<span style="color: #333333;"> </span><span style="color: #009695;">break</span><span style="color: #333333;">;</span><br />
<span style="color: #333333;"> </span><span style="color: #009695;">case</span><span style="color: #333333;"> </span><span style="color: #3364a4;">RuntimePlatform</span><span style="color: #333333;">.</span><span style="color: #333333;">OSXEditor</span><span style="color: #333333;">:</span><br />
<span style="color: #333333;"> </span><span style="color: #009695;">case</span><span style="color: #333333;"> </span><span style="color: #3364a4;">RuntimePlatform</span><span style="color: #333333;">.</span><span style="color: #333333;">WindowsEditor</span><span style="color: #333333;">:</span><br />
<span style="color: #333333;"> </span><span style="color: #333333;">yIsUp</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #009695;">true</span><span style="color: #333333;">;</span><br />
<span style="color: #333333;"> </span><span style="color: #3364a4;">Debug</span><span style="color: #333333;">.</span><span style="color: #333333;">Log</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #f57d00;">"</span><span style="color: #f57d00;">Using</span><span style="color: #f57d00;"> </span><span style="color: #f57d00;">Unity</span><span style="color: #f57d00;"> </span><span style="color: #f57d00;">Remote</span><span style="color: #f57d00;">"</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span><br />
<span style="color: #333333;"> </span><span style="color: #009695;">break</span><span style="color: #333333;">;</span><br />
<span style="color: #333333;"> </span><span style="color: #333333;">}</span></span>
<br />
<br />
In case the game is suspended, I turn off the torch in the MonoBehaviour.OnApplicationPause() callback:<br />
<span style="font-family: "menlo";">
<span style="color: #333333;"> </span><span style="color: #009695;">void</span><span style="color: #333333;"> </span><span style="color: #333333;">OnApplicationPause</span><span style="color: #333333;">(</span><span style="color: #333333;"> </span><span style="color: #009695;">bool</span><span style="color: #333333;"> </span><span style="color: #333333;">pause</span><span style="color: #333333;"> </span><span style="color: #333333;">)</span><br />
<span style="color: #333333;"> </span><span style="color: #333333;">{</span><br />
<span style="color: #333333;"> </span><span style="color: #009695;">if</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #3364a4;">Application</span><span style="color: #333333;">.</span><span style="color: #333333;">platform</span><span style="color: #333333;"> </span><span style="color: #333333;">==</span><span style="color: #333333;"> </span><span style="color: #3364a4;">RuntimePlatform</span><span style="color: #333333;">.</span><span style="color: #333333;">IPhonePlayer</span><span style="color: #333333;">)</span><span style="color: #333333;"> </span><span style="color: #333333;">{</span><br />
<span style="color: #333333;"> </span><span style="color: #009695;">if</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">pause</span><span style="color: #333333;">)</span><span style="color: #333333;"> </span><span style="color: #3364a4;">iOSTorch</span><span style="color: #333333;">.</span><span style="color: #333333;">Off</span><span style="color: #333333;"> </span><span style="color: #333333;">()</span><span style="color: #333333;">;</span><br />
<span style="color: #333333;"> </span><span style="color: #009695;">else</span><span style="color: #333333;"> </span><span style="color: #3364a4;">iOSTorch</span><span style="color: #333333;">.</span><span style="color: #333333;">On</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #f57d00;">0.001f</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span><br />
<span style="color: #333333;"> </span><span style="color: #333333;">}</span><br />
<span style="color: #333333;"> </span><span style="color: #333333;">}</span></span>
<br />
<br />
<h2>
In search of the torch photo-electrons in the image</h2>
<div>
I started my 3D game as a clone of the OpenCVforUnity's Optical Flow demo, so I already had OpenCV ready to go (it's only $95 on the Unity Asset Store). OpenCVUnity's WebcamTextureToMatHelper transforms the RGB memory into OpenCV Mat--this is the starting point of the image processing.</div>
<div>
<br /></div>
<div>
<span style="color: #009695; font-family: "menlo";">using</span><span style="color: #333333; font-family: "menlo";"> OpenCVForUnity</span><span style="color: #333333; font-family: "menlo";">;</span></div>
<div>
<span style="font-family: "menlo";"><span style="color: #3364a4;"><br /></span></span></div>
<span style="font-family: "menlo";">
<span style="color: #333333;"></span><span style="color: #009695;">void</span><span style="color: #333333;"> </span><span style="color: #333333;">Update</span><span style="color: #333333;"> </span><span style="color: #333333;">()</span></span><br />
<div>
<span style="color: #333333; font-family: "menlo";">{</span></div>
<div>
<span style="color: #333333; font-family: "menlo";">...</span></div>
<div>
<span style="font-family: "menlo";"><span style="color: #3364a4;"> Mat</span><span style="color: #333333;"> </span><span style="color: #333333;">rgbaMat</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #333333;">webCamTextureToMatHelper</span><span style="color: #333333;">.</span><span style="color: #333333;">GetMat</span><span style="color: #333333;"> </span><span style="color: #333333;">()</span><span style="color: #333333;">;</span><br /><span style="color: #009695;"> int</span><span style="color: #333333;"> </span><span style="color: #333333;">width</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #333333;">rgbaMat</span><span style="color: #333333;">.</span><span style="color: #333333;">width</span><span style="color: #333333;"> </span><span style="color: #333333;">()</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">height</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #333333;">rgbaMat</span><span style="color: #333333;">.</span><span style="color: #333333;">height</span><span style="color: #333333;"> </span><span style="color: #333333;">()</span><span style="color: #333333;">;</span><br />
</span></div>
<div>
<br /></div>
<div>
Without an academic background in image processing, I approach the problem quite naively: I aim to boost the signal, and reduce noise, where the signal is the photo-electron count from the remote device's torch. The LED torchlight seemed to be white, so wanted to kill pixels that did not have all 3 colors. A simple element-wise product of the 3 channels seemed like a good filter for the white light. OpenCV made this task trivial:</div>
<div>
<br /></div>
<div>
<span style="font-family: "menlo";"><span style="color: #3364a4;"> List</span><span style="color: #333333;"><</span><span style="color: #3364a4;">Mat</span><span style="color: #333333;">></span><span style="color: #333333;"> </span><span style="color: #333333;">channels</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #009695;">new</span><span style="color: #333333;"> </span><span style="color: #3364a4;">List</span><span style="color: #333333;"><</span><span style="color: #3364a4;">Mat</span><span style="color: #333333;">>()</span><span style="color: #333333;">;</span><br /><span style="color: #3364a4;"> Core</span><span style="color: #333333;">.</span><span style="color: #333333;">split</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">rgbaMat</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">channels</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span></span></div>
<div>
<span style="font-family: "menlo";"><span style="color: #333333;"> </span><span style="color: #333333;">rxgxbMat</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #333333;">channels</span><span style="color: #333333;">[</span><span style="color: #f57d00;">0</span><span style="color: #333333;">]</span><span style="color: #333333;">.</span><span style="color: #333333;">mul</span><span style="color: #333333;">(</span></span></div>
<div>
<span style="font-family: "menlo";"><span style="color: #333333;"> channels</span><span style="color: #333333;">[</span><span style="color: #f57d00;">1</span><span style="color: #333333;">]</span><span style="color: #333333;">.</span><span style="color: #333333;">mul</span><span style="color: #333333;">(</span><span style="color: #333333;">channels</span><span style="color: #333333;">[</span><span style="color: #f57d00;">2</span><span style="color: #333333;">]</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #f57d00;">1.0f</span><span style="color: #333333;">/</span><span style="color: #f57d00;">256</span><span style="color: #333333;">)</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #f57d00;">1.0f</span><span style="color: #333333;">/</span><span style="color: #f57d00;">256</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span><br />
<span style="color: #333333;"></span></span></div>
<div>
<br /></div>
<div>
Scaling the successive multiplication is necessary to avoid saturating out the result. I guess OpenCV's Mat.mul() method internally uses wide datatype before applying scaling and the saturation, because each color channels were originally 8 bit unsigned counts. The result seems to confirm my intuition that the torch light seems to be the brightest thing in the image. So bright in fact, that it seems to distort the pixels around it, as you can see below.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgE2UVpg6YEkjOlEm3N7uk7TrElC0WJhZnLBxtl2znqqWytGXyYJyaPP_ysyG20LrerR8YFoJsQUej3Mg2stroS6dPnBgAtv1gmb6ezx8cEL3M9lX19YANzqzabF2RyWAohdelhrwAQ9Jiq/s1600/baseline+zoom.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgE2UVpg6YEkjOlEm3N7uk7TrElC0WJhZnLBxtl2znqqWytGXyYJyaPP_ysyG20LrerR8YFoJsQUej3Mg2stroS6dPnBgAtv1gmb6ezx8cEL3M9lX19YANzqzabF2RyWAohdelhrwAQ9Jiq/s1600/baseline+zoom.png" /></a></div>
<div>
<br /></div>
<div>
Any image is contaminated by some noise, and one of the most effective noise reduction technique in image processing is the median filter, which replaces a pixel value with the median of ensemble consisting of itself and neighboring pixels. Lower noise helps the morphological operations like opening produce a cleaner (more canonical) result.</div>
<div>
<br /></div>
<div>
<span style="font-family: "menlo";">
<span style="color: #333333;"></span><span style="color: #3364a4;">Mat</span><span style="color: #333333;"> </span><span style="color: #333333;">rxgxbMat</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">tempA</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">tempB</span><span style="color: #333333;">, </span><span style="color: #333333;">emptyMat</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #009695;">new</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mat</span><span style="color: #333333;">()</span><span style="color: #333333;">,</span><br />
<span style="color: #333333;"> </span><span style="color: #333333;">erode_kernel</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #009695;">new</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mat</span><span style="color: #333333;">();</span></span></div>
<div>
<span style="font-family: "menlo";">
<span style="color: #333333;"></span><span style="color: #3364a4;">Point</span><span style="color: #333333;"> </span><span style="color: #333333;">erode_anchor</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #009695;">new</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Point</span><span style="color: #333333;">(</span><span style="color: #f57d00;">0</span><span style="color: #333333;">,</span><span style="color: #f57d00;">0</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span></span>
</div>
<div>
<span style="font-family: "menlo";"><span style="color: #333333;"><br /></span></span></div>
<div>
<span style="font-family: "menlo";"><span style="color: #009695;">void</span><span style="color: #333333;"> </span><span style="color: #333333;">Update</span><span style="color: #333333;"> </span><span style="color: #333333;">()</span></span><br />
<div>
<span style="color: #333333; font-family: "menlo";">{</span></div>
<div>
<span style="color: #333333; font-family: "menlo";">...</span></div>
</div>
<div>
<span style="font-family: "menlo";"><span style="color: #333333;"></span><span style="color: #009695;"> if</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">tempA</span><span style="color: #333333;"> </span><span style="color: #333333;">==</span><span style="color: #333333;"> </span><span style="color: #009695;">null</span><span style="color: #333333;">)</span></span></div>
<div>
<span style="font-family: "menlo";"><span style="color: #333333;"></span><span style="color: #333333;"> tempA</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #009695;">new</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mat</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">height</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">width</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">channels</span><span style="color: #333333;"> </span><span style="color: #333333;">[</span><span style="color: #f57d00;">0</span><span style="color: #333333;">]</span><span style="color: #333333;">.</span><span style="color: #333333;">type</span><span style="color: #333333;"> </span><span style="color: #333333;">())</span><span style="color: #333333;">;</span><br />
<span style="color: #333333;"> </span><span style="color: #009695;">if</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">tempB</span><span style="color: #333333;"> </span><span style="color: #333333;">==</span><span style="color: #333333;"> </span><span style="color: #009695;">null</span><span style="color: #333333;">)</span></span></div>
<div>
<span style="font-family: "menlo";"><span style="color: #333333;"> </span><span style="color: #333333;">tempB</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #009695;">new</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mat</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">height</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">width</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">channels</span><span style="color: #333333;"> </span><span style="color: #333333;">[</span><span style="color: #f57d00;">0</span><span style="color: #333333;">]</span><span style="color: #333333;">.</span><span style="color: #333333;">type</span><span style="color: #333333;"> </span><span style="color: #333333;">())</span><span style="color: #333333;">;</span><br />
<span style="color: #333333;"> </span><span style="color: #3364a4;">Imgproc</span><span style="color: #333333;">.</span><span style="color: #333333;">medianBlur</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">rxgxbMat</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">tempA</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #f57d00;">7</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span><br /><span style="color: #333333;"> </span><span style="color: #3364a4;">Imgproc</span><span style="color: #333333;">.</span><span style="color: #333333;">morphologyEx</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">tempA</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">tempB</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Imgproc</span><span style="color: #333333;">.</span><span style="color: #333333;">MORPH_OPEN</span><span style="color: #333333;">,</span></span></div>
<div>
<span style="font-family: "menlo";"><span style="color: #333333;"> </span><span style="color: #333333;">erode_kernel</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">erode_anchor</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #f57d00;">2</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span></span>
</div>
<div>
<span style="font-family: "menlo";">
<span style="color: #333333;"> </span><span style="color: #3364a4;">Core</span><span style="color: #333333;">.</span><span style="color: #3364a4;">MinMaxLocResult</span><span style="color: #333333;"> </span><span style="color: #333333;">minmax</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Core</span><span style="color: #333333;">.</span><span style="color: #333333;">minMaxLoc</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">tempB</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span><br />
<span style="color: #333333;"> </span><span style="color: #3364a4;">Debug</span><span style="color: #333333;">.</span><span style="color: #333333;">Log</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #3364a4;">String</span><span style="color: #333333;">.</span><span style="color: #333333;">Format</span><span style="color: #333333;">(</span><span style="color: #f57d00;">"</span><span style="color: #f57d00;">max</span><span style="color: #f57d00;"> </span><span style="color: red;">{</span><span style="color: red;">0</span><span style="color: red;">}</span><span style="color: red;"></span><span style="color: #f57d00;"> </span><span style="color: #f57d00;">@</span><span style="color: #f57d00;"> </span><span style="color: red;">{</span><span style="color: red;">1</span><span style="color: red;">}</span><span style="color: red;"></span><span style="color: #f57d00;">"</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">minmax</span><span style="color: #333333;">.</span><span style="color: #333333;">maxVal</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">minmax</span><span style="color: #333333;">.</span><span style="color: #333333;">maxLoc</span><span style="color: #333333;">))</span><span style="color: #333333;">;</span><br />
<span style="color: #333333;"></span></span></div>
<div>
<br /></div>
<div>
Note that I am using iteration=2 for the opening operation. It seemed that iteration=1 did not clean up the distortion enough, and iteration=3 seemed to yield another kind of distortion. iteration=2 seemed to produce the cleanest looking blob.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXnOQ34HZjJ8JYw8oyk57OdzKs2pSrsOxbeSHBMm7z58ni_gjKRk844R8V3CT1Tn24VaDf5ObXUJlMil0xHDZUP8bNzwVBwrt30tluZrFUbrAXySz8vOh_VAidCkQVdmWI8uPjokWAw7eR/s1600/Opening+zoom.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXnOQ34HZjJ8JYw8oyk57OdzKs2pSrsOxbeSHBMm7z58ni_gjKRk844R8V3CT1Tn24VaDf5ObXUJlMil0xHDZUP8bNzwVBwrt30tluZrFUbrAXySz8vOh_VAidCkQVdmWI8uPjokWAw7eR/s1600/Opening+zoom.png" /></a></div>
The max value in my study at 2 PM today (a rainy day) was 116 (out of 255), but when I turned on the torch, the maximum value shot up to 253 (pretty near the saturation), and consistently tracked where the torch was shown in the camera.<br />
<h2>
Estimating the torch position</h2>
Lacking prior knowledge of the torch's orientation to my camera, I should just estimate the torch's position as the center of the brightest blob. I just learned that any part of the image with salient information is called a "feature" in image processing. Chapter 16 of <i><a href="https://smile.amazon.com/Learning-OpenCV-Computer-Vision-Library/dp/1491937998/ref=sr_1_1?ie=UTF8&qid=1487574967&sr=8-1&keywords=Learning+OpenCV+3">Learning OpenCV</a></i> 3 is all about 2D feature detection, including the SimpleBlobDetector. To further simplify the image to feed to the simple blob detector, I use the fact that the torch seems to be the brightest thing in the scene when it is on. So I subtract 90% of the found max intensity above from the morphologically opened image.<br />
<br />
<span style="font-family: Menlo;">
<span style="color: #333333;"></span><span style="color: #3364a4;">Core</span><span style="color: #333333;">.</span><span style="color: #333333;">subtract</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">tempB</span><span style="color: #333333;">,</span><br />
<span style="color: #333333;"> </span><span style="color: #009695;">new</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Scalar</span><span style="color: #333333;">(</span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Max</span><span style="color: #333333;">((</span><span style="color: #009695;">float</span><span style="color: #333333;">)(</span><span style="color: #f57d00;">0.9f</span><span style="color: #333333;"> </span><span style="color: #333333;">*</span><span style="color: #333333;"> </span><span style="color: #333333;">minmax</span><span style="color: #333333;">.</span><span style="color: #333333;">maxVal</span><span style="color: #333333;">)</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #f57d00;">200.0f</span><span style="color: #333333;">))</span><span style="color: #333333;">,</span><br />
<span style="color: #333333;"> </span><span style="color: #333333;">tempA</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span></span><br />
<span style="font-family: Menlo;">
<span style="color: #333333;"></span><span style="color: #3364a4;">MatOfKeyPoint</span><span style="color: #333333;"> </span><span style="color: #333333;">keypoints</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #009695;">new</span><span style="color: #333333;"> </span><span style="color: #3364a4;">MatOfKeyPoint</span><span style="color: #333333;"> </span><span style="color: #333333;">()</span><span style="color: #333333;">;</span></span><span style="color: #333333; font-family: Menlo;">blobDetector</span><span style="color: #333333; font-family: Menlo;">.</span><span style="color: #333333; font-family: Menlo;">detect</span><span style="color: #333333; font-family: Menlo;"> </span><span style="color: #333333; font-family: Menlo;">(</span><span style="color: #333333; font-family: Menlo;">tempA</span><span style="color: #333333; font-family: Menlo;">,</span><span style="color: #333333; font-family: Menlo;"> </span><span style="color: #333333; font-family: Menlo;">keypoints</span><span style="color: #333333; font-family: Menlo;">)</span><span style="color: #333333; font-family: Menlo;">;</span><br />
<span style="font-family: Menlo;">
<span style="color: #333333;"></span><span style="color: #3364a4;">Features2d</span><span style="color: #333333;">.</span><span style="color: #333333;">drawKeypoints</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">rgbaMat</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">keypoints</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">rgbaMat</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span><br /><span style="color: #3364a4;">Debug</span><span style="color: #333333;">.</span><span style="color: #333333;">Log</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #f57d00;">"</span><span style="color: #f57d00;">keypoints</span><span style="color: #f57d00;"> </span><span style="color: #f57d00;">found</span><span style="color: #f57d00;"> </span><span style="color: #f57d00;">"</span><span style="color: #333333;"> </span><span style="color: #333333;">+</span><span style="color: #333333;"> </span><span style="color: #333333;">keypoints</span><span style="color: #333333;">.</span><span style="color: #333333;">size</span><span style="color: #333333;"> </span><span style="color: #333333;">())</span><span style="color: #333333;">;</span></span> <br />
<br />
The detector is pre-created and initialized in PlayerController.Start() method. I actually wanted to create a new detector with different thresholds in each Update(), but the OpenCV Unity API only allows initialization of the feature detector from a file, which is onerous to modify at every frame.<br />
<br />
<span style="font-family: Menlo;">
<span style="color: #333333;"></span><span style="color: #3364a4;">FeatureDetector</span><span style="color: #333333;"> </span><span style="color: #333333;">blobDetector</span><span style="color: #333333;">;</span></span> <br />
<span style="font-family: Menlo;">
<span style="color: #333333;"></span><span style="color: #009695;">void</span><span style="color: #333333;"> </span><span style="color: #333333;">Start</span><span style="color: #333333;"> </span><span style="color: #333333;">()</span><br /><span style="color: #333333;">{</span></span> <br />
...<br />
<span style="font-family: Menlo;"><span style="color: #333333;"></span><span style="color: #009695;"> string</span><span style="color: #333333;"> </span><span style="color: #333333;">blobparams_yml_filepath</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Utils</span><span style="color: #333333;">.</span><span style="color: #333333;">getFilePath</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #f57d00;">"</span><span style="color: #f57d00;">blobparams</span><span style="color: #f57d00;">.</span><span style="color: #f57d00;">yml</span><span style="color: #f57d00;">"</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span><br /><span style="color: #333333;"> </span><span style="color: #333333;">blobDetector</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #3364a4;">FeatureDetector</span><span style="color: #333333;">.</span><span style="color: #333333;">create</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #3364a4;">FeatureDetector</span><span style="color: #333333;">.</span><span style="color: #333333;">SIMPLEBLOB</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span><br />
<span style="color: #333333;"> </span><span style="color: #333333;">blobDetector</span><span style="color: #333333;">.</span><span style="color: #333333;">read</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">blobparams_yml_filepath</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span></span> <br />
<br />
So the magic of the blob detector is in the parameters. After playing around with various parameters, this is what I settled on for now:<br />
<br />
<span style="font-family: Courier New, Courier, monospace;">%YAML:1.0</span><br />
<span style="font-family: Courier New, Courier, monospace;">thresholdStep: 10.</span><br />
<span style="font-family: Courier New, Courier, monospace;">minThreshold: 10.</span><br />
<span style="font-family: Courier New, Courier, monospace;">maxThreshold: 20.</span><br />
<span style="font-family: Courier New, Courier, monospace;">filterByColor: True</span><br />
<span style="font-family: Courier New, Courier, monospace;">blobColor: 255 # Look for light area (rather than dark)</span><br />
<span style="font-family: Courier New, Courier, monospace;"><br /></span>
<span style="font-family: 'Courier New', Courier, monospace;">filterByArea: True</span><br />
<span style="font-family: Courier New, Courier, monospace;">minArea: 100.</span><br />
<span style="font-family: Courier New, Courier, monospace;">maxArea: 5000.</span><br />
<span style="font-family: Courier New, Courier, monospace;"><br /></span>
<span style="font-family: Courier New, Courier, monospace;">filterByCircularity: True</span><br />
<span style="font-family: Courier New, Courier, monospace;">minCircularity: 0.75</span><br />
<span style="font-family: Courier New, Courier, monospace;">maxCircularity: 1.5</span><br />
<br />
This setting works pretty well if the camera sees a roughly circular pattern, as you can see by where the key point marker was placed. The image is mostly dark because I subtracted 90% of the max value above.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiv6JDWywG4j0BZ4b3gp2GaGHs-qSL7JFQuBxe7W0mywRfIyhTVDYYEOkFRY7Himpj3Pn6hwK5GmUUpLYjylayVFIDE_r5DKgcVMddaHG1TCyPahyphenhyphen2OtCPVOkou41hMEIOtBxCABRbORfmr/s1600/blob+detect+true+positive.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiv6JDWywG4j0BZ4b3gp2GaGHs-qSL7JFQuBxe7W0mywRfIyhTVDYYEOkFRY7Himpj3Pn6hwK5GmUUpLYjylayVFIDE_r5DKgcVMddaHG1TCyPahyphenhyphen2OtCPVOkou41hMEIOtBxCABRbORfmr/s1600/blob+detect+true+positive.png" /></a></div>
Note that the center of the blob was correctly estimated. But because the torch is so bright, a lot of light reflects off the back of my iPhone, which is not encased. The reflections distort the observed image, and the circularity check rejects the blob, as you can see here.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhTGd9x2NSAQycbkqsRgo1qQ5gV9HXWRZkYQkaDK-P_V5yLxNjH8tsrnKL4xLJOJeIoWq8mOUIAC4gNpreE_KO0aHnZzRAkKmzOSCbmH1LhEaB8x2KAleLy6ecc1XFHoDyhiRQKLA-B8wNd/s1600/blob+detect+false+negative.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhTGd9x2NSAQycbkqsRgo1qQ5gV9HXWRZkYQkaDK-P_V5yLxNjH8tsrnKL4xLJOJeIoWq8mOUIAC4gNpreE_KO0aHnZzRAkKmzOSCbmH1LhEaB8x2KAleLy6ecc1XFHoDyhiRQKLA-B8wNd/s1600/blob+detect+false+negative.png" /></a></div>
<br />
<br />
<br />Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com2tag:blogger.com,1999:blog-4032020337247582619.post-25344369462523934332017-02-15T22:35:00.003-08:002017-03-04T10:43:42.978-08:00Understanding pulse compressionIn my previous <a href="http://henryomd.blogspot.com/2017/02/imu-based-6dof-estimation-in-unity.html">blog post</a>, I showed how a straight-forward implementation of the kinematic equation of motion in Unity using the MEMS accelerometer and gyro on my phone led to a unbounded position error growth due to the imperfect accelerometer bias estimation. To better estimate the instantaneous accelerometer bias, I need a different sensor, because the accelerometer cannot estimate its own bias--this is one of the core ideas of indoor location systems, regardless of the sensor types used. While reading about indoor location technologies, I stumbled on pseudo ranging in the ultrasonic frequencies on mobile phones like Anthony Rowe's work at Carnegie Mellon, which used pulse compression to get a finer range estimate than possible with un uncompressed pulse. In this blog, I simulate sending and receiving compressed pulse sound. You can find my Matlab script at https://github.com/henrychoi/OpenCVUnity/tree/output_pulse/doc<br />
<h2>
Pulse compression</h2>
The most thorough explanation of pulse compression in radar application I've found so far is <i>Radar Systems Analysis and Design Using Matlab</i> by Baseem R. Mahafza (2000 edition). Repeating the relevant parts of Chapter 6 (I changed some notations for my own clarity), firstly a rectangular pulse signal of duration is <span style="font-family: "helveticaneue"; font-size: 12px;">𝞽</span><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">p</sub>:<br />
<br />
<div style="text-align: center;">
s(t) = Rect(t/<span style="font-family: "helveticaneue"; text-align: start; text-size-adjust: auto;">𝞽</span><sub style="font-family: HelveticaNeue; text-align: start; text-size-adjust: auto;">p</sub>) = 1 for 0 <= t <= <span style="font-family: "helveticaneue"; text-align: start; text-size-adjust: auto;">𝞽</span><sub style="font-family: HelveticaNeue; text-align: start; text-size-adjust: auto;">p</sub>, else 0</div>
<br />
A compressed pulse is a sinusoid of linearly increasing frequency (which means the phase is increasing quadratically) modulated by the Rect function above. For an EM (electro-magnetic) wave which has displacements in the direction orthogonal to the direction of wave propagation (Z by convention)--and therefore can be polarized, the wave can be expressed conveniently with a complex exponential for the uncompressed signal in time domain <span style="font-family: "helveticaneue"; text-align: center;">s</span><sub style="font-family: helveticaneue; text-align: center;">u</sub><span style="font-family: "helveticaneue"; text-align: center;">(t)</span><br />
<br />
<div style="text-align: center;">
<span style="font-family: "helveticaneue"; text-size-adjust: auto;">s</span><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">u</sub><span style="font-family: "helveticaneue"; text-size-adjust: auto;">(t) = Rect(t/<span style="text-align: start; text-size-adjust: auto;">𝞽</span><sub style="text-align: start; text-size-adjust: auto;">p</sub>) exp{ j 2𝞹(</span><span style="font-family: "helveticaneue"; text-size-adjust: auto;">f</span><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">0</sub><span style="font-family: "helveticaneue"; text-size-adjust: auto;"> t + 𝞵</span><span style="font-family: "helveticaneue";">/2</span><span style="font-family: "helveticaneue"; text-size-adjust: auto;"> t</span><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">2</sup><span style="font-family: "helveticaneue"; text-size-adjust: auto;">) }</span></div>
<br />
Its real or imaginary part will be a sinusoid with increasing frequency, as shown below, for <span style="font-family: "helveticaneue"; text-align: center;">𝞽</span><sub style="font-family: helveticaneue; text-align: center;">p </sub>= 512 / 44.1 kHz, <span style="font-family: "helveticaneue"; text-align: center;">f</span><sub style="font-family: helveticaneue; text-align: center;">0 </sub>= 200, <span style="font-family: "helveticaneue"; text-align: center;">𝞵 </span>= 0.5 * B/<span style="font-family: "helveticaneue"; text-align: center;">𝞽</span><sub style="font-family: helveticaneue; text-align: center;">p</sub>, where the bandwidth B = 10 kHz.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg3f9wLb1KZQqqHVghSSxjlFgeEQBR-3fa_CboAK14_vtP8au4OBdmKpfZ_2C7PwMIWhKaXHaAZLjs4CBfhdt7jZN3F_jAKZjD6GQIpRpirbqmx-upSnuhcjm_qMDGqA5dYtxUD66qhpfLL/s1600/uncompressed.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg3f9wLb1KZQqqHVghSSxjlFgeEQBR-3fa_CboAK14_vtP8au4OBdmKpfZ_2C7PwMIWhKaXHaAZLjs4CBfhdt7jZN3F_jAKZjD6GQIpRpirbqmx-upSnuhcjm_qMDGqA5dYtxUD66qhpfLL/s1600/uncompressed.png" /></a></div>
Note that the frequency spectrum of the signal is NOT symmetric about DC, because the signal is complex. If this fictitious wave (that has polarization and travels at the speed of sound) hits 3 targets at distances 3.9 m, 4 m, and 10 m away from the speaker/receiver combo, with sonar cross-section (relative scale of how well the target back-reflects the sound wave) of 1, 1.5, and 2 respectively, the received sound wave at the receiving antenna will be a superposition of the reflection of the 3 targets. Because the 1st and the 2nd target round trip distances are close together, the reflected waves interfere, and produce a non-constant magnitude at the receiver, as shown below.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiEIrbMyl_-uPxy8wAmYDF0HGcmZaTtFWD6fzkBnK4kN-hkcOfu7kN4I_fLYeMfaReDEfv3T9adkNPOdftXla3-H0k0GnUaer67PkcAB0ikM8IbBOp9xqvJQEKvyLfylKgPaMX9bSRMdHvc/s1600/ReturnedWaveMag.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiEIrbMyl_-uPxy8wAmYDF0HGcmZaTtFWD6fzkBnK4kN-hkcOfu7kN4I_fLYeMfaReDEfv3T9adkNPOdftXla3-H0k0GnUaer67PkcAB0ikM8IbBOp9xqvJQEKvyLfylKgPaMX9bSRMdHvc/s1600/ReturnedWaveMag.png" /></a></div>
Radar signals drop off as R^4, so the returned wave from the 3rd target is tiny. Also note that the returned wave duration is as long as the transmitted rectangular pulse, so the ability to resolve the returned signal into a precise radial distance is challenging--and rather impossible when the target distances overlap. But in compressed pulse processing, we cross-correlate the transmitted with the received signal. Note conjugation of <span style="font-family: "helveticaneue"; text-align: center;">FFT</span><sub style="font-family: HelveticaNeue; text-align: center;">sr</sub><br />
<div style="font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">
FFT<sub>su</sub> = FFT(s<sub>u</sub>[k]); FFT<sub>sr</sub> = FFT(s<sub>r</sub>[k])</div>
<div style="font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">
<span style="text-align: start; text-size-adjust: auto;">corr[k] = FFT</span><sup style="text-align: start; text-size-adjust: auto;">-1</sup><span style="text-align: start; text-size-adjust: auto;">(FFT</span><sub style="text-align: start; text-size-adjust: auto;">su </sub><span style="text-align: start; text-size-adjust: auto;">FFT*</span><sub style="text-align: start; text-size-adjust: auto;">sr</sub><span style="text-align: start; text-size-adjust: auto;">) / length(s</span><sub style="text-align: start; text-size-adjust: auto;">r</sub><span style="text-align: start; text-size-adjust: auto;">[k])</span></div>
where <span style="font-family: "helveticaneue"; text-align: center;">s</span><sub style="font-family: HelveticaNeue; text-align: center;">u</sub><span style="font-family: "helveticaneue"; text-align: center;">[k]</span> and <span style="font-family: "helveticaneue"; text-align: center;">s</span><sub style="font-family: HelveticaNeue; text-align: center;">r</sub><span style="font-family: "helveticaneue"; text-align: center;">[k]</span> are the sampled sequence of the transmitted and received signals. The result is magical.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgxlWlg_xRWn-Ls1ckrfBs3aHrUxql8L1L3HSdFnKZ7ZWw82ZkBOoJNAGexAhXCxhpJiaAuChfUEcEv5mnkucK4fHxavBtiKBvnZZv7Zq2_r5VrDVTvn-5W7NDt67hUs4qYpvbb9YXSfNFW/s1600/xcorr.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgxlWlg_xRWn-Ls1ckrfBs3aHrUxql8L1L3HSdFnKZ7ZWw82ZkBOoJNAGexAhXCxhpJiaAuChfUEcEv5mnkucK4fHxavBtiKBvnZZv7Zq2_r5VrDVTvn-5W7NDt67hUs4qYpvbb9YXSfNFW/s1600/xcorr.png" /></a></div>
The diffuse energy of the pulse is concentrated into the correlation peak, improving the signal strength by the compression ratio <span style="font-family: "helveticaneue"; text-align: center;">𝞵</span>. And the correlation peak has a width of roughly c/B, where c is the wave propagation speed. In this fictitious example, c/B works out to 3.3 cm, which is much smaller than the radial distance separation of targets 1 and 2, so the 2 targets appear distinct. No wonder compressed pulse is widely used for radar and sonar. But I am not trying to reinvent sonar, but rather estimate the distance of my phone from another device using the mic and speakers available on all modern smartphones.<br />
<h2>
1 way sonar pulse compression</h2>
If I constrain myself to using the existing HW on a smartphone, the sound wave bandwidth should be limited to roughly B <= 20 kHz. Since the sound wave is a pressure oscillation in the direction of the wave propagation, the sound wave equation is real, which I write as a cosine.<br />
<span style="font-family: "helveticaneue"; text-align: center;"><br /></span>
<span style="font-family: "helveticaneue"; text-align: center;">s</span><sub style="font-family: helveticaneue; text-align: center;">u</sub><span style="font-family: "helveticaneue"; text-align: center;">(t) = A_u Rect(t/<span style="text-align: start; text-size-adjust: auto;">𝞽</span><sub style="text-align: start; text-size-adjust: auto;">p</sub>) cos{ 2𝞹(</span><span style="font-family: "helveticaneue"; text-align: center;">f</span><sub style="font-family: helveticaneue; text-align: center;">0</sub><span style="font-family: "helveticaneue"; text-align: center;"> t + 𝞵</span><span style="font-family: "helveticaneue"; text-align: center;">/2</span><span style="font-family: "helveticaneue"; text-align: center;"> t</span><sup style="font-family: helveticaneue; text-align: center;">2</sup><span style="font-family: "helveticaneue"; text-align: center;">) }</span><br />
<br />
where A_u is the sound amplitude. Because the pulse duration cannot be too long for a fast dynamics of mobile game player, I limit the pulse duration to < 50 ms. This time and bandwidth limit I put on myself is a problem, as I explain below. The transmitted signal in time and frequency domain is given below. Note that the frequency spectrum is now symmetric since the signal is real only.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi3jWJUaq9VDvUjHhe1_T60A5cKCqKvjxCAUuLBv0GQCfc7aAuFpVNu5V3vDl0C_YPpa14mMkzB4rznZRRDT3ydlvujsAiRdnfrjvTBFMynuXSX-UTStXksFzfkhWO3GHTuCIQuwoez3xyN/s1600/sound+FT.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi3jWJUaq9VDvUjHhe1_T60A5cKCqKvjxCAUuLBv0GQCfc7aAuFpVNu5V3vDl0C_YPpa14mMkzB4rznZRRDT3ydlvujsAiRdnfrjvTBFMynuXSX-UTStXksFzfkhWO3GHTuCIQuwoez3xyN/s640/sound+FT.png" width="640" /></a></div>
In an indoor gaming environment, this sound has to compete with ambient sound, such as people talking, TV noise, and game music itself. I found that I had to keep the transmitted pulse fairly strong against the ambient, to detect pulses coming from roughly 10 m away. Even at 10x the ambient sound level, it was difficult to detect a putative source 10 m away (but closer targets are no problem), as you can see in the cross correlation magnitude below:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj69C6QyTBAJPT8f9ZZwL0xLqZWpeslrzZE7sf17L6zBEwS-XexL1pJsA4sC3og-2eKmOfNCbrmnft3PkjfA6N56zsEZE3-c9cSJ_Wb73aiW7o0N0SY20CIBtorbVTBU_CLy0Nmet593aQW/s1600/sound+xcorr.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj69C6QyTBAJPT8f9ZZwL0xLqZWpeslrzZE7sf17L6zBEwS-XexL1pJsA4sC3og-2eKmOfNCbrmnft3PkjfA6N56zsEZE3-c9cSJ_Wb73aiW7o0N0SY20CIBtorbVTBU_CLy0Nmet593aQW/s1600/sound+xcorr.png" /></a></div>
<br />
<h2>
Conclusion: borderline practical</h2>
<div>
For distances < 5 m, the xcorr SNR seems strong enough for a fairly robust distance measurement even with the signal volume turned all the way down to the ambient sound level (therefore borderline unnoticeable), as you can see below.</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjOMi7cYeG-i9DImm9mMwg3FyfqfeGcYwXNAZrKIQdX5qB-AfIS-ACPsf8-ZCF86xSy5zqx1yhKpfp-cKmrs41XJY5IUCKYvqM4WdOVczQ_uqMfU6OdDH503HY2fSW-yc2a3gYQcp0C0eJ-/s1600/sound+xcorr.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjOMi7cYeG-i9DImm9mMwg3FyfqfeGcYwXNAZrKIQdX5qB-AfIS-ACPsf8-ZCF86xSy5zqx1yhKpfp-cKmrs41XJY5IUCKYvqM4WdOVczQ_uqMfU6OdDH503HY2fSW-yc2a3gYQcp0C0eJ-/s1600/sound+xcorr.png" /></a></div>
<div>
5 m sounds too small a work volume to be useful for a dynamic peer-to-peer shooting game. But it is plenty large for a more stationary scenario like playing with Lego blocks. I am going to try other external means of estimating attitude and position first.</div>
<div>
<br /></div>
Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com0tag:blogger.com,1999:blog-4032020337247582619.post-36164272358441454892017-02-06T22:47:00.002-08:002021-01-13T13:27:39.340-08:00IMU based 6DOF estimation in UnityIn my previous blog posts I <a href="http://henryomd.blogspot.com/2017/02/coarse-roll-and-pitch-initialization.html">initialized tip and tilt attitude estimate with only the Input.acceleration Unity offers</a>, and then <a href="http://henryomd.blogspot.com/2017/02/attitude-update-in-unity-using-inputgyro.html">integrated Input.gyro.rotationRateUnbiased to get the current attitude</a>. The 2nd post showed the attitude bias observable after only a few seconds of game play. In this post, I show that position update using just the MEMS accel and gyro is an even bigger problem.<br />
<h2>
Specific force, gravity, and acceleration</h2>
Core idea of strap-down IMU (where the IMU sensor moves and rotates with the body for which the estimate is sought) based position update is to double-integrate the acceleration sensed. Paul Groves explains the necessary equations in his GNSS 2nd ed. Chapter 5, but I had a tough time understanding how to take out the gravity sensed by the accelerometer. The technical term he uses for what the accelerometer is sensing is <i>specific force</i>, first explained in Section 2.4.7, as the difference between the true acceleration (what is sought) and the gravitational force the body is subject (which CANNOT be sensed by IMU):<br />
<div style="text-align: center;">
<u>f</u> = <u>a</u> - <u>𝞬</u></div>
where <u>f</u> is the specific acceleration vector, <u>a</u> is the <span style="text-align: center;">acceleration vector, and </span><span style="text-align: center;"><u>𝞬</u> (gamma) </span>is the gravitational acceleration vector.<br />
<br />
I could not understand this explanation because of few observations I made with Unity's Input.acceleration while waving my phone about:<br />
<ol>
<li>When stationary, the sensed acceleration is downward. For example, [0; 0; -1] when the phone is lying screen-up on my desk. Does it mean that the 𝞬 is UP ([0; 0; 1])?</li>
<li>He gave an example of the specific force felt by a passenger in an elevator accelerating up is greater than 1. But since > 0 in this example, f should be smaller than the stationary case!</li>
</ol>
The crystallization moment came while reading a book on the construction of MEMS accelerometer, which can be modeled as a proof mass constrained by a pair of springs, like this:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfinTcPEkYb3IeRfOoNCua4JMmqGeP5h4_7m4dk_F8C-fjbiub_xqln2edLA51B7fvWI_D_dLw04yRMKBg_JOyTbqjKJ8VCm52Mq03-lgZPytzSEBNLoBWj9aiBWiAizUDGlp0g6Jx2OIa/s1600/Screen+Shot+2017-02-05+at+10.02.43+PM.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="193" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfinTcPEkYb3IeRfOoNCua4JMmqGeP5h4_7m4dk_F8C-fjbiub_xqln2edLA51B7fvWI_D_dLw04yRMKBg_JOyTbqjKJ8VCm52Mq03-lgZPytzSEBNLoBWj9aiBWiAizUDGlp0g6Jx2OIa/s320/Screen+Shot+2017-02-05+at+10.02.43+PM.png" width="320" /></a></div>
Whether open-loop or more likely closed-loop (with force compensation electronics) design, MEMS accelerometers are <i>inertial</i> devices (hence the name IMU): the proof mass doesn't (at first) move while the case moves; hence the accelerometer senses the displacement (or force--in the case of the closed loop design) in the <i>opposite direction</i> of the case's movement. The stationary case can be understood by imagining the bottom spring being compressed due to the gravity pulling down the proof mass: 1 unit of the bottom spring being compressed is what is being reported as the "-1" acceleration by Input.acceleration! If I put the above case in an upward accelerating accelerometer, the bottom spring will compress even more, and I will get an even more negative value from the accelerometer.<br />
<br />
Based on this observation, I concluded that the accelerometer is sensing <i>reaction</i> to the force on the body, rather than the force directly. I confirmed my hypothesis by printing the Input.acceleration values while accelerating my phone in the x/y/z directions, and finally the fundamental force equation made sense:<br />
<div style="text-align: center;">
<span style="font-family: "helveticaneue"; text-size-adjust: auto;"><u><b>a</b></u></span><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">l</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">lb</sub><span style="font-family: "helveticaneue"; text-size-adjust: auto;"> = G [ <u><b>q</b></u></span><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">l</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">b</sub><span style="font-family: "helveticaneue"; text-size-adjust: auto;"> × <span style="color: red;">−</span><b><u>accel</u></b></span><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">b</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">l</sub><span style="font-family: "helveticaneue"; text-size-adjust: auto;">𝞵 + <b><u>𝞬</u></b></span><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">l</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">l</sub><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">b</sub><span style="font-family: "helveticaneue"; text-size-adjust: auto;"> </span><span style="font-family: "helveticaneue";">]</span></div>
where <span style="font-family: "helveticaneue"; text-align: center;">q</span><sup style="font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">l</sup><sub style="font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">b</sub> is the average attitude of the local frame relative to the body rather than the other around and <span style="font-family: "helveticaneue"; text-align: center;">accel</span><sup style="font-family: helveticaneue; text-align: center;">b</sup><sub style="font-family: helveticaneue; text-align: center;">l</sub><span style="font-family: "helveticaneue"; text-align: center;">𝞵</span> is the average inertial acceleration (which is <i>equal and opposite</i> of the specific force the body is subjected to) during the most recent sample period: Once I rotate the specific force from the body frame to the local frame, I can then add the gravitational acceleration <i>estimate</i> to it, to get the estimated acceleration of the body in the local frame--scaled by the Earth's gravity constant.<br />
<br />
Above equation for the body's acceleration is incomplete: even though the local tangent frame is stationary relative to the ECEF (Earth centered Earth fixed) frame, since ECEF rotates relative to the ECI (Earth centered inertial) frame, the body experiences:<br />
<ul>
<li>Centrifugal acceleration (greatest at the Equator, in the +Z direction, but zero at the Poles). This is about 0.3% of the gravitational acceleration, of 3 mg. Anywhere else except at the Equator and the Poles, the centrifugal force has horizontal components in the local frame. But breaking down that horizontal vector into x and y requires an estimate of the local frame's True North heading as well as the current latitude.</li>
<li>Coriolis acceleration proportional to the velocity of the body in the direction toward Earth's axis of rotation. This also requires the heading and latitude.</li>
</ul>
Since I do NOT want to care about the player's location, I decided to ignore the above 2 pseudo forces, feeling justified because:<br />
<br />
<ul>
<li>3 mg is on the order of accelerometer bias confidence interval of off-the-shelf MEMS accelerometers--which means that when you reboot your phone, the bias might change that much anyway, so that the factory calibration can leave this much residual.</li>
<li>I expect the player to move at a speed in the order of 1 m/s. When I multiply this by the Earth's rotation speed 72.7 urad/s, the maximum Coriolis acceleration is on the order of 0.1E-3 m/s^2, or roughly 0.01 mg--which is BURIED under the noise floor.</li>
</ul>
A word of caution on converting the Input.acceleration into specific force: Input.acceleartion is in the RIGHT handed IMU frame while the Unity body frame is in LEFT handed system, as shown in my <a href="http://henryomd.blogspot.com/2017/02/coarse-roll-and-pitch-initialization.html">previous blog post</a>, so only the x and y axes components should flip sign. As for the gravitational acceleration estimate, precision navigation systems go through a great deal of trouble to get an accurate gravity estimate, using the world gravity model almanac and some even using expensive gravitometers. I decided to just use the magnitude of the average acceleration measured during the "hold still" period as the magnitude of the gravitational acceleration.<br />
<h2>
Navigation update equations</h2>
<h3>
Average attitude</h3>
<div>
In my<a href="http://henryomd.blogspot.com/2017/02/attitude-update-in-unity-using-inputgyro.html"> last blog post</a>, I showed discrete attitude update in Unity using the Input.gyro sensor input--using the quantity<span style="font-family: inherit;"> <span style="background-color: white; color: #333333; text-align: justify;">attitude increment</span></span><span face=""helvetica neue light" , , "helvetica neue" , "helvetica" , "arial" , sans-serif" style="background-color: white; color: #333333; font-size: 14px; text-align: justify;"> </span><span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: justify;"><i><u>α</u></i></span><sup style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: justify; text-size-adjust: auto;">b</sup><sub style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: justify; text-size-adjust: auto;">ib</sub>. If I assume that the angular rate is relatively constant during the sample period, I can do exactly the same calculation for the average attitude during the last sample period by forming a relative rotation quaternion using HALF of the attitude increment. I considered briefly whether to just use Unity's Quaternion.Slerp() function to spherically interpolate between <span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;"><b><u>q</u></b></span><sup style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">b</sup><sub style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">l</sub><span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;">(-) </span>and <span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;"><b><u>q</u></b></span><sup style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">b</sup><sub style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">l</sub><span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;">(+)</span>, but decided I could better take advantage of the fact that for splitting the attitude increment into half, rather than iterating (as SLERP seemed to).</div>
<h3>
Velocity update</h3>
<div>
Once <span style="font-family: "helveticaneue"; text-align: center;">a</span><sup style="font-family: helveticaneue; text-align: center;">l</sup><sub style="font-family: helveticaneue; text-align: center;">lb</sub>, the average acceleration during the last sample period is obtained, the velocity estimate is just a rectangular integral of <span style="font-family: "helveticaneue"; text-align: center;">a</span><sup style="font-family: helveticaneue; text-align: center;">l</sup><sub style="font-family: helveticaneue; text-align: center;">lb</sub> with the sample period Ts: <span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;"><b><u>v</u></b></span><sup style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">b</sup><sub style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">l</sub><span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;">(+)</span><sub style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;"> </sub><span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;">~ <b><u>v</u></b></span><sup style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">b</sup><sub style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">l</sub><span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;">(-) + <span style="color: black; font-family: "helveticaneue"; font-size: small;"><b><u>a</u></b></span><sup style="color: black; font-family: helveticaneue;">l</sup><sub style="color: black; font-family: helveticaneue;">lb</sub> </span><span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;">Ts</span></div>
<h3>
Average velocity</h3>
<div>
To integrate the velocity by sample period for position change, the average velocity is estimated as a simple average of the previous and the updated values: <span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;"><span style="text-size-adjust: auto;"><b><u>v</u></b></span><sup style="text-size-adjust: auto;">b</sup><sub style="text-size-adjust: auto;">l</sub><span style="text-size-adjust: auto;">(avg) = </span></span><span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;">( </span><span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;"><b><u>v</u></b></span><sup style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">b</sup><sub style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">l</sub><span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;">(+)</span><sub style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;"> </sub><span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;">+ <b><u>v</u></b></span><sup style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">b</sup><sub style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">l</sub><span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;">(-) </span><span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;">) / 2</span></div>
<h3>
Position update</h3>
Finally, the position is integrated from the initial position (I assumed 0 for now): <span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;"><b><u>r</u></b></span><sup style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">b</sup><sub style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">l</sub><span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;">(+)</span><sub style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;"> </sub><span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;">~ <b><u>r</u></b></span><sup style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">b</sup><sub style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">l</sub><span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;">(-) + <span style="text-size-adjust: auto;"><b><u>v</u></b></span><sup style="text-size-adjust: auto;">b</sup><sub style="text-size-adjust: auto;">l</sub><span style="text-size-adjust: auto;">(avg)</span> </span><span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;">Ts</span><br />
<h2>
Unity implementation</h2>
My Unity implementation (<a href="https://github.com/henrychoi/OpenCVUnity">my github repo</a>) is mostly straight-forward--except for the fact I do NOT invert <span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;"><u><b>q</b></u></span><sup style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">b</sup><sub style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">l</sub> (which is the current estimate of the body relative to the local frame) to obtain <span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;"><b><u>q</u></b></span><sup style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">l</sup><sub style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">b</sub>. I did at first, and found that rotating the specific acceleration by <span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;"><b><u>q</u></b></span><sup style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">l</sup><sub style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">b</sub> using Unity's Quaternion.operator*(Quaternion, Vector3) method rotated the vector to the <i>opposite</i> direction of rotation. In other words, Quaternion.operator*(<span style="background-color: white; color: #333333; font-family: "helveticaneue"; font-size: 14px; text-align: center;"><u><b>q</b></u></span><sup style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">b</sup><sub style="background-color: white; color: #333333; font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">l</sub> <span style="font-family: "helveticaneue"; text-align: center;">, -<b><u>accel</u></b></span><sup style="font-family: helveticaneue; text-align: center;">b</sup><sub style="font-family: helveticaneue; text-align: center;">l</sub><span style="font-family: "helveticaneue"; text-align: center;">𝞵) </span>yielded correct result!<br />
<br />
Even before I tried this implementation, I knew that double-integrating the cheap MEMS accelerometer output will end up with my player "going to the moon"--as I've heard the experts describe the problem--pretty quickly. But knowing it in your head and actually seeing it is quite another. See for yourself how bad it is.<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen='allowfullscreen' webkitallowfullscreen='webkitallowfullscreen' mozallowfullscreen='mozallowfullscreen' width='320' height='266' src='https://www.blogger.com/video.g?token=AD6v5dwzf1LyD5a7mxrm-r5_jYV6Wu-zixfA6BNuvW7qFtu3gun6hFEg_GsPkpqkOmVc17Psl-iooN6MahPqL16iIA' class='b-hbp-video b-uploaded' frameborder='0'></iframe></div>
<h2>
Now what?</h2>
Clearly, this solution is unfit for the intended purpose: to play a game lasting up to 10 minutes. I have an idea to improve the navigation solution error by sensor fusing the sound echo based pseudo-range estimate with the INS solution just given. But the equations I've derived so far are too complicated to present in this post, so let me see if I can pull it off first.<br />
<br />Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com0tag:blogger.com,1999:blog-4032020337247582619.post-48711149591284779722017-02-05T15:27:00.002-08:002017-02-05T15:27:20.447-08:00Attitude update in Unity using Input.gyroIn my <a href="http://henryomd.blogspot.com/2017/02/coarse-roll-and-pitch-initialization.html">previous blog post</a>, I initialized the tip and tilt of phone in Unity using Input.acceleration. The next step is to measure the relative attitude change from the initial attitude using Unity's gyro input. Sure you can use Input.gyro.attitude to use the OS or Unity's estimate of the device attitude, but I want to learn inertial navigation equations, so I am going to trudge along, with only <a href="https://smile.amazon.com/Principles-Inertial-Multisensor-Integrated-Navigation/dp/1608070050/ref=sr_1_1?ie=UTF8&qid=1486325599&sr=8-1&keywords=Paul+Groves+GNSS">Paul Groves' GNSS 2nd edition</a> to guide me.<br />
<h2>
The rotation rate in Unity player's body frame</h2>
The first order of business is to draw the gyro triad (which is reported in the RIGHT handed IMU frame) in the Unity PlayerController's (LEFT handed) body frame, as I've done in the previous blog entry, whereupon you should notice that the X and Y components should change sign. Therefore, the angular rate in the player's body frame should be expressed like this:<br />
<br />
<span style="font-family: Menlo;">
<span style="color: #333333;"></span><span style="color: #3364a4;">Vector3</span><span style="color: #333333;"> </span><span style="color: #333333;">w_inb_lb</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #888888; font-style: italic;">//</span><span style="color: #888888; font-style: italic;">Input</span><span style="color: #888888; font-style: italic;">.</span><span style="color: #888888; font-style: italic;">gyro</span><span style="color: #888888; font-style: italic;">.</span><span style="color: #888888; font-style: italic;">rotationRate</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">/</span><span style="color: #888888; font-style: italic;">/</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">in</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">mu</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">frame</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">(</span><span style="color: #888888; font-style: italic;">RH</span><span style="color: #888888; font-style: italic;">!</span><span style="color: #888888; font-style: italic;">)</span><br />
<span style="color: #333333;"> </span><span style="color: #3364a4;">Input</span><span style="color: #333333;">.</span><span style="color: #333333;">gyro</span><span style="color: #333333;">.</span><span style="color: #333333;">rotationRateUnbiased</span><br />
<span style="color: #333333;"> </span><span style="color: #333333;">;</span></span>
<br />
<span style="font-family: Menlo;">
<span style="color: #333333;"></span><span style="color: #333333;">w_inb_lb</span><span style="color: #333333;">.</span><span style="color: #333333;">Set</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">-</span><span style="color: #333333;">w_inb_lb</span><span style="color: #333333;">.</span><span style="color: #333333;">x</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">-</span><span style="color: #333333;">w_inb_lb</span><span style="color: #333333;">.</span><span style="color: #333333;">y</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">w_inb_lb</span><span style="color: #333333;">.</span><span style="color: #333333;">z</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span></span>
<br />
<br />
<h2>
First order attitude update equation</h2>
The core idea of attitude update is to rotate the current attitude (however it is expressed--quaternion, directional cosine matrix, rotation vector and angle, etc) by the INCREMENTAL rotation from the previous update to the current update. It is mathematically captured in the attitude differential equation, as in GNSS 2nd edition Appendix E, equation E.33. A first order approximate discrete time implementation of that differential equation for in Earth centered inertial frame (ECI) is<br />
<div style="text-align: center;">
<span style="font-family: HelveticaNeue; text-size-adjust: auto;">q</span><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">b</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">i</sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;">(+)</span><sub style="font-family: HelveticaNeue; text-size-adjust: auto;"> </sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;">~ q</span><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">b</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">i</sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;">(-) <span style="font-size: 12px; text-align: start; text-size-adjust: auto;">×</span> Quaternion(</span><b style="font-family: HelveticaNeue; text-size-adjust: auto;"><i>α</i></b><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">b</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">ib</sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;">, 1)</span></div>
<br />
where the attitude increment <b style="font-family: HelveticaNeue; text-size-adjust: auto;"><i>α</i></b><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">b</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">ib</sub> is approximated as a product of the average angular rate sensed in the last interval and the sampling time Ts.<br />
<div style="text-align: center;">
<b style="font-family: HelveticaNeue; text-size-adjust: auto;"><i>α</i></b><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">b</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">ib</sub><sub style="font-family: HelveticaNeue; text-size-adjust: auto;"> </sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;">~ 𝟂</span><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">b</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">ib</sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;"> Ts</span></div>
<br />
But since the local tangent frame is located at latitude <b><i>L</i></b> and yawed at an angle from the longitudinal line (the one that goes from the South Pole to the North Pole) and rotates with Earth, there will be an extra term for the quaternion from the local tangent frame to the body frame:<br />
<div style="text-align: center;">
<span style="font-family: HelveticaNeue; text-size-adjust: auto;">q</span><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">b</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">l</sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;">(+)</span><sub style="font-family: HelveticaNeue; text-size-adjust: auto;"> </sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;">~ q</span><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">b</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">l</sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;">(-) × Quaternion(</span><b style="font-family: HelveticaNeue; text-size-adjust: auto;"><i>α</i></b><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">b</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">lb</sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;">, 1) - Quaternion(𝟂</span><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">l</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">ie </sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;">Ts/2, 0) × q</span><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">b</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">l</sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;">(-)</span></div>
<br />
where the Earth's rotation vector is resolved into the local tangent frame by the latitude and angle from the longitude:<br />
<br />
<div class="p1" style="text-align: center;">
<span style="font-family: HelveticaNeue; text-size-adjust: auto;">𝟂</span><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">ie</sub><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">l</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">ie </sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;">= |𝟂</span><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">ie</sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;">|</span><sub style="font-family: HelveticaNeue; text-size-adjust: auto;"> </sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;">[ cos</span><b style="font-family: HelveticaNeue; font-style: italic; text-size-adjust: auto;">L </b><span style="font-family: HelveticaNeue; text-size-adjust: auto;">cos</span><b style="font-family: HelveticaNeue; text-size-adjust: auto;">𝟁</b><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">el</sub><b style="font-family: HelveticaNeue; font-style: italic; text-size-adjust: auto;"> </b><span style="font-family: HelveticaNeue; text-size-adjust: auto;">; cos</span><b style="font-family: HelveticaNeue; text-size-adjust: auto;"><i>L </i></b><span style="font-family: HelveticaNeue; text-size-adjust: auto;">sin</span><b style="font-family: HelveticaNeue; text-size-adjust: auto;">𝟁</b><span style="font-family: HelveticaNeue; text-size-adjust: auto;"><sub>el</sub></span><span style="font-family: HelveticaNeue; text-size-adjust: auto;"> </span><span style="font-family: HelveticaNeue; text-size-adjust: auto;">; sin</span><b style="font-family: HelveticaNeue; text-size-adjust: auto;"><i>L </i></b><span style="font-family: HelveticaNeue; text-size-adjust: auto;">]</span></div>
<br />
Figuring out the phone's ROUGH lat/long is not that onerous, but when consider the magnitude of the Earth's rotation <span style="font-family: HelveticaNeue; font-size: 12px; text-size-adjust: auto;">|𝟂</span><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">ie</sub><span style="font-family: HelveticaNeue; font-size: 12px; text-size-adjust: auto;">|</span> (2<span style="background-color: white; color: #333333; font-family: "courier new", courier, monospace; font-size: 14px; text-align: justify;">𝞹</span> / 24 hr = 72.7 urad/s), I feel safe in ignoring this term for a game that should last on the order of 10 minutes. So I came up with the following Unity code:<br />
<div>
<br /></div>
<span style="font-family: Menlo;"><span style="color: #333333;"></span><span style="color: #3364a4;">Vector3</span><span style="color: #333333;"> </span><span style="color: #333333;">alpha</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #f57d00;">0.5f</span><span style="color: #333333;"> </span><span style="color: #333333;">*</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Time</span><span style="color: #333333;">.</span><span style="color: #333333;">deltaTime</span><span style="color: #333333;">)</span><span style="color: #333333;"> </span><span style="color: #333333;">*</span><span style="color: #333333;"> </span><span style="color: #333333;">w_inb_lb</span><span style="color: #333333;">;</span></span>
<span style="font-family: Menlo;">
<span style="color: #333333;"></span><span style="color: #3364a4;">Quaternion </span></span><span style="font-family: Menlo;"><span style="color: #333333;">q_inc_lb = new </span></span><span style="color: #3364a4; font-family: Menlo;">Quaternion</span><span style="color: #333333; font-family: Menlo;">(</span><span style="color: #333333; font-family: Menlo;">alpha</span><span style="color: #333333; font-family: Menlo;">.</span><span style="color: #333333; font-family: Menlo;">x</span><span style="color: #333333; font-family: Menlo;">,</span><span style="color: #333333; font-family: Menlo;"> </span><span style="color: #333333; font-family: Menlo;">alpha</span><span style="color: #333333; font-family: Menlo;">.</span><span style="color: #333333; font-family: Menlo;">y</span><span style="color: #333333; font-family: Menlo;">,</span><span style="color: #333333; font-family: Menlo;"> </span><span style="color: #333333; font-family: Menlo;">alpha</span><span style="color: #333333; font-family: Menlo;">.</span><span style="color: #333333; font-family: Menlo;">z</span><span style="color: #333333; font-family: Menlo;">,</span><span style="color: #333333; font-family: Menlo;"> </span><span style="color: #f57d00; font-family: Menlo;">1</span><span style="color: #333333; font-family: Menlo;">)</span><span style="color: #333333; font-family: Menlo;">;</span><br />
<span style="color: #333333; font-family: Menlo;"><br /></span>
<span style="font-family: Menlo;"><span style="color: #333333;">q_lb</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #333333;">q_lb</span><span style="color: #333333;"> </span><span style="color: #333333;">*</span><span style="color: #333333;"> </span><span style="color: #333333;">q_inc_lb</span><span style="color: #333333;">; //Unity quaternion multiples LEFT -> RIGHT</span></span><br />
<span style="font-family: Menlo;">
<span style="color: #333333;"></span><span style="color: #333333;">q_Ub</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #333333;">Q_Ul</span><span style="color: #333333;"> </span><span style="color: #333333;">*</span><span style="color: #333333;"> </span><span style="color: #333333;">q_lb</span><span style="color: #333333;">;</span><span style="color: #333333;"> </span><span style="color: #888888; font-style: italic;">//</span><span style="color: #888888; font-style: italic;">Rotate</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">the</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">q_lb</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">by</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">Q_lU</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">to</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">go</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">to</span><span style="color: #888888; font-style: italic;"> Unity</span></span>
<br />
<span style="font-family: Menlo;">
<span style="color: #333333;"></span><span style="color: #333333;">Normalize</span><span style="color: #333333;">(</span><span style="color: #009695;">ref</span><span style="color: #333333;"> </span><span style="color: #333333;">q_Ub</span><span style="color: #333333;">); //Normalizing quaternion is less onerous than orthogonalizing a DCM</span></span><br />
<span style="font-family: Menlo;">
<span style="color: #333333;"></span><span style="color: #333333;">myRigidbody</span><span style="color: #333333;">.</span><span style="color: #333333;">MoveRotation</span><span style="color: #333333;">(</span><span style="color: #333333;">q_Ub</span><span style="color: #333333;">)</span><span style="color: #333333;">;//Move the player to the est. attitude</span></span><br />
<br />
You can find this source code in my github repo: https://github.com/henrychoi/OpenCVUnity<br />
<br />
The following video is the result of the above algorithm in the Unity Remote (which relays the phone's Input.gyro updates to the Unity editor--and Unity editor plays the video in the Unity Remote screen). I start the session by initializing the attitude while the phone is stationary on my desktop vise (yes I have a desktop vise), and then take it out of the vise and rotate it around for like 30 seconds before putting it back in the vise. To help visualize the current attitude, I show a chase camera view from right behind and slightly above the player (upper left corner), and also a target ("ENEMY") located straight ahead. When the player's forward vector (0, 0, 1) lands on the target, the target lights up red.<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen='allowfullscreen' webkitallowfullscreen='webkitallowfullscreen' mozallowfullscreen='mozallowfullscreen' width='320' height='266' src='https://www.blogger.com/video.g?token=AD6v5dzYmC9O0S4KtRKlxo8aoSGsKegqD6PVkyDBCEFW6sROP_sPqww-XEVpPXVE7dJphyYVbucW2Y5rA7r2MP7lAQ' class='b-hbp-video b-uploaded' frameborder='0'></iframe></div>
Clearly, the attitude is not coming back to the starting value after this brief excursion, so that I cannot build a playable game with this algorithm. I suspect that the problem is the high bias and noise of MEMS gyro in my phone, but how much improvement can higher order attitude update equation be?<br />
<h2>
Higher order attitude increment</h2>
<div>
According to Groves GNSS 2nd ed. Appendix E, section 6.3, a more precise attitude increment than the <span style="font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">Quaternion(</span><b style="font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;"><i>α</i></b><sup style="font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">b</sup><sub style="font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">ib</sub><span style="font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">, 1) </span>is</div>
<div style="text-align: center;">
<span style="font-family: HelveticaNeue; text-size-adjust: auto;">q</span><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">b+</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">b-</sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;"> = Quaternion( </span><i style="font-family: HelveticaNeue; text-size-adjust: auto;">α</i><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">b</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">ib</sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;"> </span><span style="font-family: HelveticaNeue; text-size-adjust: auto;">sin(</span><span style="font-family: HelveticaNeue; text-size-adjust: auto;">|</span><i style="font-family: HelveticaNeue; text-size-adjust: auto;">α</i><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">b</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">ib</sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;">| / 2 ) / |</span><i style="font-family: HelveticaNeue; text-size-adjust: auto;">α</i><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">b</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">ib</sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;">|, cos( </span><span style="font-family: HelveticaNeue; text-size-adjust: auto;">|</span><i style="font-family: HelveticaNeue; text-size-adjust: auto;">α</i><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">b</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">ib</sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;">| / 2 ) )</span></div>
<div style="text-align: center;">
<span style="font-family: HelveticaNeue; text-size-adjust: auto;"><span style="text-size-adjust: auto;">q</span><sup style="text-size-adjust: auto;">b</sup><sub style="text-size-adjust: auto;">i</sub><span style="text-size-adjust: auto;">(+)</span><sub style="text-size-adjust: auto;"> </sub><span style="text-size-adjust: auto;">= q</span><sup style="text-size-adjust: auto;">b</sup><sub style="text-size-adjust: auto;">i</sub><span style="text-size-adjust: auto;">(-) <span style="font-size: 12px; text-align: start; text-size-adjust: auto;">×</span> </span></span><span style="font-family: HelveticaNeue; text-size-adjust: auto;">q</span><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">b+</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">b-</sub></div>
<div>
<br /></div>
<div>
But due to the division by 0 problem, this precise increment cannot be implemented; instead, 4th order expression for <span style="font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">q</span><sup style="font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">b+</sup><sub style="font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">b-</sub> is suggested</div>
<div>
<div style="text-align: center;">
<span style="font-family: HelveticaNeue; text-size-adjust: auto;">q</span><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">b+</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">b-</sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;"> = Quaternion( </span><span style="font-family: HelveticaNeue;">As </span><i style="font-family: HelveticaNeue; text-size-adjust: auto;">α</i><sup style="font-family: HelveticaNeue; text-size-adjust: auto;">b</sup><sub style="font-family: HelveticaNeue; text-size-adjust: auto;">ib</sub><span style="font-family: HelveticaNeue; text-size-adjust: auto;">, Ac</span><span style="font-family: HelveticaNeue; text-size-adjust: auto;"> );</span></div>
<div style="text-align: center;">
<span style="font-family: HelveticaNeue;">As = 1/2 - (</span><span style="font-family: HelveticaNeue; text-align: start;">|</span><i style="font-family: HelveticaNeue; text-align: start;">α</i><sup style="font-family: HelveticaNeue; text-align: start;">b</sup><sub style="font-family: HelveticaNeue; text-align: start;">ib</sub><span style="font-family: HelveticaNeue; text-align: start;">| / 2</span><span style="font-family: HelveticaNeue;">)</span><sup style="font-family: HelveticaNeue;">2</sup><span style="font-family: HelveticaNeue;"> / 12</span></div>
</div>
<div style="text-align: center;">
<div style="font-family: HelveticaNeue; text-align: center; text-size-adjust: auto;">
Ac = 1 - (|<i>α</i><sup>b</sup><sub>ib</sub>| / 2)<sup>2</sup> / 2 + (|<i>α</i><sup>b</sup><sub>ib</sub>| / 2)<sup>4</sup> / 24</div>
<div>
<br /></div>
</div>
<div>
Here's the Unity implementation, showing only the parts that changed from the 1st order approximation given earlier.</div>
<div>
<span style="font-family: Menlo;">
<span style="color: #333333;"> </span><span style="color: #3364a4;">Vector3</span><span style="color: #333333;"> </span><span style="color: #333333;">alpha</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #888888; font-style: italic;">//</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">Mathf</span><span style="color: #888888; font-style: italic;">.</span><span style="color: #888888; font-style: italic;">Deg2Rad</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">*</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">/</span><span style="color: #888888; font-style: italic;">/</span><span style="color: #888888; font-style: italic;">attitude</span><span style="color: #888888; font-style: italic;"> increment</span><br />
<span style="color: #333333;"> </span><span style="color: #3364a4;">Time</span><span style="color: #333333;">.</span><span style="color: #333333;">deltaTime</span><span style="color: #333333;"> </span><span style="color: #333333;">*</span><span style="color: #333333;"> </span><span style="color: #333333;">w_inb_lb</span><span style="color: #333333;">;</span><br /><span style="color: #009695;">float</span><span style="color: #333333;"> </span><span style="color: #333333;">alpha_div2</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #f57d00;">0.5f</span><span style="color: #333333;"> </span><span style="color: #333333;">*</span><span style="color: #333333;"> </span><span style="color: #333333;">alpha</span><span style="color: #333333;">.</span><span style="color: #333333;">magnitude</span><span style="color: #333333;">;</span><br /><span style="color: #009695;">float</span><span style="color: #333333;"> </span><span style="color: #333333;">alpha_div2_sq</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #333333;">alpha_div2</span><span style="color: #333333;"> </span><span style="color: #333333;">*</span><span style="color: #333333;"> </span><span style="color: #333333;">alpha_div2</span><span style="color: #333333;">;</span><br /><span style="color: #009695;">float</span><span style="color: #333333;"> </span><span style="color: #333333;">alpha_div2_qd</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #333333;">alpha_div2_sq</span><span style="color: #333333;"> </span><span style="color: #333333;">*</span><span style="color: #333333;"> </span><span style="color: #333333;">alpha_div2_sq</span><span style="color: #333333;">;</span><br /><span style="color: #009695;">float</span><span style="color: #333333;"> </span><span style="color: #333333;">As</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #f57d00;">0.5f</span><span style="color: #333333;"> </span><span style="color: #333333;">-</span><span style="color: #333333;"> </span><span style="color: #f57d00;">0.083333333333333f</span><span style="color: #333333;"> </span><span style="color: #333333;">*</span><span style="color: #333333;"> </span><span style="color: #333333;">alpha_div2_sq</span><span style="color: #333333;">;</span><br /><span style="color: #009695;">float</span><span style="color: #333333;"> </span><span style="color: #333333;">Ac</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #f57d00;">1.0f</span><span style="color: #333333;"> </span><span style="color: #333333;">-</span><span style="color: #333333;"> </span><span style="color: #f57d00;">0.5f</span><span style="color: #333333;"> </span><span style="color: #333333;">*</span><span style="color: #333333;"> </span><span style="color: #333333;">alpha_div2_sq</span><span style="color: #333333;"> </span><span style="color: #333333;">+</span><span style="color: #333333;"> </span><span style="color: #f57d00;">0.041666666666667f</span><span style="color: #333333;"> </span><span style="color: #333333;">*</span><span style="color: #333333;"> </span><span style="color: #333333;">alpha_div2_qd</span><span style="color: #333333;">;</span><br /><span style="color: #333333;">alpha</span><span style="color: #333333;"> </span><span style="color: #333333;">*=</span><span style="color: #333333;"> </span><span style="color: #333333;">As</span><span style="color: #333333;">;</span><br /><span style="color: #333333;">q_inc_lb</span><span style="color: #333333;">.</span><span style="color: #333333;">Set</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">alpha</span><span style="color: #333333;">.</span><span style="color: #333333;">x</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">alpha</span><span style="color: #333333;">.</span><span style="color: #333333;">y</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">alpha</span><span style="color: #333333;">.</span><span style="color: #333333;">z</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">Ac</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span><span style="color: #333333;"> </span><span style="color: #888888; font-style: italic;">//</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">4st</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">order</span><span style="color: #888888; font-style: italic;"> approximation</span></span>
</div>
<div>
<br /></div>
<div>
When I tried this solution, the drift in non-stationary case was still there, so it looks like my intuition was correct after all.</div>
Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com0tag:blogger.com,1999:blog-4032020337247582619.post-75191975625189263942017-02-04T17:18:00.001-08:002017-02-05T08:26:47.088-08:00Coarse roll and pitch initialization using accelerometer in Unity<h2>
My humble dream</h2>
I am self-learning Unity3D game programming these days, to write an augmented reality peer-to-peer shooter game on a smartphone--just to play a game with my friend's 5-year old son in the living room.<br />
<h2>
The problem at hand</h2>
For a seamless integration of game objects on the screen, I need to track the phone's position and orientation within the room. Since I don't know much about image processing beyond basic filtering and how to use OpenCV, I am looking into using relative 6 DOF tracking using the accelerometer and gyro available on every modern smartphones--despite the expected problems due to the high bias and noise of the MEMS sensors in a smartphone. A solution discussed in many textbooks (I like <a href="https://smile.amazon.com/Principles-Inertial-Multisensor-Integrated-Navigation/dp/1608070050/ref=sr_1_1?s=books&ie=UTF8&qid=1486051500&sr=1-1&keywords=paul+groves">Paul Groves</a> the best so far) is to initialize the 6 DOF with some constraint (like when the IMU is stationary), and then integrate the angular rate and the acceleration sensed in the body frame of the phone.<br />
<br />
All of this is elementary stuff for mechanical engineers. But when I tried to apply this simple idea in Unity, I ran into complications: firstly Unity3D's left-handed coordinate frame. At first, I tried to keep the navigation equations in RH system (as described in textbooks) and just convert to LH right before rendering, but the confusion did not go away. So here's an attempt to re-derive the coarse attitude (but without the heading) initialization in Unity's LH system. To motivate you to persevere through the trigonometry below, here is a screenshot illustrates the idea of putting the target plane (the ENEMY text) overlaid on top of the rear camera view. As I tip and tilt my phone, my view of the target plane--which does NOT move or rotate in this demo--changes in the opposite angle of my current attitude estimate, which is displayed in the upper left corner of the video.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiOqXpk9LPX2Dn3JCi1dIXVL4uZELaX5dL2kSXcBiUU8B9eAuUlylERzeaD_I79haT_yIDteQ21Q7PipTxB80kXn6f0N_jqR1zWxz_1r0khCxxkcJmcH5c-SaHwUnxqMEbWBbaV8zvLj4or/s1600/TipTilt+screenshot.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiOqXpk9LPX2Dn3JCi1dIXVL4uZELaX5dL2kSXcBiUU8B9eAuUlylERzeaD_I79haT_yIDteQ21Q7PipTxB80kXn6f0N_jqR1zWxz_1r0khCxxkcJmcH5c-SaHwUnxqMEbWBbaV8zvLj4or/s1600/TipTilt+screenshot.png" /></a></div>
My Unity3D project is on my github repo: https://github.com/henrychoi/OpenCVUnity. The red lines that flash on the screen when I move the phone are the optical flow vectors calculated using the <a href="https://www.assetstore.unity3d.com/en/#!/content/21088">OpenCV for Unity library ($95 on the Unity Asset store)</a>. This is a bit of of a legacy: at first, I thought I might make a visual odometry based game, so I paid the money and tried OpenCV for Unity's optical flow demo--which works as you can see. But when I studied how to calculate the current attitude from the optical flow vectors using inverse homology, I realized I did not understand the subject well enough. If there is any interest in my project from other engineers, I will change out the WebcamTextureHelper script with just the plain WebcamTexture in Unity, and create a more recapitulatable project.<br />
<h2>
1st attempt: IMU frame as documented for the phone; wrong</h2>
When the device is (mostly) stationary, the accelerometer is just sensing gravity (reaction to gravity actually, but I'll explore that idea in a different blog entry). In the diagram below, I've drawn an IMU (accel and gyro package) placed arbitrarily on a phone--in the RIGHT HANDED frame labeled 𝞵 (in navigation equations, the frame appears in superscripts and subscripts, so "IMU" is unwieldy), which is oriented also arbitrarily relative to the LEFT HANDED local tangent frame l (letter "ell"). Since I will play the game while holding my phone like I am taking a picture, I've drawn the LEFT HANDED body frame to be oriented the same way as the Unity's player frame when the phone is held that way: +Z away from the phone's rear facing camera, and +Y toward the top of the screen in a landscape mode, with the "Home" button on the right of the screen.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-JeIpeMJfCZH6CqKM7qvfAgjf-UGBYB4jNO3BnZzQw_pUUC60P6WD4PylpkY_9Tu8DSGGfrgAUgustjZcIbJ3N9x2Wat4XQL0s8qELYPUpFeOgvFoku1Aw7UKfhUDWLtHavE6A1aYxgJs/s1600/Frames.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="221" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-JeIpeMJfCZH6CqKM7qvfAgjf-UGBYB4jNO3BnZzQw_pUUC60P6WD4PylpkY_9Tu8DSGGfrgAUgustjZcIbJ3N9x2Wat4XQL0s8qELYPUpFeOgvFoku1Aw7UKfhUDWLtHavE6A1aYxgJs/s400/Frames.png" width="400" /></a></div>
<br />
The gravity sensed by the IMU is <span class="s1">g</span><sup>𝞵</sup><span class="s2"><sub>l</sub></span><span class="s3">𝞵</span>: I've drawn the gravity in the -Z direction of the local tangent frame, and the gravity is resolved in 𝞵--specifically the x, y, z triad of the accelerometer. As far as I know, the IMU is placed like this on all smartphones. What are the x/y/z triad values as functions of the phone's roll and pitch? If I define the roll and pitch to be zero when the body frame's Z axis is parallel to the l frame's -Z axis, and the body frame's X axis is parallel to the l frame's -Y axis, we can follow the roll and pitch as shown below.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi28y5ADjmBZ-Qya-WrSIYsDzONU3cikUD6SelLOLQGAufXg2xnuQHW9RIP0cK6kgH0_6tu8aYvwtmvqldYjP3AhOJSpgU-zprPBtNhYf1SR7yJsrQQm38iR8eTW6XGPcWSfL4c8-tWLcg4/s1600/Coarse+tiptilt.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="272" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi28y5ADjmBZ-Qya-WrSIYsDzONU3cikUD6SelLOLQGAufXg2xnuQHW9RIP0cK6kgH0_6tu8aYvwtmvqldYjP3AhOJSpgU-zprPBtNhYf1SR7yJsrQQm38iR8eTW6XGPcWSfL4c8-tWLcg4/s640/Coarse+tiptilt.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<br />
Navigation equations can be confusing to anyone without spatial IQ of 200, so let's make a note that the blue line above is the physical components of the gravity, which will be SENSED by the accelerometer in 𝞵 frame (the blown) as the <u><b>a</b></u> vector. After the roll and pitch, we can read off the components of gravity in each of the orthogonal axes of the IMU frame (brown above)--which may NOT be co-located with the body frame or even oriented exactly as drawn above (there are elaborate engineering requirement specs for the PCB placement of the IMU). BUT if it were, the values would be as follows:<br />
<div style="text-align: center;">
ax = -g cos<span class="s1">𝟇 </span>cos<span class="s1">𝞱</span></div>
<div style="text-align: center;">
ay = g sin<span class="s1">𝟇</span></div>
<div style="text-align: center;">
az = -g cos<span class="s1">𝟇 </span>sin<span class="s1">𝞱</span></div>
Let's not worry about the accel bias or the IMU mounting error relative to the body frame for now, and just solve the above over-constrained equations for the roll and pitch angles.<br />
<div style="text-align: center;">
𝟇 = asin(ay/g)</div>
<div style="text-align: center;">
𝞱 = atan2(-az, -ax) </div>
Note the negative sign into the arc tangent, necessary to land on the correct quadrant.<br />
<br />
But when I tried initializing with these equations, the answer was clearly wrong: my spaceship jumped to a completely wrong orientation!<br />
<h2>
2nd attempt: Empirically determined IMU frame</h2>
I don't know if Unity will rotate this frame for a portrait mode, but it looks like this frame is just the same as the player frame, except for being RIGHT handed.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjUNCLQBesIncryBUJR5CYXLXc8cTkRLQl5OZB5PTO4LH96c8vR67UNucRhy9PhL0aWbL6OKyJ4ZSMENsrblSMdEHvVmhN2YjPS59q_DHFfVrfJ4nOSW8w6KryG9AFmdYIHCLenLjM8bcna/s1600/Frames.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="217" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjUNCLQBesIncryBUJR5CYXLXc8cTkRLQl5OZB5PTO4LH96c8vR67UNucRhy9PhL0aWbL6OKyJ4ZSMENsrblSMdEHvVmhN2YjPS59q_DHFfVrfJ4nOSW8w6KryG9AFmdYIHCLenLjM8bcna/s400/Frames.png" width="400" /></a></div>
The last rotation looks confusing because of the choice of the rotation amount; it may help to imagine that you are looking at the b frame from below--but do that ONLY for the 3rd frame below.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXM6ztdhJwTMtFVfxdkM7rCLkLUTDxa-NSCDAPOwwJ1XIFdyEZ7t0-HS5pELP3_dhlcdxcqORvO_X8rlUZiV1CuMa78z0g5xJxLyiijxGbY88m7MwpZ0x_wEGl_ic9Uh6PJfdXjKBdLRWR/s1600/Coarse+tiptilt.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="272" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXM6ztdhJwTMtFVfxdkM7rCLkLUTDxa-NSCDAPOwwJ1XIFdyEZ7t0-HS5pELP3_dhlcdxcqORvO_X8rlUZiV1CuMa78z0g5xJxLyiijxGbY88m7MwpZ0x_wEGl_ic9Uh6PJfdXjKBdLRWR/s640/Coarse+tiptilt.png" width="640" /></a></div>
<br />
The accel triad in the IMU frame (length of the blue arrows on the brown axes) then change to<br />
<div style="text-align: center;">
ax = -g sin<span class="s1">𝟇</span></div>
<div style="text-align: center;">
ay = -g cos<span class="s1">𝟇 </span>cos<span class="s1">𝞱</span></div>
<div style="-webkit-text-stroke-width: 0px; color: black; font-family: Times; font-size: medium; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: normal; letter-spacing: normal; margin: 0px; orphans: 2; text-align: center; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">
az = -g cos<span class="s1">𝟇 </span>sin<span class="s1">𝞱</span></div>
<div>
<span class="s1">In which case the angles are:</span></div>
<div style="text-align: center;">
𝟇 = asin(-ax/g)</div>
<div style="text-align: center;">
𝞱 = atan2(-az, -ay) </div>
<div>
<br />
The sequence of rotation from the local tangent frame l to the current body frame attitude, expressed as LEFT to right (because that's how Unity applies the rotation--the opposite of almost all quaternion books I've read) quaternion multiplication is then<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">q_lb = Quaternion(0, 0, sin(𝞹/4), cos(𝞹/4)) // quarter turn about Z</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> * Quaternion(</span><span style="font-family: "courier new" , "courier" , monospace;">sin(𝞹/4),</span><span style="font-family: "courier new" , "courier" , monospace;"> </span><span style="font-family: "courier new" , "courier" , monospace;">0, 0, cos(𝞹/4))</span><span style="font-family: "courier new" , "courier" , monospace;"> </span><span style="font-family: "courier new" , "courier" , monospace;">// quarter turn about X</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> * Quaternion(0, 0, sin(</span><span style="text-align: center;">𝟇</span><span style="font-family: "courier new" , "courier" , monospace;">/2), cos(</span><span style="text-align: center;">𝟇</span><span style="font-family: "courier new" , "courier" , monospace;">/2))</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> * Quaternion(</span><span style="font-family: "courier new" , "courier" , monospace;">sin(</span><span style="text-align: center;">𝞱</span><span style="font-family: "courier new" , "courier" , monospace;">/2), 0</span><span style="font-family: "courier new" , "courier" , monospace;">, 0, cos(</span><span style="text-align: center;">𝞱</span><span style="font-family: "courier new" , "courier" , monospace;">/2))</span><br />
<br />
where I use the quaternion notation of the form [x, y, z, w] (rather than [w, x, y, z]). Above algorithm has a singularity at roll angle = +/- 90 deg, because the y and z components of the acceleration are small (and ax is almost 1).<br />
<h2>
3rd attempt: yaw to avoid singularity at roll +/- 90 deg</h2>
To avoid singularity when roll is near +/- 90, I need to apply a yaw around Y (instead of the pitch around X as above). I drew 2 pictures to convince myself that the same equation can handle both the +90 and the -90 case, as you can see below:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhHDQtm6NWQe8nljNTd6wcAJRQ-ee3Uz5wqJVYWylOLTUJQIdYgAdL4m7Hq8PAOcazrXx5XTQz6UNy2WvaolpOddfsrEoi781TiJc9koiSgEz3DIzAUz7f0Xu6MxSpxkLguHXC4dapFiu5o/s1600/Coarse+tiptilt.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="166" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhHDQtm6NWQe8nljNTd6wcAJRQ-ee3Uz5wqJVYWylOLTUJQIdYgAdL4m7Hq8PAOcazrXx5XTQz6UNy2WvaolpOddfsrEoi781TiJc9koiSgEz3DIzAUz7f0Xu6MxSpxkLguHXC4dapFiu5o/s640/Coarse+tiptilt.png" width="640" /></a></div>
<br />
Note that the acceleration triad expressions are the same around positive and negative roll, so I don't have to come up with 2 distinct cases. So how shall we solve the following accelerometer triad?<br />
<div style="text-align: center;">
ax = -g sin<span class="s1">𝟇 cos</span>𝟁</div>
<div style="text-align: center;">
ay = -g cos<span class="s1">𝟇</span></div>
<div style="font-family: times; margin: 0px; text-align: center;">
az = g sin<span class="s1">𝟇 </span>sin<span style="font-family: "times";">𝟁</span></div>
<div>
<span class="s1">The angles are:</span></div>
<div style="text-align: center;">
𝟇 = -sign(ax) acos(-ay/g)<br />
𝟁 = asin2(az/(g <span style="font-family: "times";">sin</span><span class="s1" style="font-family: "times";">𝟇</span>))<br />
<br /></div>
The switch between the 1st and this alternative rotation scheme is whether the normalized |ax| is larger than some value, say 0.8. The quaternion from the local frame to the body frame is then<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">q_lb = Quaternion(0, 0, sin(𝞹/4), cos(𝞹/4))</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> * Quaternion(</span><span style="font-family: "courier new" , "courier" , monospace;">sin(𝞹/4),</span><span style="font-family: "courier new" , "courier" , monospace;"> </span><span style="font-family: "courier new" , "courier" , monospace;">0, 0, cos(𝞹/4))</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> * Quaternion(0, 0, sin(</span><span style="text-align: center;">𝟇</span><span style="font-family: "courier new" , "courier" , monospace;">/2), cos(</span><span style="text-align: center;">𝟇</span><span style="font-family: "courier new" , "courier" , monospace;">/2))</span><br />
<span style="font-family: "courier new" , "courier" , monospace;"> * Quaternion(0</span><span style="font-family: "courier new" , "courier" , monospace;">, </span><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "courier new" , "courier" , monospace;">sin(</span><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "times"; text-align: center;">𝟁</span>/2)</span>, 0, cos(</span><span style="font-family: "courier new" , "courier" , monospace;"><span style="font-family: "times"; text-align: center;">𝟁</span>/2))</span><br />
<div>
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span></div>
When I tried this solution, I found a small jump in the quaternion values between the previous solution and this. I struggled to for a day to understand this discontinuity, but could not. Until I can figure this out, I decided to just spherically interpolate between the 2 solutions. The behavior seemed reasonable on the Unity remote, and that is what you see in the introduction video.</div>
<h2>
Unity implementation</h2>
I tried to initialize my current attitude with Quaternion.Euler(x, y, z), whose definition of the rotation order miraculously matches the rotation order presented above. But for some reason, it did not work (I am on Unity version 5.2.4f2), so I had to just brute-force multiply the quaternions in my PlayerController.FixedUpdate() method, as shown below:<br />
<br />
<span style="color: #3364a4; font-family: "menlo";">Vector3</span><span style="color: #333333; font-family: "menlo";"> </span><span style="color: #333333; font-family: "menlo";">a_b</span><span style="color: #333333; font-family: "menlo";"> </span><span style="color: #333333; font-family: "menlo";">=</span><span style="color: #333333; font-family: "menlo";"> </span><span style="color: #333333; font-family: "menlo";">a_inb_lb_ens</span><span style="color: #333333; font-family: "menlo";">.</span><span style="color: #333333; font-family: "menlo";">Average</span><span style="color: #333333; font-family: "menlo";">.</span><span style="color: #333333; font-family: "menlo";">normalized</span><span style="color: #333333; font-family: "menlo";">;</span><br />
<span style="font-family: "menlo";">
<span style="color: #333333;"></span><span style="color: #009695;">float</span><span style="color: #333333;"> </span><span style="color: #333333;">roll</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">pitch</span><span style="color: #333333;">;</span><br /><span style="color: #333333;">roll</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Asin</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">-</span><span style="color: #333333;">a_b</span><span style="color: #333333;">.</span><span style="color: #333333;">x</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span><br /><span style="color: #333333;">pitch</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Atan2</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">-</span><span style="color: #333333;">a_b</span><span style="color: #333333;">.</span><span style="color: #333333;">z</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">-</span><span style="color: #333333;">a_b</span><span style="color: #333333;">.</span><span style="color: #333333;">y</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span></span><span style="font-family: "menlo";"><span style="color: #333333;"><br /></span></span>
<span style="font-family: "menlo";"><span style="color: #333333;">
<span style="font-family: "menlo";">
roll *= <span style="color: #f57d00;">0.5f</span>; <span style="color: #888888; font-style: italic;">//</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">Halve</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">the</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">Euler</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">angles</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">feed</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">to</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">quaternion</span><span style="color: #888888; font-style: italic;"> construction</span><br />pitch *= <span style="color: #f57d00;">0.5f</span>;</span>
</span></span><br />
<br />
<span style="font-family: "menlo";">
<span style="color: #333333;"></span><span style="color: #3364a4;">Quaternion</span><span style="color: #333333;"> </span><span style="color: #333333;">q</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #888888; font-style: italic;">//</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">Apply</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">the</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">roll</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">first</span><span style="color: #888888; font-style: italic;">,</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">then</span><span style="color: #888888; font-style: italic;"> pitch</span><br /><span style="color: #333333;"> </span><span style="color: #009695;">new</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Quaternion</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #f57d00;">0</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #f57d00;">0</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Sin</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">roll</span><span style="color: #333333;">)</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Cos</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">roll</span><span style="color: #333333;">))</span><br />
<span style="color: #333333;"> </span><span style="color: #333333;">*</span><span style="color: #333333;"> </span><span style="color: #009695;">new</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Quaternion</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Sin</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">pitch</span><span style="color: #333333;">)</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #f57d00;">0</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #f57d00;">0</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Cos</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">pitch</span><span style="color: #333333;">))</span><span style="color: #333333;">;</span></span>
<br />
<span style="font-family: "menlo";">
<span style="color: #333333;"></span><span style="color: #009695;">const</span><span style="color: #333333;"> </span><span style="color: #009695;">float</span><span style="color: #333333;"> </span><span style="color: #333333;">THRESHOLD</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #f57d00;">0.8f</span><span style="color: #333333;">;</span></span><br />
<span style="font-family: "menlo";"><span style="color: #333333;"><br /></span><span style="color: #009695;">if</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Abs</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">a_b</span><span style="color: #333333;">.</span><span style="color: #333333;">x</span><span style="color: #333333;">)</span><span style="color: #333333;"> </span><span style="color: #333333;"><</span><span style="color: #333333;"> </span><span style="color: #333333;">THRESHOLD</span><span style="color: #333333;">)</span><span style="color: #333333;"> </span><span style="color: #333333;">{</span><span style="color: #333333;"> </span><span style="color: #888888; font-style: italic;">//</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">1st</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">case</span><span style="color: #888888; font-style: italic;">:</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">cos</span><span style="color: #888888; font-style: italic;">(</span><span style="color: #888888; font-style: italic;">roll</span><span style="color: #888888; font-style: italic;">)</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">reasonable</span><span style="color: #888888; font-style: italic;"> value</span><br /><span style="color: #333333;"> q_lb</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #333333;">Q_lb</span><span style="color: #333333;"> </span><span style="color: #333333;">*</span><span style="color: #333333;"> </span><span style="color: #333333;">q</span><span style="color: #333333;">;</span><br /><span style="color: #333333;">}</span><span style="color: #333333;"> </span><span style="color: #009695;">else</span><span style="color: #333333;"> </span><span style="color: #333333;">{</span><span style="color: #333333;"> </span><span style="color: #888888; font-style: italic;">//</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">2nd</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">case</span><span style="color: #888888; font-style: italic;">,</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">roll</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">near</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">+</span><span style="color: #888888; font-style: italic;">/</span><span style="color: #888888; font-style: italic;">-</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">90</span><span style="color: #888888; font-style: italic;">;</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">ay</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;"><</span><span style="color: #888888; font-style: italic;"><</span><span style="color: #888888; font-style: italic;"> ax</span></span><br />
<span style="font-family: "menlo";"><span style="color: #333333;"></span><span style="color: #333333;"> roll</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #333333;">-</span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Sign</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">a_b</span><span style="color: #333333;">.</span><span style="color: #333333;">x</span><span style="color: #333333;">)</span><span style="color: #333333;"> </span><span style="color: #333333;">*</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Acos</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">-</span><span style="color: #333333;">a_b</span><span style="color: #333333;">.</span><span style="color: #333333;">y</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span></span> <span style="font-family: "menlo";"><span style="color: #888888;"><i><br /></i></span><span style="color: #333333;"> </span><span style="color: #009695;">float</span><span style="color: #333333;"> </span><span style="color: #333333;">yaw = </span></span><span style="color: #3364a4; font-family: "menlo";">Mathf</span><span style="color: #333333; font-family: "menlo";">.</span><span style="color: #333333; font-family: "menlo";">Asin</span><span style="color: #333333; font-family: "menlo";"> </span><span style="color: #333333; font-family: "menlo";">(</span><span style="color: #333333; font-family: "menlo";">a_b</span><span style="color: #333333; font-family: "menlo";">.</span><span style="color: #333333; font-family: "menlo";">z</span><span style="color: #333333; font-family: "menlo";"> </span><span style="color: #333333; font-family: "menlo";">/</span><span style="color: #333333; font-family: "menlo";"> </span><span style="color: #3364a4; font-family: "menlo";">Mathf</span><span style="color: #333333; font-family: "menlo";">.</span><span style="color: #333333; font-family: "menlo";">Sin</span><span style="color: #333333; font-family: "menlo";"> </span><span style="color: #333333; font-family: "menlo";">(</span><span style="color: #333333; font-family: "menlo";">roll</span><span style="color: #333333; font-family: "menlo";">))</span><span style="color: #333333; font-family: "menlo";">;</span><br />
<span style="color: #333333; font-family: "menlo";"><br /></span>
<span style="color: #333333; font-family: "menlo";"> roll</span><span style="color: #333333; font-family: "menlo";"> </span><span style="color: #333333; font-family: "menlo";">*=</span><span style="color: #333333; font-family: "menlo";"> </span><span style="color: #f57d00; font-family: "menlo";">0.5f</span><span style="color: #333333; font-family: "menlo";">;</span><br />
<span style="font-family: "menlo";">
<span style="color: #333333;"> </span><span style="color: #333333;">yaw</span><span style="color: #333333;"> </span><span style="color: #333333;">*=</span><span style="color: #333333;"> </span><span style="color: #f57d00;">0.5f</span><span style="color: #333333;">;</span><br />
<br />
<span style="color: #333333;"> </span><span style="color: #3364a4;">Quaternion</span><span style="color: #333333;"> </span><span style="color: #333333;">q2</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #888888; font-style: italic;">//</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">Apply</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">the</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">roll</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">first</span><span style="color: #888888; font-style: italic;">,</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">then</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">yaw</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">about</span><span style="color: #888888; font-style: italic;"> Y</span><br />
<span style="color: #333333;"> </span><span style="color: #009695;">new</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Quaternion</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #f57d00;">0</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #f57d00;">0</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Sin</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">roll</span><span style="color: #333333;">)</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Cos</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">roll</span><span style="color: #333333;">))</span><br />
<span style="color: #333333;"> </span><span style="color: #333333;">*</span><span style="color: #333333;"> </span><span style="color: #009695;">new</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Quaternion</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #f57d00;">0</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Sin</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">yaw</span><span style="color: #333333;">)</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #f57d00;">0</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Cos</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">yaw</span><span style="color: #333333;">))</span><span style="color: #333333;">;</span><br /><br />
<span style="color: #333333;"> </span><span style="color: #333333;">q_lb</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #333333;">Q_lb</span><span style="color: #333333;"> </span><span style="color: #333333;">*</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Quaternion</span><span style="color: #333333;">.</span><span style="color: #333333;">Slerp</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">q</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #333333;">q2</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Abs</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #333333;">a_b</span><span style="color: #333333;">.</span><span style="color: #333333;">x</span><span style="color: #333333;">)</span><span style="color: #333333;"> </span><span style="color: #333333;">-</span><span style="color: #333333;"> </span><span style="color: #333333;">THRESHOLD</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span><br /><span style="color: #333333;">}</span></span>
<br />
<span style="font-family: "menlo";">
<span style="color: #333333;"></span><span style="color: #333333;">q_Ub</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #333333;">Q_Ul</span><span style="color: #333333;"> </span><span style="color: #333333;">*</span><span style="color: #333333;"> </span><span style="color: #333333;">q_lb</span><span style="color: #333333;">;</span><span style="color: #333333;"> </span><span style="color: #888888; font-style: italic;">//</span><span style="color: #888888; font-style: italic;">Rotate</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">the</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">q_lb</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">by</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">Q_lU</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">to</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">go</span><span style="color: #888888; font-style: italic;"> </span><span style="color: #888888; font-style: italic;">to</span><span style="color: #888888; font-style: italic;"> Unity</span></span>
<br />
<span style="font-family: "menlo";">
<span style="color: #333333;"></span><span style="color: #333333;">myRigidbody</span><span style="color: #333333;">.</span><span style="color: #333333;">MoveRotation</span><span style="color: #333333;">(</span><span style="color: #333333;">q_Ub</span><span style="color: #333333;">)</span><span style="color: #333333;">;</span></span>
<br />
<br />
Q_lb and Q_Ub are constant quaternions relating respectively the attitude of the player WRT the local tangent frame, and the local tangent frame to Unity's inertial frame.<br />
<span style="font-family: "menlo";">
<span style="color: #333333;"> </span><span style="color: #009695;">static</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Quaternion</span><span style="color: #333333;"> </span><span style="color: #333333;">Q_lb</span><span style="color: #333333;"> </span><span style="color: #333333;">=</span><span style="color: #333333;"> </span><span style="color: #009695;">new</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Quaternion</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #f57d00;">0</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #f57d00;">0</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Sin</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">PI</span><span style="color: #333333;"> </span><span style="color: #333333;">/</span><span style="color: #333333;"> </span><span style="color: #f57d00;">4</span><span style="color: #333333;">)</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Cos</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">PI</span><span style="color: #333333;"> </span><span style="color: #333333;">/</span><span style="color: #333333;"> </span><span style="color: #f57d00;">4</span><span style="color: #333333;">))</span><br />
<span style="color: #333333;"> </span><span style="color: #333333;">*</span><span style="color: #333333;"> </span><span style="color: #009695;">new</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Quaternion</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Sin</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">PI</span><span style="color: #333333;"> </span><span style="color: #333333;">/</span><span style="color: #333333;"> </span><span style="color: #f57d00;">4</span><span style="color: #333333;">)</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #f57d00;">0</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #f57d00;">0</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Cos</span><span style="color: #333333;"> </span><span style="color: #333333;">(</span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">PI</span><span style="color: #333333;"> </span><span style="color: #333333;">/</span><span style="color: #333333;"> </span><span style="color: #f57d00;">4</span><span style="color: #333333;">))</span><br />
<span style="color: #333333;"></span></span><br />
<span style="font-family: "menlo";"><span style="color: #333333;"> </span><span style="color: #333333;">;</span></span><br />
<span style="color: #009695; font-family: "menlo";"><br /></span>
<span style="color: #333333; font-family: "menlo";"></span><span style="color: #009695; font-family: "menlo";">static</span><span style="color: #333333; font-family: "menlo";"> </span><span style="color: #3364a4; font-family: "menlo";">Quaternion</span><span style="color: #333333; font-family: "menlo";"> </span><span style="color: #333333; font-family: "menlo";">Q_Ul</span><span style="color: #333333; font-family: "menlo";"> </span><span style="color: #333333; font-family: "menlo";">=</span><span style="color: #333333; font-family: "menlo";"> </span><span style="color: #888888; font-family: "menlo"; font-style: italic;">//</span><span style="color: #888888; font-family: "menlo"; font-style: italic;">Will</span><span style="color: #888888; font-family: "menlo"; font-style: italic;"> </span><span style="color: #888888; font-family: "menlo"; font-style: italic;">be</span><span style="color: #888888; font-family: "menlo"; font-style: italic;"> </span><span style="color: #888888; font-family: "menlo"; font-style: italic;">used</span><span style="color: #888888; font-family: "menlo"; font-style: italic;"> </span><span style="color: #888888; font-family: "menlo"; font-style: italic;">often</span><span style="color: #888888; font-family: "menlo"; font-style: italic;">,</span><span style="color: #888888; font-family: "menlo"; font-style: italic;"> </span><span style="color: #888888; font-family: "menlo"; font-style: italic;">so</span><span style="color: #888888; font-family: "menlo"; font-style: italic;"> </span><span style="color: #888888; font-family: "menlo"; font-style: italic;">calculate</span><span style="color: #888888; font-family: "menlo"; font-style: italic;"> once</span><br />
<span style="font-family: "menlo";">
<span style="color: #333333;"> </span><span style="color: #009695;">new</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Quaternion</span><span style="color: #333333;">(</span><span style="color: #f57d00;">0</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Sin</span><span style="color: #333333;">(</span><span style="color: #333333;">-</span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">PI</span><span style="color: #333333;">/</span><span style="color: #f57d00;">4</span><span style="color: #333333;">)</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #f57d00;">0</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Cos</span><span style="color: #333333;">(</span><span style="color: #333333;">-</span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">PI</span><span style="color: #333333;">/</span><span style="color: #f57d00;">4</span><span style="color: #333333;">))</span><br />
<span style="color: #333333;"> </span><span style="color: #333333;">*</span><span style="color: #333333;"> </span><span style="color: #009695;">new</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Quaternion</span><span style="color: #333333;">(</span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Sin</span><span style="color: #333333;">(</span><span style="color: #333333;">-</span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">PI</span><span style="color: #333333;">/</span><span style="color: #f57d00;">4</span><span style="color: #333333;">)</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #f57d00;">0</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #f57d00;">0</span><span style="color: #333333;">,</span><span style="color: #333333;"> </span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">Cos</span><span style="color: #333333;">(</span><span style="color: #333333;">-</span><span style="color: #3364a4;">Mathf</span><span style="color: #333333;">.</span><span style="color: #333333;">PI</span><span style="color: #333333;">/</span><span style="color: #f57d00;">4</span><span style="color: #333333;">))</span><br />
<span style="color: #333333;"> </span><span style="color: #333333;">;</span><br />
</span><br />
And here is the video of a short session, where I use the attitude initialization at 50 Hz (i.e. I am NOT yet integrating the INS input).<br />
<iframe allowfullscreen='allowfullscreen' webkitallowfullscreen='webkitallowfullscreen' mozallowfullscreen='mozallowfullscreen' width='320' height='266' src='https://www.blogger.com/video.g?token=AD6v5dyIS0IFobOewmnAFjmRqIWKEAET6RZ0v0GgaNV8LPwnxwqKDlNDHOllUPVYemJ5uZUGg30ZNOj6J4r-Ee7oUg' class='b-hbp-video b-uploaded' frameborder='0'></iframe><br />
<h2>
Conclusion: use the (left) hand, Luke--and sweat it out</h2>
<div>
Attitude initialization is the 1st step in an INS (inertial navigation system). While the idea of using the side components of accelerometer triad to estimate the current tip and tilt is simple to imagine, actually pulling off a Unity game demo was far more difficult than I imagined at first, partly because Unity's left handed frame forced me to re-derive the gravity triad expressions. In Groves' GNSS 2nd edition, this subject takes up just a single page (and the book and the appendices all combine to just shy of 1000 pages)! Drawing clear pictures, and applying successive LEFT handed rotation step by step was the only way I could to come through the whole mess in one piece. Although it was grueling, I am glad that I persevered through it; it was a character-building experience, and now I am on my way to integrating the gyro and accel into a 6 DOF solution.</div>
Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com0tag:blogger.com,1999:blog-4032020337247582619.post-19466069522998446642017-02-02T07:39:00.000-08:002017-02-02T07:39:46.330-08:00ASR on a phoneMy greatest frustration with <i>Google Now</i> is the latency. I understand that the amount of computation and data required for modern ASR does not fit on a phone, hence the need to ship off the compressed voice data to the Google backend, and eat the latency penalty. But if we want a <i>Her</i>-like speech interface, ASR will HAVE TO run on a phone-scale resource. Since I have some experience with SW optimization, I looked for an ASR (automatic speech recognition) SW package I could play around with, and found Sirius: an open source IPA (intelligent personal assistant).<br />
<h2>
U. Michgan Sirius</h2>
<div>
Sirius bundles together the following open source projects to deliver IPA feature:</div>
<ul>
<li>ASR</li>
<ul>
<li>CMU Sphinx: widely used GMM (Gaussian mixture Model) ASR SW.</li>
<li>RWTH's RASR</li>
<li>Kaldi: DNN (deep neural network) based ASR</li>
</ul>
<li>OpenEphyra: question and answer system, based on IBM's Watson</li>
<li>SURF: image matching algorithm implemented using OpenCV</li>
</ul>
<div>
The GMM scores HMM state transitions by mapping an input feature vector into a multi-dimensional coordinate system and iteratively scores the features against the trained acoustic model. DNN is defined by the number of hidden layers where scoring amounts to one forward pass through the network. In recent years, industry and academia have moved towards DNN over GMM due to its higher accuracy. Text output from ASR is passed to the Q&A system, which uses 3core processes to extract textual information:</div>
<ol>
<li>word stemming ("elected" --> "elect"). Porter stemming</li>
<li>regular expression matching ([#th] --> #)</li>
<li>part-of-speech tagging, using CRF (conditional random field)</li>
</ol>
<div>
QA service takes more time than ASR, and is more variable, primarily because of the time to select the most fitting answer.</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhnFDAp_81bkIXtM61YhXpDCbQ0X5cETIjt761DOSOHzG3-P_gSYdo05HGlJ_bphXKKkVCs8gkIco1wN9uRFcu7nzaVG37nJj-JiFAaENfj9natm7cFkxv2FTIl7e0-Y5_RqxuEHbQzkYgO/s1600/Capture.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhnFDAp_81bkIXtM61YhXpDCbQ0X5cETIjt761DOSOHzG3-P_gSYdo05HGlJ_bphXKKkVCs8gkIco1wN9uRFcu7nzaVG37nJj-JiFAaENfj9natm7cFkxv2FTIl7e0-Y5_RqxuEHbQzkYgO/s1600/Capture.PNG" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div>
Sirius Suite (C/C++) benchmark suite captures the bottlenecks:</div>
<ul>
<li>ASR: GMM/DNN scoring, rather than HMM</li>
<ul>
<li>GMM: nested loops that iteratively score the feature vector against training data (acoustic model, language model, dictionary). Store entire data required for GMM in 2 GB.</li>
</ul>
<li>QA: all 3 core processes (above)</li>
<ul>
<li>Stemmer: check for multiple variables of a word (suffix, etc)</li>
</ul>
</ul>
<div>
<h3>
Building and running <a href="http://sirius.clarity-lab.org/sirius-suite/">sirius-suite</a> on Ubuntu 14</h3>
</div>
<div>
I downloaded sirius-suite-1.1 and sirius-caffe-1.0 (dependency), which requires cmake (to build yet another dependency: OpenCV). I advise you run this before the official install process (the sirius developers missed these packages possibly because they are on Ubuntu 12):<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">$ sudo apt-get install cmake protobuf-compiler liblmdb-dev python-numpy-dev aptitude</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">$ sudo aptitude install libhdf5-serial-dev</span><br />
<br />
Manually override the offered choice and accept an alternative, to resolve the dependencies for libhdf5-serial-dev.<br />
<br />
Then follow the instruction for sirius-caffe link above to build both cafe and the sirius-suite. The suite needs to be told where sirius-caffe libs are, so run the test command with LD_LIBRARY_PATH, like this:<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">$ LD_LIBRARY_PATH=/mnt/work/CL/sirius/caffe/distribute/lib make test</span><br />
<span style="color: #999999; font-family: "courier new" , "courier" , monospace;">{</span><br />
<span style="color: #999999; font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>"kernel":"gaussian_mixture_model",</span><br />
<span style="color: #999999; font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>"abrv":"gmm",</span><br />
<span style="color: #999999; font-family: "courier new" , "courier" , monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>"gmm": 13.183000</span><br />
<span style="color: #999999; font-family: "courier new" , "courier" , monospace;">}</span><br />
<div>
<span style="color: #999999; font-family: "courier new" , "courier" , monospace;">...</span></div>
<br />
I don't know how to interpret the tests yet, so I keep moving to the individual kernel testing. Even though the QA system is the tallest pole to whack for improved user experience, I am new to NLP (natural language processing), so I start where there is some physical signal processing, which is currently my expertise area.<br />
<h2>
GMM ASR scoring</h2>
</div>
<div>
In the gmm/ folder of the sirius suite, the build created only 2 folders: baseline and pthread, because I have not attached an ML605 FPGA (which I've had for a few years already) or a CUDA GPU. My laptop actually has a CUDA GPU (Quadro K2100M), so I'll see if I can run the CUDA version too--but later. This Sirius suite "kernel" comes with its (one) test data: gmm_data.txt, which is more than 100 MB untarred. It looks like a time series of feature vector (N=29) time series, which looks like this (only the 1st vectors shown, because for each sample--5120--means and precs are matrices while weight and factor are vectors):</div>
<div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiZeBQU0c6D4-viina8XmhQ1JMHvY2a0ACY6cCx2uGCOSiF8hIwxNaIwEzCVOt4tVDCbwvDE39ptsGjQ6J_GOAcSEs933HTcQfW_2uWCsrFOMRlay1uOoddYO0nMtHjl8d-VEkYwA6LQizL/s1600/Snapshot.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiZeBQU0c6D4-viina8XmhQ1JMHvY2a0ACY6cCx2uGCOSiF8hIwxNaIwEzCVOt4tVDCbwvDE39ptsGjQ6J_GOAcSEs933HTcQfW_2uWCsrFOMRlay1uOoddYO0nMtHjl8d-VEkYwA6LQizL/s1600/Snapshot.png" /></a></div>
There is another vector: feature_vector (confusing!), which is just a bias for each features.<br />
Without any idea of the underlying physical data (what was uttered), I'll try to understand the GMM code. The score computation goes like this:<br />
<br />
<br /></div>
<div>
ln(Val2) = weight_f<br />
+ \frac{1}{ln(1.0001)} \sum_f {prec_f (C_f - means_f)^2}<br />
- factor_f<br />
\\<br />
ln\Delta = |score - ln(Val2)|<br />
\\<br />
ln(Highest) = \begin{cases}<br />
score & score - ln(Val2) < 0\\<br />
ln(Val2) & else<br />
\end{cases}<br />
<br />
add /usr/local/texlive/2015/bin/x86_64-linux to your PATH </div>
<h2>
DNN (deep neural network) scoring</h2>
<div>
Machine learning is all the rage these days perhaps because the compute power available is finally allowing the learning algorithms to yield useful result. Supposedly, DNN is 1 such example. </div>
<h2>
ASR on <a href="https://www.raspberrypi.org/magpi/raspberry-pi-3-specs-benchmarks/">Raspberry Pi 3</a></h2>
<div>
My Intel quad-core laptop, with 16 GB RAM does not have a mobile-device spec. It would be great to run Sirius on an Android phone, but my phone is not open to running a stock Linux either. Plus, I cannot add GPIO HW easily. Except for somewhat stingy RAM size, the recently released Raspberry Pi 3 has a reasonable mobile-device spec:<br />
<br />
<ul>
<li>CPU: Broadcom BCM2837 (quad ARM Cortex-A53--64-bit ARMv8 architecture)</li>
<li>Memory: 1 GB LP DDR2</li>
<li>GPU: VideoCore IV</li>
</ul>
</div>
<div>
I prefer to roll my own Linux distribution with Buildroot. The latest Buildroot even supports RPi 3 out of the box, so I will only need to modify the packages.<br />
<br />
<span style="font-family: "courier new" , "courier" , monospace;">~/work/CL$ git clone git://git.buildroot.net/buildroot rproot</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">~/work/CL/rproot$ make raspberrypi3_defconfig</span><br />
<span style="font-family: "courier new" , "courier" , monospace;">~/work/CL/rproot$ make xconfig</span><br />
<div>
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span></div>
<div>
The Buildroot default configuration wants to create an SD card image, whereas I prefer to NSF mount the rootfs during development. </div>
<br /></div>
<div>
<br /></div>
Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com0tag:blogger.com,1999:blog-4032020337247582619.post-39883235024717134802016-07-31T09:31:00.002-07:002016-07-31T09:31:47.042-07:00Scented Death StarMy wife collects scented candles. She doesn't actually use them; they just sit in my closet until I purge the house occasionally. I had no expectation of this pattern changing, until recently, when I tried to think of a gift for my friend's 5-year old son.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg0svxm-hzCb1p8w39Xsdu8Z9peI5ih0s6FT1c0uMDPKwcDciX5iNUpDKuetkQQjjUGSZaWa7QgdPQUqVPmLX3Lm_DQt9FpyQFTH6DRadueYc7GiZFA8ZsMJgEsKPrEdawaQFUFxLoDyp_k/s1600/Death+Star.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg0svxm-hzCb1p8w39Xsdu8Z9peI5ih0s6FT1c0uMDPKwcDciX5iNUpDKuetkQQjjUGSZaWa7QgdPQUqVPmLX3Lm_DQt9FpyQFTH6DRadueYc7GiZFA8ZsMJgEsKPrEdawaQFUFxLoDyp_k/s320/Death+Star.png" width="320" /></a></div>
On learning the boy's recent infatuation with Star Wars toys and noticing that the Death Star is missing from his collection, I looked for a decent scale model of the Death Star--preferably the half-constructed (but "fully operational") <i>The Return of the Jedi </i>version as shown above--but I was shocked at the lameness of what I found on eBay. The best looking model to be had for < $20 was just a rubber mold for a Death Star ice sphere (for Whiskey on <strike>Rocks </strike>Death Star as shown below), and I gave up the search.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg5ZmJ7-Li2zo009AtD1n5jznp3PIhp2EuNOtdObB4oo0IVE9SX3hvRYi2WgH2LaV1_WUtpWgh3vm5tseoVP4rGvVkvwtWKBXjexjglQAXRnS3AieF5-hPQRnoBYOieBC9BRbPgC2pL0A0k/s1600/Whiskey+on+Death+Star.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg5ZmJ7-Li2zo009AtD1n5jznp3PIhp2EuNOtdObB4oo0IVE9SX3hvRYi2WgH2LaV1_WUtpWgh3vm5tseoVP4rGvVkvwtWKBXjexjglQAXRnS3AieF5-hPQRnoBYOieBC9BRbPgC2pL0A0k/s1600/Whiskey+on+Death+Star.jpg" /></a></div>
A few days later, while reading the <i>USB Complete, 4th Edition</i>, I had an epiphany: why not melt the candles into the mold--and "kill 2 birds with 1 stone": get rid of my wife's stash of scented candles AND get a Death Star toy? So I ordered the above "ice cube tray" from eBay, and waited a few weeks. I don't understand how our Chinese brothers can make money selling something for like $3 including international shipping; but I wish we could on-shore value manufacturing for something like this back to the US.<br />
<br />
Here's the 1st look at the mold on arrival.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjoS8pBzYXUBkaGxrZ5fccATHAIB-qVTGW2TfgrB-IxB-vOyrSNFZZdoc4d5yFLYptCx6V7m8WEzuaTL47GaJwjgqpXUWpJMd7FhVegMswOdfOzmuyuqJ_YAvtumPCNf1Pc2xeTzLD2THpC/s1600/Mold.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjoS8pBzYXUBkaGxrZ5fccATHAIB-qVTGW2TfgrB-IxB-vOyrSNFZZdoc4d5yFLYptCx6V7m8WEzuaTL47GaJwjgqpXUWpJMd7FhVegMswOdfOzmuyuqJ_YAvtumPCNf1Pc2xeTzLD2THpC/s1600/Mold.jpg" /></a></div>
The hemi-spheres are keyed, and the detail for the laser cannon firing dimple--the most important part of the model--seems acceptable. Next, I grabbed one of the candles: Christmas Cookie scent--whatever--and melted it in a saucer pan filled with water, to separate the candle from the glass container.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhoCupEt_958MZ1ch2W1oXqcdsZ0YNh9Q-8ocq4EN9-8_osAXYbcnVQhMQ-Is1G8YfhZaY_FgYAVNXqueqOHgtU1sG6qgkszONYmzt2rJWGSZk0wjjG9ZWyxdD5oGcAg6rEKx7w06nzJd-J/s1600/Melt.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhoCupEt_958MZ1ch2W1oXqcdsZ0YNh9Q-8ocq4EN9-8_osAXYbcnVQhMQ-Is1G8YfhZaY_FgYAVNXqueqOHgtU1sG6qgkszONYmzt2rJWGSZk0wjjG9ZWyxdD5oGcAg6rEKx7w06nzJd-J/s320/Melt.jpg" width="320" /></a></div>
After about 2 minutes, I could grab the wick to pull out the candle from the glass.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEga9MuptQ_ZMjx4AbjA7XIXXOLjBmPoN6WERvgcX_ZPJtT8286IfMpJD2CWEKLGTI-ERhzJzJdPfGl7uGg-xt1ySQBohaDEXqKpqzIriK4t2gAhq4Et5Yl-oAodkdOdjrH9LmxNKMm6Pg-D/s1600/Extracted.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEga9MuptQ_ZMjx4AbjA7XIXXOLjBmPoN6WERvgcX_ZPJtT8286IfMpJD2CWEKLGTI-ERhzJzJdPfGl7uGg-xt1ySQBohaDEXqKpqzIriK4t2gAhq4Et5Yl-oAodkdOdjrH9LmxNKMm6Pg-D/s1600/Extracted.jpg" /></a></div>
Then I cut out the candle into a roughly spherical shape, like so.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgG8E7sqn0JTyzC2NdSORSPuv-1ysdXUzN4qxPR_1jIU2WwGaBPAEovnHteOQsmxwyjGhIUHVyP63uZBZZdShiugYvVQr1udbl4aXiefoE3Pg-QsAi8dCsky2T3sMPWfqFyQNHXOP5s6qeZ/s1600/Shaped.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgG8E7sqn0JTyzC2NdSORSPuv-1ysdXUzN4qxPR_1jIU2WwGaBPAEovnHteOQsmxwyjGhIUHVyP63uZBZZdShiugYvVQr1udbl4aXiefoE3Pg-QsAi8dCsky2T3sMPWfqFyQNHXOP5s6qeZ/s1600/Shaped.jpg" /></a></div>
I closed up the mold, and small make-shift funnel on the pour hole indicated above.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgx0ASMO22ZiYVem6K7GhkEwBYk2xMeIyPHxGAfkpq1su7GaWIqbE-nS6ZZKYrO4JMII_lJ8k0X67kyUmi8VLz17_cpLBgxOSr0Cp4ouJGXdAURGia7Nk0rx-G5D8t6ubcZX_jZ1GI7TpSX/s1600/Pour.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgx0ASMO22ZiYVem6K7GhkEwBYk2xMeIyPHxGAfkpq1su7GaWIqbE-nS6ZZKYrO4JMII_lJ8k0X67kyUmi8VLz17_cpLBgxOSr0Cp4ouJGXdAURGia7Nk0rx-G5D8t6ubcZX_jZ1GI7TpSX/s1600/Pour.jpg" /></a></div>
I put the candle pieces and shavings back into the glass and reheated it in the saucer pan shown above, and then poured the melted wax into the funnel above, until the funnel overflowed.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgARIA7gb-j0oXNqHOhi51kVoylYTjYgFiCaPMsHrTjad2SWY3Zzyc05I3lHr5A8mStC83eROoSfLjqzKECUODyISBt61qVekaov6gMIuIz8LDOnejajc9sn9topQ_M_XBLswXG6r6Qg2BE/s1600/Overflow.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgARIA7gb-j0oXNqHOhi51kVoylYTjYgFiCaPMsHrTjad2SWY3Zzyc05I3lHr5A8mStC83eROoSfLjqzKECUODyISBt61qVekaov6gMIuIz8LDOnejajc9sn9topQ_M_XBLswXG6r6Qg2BE/s1600/Overflow.jpg" /></a></div>
<div>
It's a bit messy, but the wax comes off the surface with a utility knife. To help wax better fill the nooks and crannies, I covered the pour hole, and tap it on the desk VIGOROUSLY while the wax is still in liquid state). After 1 hour, open the mold.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhak9knMCw_w_dDyuCGZFOC5QY6Nw-Z2ilHsvE07r2ijoBhTLM7qrtSXR7P7mHzTHa4UGks1QgpDyZRpzsK2mTbZhnMzAc9SWDtr00obkOmRZULsKe3N9xk8IQEMGShHfBprHjqPxTXvlRK/s1600/Out+of+the+mold.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhak9knMCw_w_dDyuCGZFOC5QY6Nw-Z2ilHsvE07r2ijoBhTLM7qrtSXR7P7mHzTHa4UGks1QgpDyZRpzsK2mTbZhnMzAc9SWDtr00obkOmRZULsKe3N9xk8IQEMGShHfBprHjqPxTXvlRK/s1600/Out+of+the+mold.jpg" /></a></div>
There are few surface blemishes (hence the emphasis on vigorously for you), but a pretty good return for a $3 investment!<br />
<br />
But I know are you thinking: "where's the laser cannon--the one that can destroy a planet in 1 shot?"<br />
<h2>
The laser cannon</h2>
I agree: a Death Star without the laser cannon is like fish & chips without the chips. Fortunately, our Chinese manufacturers came through again: 5 mW laser pointer for $2 on eBay (batteries NOT included).<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhZtnRdNtBVVE08EZgnhnyxU9CMeOi9_KA7vAdlYZa2GCCotA93-Gj-H90B2fSwDLtPlo9zFBohhzxoX3Xmt4Guqo8g5GGIL4sSkKw9GfZbEWPQ9ssv-jSm5sratqbm_Xq7s6jPD1Zr2pMt/s1600/s-l225.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhZtnRdNtBVVE08EZgnhnyxU9CMeOi9_KA7vAdlYZa2GCCotA93-Gj-H90B2fSwDLtPlo9zFBohhzxoX3Xmt4Guqo8g5GGIL4sSkKw9GfZbEWPQ9ssv-jSm5sratqbm_Xq7s6jPD1Zr2pMt/s1600/s-l225.jpg" /></a></div>
<blockquote class="tr_bq">
Do NOT shine the 5 mW laser pointer directly into the eye!</blockquote>
It's OD (outside diameter) is about 9/16", but the laser beam's width is only about 1/8". I want to "fire" the laser from within the wax to diffract the laser beam, so I will use 2 different drill bits.<br />
<br />
From the other end of the Death Star relative to the dimple, drill a hole wide enough to accommodate the laser pointer, but do NOT drill through (i.e. leave the dimple intact). Now drill a 1/8" hole in the middle of the dimple, to let the laser light escape.<br />
<h2>
The Master Card moment</h2>
<div>
Insert the laser pointer into the larger hole, and shine THROUGH the 1/8" hole. $3 for the mold, $2 for the laser pointer, $x for scented candle my wife got as a gift from someone. The Death Star shining in your hand: priceless. The laser beam actually lighting up a screen is extra credit.</div>
<div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgIo3RzA7vpDmxwRRW3h-PN2TgxSNzIUleenmipzTqf7lao8vW6slmFsxnRk7GcEt9nq7k6A8nIbXJuFVFMoMMcCyStmCunjFrmpJLW8FDE27bKZdOk3WaQaZBWk2SxQ20tawSLHFM9NR_4/s1600/IMG_20160730_181455.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgIo3RzA7vpDmxwRRW3h-PN2TgxSNzIUleenmipzTqf7lao8vW6slmFsxnRk7GcEt9nq7k6A8nIbXJuFVFMoMMcCyStmCunjFrmpJLW8FDE27bKZdOk3WaQaZBWk2SxQ20tawSLHFM9NR_4/s400/IMG_20160730_181455.jpg" width="225" /></a></div>
<br /></div>
If your Death Star looks cooler than this, please share a picture and your method.<br />
<br />Henry Choihttp://www.blogger.com/profile/02148182819821205330noreply@blogger.com0