Sep 5, 2017

Path to a multiplayer AR game

For slightly less than a year, I've been working on a hobby project to create a peer-to-peer AR game on an iPhone.  I first looked into indoor localization using audible chirps (linear frequency modulation, in technical terms).  I worked out the signal processing, borrowing heavily from radar fundamentals, and wrote about it in a blog entry.  But when I wrote a CoreAudio based program on my phone, I found out in March of this year (about 6 months into the effort) that the received audio signals were heavily distorted and attenuated, so that the signals were quite weak.  I then pivoted toward a SLAM based approach, and have been trying to solve the problem with the device pose estimate from ARKit, as you can see in the picture of the coordinate frames and the observable in my problem definition.
Just last week, Google released ARCore to keep up with Apple.  Last night, I read about it in this review by Matt Miesnieks.

I was impressed at Mr. Miesnieks' technical depth and breath in this area.  His comments about the difficulty of multi-player AR struck me as spot on, and would have normally disheartened me.  But thankfully, I reached a mini-milestone: I collected a 2-minute calibration session data from my wife's and my phone, and put it through the 1st stage of the AR map alignment algorithm I have been working on, as you can see below:
o: measurement of the other devices normalized pixel location on each of the 2 phones in the 1-1 calibration session.
x: expected normalized pixel location based on the estimate of the 6-DOF offset between the 2 phones.
Since this fit is BEFORE the least squares based 2nd stage of the alignment algorithm, this is encouraging! And here is the result of 10 iterations of nonlinear least squares (which I explained in a previous blog entry), which shows that I am on the right track!
Residual innovation (measured vs. predicted) after 10 iterations of nonlinear least squares.
But even here, I can see that my model is not accurate enough to reduce the residual level low enough for a high quality interactive AR game play, suggesting 2 things:
  1. Expand the model to add at least another 3 states (maybe even 6) to the currently 3 states I am estimating.
  2. The possibility that at least some of the estimates may have to be updated even after the initial convergence.
The 2nd is going to be particularly painful, so I want to first evaluate how quickly ARKit drifts after the initial convergence.

Although this is supposed to be a tough problem,  I've been learning and reviewing many things I learned in school in this hobby project.  I am curious whether I can actually pull off an algorithm that is as difficult as Mr. Miesnieks suggests; I'll find out in the next couple of months.