[WIP] Inertial Odometry: From Phone Sensors to 3D Tracking

Inertial odometry is the art of estimating where you are using only motion sensors — no GPS, no cameras, no external infrastructure. Just accelerometers and gyroscopes, and a lot of math to fight the inevitable drift. This post walks through building an inertial odometry system from first principles using a phone's IMU, covering the sensor model, double integration, quaternion orientation, drift correction techniques, and the Madgwick sensor fusion filter. ## The Taken 2 Problem In *Taken 2*, Bryan Mills is kidnapped, blindfolded, and thrown in the trunk of a car. He can't see anything. But he can *feel* — the accelerations, the turns, the stops. He counts seconds between turns, estimates speeds from how hard he's thrown around, and mentally reconstructs the route. This is inertial odometry. Estimate your trajectory using only what you feel: linear acceleration and angular velocity. No external references. The sensors are your inner ear — or in our case, a phone's IMU. The catch: every measurement has noise, every estimate has error, and those errors *accumulate*. Bryan had maybe 10 minutes. We want to track for longer. That's where things get interesting. ## What Does an IMU Actually Measure? An Inertial Measurement Unit (IMU) contains two sensors: **Accelerometer** — measures *specific force* along three axes (X, Y, Z) in m/s². This is **not** pure acceleration. When the phone sits still on a table, it reads approximately `[0, 0, 9.81]` — the reaction force to gravity. The accelerometer cannot distinguish between gravity and real acceleration. This ambiguity is the central challenge of inertial odometry. **Gyroscope** — measures *angular velocity* around three axes in rad/s. When the phone is still, it reads approximately `[0, 0, 0]`. When rotating, it tells you how fast and around which axis.  Both sensors are sampled at ~100 Hz on a typical phone. At rest, the readings aren't perfectly constant — they fluctuate around a mean value. This fluctuation is **noise**, and the mean offset from the expected value is **bias**. Understanding these two properties is where we start. ## Characterizing the Sensor: Bias and Noise Before tracking anything, we need to know how wrong our sensors are. We do this by collecting data while the phone is perfectly still and analyzing the statistics. ```python # Collect ~10 seconds of data at rest, then compute: accel_samples = np.array(all_accel_readings) # shape: (N, 3) accel_mean = np.mean(accel_samples, axis=0) # e.g., [0.012, -0.034, 9.823] accel_std = np.std(accel_samples, axis=0) # e.g., [0.008, 0.009, 0.011] # Bias: deviation from expected [0, 0, g] # When we subtract this from future readings, a stationary phone should read ~[0, 0, g] accel_bias = accel_mean # Gravity magnitude: measured from the sensor itself gravity = np.linalg.norm(accel_mean) # e.g., 9.8235 ``` The same is done for the gyroscope (where the expected reading at rest is `[0, 0, 0]`). This gives us a calibration file: ```json { "accel_bias": {"x": 0.012, "y": -0.034, "z": 9.823}, "accel_std": {"x": 0.008, "y": 0.009, "z": 0.011}, "gyro_bias": {"x": 0.00032, "y": -0.00018, "z": 0.00041}, "gyro_std": {"x": 0.0012, "y": 0.0011, "z": 0.0013}, "gravity": 9.8235 } ```  The **bias** gets subtracted from every future reading. The **std** tells us the noise floor — any reading smaller than this is indistinguishable from noise. **Why this matters**: Even a tiny uncorrected bias of 0.01 m/s² in the accelerometer, when double-integrated, produces 0.5 meters of position drift in just 10 seconds. Calibration isn't optional. ## From Acceleration to Position: Double Integration With calibrated sensors, the simplest tracking pipeline is: 1. Subtract bias from accelerometer 2. Subtract gravity (assuming phone is flat: gravity is along Z) 3. Integrate acceleration → velocity 4. Integrate velocity → position ```python # Bias correction accel_corrected = accel_raw - accel_bias # e.g., stationary: [0.012, -0.034, 9.823] - [0.012, -0.034, 9.823] = [~0, ~0, ~9.82] # Remove gravity (flat phone assumption: gravity is entirely along Z) accel_linear = accel_corrected - np.array([0, 0, gravity]) # Stationary result: [~0, ~0, ~0] — no linear acceleration. Good. # Double integration velocity += accel_linear * dt # integrate acceleration → velocity position += velocity * dt # integrate velocity → position ```  This works surprisingly well for short motions on a flat table — slide the phone left, the visualization tracks it left. But it breaks down quickly: **Problem 1: Bias residuals integrate to drift.** Even after calibration, a residual bias of 0.005 m/s² (half the noise floor) produces: ``` velocity after 10s: 0.005 × 10 = 0.05 m/s position after 10s: ½ × 0.005 × 10² = 0.25 m position after 60s: ½ × 0.005 × 60² = 9.0 m ``` Error grows as **t²**. Double integration is an error amplifier. **Problem 2: The flat phone assumption.** We subtracted gravity along sensor Z, but this only works if the phone is perfectly flat. Tilt it 5° and gravity leaks into X/Y: ``` gravity_leak_xy = 9.81 × sin(5°) ≈ 0.86 m/s² ``` That's 100x larger than the bias residual. After 10 seconds, it would produce ~43 meters of drift. The flat phone assumption is not just limiting — it's catastrophically wrong for any real use. ## Adding Rotation: Gyroscope and Quaternions To handle arbitrary phone orientation, we need to know which way the phone is pointing. The gyroscope gives us angular velocity, and by integrating it, we can track orientation. ### Quaternions in 30 Seconds We represent orientation as a quaternion `q = [w, x, y, z]` — a 4-component mathematical object that encodes a 3D rotation without the singularities (gimbal lock) that plague Euler angles. Key operations: ```python # Rotate a 3D vector by quaternion q (the "sandwich product") def rotate_vector(v, q): q_conj = [q[0], -q[1], -q[2], -q[3]] # conjugate = inverse for unit quaternions v_quat = [0.0, v[0], v[1], v[2]] # embed vector as pure quaternion result = q_multiply(q_multiply(q, v_quat), q_conj) # q ⊗ v ⊗ q* return result[1:4] # extract vector part # Update quaternion from gyroscope reading omega_dt = gyro_corrected * dt # rotation vector for this timestep dq = axis_angle_to_quaternion(omega_dt) # convert to incremental quaternion q = quaternion_multiply(q, dq) # apply: q_new = q_old × dq q = q / np.linalg.norm(q) # re-normalize ``` Right-multiplying (`q × dq`) applies the rotation in the body (sensor) frame, which is correct since the gyroscope measures angular velocity in its own frame. ### Rotation-Aware Gravity Removal With a tracked quaternion, we can finally remove gravity correctly at any orientation: ```python # 1. Bias-correct the accelerometer (in sensor frame) accel_corrected = accel_raw - accel_bias # 2. Rotate from sensor frame to world frame using the quaternion accel_world = rotate_vector(accel_corrected, q) # 3. In world frame, gravity is ALWAYS [0, 0, g] — subtract it cleanly accel_linear = accel_world - np.array([0, 0, gravity]) ```  This is the key insight of step 5: *the gyroscope solves the gravity problem*. No matter how the phone is oriented, we can always isolate the linear acceleration. ### The Gravity Leakage Problem But the quaternion is only as good as the gyroscope that drives it. Gyro bias, even after calibration, accumulates into orientation error: ``` After 30s: error ≈ 0.001 rad/s × 30s = 0.03 rad ≈ 1.7° gravity_leak = 9.81 × sin(1.7°) ≈ 0.29 m/s² position_drift in 10s = ½ × 0.29 × 100 = 14.5 meters ```  The gyroscope tells us how the phone is *changing* orientation, but it has no sense of absolute "down". Small errors in "down" cause gravity to leak into the horizontal plane, producing phantom acceleration that integrates into runaway position drift. This is the fundamental limitation of gyro-only orientation tracking. ## Taming the Drift: Practical Corrections Pure double integration with gyro-only orientation drifts hopelessly. We can't eliminate drift entirely with IMU alone, but several techniques make it manageable. ### Knowing When You're Still: Zero Velocity Update (ZUPT) The single most effective correction. The idea: if the phone isn't moving, velocity must be zero. Detect stationary periods and reset velocity. ```python def _detect_stationary(self): accel_arr = np.array(self.accel_linear_history) # last 25 samples # 1. Low acceleration variance — no changing forces accel_variance = np.var(accel_arr, axis=0).sum() low_variance = accel_variance < 0.1 # 2. Low acceleration magnitude — no sustained acceleration accel_mean_mag = np.mean(np.linalg.norm(accel_arr, axis=1)) low_accel = accel_mean_mag < 0.3 # 3. Low gyro magnitude — no rotation gyro_mean_mag = np.mean(self.gyro_mag_history) low_gyro = gyro_mean_mag < 0.05 return low_variance and low_accel and low_gyro ``` All three conditions must be true. Low variance alone isn't enough — constant gravity leak has low variance but non-zero magnitude. Low accel alone isn't enough — the phone could be rotating. We need the conjunction of all three. When stationary is detected, velocity is aggressively decayed: ```python if is_stationary: if stationary_count > 10: # stationary for >0.1s velocity *= 0.3 # aggressive kill else: velocity *= 0.7 # gentle decay (might start moving again) ``` **A subtle trap we discovered**: it's tempting to also require low velocity for ZUPT (surely a stationary phone has low velocity?). But this creates a deadlock — phantom velocity from drift prevents ZUPT from triggering, and ZUPT is the mechanism needed to kill that phantom velocity. ZUPT must be allowed to fire regardless of current velocity. ### Sanity-Checking Acceleration: The Gravity Magnitude Check A clever heuristic based on a physical constraint: when the phone is not accelerating, the accelerometer magnitude must equal *g*. When it is accelerating, the magnitude deviates from *g*. ```python accel_sensor_mag = np.linalg.norm(accel_corrected) # |measured accel| gravity_deviation = abs(accel_sensor_mag - gravity) # how far from g? if gravity_deviation < threshold: # threshold = 0.4 m/s² scale = gravity_deviation / threshold # linear ramp: 0 → 1 else: scale = 1.0 # full pass-through accel_linear *= scale ```  This elegantly suppresses phantom acceleration from quaternion drift during stationary periods (when `|accel| ≈ g`), while letting real acceleration through during motion (when `|accel|` deviates significantly from `g`). ### Safety Nets: Adaptive Drag and Velocity Clamping Even with ZUPT and gravity check, some phantom velocity leaks through during motion windows. Two last-resort mechanisms: ```python # Adaptive velocity drag — uses gravity check as a proxy for "is phone really accelerating?" if gravity_scale < 0.5: velocity *= 0.99 # strong drag: decays to 37% in 1 second else: velocity *= 0.999 # gentle drag: preserves 90% over 1 second # Hard velocity clamp — phone on a table rarely exceeds 1.5 m/s speed = np.linalg.norm(velocity) if speed > 1.5: velocity = velocity * (1.5 / speed) # scale down, preserve direction ``` The adaptive drag is the more interesting one: it uses the gravity check as a real-time indicator. When `gravity_scale < 0.5`, the phone is probably not accelerating, so any current velocity is likely phantom — apply strong drag. When `gravity_scale` is high, the phone is genuinely accelerating — preserve the velocity with gentle drag. **These are all symptomatic treatments.** They manage the consequences of orientation drift but cannot fix the root cause: the quaternion slowly rotating away from truth with no way to correct itself. ## Complementary Filters and the Madgwick Filter ### The Problem: Gyro Drift Has No Anchor The fundamental limitation of steps 2-5: the gyroscope integration is *open-loop*. It tracks changes in orientation but has no reference for absolute orientation. Over time, the quaternion drifts, gravity leaks into XY, and position runs away. None of our fixes above can solve this. ZUPT only works when stationary. The gravity check only suppresses phantom acceleration, it doesn't fix the quaternion. We need a way to *correct the orientation itself*. ### Two Sensors, Complementary Errors Here's the key insight: the accelerometer — the same sensor we use for position — *also* tells us about orientation. When the phone isn't accelerating, the accelerometer reads pure gravity. From the gravity direction in sensor frame, we can compute which way is "down": ``` roll = atan2(a_y, a_z) pitch = atan2(-a_x, sqrt(a_y² + a_z²)) ``` But this only works when stationary. During motion, the accelerometer reads gravity + linear acceleration, and the "down" estimate is wrong. The two sensors have **complementary error characteristics**: | | Gyroscope | Accelerometer | |---|---|---| | **Short-term** | Accurate (smooth, responsive) | Noisy (linear accel corrupts it) | | **Long-term** | Drifts (bias integrates) | Stable (gravity never drifts) | A **complementary filter** combines them: trust the gyroscope for fast changes, but slowly correct toward the accelerometer's estimate of "down". In the frequency domain: ``` orientation = HighPassFilter(gyro) + LowPassFilter(accel) ``` Where HPF + LPF = 1 at all frequencies — you get the best of both across the entire spectrum. ### Madgwick: Gradient Descent on Quaternions The Madgwick filter implements this complementary principle in quaternion space using gradient descent, avoiding the gimbal lock issues of Euler angle approaches. **The objective**: find the quaternion `q` that correctly explains the measured gravity direction. If `q` is our orientation estimate, then rotating the world gravity `[0,0,1]` into sensor frame should match the normalized accelerometer reading `â`: ``` f(q, â) = q* ⊗ [0,0,0,1] ⊗ q - [0, â_x, â_y, â_z] ``` Expanding this with `q = [w, x, y, z]`: ``` f₁ = 2(xz - wy) - â_x f₂ = 2(wx + yz) - â_y f₃ = 2(0.5 - x² - y²) - â_z ``` When `f = [0,0,0]`, the quaternion perfectly explains the accelerometer as pure gravity. **The gradient**: to minimize the squared error `½|f|²`, we compute `∇E = Jᵀ × f`, where `J` is the Jacobian (partial derivatives of `f` with respect to `q`): ```python def madgwick_gradient(q, accel_normalized): w, x, y, z = q ax, ay, az = accel_normalized # Objective function f1 = 2.0*(x*z - w*y) - ax f2 = 2.0*(w*x + y*z) - ay f3 = 2.0*(0.5 - x*x - y*y) - az # Gradient = Jᵀ × f (4×3 matrix times 3×1 vector → 4×1) grad = np.array([ -2*y*f1 + 2*x*f2, # ∂E/∂w 2*z*f1 + 2*w*f2 - 4*x*f3, # ∂E/∂x -2*w*f1 + 2*z*f2 - 4*y*f3, # ∂E/∂y 2*x*f1 + 2*y*f2, # ∂E/∂z ]) norm = np.linalg.norm(grad) return grad / norm if norm > 1e-10 else np.zeros(4) ``` **The fusion**: each timestep, combine the gyro prediction with the accelerometer correction: ```python # Gyro prediction: how the quaternion changes from rotation q_dot_gyro = 0.5 * quaternion_multiply(q, [0, gx, gy, gz]) # Accelerometer correction: nudge quaternion toward measured "down" accel_hat = accel_corrected / np.linalg.norm(accel_corrected) gradient = madgwick_gradient(q, accel_hat) # Fuse: gyro drives, accelerometer corrects q_dot = q_dot_gyro - beta * gradient # Integrate q = q + q_dot * dt q = q / np.linalg.norm(q) ```  **The `beta` parameter** controls the balance: | Beta | Correction speed | Motion sensitivity | Use case | |------|-----------------|-------------------|----------| | 0.01 | Slow (~10s) | Low | Fast, dynamic motion | | 0.04 | Medium (~2-3s) | Medium | General tracking | | 0.10 | Fast (~1s) | High | Slow/static applications | Higher `beta` corrects drift faster but also reacts to linear acceleration during motion (temporarily corrupting orientation). Lower `beta` is more robust during motion but slower to correct drift. The default of 0.04 is a good compromise. ### One Last Detail: Adaptive Gravity Estimation The Madgwick filter fixes orientation drift, which fixes gravity *direction*. But if the gravity *magnitude* stored in calibration is even slightly wrong, a constant Z residual remains: ``` residual_z = actual_gravity - calibrated_gravity position_z_drift = ½ × residual_z × t² ``` Even 0.003 m/s² error → 1.35m Z drift in 30 seconds. We fix this with a two-stage approach: ```python # 1. At startup: 2-second precise gravity measurement gravity_samples = [np.linalg.norm(accel - bias) for accel in startup_samples] gravity = np.mean(gravity_samples) # 2. During tracking: slow continuous adaptation when quasi-static if gravity_deviation < threshold: # phone not accelerating gravity = 0.999 * gravity + 0.001 * np.linalg.norm(accel_corrected) ``` The continuous estimator uses a very slow EMA (`alpha = 0.001`) so it only updates when the phone isn't accelerating, and takes ~10 seconds to converge — slow enough that brief motion can't corrupt it. ## What's Still Missing This system works well for short-duration, small-area tracking — sliding a phone on a table, rotating it in hand. But it degrades quickly over longer distances and durations. **For walking-scale tracking**, the phone is in continuous motion, so ZUPT rarely triggers and the gravity check stays at full pass-through. Double integration error grows unchecked. Real pedestrian systems solve this with **step detection** — recognizing the periodic gait pattern and applying ZUPT at each footstep, giving an error correction anchor every ~0.5 seconds. **Yaw drift** is unsolvable with accelerometer + gyroscope alone. The Madgwick filter corrects roll and pitch (tilt relative to gravity), but rotation around the gravity axis (yaw) has no accelerometer-observable reference. Fixing yaw requires a **magnetometer** (compass) or visual features. **The fundamental limitation** of inertial odometry is that position comes from double-integrating a noisy signal. No amount of filtering can prevent errors from growing over time. This is why production systems (phones, VR headsets, autonomous vehicles) always fuse IMU data with external references: GPS, cameras, LiDAR, or UWB beacons. But as a learning exercise in sensor modeling, signal processing, quaternion math, and sensor fusion — building this from scratch teaches you exactly *why* those external references are needed, and what the IMU brings to the table when they are available.