System Design: Uber — Real-time Location Tracking and Matching
January 30, 2026 · 8 min read
How Uber tracks millions of drivers in real-time, indexes their locations spatially, and matches riders to the nearest available driver.
Uber processes 14 million trips per day. The core engineering challenge: every 4 seconds, millions of driver apps send GPS coordinates. These must be stored, indexed by geography, and queried to find the nearest available driver when a rider requests a trip — in under 2 seconds.
Driver Location Ingestion
Driver apps send location updates every 4 seconds over WebSocket or HTTP. A fleet of location ingestion servers receive these, write to Kafka for fan-out, and update an in-memory spatial index. With 1M active drivers each sending 1 update/4sec, that's 250K location writes per second — these go directly to Redis Geo or a custom spatial store, not a relational DB.
Geospatial Indexing: S2 and Geohash
Uber uses Google S2 library which divides the earth into a hierarchy of cells using a Hilbert curve space-filling algorithm. Each driver's location maps to an S2 cell ID. Nearby drivers share a common cell prefix — so a radius query becomes a range scan over cell IDs rather than a distance calculation against every driver.
// Geohash alternative: encode lat/lng as a string prefix
// Nearby locations share a common prefix
// geohash("37.7749, -122.4194") → "9q8yy"
// geohash("37.7751, -122.4191") → "9q8yy" (same cell — very close)
// geohash("34.0522, -118.2437") → "9q5ct" (different — LA vs SF)
// Redis GEOADD + GEORADIUS handles this natively:
await redis.geoadd('drivers', lng, lat, driverId)
const nearby = await redis.georadius('drivers', riderLng, riderLat, 5, 'km')Matching Algorithm
Finding the nearest driver is only part of the matching problem. Uber's matching algo (internally called Michelangelo) considers: ETA to rider (not raw distance — a driver 1km away stuck in traffic loses to one 1.5km away on a clear road), driver acceptance rate history, trip direction (does picking you up put the driver closer to high-demand zones?), and surge pricing zones.
- ▸Supply/demand zones: city divided into hexagonal cells (H3 library from Uber)
- ▸Surge pricing: calculated per H3 cell based on rider-to-driver ratio in real time
- ▸ETA calculation: road network graph traversal (Dijkstra/A*) on a map graph
- ▸Batch matching: in high-density areas, batch all requests in a 500ms window and solve as an assignment problem (Hungarian algorithm) to maximize total efficiency
Real-time Trip Tracking
Once matched, the rider's app subscribes to the driver's location stream. Driver location updates go: Driver App → Location Service → Kafka → Trip Service → WebSocket push to rider. The rider sees the car move every 4 seconds. This is a classic pub/sub fan-out: one driver location event fans out to potentially multiple subscribers (rider + Uber ops dashboard + ETL pipeline).
Infrastructure at Scale
- ▸Location store: Redis Geo for hot driver locations (in-memory, fast geo queries)
- ▸Trip history: Cassandra for immutable, time-series trip events
- ▸Map data: custom tile server + road graph for ETA calculations
- ▸Dispatch service: stateless workers pulling from Kafka, doing matching, writing results
- ▸Consistency: eventual — slight staleness in driver position is acceptable for matching