MyDrive Technology

Understanding the Road Network

At MyDrive we have years of experience with GPS data and digital maps. We understand the nuances of the road network intimately, and use this knowledge to improve our analysis of driving.

MyDrive is unique in the way we match the GPS data from drivers to the underlying map of the road. Every GPS data point - 1 per second - is located against a road link, with enables us not only to tell whether the driver was on a three-lane motorway or country road, but also whether they were approaching a junction or navigating a tight corner. Understanding the context of a driver’s actions means we can understand not just what a driver did, but likely reasons why they did it.

There are typically around 4 million discrete road links in the United Kingdom, from a couple of metres long to several kilometres. Dealing with a digital map means managing the geometry of the road network, the characteristics of each link, and whether cars are allowed to drive from one link to another.

The MyDrive platform uses the mature PostgreSQL database and PostGIS extensions for the geospatial aspects of working with GPS and map data. Our proprietary micro-routing engine then uses the connectivity between road links to ensure the journey we match is valid in terms of vehicle navigation.

Data Analysis

Hadoop

Apache Hadoop is reliable, scalable platform for distributed computing. An open-source project modelled on Google’s internal systems, Hadoop enables scalable fault-tolerant analysis of big data. Hadoop is in heavy production use by by companies including Yahoo!, Facebook, eBay and Twitter to process terabytes of data.

MyDrive started using Hadoop to process billions of GPS data points to learn about how drivers influenced average speeds on road links. The same techniques can used to understand how those road links influence driver behaviour, at a scale that wasn’t possible just a few years ago.

Tools and Technologies

Continuous Integration and Automated Testing

A comprehensive suite of unit tests and integration tests means that changes and improvements to our systems can be made quickly and safely. Every new feature or piece of functionality should be accompanied by tests that serve to document the expected behaviour, and verify that it works. Tests are run automatically to detect regressions, and this means that refactoring and optimisation can be performed safe in the knowledge that end-user functionality won’t be affected.

All commits to our version control system automatically trigger a new build in our CI server, which alerts the team to any issues immediately. A robust testing regime helps to avoid the problems of ‘legacy code’, which developers are afraid to touch because they don’t know how it works or whether they have broken anything.

Scalable Automated Infrastructure

Using commercial Infrastructure-as-a-Service providers enables us to scale systems dynamically. We aggressively automate the configuration and management of our systems using tools like Opscode Chef, which means our infrastructure is robust and replicable.

Automated configuration of infrastructure means that development and staging environments can match production, minimising differences in behaviour between test and live systems. It also means that disaster recovery and business continuity strategies can be more easily formulated and tested.

Built on Solid Foundations

The MyDrive platform is built on solid, proven technologies from the open-source community. From the Linux operating system and the programming languages we use, to the PostgreSQL database and Hadoop mapreduce framework, our tools are in production use by thousands of companies all over the world.

Building on open-source foundations means we can choose from a rich ecosystem of tools and technologies to solve problems, enables us to customise those tools if necessary, and gives us the opportunity to contribute improvements back to the community.