Presented By O'Reilly and Cloudera
Make Data Work
Feb 17–20, 2015 • San Jose, CA

The IoT P2P Backbone

10:40am–11:20am Friday, 02/20/2015
Machine Data / IoT
Location: LL21 E/F
Average rating: ****.
(4.00, 1 rating)

With the cost of manufacturing sensor devices gone under $100 mark, we see an explosion of connected sensors and control devices. But one can observe several trends:

  • The majority of these devices are partially connected devices. The connection to the Internet is rarely direct and instead these devices connect via Bluetooth LE, or similar technology, to a higher-power device that is itself fully connected to the Internet.
  • Sensors are usually capturing data at 10–1000 Hz, but only storing or broadcasting at 1–10 Hz. The rest of the data is usually put through some algorithm to infer a derived metric and then thrown away. It’s only the derived metric that it’s broadcast, and the raw data is unavailable except in a lab environment to the development team.
  • The growth in CPU and storage will continue to increase for these low-power devices, orders of magnitude faster than what networks and battery evolve. Especially in consumer networks are the connection is usually asymmetric and data upload is a known bottleneck. The amount of data that gets generated, and thrown away on the device front, will continue to grow as the gap between CPU and networks increases.

To work around these constraints the current development cycle for these sensor devices is usually as follows:

  1. A fully connected sensor device prototype is built to collect and broadcast all data at the higher sampling frequencies. This usually happens on a lab environment.
  2. This data is used to develop and train algorithms, e.g. infer a derived metric such as foot stride cadence via 3-axis acceleration data.
  3. Once ready, the algorithm is compiled and shipped to the device, usually as a firmware update. This process is iterated till we arrive at a good algorithm in a prototype device.
  4. To then move to production lines of the actual devices, the current working recipe is to go through Kickstarter to get the minimum funding required to go to production in Shenzen. But these production devices, unlike the lab prototype devices, don’t store or forward the raw sensor data.
  5. Even when battery is not a concern, e.g. devices that remain powered and have full Internet connectivity, it’s common to still subsample events to address network bandwidth limitations. An example of this are home automation devices that are mains powered, or a connected laundry washing machine.

Given the current state of affairs described above, we can identity the two following pain points from an algorithmic perspective:

  • Given the limited size of the training set, which consists of lab data from prototype devices, we wonder how representative it is, compared to the full universe of production devices. It’s hard to evaluate the quality of the predictions (and their actionability) for the production devices.
  • Model improvements are slow to happen, as they require lab testing, training, and asking users to upgrade their devices firmware.

Analyzing these two pain points, we unveil what we think are HUGE issues in the current ecosystem:

  • While in some domains it’s likely that the approach of developing the algorithm in the lab and then encoding it in the software firmware is probably “good enough”, e.g. applying a simple fourier transform, in other domains this will be severely limiting. Where the quality of the prediction and the estimation bias must be measured, the lack of access to the raw data will imply sensor devices that won’t have the necessary mass market quality. Using an analogy from the world of computational advertising to illustrate this issue, if I train an ad click prediction algorithm using data from a sports site, but then use this prediction in a finance site, I am likely to get bad results; but I won’t know, not if I can’t measure my sports-trained algorithm on the finance site, since I am blind in regards to the quality being achieved.
  • In some domains, the complexity of high quality algorithms will exceed what the sensor device can individually process on-chip, and the vast and raw data generated by the sensor device will be stored on device, given how unfeasible it is to transfer it under partial network connectivity conditions. When the device then becomes fully connected, a data dump for analysis to a more powerful environment happens. Unfortunately, this process is not fast, given the cost of transfer and computation. As an example, when an airplane lands, the raw data from all sensor devices in each jet engine that has been collected in-flight is then transferred to the cloud where it’s analyzed. This transfer process however can take tens of minutes, and if the outcome of the analysis is that the airplane needs to be serviced, then all transfer-induced latency becomes a significant cost to the airline. The ideal scenario would have been for the airline to know the need for servicing whilst the airplane was still in the air. This serviceability issue exists throughout manufacturing.

Summarizing, we unveil two issues due to the computational resource gap between CPUs, storage and network on IoT sensor devices:

  • Undefined prediction quality.
  • Latency in generating predictions.

Therefore, in order to make the IoT a wider reality, we believe applications must be developed through a backbone that must ensure:

  • Partial predictions can be made on-device with limited access to the full prediction model. Online learning with partial knowledge happens on-device, synchronizing the model parameters whenever possible with other devices.
  • The online learning must be made on much larger datasets, and compared against actual events. Concretely, the ability for users to label events and train the algorithm in a distributed online crowd-sourced fashion is critical.

The computation challenge of creating this backbone can then be summarized as follows:

Each sensor device is only exposed to a fraction of the training samples, and therefore model parameters. The device needs to sync with its peers to get a complete view of the model. The device needs to do learning locally in batches of N samples and regularly exchange the learned model with its peers. Transmitting the model has a latency and storage cost, whereas working in isolation has an accuracy cost. The computational challenge is therefore to determine the optimal number of model parameters and size of the mini-batches to make the best trade-off between time and space. This trade-off is obviously dependent on the CPU, storage, battery and network connectivity of the device, and obviously the size of a training sample.

Photo of Bruno Fernandez-Ruiz

Bruno Fernandez-Ruiz


Bruno Fernandez-Ruiz is a Yahoo Senior Fellow and VP of Personalization Platforms, overseeing the development and delivery of Yahoo’s personalization technology, which Bruno’s teams use to harvest deep user insights in order to deliver a personal stream of content including native stream ads that are uniquely relevant to users.

Prior to joining Yahoo, Bruno founded OneSoup, a mobile messaging startup later acquired by Synchronica. Before that, at Accenture’s Technology Labs, he co-founded Accenture’s Claim Solutions team for Java and led the creation of Meridea’s Mobile Application Framework, jointly with Nokia and Sampo. Bruno holds a MSc degree in Operations Research and Transportation Science from MIT, and a MEng in Structural Engineering from Universidad Politécnica de Madrid in Spain.

Comments on this page are now closed.


Tim Paffel
02/21/2015 12:46am PST


Great session! Is there a link to your slides?

Kindest Regards,