Modern houses and robots have a lot in common. Both have a lot of sensors and have to make a lot of decisions. In the case of robots, the sensors are prominent, and the decisions are obvious. Robots have cameras and bumpers, odometers and battery monitors that continuously collect data, and they have to use this data to continually decide where to point their cameras or how far to turn their wheels. The result is intelligent behavior that culminates in something helpful, like bringing you a glass of wine.
Just like robots, houses can have many sensors—thermometers, motion detectors, microphones, smoke detectors, and, increasingly prevalent, video cameras. Houses also have to make a large number of decisions—Do I heat the pool? Turn off the light? Record The Walking Dead? Call the police? Right now the most common house decisions are about energy usage and security, but the technology is readily available to automatically re-order groceries, track appliance health, and schedule cleaning and maintenance.
The biggest difference between robots and houses is that robots are typically programmed to adapt to their environment while houses usually are not. This lack of adaptation prevents our houses from anticipating our needs—from becoming helpful robots.
Brandon Rohrer details an algorithm specifically designed to help houses, buildings, roads, and stores learn to actively help the people that use them. The challenge in making homes adaptive is that it pushes machine learning beyond simple value or category assignment. It requires lots of decisions. Of all the available actions, houses must decide which ones are most appropriate. This problem, called reinforcement learning, has been addressed in fields like robotics, where data is received incrementally, rather than in large batches, and goals are clearly defined. Reinforcement-learning algorithms are a natural fit for the Internet of Things, where the world is continually changing, new data are constantly generated, and the best action to take can be a complex and evolving function of recent and current events.
Brandon presents a novel reinforcement learning algorithm with characteristics that suit it well for the IoT and illustrates these characteristics with a video demonstration of the algorithm in action. This algorithm handles new data points one at a time as they occur, rather than saving them up in batches, and handles action selection by ensemble voting. Small action selection models are learned for each feature, and their votes are weighted and combined each time an action is chosen. Due to its ensemble nature, the algorithm is well suited to run parallelized and distributed and can learn on a relatively small amount of data. Even on tasks with large numbers of features, it learns a basic level of performance quickly, which it refines over time as it gains more experience. It uses curiosity-driven exploration to cautiously seek out better options and uses only arithmetic matrix operations, allowing for low computation time.
Brandon Rohrer is a data scientist in Microsoft’s Azure Machine Learning group. He creates end-to-end data science solutions for external customers and supports the development of core algorithms and functionality in Azure ML. Brandon obtained his data science skills working in a variety of applications, including robotics, agriculture, artificial intelligence, cognitive modeling, machine vision, and signal processing. Brandon holds a PhD from MIT.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.