For mobile games constant tweaks are the difference between success and failure. Product managers need instant access to the latest metrics, e.g. to see how an acquisition campaign is doing or how a change affects spending per user. Data and analytics must be available in real-time. However, calculating, for example, uniqueness or newness of a data point requires a list of seen data points – both memory-intensive and tricky when using real-time stream processing like Spark Streaming. Probabilistic data structures allow approximation of these properties with a fixed memory representation, and are very well suited for this kind of stream processing. Getting from the theory of approximation to a useful metric at a low error rate even for many millions of users is another story. In our talk we will look at practical ways of achieving this:
Kevin built up the data science and engineering team at Mind Candy, and with the team created a scalable architecture for mobile game analytics. Before Mind Candy, Kevin headed the data science and back-end services team at Last.fm, working with ten years of music listening data from millions of users. He also spent four years working on private clouds and service architecture at Goldman Sachs.
Luis is senior data engineer at Mind Candy, was the first to introduce Spark Streaming at the company, and is responsible for the real-time mobile analytics platform. He has more than 10 years of experience in software engineering and architecture.
©2015, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.