Skip to main content

Open Source Big Data for Defense

Peter Wang (Continuum Analytics), Chris White (DARPA)
Data in Action
Ballroom CD
Average rating: ****.
(4.50, 4 ratings)

Although the Department of Defense is no stranger to the problem of data analytics at scale, it is encountering many of the same fundamental challenges that businesses face, given the modern explosion of data acquisition, storage, and processing technologies.

XDATA is a $25 million/year DARPA program that seeks to develop computational techniques and software tools for analyzing large volumes of data, both semi-structured (e.g., tabular, relational, categorical) and unstructured (e.g., text documents, message traffic). The central challenges we are addressing include developing scalable algorithms for processing imperfect data in distributed data stores, and creating effective human-computer interaction tools for facilitating rapidly customizable visual reasoning for diverse missions.

Novel aspects of the XDATA program include the embracing of open-source technologies, which is relatively rare in the Defense sector, and a focus on minimizing design-to-testing time of new software by using near continual feedback from users.

In this talk, the Program Manager for XDATA will talk about the origins of the program, including personal experiences in Afghanistan which led to a recognition within the Department of Defense that a broader, coherent strategy was needed for tackling the computational challenges of large, heterogeneous datasets. The discussion will then introduce some of the libraries and teams comprising the XDATA effort, and highlight the innovations being developed, ranging from novel machine learning and graph algorithms, to fundamental improvements in data processing and distributed computation, to many powerful libraries and techniques for visualization of large data.

The multi-year program is still in its first year, and participating teams have just completed an intensive summer workshop where they focused on solving data science “challenge problems” on several datasets of interest to both businesses and defense. The talk will also showcase some interesting results from these explorations.

We hope to engage with the Strata audience to identify opportunities and best practices that will enable cultural shifts in government, business, and non-profit organizations to occur, and then to facilitate the transition to a new community around Department of Defense problems that is inherently open and collaborative.

Photo of Peter Wang

Peter Wang

President, Continuum Analytics

Co-founder and President of Continuum Analytics. Interested in data analysis, scientific computing, and data visualization with Python. Author of the Bokeh web visualization library, and the Chaco interactive visualization toolkit. Extensive experience developing flexible, performant analysis apps and environments across multiple engineering and scientific domains, including finance and high-frequency trading.

Chris White

Program Manager, DARPA

Bio

Dr. Chris White joined DARPA as a program manager in August 2011. His focus is on developing the enabling technology required for efficiently processing, analyzing and visualizing large volumes of data in a military, mission-oriented context.

Dr. White previously served DARPA as its country lead for Afghanistan and in-theater member of the Senior Executive Service supporting the commander of the NATO International Security Assistance Force, the Combined Joint Staff branch for Intelligence, the Afghan Threat Finance Cell and the regional military commands.

Prior to joining DARPA as government staff, Dr. White was a researcher in DARPA’s Information Innovation Office where he created techniques to better understand, measure and model social media and large networks of information.

Dr. White was a Research Fellow at Harvard University’s School of Engineering and Applied Sciences and the Johns Hopkins University’s Human Language Technology Center of Excellence, researching large-scale data analytics for graphs and networks, natural language processing, machine learning and statistical methods for heterogeneous sources in real-world applications.

Dr. White holds Ph.D. and M.S. degrees in Electrical Engineering from the Johns Hopkins University and a B.S. in Electrical Engineering from Oklahoma State University.