Sep 23–26, 2019
Please log in

Machine learning and large-scale data analysis on a centralized platform

James Tang (Walmart Labs), Yiyi Zeng (Walmart Labs), Linhong Kang (Walmart Labs)
1:15pm1:55pm Wednesday, September 25, 2019
Location: 1A 08/10
Average rating: **...
(2.00, 1 rating)

Who is this presentation for?

  • Business leaders, engineers, and data scientists




As the global retail leader in the world, Walmart served nearly 270 million customers per week in 2018. Tremendous amounts of data is collected and flows into the Walmart data ecosystem. Data is comprised of store, clubs, online, digital purchases, customer services (e.g., return) and financial services (e.g., money transfer). Like many other big companies, Walmart has an enterprise data hub. But how to leverage data resources and how to connect customers’ behaviors from different platforms is an interesting and challenging topic.

James Tang, Yiyi Zeng, and Linhong Kang outline their knowledge and success story about mining information from different data sources and connecting customers’ activities to provide a secure and seamless shopping experience. They explore the design of a centralized risk and abuse management platform and how this highly sophisticated platform enables dynamic and complex analytics of large-scale data from different domains. They share a study of protecting customer accounts through linking customer behaviors in their purchases, returns, and financial services.

You’ll get an introduction to the Walmart risk and abuse management platform, risk and abuse problems in the Walmart ecosystem, the data-driven analytics and advanced machine learning algorithm used to defend against fraud and abuse, and case studies of customer account protection.

Prerequisite knowledge

  • Familiarity with large-scale datasets and systems, data mining, and machine learning technologies

What you'll learn

  • Gain an introduction to the centralized risk management platform, data insight collection through mining multidimensional data sources, and advanced machine learning technology for risk and abuse detection
Photo of James Tang

James Tang

Walmart Labs

James Tang is a senior director of engineering at Walmart Labs. He’s spent time creating large-scale, resilient, and distributed architectures with high security and high performance for enterprise applications, web applications, online payments, online games, and real-time predictive analytics applications. While enthusiastic about technologies, he enjoys mentoring, training and leading teams to be successful with distributed systems concepts, microservices, DevOps, and cloud native application design.

Photo of Yiyi Zeng

Yiyi Zeng

Walmart Labs

Yiyi Zeng is a senior manager and principal data scientist at Walmart Labs, where she and her team use supervised and unsupervised machine learning techniques to detect fraud including stolen financials, account takeover, identity fraud, promotion and return abuse, and victim scams. She has 12 years of extensive experience in business analytics and intelligence, decision management, fraud detection, credit risk, online payment, and ecommerce across various business domains including both Fortune 500 firms and startups. She’s enthusiastic about mining large-scale data and applying machine learning knowledge to improve business outcomes.

Photo of Linhong Kang

Linhong Kang

Walmart Labs

Linhong Kang is a manager and staff data scientist at Walmart Labs, where she’s the lead of multiple fraud and abuse detection solutions for Walmart’s various products. She has more than 10 years of experience in data science, business analytics, and risk and fraud management across different industries including business consulting, banks, financial payment, and ecommerce. She’s passionate about translating business problems into qualitative questions, delivering cost savings and helping companies to become more profitable.

  • Cloudera
  • O'Reilly
  • Google Cloud
  • IBM
  • Cisco
  • Dataiku
  • Intel
  • Io-Tahoe
  • MemSQL
  • Microsoft Azure
  • Oracle Cloud Infrastructure
  • SAS
  • Arcadia Data
  • BMC Software
  • Hazelcast
  • SAP
  • Amazon Web Services
  • Anaconda
  • Esri
  •, Inc.
  • Kyligence
  • Pitney Bowes
  • Talend
  • Google Cloud
  • Confluent
  • DataStax
  • Dremio
  • Immuta
  • Impetus Technologies Inc.
  • Keyence
  • Kyvos Insights
  • StreamSets
  • Striim
  • Syncsort
  • SK holdings C&C

    Contact us

    For conference registration information and customer service

    For more information on community discounts and trade opportunities with O’Reilly conferences

    For information on exhibiting or sponsoring a conference

    For media/analyst press inquires