Online vehicle marketplaces are embracing artificial intelligence to ease the process of selling a vehicle on their platform. The tedious work of copying information from the vehicle registration document into some web form can be automated with the help of smart text-spotting systems, in which the seller takes a picture of the document, and the necessary information is extracted automatically.
Florian Wilhelm details the components of a text-spotting system, including the subtasks of object detection and optical character recognition (OCR). Florian elaborates on the challenges of OCR in documents with various distortions and artifacts, which rule out off-the-shelf products for this task. After offering an overview of semisupervised learning based on generative adversarial networks (GANs), Florian evaluates the performance gains of this method compared to supervised learning. More specifically, for a varying amount of labeled data, he compares the accuracy of a convolution neural network (CNN) to a GAN that uses additional unlabeled data during the training phase, showing that GANs significantly outperform classical CNNs in use cases with a lack of labeled data.
Florian Wilhelm is a data scientist at inovex in Cologne, Germany, where he focuses on recommender systems, mathematical modelling, and bringing data science to production. Previously, he worked at Blue Yonder, the leading platform provider for predictive applications and big data in the European market, and held a postdoctoral position at the Karlsruhe Institute of Technology. Florian’s background is in mathematics. He has more than five years of project experience in the field of predictive and prescriptive analytics and big data, as well as the domains of mathematical modelling, statistics, machine learning, high-performance computing and data mining. For the past few years, he has programmed mostly with the Python data science stack (NumPy, SciPy, scikit-learn, pandas, Matplotlib, Jupyter, etc.), to which he’s also contributed several extensions.
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
©2018, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com