Why automating data extraction from charts is difficult
September 2023
Xcharta is the first product on the market which automates end-to-end the extraction of data from charts.
This reflects the fundamental challenges associated with chart data extraction, including:
-
Complex chart types. One of the biggest challenges in extracting data from charts is dealing with complex chart types. Pie charts, stacked bar charts, bubble charts, line charts with multiple series - in fact all chart types - have unique shapes and layouts, and present unique challenges, making them difficult to interpret accurately.
-
Variety in presentation. Even within a certain chart type, there is huge variation in a how data is presented. For example, we once came across a beer company's annual report which showed all bar charts as beer bottles! This presents a challenge for data extraction, as you never quite know what you are going to be presented with. Compare, for example to data presented in a table, which has a much more standardised layout.
-
Quality of input data. Another challenge with chart data extraction is the wide variety of data sources that exist. From scanned documents to digital charts, each source can present its own set of issues. Some charts may be fuzzy, distorted, or difficult to read due to image quality issues, inconsistent formatting or a small size. This can cause issues with pattern recognition and data extraction.
At xcharta, we've invested heavily in our bespoke computer vision algorithms to overcome these challenges to bring a market-leading, first-of-its-kind data extraction product to market.
This enables our customers to get huge efficiency improvements compared to alternative methods (and manual extraction in particular).
Please get in touch if you want to hear any more about how xcharta works and what it could do for you.