Data Leapfrog is our most radical project at the End-to-End Value Chain Pillar. Its premise is that the industry’s reliance on legacy systems and processes inhibits progress on product data accuracy and completeness.  The project consists on five different pilots that could tangibly demonstrate a potential solution – or part of a solution – to data accuracy & completeness, that:

  • Leverage new technology/are not encumbered by legacy systems
  • Will deliver much faster than current approaches
  • Are not exclusive to any companies

The five pilots are the following, each to be led by a handful of retailers and manufacturers:

Pilot Summaries

Download the summary of pilots, workshops and Steering Committee discussions as prepared for the Leapfrog Pilots Steering Committee.

  • Centralised Data Sharing – central registry of products

The objective of this pilot is to improve the accuracy and availability of product data, at least for a limited set of attributes. We want to achieve this with the brand owner benefiting from less labour to maintain the data and retailers and consumers benefiting from easier access to a more complete and trusted source. We will set up a measurement study in parallel to measure the benefits of the pilot and associated costs for all segments in the industry. This pilot is supported by Google, GS1 Cloud/US Data Hub, IWS, SAP and SmartLabel.


  • Standard Data Attributes – a common language and structure

Today, suppliers and retailers do not have a common architecture or hierarchy of data attributes. This makes data exchange costly because trading partners need to translate from one system to another. In this pilot, Ahold Delhaize, Metro AG, Migros Turkey, Walmart, P&G and Nestlé will define a minimum set of attributes to identify, buy, move and sell a product.  Our objective is to prove that 80% of product attributes are common and providing them in a standardise way reduces time and complexity. We will leverage all the previous work on data attributes by the CGF as well as GS1 and GMA/FMI/SmartLabel.


  • Data Quality Scorecard – automated continuous improvement

Today, there is no reliably fact- based assessment of data quality, using common definitions and end to end visibility. In this pilot, Ahold Delhaize, Metro AG, Colgate-Palmolive, Johnson & Johnson, Mondelēz, Nestlé and JM Smucker will leverage existing tools and practices to develop and demonstrate a data quality scorecard. The scorecard will include common definitions of data quality and will incorporate assessments by both a company’s trading partners and its consumers (using a consumer panel). This pilot is supported by Google, GS1, IBM and IWS.



  • Automated Data Creation – simplified data provision

As noted above, the number of potentially required product attributes is well into the 100’s and increasing. In this pilot, Colgate-Palmolive, Unilever and Walmart will test the use of AI/machine learning to extract product attributes from available text and images and then populate ready-to-use templates.

We will test innovative solutions that help to automate both the extraction of data and the population of databases. The objectives for the pilot are to: increase data accuracy; reduce costs; and increase speed to market. We will measure success based on the precision and recall of attribute extraction, as experienced by the testing companies.

The technology for this pilot is from Walmart, CrowdAnalytix and Google. Walmart is spearheading the attribute extraction from texts and Google is doing the same for images. These models are being built on a CrowdAnalytix platform with investments from all three parties.


  • Federated Data Sharing – Connect on demand

Today, data sharing through centralised databases is slow and costly, as well as unable to adapt to the rapid changes in our business environment. In this pilot, METRO AG, SPAR International, Henkel, L’Oréal and Nestlé will test whether product data can be shared “on demand” directly between trading partners, without the need for a centrally managed repository.

We will allow direct connections between trading partners’ own internal Product Information Management (PIM) systems, using application programming interfaces (APIs) to connect the systems and Artificial Intelligence-enabled ‘translation interfaces’ to map from one company’s product language to the other’s. Our objective is to deliver trusted and verified data available on demand in a form the user needs it. We will test responses to a variety of different product information demands from retailers and consumers. This pilot is supported by Global Resonance, Google, SAP and Salsify. It will work closely with the “Centralised Data Sharing” Pilot.