Generative AI to Grow 17% by 2024, but Data Quality Plummets: Key Findings from Appen’s State of AI Report

Generative AI to Grow 17% by 2024, but Data Quality Plummets: Key Findings from Appen's State of AI Report

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. More information


A new report from AI data provider App shows that companies are struggling to obtain and manage the high-quality data needed to power AI systems as artificial intelligence expands into business operations.

Appen’s 2024 State of AI Reportsurveying more than 500 US IT decision makers, found that generative AI adoption increased 17% in the past year; However, organizations now face significant hurdles in data preparation and quality assurance. The report shows a 10% year-over-year increase in bottlenecks related to data acquisition, cleansing, and labeling, underscoring the complexity of building and maintaining effective AI models.

Si Chen, head of strategy at Appen, explained in an interview with VentureBeat: “As AI models tackle more complex and specialized problems, data requirements are also changing,” she said. “Companies are discovering that just having a lot of data is no longer enough. To refine a model, the data must be of extremely high quality, meaning it is accurate, diverse, properly labeled and tailored to the specific AI use case.”

As the potential of AI continues to grow, the report identifies several key areas where companies face obstacles. Below are the five key takeaways from Appen’s 2024 State of AI report:

See also  Trump slams Harris over data showing nearly 15,000 illegal immigrants in the US have been charged or convicted of murder

1. Adoption of generative AI is exploding, but so are data challenges

Generative AI (GenAI) adoption is set to grow by as much as 17% by 2024, thanks to advances in large language models (LLMs) that allow companies to automate tasks for a wide range of use cases. From IT operations to R&D, companies use GenAI to streamline internal processes and increase productivity. However, the rapid increase in GenAI usage has also brought new hurdles, especially in the area of ​​data management.

“Generative AI outputs are more diverse, unpredictable and subjective, making it harder to define and measure success,” Chen told VentureBeat. “To achieve enterprise-ready AI, models must be customized with high-quality data tailored to specific use cases.”

Custom data collection has emerged as the primary method for obtaining training data for GenAI models, reflecting a broader shift from generic, web-collected data in favor of custom, reliable data sets.

The use of generative AI in business processes continues to grow, with notable increases in IT operations, manufacturing, and research and development. However, adoption in areas such as marketing and communications has declined slightly. (Source: Appen State of AI Report 2024)

2. AI implementations and ROI in enterprises are declining

Despite the excitement around AI, the report finds a worrying trend: fewer AI projects are being deployed, and those that are are showing lower ROI. Since 2021, the average percentage of AI projects reaching deployment has fallen by 8.1%, while the average percentage of deployed AI projects with meaningful ROI has decreased by 9.4%.

This decline is largely due to the increasing complexity of AI models. Simple use cases such as image recognition and voice automation are now considered mature technologies, but companies are shifting to more ambitious AI initiatives, such as generative AI, which require high-quality, customized data and are much more difficult to implement successfully.

See also  Crypto Firm CEO Kidnapped in Toronto, Held for Ransom and Then Freed After $1,000,000 Payment: Report

Chen explains, “Generative AI has more advanced capabilities in understanding, reasoning, and content generation, but these technologies are inherently more challenging to implement.”

The percentage of AI projects making it to implementation has steadily declined since 2021, with a sharp decline to 47.4% in 2024. Similarly, the average percentage of implemented projects with meaningful ROI has fallen to 47.3%, which reflects the growing challenges companies face in achieving successful AI implementations. (Source: Appen State of AI Report 2024)

3. Data quality is essential, but is deteriorating

The report highlights a critical issue for AI development: data accuracy has fallen by nearly 9% since 2021. As AI models become more sophisticated, the data they require has also become more complex, often requiring specialized, high-quality annotations.

As many as 86% of companies now retrain or update their models at least once a quarter, underscoring the need for new, relevant data. But as the frequency of updates increases, it becomes more difficult to ensure this data is accurate and diverse. Companies are turning to third-party data providers to meet these demands, with nearly 90% of companies relying on external sources to train and evaluate their models.

“While we can’t predict the future, our research shows that managing data quality will remain a major challenge for companies,” said Chen. “With more complex generative AI models, data collection, cleaning and labeling have already become major bottlenecks.”

Data management emerged as the biggest challenge facing AI projects in 2024, with 48% of respondents citing it as a significant bottleneck. Other obstacles include a lack of technical resources, tools and data, highlighting the increasing complexity of AI implementation. (Source: Appen State of AI Report 2024)

4. Data bottlenecks are increasing

Appen’s report shows that the number of bottlenecks in data purchasing, cleaning and labeling is increasing by 10% year-on-year. These bottlenecks have a direct impact on companies’ ability to successfully deploy AI projects. As AI use cases become more specialized, the challenge of preparing the right data becomes more acute.

“Data preparation problems have increased,” says Chen. “The specialized nature of these models requires new, tailor-made datasets.”

To address these issues, companies are turning to long-term strategies that emphasize data accuracy, consistency and diversity. Many are also looking for strategic partnerships with data providers to help navigate the complexities of the AI ​​data lifecycle.

See also  OnePlus Watch 2R review: Less is More
Data accuracy in the US has steadily declined, from 63.5% in 2021 to just 54.6% in 2024. The decline highlights the growing challenge of maintaining high-quality data as AI models become more complex. (Source: Appen State of AI Report 2024)

5. Human-in-the-Loop is more important than ever

As AI technology continues to evolve, human involvement remains indispensable. The report shows that 80% of respondents emphasize the importance of human-in-the-loop machine learning, a process that uses human expertise to guide and improve AI models.

“Human involvement remains essential to the development of high-performing, ethical and contextually relevant AI systems,” said Chen.

Human experts are particularly important to limit biases and ensure ethical AI development. By providing domain-specific knowledge and identifying potential biases in AI outputs, they help refine and align models with real-world behavior and values. This is especially critical for generative AI, where results can be unpredictable and require careful monitoring to avoid harmful or biased results.

Check out Appen’s full 2024 State of AI Report here.


Source link