Poseidon has secured $15 million in seed funding led by a16z Crypto to build a decentralized data layer designed for artificial intelligence training.
The San Francisco-based full-stack AI data layer said it aims to tackle the scarcity of high-quality, IP-cleared training data in AI development, according to a Tuesday announcement shared with Coinpectra.
“LLMs and compute are no longer the bottlenecks; it’s high-quality data that’s missing,” said Sandeep Chinchali, Poseidon’s chief scientist and also chief AI officer at its incubator, Story Protocol.
“Poseidon delivers the IP-cleared, structured real-world data sets that AI teams need to build systems that actually perform in physical, complex environments,” he added.
Related: Decentralized OORT AI data hits top ranks on Google Kaggle
Decentralized pipeline for legal AI training data
Poseidon’s solution relies on decentralized infrastructure to collect and distribute data sets legally cleared for commercial use. The platform integrates Story’s onchain licensing infrastructure to ensure traceability and monetization, allowing data contributors to be paid for their work while protecting developers from IP risks.
The team argues that centralized data sourcing models cannot meet the growing demand for niche, high-context data sets needed by next-gen AI models, especially in fields like robotics and spatial computing.
Chris Dixon, founder of a16z Crypto, described the project as a step toward “a new economic foundation for the internet.” He added that the model rewards creators and suppliers for “providing the diverse inputs that next-gen intelligent systems need.”
Poseidon is working with several AI labs and plans to use the funding to scale its infrastructure. This includes launching contributor modules, software development kits and licensing tools for developers and data suppliers. Early access is expected to begin this summer.
Related: Bold Technologies and My Aion launch $2.5B smart city AI platform
Poseidon to solve AI’s data drought
The early wave of AI foundation models thrived on abundant online data, but that era is over, a16z analysts Chris Dixon and Carra Wu said in a note shared with Cointegraph.
They noted that easily accessible data sets, including books, websites and public records, have largely been mined, leaving AI models starved for fresh, high-quality and legally usable information.
“The challenge isn’t just technical — it’s a problem of coordination. Thousands of contributors must work together in a distributed way to source, label and maintain the physical data that next-gen AI needs,” the duo wrote.
They added that no centralized approach can efficiently orchestrate the data creation and curation that’s needed at the required level of scale and diversity. “A decentralized approach can solve this,” they said.
Magazine: AI Eye: Growing numbers of users are taking LSD with ChatGPT
