Ownership of Data in the Era of Artificial Intelligence

You’re surrounded by an abundance of data. Every day, you’re generating new information. For instance, if your fitness tracking app logs your steps, that’s fresh data. The Oura ring that monitors your biometrics? That’s also valuable data. And your social media posts, including those silly jokes that went unnoticed? Yet another source of data.

AI companies are eager to tap into this wealth of information. High-quality data is essential for developing effective AI, which is why many refer to data as the "new oil" in the AI landscape. However, despite the theoretical value of your data, monetizing it as an individual is challenging. You typically lack the bargaining power to negotiate its sale—no large AI firm is reaching out to purchase your past social media activity.

This brings us to Vana. Anna Kazlauskas, co-founder of Vana and CEO of Open Data Labs, believes that data is a crucial resource driving the future of AI and the digital economy. She emphasizes that many people are unaware that they actually own their data.

Indeed, you do own your data, and it holds significant value, especially when combined with the data of millions of others. This collective ownership can grant you greater bargaining power. Vana’s mission is to establish a platform for user-owned data, thereby facilitating user-focused AI development.

This platform will consist of Data Decentralized Autonomous Organizations (Data DAOs), decentralized marketplaces for data, the newly introduced VRC-20 token, and a partnership with Flower Labs to create the world’s first user-owned foundational model. This collaboration signifies that decentralized AI is becoming mainstream.

Kazlauskas is set to deliver a keynote at the AI Summit at Consensus 2025, where she will share her vision. She perceives a shift in mindset, with more individuals recognizing the importance of their data in the AI landscape and acknowledging their ownership of it. She predicts that within a few years, over 100 million users will join this movement, with the entire global population participating in a decade.

Why is user-owned data so crucial for you?

Kazlauskas: Many people mistakenly think that data is held by the platforms where it resides, but that’s not true. It’s akin to parking your car in a lot: the lot doesn’t own your vehicle; you retain that ownership.

A substantial profit is currently being generated from this data, primarily by large tech firms, while users are the rightful owners. It’s vital to reclaim this ownership from both a user’s viewpoint and that of developers.

How does this help developers?

For developers, especially in the realm of AI, having access to suitable data is critical. However, that’s quite difficult at present, as most data is confined within the ecosystems of major technology companies. Many talented individuals in the AI space are compelled to join these large labs, as that’s where the data and computational resources are. This situation doesn’t have to remain the norm.

What role do Data DAOs play in this vision?

Data DAOs act like a collective for data, allowing a significant number of individuals to combine their data and collaboratively decide how it’s utilized.

The importance lies in the fact that individual data is often not very useful on its own. It gains value when aggregated into a larger pool capable of effectively training AI models.

Which Data DAOs excite you the most?

Several interesting initiatives in the health sector are emerging. One such initiative focuses on exporting complete medical records from patients, which could greatly enhance research in that area. There are also projects related to health metrics, sleep patterns, and even driving data, including an initiative concerning Tesla data, which is particularly valuable given most people view Tesla as a leader due to its data capabilities; however, users can access much of this data.

What is your collaboration with Flower Labs for COLLECTIVE-1 aimed at?

COLLECTIVE-1 is intended as the first model owned by users. Usually, foundational models are associated with a single company handling extensive training in one data center, like OpenAI. This centralization is typically necessary due to the vast computational resources and data involved.

Flower Labs specializes in decentralized training and has developed excellent open-source libraries. They excel in the training and algorithm perspectives, while Vana focuses on the data aspect. This means utilizing user-generated data for training, granting users the ownership of the resulting model. This sets a new precedent in the foundational model space.

Is the aspiration to create AI that not only competes with central entities but surpasses them? It’s about more than just philosophy; it’s about achieving superior performance.

Absolutely correct. Within a decentralized framework, it’s widely accepted that AI should be driven by user ownership. However, the real question is: What advantages can we gain in such a decentralized system? The answer lies in data. Individual companies possess only their slice of the data pie. Utilizing users to compile data across platforms can yield far richer datasets than any solitary company could achieve. Data is the essential ingredient for success in this context.

Post Comment