• Tue. Apr 15th, 2025

North East Connected

Hopping Across The North East From Hub To Hub

How tech giants are building AI empires on our data

pexels-markusspiske-1089438

By Jamie Dobson, founder of Container Solutions and author of ‘The Cloud Native Attitude’

Imagine writing a book and putting it out into the world for people to enjoy. But then someone takes the book, blends it with another similar story, and then sells it, makes a movie of it, and starts selling merchandise, all without payment or even mentioning your name. You would be rightfully miffed.

Now scale that up to include everything we’ve ever created digitally, from scientific papers to social media posts, from news articles to personal blogs. That’s exactly what’s happening in the AI industry today, and it is almost certainly the largest appropriation of human creative work in history.

The Data Gold Rush
When large language models (LLMs) first emerged, tech companies trained them on publicly available data – content that was public domain and/or open source but still required countless hours of human effort, creativity, and expertise. These companies operated under the assumption that if content was accessible, it was fair game for AI training. No attribution needed, no compensation required.

But as these models grew more sophisticated, they needed more data. A lot more. That’s when things got murky. Companies began scraping copyrighted content, paywalled articles, and private repositories of content. Meta’s recent legal trouble over torrenting vast amounts of data (~82Tb) for AI training is likely just the tip of the iceberg.

Even more concerning is the UK government’s consideration of allowing tech firms to legally use copyrighted content for AI training, with a dubious “opt-out” mechanism for creators. It’s like being told someone can take your property unless you explicitly post a “No Trespassing” sign – in a language that hasn’t been invented yet.

The biggest problem is that it is incredibly difficult to prove that these companies are pirating content as they delete their data sources after feeding it to their LLMs.

The Real Cost of AI
The rise of AI isn’t just another technological shift, it is a fundamental restructuring of our economic and social fabric. Much like the Industrial Revolution changed the fundamental relationship between labour and capital and what it meant to be human, the AI Revolution will impact us psychologically as well as economically and politically.

If the past teaches us anything about the future it’s that the transformation of the labour market follows a predictable pattern:

First comes simplification, in this phase workers use AI as a “helper.”

Then, workers either inadvertently or overtly train their AI replacements through their daily tasks. People are literally being hired to train AI to do their job as we speak, like being asked to dig your own grave.

Finally, the job either disappears or becomes so deskilled that it commands a fraction of its former wages.

This process is why McKinsey predicts that 30% of US work hours will be automated by 2030, displacing 12 million workers. And that’s just in the US and in just five years. What happens after that is hard to predict.

Charting a Path Forward
The solutions to this challenge need to be as innovative as the technology causing it. Here are several approaches we should consider:

1. Data Rights and Compensation: Establish a framework where data creators receive compensation for their contributions to AI training. This could work in a similar way as it does with musicians who receive royalties when their songs are played.

2. Algorithmic Transparency: Require AI companies to maintain and disclose training data sources, making it possible for creators to track and verify the use of their work.

3. Public AI Infrastructure: Develop public alternatives to private AI models, ensuring that the benefits of this technology aren’t concentrated in corporate hands.

4. Progressive AI Taxation: Implement a scaled taxation system for AI companies based on their data usage and market impact, funding public services and potentially a Universal Basic Income.

5. Digital Commons Framework: Create a new category of digital rights that balances innovation with fair compensation, perhaps through a system of micropayments or credit attribution.

The path forward isn’t about stopping AI development; it’s about ensuring its benefits are distributed as widely as its costs. We need to transform this greatest theft into the greatest redistribution of technological benefits in history.

Either this transition can be well-planned, helpful and equitable or, as history shows us, people will protest, resist and will revolt causing widespread political and civil unrest.

The question isn’t whether AI will transform our world, it’s whether that transformation will enrich us all or just a select few. The answer depends on what we do next.

ABOUT THE AUTHOR
Jamie Dobson is the founder of Container Solutions, and has been helping companies, across industries, move to Cloud Native ways of working for over ten years. Container Solutions develops a strategy, a clear plan and step by step implementation helping companies achieve a smooth digital transformation. With services including Internal Developer Platform Enablement, Cloud Modernisation, DevOps/DevSecOps, Site Reliability Engineering (SRE) Consultancy, Cloud Optimisation and creating a full Cloud Native Strategy, companies get much more than just engineering know-how. Jamie is also author of the new book, ‘The Cloud Native Attitude’. https://www.container-solutions.com/

By mac