News

Top Tech Firms Used YouTube Subtitles for AI Training: Report

Proof News reports that top tech companies like Anthropic, Nvidia, and Salesforce trained AI models using YouTube subtitles from creators like Dhruv Rathee, Marques Brownlee, and PewDiePie. The dataset included subtitles from 173,536 videos across 48,000 channels, sparking criticism from creators.

AI models were trained using YouTube video subtitles from Dhruv Rathee, Marques Brownlee, and PewDiePie, according to a tool supplied by the news outlet Proof News.

According to the site, top tech companies including Anthropic, Nvidia, Apple, and Salesforce trained their AI models using a dataset of YouTube video subtitles.

The publication claimed to have discovered subtitles for 173,536 YouTube videos taken from more than 48,000 channels, although it cautioned that the technology might provide misleading negative results.

Also Read: How to Summarize YouTube Videos with Google Gemini (Step-by-Step Guide)?

In addition to content creators like PewDiePie and Dhruv Rathee, some videos uploaded by tech critic Marques Brownlee were utilized to train AI. The videos were also featured in news articles and discussion programs from all over the world.

The fact that most of the recordings were from 2020 or before suggests a cut-off.

Companies that used video transcripts as a source for AI training material were criticized by Brownlee.

“Interesting fact: I pay a service (per minute) to have my videos more accurately transcribed before uploading them to YouTube’s back end. Thus, businesses that harvest transcripts are engaging in multifaceted theft of *paid* labour. Not great,” Brownlee wrote on Tuesday on X.

Also Read: Join YouTube Premium to test the AI-powered “Jump ahead” feature immediately

According to the site, Anthropic and Salesforce acknowledged utilizing training datasets that contained the video subtitles that were scraped, but they did not acknowledge any wrongdoing. Bloomberg, Apple, Nvidia, and Databricks did not address the claims with confirmation or denial.

When asked earlier in the year if ChatGPT’s creator used YouTube videos for AI training, OpenAI representative Mira Murati found it difficult to respond briefly.


Also Read: YouTube CEO Warns OpenAI Against Using Videos for Training AI Models

This post was last modified on July 17, 2024 10:14 pm

Kumud Sahni Pruthi

A postgraduate in Science with an inclination towards education and technology. She always looks for ways to help people improve their lives by putting complex things into simple words through her writing.

Recent Posts

Best AI Model for Every Task: Image, Video, PPT and More

Pick your task, get the best AI model for it — images, video, slides, research,…

June 17, 2026

What is Agentic AI? Check How it Works with Real-Life Agentic AI Automation Examples

Learn what Agentic AI is, how it works, and how it differs from Generative AI.…

June 14, 2026

13 Best Free Online Vocal Remover AI Tools in 2026

Discover the 13 best free online vocal remover AI tools for 2026, designed to isolate…

January 4, 2026

Top 13 Yield Farming Platforms in 2026: Maximize APY with Secure and Trusted Crypto Tools

Explore the top 13 yield farming platforms for 2026, featuring secure, trusted, and high-APY crypto…

January 4, 2026

Top AI Learning Platforms for 2026: Master AI Skills with Coursera, edX, and Udacity

Explore the best AI learning platforms for 2026, including Coursera, edX, Udacity, and more. Learn…

January 4, 2026

13 Best Polygon Wallets in 2026 You Need to Checkout

Explore the 13 best Polygon wallets in 2026, comparing security, DeFi access, hardware and mobile…

January 1, 2026