Big Tech's Trapped in a Glass House on AI Data Snatching
Having exploited user data for years, the tables are turning as Big Tech firms grab it from each other.
A few weeks ago, the chief technology officer of OpenAI was asked if her company had used YouTube videos to train its AI systems. First, she gave a blank stare. Then there was a grimace. Finally, Mira Murati gave an answer that avoided the messy and furtive world she and other tech companies were operating in: “Actually, I’m not sure about that.”
According to a New York Times report, OpenAI in fact had trained its AI on “more than one million hours of YouTube videos,” using a speech recognition tool called Whisper. All the conversational text from the transcriptions was used to train GPT-4, the flagship large language model that underpins ChatGPT.
