CIOInsights - Insights From Technology Leaders

Sundar Pichai vows action on allegations: OpenAI's Sora model trained on YouTube videos without permission

By CIOInsights

In response to recent reports alleging that OpenAI's AI model Sora has been trained on YouTube videos without proper permissions, Google CEO Sundar Pichai has pledged to address the matter if there is substance to the claims.

Pichai's remarks come after OpenAI CTO Mira Murati expressed uncertainty regarding whether Sora's training data includes YouTube content, stating that the model primarily draws from publicly available and licensed sources.

According to a New York Times report, OpenAI has transcribed over a million hours of YouTube videos to train Sora. When questioned by CNBC about potential violations of Google's terms and conditions, Pichai deferred, emphasizing that it is up to OpenAI to address such concerns and adhere to clear terms of service.

OpenAI faces mounting scrutiny and legal challenges regarding its data usage practices. The New York Times has filed a lawsuit against the AI startup, alleging copyright infringement for training its models on the Times' content without proper authorization.

The Authors Guild has also initiated legal action, asserting that OpenAI's language models rely heavily on copyrighted material without adequate compensation or recognition for creators.