Google is getting AI training data from Reddit as part of a new partnership between the two companies. In an update on Thursday, Reddit announced it will start providing Google “more efficient ways to train models.”
The collaboration will give Google access to Reddit’s data API, which delivers real-time content from Reddit’s platform. This will provide “Google with an efficient and structured way to access the vast corpus of existing content on Reddit,” while also allowing the company to display content from Reddit in new ways across its products.
When Reddit CEO Steve Huffman spoke to The Verge last year about Reddit’s API changes and the subsequent protests, he said, “The API usage is about covering costs and data licensing is a new potential business for us,” suggesting Reddit may seek out similar revenue-generating arrangements in the future.
The partnership will give Reddit access to Vertex AI as well, Google’s AI-powered service that’s supposed to help companies improve their search results. Reddit says the change doesn’t affect the company’s data API terms, which prevent developers or companies from accessing it for commercial purposes without approval.
Just last week, a report from Bloomberg said Reddit struck a $60 million training deal with an unnamed AI company. Google Search is currently expanding the test of a “forums” filter that lets you browse through results from sites with human discussion, like Reddit, Stack Overflow, and Hacker News.
Despite this deal, Google and Reddit haven’t always seen eye to eye. Reddit previously threatened to block Google from crawling its site over concerns that companies would use its data for free to train AI models. Reddit is also poised to announce its initial public offering within the coming weeks, and it’s likely making this change as part of its effort to boost its valuation, which sat at more than $10 billion in 2021.


