Generative AI could revolutionize online search. But for news publishers, it can be a disaster.
News publishers have long been overly reliant on Google, the undisputed leader in online search.
About 40% of all web traffic to media sites comes from Google, according to a December Wall Street Journal report citing data from web analytics firm SimilarWeb. In return, news publishers tacitly agreed to provide the tech giant with their data so the company could continue to improve the effectiveness and reliability of its search engines.
Google's generative AI experiment could spark a major shift in this dynamic, but it won't be in news publishers' favor.
Last May, the tech giant debuted a new AI-powered tool called Search Generation Experience (SGE). This is an experimental feature that summarizes responses to search queries in natural language text similar to that used in ChatGPT.
Through SGE, Google wants to provide a more efficient and personalized search experience. Answers to queries will be found in one convenient chat box and users can ask follow-up questions. The model also generates suggestions for such questions. For example, if someone searches for the best vineyard tours in Napa Valley, you might be offered to ask about his Airbnb prices in the area.
Google search users have traditionally been directed to news publishers' websites to find more information, but they may soon be able to limit their searches entirely to Google.
“At the scale that Google operates, even at a fraction of the scale, [lost] traffic can result in millions, if not billions, fewer visitors to a publisher's site,” says Jim Lesinski, an associate professor of marketing at Northwestern University. “This is the biggest challenge for publishers.”
Google says it's too early to tell how its latest efforts in AI-powered search will impact publishers. “It is too early to estimate the traffic impact of the SGE experiment as we continue to rapidly evolve our user experience and design, including how links are displayed,” a company spokesperson told The Drum. “We will continue to prioritize an approach that sends valuable traffic to publishers. In fact, more links to his SGE-containing sites appear in searches than ever before, and new content is being discovered.” Opportunities are emerging.”
Ironically, SGE's functionality relies in part on data from news publisher sites.
“Google wants to be a one-stop shop for information,” says Chris Rodgers, founder and CEO of CSP, a search engine optimization (SEO) agency. “The problem is they don't have that information. They have to get it from somewhere else. In their perfect world, they wouldn't just take it and disseminate it directly to users. is.”
The problem with this approach, Rogers says, is that it's always been a two-way street. If a publisher suddenly realizes that they are not receiving what they were guaranteed in the deal – traffic to their website – they may decide that it is best to look for alternative ways to engage with their audience. .
This is already starting to happen. Some publishers have modified some code on their websites to prevent Google's algorithms from crawling their content, or from using them to train the company's large-scale language models (LLMs). We have responded.
In September, Google announced in a blog post that it was introducing a new feature called “Google-Extended,” which would allow publishers to opt out of having their content used to train some of its AI products. (According to Google, the company's basic LLM is primarily trained based on publicly available content from the internet, such as blog posts and chat forums.)
But like someone caught in quicksand, this struggle only threatens to worsen publishers' woes. By making a publisher's data uncrawlable, it also reduces the publisher's visibility in Google Search.
Rogers argues that the growing tension between publishers and Google won't last forever. Something would have to give.
“What happens if Google just retrieves information but doesn't return anything?” he asks. “What we're seeing is a natural progression: 'If you don't give us some credit, and even threaten our industry and viability, we'll provide content to you.' It's a backlash against Google, who said, 'We're not going to do that.''
“In my eyes, that backlash is really important,” he says.
Different paths for publishers
To protect themselves in the early days of generative AI, many news publishers are starting to adopt one of two strategies: enter into licensing agreements with big tech companies or sue them.
According to Northwestern's Lesinski, publishers who license their content are effectively paying for “traffic and advertising lost in the cost of training the AI if people only get answers from the AI but don't visit the site.” He said he was betting that his income would be offset.
Reddit recently signed a $60 million annual deal with Google that allows Google to use the platform's data to train AI models. A similar deal was signed in December between OpenAI and German media company Axel Springer. (Axel Springer is also suing Google over what it considers anti-competitive ad tech practices.)
Meanwhile, the New York Times sued OpenAI and Microsoft in December, alleging that their proprietary content was illegally used in the LLM training behind ChatGPT.
More recently, in March, French regulators revealed that the company had failed to reach fair licensing agreements with news publishers and that their articles were being used to train the company's AI chatbot, Gemini. Google has been fined 250 million euros (approximately $270 million) for failing to disclose information to news publishers. New York Times report.
Of course, between Google, one of the most valuable companies in the world, and individual news publishers, many of which have been facing increasing economic pressures since before most of the world knew it. There are large differences in the level of power. Generation AI.
But what if there was a concerted, concerted effort across the media to stand up to Google by blocking its AI crawlers, forcing publishers into a position of increased dependence and vulnerability? Is it to effectively collectively declare that there will be no future?
According to Rogers, signals like these that can't be ignored are forcing Google to reevaluate its current approach and work towards a new type of news publisher relationship that leverages generative AI to benefit both parties. There is a possibility that they may be forced to do so. “If everyone does the same across the media industry, Google will get the message that it has to do something to level the playing field and make it right.”
It's not yet clear what that is. Generative AI is still a very new technology, despite its rapid adoption over the past year and a half. While courts will be grappling with the legal implications of this technology for some time, businesses of all sizes are still finding ways to use it productively and responsibly.
But like nature, business hates a vacuum.
“Google is a huge company,” Rogers said. [the online search industry] Both age and age. I don't know if it will change. But if there are other players who can better serve the media or businesses, someone will come in to fill the void. ”
For more updates on AI, Web3, and other emerging technologies, sign up for The Emerging Tech Briefing newsletter.