Google logo /Courtesy of Yonhap News

Google has sued a crawling (data extraction) firm that scrapes its search results. The move is seen as an attempt to prevent data collected through crawling from being used to train rival generative artificial intelligence (AI) models.

Google, the world's largest search company, said on the 19th (local time) that it filed a lawsuit in the U.S. District Court for the Northern District of California, alleging that Austin, Texas-based crawling startup SerpApi is infringing copyright. Google said, "With this lawsuit, our goal is to stop SerpApi's malicious crawling."

Crawling is the mass copying and storage of the contents of countless internet pages. It is used in various analytical tasks, including AI model training.

Google said SerpApi is taking content without permission while ignoring crawling protocols (guidelines) set by individual websites and is even bypassing security measures designed to block such activity.

Google said, "Google follows industry-standard crawling guidelines, but companies like SerpAPI hide themselves and attack websites with large-scale bot networks," adding, "They use back doors, such as rotating fake names, to collect websites' content in its entirety, and such illegal activity has surged over the past year."

In particular, it said these actors are taking content for which it has obtained and displays external licenses and reselling it for a fee, adding that "SerpApi's business model is parasitic."

In the complaint, Google calculated damages for each individual violation by SerpApi at $200 to $2,500. It also stressed, "Because they are unable to pay damages, they are causing irreparable harm to Google."

In response, SerpApi legal counsel Chad Anson said, "We have not yet received Google's complaint, and Google did not contact us before filing," and argued that its business is protected by the First Amendment, which defines freedom of expression.

Founded in 2017, SerpApi began as a company that collected information to help customers rank higher in Google search. But as generative AI corporations emerged, led by OpenAI, the developer of ChatGPT, SerpApi jumped into a new market of selling the web page data it had scraped to them.

SerpApi is known to have sold web page data to OpenAI and Meta along with Lithuania-based startup Oxylabs and Russian corporations such as AQMProxy, which run similar businesses.

Accordingly, the industry sees Google's legal action against SerpApi as a move to keep competitors such as OpenAI and Meta in check.

A U.S. court ordered Google to share search data with major competitors, but limited the data Google must provide to user-entered search terms and raw data, excluding the algorithms that compose search results from sharing.

Online community Reddit also filed a lawsuit in Oct. against crawling firms including SerpApi.

※ This article has been translated by AI. Share your feedback here.