hiQ Labs v. LinkedIn is the case that shaped whether scraping publicly visible web data is lawful in the United States - a question that sits directly beneath how AI training datasets are built. hiQ, a small analytics company, used automated bots to scrape information from public LinkedIn profiles. LinkedIn sent a cease-and-desist invoking the Computer Fraud and Abuse Act (CFAA), the federal anti-hacking statute, and tried to block hiQ’s access. hiQ sued, and in September 2019 the Ninth Circuit (938 F.3d 985) affirmed a preliminary injunction protecting hiQ’s access to the public data.
The dispute then went up and came back. In 2021 the Supreme Court vacated the ruling and sent it back in light of Van Buren v. United States, which had narrowed the CFAA’s “exceeds authorized access” language. On remand in April 2022 (31 F.4th 1180), the Ninth Circuit again sided with hiQ, concluding that “when a computer network generally permits public access to its data, a user’s accessing that publicly available data will not constitute access without authorization under the CFAA.” In other words, scraping data that is open to the public is unlikely to be criminal hacking.
The story has a twist: although hiQ won the CFAA point, it later agreed to a permanent injunction and to stop scraping, because LinkedIn’s separate contract (terms of service) claims survived. So the precedent protects scrapers from the hacking statute but leaves contract and copyright questions wide open.
Why business readers should care: hiQ is the reference point for the “is web scraping legal?” debate that runs through every conversation about AI training data. It establishes that public-facing data is not protected by the anti-hacking law, while making clear that terms of service and copyright remain live battlegrounds - which is exactly where the big AI data lawsuits are now fought.