LinkedIn files lawsuit against data scraping company

Jason Pollock
By Jason Pollock | 28 January 2025
 

LinkedIn is taking a stand against companies collecting its data without permission.

The business-focused social media platform’s VP of legal - litigation, competition, and enforcement, Sarah Wight, announced that the company has filed a federal lawsuit in the US against Proxycurl and its founder for “the unauthorized creation of hundreds of thousands of fake accounts and scraping of millions of LinkedIn member profiles”.

“These actions directly violate our User Agreement, and when necessary, we take legal actions as part of our commitment to preserve the integrity of our platform and keep control of member data where it belongs – in members’ hands,” said Wight.

Proxycurl LLC, a Singapore-based company owned by Nubela Pte. Ltd. and founded by Steven Goh – all three of whom are listed as defendants in the lawsuit, along with Bach Le – describes itself as a LinkedIn profile scraping application programming interface.

This means that users of the product can use the tool to ‘scrape’ public LinkedIn profiles in order to find out personal details and contact information of both people and companies.

Data scraping has become a hotly-contested topic in recent years, with News Corp CEO Robert Thomson warning of a potential "tsunami" of job losses because of the impact of AI, which includes applications like data scraping.

"The big AI engines are going to scrape content to feed and update the engine, or they're going to surface individual stories, and they're going to extract the editorial essence," he said.

"And in each of those three areas, it's incumbent on the big players to reward those who created that content, or else they're actually undermining the act of creation."

It isn't just external platforms doing the scraping either - in September of last year, Meta’s global privacy policy director  admitted that the company has been collecting data from public photos and posts from Facebook and Instagram users and feeding it into its AI training models since 2007.

At a parliamentary inquiry into the use of AI, senator Tony Sheldon said Amazon, Meta, and Google have already committed "arguably the largest act of theft in Australian history by scraping the collective body of human knowledge and creative output, without consent or payment to the owners of that work".

“Where tech companies have scraped copyrighted data without consent or payment, the Government needs to intervene," he said.

Although the first portion of agreed recommendations from the Privacy Act Review introduced by the government didn't include anything around first-party and third-party data collection, or definitions of personal information, such updates are expected in later legislation.

LinkedIn’s User Agreement says that users of the site agree that they will not ‘develop, support or use software, devices, scripts, robots or any other means or processes (such as crawlers, browser plugins and add-ons or any other technology) to scrape or copy the Services, including profiles and other data from the Services’.

A blog post on the platform in 2021 from LinkedIn’s senior staff software engineer for machine learning, James Verbus, said that the company has “productionalised a deep learning model that operates directly on raw sequences of member activity, allowing us to scalably leverage more of the available signal hidden in the data and stop adversarial attacks more effectively.”

“Our first production use case of this model was the detection of logged-in accounts scraping member profile data,” the post reads.

“Scraping is not always bad. Search engines are expressly authorised to scrape in order to collect and index information throughout the internet. What makes it nefarious is when it is done without permission.”

In 2023, the Office of the Australian Information Commissioner said that “many data protection authorities have seen increased reports of mass data scraping from social media channels and other websites”, warning that such scraped data can be used for everything from targeted cyberattacks and identity fraud to monitoring, profiling and surveilling individuals, unauthorised political or intelligence gathering and unwanted direct marketing or spam.

Have something to say on this? Share your views in the comments section below. Or if you have a news story or tip-off, drop us a line at adnews@yaffa.com.au

Sign up to the AdNews newsletter, like us on Facebook or follow us on Twitter for breaking stories and campaigns throughout the day.

comments powered by Disqus