Technology
LinkedIn collected user data for training purposes before updating its terms of service
LinkedIn could have trained AI models on user data without updating its terms.
LinkedIn users within the United States — but not within the EU, EEA, or Switzerland, likely as a consequence of data privacy laws in those regions — have the choice to opt out toggle on the settings screen, revealing that LinkedIn collects personal data to coach “AI models to create content.” The toggle isn’t recent. But, as in early reported According to 404 Media, LinkedIn didn’t initially update its privacy policy to handle data use.
The Terms of Service have already been published. updatedbut that sometimes happens well before an enormous change, equivalent to using user data for a brand new purpose like this. The idea is that this offers users the choice to make changes to their account or leave the platform in the event that they do not like the changes. It looks like that is not the case this time.
So what models does LinkedIn train? Its own, the corporate’s says in a Q&A session, including models to put in writing suggestions and post recommendations. But LinkedIn also says that generative AI models on its platform could be trained by a “third-party vendor,” equivalent to its corporate parent Microsoft.
“As with most features on LinkedIn, when you use our platform, we collect and use (or process) data about your use of the platform, including personal data,” the Q&A reads. “This may include your use of generative AI (AI models used to create content) or other AI features, your posts and articles, how often you use LinkedIn, your language preferences, and any feedback you may have provided to our teams. We use this data, in accordance with our privacy policy, to improve or develop the LinkedIn Services.”
LinkedIn previously told TechCrunch that it uses “privacy-enhancing techniques, including redaction and removal of information, to limit personally identifiable information contained in datasets used to train generative AI.”
To opt out of LinkedIn’s data collection, go to the “Data Privacy” section of the LinkedIn settings menu in your computer, click “Data to improve Generative AI,” after which turn off “Use my data to train AI models to create content.” You may try a more comprehensive opt-out through this typebut LinkedIn notes that opting out is not going to affect training that has already taken place.
The nonprofit Open Rights Group (ORG) has asked the Information Commissioner’s Office (ICO), the UK’s independent regulator for data protection laws, to research LinkedIn and other social networks that train on user data by default. Earlier this week, Meta announced it was resuming plans to gather user data for AI training after working with the ICO to simplify the opt-out process.
“LinkedIn is the latest social media company to process our data without asking for our consent,” Mariano delli Santi, a lawyer and policy officer at ORG, said in a press release. “The opt-out model once again proves to be completely inadequate to protect our rights: society cannot be expected to monitor and prosecute every internet company that decides to use our data to train AI. Opt-in consent is not only legally required, but also common sense.”
The Irish Data Protection Commission (DPC), the supervisory authority responsible for monitoring compliance with the GDPR, the EU’s general privacy rules, told TechCrunch that LinkedIn had last week announced that clarifications on its global privacy policy could be published today.
“LinkedIn has informed us that the policy will include an opt-out setting for members who do not want their data used to train AI models that generate content,” a DPC spokesperson said. “This opt-out is not available to EU/EEA members, as LinkedIn does not currently use EU/EEA member data to train or tune these models.”
TechCrunch has reached out to LinkedIn for comment. We will update this text if we hear back.
The need for more data to coach generative AI models has led to more platforms repurposing or otherwise repurposing their vast troves of user-generated content. Some have even taken steps to monetize that content—Tumblr owner Automattic, Photobucket, Reddit, and Stack Overflow are among the many networks licensing data to AI model developers.
Not all of them made opting out easy. When Stack Overflow announced it will begin licensing content, several users deleted their posts in protest — only to see those posts restored and their accounts suspended.