Just recently, it has been reported that popular platforms Tumblr and WordPress are on the verge of striking deals with artificial intelligence companies OpenAI and Midjourney to provide user data for training purposes. The parent company of these platforms, Automattic, is said to be finalizing agreements to share data to assist in the training of AI models.
While the specifics of the data to be included in these agreements remain unclear, there have been concerns raised about potential privacy breaches. Allegedly, Automattic may have initially included private or partner-related data that was not intended to be part of the deal. Internal communications from Tumblr product manager Cyle Gage suggest that sensitive information such as private posts, deleted blogs, unanswered questions, and explicit content may have been included inadvertently.
Following these reports, Automattic has issued a statement indicating that they will only share public content from sites that have not opted out of the data-sharing agreements. The company also mentioned that current legal regulations do not require AI companies to adhere to users' opt-out preferences when it comes to web crawlers.
To address concerns about user privacy, Automattic plans to launch an opt-out tool that will allow users to block third parties, including AI companies, from training on their data. The company's AI head, Andrew Spittle, has assured that they will advocate for users who opt-out to have their data removed from past sources and future training runs.
AI data training agreements have become a lucrative opportunity for websites facing challenges in the fast-paced online publishing landscape. With reports of Tumblr's staff cuts in recent years, these partnerships with AI companies may provide a much-needed boost to the platform.
As similar deals continue to emerge in the tech industry, it is necessary for users to be aware of how their data is being used and to have options to protect their privacy. The developments involving Tumblr, WordPress, OpenAI, and Midjourney highlight the importance of transparency and user control in data-sharing agreements.
It remains to be seen how these reported agreements will unfold and how users will respond to the new opt-out tool that Automattic plans to launch. As technology continues to advance, the balance between data utilization and user privacy will be a critical issue for companies and consumers alike.