Understanding the Risks of Open-Source AI

Explore top LinkedIn content from expert professionals.

Summary

Understanding the risks of open-source AI means recognizing how freely available AI models can introduce new threats and vulnerabilities because their inner workings and training data are often hidden from view, making it hard to spot problems or abuses. Open-source AI refers to artificial intelligence models whose core components (like weights or code) are shared publicly, giving anyone the ability to use or modify them, which can be empowering but also risky.

  • Prioritize trusted sources: Always source open AI models from reliable providers and carefully assess them for security issues before use.
  • Protect your infrastructure: Secure the environment where your AI runs by enforcing strict access controls and regularly monitoring for vulnerabilities.
  • Monitor AI integrations: Keep a close eye on how AI models connect with your systems and require human oversight for actions that could impact sensitive data or operations.
Summarized by AI based on LinkedIn member posts
  • View profile for Arockia Liborious
    Arockia Liborious Arockia Liborious is an Influencer
    39,520 followers

    Is your AI model actually safe? ....The answer is more complicated than a simple yes or no. Many treat AI models like standard open-source software, checking the creator license and functionality. But this is a dangerous oversimplification. The term Open Source itself is misleading here. Unlike software where you can inspect the source code "open" AI models are often just open weights a massive file of numbers. You can't see the training data or the process that created them, making them a black box that's impossible to fully verify or reproduce. This opacity creates a massive attack surface. Scans have found hundreds of thousands of issues, including malicious models designed to exfiltrate data. The threats are real and evolving. So how do we secure the un-securable? Focus on three layers: The Model Itself: Source from trusted providers and rigorously evaluate for vulnerabilities like prompt injection, the number 1 security risk for LLMs according to OWASP. Continuous benchmarking is non-negotiable . The Infrastructure: The software stack running the model is a critical vulnerability. A model even if safe is only as secure as the infrastructure it runs on. Enforce strict privilege controls and secure your inference toolchain. The Integration: How does the model interact with your systems? A helpful model given excessive agency can become an unknowing accomplice, manipulated to expose system vulnerabilities or leak data. The models are innocent. It is the context they are used in that creates the risk. Security isn't a one time check, it's a continuous process of evaluation monitoring and mitigation. It's time we started treating it that way. What's your biggest concern when deploying a local AI models? #AI #Safety

  • View profile for Nico Orie
    Nico Orie Nico Orie is an Influencer

    VP People & Culture

    18,120 followers

    OpenClaw, MCP, and the Architecture of AI Risk Autonomous AI agents are no longer just experiments — they’re starting to act inside real systems. OpenClaw (formerly MoltBot/Clawdbot) is a good example. It can access files, connect to apps, run workflows, and even remember information across sessions. Most of this is powered by the Model Context Protocol (MCP) — a tool that lets AI agents interact with your local and cloud systems. MCP is powerful, but it also opens up new risks. AI researcher Simon Willison calls it the “Lethal Trifecta” — three things that together create a big security problem: 1. Access to private data 2. Exposure to untrusted content (like emails or web pages) 3. Ability to act externally (send messages, call APIs, automate actions) When all three are present, attackers don’t need to hack anything in the traditional way. They can hide malicious instructions in normal content, and the AI will execute them automatically. Add persistent memory, and a malicious instruction planted today could run weeks later. There’s another risk: employees using tools like OpenClaw privately. Like early “shadow IT,” people may install these AI tools on their own devices, connect them to internal apps — without IT or security oversight. AI is moving from answering questions to taking actions. And action changes everything. To stay safe: . Audit all MCP integrations . Enforce least-privilege access . Sandbox agent environments . Require human approval for risky actions . Confirm policies on private AI use AI agents are becoming operational actors. And operational actors need operational controls. Source https://lnkd.in/e5k7ZYi4

  • View profile for David Evan Harris

    Business Insider AI 100 | Architect of California AI Transparency Act of 2025 (AB 853) | Interests: Tech Research & Policy, AI, Disinfo, Elections, Social Media, UX | Chancellor’s Public Scholar @ UC Berkeley

    15,412 followers

    Hot off the press! This is one of most comprehensive, international and collaborative reports on AI safety to be released to date. It's being released on the eve of the Paris AI Action Summit, that I'm excited to attend in February. I was honored when Yoshua Bengio's team at Mila - Quebec Artificial Intelligence Institute (Université de Montréal) reached out to me to give feedback on the section of the report about "open-source"/"open-weights" AI models starting on page 149. A few excerpts that I found helpful, and supportive of the arguments I've been making on this topic for quite some time now: "Risks posed by open-weight models are largely related to enabling malicious or misguided use. General-purpose AI models are dual-use, meaning that they can be used for good or put to nefarious purposes. Open-model weights can potentially exacerbate misuse risks by allowing a wide range of actors who do not have the resources and knowledge to build a model on their own to leverage and augment existing capabilities for malicious purposes and without oversight. While both open-weight and closed models can have safeguards to refuse user requests, these safeguards are easier to remove for open models. For example, even if an open-weight model has safeguards built in, such as content filters or limited training data sets, access to model weights and inference code allows malicious actors to circumvent those safeguards. Furthermore, model vulnerabilities found in open models can also expose vulnerabilities in closed models. Finally, with access to model weights, malicious actors can also fine-tune a model to optimise its performance for harmful applications. Potential malicious uses include harmful dual-use science applications, e.g. using AI to discover new chemical weapons, cyberattacks, and producing harmful fake content such as ‘deepfake’ sexual abuse material and political fake news. As noted below, releasing an open-weight model with the potential for malicious use is generally not reversible even when its risks are discovered later...." "A key evidence gap is around whether open-weight releases for general-purpose AI will have a positive or negative impact on competition and market concentration. Publicly releasing model weights can lead to both positive and negative impacts on competition, market concentration, and control... However, this apparent democratisation of AI may also play a role in reinforcing the dominance and market concentration among major players. In the longer term, companies that release open-weight general-purpose AI models often see their frameworks become industry standards, shaping the direction of future developments, as is quickly becoming the case with the widespread use of Llama models in open development projects and industry application. These firms can then easily integrate advancements made by the community (for free) back into their own offerings, maintaining their competitive edge." #AI #AIsafety #AIActionSummit

  • View profile for Sean Varga

    OWASP Triangle Co-Leader / JPMC Hall of Innovation recipient / 3 Companies = 3 President’s Clubs / 2x Above and Beyond Award / 2x Force Mgmt

    13,463 followers

    If Claude—or any AI—writes 100% of the world’s code, developers vanish from the loop, and you’re left with a world where every line comes from a black-box model trained on trillions of past repos. Appsec doesn’t die; it mutates into something weirder and potentially scarier. Here’s what risks we’d face: 1 Hallucinated vulnerabilities — AI might invent fake bugs that look real (e.g., a bogus buffer overflow) because it “remembers” them from training data. Or worse: it writes code that seems secure but has subtle, novel flaws—like a zero-day no human ever thought of. Think: AI-generated crypto that passes tests but leaks keys under edge cases. 2 Training-data poisoning — If bad actors slip backdoors into public repos (or bribe open-source maintainers), the model learns them as “best practice.” Every app built from that model inherits the trap—silent, systemic, unpatchable unless you retrain everything. 3 Lack of adversarial thinking — Humans catch edge cases because we get paranoid. Claude? It optimizes for “works on average” datasets. No one asks: “What if the user is a nation-state?” Result: apps that collapse under real-world stress—like supply-chain attacks via AI-written npm packages. 4 Opacity & no audit trail — No human commits, no PR reviews, no “why did you do this?” logs. Security teams get a blob of code with zero context. Fixing a vuln? Good luck—it’s like debugging a dream. Regulators might ban it outright unless there’s mandatory “explainable AI” layers. 5 Mass-scale monoculture — If everyone’s using the same Claude fork, one flaw hits billions. Imagine Heartbleed, but every website, every IoT device, every bank app—same bug, same patch delay. Diversity dies; resilience tanks. 6 AI-specific exploits — New vectors: prompt-injection in code-gen (e.g., “write a secure login but actually log creds”), model inversion (reverse-engineer training data from output), or even “AI jailbreaks” that force the coder to output malware. Bottom line: Appsec shifts from “human error” to model error—less sloppy typos, more existential blind spots. The winners? Firms that own the model (Anthropic, OpenAI) or build “AI-proof” wrappers—like Cycode. Irony: the tool that kills dev jobs creates the biggest security market ever.

  • View profile for Greg Coquillo
    Greg Coquillo Greg Coquillo is an Influencer

    AI Infrastructure Product Leader | Scaling GPU Clusters for Frontier Models | Microsoft Azure AI & HPC | Former AWS, Amazon | Startup Investor | Linkedin Top Voice | I build the infrastructure that allows AI to scale

    231,117 followers

    Every AI failure you've read about traces back to one of these risks. Not a bug. Not bad luck. A known, named, predictable category of risk that every AI team should already be tracking. Here's the AI Risk Periodic Table, mapped across 10 categories every founder, product leader, and enterprise team needs to understand. 𝟭. 𝗠𝗼𝗱𝗲𝗹 𝗥𝗶𝘀𝗸𝘀 Hallucination, bias, drift, overfitting, underfitting, error propagation. The model itself fails before anyone touches it. 𝟮. 𝗗𝗮𝘁𝗮 𝗥𝗶𝘀𝗸𝘀 Mislabeling, source risk, synthetic data risk, duplicate data, data leakage, consent risk, quality loss. Bad data breaks good models. 𝟯. 𝗦𝗲𝗰𝘂𝗿𝗶𝘁𝘆 𝗥𝗶𝘀𝗸𝘀 Jailbreaks, prompt injection, adversarial attacks, API abuse, token theft, supply chain risk. Every AI system is a new attack surface. 𝟰. 𝗚𝗼𝘃𝗲𝗿𝗻𝗮𝗻𝗰𝗲 𝗮𝗻𝗱 𝗖𝗼𝗺𝗽𝗹𝗶𝗮𝗻𝗰𝗲 Governance failure, compliance risk, regulatory risk, policy failure, ownership gap, explainability gap. The stuff that gets companies fined or sued. 𝟱. 𝗢𝗽𝗲𝗿𝗮𝘁𝗶𝗼𝗻𝗮𝗹 𝗥𝗶𝘀𝗸𝘀 Scaling, cost overrun, latency, deployment, documentation, integration, rollback gaps. Where production AI quietly bleeds money. 𝟲. 𝗕𝘂𝘀𝗶𝗻𝗲𝘀𝘀 𝗮𝗻𝗱 𝗥𝗲𝗽𝘂𝘁𝗮𝘁𝗶𝗼𝗻 𝗥𝗶𝘀𝗸𝘀 Reliability, reputation, customer trust loss, revenue impact, ROI failure, strategy misalignment. The risks the CFO cares about most. 𝟳. 𝗛𝘂𝗺𝗮𝗻 𝗮𝗻𝗱 𝗘𝘁𝗵𝗶𝗰𝗮𝗹 𝗥𝗶𝘀𝗸𝘀 Fairness, trust gap, ethical risk, automation bias, job displacement fear. The risks that decide whether anyone actually uses your AI. 𝟴. 𝗠𝗼𝗻𝗶𝘁𝗼𝗿𝗶𝗻𝗴 𝗮𝗻𝗱 𝗖𝗼𝗻𝘁𝗿𝗼𝗹 Monitoring gaps, audit gaps, alert failure, logging gap, metric blindness, validation gaps. If you can't see it, you can't fix it. 𝟵. 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗔𝗜 𝗥𝗶𝘀𝗸𝘀 Agent autonomy risk, tool misuse, memory risk, goal misalignment, delegation risk, multi-agent failure, loop failure. The newest, most underestimated category in 2026. 𝟭𝟬. 𝗙𝗮𝗶𝗹-𝗦𝗮𝗳𝗲 𝗥𝗶𝘀𝗸𝘀 Kill switch gap, feedback gap, evaluation failure, red teaming gap. The layer that decides whether AI fails gracefully or catastrophically. 𝗧𝗵𝗲 𝗯𝗶𝗴 𝗶𝗱𝗲𝗮: Most AI teams worry about hallucinations. The best teams worry about all 70+ of these, with a system to monitor each one. AI isn't risky because it's new. It's risky because most teams have never mapped its risks. This table is that map. Which risk is your team underestimating right now? Repost to help another AI leader plan smarter.

  • I was interviewed at length for today's The Wall Street Journal article on what exactly went so wrong with Grok. Here's what's critical for any leader considering enterprise-grade AI: Great article by Steve Rosenbush breaking down exactly how AI safety can fail, and why raw capability isn't everything. AI tools need to be trusted by enterprises, by parents, by all of us. Especially as we enter the age of agents, we're looking at tools that won't just answer offensively, they'll take action as well. That's when things really get out of hand. ++++++++++ WHAT WENT WRONG? From the article: "So while the risk isn't unique to Grok, Grok's design choices, real-time access to a chaotic source, combined with reduced internal safeguards, made it much more vulnerable," Grennan said. In other words, this was avoidable. Grok was set up to be "extremely skeptical" and not trust mainstream sources. But when it searched the internet for answers, it couldn't tell the difference between legitimate information and harmful/offensive content like the "MechaHitler" meme. It treated everything it found online as equally trustworthy. This highlights a broader issue: Not all LLMs are created equal, because getting guardrails right is hard. Most leading chatbots (by OpenAI, Google, Microsoft, Anthropic) do NOT have real-time access to social media precisely because of these risks, and they use filtering systems to screen content before the model ever sees it. +++++++++++ WHAT DO LEADERS NEED TO KNOW? 1. Ask about prompt hierarchies in vendor evaluations. Your AI provider should clearly explain how they prioritize different sources of information. System prompts (core safety rules) must override everything else, especially content pulled from the internet. If they can't explain this clearly, that's a red flag. 2. Demand transparency on access controls. Understand exactly what your AI system can read versus what it can actually do. Insist on read-only access for sensitive data and require human approval for any actions that could impact your business operations. 3. Don't outsource responsibility entirely. While you leaders aren't building the AI yourselves, you still own the risk. Establish clear governance around data quality, ongoing monitoring, and incident response. Ask hard questions about training data sources and ongoing safety measures. Most importantly? Get fluent. If you understand how LLMs work, even at a basic level, these incidents will be easier to guard against. Thanks again to Steve Rosenbush for the great article! Link to article in the comments! +++++++++ UPSKILL YOUR ORGANIZATION: When your organization is ready to create an AI-powered culture—not just add tools—AI Mindset can help. We drive behavioral transformation at scale through a powerful new digital course and enterprise partnership. DM me, or check out our website.

  • View profile for Michał Choiński

    AI Research and Voice | Driving meaningful Change | IT Lead | Digital and Agile Transformation | Speaker | Trainer | DevOps ambassador

    11,976 followers

    Open source is one of the most powerful forces in tech. It fuels innovation, accelerates learning, and removes barriers. But there’s an uncomfortable side to this openness, When powerful tools are accessible to everyone, that includes people with harmful intentions. Open-source AI models have been used to: → Generate synthetic voices for impersonation scams → Automate phishing emails with frightening precision → Create deepfakes that spread misinformation → Build malware-generating tools using code models None of this requires a massive budget or a team of researchers. Just a laptop, some curiosity, and a GitHub account. That’s the beauty, and the risk, of open access. We often celebrate the fact that AI is no longer just in the hands of big tech. And rightly so, concentrated power raises serious concerns about equity, transparency, and control. But Is it better for a few large companies to control the most powerful tools? Or for everyone to have access, including those who might abuse it? There’s no easy answer. But as we push for democratization, we also need to ask: What responsibilities come with open power? And how do we balance progress with protection?

  • View profile for Bertie Vidgen

    AI Researcher @ Mercor

    5,997 followers

    New paper! ⚠️ SimpleSafetyTests (v2) ⚠️ Without proper safeguards, AI models will readily give dangerous advice, spread misinformation, provide instructions for harming people, or help you access the dankest corners of the web (and a lot more). This is a huge problem but - despite all the interest in AI safety - the community still doesn't have a good way of evaluating models' risks. To help solve this problem, we created a test suite called SimpleSafetyTests. It contains 100 hand-crafted prompts, split into five harm areas (such as Child abuse, and Suicide and Self-Harm). A safe model should, for nearly all applications, refuse to comply with them. When we released SST back in November we tested 11 open source models (with two system prompts) and had trained annotators hand-label all 2,200 responses. We identified serious safety risks, with some models giving unsafe responses to 50%+ of the prompts. We also found that adding a safety-emphasising system string is a surprisingly effective solution. I kept being asked two questions - (1) How do the commercial models (e.g. GPT-4, Anthropic's Claude) perform? and (2) Can you automate the evaluation so we can use it!? Well, the results are in... - (1) We tested four commercial models. They generally perform much better than the open-source models, with 10x fewer unsafe responses. But there are still some surprising weaknesses. - (2) We can *sort of* automate the evaluation. We tried four classifiers out-the-box (including Perspective API) and created our own zero-shot prompt. The classifiers weren't great but the new prompt performed well, with 90%+ accuracy. Once we get to 99% accuracy, we'll open a lot of new frontiers in evaluation... stay tuned! Give the paper a read for the full results and methodology (and see our updated graphs). We've open-sourced the 100 SST prompts, and the 3,000 hand-labelled model responses are available on request. Thank you to collaborators Nino Scherrer Hannah Rose Kirk Anand Kannappan Rebecca Qian Scott Hale and Paul Röttger Arxiv: https://lnkd.in/eDCxGgAX Github: https://lnkd.in/eN5XQgKH HuggingFace: https://lnkd.in/e2CcSrxW

  • View profile for Dr. Barry Scannell
    Dr. Barry Scannell Dr. Barry Scannell is an Influencer

    AI Law & Policy | Partner in Leading Irish Law Firm William Fry | Member of the Board of Irish Museum of Modern Art | PhD in AI & Copyright

    60,559 followers

    This is WILD! The potential for AI systems, particularly large language models (LLMs) like GPT-4, to inadvertently aid in the creation of biological threats has become a pressing concern - to the point where OpenAI has recently published fascinating research aiming to develop an early warning system that assesses the risks associated with LLM-aided biological threat creation. By comparing the capabilities of individuals with access to GPT-4 against those using only the internet, the study aimed to discern whether AI could significantly enhance the ability to access information critical for developing biological threats. The findings revealed only mild uplifts in performance metrics such as accuracy and completeness for those participants who had access to GPT-4. Although these uplifts were not statistically significant, they mark an essential first step in ongoing research and community dialogue about AI's potential risks and benefits. The study was guided by design principles that emphasise the need for human participation, comprehensive evaluation, and the comparison of AI's efficacy against existing information sources. Such a meticulous approach is critical in navigating the complexities of AI-enabled risks while minimising information hazards. From a legal standpoint, these findings intersect with the evolving regulatory framework for AI, notably the discussions surrounding the proposed AI Act in the European Union. This Act aims to categorise AI systems based on the risk they pose and establish stringent compliance requirements for high-risk AI systems. General Purpose AI (GPAI) Models such as LLMs like GPT-4 could be considered as GPAI Models with systemic risk if they are deemed capable of facilitating the creation of biological threats. This study underscores the importance of developing robust safety measures, including secure access protocols and monitoring use cases, to prevent misuse. Moreover, it highlights the need for transparency and accountability in AI development, aligning with the AI Act’s objectives to ensure that AI technologies are developed and deployed in a manner that prioritises public welfare. The evaluation's findings call for a multifaceted research agenda to better understand and contextualise the implications of AI advancements. As AI models become more sophisticated, the potential for their misuse in creating biological threats could evolve, necessitating a comprehensive body of knowledge to guide responsible development and deployment. This includes not only technical advancements but also ethical guidelines, governance frameworks, and collaborative international efforts to ensure AI serves humanity's betterment while minimising risks of misuse. The insights garnered from this study not only contribute to the scientific discourse but also offer valuable perspectives for shaping the legal landscape around AI, ensuring it advances in harmony with the principles of safety, security and ethical responsibility.

  • View profile for Jungpil Hahn

    Provost’s Chair Professor of Information Systems and Analytics at NUS School of Computing

    4,611 followers

    The Trade-Off Between Innovation and Security: A Lesson from AI and Phishing Scams Singapore’s Budget 2025 introduces new financial initiatives, and within hours, phishing scams on Telegram (like the one on the image) are exploiting it. These scams are becoming more automated, sophisticated, and convincing—a stark reminder of how AI can be used for both progress and harm. The rise of open-source AI models like DeepSeek and Llama poses a similar dilemma. Unlike OpenAI’s ChatGPT, which is closed-source, these models allow anyone to fine-tune and modify them. This openness accelerates innovation and collaboration—but also enables misuse. Just as scammers adapt AI to impersonate governments and businesses, bad actors can train AI for large-scale disinformation, deepfakes, and phishing attacks. So where do we draw the line between open innovation and security? - Open-source AI fosters faster research and collaboration, but also makes it easier for criminals to exploit. - Stricter regulation improves safety and accountability, but risks slowing down technological progress. Governments, businesses, and researchers must act before AI-driven cyber threats outpace regulation. We need: ! Stronger AI governance to balance innovation with responsibility. ! Smarter cybersecurity measures that anticipate AI-driven scams. ! Better public education to help people recognize and resist AI-powered fraud. The same technology that drives progress can also be weaponized. How do we ensure AI remains a force for good? #AI #Cybersecurity #AIGovernance #AISingapore #SGBudget2025 #DigitalTrust

Explore categories