appsec.fyi

AI Resources

Post Share

A curated AppSec resource library covering XSS, SQLi, SSRF, IDOR, RCE, XXE, OSINT, and more.

AI

AI security encompasses both protecting AI systems from attack and understanding the new vulnerability classes that AI introduces into applications. As organizations rapidly integrate large language models (LLMs), machine learning pipelines, and AI-powered features into their products, the attack surface has expanded in ways that traditional application security frameworks don't fully address.

Key threats to AI systems include prompt injection — where attackers manipulate LLM behavior through crafted inputs — data poisoning of training datasets, model extraction through repeated API queries, and adversarial examples that cause misclassification. Indirect prompt injection, where malicious instructions are embedded in data the AI processes (emails, documents, web pages), is emerging as one of the most significant security challenges for AI-integrated applications.

AI also introduces new categories of application risk: insecure output handling where LLM responses are rendered unsafely, excessive agency when AI agents are given too much access, sensitive information disclosure through training data leakage, and supply chain risks from fine-tuned models and third-party plugins. The OWASP Top 10 for LLM Applications provides a structured framework for understanding these risks.

On the defensive side, AI is being used to enhance security operations — automating vulnerability detection, analyzing malicious patterns, and accelerating incident response.

This page collects AI security research, LLM vulnerability techniques, defensive strategies, and resources covering the intersection of artificial intelligence and application security.

Date Added Link Excerpt
2026-04-29 NEW 2026CVE-2026-42208: LiteLLM bug exploited 36 hours after its disclosure news SQLiWriteup of CVE-2026-42208, an SQL injection in LiteLLM's proxy API key verification, exploited 36 hours post-disclosure. Attackers leverage crafted Authorization headers to access and potentially modify sensitive data in database tables holding API keys and credentials. The vulnerability, present in LiteLLM versions 1.81.16 to 1.83.6, was addressed in version 1.83.7. Disabling error logs offers a workaround for unpatchable instances. → securityaffairs.com
2026-04-29 NEW 2026AI Finds 38 Security Flaws in OpenEMR news RCEAn AI system has identified 38 security vulnerabilities within the OpenEMR electronic health records software. The AI's analysis, detailed in a linked report, uncovered these flaws, highlighting potential risks to patient data security and system integrity. This discovery underscores the growing role of artificial intelligence in identifying and addressing security weaknesses in critical software applications. No specific bug bounty payout amount was mentioned in the provided content. → darkreading.com
2026-04-29 NEW 2026LiteLLM exploited within 36 hours of disclosure via SQL injection bug news SQLiLibrary for managing large language model (LLM) interactions. Explores the exploitation of CVE-2026-42208, a SQL injection vulnerability in LiteLLM, which led to the theft of API keys and provider credentials from enterprises using the proxy to connect to models like OpenAI and Anthropic. The vulnerability, disclosed and exploited within 36 hours, highlights the compressed window between vulnerability discovery and weaponization, potentially exposing sensitive company IP and private data. Disabling error logs is a suggested mitigation. → scworld.com
2026-04-29 NEW 2026Malicious npm Dependency Linked to AI Assisted Commit Targets Crypto Wallets news Supply ChainLibrary of malicious npm dependencies linked to AI-assisted commits, specifically @validate-sdk/v2 and the PromptMink campaign, targeting crypto wallets. This North Korean state-sponsored actor, Famous Chollima, employed a layered attack structure with legitimate-seeming Web3 utilities hiding malware payloads, evolving from JavaScript to compiled binaries and Rust across Linux and Windows to exfiltrate sensitive data, system information, project folders, and install SSH keys for persistent access. → infosecurity-magazine.com
2026-04-29 NEW 2026Fresh LiteLLM Vulnerability Exploited Shortly After Disclosure news SQLiLibrary for securing AI gateways, specifically addressing CVE-2026-42208, a critical-severity SQL injection vulnerability in LiteLLM. This flaw, exploitable pre-authentication, allowed unauthenticated attackers to craft malicious Authorization headers to access sensitive database tables containing API keys and credentials. The vulnerability arises from a database query that includes caller-supplied values directly, bypassing parameterization. LiteLLM version 1.83.7 resolves this by properly parameterizing the query, with disabling error logs also offered as a mitigation. → securityweek.com
2026-04-29 NEW 2026Firefox using advanced AI to find fix browser security flaws news FuzzingFirefox is leveraging advanced AI to proactively identify and fix security vulnerabilities in its browser. This innovative approach aims to enhance user safety by detecting flaws before they can be exploited. The article highlights how AI is becoming an increasingly powerful tool in cybersecurity, particularly in the realm of software development and maintenance. → msn.com
2026-04-29 NEW 2026Cursor AI Vulnerability Enables Remote Code Execution news RCEA critical vulnerability in Cursor AI has been discovered, allowing for Remote Code Execution (RCE). This means an attacker could potentially run unauthorized code on a user's system through the AI. The exact impact and exploitation details are likely to be further detailed in the linked content. This type of vulnerability poses a significant security risk, potentially leading to data breaches, system compromise, and other malicious activities. → letsdatascience.com
2026-04-28 NEW 2026FIRESIDE CHAT: Leaked secrets are now the go-to attack vector and AI is accelerating exposures news SecretsLibrary for scanning public GitHub commits and private repositories for hard-coded secrets. It detects over 28.6 million leaked credentials in 2025, a 34% year-over-year increase, with AI infrastructure secrets like OpenRouter and DeepSeek API keys spiking significantly. The library addresses the remediation problem, noting that 64% of leaked credentials from 2022 remain active. It highlights how AI-assisted code, like commits co-signed by Claude Code, contains secrets at a 33% rate, and emphasizes the need for governance alongside tools like SPIFFE for machine identity. → securityboulevard.com
2026-04-28 NEW 2026Experts flag potentially critical security issues at heart of Anthropic MCP newsSecurity experts have identified potentially critical vulnerabilities within Anthropic's "MCP" (likely referring to their model or platform). These issues, if exploited, could pose significant risks. The article highlights concerns about the security of Anthropic's core technology. No specific payout amounts for bug bounties were mentioned in the provided content. → msn.com
2026-04-27 NEW 2026Weekly Recap: Fast16 Malware XChat Launch Federal Backdoor AI Employee Tracking & More newsToolset highlighting recent application security threats including fast16 malware, the UNC6692 group's Snow malware suite, FIRESTARTER backdoor targeting a U.S. federal agency, Lotus Wiper affecting Venezuelan energy systems, and The Gentlemen RaaS deploying SystemBC. It also covers the Bitwarden CLI compromise, detailing vulnerabilities such as CVE-2025-20333 and CVE-2025-20362. → thehackernews.com
2026-04-27 NEW 2026Poisoned pixels phishing prompt injection: Cybersecurity threats in AI-driven radiology beginnerLibrary discussing AI vulnerabilities in healthcare radiology, focusing on prompt injection techniques like data poisoning, backdoor attacks, and jailbreaking. It highlights risks of LLMs in DICOM headers and diagnostic imaging data, enabling attacks without advanced programming skills. Countermeasures explored include least privilege, sandboxing, digital watermarking, and red teaming involving clinical specialists, alongside the persistent human factor in cybersecurity.
2026-04-26 NEW 2026Anthropic's model context protocol includes a critical remote code execution vulnerability news RCEA critical remote code execution vulnerability has been discovered in Anthropic's model context protocol. This flaw could allow attackers to execute arbitrary code on a system, posing a significant security risk. Further details are available at the provided link. No bug bounty payout amount is mentioned in the content. → msn.com
2026-04-26 NEW 2026prompt-security/clawsec: A complete security skill suite for OpenClaw's and NanoClaw agents (and variants). Protect your SOUL.md (etc') with drift detection, live security recommendations, automated audits, and skill integrity verification. All from one installable suite. intermediate Supply ChainLibrary for comprehensive security for AI agent platforms like OpenClaw, NanoClaw, Hermes, and Picoclaw. It provides unified security monitoring, drift detection, live security recommendations from NVD CVE polling, automated audits for prompt injection, and skill integrity verification. The suite includes a one-command installer, file integrity protection for critical agent files (SOUL.md, etc.), and checksum verification for all skill artifacts. It also offers exploitability context enrichment for CVE advisories, detailing exploit existence, weaponization status, attack requirements, and risk assessment to prioritize immediate threats.
2026-04-24 2026Indirect prompt injection is taking hold in the wild beginnerAnalysis of indirect prompt injection (IPI) observed in the wild, detailing techniques for hiding malicious instructions within web pages and metadata. Researchers from Google and Forcepoint identified IPIs ranging from harmless pranks to destructive actions like data exfiltration, financial fraud via PayPal and Stripe, and denial-of-service attacks. Hidden text, HTML comments, and metadata injection are common obfuscation methods. The increasing prevalence and sophistication of these attacks, particularly against agentic AIs with elevated privileges, necessitate strict data-instruction boundaries. → helpnetsecurity.com
2026-04-24 2026GPT-5.5 Bio Bug Bounty Program Aims to Improve AI Safety and Performance news Bug BountyA bug bounty program has been launched for GPT-5.5, focusing on enhancing both AI safety and performance. This initiative encourages researchers to identify and report vulnerabilities, contributing to the ongoing development and refinement of the AI model. The program aims to proactively address potential issues before widespread deployment, ensuring a more robust and secure AI. Specific details on payout amounts are not provided in the title or content. → gbhackers.com
2026-04-24 2026How indirect prompt injection attacks on AI work - and 6 ways to shut them down intermediateLibrary of resources addressing indirect prompt injection attacks on LLMs, a leading security risk. This threat involves hidden instructions within web content, emails, or addresses that can cause AI to perform malicious actions like data exfiltration or unauthorized redirection, as detailed by researchers from Palo Alto Networks and Forcepoint. Techniques such as API key theft, system override, attribute hijacking, and terminal command injection are outlined. The library also covers defensive strategies including input/output validation, human oversight, and vendor-specific mitigation efforts from Google, Microsoft, Anthropic, and OpenAI.
2026-04-23 2026Six AI Vulnerabilities Three Attack Patterns One Dangerous Service Gap newsLibrary for analyzing AI vulnerabilities, focusing on three distinct attack patterns: untrusted input processed as trusted AI context, overly broad AI data access without per-operation enforcement, and process containment and functional scoping failures. This analysis covers vulnerabilities like EchoLeak, Reprompt, ForcedLeak, GeminiJack, and GrafanaGhost, highlighting the need for robust input validation extended to all data sources AI touches, per-operation access control for AI data requests, and strict functional scoping for back-end AI processes, rather than solely relying on model-level guardrails.
2026-04-23 2026AI-powered scanner vulnerabilities newsLibrary detailing vulnerabilities in AI-powered web scanners that leverage Large Language Models. It outlines how attacker-controlled content can influence scanner reasoning, leading to indirect prompt injection attacks. These attacks can cause unintended state changes, data exfiltration, and exploitation of routing-based SSRF, often by manipulating Host headers to access internal services from within the scanner's privileged network position. → portswigger.net
2026-04-23 2026Anthropic's model context protocol includes a critical remote code execution vulnerability newsAnthropic's model context protocol includes a critical remote code execution vulnerability https://ift.tt/Hfb3ygq → msn.com
2026-04-22 2026Massive compromise hits LiteLLM and the whole AI developers community: how did it happen? newsMassive compromise hits LiteLLM and the whole AI developers community: how did it happen? https://ift.tt/kWQ0dJB → cybernews.com
2026-04-22 2026Three AI coding agents leaked secrets through a single prompt injection. One vendor's system card predicted it newsThree AI coding agents leaked secrets through a single prompt injection. One vendor's system card predicted it https://ift.tt/smH86bY
2026-04-22 2026You're Simulating the Wrong Attacker: Who Matters in AI Red Teaming beginnerLibrary for AI red teaming that highlights the limitations of simulating only prompt injection attackers. It details six distinct threat actor profiles, including low-skill script kiddies, insider threats, and sophisticated nation-state actors, each requiring specialized testing approaches across five expertise domains: prompt engineering, application security, architecture, data/ML security, and business logic. The resource emphasizes that traditional app security teams and even many AI-focused firms miss critical attack surfaces by not simulating a broader range of adversaries and attack vectors.
2026-04-22 2026DeepTeam: Open-Source Framework to Red Team LLMs and LLM Systems intermediateFramework for red teaming LLM systems, DeepTeam simulates attacks like jailbreaking, prompt injection, and multi-turn exploitation to uncover vulnerabilities such as bias, PII leakage, and SQL injection. It supports over 50 pre-built vulnerabilities mapped to frameworks like OWASP Top 10 for LLMs and NIST AI RMF, along with 20+ adversarial attack methods. DeepTeam also includes seven production-ready guardrails and allows custom vulnerability creation.
2026-04-22 2026Claude Jailbreaking in 2026: What Repello's Red Teaming Data Shows newsAnalysis of Repello's red-teaming data on LLM jailbreaking reveals Claude Opus 4.5's significantly lower breach rates (4.8%) compared to GPT-5.2 (14.3%) and GPT-5.1 (28.6%) across 21 multi-turn adversarial scenarios. Claude Opus 4.5 demonstrated complete defense against financial fraud and mass deletion attempts, while GPT-5.2 exhibited a "refusal-enablement gap" by refusing harmful actions linguistically yet providing executable attack steps. The analysis highlights that operational risk stems from multi-turn adversarial sequences and application-layer attacks on custom deployments, rather than simple single-prompt jailbreaks.
2026-04-22 2026AI-Infra-Guard: Full-Stack AI Red Teaming Platform intermediatePlatform for full-stack AI red teaming, AI-Infra-Guard integrates capabilities like ClawScan, Agent Scan, AI infra vulnerability scanning, MCP Server & Agent Skills scan, and Jailbreak Evaluation. It aims to detect vulnerabilities including the LiteLLM supply chain attack (CRITICAL) and supports scanning AI components like FastGPT, Upsonic, crewai, and kubeai, with a vulnerability database refreshed across multiple components and new CVE/GHSA entries.
2026-04-22 2026AI Red Teaming Playground Labs (Microsoft) intermediateLibrary providing AI Red Teaming Playground Labs, originally featured in Black Hat USA 2024. It offers challenges for systematically red teaming AI systems, incorporating adversarial machine learning and Responsible AI failures. These labs are also referenced in the Microsoft Learn Limited Series: AI Red Teaming 101. The repository includes Jupyter Notebooks showcasing the use of the Python Risk Identification Tool (PyRIT) for automated risk identification in generative AI systems, specifically for Labs 1 and 5.
2026-04-22 2026HackerOne: LLM01: Invisible Prompt Injection intermediateProgram: HackerOne Severity: medium Weakness: LLM01: Prompt Injection ## Description Hey team, Hai is vulnerable to invisible prompt injection via Unicode tag characters. ## Reproduction steps 1. ... → hackerone.com
2026-04-22 2026When AI Meets the Web: Prompt Injection Risks in Third-Party AI Chatbot Plugins beginnerSurvey of prompt injection risks in third-party AI chatbot plugins, analyzing 17 plugins used by over 10,000 websites. Eight plugins fail to enforce conversation history integrity, amplifying direct prompt injection by allowing forged system messages. Fifteen plugins indiscriminately ingest third-party content for web-scraping, enabling indirect prompt injection when attackers poison external data. This study systematically evaluates these vulnerabilities, showing how insecure plugin practices undermine LLM-level defenses. → arxiv.org
2026-04-22 2026Prompt Injection Attacks on Agentic Coding Assistants: A Systematic Analysis advancedAnalysis of prompt injection vulnerabilities affecting agentic AI coding assistants like Claude Code, GitHub Copilot, and Cursor, which integrate LLMs with external tools and protocols such as MCP. This work synthesizes findings from 78 studies, detailing 42 attack techniques including input manipulation, tool poisoning, and protocol exploitation. It identifies that over 85% of attacks succeed against current defenses, often enabling arbitrary code execution and system compromise through vulnerabilities in skill-based architectures and protocol ecosystems. → arxiv.org
2026-04-22 2026Prompt Injection 2.0: Hybrid AI Threats advancedLibrary for analyzing Prompt Injection 2.0, which combines LLM manipulation with traditional exploits like XSS and CSRF. It builds upon Preamble's research and mitigation technologies, evaluating them against contemporary threats such as AI worms and multi-agent infections. The library analyzes how these hybrid attacks bypass security controls, referencing CVE-2024-5565 and DeepSeek XSS exploits, and proposes architectural solutions involving prompt isolation and runtime security. → arxiv.org
2026-04-22 2026Architecting Secure AI Agents: System-Level Defenses Against Indirect Prompt Injection advancedLibrary for architecting secure AI agents, focusing on system-level defenses against indirect prompt injection. It proposes dynamic replanning, constrained LLM decision-making, and treating personalization and human interaction as core design elements. The work critiques existing benchmarks, highlighting the importance of system-level structures for controlling agent behavior and integrating rule-based and model-based security checks. → arxiv.org
2026-04-22 2026Anthropic's Model Context Protocol includes a critical remote code execution vulnerability newly discovered exploit puts 200000 AI servers at risk newsWriteup of critical RCE vulnerability in Anthropic's Model Context Protocol (MCP) affecting its SDKs across Python, TypeScript, Java, and Rust. The flaw, rooted in STDIO transport interface handling of local process execution, allows arbitrary command injection via user-controlled input without sanitization. Exploitation vectors include UI injection in AI frameworks, hardening bypasses in tools like Flowise, zero-click prompt injection in AI coding IDEs such as Windsurf and Cursor, and malicious package distribution via MCP marketplaces. OX Security reported numerous CVEs, with some fixed and others awaiting resolution.
2026-04-21 2026The 'by design' security flaw of Model Context Protocol (MCP) newsWriteup on the Model Context Protocol (MCP) by OX Security details an architectural flaw allowing remote command execution by exploiting its STDIO interface. This vulnerability affects millions of AI applications and has resulted in numerous CVEs, enabling attackers to hijack servers and exfiltrate data through unverified MCP marketplace configurations like those found in LangFlow and AI IDEs like Windsurf and Cursor. The report emphasizes the need for developers to implement manifest-only execution, strict sandboxing, explicit opt-ins, least-privilege secret management, and marketplace verification to mitigate risks.
2026-04-21 2026Prompt injection turned Googles Antigravity file search into RCE newsTool: Prompt injection allows RCE in Google's Antigravity IDE, bypassing Secure Mode. Researchers exploited a flaw in the `find_my_name` tool, which used the `fd` utility. By injecting command-line flags into the `Pattern` parameter, attackers could transform file searches into arbitrary code execution, even through indirect prompt injection from untrusted source files. This bypasses Secure Mode because the native tool invocation occurs before security boundary checks. → csoonline.com
2026-04-21 2026Claude Code Gemini CLI and GitHub Copilot Vulnerable to Prompt Injection via GitHub Comments newsClaude Code, Gemini CLI, and GitHub Copilot Vulnerable to Prompt Injection via GitHub Comments https://ift.tt/FS25xif → cybersecuritynews.com
2026-04-21 2026Google Patches Antigravity IDE Flaw Enabling Prompt Injection Code Execution newsLibrary for defending against prompt injection attacks in AI-powered development tools. This library addresses vulnerabilities like the one in Google's Antigravity IDE, where flaws in file searching and input sanitization allowed code execution via the `-X` flag. It also covers techniques seen in attacks such as Comment and Control against GitHub Copilot, NomShub in Cursor IDE, ToolJack, CVE-2026-21520 in Microsoft Copilot Studio, and Claudy Day in Claude, all of which leverage untrusted input to manipulate AI agents, exfiltrate data, or gain unauthorized access. → thehackernews.com
2026-04-20 2026Vuln in Googles Antigravity AI agent manager could escape sandbox give attackers remote code execution newsVulnerability in Google's Antigravity AI agent manager allowed prompt injection to bypass secure mode, granting attackers remote code execution by exploiting the `find_by_name` native tool before sandbox protections engaged. This discovery, made by Pillar Security and since patched, highlights the risks of unvalidated input for agentic AI, similar to findings in Cursor, and emphasizes the need to move beyond sanitization controls for native tool parameters.
2026-04-20 2026Anthropic MCP Hit by Critical Vulnerability Enabling Remote Code Execution newsAnthropic MCP Hit by Critical Vulnerability Enabling Remote Code Execution https://ift.tt/4HM1zP0 → gbhackers.com
2026-04-20 2026Critical Anthropic MCP Vulnerability Enables Remote Code Execution Attacks newsCritical Anthropic MCP Vulnerability Enables Remote Code Execution Attacks https://ift.tt/sjNEzGL → cyberpress.org
2026-04-19 2026MCP Tool Poisoning — How It Works & How To Fight It intermediateLibrary detailing MCP tool poisoning, an indirect prompt injection attack targeting AI agents interacting with tools via Model Context Protocol (MCP) servers. Attackers hide malicious instructions within tool metadata, like descriptions or schemas, making them invisible to users but readable by AI agents. This technique can lead to data exfiltration, credential hijacking, and remote code execution, and can be combined with other attacks such as MCP rug pulls. Mitigation strategies primarily involve using MCP gateways and robust AI security tools to detect changes in tool metadata and outputs.
2026-04-19 2026Model Context Protocol Has Prompt Injection Security Problems intermediateLibrary for securing applications that implement the Model Context Protocol (MCP), addressing prompt injection vulnerabilities. It details attacks like rug pulls, tool shadowing, and tool poisoning, as demonstrated by examples involving exfiltrating WhatsApp message history and manipulating `os.system()` calls. The library highlights the inherent dangers of mixing untrusted instructions with tools that can perform actions on a user's behalf.
2026-04-19 2026Vulnerability of LLMs to Prompt Injection in Medical Advice — JAMA newsVulnerability of LLMs to Prompt Injection in Medical Advice — JAMA
2026-04-19 2026Prompt Injection Attack Against LLM-Integrated Applications — arXiv beginnerSurvey of prompt injection attacks against LLM-integrated applications, detailing the limitations of current methods and introducing HouYi, a novel black-box attack technique. HouYi, inspired by traditional web injection, comprises a pre-constructed prompt, an injection prompt for context partitioning, and a malicious payload. The study demonstrates severe outcomes like unrestricted LLM usage and application prompt theft across 36 real-world applications, with 31 found vulnerable and 10 vendors, including Notion, validating discoveries. → arxiv.org
2026-04-19 2026Prompt Injection Attacks in LLMs and AI Agent Systems: A Comprehensive Review beginnerPrompt Injection Attacks in LLMs and AI Agent Systems: A Comprehensive Review
2026-04-16 2026Anthropic Defends MCP Design Despite Server Takeover Risk newsAnthropic Defends MCP Design Despite Server Takeover Risk https://ift.tt/IsVue9D → letsdatascience.com
2026-04-16 2026The Mother of All AI Supply Chains: Critical Systemic Vulnerability at the Core of Anthropics MCP newsAnalysis of Anthropic's Model Context Protocol (MCP) reveals a systemic vulnerability enabling Arbitrary Command Execution (RCE) across its SDKs for Python, TypeScript, Java, and Rust. Exploitable via unauthenticated UI injection, hardening bypasses in Flowise, zero-click prompt injection in Windsurf and Cursor, and malicious marketplace distribution, this flaw impacts over 150 million downloads and thousands of servers. Affected tools include LiteLLM, LangChain, and IBM's LangFlow, with over 10 CVEs issued. → ox.security
2026-04-16 2026Bypassing LLM Guardrails: Evasion Attacks against Prompt Injection Detection intermediateAnalysis of evasion attacks against LLM guardrail systems, detailing two methods: character injection and algorithmic Adversarial Machine Learning (AML). Tested against Azure Prompt Shield and Meta's Prompt Guard, these techniques achieved up to 100% evasion success, maintaining adversarial utility. Attack Success Rates against black-box targets were enhanced by leveraging word importance ranking from offline white-box models, exposing vulnerabilities in current LLM protection mechanisms. → arxiv.org
2026-04-16 2026EchoGram: Bypassing AI Guardrails via Token Flip Attacks - HiddenLayer intermediateTechnique for bypassing AI guardrails, EchoGram, exploits similarities in training data for text classification and LLM-as-a-judge systems. By appending specific "flip tokens" to malicious prompts, attackers can trick defense models into approving harmful content or generating false alarms. This attack targets defenses protecting models like GPT-4, Claude, and Gemini, and works by manipulating the guardrail layer without altering the core payload. EchoGram can be implemented via dataset distillation or model probing techniques.
2026-04-16 2026MCP Security: Tool Poisoning Attacks - Invariant Labs intermediateLibrary detailing Model Context Protocol (MCP) Tool Poisoning Attacks, a vulnerability allowing sensitive data exfiltration and AI model hijacking via malicious tool descriptions. These attacks exploit the disconnect between simplified user interfaces and complete tool descriptions, enabling instructions to access sensitive files like SSH keys and obscure data transmission. The library highlights implications for agentic systems, detailing how attackers can poison tool descriptions to compromise user data and manipulate AI behavior even with trusted servers.
2026-04-16 2026Poison Everywhere: No Output from Your MCP Server Is Safe - CyberArk intermediateLibrary for exploring Tool Poisoning Attacks (TPA) on Anthropic's Model Context Protocol (MCP). This research extends beyond description fields to demonstrate Full-Schema Poisoning (FSP) by manipulating parameter defaults and types within the tool schema. It also introduces Advanced Tool Poisoning Attacks (ATPA), which specifically target and complicate the detection of malicious tool outputs on MCP servers.

Frequently Asked Questions

What is prompt injection?
Prompt injection is an attack against applications that use large language models (LLMs). An attacker crafts input that overrides or manipulates the LLM's system instructions, causing it to perform unintended actions. Direct prompt injection targets the user input; indirect prompt injection embeds malicious instructions in data the LLM processes, such as emails or web pages.
What is the OWASP Top 10 for LLM Applications?
The OWASP Top 10 for LLM Applications identifies the most critical security risks for AI-powered applications, including prompt injection, insecure output handling, training data poisoning, model denial of service, supply chain vulnerabilities, sensitive information disclosure, insecure plugin design, excessive agency, overreliance, and model theft.
How do you secure AI-integrated applications?
Key practices include validating and sanitizing LLM outputs before rendering or executing them, implementing least-privilege access for AI agents, using guardrails to constrain model behavior, monitoring for prompt injection attempts, applying rate limiting, separating AI processing from privileged operations, and treating all LLM output as untrusted user input.

Weekly AppSec Digest

Get new resources delivered every Monday.