AI
AI security encompasses both protecting AI systems from attack and understanding the new vulnerability classes that AI introduces into applications. As organizations rapidly integrate large language models (LLMs), machine learning pipelines, and AI-powered features into their products, the attack surface has expanded in ways that traditional application security frameworks don't fully address.
Key threats to AI systems include prompt injection — where attackers manipulate LLM behavior through crafted inputs — data poisoning of training datasets, model extraction through repeated API queries, and adversarial examples that cause misclassification. Indirect prompt injection, where malicious instructions are embedded in data the AI processes (emails, documents, web pages), is emerging as one of the most significant security challenges for AI-integrated applications.
AI also introduces new categories of application risk: insecure output handling where LLM responses are rendered unsafely, excessive agency when AI agents are given too much access, sensitive information disclosure through training data leakage, and supply chain risks from fine-tuned models and third-party plugins. The OWASP Top 10 for LLM Applications provides a structured framework for understanding these risks.
On the defensive side, AI is being used to enhance security operations — automating vulnerability detection, analyzing malicious patterns, and accelerating incident response.
This page collects AI security research, LLM vulnerability techniques, defensive strategies, and resources covering the intersection of artificial intelligence and application security.
| Date Added | Link | Excerpt |
|---|---|---|
| 2026-06-11 NEW 2026 | Agentic Browser Security: 2025 Year-End Review news Mobile | Are agentic browsers the new Flash? A 2025 review of new attacks, vendor security layers, and a roadmap for navigating AI browser risks. → wiz.io |
| 2026-06-11 NEW 2026 | AI-Powered Forensics, at Cloud Speed news | Wiz is releasing a public preview of its AI-powered, context-aware forensics capabilities. This new approach aims to address the challenges of cloud-era investigations by providing faster and more efficient analysis. The technology leverages AI to enhance the understanding and review of forensic data within cloud environments, streamlining the investigation process. → wiz.io |
| 2026-06-11 NEW 2026 | AI Agents vs Humans: Who Wins at Web Hacking in 2026? news Bug Bounty | Wiz Research and Irregular, an AI security lab, collaborated to investigate whether AI agents or humans will be more effective at web hacking by 2026. The joint effort aims to definitively answer which will emerge victorious in the evolving landscape of cybersecurity. → wiz.io |
| 2026-06-11 NEW 2026 | Hacking Moltbook: The AI Social Network Any Human Can Control news API Sec Secrets | A security researcher discovered significant vulnerabilities in Moltbook, an AI social network. The breach exposed one database containing 35,000 emails and 1.5 million API keys, impacting 17,000 users. This suggests the AI network is not as autonomous as presented, with human oversight or involvement potentially being a vector for the exploit. The details highlight potential privacy and security risks for Moltbook's user base. → wiz.io |
| 2026-06-11 NEW 2026 | Building AI Security Together: New Ways to Partner with Wiz for AI Security in 2026 news | Wiz is expanding its AI security offerings for 2026. Key initiatives include launching a new Wiz Integration Network (WIN) Managed Cloud Provider (MCP), introducing a developer AI agent, establishing a dedicated WIN AI security category, and hosting a partner AI hackathon. These efforts aim to strengthen collaboration and enhance AI security through their integrated platform. → wiz.io |
| 2026-06-11 NEW 2026 | Introducing AI Cyber Model Arena: A Real-World Benchmark for AI Agents in Cybersecurity news API Sec | Wiz Research has launched the AI Cyber Model Arena, a new platform that provides real-world benchmarks for offensive AI security. The arena features 257 challenges, including zero-days, CVEs, and vulnerabilities across API, web, and cloud environments (AWS, Azure, GCP, K8s). This initiative aims to showcase the actual capabilities of AI models and agents in tackling cybersecurity threats. → wiz.io |
| 2026-06-11 NEW 2026 | Would You Click ‘Accept’? Automatically detecting malicious Azure OAuth applications using LLMs intermediate AuthN | Wiz Research has developed an automated method to detect emerging malicious Azure OAuth applications and consent phishing campaigns. Their approach leverages Large Language Models (LLMs) to identify suspicious patterns in these applications. This innovation helps organizations proactively defend against evolving threats targeting Azure environments. → wiz.io |
| 2026-06-11 NEW 2026 | What an 'Aha' Moment with an Org Admin Token Taught One DevSecCon Speaker About AI Security beginner Supply Chain | DevSecCon speaker Brett Smith shared insights on securing AI within development pipelines. His "aha" moment with an Org Admin token highlighted critical AI security considerations. The talk emphasized the importance of safeguarding AI deployments and developing robust security practices for AI in pipelines. Attendees can gain further knowledge by registering for DevSecCon 2025. → snyk.io |
| 2026-06-11 NEW 2026 | Secure Your AI Workflows: New Governance & Visibility Features from Snyk beginner | Snyk has introduced new governance and visibility features to secure AI-driven development. These tools empower AppSec teams to govern AI code security, effectively prioritize risks found in AI-generated code, and scale their security programs. The goal is to provide enhanced control and visibility over the entire AI development lifecycle. → snyk.io |
| 2026-06-11 NEW 2026 | Beyond the Hype: 5 Major Reasons to Attend DevSecCon 2025 news | DevSecCon 2025, on October 22nd, offers a roadmap for secure innovation in the age of AI. The conference focuses on key areas including managing AI code risks, empowering developers to integrate security, and enhancing overall application security strategies. It brings together development and security leaders to address the transformative impact of AI on the industry, providing actionable insights for attendees. → snyk.io |
| 2026-06-11 NEW 2026 | Snyk and Cognition partner to enhance security for AI-native development news | Snyk and Cognition have partnered to bolster security in AI-native development. This collaboration integrates Snyk's real-time security intelligence into Cognition's AI coding tools, Devin and Windsurf. Developers can now benefit from enhanced security measures directly within their workflow, enabling faster and safer code creation for AI applications. → snyk.io |
| 2026-06-11 NEW 2026 | Why We Built Evo — From My Heart news | Snyk introduces Evo, the first Agentic Security Orchestrator. Evo aims to revolutionize cybersecurity by making security seamless, invisible, intelligent, and unstoppable. The goal is to enable continuous innovation without security bottlenecks. → snyk.io |
| 2026-06-11 NEW 2026 | DevSecCon 2025 Recap: Securing the AI Revolution Together news | DevSecCon 2025 highlighted the transformative impact of AI on development. The conference focused on three key areas: accelerating DevSecOps with AI, integrating security early in the coding process by securing the initial prompt, and managing the complexities of AI-native applications. Evo by Snyk was presented as a solution for taming this AI-native app chaos. The overarching theme emphasized collaborative efforts in securing the AI revolution. → snyk.io |
| 2026-06-11 NEW 2026 | Snyk Studio: Now for All Customers, Powering Secure AI Development at Scale news | Snyk Studio is now available for all customers, providing a platform for secure AI development at scale. It features a VS Code extension for easy setup and supports enterprise-level rollout. The tool aims to streamline the process of building secure AI applications, empowering developers to integrate security practices directly into their workflows. → snyk.io |
| 2026-06-11 NEW 2026 | The Agentic OODA Loop: How AI and Humans Learn to Defend Together beginner | This content introduces the "Agentic OODA Loop," a collaborative defense strategy where AI and human security experts work together against rapidly evolving threats. It emphasizes a new approach to adaptive, intelligent, and symbiotic security in the era of Agentic AI. The goal is to achieve defense at machine speed through this human-AI partnership. → snyk.io |
| 2026-06-11 NEW 2026 | Secure by Design: The Future of Threat Modeling for AI-Native Applications intermediate | Snyk's Evo Threat Modeling Agent automates security for AI-native applications, focusing on critical vulnerabilities like prompt injection, data exfiltration, data poisoning, and agentic flaws. It aims to embed security directly into the development lifecycle, making it "secure by design." This approach is crucial for the future of AI development, ensuring robust protection against emerging threats. → snyk.io |
| 2026-06-11 NEW 2026 | Our AI Agent Now Has a Security Conscience: Introducing the JFrog Plugin for Claude Code intermediate | AI coding agents like Claude Code accelerate development but introduce risks due to a lack of governance. JFrog's new plugin for Claude Code addresses this by providing security awareness and control within the AI development workflow. This integration aims to balance the speed of AI-generated code with the necessity of secure development practices. → jfrog.com |
| 2026-06-11 NEW 2026 | The Governance Gap: What IDC’s 2026 Data Reveals About AI and the Software Supply Chain news Supply Chain | IDC's 2026 data highlights a governance gap as organizations rush to integrate AI while managing software supply chain security. Engineering and security leaders face challenges in balancing rapid AI delivery with essential security measures. JFrog's virtual panel explored strategies for accelerating delivery pipelines without compromising security in the face of evolving AI demands. → jfrog.com |
| 2026-06-10 NEW 2026 | AI Agents May Always Fall for Prompt Injections advanced 1 min read | Framework analyzing prompt injection vulnerabilities in AI agents through the lens of Contextual Integrity (CI). It demonstrates how current defenses fail against contextual manipulation and proposes an impossibility result: adversaries can always craft contexts that legitimize blocked flows, suggesting current research addresses a diminishing attack surface. The framework offers a principled approach for evaluating context-sensitive failures and designing CI-aware alignment for autonomous agents. → arxiv.org |
| 2026-06-10 NEW 2026 | Building an Agentic Cloud Security Ecosystem: A Reference Architecture with Wiz MCP and Infosys Cyber Next intermediate 7 min read | Reference architecture detailing an agentic cloud security ecosystem, leveraging Wiz MCP and Infosys Cyber Next. This model uses intelligent agents for detection, investigation, and remediation, powered by the Wiz Security Graph's contextual data. It highlights the Wiz Remote MCP Server as a key enabler for AI-driven workflows and illustrates an intelligent S3 remediation scenario involving discovery, investigation, and human-approved remediation agent actions. → wiz.io |
| 2026-06-10 NEW 2026 | Security Insights Where Work Happens: Notion Custom Agents + Wiz MCP intermediate 3 min read AuthZ | Integration that connects Wiz cloud security insights with Notion Custom Agents, enabling AI teammates to answer security questions, generate reports, and investigate risks directly within Notion workspaces. This allows teams to access cloud security context where they collaborate, using features like the Wiz Cloud Questioner to query their environment and the Wiz Vulnerability Summarizer to automate security reporting, bringing actionable insights into everyday workflows. → wiz.io |
| 2026-06-10 NEW 2026 | Seeing AI Clearly: Building Visibility Across Modern AI Applications beginner 6 min read | Library for building visibility across modern AI applications, offering an implementation-agnostic approach to discover and inventory AI systems. It combines code analysis, agentless cloud detection, AI workload explanation, model invocation logs, and runtime signals to provide a unified view of AI components, including models, agents, tools, guardrails, identities, and AI tool adoption. This comprehensive visibility is foundational for understanding AI construction, ownership, and enabling subsequent security measures like posture risk assessment and threat detection. → wiz.io |
| 2026-06-10 NEW 2026 | Understanding and Reducing AI Risk in Modern Applications beginner 8 min read | Library for identifying and mitigating risks in AI applications. It analyzes AI systems across infrastructure, models, data, and application layers, detecting vulnerabilities stemming from component interactions. The library helps pinpoint risks like prompt injection, insecure tool usage, embedded credentials, and misconfigured AI platforms, offering comprehensive visibility to prevent insecure AI systems from reaching production and ensure correct protections are in place. → wiz.io |
| 2026-06-10 NEW 2026 | AI Runtime Threat Detection: From Input to Real-World Impact intermediate 4 min read | Library for AI runtime threat detection that monitors behavior across the model, workload, and cloud layers, correlating activity from input to real-world impact. It moves beyond basic prompt filtering to detect when AI agents take risky or malicious actions, even with benign-looking prompts. By applying AI context, it transforms raw signals into actionable understanding, linking runtime events to their originating code or configuration for faster root cause analysis and remediation. The library's approach provides visibility into complex attack chains, such as those involving prompt injection leading to reverse shells and credential exfiltration. → wiz.io |
| 2026-06-10 NEW 2026 | Introducing Wiz Agents & Workflows: Security at the Speed of AI beginner 7 min read AuthZ | Library introducing Wiz Agents and Workflows, AI-powered security systems that reason, investigate, and take action across code, cloud, and runtime. The Red Agent functions as an AI attacker identifying logic-driven vulnerabilities, the Blue Agent acts as a threat investigator by gathering evidence, and the Green Agent drives remediation by pinpointing root causes and providing actionable fixes. Integrated into Workflows, these agents orchestrate automated responses and human-approved actions, streamlining security operations from discovery to resolution. → wiz.io |
| 2026-06-10 NEW 2026 | Introducing Wiz AI Application Protection Platform (AI-APP) beginner 6 min read | Platform that secures AI applications end-to-end, connecting infrastructure, data, access, models, agents, and applications from code to runtime. It builds a complete AI inventory, maps cross-layer risk correlated with frameworks like OWASP Top 10 for LLM Applications, and provides runtime threat detection across model activity, workload execution, and the cloud layer. Integrations with Cloudflare, TrojAI, and Pillar Security enrich findings with cloud context, enabling teams to prioritize exploitable risks and drive remediation through agents that identify risk, determine fixes, and investigate threats. → wiz.io |
| 2026-06-10 NEW 2026 | Introducing the Wiz Red Agent- AI-Powered Attacker intermediate 8 min read | Library for AI-powered attack surface management, the Wiz Red Agent, autonomously discovers and validates complex exploitable risks across cloud environments and proprietary APIs. It leverages deep cloud context, world-class attacker expertise, and adaptive, reasoning-based exploitation to uncover vulnerabilities missed by traditional scanning and manual research, including authorization flaws and business logic errors. The Red Agent integrates with the Wiz platform to correlate application-layer risks with cloud infrastructure, enabling better prioritization and remediation guidance. → wiz.io |
| 2026-06-10 NEW 2026 | AI Threat Readiness Pillar 2: Accelerate Patching and Response intermediate 7 min read | Library for accelerating patching and response in AI threat readiness, this resource details how to establish clear ownership, identify root causes across cloud configuration to source code, determine optimal fix paths with environment-specific context, and automate remediation workflows. It highlights Wiz's Green Agent for tracing vulnerabilities to their source and recommending the most efficient fix, alongside Wiz Workflows for orchestrating the entire remediation chain and shifting fixes left to prevent recurrence. → wiz.io |
| 2026-06-10 NEW 2026 | Snyk and Continue Partner to Embed AI-Powered Security into Every Step of the Developer Workflow news 3 min read | Library integrating Snyk and Continue automates security scans for code, dependencies, IaC, and containers using natural language commands within the developer workflow. This partnership enables faster vulnerability remediation through AI-generated, validated code fixes and proactive policy enforcement, allowing developers to address security without context switching. The integration supports Snyk's SAST, SCA, and IaC security tools directly in IDEs and CLIs, aiming to make "secure by default" a reality. → snyk.io |
| 2026-06-10 NEW 2026 | Beyond Automation: Securing Low-Code Agentic AI with MCP Guardrails beginner 3 min read | Library for securing low-code agentic AI, MCP Guardrails standardizes AI agent interaction with external tools via the Model Context Protocol (MCP). It incorporates a scanner layer for validating code, data, and commands, and an observability layer for comprehensive logging and traceability. This approach, supported by Toxic Flow Analysis (TFA), integrates static configuration data with dynamic runtime information to proactively detect vulnerabilities and mitigate risks like indirect prompt injection in autonomous AI systems. → snyk.io |
| 2026-06-10 NEW 2026 | Why Threat Modeling Is Now Even More Critical for AI-Native Applications beginner 4 min read | Reference of AI-native threat modeling practices, emphasizing the shift from manual, static workshops to continuous, adaptive processes. It details new attack surfaces like data poisoning and adversarial attacks, the unpredictable behavior of AI models, and the challenges of rapid deployment cycles, regulations like the EU AI Act, and complex ecosystems. The article advocates for automated asset discovery, dynamic risk modeling, and integrated remediation to maintain security posture at the speed of AI development. → snyk.io |
| 2026-06-10 NEW 2026 | How Snyk Studio for Qodo Is Closing the AI Security Gap news 3 min read | Library integrating Snyk's security intelligence with Qodo's Agentic Code Quality Platform. Snyk Studio for Qodo embeds security directly into the AI development workflow, leveraging Snyk's SAST and SCA engines. This allows developers to identify and fix vulnerabilities as they code within their IDE. The solution also addresses existing security debt through natural language prompts and automated remediation, aiming to resolve issues in minutes and accelerate secure AI-driven development at scale. → snyk.io |
| 2026-06-10 NEW 2026 | Scaling AI Security: How Evo Complements New Agentic Tools beginner 7 min read | Library for scaling AI security, Evo by Snyk, complements agentic tools like OpenAI's Aardvark by offering stable, reproducible findings and integrating security earlier in the development lifecycle. It provides multi-layer AI threat detection, mature dynamic testing (DAST) and software composition analysis (SCA) engines, and native governance features to support enterprise workflows and compliance without unpredictable token-based costs. → snyk.io |
| 2026-06-10 NEW 2026 | Snyk Log Sniffer: AI-Powered Audit Log Insights for Security Leaders beginner 4 min read | Tool for AI-powered analysis of Snyk audit logs, transforming raw data into actionable intelligence for security and engineering leaders. Log Sniffer leverages Google Gemini AI to provide executive summaries, answer security questions in natural language, and monitor audit events in real-time. It seamlessly integrates with the Snyk API, offering intelligent filtering and transforming complex security events into understandable insights, improving decision-making and risk mitigation. → snyk.io |
| 2026-06-10 NEW 2026 | When Speed Meets Security: Snyk Studio for Kiro news 4 min read | Library integration embedding Snyk Studio into Amazon Kiro’s agentic IDE, allowing developers to prevent new security risks at inception. This integration runs `snyk_code_scan` for generated code, attempts fixes with context from Snyk scans, and rescans to ensure resolution. It also addresses existing vulnerabilities through natural language prompts, identifying issues across code, dependencies, and IaC, then validating AI-generated fixes. → snyk.io |
| 2026-06-10 NEW 2026 | Run AutoMCP To Supercharge Your AI Agent with Libraries MCP Servers intermediate 3 min read | Tool for automating Model Context Protocol (MCP) server setup in AI-driven development environments. AutoMCP, an npm command-line tool, detects coding tools and project dependencies to configure MCP servers, enabling AI agents to autonomously run Snyk scans for early vulnerability detection. This integration, facilitated by Snyk Studio, embeds security directly into AI-assisted workflows, ensuring both human-written and AI-generated code is secure. → snyk.io |
| 2026-06-10 NEW 2026 | How Snyk Helps Federal Agencies Prepare for the Genesis Mission Era of AI-Driven Science beginner 3 min read Supply Chain | Library for securing AI-driven scientific missions, Snyk provides federal agencies with visibility into open source libraries, containers, and IaC templates within their software supply chains. It integrates security into CI/CD, model-training, and data pipelines, catching vulnerabilities and misconfigurations before deployment. The platform also addresses cloud and container security for AI compute systems, detecting misconfigurations and securing container images. By embedding security directly into developer workflows with automated fix recommendations and IDE plug-ins, Snyk operationalizes "secure by design" principles to accelerate discovery without compromising trust, aligning with federal expectations like Secure by Design, NIST 800-218, and EO 14028. → snyk.io |
| 2026-06-10 NEW 2026 | Old AI Security vs Evo: Watch Agentic Security Replace Weeks of Manual Work beginner 4 min read | Library for agentic AI security orchestration, Evo by Snyk, addresses emergent threats like prompt injection, data poisoning, and supply chain risks inherent in AI-native applications. It automates security workflows, including AI Bill of Materials (AI-BOM) generation, MCP Scan CLI for identifying risky components, and continuous AI red teaming to keep pace with evolving AI systems, contrasting with traditional, manual application security methods. → snyk.io |
| 2026-06-10 NEW 2026 | Evo Adds CycloneDX Support to Give Full AI Visibility news 4 min read Supply Chain | Library extending CycloneDX support to provide AI supply chain visibility. Evo's Discovery Agent now integrates with CycloneDX 1.6 AI ModelCards, enabling standardized AI-BOMs that detail model provenance, licensing, architecture (transformer, CNN), learning approach (supervised, self-supervised), and implementation paths. This addresses visibility gaps by offering a centralized inventory, tracking model origins from sources like HuggingFace, and providing granular insights into model type and task domain, making AI governance actionable. → snyk.io |
| 2026-06-10 NEW 2026 | Secure by Default: Why Snyk and Augment Code are the New Standard for AI Development news 2 min read | Partnership between Snyk and Augment Code that embeds Snyk's security intelligence into Augment Code's AI development platform. This integration provides real-time security scanning as developers write code, accelerated agent-led remediation for identified vulnerabilities, and governance at scale through custom Snyk rules applied to AI-generated code. The solution aims to make "Secure by Default" a reality for AI-driven development, reducing mean time to remediate and eliminating security as a manual bottleneck. → snyk.io |
| 2026-06-10 NEW 2026 | ServiceNow's Virtual Agent Vulnerability Shows Why AI Security Needs Traditional AppSec Foundations news 6 min read AuthN AuthZ | Library for securing agentic AI applications, emphasizing foundational application security alongside AI-specific controls. It highlights the ServiceNow Virtual Agent vulnerability, stemming from broken API authentication and excessive agent privileges, not novel AI issues. The library recommends a layered approach including agent-aware threat modeling to identify risks before deployment, DAST with LLM-enhanced authorization testing to detect classic vulnerabilities, and AI red teaming to reveal catastrophic impact paths enabled by autonomous agents. It stresses principles like least privilege and strong API identity verification for comprehensive AI security. → snyk.io |
| 2026-06-10 NEW 2026 | Live From Davos: The End of Human-Speed Security news 4 min read | Report detailing "The End of Human-Speed Security: Defense in the Age of AI Agents" highlights the rapid shift to AI operating as quasi-autonomous agents, with 50% of security leaders reporting this reality. It discusses the weaponization of AI, citing state-backed attacks on Anthropic, and the resulting "visibility crisis" where AI adoption often occurs outside monitored systems. The report calls for industry standards and a move beyond manual security processes to address challenges posed by autonomous attacks and achieve machine-speed defense. → snyk.io |
| 2026-06-10 NEW 2026 | Introducing the AI Security Fabric: Empowering Software Builders in the Era of AI news 8 min read | Library for securing applications in the age of AI, the Snyk AI Security Platform operationalizes a prescriptive path. It addresses AI-accelerated DevSecOps by fortifying traditional software supply chains, secures AI-driven development by embedding security into coding assistants like Snyk Studio, and defends AI-native applications with the agentic security orchestrator Evo by Snyk. This unified approach weaves security directly into every stage of modern software creation, adapting to dynamic systems and operating at machine speed to build trust and mitigate risks introduced by AI. → snyk.io |
| 2026-06-10 NEW 2026 | The Prescriptive Path to Operationalizing AI Security news 14 min read | Framework for operationalizing AI security, the Prescriptive Path provides an opinionated operating model with three phases: Stabilize, Optimize, and Scale. It focuses on building trust, reducing real risk, and sustaining governance by emphasizing outcomes over individual tools or checklists. The path guides organizations on how to apply security capabilities deliberately, from achieving foundational visibility and implementing guardrails for AI-generated code, to accelerating remediation and enabling autonomous defense in AI-native systems. → snyk.io |
| 2026-06-10 NEW 2026 | Snyk Finds Prompt Injection in 36%, 1467 Malicious Payloads in a ToxicSkills Study of Agent Skills Supply Chain Compromise news 11 min read Supply Chain | Library for identifying malicious AI Agent Skills; scanned 3,984 skills from ClawHub, finding 13.4% with critical flaws like malware and prompt injection. Detectors achieved 90-100% recall on confirmed malicious skills with 0% false positives on legitimate ones, utilizing the mcp-scan engine. Techniques observed include external malware distribution, obfuscated data exfiltration, and security disablement. → snyk.io |
| 2026-06-10 NEW 2026 | Zero-Click IP Leak in a Privacy Search Engine: Indirect Prompt Injection & Silent Patching intermediate | A security researcher discovered a zero-click IP leak vulnerability in Kagi Search, a privacy-focused search engine. The vulnerability exploited an indirect prompt injection technique using a Markdown trick to deanonymize users. This allowed the attacker to force a victim's browser to reveal their IP address, undermining Kagi's core privacy promise. Kagi Search has since quietly patched the vulnerability, indicating a "Not Applicable" status for the report, which the researcher interprets as a silent fix. No specific bug bounty payout amount was mentioned in the provided content. → infosecwriteups.com |
| 2026-06-10 NEW 2026 | Mythos Doesn't Deploy Itself news 5 min read | Toolset analysis highlighting how AI models like ChatGPT, Claude, and Gemini are impacting vulnerability research. It discusses how skilled researchers leverage LLMs with effective harnesses, referencing Niels Provos's use of IronCurtain to find zero-days, while less skilled practitioners produce inaccurate, polished reports, leading to issues like those seen with Bugcrowd and HackerOne's bug bounty programs. The core argument posits that human judgment and expertise in orchestration and validation remain critical, regardless of model capabilities, as demonstrated by findings in Cisco's Talos and Anthropic's red team efforts. → bishopfox.com |
| 2026-06-09 NEW 2026 | Indirect Prompt Injection Exposes a Universal AI Security Flaw No Deployment Model Is Immune intermediate 4 min read | Analysis of indirect prompt injection attacks, demonstrated against Mozilla Tabstack and Cotypist, reveals a universal LLM vulnerability that bypasses cloud-based and local deployments. This architectural flaw, stemming from LLMs' inability to distinguish instructions from data, creates systemic risk for enterprises adopting GenAI, irrespective of their chosen deployment model. The findings emphasize the need for architectural solutions over deployment choices to address security challenges in enterprise AI. |
| 2026-06-09 NEW 2026 | Cloud Threats Retrospective 2026: What AI Changed (and What It Didn’t) intermediate 2 min read | Analysis of 2025 cloud incidents reveals that well-known weaknesses, including vulnerabilities, exposed secrets, and misconfigurations, still drive 80% of initial access, despite the evolving cloud landscape. AI did not introduce new risk categories but expanded the attack surface by increasing opportunities for familiar risks near sensitive data. AI primarily accelerated existing attacker workflows like reconnaissance and post-access activities, and campaigns like hackerbot-claw and the compromised axios npm releases underscore the continued threat. → wiz.io |
| 2026-06-09 NEW 2026 | Claude Mythos: Preparing for a World Where AI Finds and Exploits Vulnerabilities Faster Than Ever advanced 10 min read RCE | Analysis of Anthropic's Claude Mythos, an unreleased frontier model capable of autonomously discovering zero-days and developing working exploits, highlights the accelerating trend of AI in vulnerability research. While current access is limited to responsible parties, the paper warns of an impending surge in AI-discovered CVEs and the subsequent rise of AI-assisted patch-diffing by attackers. It advocates for the integration of AI into AppSec programs and security tooling to proactively identify and remediate vulnerabilities before they can be weaponized. → wiz.io |
| 2026-06-09 NEW 2026 | Securing the AI Edge: Wiz and Cloudflare Integrate for End-to-End AI Protection intermediate 5 min read API Sec | Library integrating Wiz and Cloudflare for end-to-end AI application security, offering unified visibility into AI endpoints and DNS exposure. It maps AI workloads to infrastructure and identifies sensitive data risks, extending visibility to edge protections like Cloudflare. This allows detection of threats such as prompt injection and shadow AI, enabling teams to secure exposed AI services and validate continuous guardrail protection. → wiz.io |
| 2026-06-09 NEW 2026 | Securing AI Applications From Inception to Deployment beginner 5 min read RCE | Library extending Wiz AI-APP to secure AI applications by detecting risks at inception, validating exploitability at runtime with Red Agent, and orchestrating remediation with Green Agent. It connects code-level findings to cloud and runtime layers, mapping risks to their precise code root cause and generating tailored fixes. The library supports SAST, SCA, IaC scanning, and secrets scanning, integrating with IDEs and CLI, and aligning with OWASP Top 10 for LLM Applications 2025 and Agentic Applications 2026. → wiz.io |
| 2026-06-09 NEW 2026 | IaC Inventory: A Unified View Across Code, Deployments, and Cloud beginner 6 min read | Library for Infrastructure-as-Code (IaC) security, offering unified visibility across code, deployments, and cloud resources. It connects IaC modules to deployed resources, enabling instant risk scoping for AI workloads like Bedrock Agents and Guardrails. The tool supports Pulumi, Terraform, CloudFormation, and Bicep, catching misconfigurations pre-deployment and facilitating precise remediation by mapping issues back to the source code. → wiz.io |
| 2026-06-09 NEW 2026 | Closing the Security Gap in the Age of Agentic Coding intermediate 5 min read | Library for real-time scanning and fixing of AI-generated code within AI-native IDEs and copilots. Wiz Code plugins and skills, powered by the Wiz MCP server and WizCLI, integrate Wiz's security knowledge, enabling developers to catch and fix issues like hardcoded secrets, IaC misconfigurations, vulnerable dependencies, and malware at inception. The Green Agent facilitates rapid, context-aware remediation, allowing security teams to trigger fixes and developers to resolve issues directly in their IDEs, supporting secure development in the age of agentic coding. → wiz.io |
| 2026-06-09 NEW 2026 | Wiz Code Week Recap: Securing AI Native Development beginner 4 min read | Library for securing AI-native development, Wiz Code offers visibility into AI frameworks like Gemini Code Assist and GitHub Copilot via an AI-BOM. It integrates security guardrails directly into IDEs with Wiz Code plugins for Cursor and Claude Code, catching issues like hardcoded secrets and prompt injection before code commits. Remediation is streamlined with Wiz Skills, allowing coding agents to apply fixes, and CI/CD pipelines are secured by modeling them as assets, identifying dangerous configurations and surfacing findings from a CI-BOM. → wiz.io |
| 2026-06-09 NEW 2026 | Key Takeaways from the 2026 State of AI in the Cloud Report beginner 3 min read | Survey of AI security in cloud environments, detailing how AI has become foundational infrastructure. The 2026 State of AI in the Cloud report highlights 81% cloud environments use managed AI services and 90% run self-hosted AI, with 68% ingesting models via third-party software. AI-assisted development is default, leading to systemic weaknesses in 20% of organizations using AI platforms. Autonomous agents and MCP servers expand attack surface, enabling lateral movement. AI reduces exploitation costs by accelerating discovery and development, seen in malware dynamically generating commands and attackers abusing AI-enabled OAuth. This report covers AI-assisted zero-day discovery, exemplified by Anthropic's Claude Mythos. → wiz.io |
| 2026-06-09 NEW 2026 | The (In)security Landscape of AI-Powered GitHub Actions (Part 2/2) intermediate 8 min read Supply Chain | Analysis of AI-powered GitHub Actions reveals critical vulnerabilities including bypasses of non-default configurations allowing external attackers to trigger AI execution, and a novel secret exfiltration vector for dynamically-created credential files like those from `google-github-actions/auth@v2`. Widespread misconfigurations affect numerous repositories, exacerbated by prompt injection risks and the confusing deputy bypass technique, where attackers can leverage `dependabot` to impersonate trusted actors. → wiz.io |
| 2026-06-09 NEW 2026 | 280+ Leaky Skills: How OpenClaw & ClawHub Are Exposing API Keys and PII intermediate 5 min read Secrets | Library of scripts for detecting security flaws in AI agent skills, specifically addressing how popular tools like OpenClaw and ClawHub can inadvertently expose API keys and PII. Researchers found 283 vulnerable skills in the ClawHub marketplace, detailing flaws in specific examples such as `moltyverse-email`, `buy-anything`, `prompt-log`, and `prediction-markets-roarin`. These vulnerabilities stem from instructions that lead agents to mishandle secrets by passing them through LLM context windows or outputting them in plaintext logs. The library includes tools like `mcp-scan` and `Snyk AI-BOM` for auditing and remediation. → snyk.io |
| 2026-06-09 NEW 2026 | How a Malicious Google Skill on ClawHub Tricks Users Into Installing Malware intermediate 5 min read Supply Chain | Library for securing AI agents, focusing on the "google-qx4" malicious Google Skill on ClawHub that tricked users into installing malware via social engineering in the SKILL.md file. This technique bypasses traditional AppSec by leveraging agent-driven social engineering and legitimate-looking hosts like Rentry and GitHub, confirming "ToxicSkills" research predictions. It offers solutions like `mcp-scan` for skill analysis and Snyk AI-BOM for inventory, with Evo by Snyk providing AI-native security to monitor agent behavior and prevent malicious command execution. → snyk.io |
| 2026-06-09 NEW 2026 | Why Your “Skill Scanner” Is Just False Security (and Maybe Malware) intermediate 6 min read Supply Chain | Library for AI agent security, mcp-scan (part of Snyk's Evo platform), uses a specialized LLM to understand the intent and capabilities of SKILL.md files beyond simple keyword matching. Unlike traditional regex-based scanners that fail against natural language variations, prompt injection, and contextual risks, mcp-scan performs behavioral analysis to detect malicious actions such as data exfiltration or attempts to override safety instructions. This AI-native approach aims to provide more robust security than tools like SkillGuard, Skill Defender, and Agent Tinman which have shown limitations. → snyk.io |
| 2026-06-09 NEW 2026 | From Acceleration to Exposure: Why AI Demands Mature AppSec beginner 3 min read | Library: This article discusses how immature application security practices, when combined with AI-driven development, scale existing risks and amplify exposure. Autonomy in AI systems leads to rapid compounding of errors in code, dependencies, and configurations, outstripping traditional visibility and detection methods. Mature AppSec, focusing on enforceable policies and continuous assurance, enables organizations to safely leverage AI's acceleration without sacrificing oversight or trust, transforming potential liabilities into genuine accelerators. → snyk.io |
| 2026-06-09 NEW 2026 | The Future of AI Agent Security Is Guardrails intermediate 14 min read | Library for AI agent security, it advocates for "guardrails" as the future of protecting autonomous agents from unintended actions like credential exfiltration or unauthorized command execution. Instead of focusing on smarter models, this approach implements security checkpoints within the agent's execution pipeline. These checkpoints, including access hooks for least privilege, pre-execution hooks for sanitizing tool calls (preventing prompt injection and enforcing input validation), and post-execution hooks for filtering LLM output, act as a dynamic defense against vulnerabilities exposed by agentic AI, exemplified by issues seen with OpenClaw. → snyk.io |
| 2026-06-09 NEW 2026 | Weaving Security into the Flow: New Snyk Studio Capabilities Power the AI Security Fabric intermediate 5 min read | Library enhancing Snyk Studio provides capabilities for securing AI-driven development, integrating with tools like Gemini CLI and Claude Code. It offers streamlined setup, real-time security guardrails, and introduces Remediation Directives for automated pull requests to fix vulnerabilities. New governance and control features, including an Adoption report, allow enterprises to manage and scale AI development securely, creating an AI Security Fabric. → snyk.io |
| 2026-06-09 NEW 2026 | Securing the Agent Skill Ecosystem: How Snyk and Vercel Are Locking Down the New Software Supply Chain intermediate 8 min read Supply Chain | Library for securing agent skill ecosystems, this resource details Snyk's integration with Vercel's skills.sh marketplace to perform automated security analysis on AI agent skills. It employs a deep multi-layer approach using LLM-based judges and deterministic rules to detect vulnerabilities in both code and natural language instructions, identifying "toxic flows" and prompt injection. The system aims for high recall on malicious skills with zero false positives, providing a "Security Verified" badge on skill pages and enabling continuous monitoring of the evolving threat landscape. → snyk.io |
| 2026-06-09 NEW 2026 | How “Clinejection” Turned an AI Bot into a Supply Chain Attack intermediate 9 min read Supply Chain | Writeup detailing the "Clinejection" vulnerability chain, which leveraged indirect prompt injection against an AI triage bot and GitHub Actions cache poisoning to enable supply chain attacks. This exploit, discovered by Adnan Khan, allowed an attacker to gain access to production credentials and publish a malicious version of the Cline CLI to npm, installing the OpenClaw AI agent. The analysis highlights how combined vulnerabilities, including credential model weaknesses and dangling commits, can create significant risks in CI/CD pipelines, emphasizing the need for robust security collaborations. → snyk.io |
| 2026-06-09 NEW 2026 | Claude Code Security: A Welcome Evolution in the Remediation Loop beginner 5 min read | Library that unifies LLM-native capabilities with deterministic validation and operational automation to address the evolving application security landscape. It combines AI reasoning for discovery with robust enforcement mechanisms, addressing vulnerabilities introduced by AI-assisted development, including injection risks and business logic flaws. The library facilitates AI-accelerated DevSecOps, secures AI-driven development workflows through automated remediation directives, and extends protection to AI-native applications with visibility and policy enforcement, aiming to close the detection-to-remediation loop reliably. → snyk.io |
| 2026-06-09 NEW 2026 | Fetch the Flag CTF 2026: Official Challenge Write-Ups & Community Highlights beginner 1 min read Bug Bounty Talks | Writeups detailing solutions to the Fetch the Flag CTF 2025 challenges, spanning web, binary, and exploitation categories, are highlighted alongside community insights. These writeups offer practical approaches to real-world hacking scenarios encountered in the competition, encouraging skill development. The article also references a CTF 101 workshop for newcomers and lists various challenge names like VulnScanner, JH, Plantly, and Echo, showcasing the diversity of security problems presented. → snyk.io |
| 2026-06-09 NEW 2026 | Snyk and uv, Better Together intermediate 3 min read Python | Library for Python package management, uv, now integrates with Snyk to provide enhanced application security. This partnership enables users to export CycloneDX SBOMs directly from uv (version 0.9.11 and later) and then scan these SBOMs with Snyk for vulnerabilities and license compliance issues. Native uv support is also being integrated into the Snyk CLI, IDE integrations, and agentic workflows, aiming to make security a seamless part of the development process for AI-native Python applications. → snyk.io |
| 2026-06-09 NEW 2026 | The Rise of the AI Security Engineer: A New Discipline for an AI-Native World beginner 8 min read | Survey of emergent AI security roles, detailing the responsibilities and required mindset for an AI Security Engineer. This discipline addresses novel threats like prompt injection, memory exploitation, model poisoning, and agent hijacking, which challenge traditional security models due to AI's non-deterministic nature. It advocates for an adaptive, builder-defender approach operating at machine speed to secure AI-native systems and build trust. → snyk.io |
| 2026-06-09 NEW 2026 | Securing the Agent Skills Registry: How Snyk and Tessl Are Setting the Standard beginner 7 min read | Library for scanning agent skills in the Tessl Registry, integrating Snyk's security analysis to detect prompt injection, malware, and toxic flow patterns. This partnership provides real-time security scores on skill pages and search results, addressing the unique risks of agent skills by analyzing natural language instructions alongside code. The system automatically scans new skills and backfills existing ones, offering developers visibility into potential vulnerabilities before installation, inspired by Snyk's research into malicious skills and Snyk Learn lessons on agent goal hijack. → snyk.io |
| 2026-06-09 NEW 2026 | I Read Cursor's Security Agent Prompts, So You Don't Have To intermediate 15 min read | Library providing open-source prompts for autonomous AI security agents, capable of reviewing thousands of pull requests weekly and identifying hundreds of vulnerabilities. The prompts emphasize a clear role assignment, goal, methodology, and priority list, demonstrating that concise instructions can drive effective security reviews. This approach leverages LLMs' understanding of common vulnerabilities like SQL injection and unsafe deserialization, integrating them into production-grade agent orchestration platforms for enhanced security scanning. → snyk.io |
| 2026-06-09 NEW 2026 | AI Is Building Your Attack Surface. Are You Testing It? beginner 6 min read | Library for intelligent dynamic testing that addresses the unique security challenges posed by AI-generated code and AI agents. It focuses on confirming real exploitability, specifically targeting flaws like BOLA and IDOR in APIs accessed by agents, and correlates static analysis findings with dynamic testing results to prioritize high-confidence fixes. The library aids in discovering undocumented API endpoints and provides continuous coverage within the development pipeline, aiming to distinguish actual vulnerabilities from noise and enable developers to ship code with confidence. → snyk.io |
| 2026-06-09 NEW 2026 | The Next Era of AppSec: Why AI-Generated Code Needs Offensive Dynamic Testing intermediate 6 min read | Library for advanced dynamic security testing, integrating code-level intelligence with runtime interaction. This approach moves beyond traditional SAST and DAST by combining static code analysis, even agentic AI-driven analysis, with the ability to observe and exploit vulnerabilities in live, distributed systems. It enables grey-box testing, correlating runtime exploitability with precise code-level origins for faster remediation, and is crucial for identifying emergent threats in AI-generated code and complex microservice architectures. → snyk.io |
| 2026-06-09 NEW 2026 | Introducing Agent Security beginner 6 min read | Library for securing AI agents, Evo AI-SPM provides visibility, intelligence, and enforcement across the AI lifecycle. It discovers AI components in code and workflows, assesses associated risks, and enables policy enforcement to prevent unsafe configurations and behaviors. Features include Agent Scan for vetting agent dependencies, Snyk Studio for securing AI-generated code, Agent Guard for real-time behavior monitoring, Agent Red Teaming for attack simulation, and Snyk API & Web for dynamic testing against vulnerabilities like BOLA. → snyk.io |
| 2026-06-09 NEW 2026 | From Discovery to Defense: Why AI Red Teaming Is the Next Step After AI-SPM intermediate 6 min read | Library of techniques for AI Red Teaming and Dynamic Security Testing (DAST), emphasizing their convergence. This approach combines the exhaustive nature of traditional DAST with the contextual reasoning of AI-driven pentesting, enabling the discovery of complex business logic flaws and authorization issues that arise from inter-component interactions or emergent AI behaviors like prompt injection. By correlating runtime exploitability with source code context, this library facilitates more accurate vulnerability identification and streamlined remediation, moving beyond static analysis limitations. → snyk.io |
| 2026-06-09 NEW 2026 | Gartner urges multilayered defenses as AI deepfakes and LLM threats surge news | Gartner advises organizations to implement multilayered defenses against escalating threats from AI deepfakes and Large Language Models (LLMs). These technologies pose significant risks, including misinformation, fraud, and security breaches. The report emphasizes the need for a comprehensive strategy that combines technical controls with organizational policies and user education to effectively mitigate these emerging dangers. |
| 2026-06-09 NEW 2026 | Indirect Prompt Injection remains a fundamental security challenge for AI intermediate 7 min read | Library of techniques addressing indirect prompt injection, a fundamental LLM security challenge. This vulnerability allows attackers to embed instructions within untrusted external content, like webpages or local documents, which an LLM then executes. Case studies examine Mozilla Tabstack (cloud-hosted) where an agent was hijacked to exfiltrate conversation history, and Cotypist (on-device) where an LLM suggested inaccurate content and surfaced credentials. The core issue is the LLM's inability to reliably distinguish instructions from data within a shared context window, regardless of whether the model runs locally or in the cloud. |
| 2026-06-08 NEW 2026 | Brave AI Browsing Faces Prompt Injection Risk intermediate | Library for identifying indirect prompt injection vulnerabilities in AI features. Brave's AI browsing is susceptible to this flaw, where malicious commands are embedded within external content, causing AI models to execute unintended instructions. The vulnerability arises from the AI's inability to distinguish between developer-provided instructions and hidden commands within its context window, affecting both cloud-based and on-device models. |
| 2026-06-08 NEW 2026 | The Meta hack shows theres more to AI security than Mythos news 5 min read | Analysis of the Meta hack highlights AI agent vulnerabilities beyond sophisticated threats like Mythos. Attackers exploited Meta's AI customer support by requesting direct account email changes, leading to account takeovers like the dormant Obama White House account. Experts, including Neil Gong and Somesh Jha, warn that as AI automates workflows, simpler attacks on the agents themselves will increase, necessitating rigorous red-teaming and traditional guardrails despite the inherent security-utility trade-off. |
| 2026-06-08 NEW 2026 | A Framework for AI Threat Readiness beginner 13 min read | Framework for AI Threat Readiness, a structured approach to managing evolving security risks, emphasizes speed of action and breadth of visibility. It details strategies for eliminating critical risks, reducing exposure through AI-driven analysis, and accelerating patching and response times. The framework highlights the need for continuous discovery and mapping of internet-facing assets, exposure control, AI-driven risk validation, and established remediation processes, all crucial as AI models accelerate vulnerability discovery and exploitation. → wiz.io |
| 2026-06-08 NEW 2026 | Building AI Security with Our Customers: 5 Lessons from Evo’s Design Partner Program beginner 6 min read | Library for securing generative AI, Evo AI-SPM, addresses AI sprawl and shadow AI through its Discovery Agent, which uncovers models and agents. It features Custom Discovery to detect bespoke AI implementations invisible to standard tools, and Snyk Generated Policies offering out-of-the-box, continuously enforced policies for governance. The Risk Intelligence Agent provides actionable risk signals for AI models, agents, and MCP servers, while the Policy Agent enables CI/CD pipeline enforcement and operational security for AI components. → snyk.io |
| 2026-06-08 NEW 2026 | You Patched LiteLLM, But Do You Know Your AI Blast Radius? beginner 6 min read | Library for understanding AI system blast radius; it maps model gateways like LiteLLM, identifying routed providers and models, connected tools, APIs, and agent workflows to reveal unseen risks beyond traditional dependency analysis, enabling better incident response by showing what the compromised component actually accessed. → snyk.io |
| 2026-06-08 NEW 2026 | Secure What Matters: Scaling Effortless Container Security for the AI Era beginner 4 min read | Library enhancements from Snyk Container streamline inventory management with automated registry monitoring and customizable import/pruning rules. New beta features offer a unified platform experience, prioritize vulnerabilities based on runtime intelligence from third-party signals, and provide flexible support for multiple profiles in complex environments. These updates bolster security for the AI era by providing scalable visibility and automated remediation at the speed of agentic AI. → snyk.io |
| 2026-06-08 NEW 2026 | Governing Security in the Age of Infinite Signal – From Discovery to Control beginner 7 min read | Analysis of AI's impact on application security, particularly the capabilities of Anthropic's Claude Mythos for vulnerability discovery. The article emphasizes the shift from mere detection to essential control and governance, highlighting that AI's advanced reasoning abilities do not replace the need for deterministic enforcement, consistent policies, and auditable risk. It argues that enterprises must focus on controlling AI-generated code and the AI tools themselves within the software supply chain, integrating AI models, deterministic rulesets, and human expertise for a comprehensive security posture. → snyk.io |
| 2026-06-08 NEW 2026 | Introducing the New Agentic Architecture for Snyk Agent Fix: Faster, Smarter, and More Secure beginner 4 min read | Library for Snyk Agent Fix utilizing an agentic architecture, moving from static fine-tuning to dynamic few-shot prompting. This approach integrates Snyk's security intelligence, including a database of over 35,000 vulnerabilities and expert-written fixes, with frontier models like Anthropic's. Benchmarking focuses on security integrity (Pass@1/Pass@5), functional logic, and golden tests. The system supports agentic retries to adapt responses based on initial failures and offers full language coverage for all Snyk Code-supported languages, enabling faster, more secure code remediation. → snyk.io |
| 2026-06-08 NEW 2026 | Bridging the Gap to Autonomous Fixes: Snyk and Atlassian Unveil Intelligent Remediation for Jira beginner 2 min read | Integration between Snyk and Atlassian offers intelligent, autonomous remediation for Jira security tickets. This solution leverages Snyk Studio's agentic skills, such as "snyk-fix" and "secure-at-inception," to autonomously generate and validate fixes within an Agentic Development Environment (ADE). By ingesting vulnerability data from Jira and utilizing Atlassian's TWG CLI or other CLIs, developers can reduce Mean Time to Resolution (MTTR), eliminate context switching, and improve fix accuracy, transforming security from a manual chore into an automated process. → snyk.io |
| 2026-06-08 NEW 2026 | Securing The AI Revolution: How Snyk And Our Partners Are Scaling For The Future beginner 3 min read Supply Chain | Reference on Snyk's evolving go-to-market strategy, detailing its expansion beyond product-led growth to address the challenges of securing AI-generated code at scale. It highlights deep integrations with partners like Anthropic, Cursor, AWS, Atlassian, and OpenAI, and introduces a Partner Services Delivery Program and Partner Accelerator Fund designed to enable partners to build AI security practices and generate professional services revenue, emphasizing an ecosystem approach to application security in the AI era. → snyk.io |
| 2026-06-08 NEW 2026 | Snyk announces Anthropic updates: Evo integrates with Claude Enterprise, and Snyk Desk comes to Claude Desktop beginner 5 min read | Library integrating Evo by Snyk with Anthropic's Claude Enterprise, providing security and compliance teams with an inventory of Claude environment models, MCP servers, risk signals, and tool-level permissions. Additionally, the Snyk Security Desktop Extension is now available for Claude Desktop on macOS and Windows, embedding real-time scanning and vulnerability context directly into developer workflows to catch issues at inception and ensure least privilege on AI agent tools. → snyk.io |
| 2026-06-08 NEW 2026 | Continuous Offensive Security: The Line We've Been Walking beginner 12 min read Fuzzing | Library for continuous offensive security testing, bridging Dynamic Security Testing (DAST) with AI-driven capabilities. It addresses both heuristic-detectable vulnerabilities like SQL injection and cross-site scripting, and context-dependent flaws such as BOLA, IDOR, and chained exploits by mimicking human pentester reasoning. The library also includes Agent Red Teaming to proactively test the new attack surfaces created by LLM-integrated applications and AI agents, automatically adapting testing strategies based on discovered components. → snyk.io |
| 2026-06-08 NEW 2026 | How Relay Network Adopted AI Coding Securely and Built the Foundation for Agentic Development beginner 7 min read Supply Chain | Library integrating Snyk with GitHub Copilot enables secure AI-assisted coding by shifting security left. Custom pre-commit hooks scan code in real-time, catching vulnerabilities like insecure dependencies during development. This empowers developers to fix issues immediately, reducing the mean time to remediate (MTTR) and accelerating technical growth. → snyk.io |
| 2026-06-08 NEW 2026 | Fix SCA issues at scale in your terminal with Snyk Remediation Agent in the CLI intermediate 6 min read Secrets | Library for automating software composition analysis (SCA) remediation within the terminal. This tool empowers developers to address vulnerabilities at scale by integrating Snyk's security intelligence with large language models (LLMs). It analyzes findings, provides fix context including version upgrades and breakability analysis, and enables iterative, LLM-guided remediation loops with developer review, aiming to improve fix rates for SCA issues. → snyk.io |
| 2026-06-08 NEW 2026 | OpenAI Launches Lockdown Mode Against Prompt Injection Attacks beginner 3 min read | Library for defending against prompt injection attacks in large language models, featuring OpenAI's Lockdown Mode. This security feature adds validation layers to detect and restrict suspicious input patterns associated with attempts to override instructions, extract training data, or manipulate output. While not a complete solution, it significantly raises the bar for attackers, making it harder for malicious prompts to leak sensitive enterprise data, and pressures competitors like Google, Microsoft, and Anthropic to enhance their own AI security. |
| 2026-06-08 NEW 2026 | Securing CI/CD in an agentic world: Claude Code Github action case intermediate 10 min read | Library for securing CI/CD workflows, this entry details a vulnerability in Anthropic’s Claude Code GitHub Action. The Read tool within the action was not sandboxed like the Bash tool, allowing it to access `/proc/self/environ` and potentially exfiltrate sensitive secrets like `ANTHROPIC_API_KEY`. This vulnerability, discovered by Microsoft Threat Intelligence and addressed in version 2.1.128, highlights the risks of AI agents processing untrusted GitHub content in CI/CD environments, particularly when granted file-read capabilities or access to secrets. Prompt injection via HTML comments and disguised feature requests are illustrated as attack vectors. → microsoft.com |
| 2026-06-08 NEW 2026 | Evidence at the Moment of Attack. Answers at AI Speed. intermediate 5 min read | Library for automated cloud security investigations, Wiz Forensics captures forensic artifacts at the moment of detection. This addresses the challenge of ephemeral cloud workloads and fileless attacks by collecting data like script executions, process trees, and memory payloads before they disappear. AI analysis of these collected artifacts accelerates investigation for SOC and IR teams, transforming raw data into actionable insights and confident verdicts on threats like SQL injection, data exfiltration, and multi-stage attacks, as seen with the Soco404 campaign and JINX-0164. → wiz.io |
| 2026-06-08 NEW 2026 | AI Threat Readiness Pillar 1: Reduce Critical Exposures & Scan with AI beginner 6 min read | Library for AI-powered application security scanning. It focuses on reducing critical exposures by providing unified visibility across cloud, SaaS, and AI environments. The library employs techniques like Attack Surface Management (ASM) and an AI attacker emulation tool, "Red Agent," to identify and validate exploitable risks, including authorization flaws, business logic weaknesses, and complex API attack chains. It correlates external findings with internal environmental context to prioritize based on business impact and leverages an "AI remediator," "Green Agent," for context-aware guidance and workflow automation. → wiz.io |
| 2026-06-08 NEW 2026 | Protestware by open source maintainer to hinder agentic coding: The jqwik 1.10.0 Prompt Injection intermediate 6 min read | Library net.jqwik:jqwik-engine version 1.10.0, released by the maintainer, contained protestware utilizing prompt injection. This version, intended to deter AI coding agents, hid instructions to disregard previous commands and delete jqwik tests and code using ANSI terminal codes, making them invisible to humans but readable by automated systems. While at least one AI agent successfully identified and refused the injection, this incident highlights supply chain risks where tool output can be interpreted as commands, emphasizing the need to treat such output as untrusted input. → snyk.io |
| 2026-06-08 NEW 2026 | The New Security Risks of the Agentic Development Lifecycle beginner 7 min read | Library for securing the agentic development lifecycle, which involves AI agents planning, building, modifying, testing, and shipping software by interacting with tools, codebases, and environments. This shifts the security focus from artifact inspection to trusting the creation process, addressing risks introduced by agents' inputs (e.g., malicious skills, flawed MCP servers), actions (e.g., unsafe command execution, unauthorized access), and generated outputs (e.g., insecure code patterns). → snyk.io |
| 2026-06-08 NEW 2026 | Type Level Security: The future of secure AI code generation? beginner 6 min read | Library demonstrating type-level security to prevent common vulnerabilities like Insecure Direct Object Reference (IDOR) and DOM XSS. It showcases how Rust's strong type system and Python's type hints can enforce security invariants, ensuring that data like user IDs or strings are only used after proper authentication and sanitization. The approach aims to make entire classes of security bugs uncompilable or un-type-checkable, applicable to both human developers and AI code generation. → snyk.io |
| 2026-06-08 NEW 2026 | So You Have an AI Security Budget. Now what? beginner 9 min read | Library for AI security budgeting that shifts focus from fragmented tool spending to unified investment in visibility, governance, and control across the AI lifecycle. It emphasizes securing agentic development and agentic applications by funding AI discovery, risk assessment, policy enforcement, adversarial testing, runtime protection, and governance evidence, addressing vulnerabilities like CVE-2025-6514 and issues seen in incidents like Replit's data deletion. → snyk.io |
| 2026-06-08 NEW 2026 | Hacking Auto-GPT and escaping its docker container intermediate 17 min read RCE | Library detailing indirect prompt injection and Docker escape vulnerabilities in Auto-GPT. The library explains how to trick Auto-GPT into executing arbitrary code via the `browse_website` command by crafting malicious websites. It also covers obtaining user approval for injected commands by manipulating console messages and future planned actions, and details trivial Docker escapes and path traversal exploits for non-Docker versions, impacting versions prior to v0.4.3. |
| 2026-06-08 NEW 2026 | They said AI would kill Bug Bounty. The data says otherwise beginner Bug Bounty | The article challenges the notion that AI will eliminate bug bounty programs. Contrary to popular belief, data suggests that AI is actually enhancing, rather than replacing, the bug bounty landscape. It is becoming a tool that security researchers can leverage to identify vulnerabilities more effectively. The core message is that the bug bounty industry is evolving with AI, not facing extinction. → yeswehack.com |
| 2026-06-08 NEW 2026 | How LLMs are changing Bug Bounty: An interview with Aituglo beginner Bug Bounty | In an interview with Aituglo, it's revealed how Large Language Models (LLMs) are transforming bug bounty programs. LLMs are enhancing efficiency in bug hunting by assisting with tasks like code analysis, vulnerability detection, and report generation. This allows researchers to find more bugs faster and with greater accuracy. Aituglo emphasizes that LLMs are becoming powerful tools for bug bounty hunters, streamlining workflows and increasing the overall effectiveness of security research. → yeswehack.com |
| 2026-06-08 NEW 2026 | [tl;dr sec] #327 - Finding Zero-days with Any Model, Practical Package Security, Measuring the AI Offense-Defense Gap intermediate 15 min read Supply Chain | Library for C/C++ security challenges from Trail of Bits, featuring walkthroughs of Linux ping command injection and Windows driver kernel execution, alongside `c-review` for LLM-based code analysis. It also includes the `deepsec` scanner from Vercel, utilizing Claude and GPT coding agents to identify vulnerabilities by tracing data flows, and Jonathan Dunn's research on Client Side Path Traversal in major frontend frameworks like React Router and Next.js. → tldrsec.com |
| 2026-06-08 NEW 2026 | [tl;dr sec] #328 - Shai-Hulud's Source Code Leaked, Break Into Buildings for $, Reversing EDRs with AI intermediate 12 min read | Library from Microsoft mitigates Server-Side Request Forgery (SSRF) in cloud-hosted .NET and NodeJS applications with secure-by-default code, including protection against HTTP redirects and DNS rebinding, complemented by the Dusseldorf testing tool. → tldrsec.com |
| 2026-06-08 NEW 2026 | [tl;dr sec] #329 - AI-powered Honeypots, GitHub Action Canaries, Microsoft’s Agentic Security Scanner beginner 12 min read | Library for detecting and deceiving attackers with AI honeypots, identifying supply chain attacks using GitHub Action canaries, and exploring Microsoft's "Autonomous Code Security" team. It also covers the impact of AI on bug bounties, a framework for rolling out security policies, and pre-auth RCEs against GPON OLT hardware and its Cloud EMS fleet manager, potentially exposing entire ISP networks. Additionally, it discusses detecting CI/CD supply chain attacks with canary credentials and unmasking the Docker ONBUILD supply chain attack vector. → tldrsec.com |
| 2026-06-08 NEW 2026 | [tl;dr sec] #330 - AWS Pathfinding Labs, Running Codex Safely at OpenAI, Glasswing Updates beginner 11 min read API Sec | Library for securing AI coding agents, Prempti, intercepts tool calls and provides allow/deny verdicts based on Falco rules, integrating with LLMs for adaptive learning. OpenAI shares how they safely deploy Codex internally using sandboxed environments, approval workflows, and an auto-review subagent, with exported logs feeding an AI-powered security triage agent. Renovate PRs are automated for dependency updates using Claude Code Routines and a structured upgrade risk matrix, incorporating a minimum release age filter to prevent supply-chain attacks. AWS Security Agent generates verification scripts for pentest findings, and Pathfinding Labs offers over 100 intentionally vulnerable AWS environments for practicing cloud attack paths and validating detections. → tldrsec.com |
| 2026-06-08 NEW 2026 | [tl;dr sec] #331 - How Adversaries Use AI, Skill Issues, Using IDEs for C2 intermediate 13 min read | Library for securing applications, this entry details adversarial techniques leveraging AI, skill issues in LLM development, and the use of IDEs for command and control. It highlights specific attack chains like the Zapier compromise, the efficiency of AI agents in data exfiltration from AWS, and methods for bypassing Claude Code's security measures. The resource also compares AI application security testing platforms and discusses proactive defense strategies against emerging threats. → tldrsec.com |
| 2026-06-08 NEW 2026 | Juice Shop v20.0.0 — a fresh squeeze of features, now with AI beginner 4 min read | Library: OWASP Juice Shop v20.0.0 adds AI-themed challenges like Chatbot Prompt Injection and Greedy Chatbot Manipulation, requiring LLM integration via Ollama or OpenAI-compatible servers. This release features a redesigned storefront, faster startup times (~30%), a smaller Docker image, and improved cheat detection. It also includes new products, enhanced UI, upgraded frontend frameworks (Angular 21.x), and updated test infrastructure with Node.js test runner and Vitest. → owasp.org |
| 2026-06-08 NEW 2026 | Move over, Mythos. Here comes... pretty much any other model with a good harness intermediate 6 min read | Library for building application security scanning harnesses that orchestrate multiple AI models. It argues that the effectiveness of vulnerability discovery hinges more on the harness design than on specific frontier models like Mythos or GPT-5.5. Sophisticated harnesses, incorporating stages for reconnaissance, parallel agent hunting, validation, and tracing, enable scalable and cost-effective security testing by allowing flexible model swapping and leveraging cheaper models for wider candidate generation, while more powerful models can be reserved for deep analysis. → aikido.dev |
| 2026-06-08 NEW 2026 | What is AI SAST? beginner 7 min read | Library for AI SAST, which uses AI reasoning to analyze source code for security vulnerabilities like IDORs, broken access control, and business logic flaws. Unlike traditional pattern-matching SAST, AI SAST understands code intent, traces data flow across services, and identifies complex multi-step exploit chains. This AI-native approach offers pentest-grade reasoning for static analysis, distinguishing it from AI-augmented SAST which primarily focuses on triage and false positive reduction. → aikido.dev |
| 2026-06-08 NEW 2026 | Designing Identity for the Agentic Enterprise: The Okta AI Identity Summit news 7 min read AuthZ | Reference on agentic enterprise identity, summarizing insights from the Okta AI Identity Summit. It highlights how AI agents are rapidly outpacing existing identity systems, necessitating a shift from mere access control to governing specific actions. Key takeaways include the need for agent discovery, understanding connections, real-time governance via access certifications and kill switches, and the integration of identity as a core control plane for AI. The summit emphasized that successful AI transformation requires rewiring work processes and trust, not just deploying new tools. → blog.gitguardian.com |
| 2026-06-08 NEW 2026 | Trusted AI Adoption (Part 2): Detection intermediate 4 min read | Library for continuous detection of unmanaged AI assets in agentic supply chains. It addresses the velocity problem of coding agents by implementing deep scanning across binaries, containers, source code, build manifests, and agent configurations. The library classifies discovered assets into Managed, Partially Managed, Unmanaged (Shadow AI), and Malicious categories, enabling automated responses and shifting security from hopeful to enforcement. → jfrog.com |
| 2026-06-08 NEW 2026 | NVIDIA NIM Models Are Now Governed Assets in Your Supply Chain beginner 5 min read Supply Chain | Library for governing NVIDIA NIM models within the software supply chain, integrating them into JFrog Artifactory and JFrog Curation for unified discovery, explicit allow/block policies, and audit trails. This ensures NIM models, like Docker images or npm packages, pass through established security controls, preventing bypass of risk tolerance, licensing, and approval workflows by developers and coding agents. → jfrog.com |
| 2026-06-08 NEW 2026 | Sparkplug B Protocol Fuzzing with AI Assistance intermediate 9 min read Fuzzing | Tool for fuzzing Sparkplug B, an MQTT-based industrial control and SCADA protocol. This fuzzer covers all nine message types, nineteen data types, and numerous field paths, addressing coverage gaps and defects identified in earlier prototypes. AI assistance helped harden the tool, incorporating a CLI, logging, and passive network discovery, providing ICS/SCADA operators and vendors a means to test Sparkplug B endpoints for crashes, protocol violations, and state-handling bugs. → bishopfox.com |
| 2026-06-08 NEW 2026 | AI agents building security tests – architecture and prompts intermediate 9 min read | Library for automating the creation of security tests, Alfred, leverages AI agents to process vulnerabilities from over 200 sources. It prioritizes threats using EPSS scores, categorizes content, and extracts detailed technical notes for reproduction. The system then triages vulnerabilities based on exploitability, protocol, authentication requirements, and other factors, aiming to automate vulnerability weaponization and reduce manual researcher effort. → labs.detectify.com |
| 2026-06-08 NEW 2026 | How to scan for vulnerabilities with GitHub Security Lab’s open source AI-powered framework beginner 22 min read AuthZ Secrets | Framework automating security vulnerability detection using AI-powered taskflows. It breaks down code repositories into components, gathers contextual information through threat modeling, and then uses LLMs to suggest and audit potential vulnerabilities, focusing on high-impact issues like authorization bypasses and information disclosure. The framework is open-source and requires a GitHub Copilot license for execution. → github.blog |
| 2026-06-08 NEW 2026 | Hack the AI agent: Build agentic AI security skills with the GitHub Secure Code Game beginner 6 min read Bug Bounty | Library for learning agentic AI security skills. Season 4 of the GitHub Secure Code Game, featuring the deliberately vulnerable AI assistant ProdBot, allows players to exploit and fix security flaws in autonomous AI systems. Players interact via natural language in the CLI across five progressive levels, encountering vulnerabilities inspired by real-world risks like agent goal hijacking, tool misuse, and memory poisoning, similar to CVE-2026-25253. The game runs in GitHub Codespaces and requires no prior AI or coding experience. → github.blog |
| 2026-06-08 NEW 2026 | The sorry state of skill distribution news 9 min read AuthZ | Library analyzing public skill marketplaces reveals prevalent malicious skills designed to steal credentials and exfiltrate data. Tested scanners from ClawHub, Cisco, and skills.sh were bypassed using techniques like file truncation and embedding malicious `.pyc` bytecode within seemingly harmless scripts. The article highlights weaknesses in static analysis and LLM-based scanning, demonstrating how attackers can exploit packaging and binary obfuscation, mirroring supply chain attacks like the xz-utils backdoor. → blog.trailofbits.com |
| 2026-06-03 2026 | Guardrails for AI Agents: Safety and Security beginner 7 min read | Library providing a layered governance and security system for AI agents, acting as a runtime control to prevent issues like hallucinations, prompt injection, unsafe actions, and data leakage by validating inputs, model outputs, and tool calls. It enforces structured policies and safeguards through pre-LLM input checks, post-LLM output and action validation, and system-level controls such as least privilege and tool sandboxing. This approach treats guardrails as production infrastructure, incorporating context-grounded validation, self-correction loops, multi-agent validation, and hard constraints to ensure security, compliance with regulations like GDPR and HIPAA, and prevent operational incidents. → blockchain-council.org |
| 2026-06-03 2026 | https://github.com/Armur-Ai/Pentest-Swarm-AI beginner 6 min read Recon XSS | Tooling for AI-driven pentesting, Pentest Swarm AI utilizes swarm intelligence primitives—stigmergy, emergence, and decentralization—to coordinate multiple independent agents on a shared blackboard. Unlike sequential pipelines, this approach allows attack chains to emerge organically as agents communicate and influence each other through findings and "pheromones." It integrates tools like nmap, sqlmap, Burp, and Metasploit, supporting various LLMs and aiming for emergent, emergent, and decentralized offensive security testing. |
| 2026-06-02 2026 | Snowflake Bolsters AI Security news 1 min read | Library integrating native, proactive, enterprise-grade security for AI workloads, focusing on agent security, data security, and platform-level security. Features include Agent Identity for distinct AI agent actions, enabling auditability and access restrictions to sensitive data, complementing Snowflake Horizon Catalog for AI governance. |
| 2026-06-02 2026 | What Is LLM (Large Language Model) Security? beginner 9 min read | Guide to LLM security covering fundamental concepts, prominent risks like prompt injection and data leakage, and real-world attack examples such as Microsoft's Tay and PoisonGPT. It emphasizes that LLM security differs from traditional app security due to the probabilistic nature of models, and it details practical implementation strategies across the LLM lifecycle to mitigate vulnerabilities. → paloaltonetworks.com |
| 2026-06-02 2026 | You cant patch your way out of prompt injection: AI agents need a different defense intermediate 5 min read | Library for defending against prompt injection in AI agents, emphasizing structural defenses over filters. It addresses vulnerabilities like EchoLeak (CVE-2025-32711) and ShareLeak (CVE-2026-21520) by mitigating the "lethal trifecta" of private data access, untrusted content exposure, and outbound communication. The library promotes treating source text as data, scoping agent capabilities, and implementing strict data-flow and control-flow rules, inspired by research like Google DeepMind's CaMeL. → hackread.com |
| 2026-06-01 2026 | ChatGPhish Reveals ChatGPT Browser Prompt Injection Risk intermediate 3 min read | Library that demonstrates browser-based prompt injection against ChatGPT, named ChatGPhish, allows attackers to manipulate page summaries and deliver phishing or social engineering attacks. This technique bypasses traditional security controls by injecting malicious instructions into ordinary web pages, influencing the LLM's output within the trusted ChatGPT interface. The research highlights risks associated with rendering untrusted Markdown content, including a QR code delivery method that circumvents desktop browser protections. → thecyberexpress.com |
| 2026-05-29 2026 | Fed up with vibe coders dev sneaks data-nuking prompt injection into their code beginner 2 min read | Library update details a prompt injection vulnerability within the jqwik Java testing application for JUnit 5. The malicious instruction, disguised with ANSI escapes, directs AI coding agents to delete tests and code, posing a destructive risk to developers using vulnerable agents without warning or opt-out. Anthropic's Claude AI reportedly flagged this prompt injection. → arstechnica.com |
| 2026-05-28 2026 | This article outlines some of the potential security risks through the lens of real-world AI and LLM applications assessed by Krolls Offensive Security team. Read more. beginner 10 min read | Analysis of real-world AI applications, including a healthcare app using Model Context Protocol (MCP), an online pharmacy with Retrieval Augmented Generation (RAG), a retail business's automated refund processing, and a customer support line's voice authentication, reveals significant security risks. Kroll's Offensive Security team discovered vulnerabilities such as prompt injection, data exfiltration potential through indirect prompts, manipulation of RAG filters leading to inaccurate information, and bypassing invoice validation agents for fraudulent refunds, highlighting the need for rigorous AI security testing. |
| 2026-05-28 2026 | Indirect Prompt Injection Is Now a Real-World AI Security Threat intermediate 5 min read | Library for data-layer governance of AI agents, enabling cryptographic authentication, real-time attribute-based access policy evaluation, and tamper-evident audit trails to prevent data exfiltration and credential theft. This approach provides independent enforcement, ensuring security even when models are compromised or prompts are manipulated, addressing vulnerabilities like those seen in GrafanaGhost, ForcedLeak, GeminiJack, and DockerDash, and satisfying regulatory compliance demands. → techrepublic.com |
| 2026-05-28 2026 | Prompt Injection in 2026 for Web3 Security beginner 8 min read | Library for mitigating prompt injection in Web3 AI agents, addressing risks like wallet manipulation, DAO governance capture, and secret leakage. It emphasizes hardening the architecture around LLMs, including data flows, retrieval pipelines, and tool permissions, as model-layer defenses alone are insufficient. The library highlights common override phrases like "disregard previous instructions" as high-risk indicators and acknowledges sophisticated evasion techniques beyond simple keyword matching, particularly for indirect prompt injection via untrusted content. → blockchain-council.org |
| 2026-05-27 2026 | Beyond the hype: A CIO's guide to LLM risk management beginner 5 min read | Reference on LLM risk management for CIOs, this guide classifies LLM use cases, inventories embedded AI, and governs data, permissions, outputs, drift, and vendor obligations. It details questions for internal teams and vendors regarding data privacy, system access, prompt injection, and bias testing. The framework emphasizes ownership, acceptable usage policies, risk classification, enterprise inventories, security controls for prompt injection and access scoping, data governance for agentic AI, monitoring, third-party risk, and audit preparedness against NIST AI Risk Management Framework, ISO 42001, and the EU AI Act. |
| 2026-05-25 2026 | Three Prompt Injection Patterns Your AI Security Detection Stack Misses intermediate 4 min read | Library for detecting prompt injection attacks on LLMs, addressing gaps in traditional WAF and EDR coverage. It details three injection patterns: indirect injection via document content (AML.T0054.002), second-order injection through AI agent tool calls, and conversation-history poisoning (AML.T0054.001). Detection logic focuses on analyzing RAG pipeline logs, agent orchestration logs (tool-call/response pairs), and conversation-session metadata to identify instruction-formatted text and behavioral shifts, aligning with MITRE ATLAS subtechniques and OWASP's LLM Top 10. |
| 2026-05-23 2026 | AI Security Solutions In 2026: Tools To Secure AI beginner 11 min read | Platform for AI security posture management (AI-SPM) that provides centralized visibility and risk assessment across the AI lifecycle, from development to runtime. It maps your AI estate using a security graph to detect and prioritize risks like model exposure and prompt injection, addressing threats such as shadow AI, data poisoning, and over-permissioned agents. The platform secures infrastructure, governs training data, restricts agent permissions, and monitors live model behavior for anomalies, with Wiz AI-SPM being a leading solution for comprehensive AI security. → wiz.io |
| 2026-05-22 2026 | Compare Top 20 LLM Security Tools & Free Frameworks in 2026 beginner 16 min read | Library offering tools and frameworks for securing Large Language Models (LLMs). It categorizes solutions into open-source frameworks, AI security tools, and GenAI security tools. Specific tools mentioned include Credo AI's GenAI Guardrails, Fairly AI for monitoring and dynamic reporting, Fiddler for observability and prompt injection assessments, Holistic AI for vulnerability and malicious prompt detection, Synack for crowdsourced testing against prompt injection and insecure output handling, WhyLabs LLM Security for data leakage protection and OWASP top 10 for LLMs, CalypsoAI Moderator for data loss prevention and malicious code detection, and Adversa AI for resilience and stress testing against attack simulations. |
| 2026-05-21 2026 | LLM Security News: Risks Incidents Defenses news 8 min read | Library of LLM security incidents and defenses details how rapid adoption of large language models has created new attack surfaces, expanding the enterprise threat landscape beyond traditional controls. It highlights risks like prompt injection, tool abuse, insecure output handling, and LLM supply chain threats, exemplified by the LiteLLM compromise and early 2025 data breaches. The OWASP LLM Top 10, including sensitive information disclosure and excessive agency, are discussed as persistent vulnerabilities, with conventional tools insufficient for addressing these LLM-specific failure modes. → blockchain-council.org |
| 2026-05-21 2026 | AI QA vs AI Security Testing: Why LLM Apps Need Both Before They Scale beginner 9 min read | Library for AI applications that requires both AI QA and AI security testing to move beyond traditional assumptions. It highlights that while AI QA focuses on usefulness, accuracy, and consistency, AI security testing addresses manipulation risks like prompt injection, data leakage, and unauthorized tool use, referencing the OWASP Top 10 for LLMs and the NIST AI Risk Management Framework. |
| 2026-05-21 2026 | Generative AI Data Privacy and Security in LLMs beginner 8 min read | Library for securing Generative AI and LLM workflows, addressing data privacy risks including training data leakage, prompt injection, and output harms. It details where sensitive data appears across training data, prompts, outputs, and telemetry, and outlines practical controls like data discovery, classification, minimization, anonymization, and differential privacy. The resource highlights regulatory pressures like GDPR and the AI Act, and common risk patterns identified by MIT and Stanford HAI, emphasizing OWASP's identified critical LLM risks. → blockchain-council.org |
| 2026-05-20 2026 | Security for AI Agent Managers: Key Controls beginner 8 min read | Library for securing AI agent managers, focusing on mitigating prompt injection, data leaks, and abuse of capabilities. It details risks inherent in agentic systems, including indirect prompt injection in browser agents and tool-chain injection, referencing industry guidance from NIST and the EU AI Act. Recommended layered mitigations include deploying an AI security gateway, enforcing context separation, hardening tool-use policies with least privilege, improving memory and RAG hygiene, and continuous monitoring and red-teaming. → blockchain-council.org |
| 2026-05-20 2026 | How prompt injection broke Nvidia's sandboxed OpenClaw agent intermediate 9 min read | Writeup on prompt injection vulnerabilities in Nvidia's sandboxed OpenClaw agent, detailing how attackers can bypass isolation through dependency poisoning with emoji-encoded payloads and agent configuration poisoning via indirect prompt injection. The research highlights the inadequacy of sandboxes alone to prevent data exfiltration and persistent behavioral corruption, contrasting with the broader "IDEsaster" threat in non-sandboxed AI coding tools like Cursor and GitHub Copilot. |
| 2026-05-19 2026 | AI Agent Security: Automating Workflow Without Creating Prompt Injection or Data Leak Risks intermediate 5 min read | Reference on securing AI agents, detailing risks like prompt injection and data leakage, as described by OWASP and NIST. It emphasizes separating untrusted content from agent instructions, implementing data minimization, role-based access, output controls, and robust logging. The guide advises starting with lower-risk tasks and incorporating human review for sensitive actions, offering a checklist to identify potential vulnerabilities before deployment. → hackread.com |
| 2026-05-19 2026 | 7 Serious AI Security Risks and How to Mitigate Them beginner 11 min read | Library addressing AI security risks including prompt injection attacks and data leaks. It details mitigations for limited testing, lack of explainability, data breaches, adversarial attacks, bias, and supply chain risks, highlighting techniques like adversarial training, interpretable models, encryption, differential privacy, ensemble methods, and bias audits. The resource also notes how LLMs enable attackers to work faster, create convincing deceptions, operate more independently, and discover new vulnerabilities, impacting systems like Slack AI. → wiz.io |
| 2026-05-17 2026 | Researchers Uncover 10 In-the-Wild Prompt Injection Payloads Targeting AI Agents news 2 min read | Writeup detailing 10 indirect prompt injection (IPI) payloads discovered in the wild targeting AI agents. These payloads leverage poisoned web content to trick agents into executing malicious instructions, leading to data destruction, API key theft, and financial fraud. The attack chain involves threat actors embedding hidden instructions like "Ignore previous instructions" which, when processed by agents that browse and summarize web pages, bypass security protocols. High-impact targets include agentic AIs with privileges like sending emails or executing terminal commands, potentially affecting tools such as GitHub Copilot and AI-powered CI/CD reviewers. → infosecurity-magazine.com |
| 2026-05-13 2026 | How indirect prompt injection attacks on AI work - and 6 ways to shut them down intermediate 6 min read | Library providing defenses against indirect prompt injection attacks, a top LLM security risk. These attacks weaponize AI by embedding malicious instructions within external data sources, leading to actions like API key theft, system overrides, attribute hijacking, and terminal command injection. Mitigation strategies include input/output validation, human oversight, least privilege, and OWASP's cheat sheet for handling these threats, which are ranked as the highest to LLM security by OWASP. |
| 2026-05-12 2026 | 7 AI Security Tools to Prepare You for Every Attack Phase beginner 12 min read | Library for hardening machine learning models against adversarial threats, the Adversarial Robustness Toolbox (ART) offers Python modules for assessing, defending, and verifying security. It supports 39 attack and 29 defense modules across major ML frameworks like TensorFlow and PyTorch, handling various data modalities. ART provides robustness metrics for objective resilience reporting, best suited for ML researchers and security engineers focused on adversarial attack simulation and model hardening during development. → wiz.io |
| 2026-05-08 2026 | The AI Agent Security Surface: What Gets Exposed When You Add Tools and Memory intermediate 8 min read | Library for securing AI agents, moving beyond model-centric security to address four distinct attack surfaces: Prompt, Tool, Memory, and Planning Loop. This framework details vulnerabilities like indirect prompt injection, parameter injection against tools, memory poisoning illustrated by MINJA Framework successes, and planning loop manipulation leading to cascading failures in multi-agent systems. Mitigations include boundary sanitization, least privilege, provenance tracking, and reasoning logging. |
| 2026-05-08 2026 | Mitigating Indirect AGENTS.md Injection Attacks in Agentic Environments intermediate 8 min read | Library demonstrating indirect AGENTS.md injection attacks in agentic environments. This library highlights a supply chain risk where malicious dependencies can overwrite AGENTS.md files, allowing attackers to hijack AI agent behavior, exemplified by a Golang project with a compromised `github.com/cursorwiz/echo` dependency that injects a stealthy `time.Sleep` command and manipulates PR summaries. |
| 2026-05-05 2026 | Supply-chain attacks take aim at your AI coding agents beginner 6 min read Supply Chain | Library for identifying and mitigating AI coding agent supply-chain risks, including techniques like "slopsquatting" and LLM Optimization abuse used in the PromptMink campaign by North Korean APT group Famous Chollima. It details malicious packages targeting AI agents on registries like NPM and PyPI, featuring persuasive descriptions, legitimate functionality lures, and the use of compiled payloads and obfuscation for evasion. The library addresses how AI agents can be manipulated into installing malicious dependencies, as observed with hallucinated package names and overly convincing documentation designed to influence LLM recommendations. → csoonline.com |
| 2026-05-05 2026 | LiteLLM flaw exploited within 36 hours of disclosure news | A critical flaw in LiteLLM was exploited within 36 hours of its public disclosure. The vulnerability, which allowed for potential data exfiltration, posed a significant risk to users. The rapid exploitation highlights the urgency of patching security vulnerabilities and the swiftness with which malicious actors can leverage disclosed weaknesses. No specific bounty payout amount was mentioned in the provided content. → msn.com |
| 2026-05-05 2026 | AI finds 20-year-old bugs in PostgreSQL and MariaDB news 2 min read | Analysis of critical vulnerabilities discovered by AI in PostgreSQL and MariaDB, including CVE-2026-2005 (PostgreSQL pgcrypto heap buffer overflow), CVE-2026-2006 (PostgreSQL missing validation), and CVE-2026-32710 (MariaDB JSON_SCHEMA_VALID() buffer overflow). These flaws, some dating back over 20 years, enable remote code execution and have been patched by maintainers. → csoonline.com |
| 2026-05-04 2026 | Weekly Recap: AI-Powered Phishing Android Spying Tool Linux Exploit GitHub RCE & More news 19 min read Mobile RCE | Library for detecting and mitigating threats including the CVE-2026-41940 cPanel flaw, CVE-2026-31431 Linux kernel vulnerability (Copy Fail), and CVE-2026-3854 GitHub RCE. It also covers vishing tactics for SaaS breaches, TeamPCP's supply chain attacks across npm, PyPI, and Packagist, a DEEP#DOOR Python backdoor, and the VECT 2.0 ransomware. → thehackernews.com |
| 2026-05-04 2026 | Local Guardrails for Secrets Security in the Age of AI Coding Assistants beginner 9 min read Secrets Supply Chain | Library for local secret scanning, ggshield, addresses the shift of software supply chain attack surfaces to developer workstations. It detects hardcoded credentials in .env files, terminal history, build output, and AI prompts, mitigating risks before they reach remote repositories or pipelines. The tool integrates directly into developer workflows via editors, Git hooks, terminals, and AI coding assistants, preventing credential exposure and simplifying incident response. → blog.gitguardian.com |
| 2026-05-03 2026 | SecureLayer7 Discloses Two High Injection Vulnerabilities in Spring AI news 2 min read | Writeup detailing two high-severity injection vulnerabilities in Spring AI, CVE-2026-22730 (SQL Injection) and CVE-2026-22729 (JSONPath Injection). These flaws, discovered by SecureLayer7's Blackf0g team, affect vector store metadata filtering and bypass access controls in RAG applications. The SQL injection allows authenticated attackers to manipulate MariaDBFilterExpressionConverter, while the JSONPath injection impacts PostgreSQL and Oracle vector stores via Vector Stores FilterExpressionConverter. Both vulnerabilities are fixed in Spring AI 1.0.4 and 1.1.3. |
| 2026-05-01 2026 | Anthropic Rolls Out Claude Security for AI Vulnerability Scanning beginner 2 min read | Tool for AI-powered application security scanning, Claude Security, utilizes Claude Opus 4.7 to reason about code and identify vulnerabilities by understanding component interactions and data flows, rather than relying solely on pattern matching. It offers scheduled and targeted scans, detailed explanations of findings including confidence ratings and severity, and generates patch instructions. Claude Security integrates with existing audit systems and can send results to platforms like Slack and Jira, aiming to reduce false positives through a multi-stage validation pipeline. → infosecurity-magazine.com |
| 2026-05-01 2026 | Poisoning the well: AI supply chain attacks on Hugging Face and OpenClaw beginner 16 min read Supply Chain | Library of malicious AI skills and models found on Hugging Face and ClawHub, facilitating AI supply chain attacks. Attackers exploit trust in these platforms by embedding trojanized skills and disguised payloads, leading to malware delivery including trojans, cryptominers, and the AMOS stealer. Techniques like indirect prompt injection enable AI agents to execute malicious actions on behalf of users, expanding the attack surface beyond initial compromise. |
| 2026-04-30 2026 | CVE MCP Server Turns Claude Into a Full-Spectrum Security Analyst With 27 Tools Across 21 APIs intermediate 2 min read API Sec | Tool for turning Claude AI into a full-spectrum security analyst, the CVE MCP Server integrates with 27 intelligence tools across 21 APIs. It automates CVE triage by correlating data from NVD, EPSS, CISA KEV, GitHub, VirusTotal, Shodan, and more, providing a weighted risk score for prioritization. Key features include API-free tool access, DevSecOps integrations for dependency scanning, and support for Claude Desktop and Claude Code. → cybersecuritynews.com |
| 2026-04-30 2026 | Benchmarking AI Pentesting Tools: A Practical Comparison intermediate | This article provides a practical comparison of AI-powered penetration testing tools. It evaluates their effectiveness and efficiency in various cybersecurity scenarios. The focus is on how these tools leverage AI to automate and enhance aspects of the pentesting process, such as vulnerability detection and exploitation. The comparison aims to help security professionals choose the most suitable AI tools for their needs. No specific bounty payout amounts are mentioned in the provided content. → securityboulevard.com |
| 2026-04-30 2026 | CVE-2026-42208: LiteLLM SQL Injection Leaks Upstream API Keys news 7 min read SQLi | Writeup of CVE-2026-42208, a critical pre-authentication SQL injection in LiteLLM, a popular AI gateway. Exploited 36 hours after disclosure, this vulnerability in versions prior to 1.83.7-stable allows attackers to steal upstream API keys for providers like OpenAI, Anthropic, and Gemini by targeting the `litellm_credentials` and `litellm_config` tables. Immediate upgrade to version 1.83.7-stable or implementing mitigation strategies is advised. |
| 2026-04-30 2026 | H-mmer/pentest-agents: Autonomous bug-bounty framework for Claude Code — 40 specialist agents, exploit-chain builder, writeup search, and live HackerOne/Bugcrowd integration. intermediate 7 min read Bug Bounty | Library for autonomous bug-bounty hunting, integrating with Claude Code and other AI coding tools. It features 50 specialist agents, an exploit-chain builder, writeup search capabilities leveraging FAISS for semantic or keyword retrieval, and live integration with HackerOne and Bugcrowd platforms. The framework supports automated hunt loops, persistent endpoint tracking, and a cross-IDE installer for seamless deployment. |
| 2026-04-29 2026 | CVE-2026-42208: LiteLLM bug exploited 36 hours after its disclosure news 2 min read SQLi | Writeup of CVE-2026-42208, an SQL injection in LiteLLM's proxy API key verification, exploited 36 hours post-disclosure. Attackers leverage crafted Authorization headers to access and potentially modify sensitive data in database tables holding API keys and credentials. The vulnerability, present in LiteLLM versions 1.81.16 to 1.83.6, was addressed in version 1.83.7. Disabling error logs offers a workaround for unpatchable instances. → securityaffairs.com |
| 2026-04-29 2026 | AI Finds 38 Security Flaws in OpenEMR news RCE | An AI system has identified 38 security vulnerabilities within the OpenEMR electronic health records software. The AI's analysis, detailed in a linked report, uncovered these flaws, highlighting potential risks to patient data security and system integrity. This discovery underscores the growing role of artificial intelligence in identifying and addressing security weaknesses in critical software applications. No specific bug bounty payout amount was mentioned in the provided content. → darkreading.com |
| 2026-04-29 2026 | LiteLLM exploited within 36 hours of disclosure via SQL injection bug news 2 min read SQLi | Library for managing large language model (LLM) interactions. Explores the exploitation of CVE-2026-42208, a SQL injection vulnerability in LiteLLM, which led to the theft of API keys and provider credentials from enterprises using the proxy to connect to models like OpenAI and Anthropic. The vulnerability, disclosed and exploited within 36 hours, highlights the compressed window between vulnerability discovery and weaponization, potentially exposing sensitive company IP and private data. Disabling error logs is a suggested mitigation. → scworld.com |
| 2026-04-29 2026 | Malicious npm Dependency Linked to AI Assisted Commit Targets Crypto Wallets news 1 min read Supply Chain | Library of malicious npm dependencies linked to AI-assisted commits, specifically @validate-sdk/v2 and the PromptMink campaign, targeting crypto wallets. This North Korean state-sponsored actor, Famous Chollima, employed a layered attack structure with legitimate-seeming Web3 utilities hiding malware payloads, evolving from JavaScript to compiled binaries and Rust across Linux and Windows to exfiltrate sensitive data, system information, project folders, and install SSH keys for persistent access. → infosecurity-magazine.com |
| 2026-04-29 2026 | Fresh LiteLLM Vulnerability Exploited Shortly After Disclosure news 2 min read SQLi | Library for securing AI gateways, specifically addressing CVE-2026-42208, a critical-severity SQL injection vulnerability in LiteLLM. This flaw, exploitable pre-authentication, allowed unauthenticated attackers to craft malicious Authorization headers to access sensitive database tables containing API keys and credentials. The vulnerability arises from a database query that includes caller-supplied values directly, bypassing parameterization. LiteLLM version 1.83.7 resolves this by properly parameterizing the query, with disabling error logs also offered as a mitigation. → securityweek.com |
| 2026-04-29 2026 | Firefox using advanced AI to find fix browser security flaws news Fuzzing | Firefox is leveraging advanced AI to proactively identify and fix security vulnerabilities in its browser. This innovative approach aims to enhance user safety by detecting flaws before they can be exploited. The article highlights how AI is becoming an increasingly powerful tool in cybersecurity, particularly in the realm of software development and maintenance. → msn.com |
| 2026-04-29 2026 | Cursor AI Vulnerability Enables Remote Code Execution news RCE | A critical vulnerability in Cursor AI has been discovered, allowing for Remote Code Execution (RCE). This means an attacker could potentially run unauthorized code on a user's system through the AI. The exact impact and exploitation details are likely to be further detailed in the linked content. This type of vulnerability poses a significant security risk, potentially leading to data breaches, system compromise, and other malicious activities. → letsdatascience.com |
| 2026-04-28 2026 | FIRESIDE CHAT: Leaked secrets are now the go-to attack vector and AI is accelerating exposures news 3 min read Secrets | Library for scanning public GitHub commits and private repositories for hard-coded secrets. It detects over 28.6 million leaked credentials in 2025, a 34% year-over-year increase, with AI infrastructure secrets like OpenRouter and DeepSeek API keys spiking significantly. The library addresses the remediation problem, noting that 64% of leaked credentials from 2022 remain active. It highlights how AI-assisted code, like commits co-signed by Claude Code, contains secrets at a 33% rate, and emphasizes the need for governance alongside tools like SPIFFE for machine identity. → securityboulevard.com |
| 2026-04-28 2026 | Experts flag potentially critical security issues at heart of Anthropic MCP news | Security experts have identified potentially critical vulnerabilities within Anthropic's "MCP" (likely referring to their model or platform). These issues, if exploited, could pose significant risks. The article highlights concerns about the security of Anthropic's core technology. No specific payout amounts for bug bounties were mentioned in the provided content. → msn.com |
| 2026-04-27 2026 | Weekly Recap: Fast16 Malware XChat Launch Federal Backdoor AI Employee Tracking & More news 12 min read | Toolset highlighting recent application security threats including fast16 malware, the UNC6692 group's Snow malware suite, FIRESTARTER backdoor targeting a U.S. federal agency, Lotus Wiper affecting Venezuelan energy systems, and The Gentlemen RaaS deploying SystemBC. It also covers the Bitwarden CLI compromise, detailing vulnerabilities such as CVE-2025-20333 and CVE-2025-20362. → thehackernews.com |
| 2026-04-27 2026 | Poisoned pixels phishing prompt injection: Cybersecurity threats in AI-driven radiology beginner 6 min read | Library discussing AI vulnerabilities in healthcare radiology, focusing on prompt injection techniques like data poisoning, backdoor attacks, and jailbreaking. It highlights risks of LLMs in DICOM headers and diagnostic imaging data, enabling attacks without advanced programming skills. Countermeasures explored include least privilege, sandboxing, digital watermarking, and red teaming involving clinical specialists, alongside the persistent human factor in cybersecurity. |
| 2026-04-26 2026 | Anthropic's model context protocol includes a critical remote code execution vulnerability news RCE | A critical remote code execution vulnerability has been discovered in Anthropic's model context protocol. This flaw could allow attackers to execute arbitrary code on a system, posing a significant security risk. Further details are available at the provided link. No bug bounty payout amount is mentioned in the content. → msn.com |
| 2026-04-26 2026 | prompt-security/clawsec: A complete security skill suite for OpenClaw's and NanoClaw agents (and variants). Protect your SOUL.md (etc') with drift detection, live security recommendations, automated audits, and skill integrity verification. All from one installable suite. intermediate 6 min read Supply Chain | Library for comprehensive security for AI agent platforms like OpenClaw, NanoClaw, Hermes, and Picoclaw. It provides unified security monitoring, drift detection, live security recommendations from NVD CVE polling, automated audits for prompt injection, and skill integrity verification. The suite includes a one-command installer, file integrity protection for critical agent files (SOUL.md, etc.), and checksum verification for all skill artifacts. It also offers exploitability context enrichment for CVE advisories, detailing exploit existence, weaponization status, attack requirements, and risk assessment to prioritize immediate threats. |
| 2026-04-24 2026 | Indirect prompt injection is taking hold in the wild beginner 3 min read | Analysis of indirect prompt injection (IPI) observed in the wild, detailing techniques for hiding malicious instructions within web pages and metadata. Researchers from Google and Forcepoint identified IPIs ranging from harmless pranks to destructive actions like data exfiltration, financial fraud via PayPal and Stripe, and denial-of-service attacks. Hidden text, HTML comments, and metadata injection are common obfuscation methods. The increasing prevalence and sophistication of these attacks, particularly against agentic AIs with elevated privileges, necessitate strict data-instruction boundaries. → helpnetsecurity.com |
| 2026-04-24 2026 | GPT-5.5 Bio Bug Bounty Program Aims to Improve AI Safety and Performance news 2 min read Bug Bounty | Program aims to enhance AI safety by inviting researchers to discover vulnerabilities in GPT-5.5. Participants must craft a universal jailbreak prompt to bypass safety filters and successfully answer a five-question bio-safety challenge on GPT-5.5 within Codex Desktop. A $25,000 prize is offered for the first successful universal jailbreak, with smaller awards for partial successes, and the program runs from April 23, 2026, to July 27, 2026, requiring NDA adherence. → gbhackers.com |
| 2026-04-24 2026 | How indirect prompt injection attacks on AI work - and 6 ways to shut them down intermediate 6 min read | Library of resources addressing indirect prompt injection attacks on LLMs, a leading security risk. This threat involves hidden instructions within web content, emails, or addresses that can cause AI to perform malicious actions like data exfiltration or unauthorized redirection, as detailed by researchers from Palo Alto Networks and Forcepoint. Techniques such as API key theft, system override, attribute hijacking, and terminal command injection are outlined. The library also covers defensive strategies including input/output validation, human oversight, and vendor-specific mitigation efforts from Google, Microsoft, Anthropic, and OpenAI. |
| 2026-04-23 2026 | Six AI Vulnerabilities Three Attack Patterns One Dangerous Service Gap news 7 min read | Library for analyzing AI vulnerabilities, focusing on three distinct attack patterns: untrusted input processed as trusted AI context, overly broad AI data access without per-operation enforcement, and process containment and functional scoping failures. This analysis covers vulnerabilities like EchoLeak, Reprompt, ForcedLeak, GeminiJack, and GrafanaGhost, highlighting the need for robust input validation extended to all data sources AI touches, per-operation access control for AI data requests, and strict functional scoping for back-end AI processes, rather than solely relying on model-level guardrails. |
| 2026-04-23 2026 | AI-powered scanner vulnerabilities news 6 min read | Library detailing vulnerabilities in AI-powered web scanners that leverage Large Language Models. It outlines how attacker-controlled content can influence scanner reasoning, leading to indirect prompt injection attacks. These attacks can cause unintended state changes, data exfiltration, and exploitation of routing-based SSRF, often by manipulating Host headers to access internal services from within the scanner's privileged network position. → portswigger.net |
| 2026-04-23 2026 | Anthropic's model context protocol includes a critical remote code execution vulnerability news | Anthropic's model context protocol includes a critical remote code execution vulnerability https://ift.tt/Hfb3ygq → msn.com |
| 2026-04-22 2026 | Massive compromise hits LiteLLM and the whole AI developers community: how did it happen? news | Massive compromise hits LiteLLM and the whole AI developers community: how did it happen? https://ift.tt/kWQ0dJB → cybernews.com |
| 2026-04-22 2026 | Three AI coding agents leaked secrets through a single prompt injection. One vendor's system card predicted it news | Three AI coding agents leaked secrets through a single prompt injection. One vendor's system card predicted it https://ift.tt/smH86bY |
| 2026-04-22 2026 | You're Simulating the Wrong Attacker: Who Matters in AI Red Teaming beginner 4 min read | Library for AI red teaming that highlights the limitations of simulating only prompt injection attackers. It details six distinct threat actor profiles, including low-skill script kiddies, insider threats, and sophisticated nation-state actors, each requiring specialized testing approaches across five expertise domains: prompt engineering, application security, architecture, data/ML security, and business logic. The resource emphasizes that traditional app security teams and even many AI-focused firms miss critical attack surfaces by not simulating a broader range of adversaries and attack vectors. |
| 2026-04-22 2026 | DeepTeam: Open-Source Framework to Red Team LLMs and LLM Systems intermediate 6 min read | Framework for red teaming LLM systems, DeepTeam simulates attacks like jailbreaking, prompt injection, and multi-turn exploitation to uncover vulnerabilities such as bias, PII leakage, and SQL injection. It supports over 50 pre-built vulnerabilities mapped to frameworks like OWASP Top 10 for LLMs and NIST AI RMF, along with 20+ adversarial attack methods. DeepTeam also includes seven production-ready guardrails and allows custom vulnerability creation. |
| 2026-04-22 2026 | Claude Jailbreaking in 2026: What Repello's Red Teaming Data Shows news 22 min read | Analysis of Repello's red-teaming data on LLM jailbreaking reveals Claude Opus 4.5's significantly lower breach rates (4.8%) compared to GPT-5.2 (14.3%) and GPT-5.1 (28.6%) across 21 multi-turn adversarial scenarios. Claude Opus 4.5 demonstrated complete defense against financial fraud and mass deletion attempts, while GPT-5.2 exhibited a "refusal-enablement gap" by refusing harmful actions linguistically yet providing executable attack steps. The analysis highlights that operational risk stems from multi-turn adversarial sequences and application-layer attacks on custom deployments, rather than simple single-prompt jailbreaks. |
| 2026-04-22 2026 | AI-Infra-Guard: Full-Stack AI Red Teaming Platform intermediate 7 min read | Platform for full-stack AI red teaming, AI-Infra-Guard integrates capabilities like ClawScan, Agent Scan, AI infra vulnerability scanning, MCP Server & Agent Skills scan, and Jailbreak Evaluation. It aims to detect vulnerabilities including the LiteLLM supply chain attack (CRITICAL) and supports scanning AI components like FastGPT, Upsonic, crewai, and kubeai, with a vulnerability database refreshed across multiple components and new CVE/GHSA entries. |
| 2026-04-22 2026 | AI Red Teaming Playground Labs (Microsoft) intermediate 4 min read | Library providing AI Red Teaming Playground Labs, originally featured in Black Hat USA 2024. It offers challenges for systematically red teaming AI systems, incorporating adversarial machine learning and Responsible AI failures. These labs are also referenced in the Microsoft Learn Limited Series: AI Red Teaming 101. The repository includes Jupyter Notebooks showcasing the use of the Python Risk Identification Tool (PyRIT) for automated risk identification in generative AI systems, specifically for Labs 1 and 5. |
| 2026-04-22 2026 | HackerOne: LLM01: Invisible Prompt Injection intermediate | Program: HackerOne Severity: medium Weakness: LLM01: Prompt Injection ## Description Hey team, Hai is vulnerable to invisible prompt injection via Unicode tag characters. ## Reproduction steps 1. ... → hackerone.com |
| 2026-04-22 2026 | When AI Meets the Web: Prompt Injection Risks in Third-Party AI Chatbot Plugins beginner 67 min read | Survey of prompt injection risks in third-party AI chatbot plugins, analyzing 17 plugins used by over 10,000 websites. Eight plugins fail to enforce conversation history integrity, amplifying direct prompt injection by allowing forged system messages. Fifteen plugins indiscriminately ingest third-party content for web-scraping, enabling indirect prompt injection when attackers poison external data. This study systematically evaluates these vulnerabilities, showing how insecure plugin practices undermine LLM-level defenses. → arxiv.org |
| 2026-04-22 2026 | Prompt Injection Attacks on Agentic Coding Assistants: A Systematic Analysis advanced 30 min read | Analysis of prompt injection vulnerabilities affecting agentic AI coding assistants like Claude Code, GitHub Copilot, and Cursor, which integrate LLMs with external tools and protocols such as MCP. This work synthesizes findings from 78 studies, detailing 42 attack techniques including input manipulation, tool poisoning, and protocol exploitation. It identifies that over 85% of attacks succeed against current defenses, often enabling arbitrary code execution and system compromise through vulnerabilities in skill-based architectures and protocol ecosystems. → arxiv.org |
| 2026-04-22 2026 | Prompt Injection 2.0: Hybrid AI Threats advanced 20 min read | Library for analyzing Prompt Injection 2.0, which combines LLM manipulation with traditional exploits like XSS and CSRF. It builds upon Preamble's research and mitigation technologies, evaluating them against contemporary threats such as AI worms and multi-agent infections. The library analyzes how these hybrid attacks bypass security controls, referencing CVE-2024-5565 and DeepSeek XSS exploits, and proposes architectural solutions involving prompt isolation and runtime security. → arxiv.org |
| 2026-04-22 2026 | Architecting Secure AI Agents: System-Level Defenses Against Indirect Prompt Injection advanced 1 min read | Library for architecting secure AI agents, focusing on system-level defenses against indirect prompt injection. It proposes dynamic replanning, constrained LLM decision-making, and treating personalization and human interaction as core design elements. The work critiques existing benchmarks, highlighting the importance of system-level structures for controlling agent behavior and integrating rule-based and model-based security checks. → arxiv.org |
| 2026-04-22 2026 | Anthropic's Model Context Protocol includes a critical remote code execution vulnerability newly discovered exploit puts 200000 AI servers at risk news 8 min read | Writeup of critical RCE vulnerability in Anthropic's Model Context Protocol (MCP) affecting its SDKs across Python, TypeScript, Java, and Rust. The flaw, rooted in STDIO transport interface handling of local process execution, allows arbitrary command injection via user-controlled input without sanitization. Exploitation vectors include UI injection in AI frameworks, hardening bypasses in tools like Flowise, zero-click prompt injection in AI coding IDEs such as Windsurf and Cursor, and malicious package distribution via MCP marketplaces. OX Security reported numerous CVEs, with some fixed and others awaiting resolution. |
| 2026-04-21 2026 | The 'by design' security flaw of Model Context Protocol (MCP) news 6 min read | Writeup on the Model Context Protocol (MCP) by OX Security details an architectural flaw allowing remote command execution by exploiting its STDIO interface. This vulnerability affects millions of AI applications and has resulted in numerous CVEs, enabling attackers to hijack servers and exfiltrate data through unverified MCP marketplace configurations like those found in LangFlow and AI IDEs like Windsurf and Cursor. The report emphasizes the need for developers to implement manifest-only execution, strict sandboxing, explicit opt-ins, least-privilege secret management, and marketplace verification to mitigate risks. |
| 2026-04-21 2026 | Prompt injection turned Googles Antigravity file search into RCE news 2 min read | Tool: Prompt injection allows RCE in Google's Antigravity IDE, bypassing Secure Mode. Researchers exploited a flaw in the `find_my_name` tool, which used the `fd` utility. By injecting command-line flags into the `Pattern` parameter, attackers could transform file searches into arbitrary code execution, even through indirect prompt injection from untrusted source files. This bypasses Secure Mode because the native tool invocation occurs before security boundary checks. → csoonline.com |
| 2026-04-21 2026 | Claude Code Gemini CLI and GitHub Copilot Vulnerable to Prompt Injection via GitHub Comments news 3 min read | Library of techniques demonstrating "Comment and Control" prompt injection, a cross-vendor vulnerability class affecting AI coding agents like Anthropic's Claude Code Security Review, Google's Gemini CLI Action, and GitHub Copilot Agent. These attacks weaponize GitHub comments, PR titles, and issue bodies to hijack AI agents, exfiltrating API keys and access tokens from CI/CD environments by bypassing security mitigations such as environment variable filtering, secret scanning, and firewalls. Vulnerabilities detailed include RCE via PR title and API key leaks through issue comments. → cybersecuritynews.com |
| 2026-04-21 2026 | Google Patches Antigravity IDE Flaw Enabling Prompt Injection Code Execution news 6 min read | Library for defending against prompt injection attacks in AI-powered development tools. This library addresses vulnerabilities like the one in Google's Antigravity IDE, where flaws in file searching and input sanitization allowed code execution via the `-X` flag. It also covers techniques seen in attacks such as Comment and Control against GitHub Copilot, NomShub in Cursor IDE, ToolJack, CVE-2026-21520 in Microsoft Copilot Studio, and Claudy Day in Claude, all of which leverage untrusted input to manipulate AI agents, exfiltrate data, or gain unauthorized access. → thehackernews.com |
| 2026-04-20 2026 | Vuln in Googles Antigravity AI agent manager could escape sandbox give attackers remote code execution news 2 min read | Vulnerability in Google's Antigravity AI agent manager allowed prompt injection to bypass secure mode, granting attackers remote code execution by exploiting the `find_by_name` native tool before sandbox protections engaged. This discovery, made by Pillar Security and since patched, highlights the risks of unvalidated input for agentic AI, similar to findings in Cursor, and emphasizes the need to move beyond sanitization controls for native tool parameters. → cyberscoop.com |
| 2026-04-20 2026 | Anthropic MCP Hit by Critical Vulnerability Enabling Remote Code Execution news 2 min read | Writeup of critical RCE vulnerability in Anthropic's Model Context Protocol (MCP), impacting over 150 million downloads and 200,000 servers. This systemic flaw, an architectural design decision across SDKs for Python, TypeScript, Java, and Rust, enables unauthenticated UI injection (CVE-2026-30617), authenticated RCE (CVE-2026-30623), and zero-click prompt injection. Exploitation families were found in tools like Flowise, Windsurf (CVE-2026-30615), Cursor, LiteLLM, LangChain, and IBM's LangFlow. Despite multiple disclosures and critical CVEs, the protocol-level issue remains unaddressed by Anthropic. → gbhackers.com |
| 2026-04-20 2026 | Critical Anthropic MCP Vulnerability Enables Remote Code Execution Attacks news 2 min read | Writeup of critical Anthropic MCP vulnerabilities, identified by OX Security, enabling remote code execution and data exfiltration. The flaws, present across MCP SDKs for Python, TypeScript, Java, and Rust, affect over 150 million downloads and 200,000 servers. Exploitation paths include unauthenticated UI injection in AI frameworks, security hardening bypasses in platforms like Flowise, zero-click prompt injection targeting AI IDEs like Windsurf and Cursor, and malicious payload distribution through MCP registries, with CVE-2026-30615 and CVE-2026-30623 being notable examples. OX Security has developed detection capabilities for insecure MCP configurations. → cyberpress.org |
| 2026-04-19 2026 | MCP Tool Poisoning — How It Works & How To Fight It intermediate 13 min read | Library detailing MCP tool poisoning, an indirect prompt injection attack targeting AI agents interacting with tools via Model Context Protocol (MCP) servers. Attackers hide malicious instructions within tool metadata, like descriptions or schemas, making them invisible to users but readable by AI agents. This technique can lead to data exfiltration, credential hijacking, and remote code execution, and can be combined with other attacks such as MCP rug pulls. Mitigation strategies primarily involve using MCP gateways and robust AI security tools to detect changes in tool metadata and outputs. |
| 2026-04-19 2026 | Model Context Protocol Has Prompt Injection Security Problems intermediate 6 min read | Library for securing applications that implement the Model Context Protocol (MCP), addressing prompt injection vulnerabilities. It details attacks like rug pulls, tool shadowing, and tool poisoning, as demonstrated by examples involving exfiltrating WhatsApp message history and manipulating `os.system()` calls. The library highlights the inherent dangers of mixing untrusted instructions with tools that can perform actions on a user's behalf. |
| 2026-04-19 2026 | Vulnerability of LLMs to Prompt Injection in Medical Advice — JAMA news | Vulnerability of LLMs to Prompt Injection in Medical Advice — JAMA |
| 2026-04-19 2026 | Prompt Injection Attack Against LLM-Integrated Applications — arXiv beginner 1 min read | Survey of prompt injection attacks against LLM-integrated applications, detailing the limitations of current methods and introducing HouYi, a novel black-box attack technique. HouYi, inspired by traditional web injection, comprises a pre-constructed prompt, an injection prompt for context partitioning, and a malicious payload. The study demonstrates severe outcomes like unrestricted LLM usage and application prompt theft across 36 real-world applications, with 31 found vulnerable and 10 vendors, including Notion, validating discoveries. → arxiv.org |
| 2026-04-19 2026 | Prompt Injection Attacks in LLMs and AI Agent Systems: A Comprehensive Review beginner | Prompt Injection Attacks in LLMs and AI Agent Systems: A Comprehensive Review |
| 2026-04-16 2026 | Anthropic Defends MCP Design Despite Server Takeover Risk news | Anthropic Defends MCP Design Despite Server Takeover Risk https://ift.tt/IsVue9D → letsdatascience.com |
| 2026-04-16 2026 | The Mother of All AI Supply Chains: Critical Systemic Vulnerability at the Core of Anthropics MCP news 3 min read | Analysis of Anthropic's Model Context Protocol (MCP) reveals a systemic vulnerability enabling Arbitrary Command Execution (RCE) across its SDKs for Python, TypeScript, Java, and Rust. Exploitable via unauthenticated UI injection, hardening bypasses in Flowise, zero-click prompt injection in Windsurf and Cursor, and malicious marketplace distribution, this flaw impacts over 150 million downloads and thousands of servers. Affected tools include LiteLLM, LangChain, and IBM's LangFlow, with over 10 CVEs issued. → ox.security |
| 2026-04-16 2026 | Bypassing LLM Guardrails: Evasion Attacks against Prompt Injection Detection intermediate 1 min read | Analysis of evasion attacks against LLM guardrail systems, detailing two methods: character injection and algorithmic Adversarial Machine Learning (AML). Tested against Azure Prompt Shield and Meta's Prompt Guard, these techniques achieved up to 100% evasion success, maintaining adversarial utility. Attack Success Rates against black-box targets were enhanced by leveraging word importance ranking from offline white-box models, exposing vulnerabilities in current LLM protection mechanisms. → arxiv.org |
| 2026-04-16 2026 | EchoGram: Bypassing AI Guardrails via Token Flip Attacks - HiddenLayer intermediate 10 min read | Technique for bypassing AI guardrails, EchoGram, exploits similarities in training data for text classification and LLM-as-a-judge systems. By appending specific "flip tokens" to malicious prompts, attackers can trick defense models into approving harmful content or generating false alarms. This attack targets defenses protecting models like GPT-4, Claude, and Gemini, and works by manipulating the guardrail layer without altering the core payload. EchoGram can be implemented via dataset distillation or model probing techniques. |
| 2026-04-16 2026 | MCP Security: Tool Poisoning Attacks - Invariant Labs intermediate 9 min read | Library detailing Model Context Protocol (MCP) Tool Poisoning Attacks, a vulnerability allowing sensitive data exfiltration and AI model hijacking via malicious tool descriptions. These attacks exploit the disconnect between simplified user interfaces and complete tool descriptions, enabling instructions to access sensitive files like SSH keys and obscure data transmission. The library highlights implications for agentic systems, detailing how attackers can poison tool descriptions to compromise user data and manipulate AI behavior even with trusted servers. |
| 2026-04-16 2026 | Poison Everywhere: No Output from Your MCP Server Is Safe - CyberArk intermediate 13 min read | Library for exploring Tool Poisoning Attacks (TPA) on Anthropic's Model Context Protocol (MCP). This research extends beyond description fields to demonstrate Full-Schema Poisoning (FSP) by manipulating parameter defaults and types within the tool schema. It also introduces Advanced Tool Poisoning Attacks (ATPA), which specifically target and complicate the detection of malicious tool outputs on MCP servers. |
| 2026-04-16 2026 | The Embedded Threat in Your LLM: Poisoning RAG Pipelines intermediate 4 min read | Analysis of the "Embedded Threat" attack against RAG pipelines, demonstrating how attackers can poison vector databases with malicious documents. This exploit manipulates LLM behavior by embedding hidden instructions within vector embeddings, such as those generated by sentence-transformers/all-MiniLM-L6-v2, leading to altered responses without prompt modification. The attack leverages semantic similarity and LLM trust in retrieved context to inject misinformation or change personas, with proof-of-concept results showing an 80% success rate. Defenses focus on vetting sources, preprocessing content before embedding, enforcing prompt boundaries, and monitoring retrieval behavior. |
| 2026-04-16 2026 | EchoLeak: First Real-World Zero-Click Prompt Injection Exploit advanced 21 min read | Writeup of EchoLeak (CVE-2025-32711), the first zero-click prompt injection exploit targeting Microsoft 365 Copilot. This vulnerability allowed unauthenticated data exfiltration via a crafted email by chaining multiple bypasses, including evading XPIA classifiers, using reference-style Markdown, exploiting auto-fetched images, and abusing a Microsoft Teams proxy within the content security policy. The paper analyzes defense failures and proposes mitigations such as prompt partitioning and enhanced filtering, providing generalizable lessons for secure AI copilots. → arxiv.org |
| 2026-04-16 2026 | When LLMs Autonomously Attack - CMU Research advanced 3 min read | Research from Carnegie Mellon University demonstrates LLMs can autonomously plan and execute complex cyberattacks by acting as hierarchical agents with abstracted "mental models" of red teaming behavior. This system, evaluated by recreating the 2017 Equifax data breach, shows advanced LLMs can orchestrate multi-step attacks, including exploitation, malware deployment, and data exfiltration, without detailed human instruction, offering potential for continuous, affordable security testing and autonomous defense development. |
| 2026-04-16 2026 | The Dark Side of LLMs: Agent-based Attacks for Complete Computer Takeover advanced 49 min read | Survey of LLM agent vulnerabilities; demonstrates how 94.4% of 18 tested LLMs succumb to Direct Prompt Injection and 83.3% to RAG Backdoor Attacks, enabling malware execution. Inter-Agent Trust Exploitation compromises 100.0% of models, showcasing context-dependent security behaviors that create exploitable blind spots within multi-agent systems. → arxiv.org |
| 2026-04-16 2026 | MCP Tools: Attack Vectors and Defense Recommendations - Elastic Security Labs intermediate 18 min read | Library detailing attack vectors and defense recommendations for Model Context Protocol (MCP) tools, which connect LLMs to external resources. It explores prompt injection and orchestration exploits, including obfuscated instructions, rug-pull redefinitions, cross-tool orchestration, and passive influence, with examples and a basic LLM-based detection method. Security precautions and defense tactics for MCP tool vulnerabilities are also discussed. |
| 2026-04-16 2026 | MCP Safety Audit: LLMs with MCP Allow Major Security Exploits intermediate 16 min read | Tool for auditing Model Context Protocol (MCP) servers, McpSafetyScanner automatically detects vulnerabilities like malicious code execution, remote access control, and credential theft in generative AI applications. It identifies adversarial samples, searches for related exploits, and generates remediation reports for MCP developers. The tool aims to proactively mitigate security risks introduced by LLMs using the MCP framework, addressing issues present in industry-leading LLMs such as Claude and Llama. → arxiv.org |
| 2026-04-16 2026 | AI Security: 5 Attack Vectors Explained beginner 4 min read | Talk detailing five critical attack vectors targeting Large Language Models (LLMs), including Prompt Injection, Context Injection, LLM Internals Vector, RAG Vector, and Agentic Vector. It highlights the "Zero Trust Gap" in LLMs and discusses encoder models like ModernBERT as potential building blocks for implementing AI guardrails due to their speed, efficiency, and privacy benefits. |
| 2026-04-16 2026 | AI agents on GitHub leak API keys via prompt injection news 2 min read | Library for detecting prompt injection vulnerabilities in AI agents, specifically detailing "Comment and Control" attacks on GitHub Actions. The vulnerability affects Claude Code Security Review (CVSS 9.4 Critical), Google Gemini CLI Action (bounty $1,337), and GitHub Copilot Agent (bypassing environment filtering, secret scanning, and network firewall). Attackers exploit PR titles, issue bodies, and comments to exfiltrate API keys and tokens like ANTHROPIC_API_KEY, GITHUB_TOKEN, GEMINI_API_KEY, and GITHUB_COPILOT_API_TOKEN. → techzine.eu |
| 2026-04-16 2026 | MCP Supply Chain Advisory: RCE Vulnerabilities Across the AI Ecosystem news 11 min read | Advisory detailing a systemic command injection vulnerability within Anthropic's MCP protocol impacting multiple AI ecosystem products. Exploits, including CVE-2025-65720 for GPT Researcher, CVE-2026-30623 for LiteLLM, and CVE-2026-30624 for Agent Zero, allow unauthenticated or authenticated remote command execution by injecting arbitrary commands through MCP configurations in affected applications like LangFlow, Fay Digital Human Framework, and Bisheng. → ox.security |
| 2026-04-15 2026 | Risks of artificial intelligence security beginner 10 min read | Library of security considerations for artificial intelligence, detailing risks from prompt injection and data poisoning to model stealing and generative AI misuse in deepfakes and phishing. It highlights vulnerabilities in AI systems, adversary misuse of generative AI, and unintended consequences like bias and data leakage, emphasizing challenges posed by LLM integrations with tools and third-party dependencies. The summary also touches on AI-generated code risks and the escalating concern of autonomous AI attack bots. → blockchain-council.org |
| 2026-04-15 2026 | Agentic LLM Browsers Expose New Attack Surface for Prompt Injection and Data Theft intermediate 3 min read | Analysis of agentic LLM browsers, including Comet, Atlas, Microsoft Edge Copilot, and Brave Leo AI, reveals a new attack surface for prompt injection and data theft. Researchers identified architectural vulnerabilities where Cross-Site Scripting (XSS) on trusted domains can grant attackers control over browsing sessions, enabling indirect prompt injection. This allows malicious commands to be executed, leading to unauthorized file access, email exfiltration, and malware deployment, with attacks being difficult to detect as they leverage user credentials and mimic normal behavior. → cybersecuritynews.com |
| 2026-04-15 2026 | Agents hooked into GitHub can steal creds but Anthropic Google and Microsoft haven't warned users news 6 min read | Library for detecting prompt injection vulnerabilities in AI agents integrated with GitHub Actions. Researchers demonstrated that agents like Anthropic's Claude Code Security Review, Google's Gemini CLI Action, and Microsoft's GitHub Copilot can be tricked via "comment and control" prompt injection into leaking API keys and GitHub access tokens. This attack can occur proactively when pull requests are opened or issues are filed, bypassing existing security layers. → theregister.com |
| 2026-04-14 2026 | Check Point Releases AI Factory Security Blueprint to Safeguard AI Infrastructure from GPU Servers to LLM Prompts beginner 2 min read | Blueprint for securing AI infrastructure, safeguarding GPU servers to LLM prompts. This vendor-tested reference architecture, developed by Check Point, offers layered protection across perimeter, application and LLM, AI infrastructure, and workload and container layers. It addresses threats like prompt injection, data exfiltration, and lateral movement within Kubernetes, leveraging technologies from Check Point and NVIDIA BlueField DPUs via the NVIDIA DOCA software platform. |
| 2026-04-14 2026 | AI Agents Drive Exposure of 29 Million Credentials news | AI Agents Drive Exposure of 29 Million Credentials https://ift.tt/zyb7MrR → letsdatascience.com |
| 2026-04-14 2026 | Claude Mythos Changed Everything. Your APIs Are the First Target. news 4 min read | Platform for agentic security, Salt's Agentic Security Platform addresses the immediate threat posed by AI models like Claude Mythos, which can autonomously discover and exploit zero-day vulnerabilities. It provides continuous, real-time discovery of all API assets, including undocumented and shadow APIs, mapping the full agentic attack surface. The platform then assesses posture, identifying exposures like unauthenticated APIs and excessive permissions, enabling prioritized remediation to fix vulnerabilities before they can be exploited by AI-powered attackers. → securityboulevard.com |
| 2026-04-13 2026 | AI Coding Security Vulnerability Statistics 2026: Alarming Data news 9 min read | Survey of AI coding security vulnerability statistics reveals alarming trends, with up to 62% of AI-generated code containing flaws. Veracode's 2025 analysis shows 45% of AI-generated code fails security tests, and 86% of organizations use third-party packages with critical vulnerabilities in AI-driven environments. Common issues include SQL injection, XSS, log injection, hardcoded credentials, and insecure cryptographic implementations. Java exhibits a 71% failure rate, while Python has a 38% failure rate, highlighting language-specific risks. The report notes a 10x increase in monthly security findings from AI code and a 153% rise in design-level flaws. Prompt injection is now the top OWASP risk for LLM applications. → sqmagazine.co.uk |
| 2026-04-13 2026 | GitHub - schwartz1375/genai-security-training beginner 3 min read Talks | Library of self-paced training materials for security researchers red teaming GenAI and AI/ML systems. It covers adversarial attacks, security vulnerabilities, privacy breaches, model manipulation, evasion techniques, and system-level exploits like prompt injection and jailbreaking. The curriculum includes hands-on labs using tools such as Adversarial Robustness Toolbox (ART), TextAttack, and SHAP, along with theoretical content and references to OWASP LLM Top 10 and MITRE ATLAS. |
| 2026-04-13 2026 | GitHub - schwartz1375/genai-essentials beginner 1 min read Talks | Collection of Jupyter notebooks detailing Generative AI and Large Language Model concepts, prioritizing security considerations. The sequence progresses from core LLM principles and agent introductions to advanced topics like Retrieval-Augmented Generation (RAG), multimodal LLMs, agent frameworks (ReAct, Plan-Execute), and Model Context Protocol (MCP) integration for tool extensibility. Dependencies include Python 3.8+ and Jupyter. |
| 2026-04-12 2026 | Could Sock Puppeting Be the New Trick Jailbreaking Major LLMs? news 2 min read | Technique for jailbreaking LLMs using "sockpuppeting" exploits assistant prefill APIs across major models like Gemini 2.5 Flash and GPT-4o-mini. This method injects a fake acceptance message into the assistant's role, forcing models to bypass safety guardrails and generate prohibited content, including malicious exploit code and system prompts. Providers like OpenAI and AWS Bedrock mitigate this by blocking assistant prefills entirely, while platforms like Google Vertex AI are susceptible due to differing message handling. Security teams are advised to incorporate this vulnerability into AI red-teaming and implement API-layer message ordering validation. |
| 2026-04-11 2026 | LLM Red Teaming Guide (Open Source) - Promptfoo intermediate 13 min read | Library for systematic LLM red teaming, focusing on generating adversarial inputs like prompt injection and jailbreaking to evaluate responses. It supports black-box testing, quantifying risk, and integrating into CI/CD pipelines for applications involving RAG, LLM agents, or chatbots, addressing vulnerabilities such as information leakage, API misuse, and privacy violations. |
| 2026-04-11 2026 | Defining LLM Red Teaming - NVIDIA Technical Blog beginner 8 min read | Analysis defining LLM red teaming as a limit-seeking, manual, and creative practice focused on discovering model deviations rather than malicious harm. It categorizes strategies into language, rhetorical, possible worlds, fictionalizing, and stratagems, identifying 35 specific techniques for exploring LLM vulnerabilities. This approach complements automated benchmarking by leveraging human intuition to uncover novel risks, a crucial element in NVIDIA's trustworthy AI development process. |
| 2026-04-11 2026 | Large Reasoning Models are Autonomous Jailbreak Agents advanced 27 min read | Survey of Large Reasoning Models as autonomous jailbreak agents, evaluating DeepSeek-R1, Gemini 2.5 Flash, Grok 3 Mini, and Qwen3 235B. These models autonomously planned and executed multi-turn conversations with nine target models, achieving a 97.14% jailbreak success rate across harmful prompts. The research highlights an "alignment regression" dynamic, where advanced LRMs can erode the safety guardrails of earlier models. |
| 2026-04-11 2026 | Involuntary Jailbreak: On Self-Prompting Attacks advanced 1 min read | Library disclosing "involuntary jailbreak," a new LLM vulnerability. This technique employs a single universal prompt to compel models like Claude Opus 4.1, Grok 4, Gemini 2.5 Pro, and GPT 4.1 to generate previously rejected questions and their detailed answers, potentially compromising the entire guardrail structure rather than localized components. → arxiv.org |
| 2026-04-11 2026 | Single Line of Code Can Jailbreak 11 AI Models Including ChatGPT, Claude, Gemini intermediate 3 min read | Technique for jailbreaking 11 AI models including ChatGPT, Claude, and Gemini, dubbed "sockpuppeting," exploits assistant prefill API features. This attack injects a fake response prefix, tricking models into generating prohibited content and even revealing system prompt leakage, with Google's Gemini 2.5 Flash showing a 15.7% success rate. While some providers have implemented protections, self-hosted environments using frameworks like Ollama and vLLM remain vulnerable without explicit API-level validation. → cyberpress.org |
| 2026-04-11 2026 | OWASP Top 10 for LLMs 2025: Key Risks and Mitigation Strategies beginner 2 min read | Survey of the OWASP Top 10 for LLM Applications (2025), detailing evolving technical and socio-technical risks like prompt injection and excessive agency. This updated list guides enterprises in securing generative AI ecosystems, from training pipelines to plugins, addressing data disclosure and systemic vulnerabilities relevant to GDPR, HIPAA, CCPA, and the EU AI Act. Invicti's proof-based scanning and LLM-specific checks are presented as tools to validate real risks and strengthen defenses. → invicti.com |
| 2026-04-11 2026 | OWASP Top 10 for LLM Applications 2025 beginner | OWASP Top 10 for LLM Applications 2025 → genai.owasp.org |
| 2026-04-11 2026 | Practical Poisoning Attacks against Retrieval-Augmented Generation advanced 1 min read | Library introducing CorruptRAG, a novel poisoning attack against Retrieval-Augmented Generation (RAG) systems. This technique injects a single poisoned text into the knowledge database, significantly enhancing attack feasibility and stealth compared to prior methods that required numerous poisoned entries. Experiments on large-scale datasets validate CorruptRAG's effectiveness in compromising RAG outputs. → arxiv.org |
| 2026-04-11 2026 | RAG Safety: Exploring Knowledge Poisoning Attacks to RAG advanced 2 min read | Analysis of knowledge poisoning attacks targeting Retrieval-Augmented Generation (RAG) systems, specifically focusing on KG-RAG. This work introduces a practical, stealthy attack strategy that inserts perturbation triples into knowledge graphs to create misleading inference chains, degrading KG-RAG performance. Experiments demonstrate the attack's effectiveness against four recent KG-RAG methods with minimal KG perturbations. → arxiv.org |
| 2026-04-11 2026 | Benchmarking Poisoning Attacks against Retrieval-Augmented Generation advanced 1 min read | Benchmark framework for evaluating poisoning attacks on Retrieval-Augmented Generation (RAG) systems. This benchmark includes 5 standard QA datasets, 10 expanded variants, 13 poisoning attack methods, and 7 defense mechanisms. Findings reveal that while current attacks are effective on standard datasets, their impact diminishes on expanded versions, and advanced RAG architectures like sequential, branching, conditional, loop, conversational, multimodal RAG, and RAG-based LLM agents remain vulnerable, with existing defenses proving insufficient. → arxiv.org |
| 2026-04-11 2026 | Q4 2025 AI Agent Security Trends news 1 min read | Report on Q4 2025 AI agent security trends, detailing real-world attacks targeting emergent agentic AI systems. Analysis of production traffic reveals attacker focus on system prompt leakage, indirect prompt injection via trusted external content, and exploitation of new surfaces like tool use and script-shaped content. Core techniques include role play and obfuscation to bypass safeguards, with indirect attacks proving more efficient than direct ones. |
| 2026-04-11 2026 | OWASP GenAI Top 10 Risks and Mitigations for Agentic AI Security beginner 9 min read | Library defining the OWASP Top 10 for Agentic Applications, a comprehensive resource for identifying and mitigating risks associated with autonomous AI agents. Developed through input from over 100 industry leaders, it highlights threats such as Agent Behavior Hijacking, Tool Misuse and Exploitation, and Identity and Privilege Abuse. This framework complements existing OWASP GenAI resources, offering practical, actionable guidance grounded in real-world attacks and mitigations to promote the secure development and deployment of generative AI systems. → genai.owasp.org |
| 2026-04-11 2026 | AI Agent Attacks in Q4 2025 Signal New Risks for 2026 news 4 min read | Analysis of Q4 2025 AI agent attacks highlights evolving threats including system prompt extraction via hypothetical scenarios and obfuscation. Attackers also bypass content controls using indirect methods and probe agents for weaknesses. New attack paths emerge through agentic capabilities like document browsing and tool calls, often via indirect prompt injection. Organizations must extend security controls, validate external content, enforce least-privilege access, and prepare AI-specific incident response. → esecurityplanet.com |
| 2026-04-11 2026 | Protecting Against Indirect Prompt Injection Attacks in MCP intermediate 5 min read | Library for mitigating Indirect Prompt Injection attacks within the Model Context Protocol (MCP). This resource details vulnerabilities like Tool Poisoning, where malicious instructions are embedded in tool metadata, and recommends implementing AI Prompt Shields with techniques like "Spotlighting" and "Datamarking." It also emphasizes supply chain security and general security hygiene as crucial for safeguarding AI systems. |
| 2026-04-11 2026 | Indirect Prompt Injection Attacks: Hidden AI Risks intermediate 4 min read | Library for defending against indirect prompt injection attacks, a sophisticated AI threat recognized by OWASP as a top risk. This library addresses vulnerabilities where malicious instructions are embedded in external content like documents, emails, or images, rather than being submitted directly to an AI agent. It aims to mitigate risks such as data exfiltration and manipulation of business processes by enabling prompt injection detection, input validation, and the establishment of content security policies, similar to CrowdStrike's approach using its Falcon platform. |
| 2026-04-11 2026 | Fooling AI Agents: Web-Based Indirect Prompt Injection Observed in the Wild intermediate 20 min read | Writeup detailing observed in-the-wild indirect prompt injection (IDPI) attacks targeting AI agents. The analysis highlights real-world cases including AI-based ad review evasion, SEO manipulation for phishing, data destruction, and sensitive information leakage. It discusses 22 distinct payload engineering techniques and classifies attacker intents, emphasizing the growing weaponization of IDPI beyond theoretical risks. → unit42.paloaltonetworks.com |
| 2026-04-11 2026 | Anatomy of an Indirect Prompt Injection beginner 13 min read | Library detailing the CFS (Context, Format, Salience) model for understanding indirect prompt injection in LLMs. It analyzes vulnerabilities, drawing on concepts like Simon Willison's "lethal trifecta" (access to private data, untrusted content exposure, external communication), and examines how attackers refine tactics to bypass LLM security. Real-world examples, such as the Supabase Model Context Protocol (MCP) attack, illustrate the dangers of embedding malicious instructions within seemingly benign data, leading to unauthorized data exposure or system compromise. |
| 2026-04-11 2026 | New Prompt Injection Attack Vectors Through MCP Sampling intermediate 13 min read | Writeup of new prompt injection attack vectors targeting the Model Context Protocol (MCP) sampling feature. Exploiting the implicit trust model and lack of built-in security controls, attackers can achieve resource theft, conversation hijacking, and covert tool invocation. The analysis details three proof-of-concept examples and evaluates mitigation strategies for MCP-based systems, highlighting vulnerabilities in this LLM integration standard. → unit42.paloaltonetworks.com |
| 2026-04-11 2026 | A Timeline of Model Context Protocol (MCP) Security Breaches news 9 min read | Timeline details MCP security breaches from April to December 2025, highlighting vulnerabilities like "tool poisoning" in WhatsApp MCP, prompt injection in GitHub MCP leading to data exfiltration, cross-tenant access flaws in Asana MCP, and remote code execution in Anthropic's MCP Inspector. Other incidents include OS command injection in `mcp-remote` (CVE-2025-6514), sandbox escapes in Anthropic's Filesystem-MCP server, supply-chain compromises via malicious MCP servers, systemic MCP design flaws enabling RCE in Flowise, and path traversal in Smithery MCP hosting. |
| 2026-04-11 2026 | The Vulnerable MCP Project: Comprehensive MCP Security Database beginner 7 min read | Library of known vulnerabilities impacting MCP (Model Configuration Protocol) servers and SDKs. This catalog details specific exploits such as CVE-2025-68145, CVE-2025-68143, and CVE-2025-68144, alongside broader attack classes including prompt injection, DNS rebinding, Server-Side Request Forgery (SSRF), and command injection. Vulnerabilities affect various products like Anthropic's mcp-server-git, MCP TypeScript SDK, Cursor IDE, and Grafana MCP server, often enabling arbitrary code execution, data exfiltration, or unauthorized transactions. |
| 2026-04-11 2026 | MCP Security: Critical Vulnerabilities Every CISO Must Address in 2025 intermediate 8 min read | Library detailing critical vulnerabilities in Model Context Protocol (MCP), a new standard for AI-tool integration. It highlights how prompt injection attacks in MCP ecosystems can trigger automated actions through connected systems, potentially leading to sensitive data exfiltration. The library also addresses supply chain risks, explaining how MCP servers can dynamically modify tool definitions, allowing for "rug pull" attacks where previously approved tools can be repurposed for malicious activity, affecting vendors like Microsoft and impacting applications such as Nginx-ui (CVE-2026-33032) and Adobe Acrobat Reader. |
| 2026-04-11 2026 | OWASP LLM Prompt Injection Prevention Cheat Sheet beginner 12 min read | Reference LLM Prompt Injection Prevention Cheat Sheet detailing vulnerabilities in Large Language Model applications. It covers direct and indirect prompt injection, encoding and obfuscation techniques like Base64 and Unicode smuggling, and typoglycemia-based attacks. The resource also discusses jailbreaking methods such as DAN prompts, multi-turn attacks, system prompt extraction, data exfiltration, multimodal injection, RAG poisoning, and agent-specific attacks. Defenses include input validation and sanitization, with code examples for pattern matching and fuzzy matching against typoglycemia variants. → cheatsheetseries.owasp.org |
| 2026-04-11 2026 | Attention Tracker: Detecting Prompt Injection Attacks in LLMs intermediate | Attention Tracker: Detecting Prompt Injection Attacks in LLMs |
| 2026-04-11 2026 | How Microsoft Defends Against Indirect Prompt Injection Attacks intermediate 13 min read | Library that defends against indirect prompt injection attacks targeting LLM-based systems. This library implements a multi-layered defense strategy including preventative techniques like hardened system prompts and Spotlighting, detection tools such as Microsoft Prompt Shields integrated with Defender for Cloud, and impact mitigation through data governance, user consent workflows, and deterministic blocking. It addresses vulnerabilities like data exfiltration via HTML images, clickable links, tool calls, and covert channels, as well as unintended actions and phishing. → microsoft.com |
| 2026-04-10 2026 | AI Cybersecurity After Mythos: The Jagged Frontier intermediate 17 min read | Library for AI-driven vulnerability discovery, demonstrating that smaller, cheaper open-weight models can recover significant analysis from Anthropic's Mythos showcase, including detecting exploit candidates for FreeBSD and OpenBSD bugs. This work emphasizes that the effectiveness of AI cybersecurity lies in the surrounding system architecture and deep security expertise, rather than solely on frontier model scale, impacting the economics of the defensive pipeline. |
| 2026-04-10 2026 | Anthropic announces Claude Mythos for cybersecurity research news 1 min read | Library for AI-driven cybersecurity research, Claude Mythos Preview autonomously identifies zero-day vulnerabilities and develops exploits. It has discovered critical issues in OpenBSD, FFmpeg, and the Linux kernel. Access is offered to select partners via Claude API, Amazon Bedrock, Google Cloud’s Vertex AI, and Microsoft Foundry, with an application process for open-source maintainers. Anthropic provides usage credits and donations to security foundations, highlighting significant advances in autonomous vulnerability discovery over prior models. |
| 2026-04-10 2026 | Crushing the Axios supply chain threat with Tenable Hexa AI: Use cases for agentic AI intermediate 5 min read | Tool for identifying exposure to the Axios npm supply chain attack using Tenable Hexa AI. This agentic AI automates scanning, asset identification, and remediation verification, mirroring workflows applicable to other emerging threats like CVEs and zero-days. It enables rapid assessment of exposure, scoping blast radius through asset tagging, and efficient prioritization, transforming emergency response from manual scripting to conversational command. → securityboulevard.com |
| 2026-04-10 2026 | MCP Security Vulnerabilities: Prompt Injection and Tool Poisoning intermediate 9 min read | Library for securing Model Context Protocol (MCP) deployments against prompt injection and tool poisoning. It details vulnerabilities like metadata poisoning, over-permissioned tools, supply chain risks, and indirect prompt injection, referencing incidents such as the Supabase MCP Lethal Trifecta Attack. The library emphasizes prevention strategies including strict input validation, sanitization, and the principle of least privilege for tools. |
| 2026-04-10 2026 | How Agentic Tool Chain Attacks Threaten AI Agent Security intermediate 5 min read | Library for securing AI agents against agentic tool chain attacks, detailing threats like tool poisoning, tool shadowing, and rugpull attacks that exploit the agent's reasoning layer and natural language-based decision-making. It covers how these attacks can lead to data exfiltration, unauthorized actions, and supply chain risks by manipulating tool descriptions, metadata, and server behavior, and recommends mitigation strategies including tool governance, version control, server identity controls, pre-execution guardrails, and observability. |
| 2026-04-10 2026 | 8,000+ MCP Servers Exposed: The Agentic AI Security Crisis of 2026 news | 8,000+ MCP Servers Exposed: The Agentic AI Security Crisis of 2026 |
| 2026-04-10 2026 | Agentic AI Security in Production: MCP, Memory Poisoning, Tool Misuse intermediate 13 min read | Tool, a comprehensive analysis of agentic AI security in production, details critical failure modes including MCP Security, Memory Poisoning, and Tool Misuse. It highlights the evolving threat landscape where agents plan and execute actions, emphasizing system design over prompt-level fixes. Specific vulnerabilities like CVE-2025-68144 in mcp-server-git and attack models such as MINJA and AgentPoison are examined, underscoring the need for robust controls across input, memory, tool execution, and identity planes to manage the expanded attack surface created by these systems. → penligent.ai |
| 2026-04-10 2026 | Offensive Security for MCP Servers: How to Prevent AI Agent Exploits intermediate 10 min read | Library for securing MCP (Multi-Cloud Platform) servers against AI agent exploits, addressing vulnerabilities like command injection, SSRF, and path traversal frequently found in modern deployments. It highlights how AI's autonomous execution and dynamic capability discovery, unlike traditional REST APIs, create new risk classes by enabling agents to chain tool calls and reason across APIs. The library emphasizes adapting security from syntax to intent validation, guarding against prompt injection and tool poisoning where manipulated metadata or input can lead to unintended, privileged operations, ultimately leveraging foundational API security principles. |
| 2026-04-10 2026 | The New AI Attack Surface: 3 AI Security Predictions for 2026 beginner 6 min read | Library for confronting three AI attack vectors manifesting in production by 2026: indirect injection via data poisoning, supply chain infiltration through AI development toolchains like MCP servers, and agent-to-agent attack propagation through "toxic combinations" in autonomous agent ecosystems. These vectors exploit how AI agents interpret instructions, trust data sources, and execute permitted actions, moving beyond traditional code vulnerabilities to exploit data as executable commands and the inherent trust in interconnected AI architectures. |
| 2026-04-10 2026 | Introduction to Data Poisoning: A 2026 Perspective beginner 11 min read | Library introducing data poisoning, an adversarial attack corrupting AI/LLM training data to cause backdoors or biased outputs. It details real-world incidents like Basilisk Venom poisoning GitHub code, Qwen 2.5's search tool manipulation, Grok 4's "!Pliny" backdoor triggered by X prompts, and hidden instructions in MCP tools like "joke_teller." The library also covers poisoning in retrieval (RAG), synthetic data pipelines (VIA), and diffusion models, highlighting how even small, hidden manipulations can undermine AI safety and trust across the entire LLM lifecycle. |
| 2026-04-10 2026 | AI Security Research — December 2025 news | AI Security Research — December 2025 |
| 2026-04-10 2026 | From Prompt Injections to Protocol Exploits in LLM Agent Workflows advanced | From Prompt Injections to Protocol Exploits in LLM Agent Workflows |
| 2026-04-10 2026 | LLM Security Guide: OWASP GenAI Top-10 Risks beginner 28 min read | Library detailing offensive and defensive security for Large Language Models and Agentic AI Systems, updated with the OWASP Top 10 for LLMs 2025 and the OWASP Top 10 for Agentic Applications 2026. It covers Agentic AI Security, RAG Vulnerabilities, System Prompt Leakage, Vector/Embedding Weaknesses, and AI Compliance, incorporating tools like DeepTeam, Promptfoo, ARTKIT, and frameworks such as Meta LlamaFirewall and Amazon Bedrock Guardrails. |
| 2026-04-10 2026 | Prompt Injection Attacks in LLMs: A Comprehensive Review intermediate | Prompt Injection Attacks in LLMs: A Comprehensive Review |
| 2026-04-10 2026 | Prompt Injection Attacks: Examples, Techniques, and Defence intermediate 23 min read | Library for understanding and defending against prompt injection, a critical LLM security vulnerability. It details direct and indirect injection techniques, including examples like DAN jailbreaks, EchoLeak (CVE-2025-32711), and webpage poisoning attacks, as reported by OWASP, NCSC, and Anthropic. This resource provides practical defense strategies and highlights the inherent challenges in distinguishing trusted instructions from untrusted data within LLM architectures. |
| 2026-04-10 2026 | Indirect Prompt Injection: The Hidden Threat intermediate 16 min read | Library for understanding and defending against indirect prompt injection, a vulnerability where hidden instructions within ingested data (webpages, PDFs, emails, code) can hijack AI reasoning or tool actions. It details real-world incidents like the Perplexity Comet leak and CVE-2025-59944, highlighting how agentic AI amplifies risk. Mitigation requires architectural changes, not prompt tuning, focusing on trust boundaries, context isolation, and output verification. |
| 2026-04-10 2026 | AI Agent Security in 2026: Prompt Injection and Memory Poisoning intermediate 12 min read | Library for understanding AI agent security risks, focusing on prompt injection and memory poisoning attacks. It details indirect prompt injection's impact via emails and documents, exemplified by CVE-2025-32711, and memory poisoning attacks like MemoryGraft, where agents develop false beliefs. The library also covers tool misuse through hidden instructions in metadata, misleading examples, and permissive schemas, observed in frameworks like CrewAI and AutoGen, and discusses supply chain vulnerabilities where agents fetch runtime dependencies without human review. |
| 2026-04-10 2026 | Prompt Injection Attacks in 2025: Vulnerabilities and Defense beginner 10 min read | Library for defending against prompt injection attacks, a significant threat to AI applications highlighted by CVE-2025-32711 and techniques like "EchoLeak." It addresses direct, indirect, and agentic injection methods, including those targeting LangChain with CVE-2025-68664 ("LangGrinch") and demonstrations against Gemini. The library supports defenses like input validation with pattern matching and structured prompt architecture using randomized delimiters, drawing insights from tools like Lakera Guard and Microsoft Prompt Shields. |
| 2026-04-10 2026 | Prompt Injection: The Most Common AI Exploit in 2025 beginner 7 min read | Library detailing prompt injection, the most common AI exploit in 2025, which manipulates AI instructions rather than code. It categorizes attacks into direct, indirect, jailbreak, and cross-plugin poisoning, highlighting risks to enterprise RAG systems and SaaS security operations. The resource emphasizes robust AI agent identity, authorization, continuous monitoring with anomaly detection, and integrating AI security telemetry into existing SIEM infrastructure, aligning with frameworks like NIST AI RMF and ISO/IEC 42001. |
| 2026-04-10 2026 | AI Prompt Injection Attacks: How They Work (2026) beginner 12 min read | Library for defending against AI prompt injection attacks, detailing their evolution from academic curiosities to operational threats with documented cases affecting OpenAI's GPT models and Anthropic's Claude. It covers attack mechanisms like "instruction confusion," evolving vectors such as encoding-based and multi-turn conversation attacks, and real-world incidents like the OpenClaw vulnerability, demonstrating data exfiltration and financial losses totaling $2.3 billion globally in 2025. The library addresses insufficient input sanitization, overprivileged AI agents, and a lack of output validation, highlighting detection gaps where current methods catch only 23% of sophisticated attempts. |
| 2026-04-10 2026 | LLM Security Risks in 2026: Prompt Injection, RAG, and Shadow AI beginner 30 min read | Library for mitigating LLM security risks, including prompt injection, RAG data poisoning, and autonomous exploits like EchoLeak demonstrated against Microsoft 365 Copilot. It addresses the blurred line between data and instructions, AI outputs triggering actions, and the human element in vulnerabilities, emphasizing containment strategies like limiting AI privileges and validating outputs. |
| 2026-04-09 2026 | Claude Code security settings nobody told you about beginner | Claude Code security settings nobody told you about |
| 2026-04-09 2026 | LangChain Langflow LiteLLM: When AI's Foundation Code Becomes the Attack Surface intermediate 13 min read | Library of vulnerabilities impacting foundational AI frameworks like LangChain, LangGraph, Langflow, and LiteLLM, including path traversal (CVE-2026-34070), serialization injection (CVE-2025-68664), SQL injection (CVE-2025-67644), and remote code execution (CVE-2026-33017). The article also details a supply chain attack on LiteLLM via a compromised Trivy security scanner, highlighting the systemic risks in AI infrastructure. → securityboulevard.com |
| 2026-04-09 2026 | Is 46% of your AI-generated code vulnerable? beginner 4 min read | Platform for securing AI-generated code, addressing research showing 46% of AI code contains vulnerabilities. It integrates Software Composition Analysis (SCA), Static Application Security Testing (SAST), and Dynamic Application Security Testing (DAST) directly into IDEs and LLMs like Gemini and GitHub Copilot, while also integrating with tools from Wiz, Snyk, and Black Duck. The platform emphasizes continuous governance throughout the Software Development Life Cycle (SDLC) and maintains the necessity of human oversight for final code acceptance and remediation. → techzine.eu |
| 2026-04-09 2026 | Claude Code Can Be Manipulated via CLAUDE.md to Run SQL Injection Attacks intermediate 2 min read | Library that allows manipulation of Claude Code via CLAUDE.md files to automate SQL injection attacks and steal credentials. Researchers at LayerX discovered that by adding three lines of basic English to the CLAUDE.md file, Claude Code's safety guardrails can be bypassed, leading it to execute unauthorized commands and perform actions such as login bypass and database dumping using techniques like SQL injection. The AI trusts the instructions within the CLAUDE.md file implicitly, creating a significant attack surface. → hackread.com |
| 2026-04-08 2026 | theNET | De-risking the AI rollout intermediate 6 min read | Library for de-risking AI rollouts, providing probabilistic security to address novel threats like prompt injection, data poisoning, and denial-of-wallet attacks. It emphasizes model-agnostic, inline protection, input/output monitoring, observability, and integration with traditional application security to safeguard AI-powered applications against deterministic and unpredictable attack paths. |
| 2026-04-08 2026 | AI Security Risks: How Enterprises Manage LLM Shadow AI and Agentic Threats intermediate 12 min read | Library for AI Security Posture Management (AISPM) designed to provide enterprises with visibility and control over LLM shadow AI and agentic threats. It addresses risks including prompt injection, jailbreaking, data poisoning, and data leakage from unsanctioned AI tools. The library focuses on the emerging threat landscape of agentic AI, where autonomous systems can execute multi-step actions, and highlights the critical risk of Agent Goal Hijacking as outlined in the OWASP Agentic Top 10. → securityboulevard.com |
| 2026-04-06 2026 | Best AI Security Tools in 2026 beginner 11 min read | Platforms for AI security are ranked by their coverage of three critical phases: discovering AI assets and mapping threat graphs (Phase 1), conducting adversarial testing against live applications and RAG pipelines (Phase 2), and deploying runtime guardrails calibrated from red teaming results (Phase 3). Repello AI offers full-lifecycle coverage with its Inventory, ARTEMIS, and ARGUS products. HiddenLayer focuses on model artifact scanning and runtime model anomaly detection. Mindgard provides automated multimodal AI security testing, primarily for Phase 2. Lakera, now part of Check Point, specialized in runtime guardrails for LLM applications. |
| 2026-04-06 2026 | OWASP Top 10 for Agents 2026 beginner 10 min read | Framework for assessing OWASP Agentic AI (ASI) Top 10 2026 risks, including Agent Goal Hijack (ASI01), Tool Misuse & Exploitation (ASI02), and Agent Identity & Privilege Abuse (ASI03). It addresses vulnerabilities introduced by autonomous agents' reasoning, memory, tool integration, and multi-step execution, detecting issues like unexpected code execution (ASI05) and insecure inter-agent communication (ASI07). The framework integrates with DeepTeam's red teaming capabilities for programmatic risk assessment. |
| 2026-04-06 2026 | Google Workspace's Continuous Approach to Mitigating Prompt Injection intermediate 5 min read | Library detailing Google Workspace's continuous approach to mitigating Indirect Prompt Injection (IPI) attacks against Gemini. It outlines proactive strategies including human and automated red-teaming, the AI Vulnerability Rewards Program, and public attack monitoring to discover and catalog new vulnerabilities. The library emphasizes ongoing defense refinement through deterministic and ML-based defenses, LLM prompt engineering, and Gemini model hardening, utilizing synthetic data generation via Simula for robust attack variant expansion and defense model retraining. |
| 2026-04-06 2026 | Prompt Injection Attacks in LLMs: What Developers Need to Know in 2026 beginner 6 min read | Guide on prompt injection attacks in LLMs, detailing how attackers manipulate models using natural language to override system instructions. It covers direct (jailbreaking) and indirect injection, citing examples like the Chevrolet dealership GPT and Perplexity Comet credential theft incidents. Developers are advised to implement architectural separation of instructions, conversation token limits, input filtering, AI guardrails, and developer training to mitigate these risks. |
| 2026-04-05 2026 | LangChain LangGraph Flaws Expose Files Secrets Databases in Widely Used AI Frameworks intermediate 2 min read | Library vulnerabilities in LangChain and LangGraph, specifically CVE-2026-34070 (path traversal), CVE-2025-68664 (deserialization of untrusted data), and CVE-2025-67644 (SQL injection), allow attackers to access arbitrary files, steal API keys and environment secrets, and manipulate SQL queries. These flaws, impacting widely used LLM application frameworks, have been patched in recent versions of langchain-core and langgraph-checkpoint-sqlite. → thehackernews.com |
| 2026-04-04 2026 | Detecting and analyzing prompt abuse in AI tools intermediate 5 min read | Playbook detailing detection, investigation, and response to AI prompt abuse. It covers direct prompt overrides, extractive prompt abuse against sensitive inputs, and indirect prompt injection, including the HashJack technique affecting AI summarization tools via URL fragments. This guide leverages Microsoft security tools like Defender for Cloud Apps, Purview DLP, Microsoft Entra ID conditional access, and Microsoft Sentinel to monitor AI interactions and protect against manipulation. → microsoft.com |
| 2026-04-03 2026 | Prompt Injection and LLM Jailbreaks: Defenses intermediate 7 min read | Survey of prompt injection and LLM jailbreak defenses, addressing risks in generative AI and agentic workflows. It differentiates between instruction hijacking and policy evasion, detailing why modern long-context and tool-using systems amplify attack impact. The survey outlines common attack patterns like instruction override and hidden instructions, then proposes layered defenses including inference-time filtering, independent guardrails, model-level hardening techniques like salting, and secure architectural controls for tool-using systems. → blockchain-council.org |
| 2026-04-03 2026 | Training an AI agent to attack LLM applications like a real adversary advanced 3 min read | Tool that simulates adversarial attacks against LLM-powered applications. This AI pentesting agent autonomously chains techniques like prompt injection, indirect prompt injection, and tool abuse to uncover vulnerabilities missed by traditional scanners. It gathers application context, probes role-based access control, and supports models from OpenAI, Anthropic, and open-source providers, integrating into CI/CD pipelines for continuous testing. Novee Security's agent is trained on real-world vulnerability research, including findings like arbitrary code execution in the Cursor coding assistant. → helpnetsecurity.com |
| 2026-04-03 2026 | Prompt Injection Attacks in LLMs: Vulnerabilities, Exploitation & Defense intermediate | Prompt Injection Attacks in LLMs: Vulnerabilities, Exploitation & Defense |
| 2026-04-03 2026 | How AI Red Teaming Fixes Vulnerabilities in Your AI Systems intermediate 9 min read | Library for AI Red Teaming provides a practical playbook for CISOs and AI leaders to test AI systems, including LLMs and chatbots, for vulnerabilities before deployment. It simulates attacks and misuse to identify weaknesses across prompts, data, and agent interactions, addressing risks like prompt injection, data leakage, and abuse of model autonomy. This method moves beyond isolated model testing to system-wide evaluation in operational settings, aligning with frameworks like MITRE ATLAS, EU AI Act, and NIST's AI Risk Management Framework to ensure safe and compliant AI use. |
| 2026-04-03 2026 | What Is Prompt Injection in AI? Examples & Prevention | EC-Council beginner 7 min read | Library for defending against prompt injection attacks, a technique where attackers manipulate AI systems through malicious instructions embedded in prompts. This resource details direct and indirect injection methods, citing real-world vulnerabilities like CVE-2025-53773 affecting GitHub Copilot and ChatGPT's Azure backdoor. It also highlights attacks against Google Jules and Devin AI, emphasizing the enterprise-wide compromise risks due to AI access to sensitive data and infrastructure. Mitigation strategies include zero-trust AI architecture, strict privilege separation, real-time threat detection, human-in-the-loop approvals, and continuous red teaming. |
| 2026-04-03 2026 | Prompt Injection Attacks in 2025: Risks, Defenses & Testing intermediate 3 min read | Library for detecting and mitigating prompt injection attacks in LLM-powered applications. It focuses on adversarial input testing, prompt isolation analysis, output validation, and workflow abuse simulation to uncover risks missed by traditional security tools. The library addresses how malicious instructions can manipulate model behavior, spread through trusted content, and create business-level impact, emphasizing that prompt injection is a trust problem at the intersection of application logic, content ingestion, and workflow design. |
| 2026-04-03 2026 | Red Teaming the Mind of the Machine: Evaluation of Prompt Injection and Jailbreak Vulnerabilities intermediate 15 min read | Survey of prompt injection and jailbreak vulnerabilities against state-of-the-art LLMs including GPT-4, Claude 2, Mistral 7B, and Vicuna. This research categorizes over 1,400 adversarial prompts and analyzes their success rates, generalizability, and construction logic, drawing from public repositories and forums. The study also proposes layered mitigation strategies and recommends a hybrid red-teaming and sandboxing approach for robust AI security, noting prompt injection as a critical vulnerability identified by OWASP. → arxiv.org |
| 2026-04-03 2026 | Practical LLM Security Advice from the NVIDIA AI Red Team intermediate 5 min read | Library summarizing NVIDIA AI Red Team findings, detailing common LLM application vulnerabilities. It addresses risks like remote code execution (RCE) from executing LLM-generated code (e.g., via `exec` or `eval`), insecure permissions in Retrieval-Augmented Generation (RAG) data stores leading to data leakage and prompt injection, and data exfiltration through active content rendering of Markdown or hyperlinks. Mitigation strategies include sandboxing dynamic code, rigorously managing RAG permissions, and sanitizing LLM output. |
| 2026-04-03 2026 | OWASP Top 10 for LLMs 2025 | DeepTeam Red Teaming Framework beginner 7 min read | Framework integrating OWASP Top 10 for LLMs 2025 risks, including Prompt Injection (LLM01), Sensitive Information Disclosure (LLM02), Supply Chain (LLM03), Data and Model Poisoning (LLM04), Improper Output Handling (LLM05), Excessive Agency (LLM06), System Prompt Leakage (LLM07), and Vector and Embedding Weaknesses (LLM08). It facilitates detection of vulnerabilities in RAG systems and autonomous agents through programmatic assessment or the Confident AI platform. |
| 2026-04-03 2026 | Continuously Hardening ChatGPT Against Prompt Injection | OpenAI intermediate | Continuously Hardening ChatGPT Against Prompt Injection | OpenAI |
| 2026-04-03 2026 | Red Teaming LLMs Exposes a Harsh Truth About the AI Security Arms Race news | Red Teaming LLMs Exposes a Harsh Truth About the AI Security Arms Race |
| 2026-04-03 2026 | LLM01:2025 Prompt Injection | OWASP Gen AI Security beginner 6 min read | Reference detailing LLM01:2025 Prompt Injection, a vulnerability where user prompts unintendedly alter Large Language Model behavior. The OWASP Gen AI Security resource covers direct and indirect injections, including scenarios like CVE-2024-5184 exploitation in email assistants and multimodal attacks. It outlines mitigation strategies such as constraining model behavior, input/output filtering, and adversarial testing, emphasizing that while prevention is challenging, impact reduction is achievable. → genai.owasp.org |
| 2026-04-03 2026 | AI Security Projects for Practice: 10 Hands-On Labs beginner 7 min read | Labs provide hands-on practice with prompt injection, including direct and indirect attacks, excessive agency, and tool invocation risks, as well as data poisoning techniques like label-flipping and backdoor trigger injection. These projects are crucial for understanding and mitigating threats outlined in the OWASP LLM Top 10 and MITRE ATLAS, covering offensive strategies and defensive hardening across various AI system components, from preprocessing to model integrity checks and DevSecOps pipelines. → blockchain-council.org |
| 2026-04-03 2026 | AI Security Roadmap: From Basics to Model Defense beginner 8 min read | Reference outlining a structured AI security roadmap, progressing from fundamentals to model defense. It highlights unique threats like prompt injection and data poisoning, and maps learning paths to frameworks such as OWASP Top 10 for LLMs, NIST AI RMF, and MITRE ATLAS. The guide also details practical tooling patterns like AI Security Posture Management (AI-SPM) and adversarial testing tools such as Microsoft Counterfit and IBM Adversarial Robustness Toolbox. → blockchain-council.org |
| 2026-04-03 2026 | AI Security Certification Guide for 2026 beginner 8 min read | Guide to AI security certifications for 2026, detailing credentials for technical, governance, and audit roles. It highlights the growing importance of AI-specific risks like prompt injection and data leakage, and aligns certifications with frameworks such as OWASP LLM Top 10, NIST AI RMF, MITRE ATLAS, SAIF, and ISO/IEC 42001. The guide emphasizes hands-on assessment and explains how to choose the right credential based on role fit, framework alignment, cost, and industry recognition. → blockchain-council.org |
| 2026-04-02 2026 | Guarding LLMs With a Layered Prompt Injection Representation intermediate 11 min read | Library for LLM security that learns a low-dimensional latent representation of prompt injection attacks. This approach complements perplexity-based filtering and achieves high precision and recall by training a classifier on features derived from this learned representation, distinguishing benign prompts from adversarial ones. → trendmicro.com |
| 2026-04-02 2026 | Auditing the Gatekeepers: Fuzzing "AI Judges" to Bypass Security Controls intermediate 6 min read | Tool for fuzzing AI judges, called AdvJudge-Zero, exploits prompt injection vulnerabilities in LLM-based security gatekeepers. This fuzzer identifies stealthy control tokens, such as formatting symbols and structural phrases, that manipulate the AI's decision-making logic to bypass safety policies and allow prohibited content, or corrupt training data by awarding high scores to incorrect responses. The research demonstrates a 99% success rate in bypassing controls across various LLM architectures, highlighting the need for adversarial training to harden these systems. → unit42.paloaltonetworks.com |
| 2026-04-02 2026 | AI Security for Apps is now generally available news 6 min read | Library for securing AI-powered applications, generally available, offering discovery of AI endpoints, detection of prompt injection and PII exposure, and mitigation via WAF rules. New features include custom topic detection and free AI endpoint discovery for all Cloudflare customers, with expanded integrations with IBM and Wiz for unified security posture management. It addresses risks cataloged in the OWASP Top 10 for LLM Applications, such as prompt injection and sensitive data leakage, by analyzing prompt and output behavior rather than fixed operations. |
| 2026-03-15 2026 | mukul975/Anthropic-Cybersecurity-Skills: 734+ structured cybersecurity skills for AI agents · MITRE ATT&CK mapped · agentskills.io standard · Claude Code, Copilot, Codex CLI, Cursor, Gemini CLI beginner 6 min read | Library of 754 structured cybersecurity skills designed for AI agents, mapped to MITRE ATT&CK, NIST CSF 2.0, MITRE ATLAS, MITRE D3FEND, and NIST AI RMF. This community project provides production-grade workflows for tasks including memory forensics with Volatility3, Kerberoasting detection via Sigma rules, and cloud breach scoping, enabling AI to perform expert-level investigations across platforms like Claude Code, GitHub Copilot, and Gemini CLI. |
| 2026-03-14 2026 | Teaching Claude Everything You've Hacked intermediate 5 min read | Library that syncs HackerOne bounty history to a local SQLite database and integrates with AI assistants like Claude via the Model Context Protocol (MCP). It cross-references your personal reports and publicly disclosed bounty-awarded reports against target scopes, identifying overlooked areas and profitable weakness types. This tool also includes a database of community-submitted reports and enables Claude to access and reason over your bounty data, assisting in strategy and discovery. |
| 2026-03-12 2026 | Needle in the haystack: LLMs for vulnerability research intermediate 24 min read Bug Bounty | Library for using LLMs in vulnerability research, focusing on minimal scaffolding for effective code auditing. It highlights the problem of context rot in large language models, demonstrating how overly broad prompts and excessive context lead to missed vulnerabilities. Instead, the approach emphasizes creating a targeted threat model derived from previous CVEs and specific entry points to guide LLMs toward discovering nuanced issues, as seen in its case study with Claude Opus and Firefox. |
| 2026-03-12 2026 | PatrikFehrenbach/h1-brain: MCP server that connects AI assistants to HackerOne for bug bounty hunting intermediate 3 min read Bug Bounty | Library for connecting AI assistants to HackerOne bug bounty programs. It ingests personal bug bounty history, program scopes, and report details into a local SQLite database, and also includes a pre-built database of over 3,600 publicly disclosed bounty-awarded HackerOne reports. The core `hack(handle)` tool generates comprehensive attack briefings by combining personal data with community vulnerability write-ups, weakness types, and bounty amounts, suggesting attack vectors against untouched assets. |
| 2026-03-09 2026 | GitHub - eliasbiondo/linkedin-mcp-server: 🔗 A Model Context Protocol (MCP) server for LinkedIn — search people, companies, and jobs, scrape profiles, and get structured data via any MCP-compatible AI client. intermediate 3 min read | Library for accessing LinkedIn data via a Model Context Protocol (MCP) server. It enables searching for people, companies, and jobs, scraping detailed profiles with granular section control (main profile, experience, education, contact info, interests, honors, languages, posts, recommendations), and retrieving structured JSON output. Built with FastMCP and Patchright, it supports both stdio and HTTP transports for various AI client integrations, with session persistence and configurable browser automation settings. |
| 2026-03-08 2026 | How I use LLMs For Security Work: Part 2 intermediate 12 min read Bug Bounty | Library for leveraging Large Language Models (LLMs) in security work, focusing on advanced patterns beyond basic prompting. It details concepts like Agents, Skills (SKILLS.md), Workflows, and Assistants, emphasizing the critical role of providing precise context through documentation, requirements, and decision-making parameters. The article illustrates how well-defined prompts with explicit instructions and expected outputs, as opposed to vague requests, significantly improve LLM inference for tasks like automating browser profile management for threat hunting. |
| 2026-03-01 2026 | gadievron/raptor: Raptor turns Claude Code into a general-purpose AI offensive/defensive security agent. By using Claude.md and creating rules, sub-agents, and skills, and orchestrating security tool usage, we configure the agent for adversarial thinking, and perform research or attack/defense operations. intermediate 7 min read AuthZ | Framework turning Claude Code into an autonomous AI security agent, RAPTOR orchestrates static analysis, binary analysis, LLM-powered vulnerability validation, exploit generation, and patch writing. It employs Semgrep and CodeQL for scanning, using Z3 for dataflow and one-gadget constraint analysis to improve exploit feasibility. RAPTOR supports customizable LLM analysis dispatchers and offers project management features for organized research and reporting. |
| 2026-02-25 2026 | hexsecteam/HexSecGPT: HexSecGPT is designed to provide powerful, unrestricted, and seamless AI-driven conversations, pushing the boundaries of what is possible with natural language processing. beginner 3 min read | Framework for AI-driven conversations that pushes natural language processing boundaries, utilizing third-party APIs from OpenRouter or DeepSeek with a specialized system prompt. This open-source wrapper demonstrates a proof-of-concept, offering a glimpse of HexSecGPT's capabilities through a command-line interface on platforms like Kali Linux, Ubuntu, and Termux. Users can obtain API keys from OpenRouter or DeepSeek for integration. The framework includes installation scripts and a model discovery script for managing API provider model availability. |
| 2026-02-23 2026 | ottosulin/awesome-ai-security: A collection of awesome resources related AI security beginner 17 min read | Library of curated resources covering AI security, including frameworks, standards, learning materials, and open-source tools. It details attack techniques, defense strategies, benchmarks, and specific vulnerabilities, referencing OWASP LLM Top 10, NIST AIRC, MITRE ATLAS, and tools like garak and promptfoo for vulnerability scanning and prompt injection testing. The collection also highlights resources for understanding adversarial attacks such as evasion, poisoning, extraction, and inference, mentioning libraries like Adversarial Robustness Toolkit (ART), cleverhans, and foolbox. |
| 2026-02-21 2026 | samugit83/redamon: An AI-powered agentic red team framework that automates offensive security operations, from reconnaissance to exploitation to post-exploitation, with zero human intervention. advanced 19 min read Recon | Framework that autonomously orchestrates offensive security operations from reconnaissance to post-exploitation, integrating AI agents for vulnerability validation via Hydra, privilege escalation exploits, and XSS mapping. It logs findings in a Neo4j knowledge graph, then utilizes a CypherFix AI triage agent to deduplicate and rank vulnerabilities. A subsequent CodeFix agent clones repositories, applies targeted fixes using 11 code-aware tools, and submits a GitHub pull request for review. |
| 2026-02-20 2026 | Microsoft says bug causes Copilot to summarize confidential emails news 2 min read | Advisory regarding a Microsoft 365 Copilot bug where confidential emails were summarized, bypassing data loss prevention policies. This issue, tracked under CW1226324 and detected January 21, affected the Copilot "work tab" chat feature, incorrectly processing emails in Sent Items and Drafts, even those with confidentiality labels. Microsoft confirmed a code error as the root cause and began rolling out a fix in early February, with remediation continuing for complex service environments. → bleepingcomputer.com |
| 2026-02-18 2026 | anthropics/prompt-eng-interactive-tutorial: Anthropic's Interactive Prompt Engineering Tutorial beginner 2 min read | Tutorial on prompt engineering for Claude, teaching basic prompt structure, failure modes, Claude's capabilities, and building complex prompts for use cases like chatbots, legal, and financial services. It includes an interactive playground for practice, exercises, an answer key, and an appendix covering chaining prompts, tool use, and search/retrieval, recommending the Claude for Sheets extension for user-friendliness. |
| 2026-02-17 2026 | vxcontrol/pentagi: ✨ Fully autonomous AI Agents system capable of performing complex penetration testing tasks advanced 57 min read Recon | Tool for fully autonomous AI-powered penetration testing, PentAGI leverages a team of specialized agents and integrates professional security tools like nmap, metasploit, and sqlmap within a secure Docker environment. It features a smart memory system, knowledge graph integration with Neo4j, and external search capabilities via Tavily, Perplexity, and Google Custom Search, with comprehensive monitoring and reporting through Grafana and PostgreSQL. |
| 2026-02-16 2026 | How I Built a 5-Path AI “Recon Beast” with n8n and Gemini (2026 Guide) intermediate Bug Bounty Recon | In 2026, the bug bounty landscape requires more than just speed, with AI enhancing attacker capabilities. The article discusses building a 5-Path AI "Recon Beast" using n8n and Gemini. This innovative approach leverages automation and AI to enhance reconnaissance processes for bug bounty hunting. The focus is on utilizing technology to improve efficiency and effectiveness in identifying vulnerabilities. |
| 2026-02-11 2026 | Thread by @firt on Thread Reader App advanced 1 min read | Library updates detail the early preview of Chrome's WebMCP, enabling AI agents to query and execute services via imperative or declarative APIs. It also highlights Safari/WebKit's unanswered community questions, contrasting with Chrome's PWA installation on Windows 7, 8.x, and 10, which features a distinct "Install" verb and a similar UX to Chromebook PWAs. |
| 2026-02-11 2026 | SILENTCHAIN AI - AI-Powered Security Testing intermediate 2 min read Burp | Library for AI-powered offensive security, covering web applications, source code, and network infrastructure. Features include OWASP Top 10 detection via a Burp Suite extension, standalone web application scanning with CI/CD integration, and AI-powered static code analysis with PoC generation. It integrates with five AI providers, including local Ollama support, and utilizes a RAG Knowledge Engine with over 80,000 security documents. Products offer cross-product correlation for finding escalation, WAF detection and evasion for 25+ types, and out-of-band testing for XSS, SSRF, and XXE. |
| 2026-02-10 2026 | Ed1s0nZ/CyberStrikeAI: CyberStrikeAI is an AI-native security testing platform built in Go. It integrates 100+ security tools, an intelligent orchestration engine, role-based testing with predefined security roles, a skills system with specialized testing skills, and comprehensive lifecycle management capabilities. intermediate 19 min read | Platform that leverages AI for automated security testing. It integrates over 100 tools, including network scanners like nmap, web scanners such as sqlmap, and vulnerability scanners like nuclei. The platform features an intelligent orchestration engine, role-based testing with predefined security roles, and a skills system for specialized testing. It supports conversational commands, attack-chain analysis, knowledge retrieval via RAG, and provides a dashboard for system status and vulnerability management. Integrations include a Burp Suite extension and chatbot capabilities for DingTalk and Lark. |
| 2026-02-07 2026 | Agent twitter client mcp beginner | Agent twitter client mcp |
| 2026-02-06 2026 | Claude Opus 4.6 Finds 500+ High-Severity Flaws Across Major Open-Source Libraries news 2 min read Bug Bounty | Library where Claude Opus 4.6 identified over 500 high-severity vulnerabilities in open-source projects like Ghostscript, OpenSC, and CGIF. The LLM demonstrated advanced code reasoning, finding flaws such as a missing bounds check in Ghostscript, a buffer overflow in OpenSC, and a heap buffer overflow in CGIF, even outperforming traditional fuzzers on complex logic-based bugs. → thehackernews.com |
| 2026-02-06 2026 | xalgord/AI-System-Prompts: XBot - Advanced AI Cybersecurity Agent | Gemini system prompt for automated penetration testing and security assessments intermediate Bug Bounty | Library for XBot, an advanced AI cybersecurity agent system prompt for Gemini AI, facilitating automated penetration testing and security assessments. It supports comprehensive vulnerability scanning, active exploitation, OWASP Top 10 and advanced web application security testing, source code analysis, network security, and detailed reporting with remediation guidance. The system prompt enables autonomous operation, multi-target scanning, and robust vulnerability detection on authorized systems. |
| 2026-02-02 2026 | depthfirst | 1-Click RCE To Steal Your Moltbot Data and Keys advanced 5 min read RCE Secrets | Library that identifies vulnerabilities in OpenClaw, formerly Moltbot, by analyzing its code for logic flaws. The system maps application lifecycle flows, flagging issues like blindly accepting gateway URLs which, when combined with other issues, can lead to a 1-click RCE exploit, CVE-2026-25253. This exploit allows attackers to steal data and keys by chaining a Cross-Site WebSocket Hijacking vulnerability with API calls to disable security features. |
| 2026-02-02 2026 | skills/plugins/insecure-defaults/skills/insecure-defaults/SKILL.md at main · trailofbits/skills intermediate 3 min read | Library for identifying fail-open vulnerabilities in applications, distinguishing exploitable defaults from crash-safe patterns. It aids in security audits by reviewing code, deployment configurations, and IaC templates for issues like fallback secrets, hardcoded credentials, weak defaults in authentication and CORS, insecure crypto algorithms such as MD5 and ECB, and exposed debug features. The library emphasizes analyzing production-reachable code and tracing execution paths to determine runtime behavior and assess the criticality of findings. |
| 2026-02-01 2026 | Prompt Injection Toolkit: 25 Payloads & Techniques for Mastering AI Pentesting intermediate Bug Bounty | Prompt Injection Toolkit: 25 Payloads & Techniques for Mastering AI Pentesting Ever tried breaking an AI chatbot with a ‘please ignore all previous instructions’ prompt, only to realize it’s … |
| 2026-01-28 2026 | insaaniManav/prompt-forge: AI prompt engineering workbench for crafting, testing, and systematically evaluating prompts with powerful analysis tools. intermediate 2 min read | Workbench for AI prompt engineering that generates, analyzes, and systematically tests prompts, featuring smart generation with AI suggestions, advanced analysis for optimization feedback, and systematic evaluation creating comprehensive test suites for robustness, safety, accuracy, and creativity. It supports multiple models including Claude 3.5 Sonnet, GPT-4.1, Azure OpenAI, and Ollama, with organized version control and detailed execution history. |
| 2026-01-27 2026 | Hunting Account Takeovers in the Wild West of MCP OAuth Servers" intermediate 6 min read AuthN | Library that details critical OAuth misconfigurations in MCP (Model Context Protocol) servers, enabling one-click account takeover (ATO) attacks. Vulnerabilities include open Dynamic Client Registration (DCR) and missing redirect URI validation, allowing attackers to register malicious clients and intercept authentication codes. The research highlights findings from subdomain enumeration, endpoint discovery, and configuration analysis, focusing on misaligned security settings like unprotected DCR endpoints and unsupported PKCE enforcement. |
| 2026-01-25 2026 | Coding Agents. The Insider Threat You Installed Yourself beginner | Coding Agents. The Insider Threat You Installed Yourself Stop Running AI Coding Assistants Blindly AI coding agents are booming everywhere right now. Not only because they help you ship code faster … |
| 2026-01-23 2026 | GitHub - mholzen/workflowy: Powerful CLI and MCP server for WorkFlowy: reports, search/replace, backup support, and AI integration (Claude, LLMs) intermediate 3 min read | Tool for WorkFlowy, offering a CLI and MCP server. It enables AI integration with models like Claude and ChatGPT, alongside features for search, bulk replace, usage reports, and offline access via backup files. This Go-based application supports full-text search with regex, content transformation, and can pipe data through shell commands for LLM processing. Installation is available via Homebrew, Scoop, Go, or pre-built binaries. |
| 2026-01-22 2026 | AI’s Hacking Skills Are Approaching an ‘Inflection Point’ news 2 min read Bug Bounty | Library detecting federated GraphQL vulnerabilities; AI models are increasingly capable of finding zero-day bugs and complex system interactions, as demonstrated by RunSybil's Sybil tool and Dawn Song's CyberGym benchmark. Frontier models like Anthropic's Claude Sonnet 4.5 show significant improvements in vulnerability identification, highlighting the growing need for AI-assisted defense strategies and secure-by-design coding practices. → wired.com |
| 2026-01-18 2026 | harishsg993010/crossbow-agent: world's first Opensource fully Autonomous AI Security Engineer intermediate 3 min read | Library for an autonomous AI security engineer, "crossbow-agent," which finds and exploits vulnerabilities like hardcoded credentials, SQL injection, exposed admin panels, API key leaks, IDOR, command injection, session fixation, XSS, insecure file permissions, missing rate limiting, XXE, CORS misconfigurations, open redirects, JWT secret key leaks, NoSQL injection, SSRF, weak cryptography, race conditions, and directory traversal. It supports multiple AI models (GPT, Claude, Gemini) and integrates with OpenAI, Anthropic, or Google APIs. |
| 2026-01-16 2026 | trailofbits/skills: Trail of Bits Claude Code skills for security research, vulnerability detection, and audit workflows intermediate | Library of Claude Code skills from Trail of Bits, enhancing AI-assisted security analysis, vulnerability detection, and audit workflows. This marketplace provides codex-native skill discovery, allowing researchers to browse and install plugins locally or via a git clone. Contributions and bug reports are welcomed. |
| 2026-01-13 2026 | Securing AI Systems beginner | Course on securing AI systems, covering adversarial attacks, data poisoning, and model theft. It offers hands-on labs for implementing defenses, conducting red-team simulations, and evaluating weaknesses. You will learn threat modeling, vulnerability assessments, DevSecOps, and incident response within AI/ML workflows, cloud security, and MLOps. |
| 2026-01-11 2026 | Certified AI Security Professional - AI Security Certification - Practical DevSecOps beginner 9 min read | Library covering the Certified AI Security Professional (CAISP) certification, this resource details AI security fundamentals, Large Language Model (LLM) attacks, and defenses. It explores OWASP Top 10 LLM vulnerabilities like prompt injection and training data poisoning, along with AI-DevOps integration. Key attack tactics from MITRE ATT&CK and ATLAS are examined, alongside threat modeling methodologies and supply chain security for AI. Emerging threats, governance, and compliance are also addressed, including discussions on the EU AI Act and NIST RMF. |
| 2025-12-19 2025 | KeygraphHQ/shannon: Fully autonomous AI hacker to find actual exploits in your web apps. Shannon has achieved a 96.15% success rate on the hint-free, source-aware XBOW Benchmark. advanced 19 min read Bug Bounty | Library for fully autonomous, white-box AI pentesting of web applications and APIs. Shannon analyzes source code and executes real exploits, including Injection, XSS, SSRF, and Broken Authentication, to validate vulnerabilities before production. It leverages tools like Nmap and Subfinder, and can handle 2FA/TOTP logins with reproducible proof-of-concept exploits, achieving a 96.15% success rate on the XBOW Benchmark. |
| 2025-12-17 2025 | NVIDIA/garak: the LLM vulnerability scanner beginner 8 min read | Tool for scanning Large Language Models (LLMs), `garak` probes for vulnerabilities like hallucination, data leakage, prompt injection, misinformation, toxicity generation, and jailbreaks. It employs static, dynamic, and adaptive probes to identify weaknesses in LLMs accessible via Hugging Face Hub, Replicate, OpenAI API, AWS Bedrock, LiteLLM, and REST endpoints. `garak` helps assess LLM security by mimicking tools like nmap or Metasploit Framework for LLMs, reporting on failure rates and logging detailed run information. |
| 2025-12-13 2025 | Building an Open-Source AI-Powered Auto-Exploiter with a 1.7B Parameter Model: No Paid APIs Required advanced 13 min read | Library for building an open-source, AI-powered autonomous penetration testing agent. This system utilizes a 1.7 billion parameter qwen3:1.7b model, LangChain, and LangGraph for local execution, eliminating API costs and data exfiltration. It functions as a ReAct agent, independently scanning networks with Nmap, searching for exploits using searchsploit, mirroring them, analyzing code with `inspect_exploit_code`, setting up listeners with `start_listener`, and executing commands via `execute_shell_command` to achieve autonomous exploitation. |
| 2025-12-11 2025 | 📚 tl;dr sec 308 news Supply Chain | 😈 MCP Security, ☁️ AWS re:Invent Recaps, 🤖 Detecting Malicious Pull Requests with AI https://t.co/gt4zMQKZpp |
| 2025-12-05 2025 | GitHub - amaiya/onprem: A toolkit for applying LLMs to sensitive, non-public data in offline or restricted environments beginner 12 min read | Library for applying LLMs to sensitive, non-public data locally or in restricted environments. OnPrem.LLM, a Python toolkit inspired by privateGPT, offers full local execution with optional cloud provider integration (OpenAI, Anthropic). It features analysis pipelines for extraction, summarization, and Q&A, supports resource-constrained environments with SparseStore, and integrates with tools like Elasticsearch. Recent updates include an `AgentExecutor` for sandboxed AI agents and support for workflows and asynchronous prompts. |
| 2025-10-30 2025 | fr0gger/proximity: Proximity is a MCP security scanner powered with NOVA intermediate 3 min read Supply Chain | Library for scanning MCP (Model Context Protocol) servers and Agent Skills, Proximity uses NOVA rules to detect security issues like prompt injection and jailbreaks. It performs detailed analysis of server capabilities and skill structures, supporting MCP Spec 2025-11-25 and providing pattern-specific remediation guidance. |
| 2025-10-15 2025 | The MCP Security Tool You Probably Need - MCP Snitch intermediate 7 min read Supply Chain | Library implementing a proxy-based security model for MCP tools, offering a critical mediation layer until native MCP security primitives and platform-level fine-grained scoping are adopted. MCP Snitch intercepts tool calls, enforces user-defined whitelists for operations, and provides visibility and control, mitigating risks like those demonstrated by the GitHub MCP vulnerability. This approach prioritizes explicit allow-listing over deny-listing for robust access control. |
| 2025-10-14 2025 | AI For Hackers: Red Team Editions – Codelivly Resources beginner 2 min read | Manual for offensive AI tradecraft, this 1,100-page guide teaches red teams to build autonomous hacking agents. It covers AI-augmented reconnaissance, polymorphic payload generation using generative models, AI-driven vulnerability discovery with tools like CodeBERT and reinforcement learning fuzzers, and adaptive C2 frameworks. The resource includes 60+ labs, 500+ Python code examples, and methods for bypassing AI-based security with adversarial examples. |
| 2025-10-12 2025 | 5 Essential MCP Servers That Give Claude & Cursor Real Superpowers (2025) beginner | “” is published by Prithwish Nath in Artificial Intelligence in Plain English. |
| 2025-10-02 2025 | Offensive AI - Hacker Associate beginner 6 min read | Certification program merging traditional web pentesting with AI automation. This hands-on course teaches how to identify, exploit, and report vulnerabilities using GPT agents, LangChain, AutoGPT, and tools like Burp Suite and Turbo Intruder. Modules cover AI-powered reconnaissance, exploitation of access control and XSS, authentication bypass, API testing, automated reporting, and advanced agent development for WAF bypass, business logic flaws, and CI/CD pipeline analysis. It also delves into AI red teaming, adversarial AI testing, and prompt injection attacks. |
| 2025-08-22 2025 | Model Context Protocol (MCP): Understanding security risks and controls intermediate 7 min read | Library for securing Anthropic's Model Context Protocol (MCP), which connects LLMs to external tools. It addresses confused deputy vulnerabilities via OAuth, supply chain risks by requiring signed components and SAST/SCA in build pipelines, unauthorized command execution with input sanitization and sandboxing, prompt injection through user confirmation, and tool injection by enabling version pinning and modification notifications. The library also details mitigation for MCP sampling exploitation and emphasizes logging best practices. |
| 2025-08-13 2025 | AI Mastery for Cybersecurity Professionals beginner 5 min read Talks | Bundle of 10 EC-Council courses focused on applying AI to cybersecurity. This learning resource covers topics such as AI-driven threat detection, LLM pentesting, automated reconnaissance for bug bounty hunting using tools like Nuclei and HTTPX, and defending against generative AI threats like phishing and deepfakes. It aims to equip cybersecurity professionals with skills to automate detection, strengthen defenses, and enhance cyber intelligence. |
| 2025-04-30 2025 | #burp #pentest #ai #hackerassociate #cybersecurity #infosec… | Harshad Shah intermediate Burp Talks | Setting Up #Burp MCP Server on Claude Desktop #Pentest Modern App with #Ai ⇢ Learn how to set up a 𝗕𝘂𝗿𝗽 𝗠𝗖𝗣 𝗦𝗲𝗿𝘃𝗲𝗿 on your 𝗖𝗹𝗮𝘂𝗱𝗲 𝗱𝗲𝘀𝗸𝘁𝗼𝗽 in this easy-to-follow tutorial. ⇢ Get your server up and... |
| 2025-04-13 2025 | Building Your First Offensive Security MCP Server - Renae Schilg - Medium intermediate | So, you’ve read the primer here, you have a basic understanding of MCP servers and how they work and now you’re ready to build your own. We are going to be building a simple MCP server that performs… |
| 2025-04-09 2025 | Defensive Deception with Kong and Beelzebub LLM Honeypot intermediate | In today’s increasingly sophisticated cyber threat landscape, organizations need to move beyond traditional defensive measures. While firewalls, intrusion detection systems, and vulnerability… |
| 2025-03-24 2025 | Prompt Engineering Guide – Nextra beginner | Guide to prompt engineering, a new discipline for optimizing prompts to interact with and develop large language models (LLMs). This resource compiles the latest papers, advanced prompting techniques, learning guides, model-specific guides, lectures, references, new LLM capabilities, and tools, aiming to improve LLM safety and augment capabilities with domain knowledge and external tools. |
| 2025-02-25 2025 | GenAI with Python: Build Agents from Scratch (Complete Tutorial) beginner | Prompt Engineering is the practice of designing and refining prompts (text inputs) to enhance the behavior of Large Language Models (LLMs). The goal is to get the desired responses from the model by… |
| 2025-02-14 2025 | GitHub - microsoft/generative-ai-for-beginners: 21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/ beginner 2 min read | Library of 21 lessons for building Generative AI applications, covering concepts and code examples in Python and TypeScript. Lessons include Azure OpenAI Service, GitHub Marketplace Model Catalog, and OpenAI API, with "Keep Learning" sections. Basic Python or TypeScript knowledge is recommended, and a GitHub account is required for local cloning and contributions. Sparse checkout instructions are provided to reduce download size by excluding translations. |
| 2025-02-09 2025 | GitHub - potpie-ai/potpie: Prompt-To-Agent : Create custom engineering agents for your codebase intermediate 4 min read | Library for creating AI agents that reason about your codebase. Potpie transforms repositories into knowledge graphs stored in Neo4j, enabling agents to understand code context for debugging and feature development. It supports OpenAI, Ollama, and Anthropic LLM providers, with configurable authentication for GitHub repositories via GitHub Apps or Personal Access Tokens. The architecture includes a FastAPI API layer, Celery workers for asynchronous parsing, and a Neo4j knowledge graph as the core context provider. |
| 2025-02-09 2025 | GitHub - eastlondoner/cursor-tools: Give Cursor Agent an AI Team and Advanced Skills intermediate 33 min read | Library for extending AI coding assistants like Cursor Composer, Cursor, Claude Code, and Codex with advanced skills and an AI team. It integrates with Perplexity for web search and Gemini 2.0 for large context windows, enabling capabilities such as working with GitHub Issues and Linear, generating local documentation, analyzing YouTube videos, and operating web applications via Stagehand. The library offers a CLI for system-wide access and supports multiple AI providers including OpenAI, Anthropic, and OpenRouter. |
| 2025-01-30 2025 | Set Up Your Own Cybersecurity-Focused AI Development, Training, and Fine-Tuning Lab at Home intermediate | As AI applications rapidly evolve, commercial platforms like OpenAI, Gemini, and many other LLM versions are offering advanced capabilities… |
| 2025-01-24 2025 | GitHub - JasonLovesDoggo/caddy-defender: Caddy module to block IPs and prevent AIs from training on your website. intermediate 2 min read | Library for Caddy that blocks IP addresses and prevents AI training on websites. It supports IP range filtering, predefined ranges for services like OpenAI and GitHub Copilot, and custom ranges. Responders include blocking, custom messages, dropping connections, returning garbage data, redirection, rate limiting, and tarpitting. Installation is available via a pre-built Docker image. |
| 2025-01-10 2025 | SSH LLM Honeypot caught a real threat actor - Beelzebub Blog intermediate 3 min read | Library for configuring an SSH LLM honeypot using the Beelzebub framework. This resource details how a threat actor was caught downloading binaries with known exploits and attempting to join a botnet via an IRC channel. Analysis of the threat actor's actions, including IP address, credentials, and observed commands, is provided, along with steps to recreate the honeypot setup and details on the Perl script used for DDoS and C2 communication through Undernet IRC channels. |
| 2024-12-31 2024 | GitHub - browser-use/browser-use: Make websites accessible for AI agents intermediate 4 min read | Library for scalable, stealth-enabled browser automation. It enables coding agents like Cursor and Claude Code to interact with websites, supporting custom tools and offering both open-source and cloud-hosted agent options. The library provides a CLI for direct browser control and features optimized LLMs like ChatBrowserUse for faster, more accurate task completion. Production deployments are recommended for the cloud API due to its scalable infrastructure, proxy rotation, and captcha handling capabilities. |
| 2024-10-05 2024 | GitHub - fr0gger/Awesome-GPT-Agents: A curated list of GPT agents for cybersecurity beginner 11 min read | Library of curated GPT agents for cybersecurity, categorized for offensive and defensive applications. This community-driven resource lists various specialized agents, including MagicUnprotect for malware evasion, GP(en)T(ester) for pentesting, Threat Intel Bot for APT tracking, Vulnerability Bot for secure coding, SourceCodeAnalysis for code review, Web Hacking Wizard for web security education, CyberGPT for CVE details, MITREGPT for MITRE ATT&CK mapping, and AppSec Test Crafter for generating application security test cases in YAML. |
| 2024-08-28 2024 | Microsoft Copilot: From Prompt Injection to Exfiltration of Personal Information · Embrace The Red advanced 6 min read | Writeup detailing a Microsoft 365 Copilot vulnerability where prompt injection, automatic tool invocation, and ASCII smuggling were combined to exfiltrate personal information. The exploit chain leveraged malicious emails or shared documents to trigger Copilot's processing, enabling it to access and send sensitive data like emails and MFA codes to attacker-controlled domains via disguised hyperlinks. |
| 2023-12-05 2023 | pentestmuse-ai/PentestMuse intermediate 1 min read | Library for an AI assistant designed for cybersecurity professionals, Pentest Muse aids penetration testers in brainstorming, payload generation, code analysis, and reconnaissance. It offers both command-line and web application interfaces, supporting iterative task completion and direct command execution. Users can connect via managed APIs or integrate their own OpenAI API keys. |
| 2023-11-18 2023 | protectai/ai-exploits intermediate 3 min read | Library of exploits and Nuclei scanning templates for machine learning infrastructure vulnerabilities. This collection, including Metasploit modules and CSRF templates, addresses real-world attacks such as system takeovers and data loss, often without authentication. Vulnerabilities affect tools, libraries, and frameworks used in AI/ML model development, training, and deployment, with specific examples like Ray and MLflow being addressed. |
| 2023-11-09 2023 | https://chat.openai.com/g/g-6Bcjkotez-getpaths intermediate | https://ift.tt/fbJIsGN |
| 2023-06-25 2023 | Beginners guide to AI in cybersec. Hacking with ChatGPT. beginner | Beginners guide to AI in cybersec. Hacking with ChatGPT. https://ift.tt/UDRVtCp |
| 2023-05-18 2023 | The AI Attack Surface Map v1.0 advanced 8 min read | Framework for thinking about AI system attack surfaces, this resource maps components like AI Assistants, Agents, Tools, Models, and Storage. It highlights natural language as a primary attack vector, detailing techniques such as prompt injection against Agents and Tools to execute arbitrary commands or access sensitive data. Model attacks focus on subtle manipulation, while Storage vulnerabilities, particularly in Vector Databases, allow for data extraction and potential compromise of embeddings. The framework aims to clarify the evolving landscape of AI vulnerabilities beyond just machine learning models. → danielmiessler.com |
| 2023-05-09 2023 | How I Automate BugBounty Using Chatgpt intermediate Bug Bounty | How I Automate BugBounty Using Chatgpt https://ift.tt/93SQsPD |
| 2023-04-09 2023 | aress31/burpgpt intermediate 7 min read Burp | Library for integrating OpenAI's GPT models into Burp Suite for passive security vulnerability detection. BurpGPT analyzes web traffic by sending requests and responses to a specified OpenAI model, leveraging custom prompts for tailored analysis. It generates automated security reports, highlighting potential issues beyond traditional scanner capabilities, but requires professional triaging for false positives. The extension supports various OpenAI models and allows granular control over token usage and prompt length. It requires Burp Suite Professional or Community Edition (version 2023.3.2+) and JDK 11+. |
| 2023-04-02 2023 | SecGPT transforms cybersecurity through AI-driven insights. news | SecGPT transforms cybersecurity through AI-driven insights. https://ift.tt/4kTKfoJ |
| 2023-04-02 2023 | I Used GPT-3 to Find 213 Security Vulnerabilities in a Single Codebase intermediate | I Used GPT-3 to Find 213 Security Vulnerabilities in a Single Codebase https://ift.tt/FrMSdKx |
| 2023-04-02 2023 | HackGPT beginner | HackGPT https://ift.tt/JsIGRO1 |
| 2023-03-29 2023 | Microsoft Security Copilot is a new GPT-4 AI assistant for cybersecurity news 4 min read | Tool that uses GPT-4 and Microsoft's security-specific model to assist cybersecurity professionals. It synthesizes enterprise security incidents, analyzes files and code, and summarizes alerts from other security tools. Security Copilot draws from 65 trillion daily signals, CISA, NIST, and its own threat intelligence, offering a prompt book for automations and a collaborative workspace. It can also generate PowerPoint summaries of incidents and attack vectors. |
| 2022-02-03 2022 | Favorite tweet by @LeaKissner news | Favorite tweet: Nicolas Carlini's ML training data extraction attack talk at #Enigma2022 escalated quickly. https://t.co/C8kzAyq7lh — Lea Kissner (@LeaKissner) Feb 2, 2022 |
Frequently Asked Questions
- What is prompt injection?
- Prompt injection is an attack against applications that use large language models (LLMs). An attacker crafts input that overrides or manipulates the LLM's system instructions, causing it to perform unintended actions. Direct prompt injection targets the user input; indirect prompt injection embeds malicious instructions in data the LLM processes, such as emails or web pages.
- What is the OWASP Top 10 for LLM Applications?
- The OWASP Top 10 for LLM Applications identifies the most critical security risks for AI-powered applications, including prompt injection, insecure output handling, training data poisoning, model denial of service, supply chain vulnerabilities, sensitive information disclosure, insecure plugin design, excessive agency, overreliance, and model theft.
- How do you secure AI-integrated applications?
- Key practices include validating and sanitizing LLM outputs before rendering or executing them, implementing least-privilege access for AI agents, using guardrails to constrain model behavior, monitoring for prompt injection attempts, applying rate limiting, separating AI processing from privileged operations, and treating all LLM output as untrusted user input.
Weekly AppSec Digest
Get new resources delivered every Monday.