drj logo

"*" indicates required fields

Name*
Region*
Please enter a number from 0 to 100.
Strength indicator
I agree to the Terms of Service and Privacy Policy*
Yes, of course I want to receive emails from DRJ!
This field is for validation purposes and should be left unchanged.

Already have an account? Log in

drj logo

Welcome to DRJ

Already registered user? Please login here

Login Form

Register
Forgot password? Click here to reset

Create new account
(it's completely free). Subscribe

x
DRJ Fall 2025 Dallas Show
Skip to content
Disaster Recovery Journal
  • EN ESPAÑOL
  • SIGN IN
  • SUBSCRIBE
  • THE JOURNAL
    • Why Subscribe to DRJ
    • Digital Edition
    • Article Submission
    • DRJ Annual Resource Directories
    • Article Archives
    • Career Spotlight
  • EVENTS
    • DRJ Fall 2025
    • DRJ Spring 2025
    • DRJ Scholarship
    • Other Industry Events
    • Schedule & Archive
    • Send Your Feedback
  • WEBINARS
    • Upcoming Webinars
    • On Demand
  • MENTOR PROGRAM
  • DRJ ACADEMY
    • DRJ Academy
    • Beginner’s Guide to BC
  • RESOURCES
    • New to Business Continuity?
    • White Papers
    • DR Rules and Regs
    • Planning Groups
    • Business Resilience Decoded
    • DRJ Glossary of Business Continuity Terms
    • Careers
  • ABOUT
    • Advertise with DRJ
    • DEI
    • Board and Committees
      • Executive Council Members
      • Editorial Advisory Board
      • Career Development Committee
      • Glossary Committee
      • Rules and Regulations Committee
  • Podcast

Backslash Security Reveals in New Research that GPT-4.1, Other Popular LLMs Generate Insecure Code Unless Explicitly Prompted

by Jon Seals | April 24, 2025 | | 0 comments

Addressing ‘vibe coding’ security gaps, Backslash to demo its MCP server and built-in rules for securing Agentic IDEs

TEL AVIV, Israel – Backslash Security, the modern application security platform for the AI era, today revealed that the most popular LLMs on the market produce insecure code by default, failing to address the most common weaknesses. When prompted with additional security guidance or when governed by rules, security is greatly improved, but not equally among the different tools and versions. To address the risks of insecure code generation by AI, Backslash is also announcing the debut of its Model Context Protocol (MCP) Server, and the debut of its Rules and Extension for Agentic IDEs such as Cursor, Windsurf and GitHub Copilot in VS Code. 

Backslash Security selected seven current versions of OpenAI’s GPT, Anthropic’s Claude and Google’s Gemini to test the influence varying prompting techniques had on their ability to produce secure code. Three tiers of prompting techniques, ranging from “naive” to “comprehensive,” were used to generate code for everyday use cases. Code output was measured by its resilience against 10 Common Weakness Enumeration (CWE) use cases. The results carried a common theme – secure code output success rose with prompt sophistication, but all LLMs generally produced insecure code by default:

  • In response to simple, “naive” prompts, all LLMs tested generated insecure code vulnerable to at least 4 of the 10 common CWEs. Naive prompts merely asked to generate code for a specific application, without specifying security requirements.
  • Prompts that generally specified a need for security produced more secure results, while prompts that requested code that complied with Open Web Application Security Project (OWASP) best practices produced superior results, yet both still yielded some code vulnerabilities for 5 out of the 7 LLMs tested. 
  • Prompts that were bound to rules specified by Backslash to address the specific CWEs resulted in code that is secure and not vulnerable to the tested CWEs.
  • Overall, OpenAI’s GPT-4o had the lowest performance across all prompts, scoring a 1/10 secure code result using “naive” prompts. When prompted to generate secure code, it still produced insecure outputs vulnerable to 8 out of 10 issues. GPT-4.1 didn’t fare much better with naive prompts, scoring 1.5/10.
  • Among the GenAI tools, the best performer was Claude 3.7 Sonnet, scoring 6/10 using naive prompts and 10/10 with security-focused prompts.

For AppSec to keep pace with the emerging “vibe coding” paradigm, in which developers tap AI to create code based on “feel” rather than formal planning, application security tools must ensure that LLMs generate safe, secure code. To address the issues revealed by its LLM prompt testing, Backslash is introducing several new features that immediately enable safe vibe coding. By controlling the LLM prompt, the Backslash platform can leverage AI-coding to drive secure code from the get-go, enabling true “security by design” for the first time. Backslash will debut the new capabilities at RSAC 2025:

  • Backslash AI Rules & Policies: Machine-readable rules (e.g., for Cursor) can be injected into prompts to ensure CWE coverage, while AI policies control which AI rules are active in IDEs via the Backslash platform.  
  • Backslash IDE Extension: IDE integration is key to serving developers where they work. The IDE extension enables developers to receive Backslash security reviews on code written by both humans and AI.
  • Backslash Model Context Protocol (MCP) Server: The context-aware API conforms to the MCP standard, connecting Backslash to AI tools, enabling secure coding, scanning, and fixes. Through this connection, Backslash can answer questions like: Is this package vulnerable? Does this code expose a vulnerable package? What code needs to change to safely upgrade a package?

“For security teams, AI-generated code – or vibe coding – can feel like a nightmare,” said Yossi Pik, co-founder and CTO of Backslash Security. “It creates a flood of new code and brings LLM risks like hallucinations and prompt sensitivity. But with the right controls – like org-defined rules and a context-aware MCP server plugged into a purpose-built security platform – AI can actually give AppSec teams more control from the start. That’s where Backslash comes in, with dynamic policy-based rules, a context-sensitive MCP server, and an IDE extension built for the new coding era.”

See Backslash Security’s new blog post for full details about the AI prompt research: https://www.backslash.security/blog/can-ai-vibe-coding-be-trusted. 

Meet the Backslash Security team at the RSA Conference to see a live demonstration of Backslash MCP Server at booth ESE-52 from April 28 to May 1, 2025. To schedule a remote demo, sign up at https://www.backslash.security/demo.

About Backslash Security

Backslash Security offers a fresh approach to application security by creating a digital twin of your application, modeled into an AI-enabled App Graph. It filters “triggerable” vulnerabilities, categorizes security findings by business process, secures AI-generated code, and simulates the security impact of updates, using a fully agentless approach. Backslash dramatically improves AppSec efficiency, eliminating the frustration caused by legacy SAST and SCA tools. Forward-looking organizations use Backslash to modernize their application security for the AI era, shorten remediation time, and accelerate time-to-market of their applications. For more information, visit https://backslash.security.

Related Content

  1. Integration of Cybersecurity into Physical Security Realm
  2. Ways to Enhance Strategic Influence of Security Professionals Within Organization
  3. Disaster Recovery Journal
    F5 Delivers Application Security for the Digital Economy

Recent Posts

Flexential Secures Strategic Real Estate Control in Atlanta with Acquisition of Two Facilities

May 12, 2025

King Street-Backed Colovore Closes $925 Million Facility with Blackstone for AI Data Center Platform

May 12, 2025

ColorTokens and Nozomi Networks Join Forces to Deliver Unmatched OT and IoT Security Through Zero Trust Microsegmentation

May 9, 2025

The Overconfidence Trap: Why Most People Think Online Privacy Isn’t Their Problem

May 9, 2025

Flexential’s 2025 State of AI Infrastructure Report Reveals Growing Pressure Over AI Implementation

May 8, 2025

Multimodal AI at a Crossroads: Report Reveals CSEM Risks

May 8, 2025

Archives

  • May 2025 (26)
  • April 2025 (91)
  • March 2025 (57)
  • February 2025 (47)
  • January 2025 (73)
  • December 2024 (82)
  • November 2024 (41)
  • October 2024 (87)
  • September 2024 (61)
  • August 2024 (65)
  • July 2024 (48)
  • June 2024 (55)
  • May 2024 (70)
  • April 2024 (79)
  • March 2024 (65)
  • February 2024 (73)
  • January 2024 (66)
  • December 2023 (49)
  • November 2023 (80)
  • October 2023 (67)
  • September 2023 (53)
  • August 2023 (72)
  • July 2023 (45)
  • June 2023 (61)
  • May 2023 (50)
  • April 2023 (60)
  • March 2023 (69)
  • February 2023 (54)
  • January 2023 (71)
  • December 2022 (54)
  • November 2022 (59)
  • October 2022 (66)
  • September 2022 (72)
  • August 2022 (65)
  • July 2022 (66)
  • June 2022 (53)
  • May 2022 (55)
  • April 2022 (60)
  • March 2022 (65)
  • February 2022 (50)
  • January 2022 (46)
  • December 2021 (39)
  • November 2021 (38)
  • October 2021 (39)
  • September 2021 (50)
  • August 2021 (77)
  • July 2021 (63)
  • June 2021 (42)
  • May 2021 (43)
  • April 2021 (50)
  • March 2021 (60)
  • February 2021 (16)
  • January 2021 (554)
  • December 2020 (30)
  • November 2020 (35)
  • October 2020 (48)
  • September 2020 (57)
  • August 2020 (52)
  • July 2020 (40)
  • June 2020 (72)
  • May 2020 (46)
  • April 2020 (59)
  • March 2020 (46)
  • February 2020 (28)
  • January 2020 (36)
  • December 2019 (22)
  • November 2019 (11)
  • October 2019 (36)
  • September 2019 (44)
  • August 2019 (77)
  • July 2019 (117)
  • June 2019 (106)
  • May 2019 (49)
  • April 2019 (47)
  • March 2019 (24)
  • February 2019 (37)
  • January 2019 (12)
  • ARTICLES & NEWS

    • Business Continuity
    • Disaster Recovery
    • Crisis Management & Communications
    • Risk Management
    • Article Archives
    • Industry News

    THE JOURNAL

    • Digital Edition
    • Advertising & Media Kit
    • Submit an Article
    • Career Spotlight

    RESOURCES

    • White Papers
    • Rules & Regulations
    • FAQs
    • Glossary of Terms
    • Industry Groups
    • Business & Resource Directory
    • Business Resilience Decoded
    • Careers

    EVENTS

    • Fall 2025
    • Spring 2025

    WEBINARS

    • Watch Now
    • Upcoming

    CONTACT

    • Article Submission
    • Media Kit
    • Contact Us

    ABOUT DRJ

    Disaster Recovery Journal is the industry’s largest resource for business continuity, disaster recovery, crisis management, and risk management, reaching a global network of more than 138,000 professionals. Offering weekly webinars, the latest industry news, rules and regulations, podcasts, the industry’s only official mentoring program, a quarterly magazine, and two annual live conferences, DRJ is leading the way to keep professionals up-to-date and connected in an ever-changing world.

    LEARN MORE

    LINKEDIN AND TWITTER

    Disaster Recovery Journal is the leading publication/event covering business continuity/disaster recovery.

    Follow us for daily updates

    LinkedIn

    @drjournal

    Newsletter

    The Journal, right in your inbox.

    Be informed and stay connected by getting the latest in news, events, webinars and whitepapers on Business Continuity and Disaster Recovery.

    Subscribe Now
    Copyright 2025 Disaster Recovery Journal
    • Terms of Use
    • Privacy Policy