Check Point researchers have found that popular AI coding assistants are unintentionally leaking sensitive internal data, including API keys.
Standard development environments rely on strict rules. A file labelled .gitignore tells the version control system exactly what to leave out of a public commit. Passwords, local environment variables, and API keys stay on the local machine.
Generative coding assistants, however, do not read these files the way a traditional compiler or a Git client does. They ingest the entire workspace to build context. When the AI generates a snippet or suggests a code block, it routinely pulls from that ingested memory, regurgitating sensitive tokens into the production code.
Language models integrated into code editors use advanced context-gathering techniques. To give the user accurate autocomplete suggestions, the tool must understand the broader project. To that end, it scans open tabs, adjacent files, and project directories. If a developer leaves an environment file open in a background tab, the AI reads it without hesitation.
When the developer switches back to the main application file and types a command to connect to a database, the AI dutifully volunteers the exact credentials it just read. The machine sees no difference between a public variable and a private password. Both are just text strings that mathematically fit the current pattern. The developer, operating at high speed and trusting the tool, hits the tab key to accept the autocomplete suggestion.
Traditional data loss prevention tools look for anomalies in network traffic or scan repositories after a commit happens. Check Point’s findings highlight a massive vulnerability happening before the code even hits the repository. The AI operates within the developer’s local environment, inside the integrated development environment. It looks over the developer’s shoulder, absorbing every configuration file and environment variable in plain text.
Security policies depend on predictability. You write a rule, and the machine follows it. Generative artificial intelligence operates on probability, tearing down the very idea of predictable software development.
Check Point specifically noted that files designed to prevent leaks, such as .npmignore, fail to stop this behaviour. These configuration files tell package managers which directories to exclude when publishing software. The AI assistant does not execute the package manager; it generates the code that the package manager will eventually process. By the time the developer runs the publish command, the sensitive data is already baked into the core logic, completely bypassing the intended safeguard.
Steve Giguere, Principal AI Security Advocate at Check Point Software, commented: “Files like .npmignore and .gitignore exist for one main reason: don’t ship secrets. What this research highlights is that AI coding assistants are introducing entirely new ways those secrets can be created, stored, and accidentally exposed.
“Even when these safeguards are generated by AI, the system doesn’t yet understand how to protect itself, from itself. For organisations, the takeaway is simple: don’t assume AI-generated safeguards are correct just because they look right. Any files created for defensive purposes, like ignore rules or security configurations, should have a human in the loop to validate that they actually do what they’re intended to do.”
Enterprise IT departments procure these generative assistants under the assumption that vendor-provided guardrails will protect them. The software vendors promise enterprise-grade security, often citing encryption in transit and strict data retention policies. These features protect the data from being intercepted by external attackers during transmission, but they do absolutely nothing to stop the AI from injecting secrets directly into the company’s own source code.
Procurement teams are asking the wrong questions. They ask if the vendor will train future models on their proprietary code. They fail to ask how the tool handles high-entropy secrets locally before generating a response. This oversight allows vulnerable workflows to become standard operating procedure across entire engineering departments.
A leaked API key represents a direct line into corporate infrastructure. Threat actors constantly scrape public and private repositories looking for patterns that match AWS credentials, Stripe API keys, or OpenAI tokens. Once a generative assistant accidentally bakes a key into a commit, it takes seconds for automated scrapers to find it.
Revoking and rotating a compromised key creates an operational nightmare. Engineering teams must halt production, trace every service attached to that credential, update the keys, and test the connections to ensure nothing breaks. A single errant suggestion accepted by a tired developer at the end of the day can trigger an exhaustive incident response protocol, costing thousands in lost engineering hours.
Mitigating the risk from AI coding assistants
CISOs face a severe problem here as you cannot enforce governance on a tool that operates outside your visibility. Most enterprise security frameworks assume human developers might make a mistake and build safety nets to catch those errors. They do not account for a machine agent actively extracting hidden credentials and placing them in plain view.
Fixing this requires a complete tear-down of how organisations view secure code practices. Relying on static exclusion files is no longer viable.
Security must integrate directly into the context window. Some enterprise-grade AI platforms are beginning to implement local secret redaction, scanning the workspace for high-entropy strings and masking them before the data ever reaches the language model. This approach keeps the keys out of the AI’s memory entirely. If the model cannot see the secret, it cannot leak the secret.
Organisations must also rethink peer review. Engineering culture currently views AI as a hyper-competent junior developer. Teams often wave through AI-generated code with less scrutiny than human-written code. The exact opposite should happen. Reviewers need to treat generative outputs as highly suspicious, looking specifically for hardcoded tokens and environment variables that the machine might have hallucinatory injected.
Automated secret scanning must run continuously in the local environment, not just at the repository level. Developers need disruptive alerts the moment a credential appears in an active editor window. Catching the leak at the commit stage is too late; the data has already been copied, pasted, and potentially synced to a remote server.
The race to automate software engineering blinds organisations to the risks of handing context to a machine. We spend immense resources building walled gardens around our corporate infrastructure, only to give generative agents the keys to every gate.
Development teams will not surrender their coding assistants, as the productivity gains are simply too high. Security leaders must stop relying on legacy fences and start building guardrails that actually understand how probabilistic models behave.
See also: GitHub restricts Copilot as agentic AI workflows strain infrastructure

Want to learn more about cybersecurity from industry leaders? Check out Cyber Security & Cloud Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events including the AI & Big Data Expo. Click here for more information.
Developer is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

