How AI Coding Assistants Leak Your Secrets (and How to Stop It)
Developers paste API keys, database credentials, and customer PII into AI prompts every day. Here's how data leaks happen and what your team can do about it.
Every day, thousands of developers paste code into ChatGPT, Claude, Cursor, and Copilot without thinking twice about what's in it. A database connection string here, an AWS key there, a customer email address embedded in a test fixture. It feels harmless — until it isn't.
The Problem Is Bigger Than You Think
A 2025 study by GitGuardian found that 12.8 million new secrets were exposed in public GitHub repositories in a single year. But that's just the tip of the iceberg. The real risk has shifted from code repositories to AI prompts.
When a developer pastes code into an AI coding assistant, that code is sent to a third-party API. Depending on the provider's data retention policy, that code may be:
- Stored for training — your API keys become part of the model's training data
- Logged for abuse prevention — your customer PII sits in someone else's log files
- Cached for performance — your secrets persist in infrastructure you don't control
Even providers with strong privacy policies (like Anthropic's zero-retention API) can't protect against the fundamental problem: the developer didn't realize the secret was there in the first place.
Real-World Scenarios
Scenario 1: The .env File
A developer debugging a Docker issue pastes their entire .env file into ChatGPT:
DATABASE_URL=postgres://admin:P@ssw0rd123@prod-db.company.com:5432/users
STRIPE_SECRET_KEY=sk_live_51H7...
AWS_ACCESS_KEY_ID=AKIA1234567890ABCDEF
AWS_SECRET_ACCESS_KEY=wJalr...Four production secrets, sent to a third-party API in a single paste.
Scenario 2: The Customer Data
A data engineer asks Claude to help write a SQL query and includes sample output:
-- Sample output:
-- | id | name | email | ssn |
-- | 1 | John Smith | john@acme.com | 123-45-6789 |
-- | 2 | Sarah Johnson | sarah@bigcorp.com | 987-65-4321 |Two customers' PII — names, emails, and Social Security numbers — now in an AI provider's logs.
Scenario 3: The Internal API
A developer asks Cursor to refactor an API client that has hardcoded credentials:
const client = new InternalAPI({
endpoint: "https://internal-api.company.com",
token: "eyJhbGciOiJIUzI1NiIs...",
apiKey: "company_prod_ak_8f3j2k4l5m6n7o8p",
});Internal endpoints and authentication tokens, sent to an external AI model.
Why Traditional DLP Doesn't Work
Traditional Data Loss Prevention (DLP) tools were designed for email attachments and USB drives. They don't understand:
- Developer workflows — code editors, terminal sessions, browser-based AI chats
- The speed of AI interactions — developers send dozens of prompts per hour
- Context — a string that looks random might be a production API key
You need something purpose-built for AI-era development workflows.
The Solution: Scan Before It Leaves
The most effective approach is to intercept and scan content before it reaches the AI provider. This means:
- Local scanning — all analysis happens on the developer's machine, not in the cloud
- Zero latency — scanning must be fast enough that developers don't notice it
- Multiple detection methods — regex for known patterns (AWS keys, SSNs) plus ML for unknown formats
- Non-disruptive — block or redact, but never slow down the developer
This is exactly what AxSentinel does. It sits between your AI tools and the API, scanning every request in milliseconds. If it finds PII or secrets, it blocks the request before the data ever leaves your network.
Getting Started
AxSentinel works with every major AI coding tool:
- Cursor — set the API base URL to the local proxy
- Claude Code — set
ANTHROPIC_BASE_URLto the proxy - ChatGPT / Claude.ai — install the Chrome extension
- VS Code — install the extension from the marketplace
All scanning happens locally. Only detection metadata (type, count) is reported to your compliance dashboard — never the actual content.