March 5, 20266 min readsecurityai-assistantsdata-leakssecrets

How AI Coding Assistants Leak Your Secrets (and How to Stop It)

Developers paste API keys, database credentials, and customer PII into AI prompts every day. Here's how data leaks happen and what your team can do about it.

Every day, thousands of developers paste code into ChatGPT, Claude, Cursor, and Copilot without thinking twice about what's in it. A database connection string here, an AWS key there, a customer email address embedded in a test fixture. It feels harmless — until it isn't.

The Problem Is Bigger Than You Think

A 2025 study by GitGuardian found that 12.8 million new secrets were exposed in public GitHub repositories in a single year. But that's just the tip of the iceberg. The real risk has shifted from code repositories to AI prompts.

When a developer pastes code into an AI coding assistant, that code is sent to a third-party API. Depending on the provider's data retention policy, that code may be:

Stored for training — your API keys become part of the model's training data
Logged for abuse prevention — your customer PII sits in someone else's log files
Cached for performance — your secrets persist in infrastructure you don't control

Even providers with strong privacy policies (like Anthropic's zero-retention API) can't protect against the fundamental problem: the developer didn't realize the secret was there in the first place.

Real-World Scenarios

Scenario 1: The .env File

A developer debugging a Docker issue pastes their entire .env file into ChatGPT:

DATABASE_URL=postgres://admin:P@ssw0rd123@prod-db.company.com:5432/users
STRIPE_SECRET_KEY=sk_live_51H7...
AWS_ACCESS_KEY_ID=AKIA1234567890ABCDEF
AWS_SECRET_ACCESS_KEY=wJalr...

Four production secrets, sent to a third-party API in a single paste.

Scenario 2: The Customer Data

A data engineer asks Claude to help write a SQL query and includes sample output:

-- Sample output:
-- | id | name          | email              | ssn         |
-- | 1  | John Smith    | john@acme.com      | 123-45-6789 |
-- | 2  | Sarah Johnson | sarah@bigcorp.com  | 987-65-4321 |

Two customers' PII — names, emails, and Social Security numbers — now in an AI provider's logs.

Scenario 3: The Internal API

A developer asks Cursor to refactor an API client that has hardcoded credentials:

const client = new InternalAPI({
  endpoint: "https://internal-api.company.com",
  token: "eyJhbGciOiJIUzI1NiIs...",
  apiKey: "company_prod_ak_8f3j2k4l5m6n7o8p",
});

Internal endpoints and authentication tokens, sent to an external AI model.

Why Traditional DLP Doesn't Work

Traditional Data Loss Prevention (DLP) tools were designed for email attachments and USB drives. They don't understand:

Developer workflows — code editors, terminal sessions, browser-based AI chats
The speed of AI interactions — developers send dozens of prompts per hour
Context — a string that looks random might be a production API key

You need something purpose-built for AI-era development workflows.

The Solution: Scan Before It Leaves

The most effective approach is to intercept and scan content before it reaches the AI provider. This means:

Local scanning — all analysis happens on the developer's machine, not in the cloud
Zero latency — scanning must be fast enough that developers don't notice it
Multiple detection methods — regex for known patterns (AWS keys, SSNs) plus ML for unknown formats
Non-disruptive — block or redact, but never slow down the developer

This is exactly what AxSentinel does. It sits between your AI tools and the API, scanning every request in milliseconds. If it finds PII or secrets, it blocks the request before the data ever leaves your network.

Getting Started

AxSentinel works with every major AI coding tool:

Cursor — set the API base URL to the local proxy
Claude Code — set ANTHROPIC_BASE_URL to the proxy
ChatGPT / Claude.ai — install the Chrome extension
VS Code — install the extension from the marketplace

All scanning happens locally. Only detection metadata (type, count) is reported to your compliance dashboard — never the actual content.

Start protecting your team today →