Sensitive Data Exposure - The Bug That Looks Like a Feature Working Correctly
The API response came back 200. The data was correct. The test passed. Nobody noticed the response body also contained the user's hashed admin_notes that was never meant to leave the server. Sensitive Data Exposure does not look like an attack. It looks like a Sensitive Data Exposure happens when an application transmits, stores, or returns protected data without adequate controls. No exploit required. The data is simply available. CVE-2024-25124 in Gofiber Fiber allowed any origin to make credentialed requests due to a wildcard CORS misconfiguration. CVSS 9.4 Critical. Your suite checks that the right user gets the right data. It almost never checks what headers protect that data, what leaks in error responses, or what extra fields slip through the serializer. Three attack surfaces: data in transit (TLS, CORS headers), data at rest (logs, error messages), data in API responses (over-fetching, debug fields left in serializers). AI tools generate functional tests. They never generate a test that fails when a response contains a field it should not. Sensitive Data Exposure covers a wide class of failures: data transmitted over unencrypted connections, credentials stored in plaintext, PII returned in API responses, stack traces sent to clients in error messages, secrets committed to logs, and configuration details exposed through misconfigured headers. The thread connecting all of them: the data should be protected and it is not. Developers introduce this in several ways, none of them malicious. A serializer that returns the full database model instead of a curated DTO. An error handler that forwards exception details to the client In code review this is nearly impossible to spot without a specific checklist. The serializer returns data. The error handler returns a message. The CORS header is present. Everything looks like it is working. The question nobody asked: should this data actually be leaving the system in this form? CVE-2024-25124 · Gofiber Fiber · February 2024 · CVSS 9.4 (Critical) In February 2024, a critical vulnerability was disclosed in the CORS middleware of Gofiber Fiber, a widely used Go web framework. The flaw allowed developers to configure the middleware with a wildcard origin (Access-Control-Allow-Origin: *) while simultaneously enabling Access-Control-Allow-Credentials: true. This combination is explicitly prohibited by the CORS specification. It allows any website on the internet to make credentialed requests to the affected application and read the response. Sensitive user data, session tokens, and authenticated API responses could be accessed by attacker-controlled pages without the victim knowing. No credentials required from the attacker. No exploit chain. Just a misconfiguration the framework permitted and many developers shipped without realizing. Source: [GitHub Advisory GHSA-fmg4-x8pw-hjhg (https://github.com/advisories/GHSA-fmg4-x8pw-hjhg) NVD CVE-2024-25124. A QA engineer running response header validation tests against any authenticated endpoint would have caught this. The test is not complex: send a credentialed cross-origin request, assert that Access-Control-Allow-Origin does not contain a wildcard. That test did not exist. The misconfiguration was in the framework default. Teams inherited it without knowing. Test suites validate that data is correct. Not that data is minimal. When you assert that GET /users/123 returns the right name and email, you are not asserting that it does not also contain a password hash, an internal flag, or a field your serializer included because nobody removed it. Happy-path tests confirm the presence of expected data. Nobody writes a test that fails when unexpected data appears. That gap is where sensitive data exposure lives — entirely invisible to a suite that is otherwise fully green. Every response has two contracts: What it must contain What it must not contain Your suite almost certainly verifies the first. It needs to verify the second. import pytest import requests import jsonschema BASE_URL = "https://your-app.com" FORBIDDEN_FIELDS = { "password", "password_hash", "token", "secret", "api_key", "internal_id", "debug", "admin_notes", "stack", "trace", "last_login_ip" } ALLOWED_USER_FIELDS = {"id", "name", "email", "created_at"} @pytest.fixture def auth_session(): session = requests.Session() session.post(f"{BASE_URL}/login", json={ "username": "testuser", "password": "test_password" }) return session def test_user_response_contains_no_forbidden_fields(auth_session): # CVE-2024-25124 pattern: assert what must NOT be in the response response = auth_session.get(f"{BASE_URL}/users/123") body = response.json() exposed = FORBIDDEN_FIELDS.intersection(body.keys()) assert not exposed, f"Sensitive fields exposed in response: {exposed}" def test_user_response_schema_allowlist(auth_session): # any field outside the allowlist is a contract violation response = auth_session.get(f"{BASE_URL}/users/123") body = response.json() unexpected = set(body.keys()) - ALLOWED_USER_FIELDS assert not unexpected, f"Unexpected fields in response: {unexpected}" def test_error_response_contains_no_stack_trace(auth_session): # deliberately trigger a server error response = auth_session.get(f"{BASE_URL}/users/invalid-id-trigger-500") body = response.text forbidden_strings = ["Traceback", "at line", "Exception", "File \"", "django", "sqlalchemy", "psycopg2", "pymongo"] for s in forbidden_strings: assert s not in body, f"Stack trace marker '{s}' found in error response" def test_cors_no_wildcard_on_authenticated_endpoint(auth_session): # CVE-2024-25124: wildcard + credentials = any origin reads response response = auth_session.get( f"{BASE_URL}/users/123", headers={"Origin": "https://attacker.com"} ) acao = response.headers.get("Access-Control-Allow-Origin", "") assert acao != "*", "Wildcard CORS on authenticated endpoint exposes data" def test_security_headers_present(auth_session): response = auth_session.get(f"{BASE_URL}/users/123") assert "X-Powered-By" not in response.headers, \ "X-Powered-By header discloses server technology" assert response.headers.get("X-Content-Type-Options") == "nosniff" assert "Secure" in response.headers.get("Set-Cookie", ""), \ "Session cookie missing Secure flag" assert "HttpOnly" in response.headers.get("Set-Cookie", ""), \ "Session cookie missing HttpOnly flag" *** Settings *** Library RequestsLibrary Library Collections Library String *** Variables *** ${BASE_URL} https://your-app.com @{FORBIDDEN} password password_hash token secret ... api_key internal_id debug admin_notes ... stack trace last_login_ip @{ALLOWED_FIELDS} id name email created_at *** Test Cases *** User Response Contains No Forbidden Fields # CVE-2024-25124 pattern: assert absence of sensitive fields Create Session app ${BASE_URL} ${response}= GET On Session app /users/123 ${body}= Set Variable ${response.json()} FOR ${field} IN @{FORBIDDEN} Dictionary Should Not Contain Key ${body} ${field} ... msg=Sensitive field '${field}' exposed in response END User Response Schema Allowlist Enforced Create Session app ${BASE_URL} ${response}= GET On Session app /users/123 ${body}= Set Variable ${response.json()} ${keys}= Get Dictionary Keys ${body} FOR ${key} IN @{keys} Should Contain ${ALLOWED_FIELDS} ${key} ... msg=Unexpected field '${key}' found in response END Error Response Contains No Stack Trace Create Session app ${BASE_URL} ${response}= GET On Session app /users/invalid-id-trigger-500 ... expected_status=any ${body}= Set Variable ${response.text} Should Not Contain ${body} Traceback Should Not Contain ${body} at line Should Not Contain ${body} Exception Should Not Contain ${body} File " Should Not Contain ${body} sqlalchemy Should Not Contain ${body} psycopg2 CORS No Wildcard On Authenticated Endpoint # CVE-2024-25124: wildcard origin + credentials = data exposed ${headers}= Create Dictionary Origin=https://attacker.com Create Session app ${BASE_URL} ${response}= GET On Session app /users/123 headers=${headers} ${acao}= Get From Dictionary ${response.headers} ... Access-Control-Allow-Origin default=${EMPTY} Should Not Be Equal ${acao} * ... msg=Wildcard CORS on authenticated endpoint exposes data Security Headers Present And Disclosure Headers Absent Create Session app ${BASE_URL} ${response}= GET On Session app /users/123 Dictionary Should Not Contain Key ${response.headers} X-Powered-By Dictionary Should Not Contain Key ${response.headers} Server ${xcto}= Get From Dictionary ${response.headers} ... X-Content-Type-Options default=${EMPTY} Should Be Equal ${xcto} nosniff import { test, expect, APIRequestContext } from '@playwright/test'; const FORBIDDEN_FIELDS = [ 'password', 'password_hash', 'token', 'secret', 'api_key', 'internal_id', 'debug', 'admin_notes', 'stack', 'trace', 'last_login_ip' ]; const ALLOWED_USER_FIELDS = new Set(['id', 'name', 'email', 'created_at']); const STACK_TRACE_MARKERS = [ 'Traceback', 'at line', 'Exception', 'File "', 'django', 'sqlalchemy', 'psycopg2', 'pymongo' ]; let apiContext: APIRequestContext; test.beforeAll(async ({ playwright }) => { apiContext = await playwright.request.newContext({ baseURL: 'https://your-app.com', }); await apiContext.post('/login', { data: { username: 'testuser', password: 'test_password' } }); }); test.afterAll(async () => { await apiContext.dispose(); }); test('user response — no forbidden fields exposed', async () => { // CVE-2024-25124 pattern: assert what must NOT be in the response const response = await apiContext.get('/users/123'); const body = await response.json(); const exposed = FORBIDDEN_FIELDS.filter(field => field in body); expect(exposed, `Sensitive fields exposed: ${exposed.join(', ')}`).toHaveLength(0); }); test('user response — schema allowlist enforced', async () => { // any field outside the allowlist is a contract violation const response = await apiContext.get('/users/123'); const body = await response.json(); const unexpected = Object.keys(body).filter(key => !ALLOWED_USER_FIELDS.has(key)); expect(unexpected, `Unexpected fields in response: ${unexpected.join(', ')}`).toHaveLength(0); }); test('error response — no stack trace in body', async () => { // deliberately trigger a server error, assert clean generic message const response = await apiContext.get('/users/invalid-id-trigger-500'); const body = await response.text(); for (const marker of STACK_TRACE_MARKERS) { expect(body, `Stack trace marker '${marker}' found in error response`) .not.toContain(marker); } }); test('CORS — no wildcard origin on authenticated endpoint', async () => { // CVE-2024-25124: wildcard + credentials = any origin reads response const response = await apiContext.get('/users/123', { headers: { 'Origin': 'https://attacker.com' } }); const acao = response.headers()['access-control-allow-origin'] ?? ''; expect(acao, 'Wildcard CORS on authenticated endpoint exposes data') .not.toBe('*'); }); test('security headers — disclosure headers absent', async () => { const response = await apiContext.get('/users/123'); const headers = response.headers(); expect(headers['x-powered-by'], 'X-Powered-By discloses server technology') .toBeUndefined(); expect(headers['x-content-type-options']).toBe('nosniff'); }); test('session cookie — Secure and HttpOnly flags present', async () => { const response = await apiContext.post('/login', { data: { username: 'testuser', password: 'test_password' } }); const setCookie = response.headers()['set-cookie'] ?? ''; expect(setCookie, 'Session cookie missing Secure flag').toContain('Secure'); expect(setCookie, 'Session cookie missing HttpOnly flag').toContain('HttpOnly'); }); Run data exposure tests in isolation: npx playwright test --grep "CORS|schema|forbidden|stack trace|cookie" sensitive-data-exposure-tests: stage: test script: - pytest tests/security/test_data_exposure.py -v - npx playwright test --grep "CORS|schema|forbidden" rules: - if: '$CI_PIPELINE_SOURCE == "merge_request_event"' allow_failure: false Pair with a Semgrep rule that flags direct model serialization without a DTO layer — the static check catches the pattern before deployment, the runtime tests confirm enforcement in the running application. Sensitive data exposure behaves differently across environments because debug settings change between development and production. A stack trace that appears in staging because DEBUG=True never makes it to production if someone remembered to flip the flag — but the test only ran in staging, so the gap is invisible. Always run response content validation tests against a production-mirrored environment with debug mode explicitly disabled. 🔬 Practice this yourself: yuriysafron.com/qa-sandbox When a team asks GitHub Copilot to write tests for a user profile endpoint, it writes tests that assert the response contains the correct name, email, and profile fields. The tests pass. The suite is green. Nobody generated a test that asserts the response does not also contain password_hash, internal_user_id, or a debug object left in the serializer six months ago. The structural reason LLMs fail at this class: sensitive data exposure is defined by what should not be present, not by what should. LLMs generate tests by modeling the expected output of a function. They do not model the set of all outputs that would constitute a violation. Testing for absence requires knowing the full allowlist of permitted data — a design decision that lives outside the code itself. No amount of reading the handler implementation tells you that the password field should never appear in the response, because that information lives in a compliance document or a threat model, not in the source code. The concrete failure: a team builds a user management API. Copilot generates a full test suite. Every endpoint is covered. Three months after launch, a security researcher reports that GET /users/:id returns a hashed password and a last_login_ip field. The AI generated suite asserted the name and email were correct. It never asserted those two fields were absent. The data had been in every response since the first deployment. No test had ever looked for it. 1. Response allowlist at the serialization layer - every API response must pass through a DTO that explicitly enumerates the fields allowed in the output. Nothing from the underlying domain model reaches the client unless it was deliberately placed in the DTO. Making implicit serialization impossible in your framework configuration means returning a raw model object is a runtime error, not a silent data leak. 2. Error response hardening — error handlers must return a generic message. Stack traces, exception class names, file paths, database driver information, and ORM query strings must never reach the client. This must be tested explicitly in CI against a production-mirrored environment with debug mode disabled. 3. Header security as a pipeline gate — every deployment must pass a header check that validates required security headers and the absence of disclosure headers. CORS headers must be validated against a known allowlist of permitted origins for every authenticated endpoint. This runs as a blocking post-deployment check, not a manual review before release. Prevention only works when it is tested. A DTO layer that exists is not the same as a DTO layer that is verified to contain only the fields it should. The test suite is the enforcement mechanism. Without it, the prevention is a convention, not a guarantee. Working on a cybersecurity platform protecting U.S. critical infrastructure and multiple branches of the U.S. military sharpens your sense of what "sensitive" means in practice. In those environments, a leaked internal ID is not a minor finding. It is reconnaissance. Sensitive Data Exposure separates teams who think about what their API must return from teams who think only about what their API must do. The second group ships data they never intended to expose. Always in a field nobody thought to look for. Always in a response that was otherwise working perfectly. When did your team last audit your API responses for fields that should not be there and do you have a test that would catch a new one being added tomorrow? Part of the **Break It on Purpose* series, published weekly for QA 🔬 Practice sandbox: yuriysafron.com/qa-sandbox 🔗 linkedin.com/in/yuriy-safronnynov
