HIPAA — Complaince

PHI · Security Rule · BAA · Breach Notification · De-identification

🔒 18 PHI Identifiers

🛡️ Security Rule

📝 BAA

🚨 Breach Notification

🧹 De-identification

PHI (Protected Health Information) = any health information that can identify a patient AND relates to their health condition, care, or payment. It becomes PHI when ANY of these 18 identifiers are present alongside health data. Remove all 18 → data is de-identified → no longer subject to HIPAA.

Identifier 01

Name

Full name, first name alone if combined with health data

Never log patient names in application logs

Identifier 02

Geographic Data

Street address, city, ZIP (first 3 digits of ZIP may be OK if population > 20,000)

Full ZIP codes in query params = PHI leak risk

Identifier 03

Dates (except year)

DOB, admission date, discharge date, date of death, age if > 89 years

Store as year only in de-identified datasets

Identifier 04

Phone Numbers

Any telephone number — home, cell, work, fax

Mask in UI: show only last 4 digits where possible

Identifier 05

Fax Numbers

Fax number associated with patient or their provider

Fax still widely used in healthcare — treat as PHI

Identifier 06

Email Addresses

Any email associated with the patient

Encrypt emails containing health info. Use Direct Secure Messaging for clinical email.

Identifier 07

Social Security Numbers

Full SSN or partial (last 4 digits can still be PHI in context)

Never store plaintext. Hash or tokenize. Audit access.

Identifier 08

Medical Record Numbers (MRN)

Any facility-assigned patient identifier — MRN, encounter ID, account number

MRNs in URLs or logs = PHI exposure. Use internal UUIDs instead.

Identifier 09

Health Plan Beneficiary Numbers

Insurance member ID, group number, Medicare/Medicaid ID

Common in 270/271 eligibility and 837 claims — encrypt in transit and at rest

Identifier 10

Account Numbers

Hospital billing account numbers, bank account if linked to health payment

RCM systems: mask account numbers in logs and error messages

Identifier 11

Certificate / License Numbers

Driver's license, medical license — if linked to patient health info

Identity verification workflows: treat as sensitive PII + PHI

Identifier 12

Vehicle Identifiers / Serial Numbers

VIN, license plate — can identify location of care

Rare in clinical systems but relevant in transport/ambulance data

Identifier 13

Device Identifiers

Implant serial numbers (pacemaker, hip), medical device IDs

IoT/wearable data: device ID + health reading = PHI

Identifier 14

Web URLs

URL if it identifies a patient (e.g. patient portal URL with patient ID)

Never put patient IDs in GET query params. Use POST body or session token.

Identifier 15

IP Addresses

Patient's IP address if linked to health data (e.g. portal login)

Web access logs with health context = ePHI. Protect server logs.

Identifier 16

Biometric Identifiers

Fingerprints, retinal scans, voiceprints used for patient identification

Biometric auth systems in healthcare: store hashes, not raw biometrics

Identifier 17

Full-Face Photos

Photographs that could identify the patient — clinical photos, ID photos

DICOM images often embed patient name in metadata — scrub before sharing

Identifier 18

Any Other Unique Identifying Number

Any other number or code not explicitly listed but uniquely identifying a person

Catch-all: if it can re-identify a patient when combined with health data, it's PHI

Dev rule of thumb: If a field can answer "which patient is this?" AND the record contains health data → it's PHI. Treat any combination of identifier + health condition as PHI by default. When in doubt, protect it.

The HIPAA Security Rule applies specifically to ePHI (electronic PHI). It requires three categories of safeguards. Each requirement is either REQUIRED (must implement) or ADDRESSABLE (implement if reasonable and appropriate, or document why not).

📋

Administrative Safeguards

Policies, training, and workforce management — ~50% of Security Rule

Required

Security Officer — designate one person responsible for HIPAA security policy. In startups this is often the CTO or founder.

Required

Workforce Training — all employees who touch ePHI must be trained on security policies. Document completion. Repeat annually.

Required

Access Management — formal process for granting/revoking access to ePHI systems. Role-based access control (RBAC). Audit log of who was granted what.

Required

Contingency Plan — data backup, disaster recovery, emergency access. RTO/RPO documented. Test the backup.

Addressable

Workforce Clearance — background checks for staff with ePHI access.

Addressable

Security Reminders — periodic security awareness updates (emails, training refreshers).

🏢

Physical Safeguards

Physical access to systems and devices storing ePHI

Required

Facility Access Controls — locked server rooms, badge access, visitor logs. Cloud vendors handle this for hosted systems.

Required

Workstation Use Policy — where and how workstations with ePHI access are used. Auto-lock screens after inactivity.

Required

Device & Media Controls — procedures for disposal of hardware and media containing ePHI. Wipe drives, shred documents.

Addressable

Workstation Security — physical protections like cable locks, privacy screens for laptops in clinical areas.

💻

Technical Safeguards — Most Relevant to Developers

The code and infrastructure controls you build and configure

Required

Access Control — unique user IDs, no shared logins, automatic logoff, encryption/decryption. Implement: RBAC, JWT with short expiry, MFA for ePHI systems. user_id NOT 'admin/admin'

Required

Audit Controls — record and examine activity in systems containing ePHI. Log: who accessed, what record, when, from where. Immutable logs. Retain ≥6 years. SELECT * WHERE patient_id = X → logged

Required

Integrity Controls — ensure ePHI is not improperly altered or destroyed. Checksums, digital signatures, version history, database transactions with rollback.

Required

Transmission Security — protect ePHI transmitted over networks. Minimum: TLS 1.2+. All APIs must use HTTPS. No ePHI in unencrypted email. No ePHI in HTTP GET params. https:// · TLS 1.3

Addressable

Encryption at Rest — encrypt databases, file systems, backups containing ePHI. Practically required — hard to justify not doing this. AES-256. AWS KMS / Azure Key Vault. AES-256-GCM

Addressable

Automatic Logoff — terminate sessions after a period of inactivity. Implement as idle timeout in your session management. Standard: 15 minutes in clinical settings.

Addressable

Authentication — verify identity before granting access. In practice: MFA is expected for any system with ePHI. FIDO2/WebAuthn for strongest security.

Quick dev checklist:

✅ HTTPS everywhere (TLS 1.2+ minimum)

✅ Encrypt DB at rest (AES-256)

✅ Unique user IDs + MFA

✅ Immutable audit logs (who, what, when)

✅ Role-based access control (RBAC)

✅ Session timeout (≤15 min idle)

✅ No PHI in URLs / query strings

✅ No PHI in application error logs

✅ Encrypted backups + tested restore

✅ Signed BAA with every cloud vendor

A Business Associate Agreement (BAA) is a legally required contract between a Covered Entity (CE) and any Business Associate (BA) — any vendor or contractor who creates, receives, maintains, or transmits PHI on your behalf. Without a signed BAA, both parties are in violation of HIPAA.

You build a healthcare app that handles patient data → You are a Business Associate (BA) or possibly a Covered Entity (CE)

You store ePHI on AWS S3 → AWS is your sub-BA. You must sign AWS's BAA (available in AWS console). Same for Azure, GCP, Snowflake, Databricks.

You use Twilio to send appointment reminders with patient info → Twilio must sign a BAA. Using a service without a BAA = HIPAA violation even if they're encrypted.

A hospital deploys your software → They (the CE) must sign a BAA with you before you can access any of their patient data.

BAA must specify: what PHI is involved, permitted uses, security obligations, breach reporting requirements, and how PHI is returned/destroyed at contract end.

✅ BAA Available — HIPAA-eligible vendors

AWS (HIPAA-eligible services — S3, RDS, Lambda, etc.)

Microsoft Azure (HIPAA/HITECH compliance)

Google Cloud Platform (GCP HIPAA BAA)

Snowflake (sign in web UI)

Twilio (available, must request)

SendGrid / Mailgun (with restrictions)

Auth0 / Okta (available)

Datadog, PagerDuty (available)

❌ No BAA — DO NOT use for ePHI

Slack (free/standard plan — no BAA)

Google Workspace (personal accounts)

Trello, Notion (standard plans)

GitHub (public repos — obviously)

ChatGPT / Claude API (without enterprise agreement)

Most analytics tools (Mixpanel, Amplitude — no BAA)

Zapier (standard plan)

Any free-tier SaaS tool

⚠️ Common dev mistakes:

Logging patient data to Datadog / Splunk without a BAA · Sending PHI in Slack messages · Using ChatGPT/Claude to analyze patient records without enterprise BAA · Storing test data with real patient records in dev/staging · Emailing PHI via Gmail

A breach = unauthorized acquisition, access, use, or disclosure of PHI that compromises its security or privacy. HIPAA's Breach Notification Rule requires specific actions within strict timeframes. There is a presumption of breach — you must prove it's NOT a breach, not the other way around.

Day 0

Breach Occurs or Is Discovered

Unauthorized access to PHI detected. Examples: database exposed publicly, ransomware, employee snooping on celebrity patient records, wrong patient record sent to another provider, laptop stolen.

Day 1–10 (as soon as possible)

Internal Investigation & Containment

Contain the breach. Assess scope: how many records, which identifiers, what health data. Apply the 4-factor risk assessment: (1) nature/extent of PHI, (2) who accessed it, (3) was it actually acquired/viewed, (4) risk of harm mitigated. If low probability of compromise → not a reportable breach.

Within 60 days of DISCOVERY

🔴 Notify Affected Individuals (Required)

Written notice by first-class mail (or email if patient consented). Must include: description of breach, types of PHI involved, steps individuals can take, what you're doing to investigate/mitigate, contact info. If 10+ individuals have outdated contact info → substitute notice (website or media).

Within 60 days of DISCOVERY

🔴 Notify HHS (Required)

Report to HHS via online portal. <500 records: can submit annual log by March 1 of following year. ≥500 records: must notify HHS within 60 days AND notify prominent media outlets in affected state/region.

Immediately (if BA)

Notify Your Covered Entity

If you're a Business Associate, your BAA specifies how quickly you must notify the CE. Typically without unreasonable delay. The CE's 60-day clock starts from when they discover it (or when you notify them).

Civil Monetary Penalties (CMPs)

Tier 1 — Did Not Know

$100 – $50,000 / violation

Unaware of the violation even with reasonable diligence. Cap: $25,000/year per category.

Tier 2 — Reasonable Cause

$1,000 – $50,000 / violation

Knew or should have known but not willful neglect. Cap: $100,000/year.

Tier 3 — Willful Neglect, Corrected

$10,000 – $50,000 / violation

Willful neglect but violation corrected within 30 days. Cap: $250,000/year.

Tier 4 — Willful Neglect, Not Corrected

$50,000 – $1,900,000 / violation

Willful neglect not corrected. Highest penalties. Criminal referral possible.

Real examples: Anthem (2015) — $16M settlement. UCLA Health — $865K. Small practices — $25K–$250K. Each patient record = potentially one "violation."

De-identification removes or transforms PHI so that data is no longer subject to HIPAA. De-identified data can be freely shared, used for research, analytics, AI training, or published. Two official methods under HIPAA:

Method 1: Safe Harbor

Remove ALL 18 identifiers listed in the Privacy Rule. Also: no actual knowledge that the remaining information could identify an individual.

What gets removed:

NameDOBZIPMRNSSNPhoneEmailIPDevice IDPhoto

What remains:

Year only3-digit ZIP*Age (if ≤89)ICD-10 codeLab values

*ZIP first 3 digits only if population > 20,000 in that region

Method 2: Expert Determination

A statistician or expert applies methods to determine the risk of re-identification is "very small." Allows more data to remain than Safe Harbor — including some dates and geographic detail.

Common techniques: k-anonymity (each record identical to ≥k-1 others), l-diversity, differential privacy (add statistical noise), tokenization (replace PHI with reversible token), data masking.

Must be documented and defensible. Expert signs off on the methodology.

Developer patterns for handling PHI safely:

Tokenization
Replace MRN "MRN-123" with opaque UUID. Store mapping in separate secured vault. API returns token, not raw PHI.

Test Data
NEVER use real patient records in dev/staging. Use synthetic data generators (Synthea) or properly de-identified datasets.

Logging
Scrub PHI from all logs before writing. Regex-strip SSN patterns, email addresses, MRNs. Use log masking middleware.

AI / LLM
De-identify before sending to any LLM API. Or use on-prem/enterprise agreements with BAA. Never send identifiable records to public APIs.

Lợi V. Nguyễn

Navbar-tab

HIPPA Complaince

HIPAA — Complaince

HIPPA Complaince

Navbar-tab

HIPPA Complaince

HIPAA Deep Dive: PHI 18 identifiers, Security Rule safeguards, BAA requirements, breach notification, and de-identification

HIPAA — Complaince

HIPPA Complaince