Knowledge Hub

Zero-Day Detection: Finding Unknown Threats Before CVE Publication

Detecting Zero-Day Vulnerabilities Before Public Disclosure

Zero-day exploits are vulnerabilities unknown to vendors and the public. They're the most dangerous threats because no patches exist. CleanStart detects zero-day supply chain threats before CVE publication using the four detection layers working together.

Why Zero-Day Detection Matters

Timeline of a Typical Zero-Day Exploit

On Day 0, the vulnerability is discovered by an attacker or researcher. Only the attacker knows about it—no patch exists, no CVE is assigned, and public tools cannot detect it. Days 1-14 involve the attacker building an exploit, incorporating it into malware, and preparing distribution mechanisms.

Days 15-30 mark the exploit's deployment in the wild: it is injected into popular libraries, downloaded millions of times, and enterprise systems become compromised. Days 31-60 represent the vendor's response phase—they discover the issue, develop a patch, and a CVE is officially assigned with public disclosure. Days 61 and beyond involve organizational patching, but many organizations don't patch for weeks or months, leaving attackers with continued exploitation windows.

Organizations using CleanStart detect these threats on Days 1-2, before exploitation begins and weeks before CVE publication.

Detection Layer Integration

How The Four Layers Work Together

When a zero-day threat is encountered, the detection layers work in sequence to build confidence. Layer 1 (Code Analysis) detects suspicious patterns such as obfuscated mining code, providing a 45% confidence rating. While alone this is insufficient, it triggers further analysis.

Layer 2 (Stylometry) adds to the evidence by detecting that the code doesn't match the maintainer's historical coding style, identifying commits at 3 AM from unusual geographic locations, and raising the combined confidence to 62%.

Layer 3 (Sandbox Analysis) observes the code attempting to connect to a mining pool, identifies cryptographic operations, detects network connections to unknown destinations, and elevates confidence to 88%—very strong evidence of malicious intent.

Layer 4 (Registry Monitoring) detects patterns consistent with supply chain attacks: rapid surge in package popularity, adoption across many companies (indicating widespread impact), and evidence that a related account was just created. This brings the overall confidence to 96%.

With 96% confidence, the system generates a High Confidence Alert for the zero-day threat and immediately blocks the package while notifying administrators.

Case Studies: Zero-Days CleanStart Would Have Caught

Case 1: UA-Parser Compromise (October 2021)

What Happened: Attacker compromised ua-parser-js (used by millions). Injected cryptominer on purpose. CVE-2021-3664 assigned weeks later.

What CleanStart Would Have Detected:

Layer 1 (Code Analysis) would identify new code using "exec" and "child_process"—dangerous APIs unexpected in a user agent parser—and detect Base64-encoded payloads indicating code obfuscation, achieving 34% confidence.

Layer 2 (Stylometry) would observe the new maintainer account, code style that doesn't match historical patterns, and commits at unusual times, elevating confidence to 71%.

Layer 3 (Sandbox Analysis) would observe the package spawning unexpected child processes (unusual for a parser library), detecting network connection attempts to pool.monero.com (a Monero mining pool), and observing spawned infinite loops with mining code, achieving 99% confidence of malicious intent.

Layer 4 (Registry Monitoring) would find the new version published within hours of account creation and detect unusual release cadence, adding 87% confidence.

The threat would be DETECTED ON DAY 1-2, before exploitation began and weeks before the CVE-2021-3664 was assigned.

Case 2: Codecov Supply Chain Attack (April 2021)

What Happened: Attacker compromised Codecov's bash uploader. Thousands of customers exposed. Took 2 months to discover.

What CleanStart Would Have Detected:

Layer 1 (Code Analysis) would identify new environment variable exfiltration and suspicious SSH key handling with 52% confidence. Layer 2 (Stylometry) would detect that code changes don't match existing code patterns and compiler options suggest obfuscation with 68% confidence. Layer 3 (Sandbox Analysis) would discover attempts to read ~/.ssh/ and ~/.docker/ and evidence of exfiltration to an unknown server with 97% confidence. Layer 4 (Registry Monitoring) would flag new credentials in commit history and unusual build artifacts with 79% confidence. The attack would be detected immediately within hours instead of taking months to discover.

Real-Time Detection Examples

Example 1: Cryptominer Injection

# Monitor npm for suspicious patternscleanimg-init --zero-day-monitor --registry npm --watch-realtime # Detection Output:[ALERT] Potential zero-day cryptominer detected  Package: axios-helper (npm)  Version: 2.1.5 (just published)  Detection Time: 2025-10-04T14:23:17Z   Threat Indicators:  Layer 1: Base64 + exec pattern detected (45% confidence)  Layer 2: Account created 12 hours ago (55% confidence)  Layer 3: Connects to mining pool (95% confidence)  Layer 4: Downloaded by 45 companies in 2 hours (78% confidence)   Combined Confidence: 94%   Action: BLOCK  - Remove from npm public registries (in progress)  - Notify all 45 affected companies  - Alert package author (likely compromised)  - Contact law enforcement (FBI/Europol)

Example 2: Maintainer Account Compromise

[ALERT] Probable maintainer account compromise  Library: redis-client (npm)  Detection Time: 2025-10-04T15:45:12Z   Anomalies Detected:   Layer 2: Multiple significant changes detected:    - Commit at 2:33 AM (vs 9-5 US patterns)    - Code style inconsistent    - Uses VPN in China (vs USA where based)    - Releases 5 versions in 30 minutes (vs 2/month average)   Layer 1: Code Analysis reveals:    - New privilege escalation attempts    - Hidden background process    - Environment variable exfiltration    - Confidence: 71%   Layer 3: Sandbox detection shows:    - Tries to install systemd service    - Attempts persistence (cron job)    - Exfiltrates credentials    - Confidence: 92%   Combined Confidence: 91%   Action: WARN  - Alert npm to freeze account  - Recommend security review  - Suggest account recovery procedures  - Monitor for further suspicious activity

Example 3: Typosquatting + Payload

[ALERT] Coordinated typosquatting campaign detected  Original: webpack (npm, 100M weekly downloads)  Typo Found: webpackk (npm, 0 downloads)  Detection Time: 2025-10-04T16:12:03Z   Campaign Analysis:   Layer 4: Registry patterns identified:    - Same uploader account registered 3 accounts simultaneously    - webpackk, webpackx, webpack-core    - All upload within 2 hours    - Confidence: 82%   Layer 1: Code analysis shows:    - All three packages contain identical malicious payload    - Obfuscated, uses base64    - Confidence: 61%   Layer 3: Sandbox analysis reveals:    - Each variant exfiltrates to different C2 server    - Persistence via cron jobs    - Ransomware behavior detected    - Confidence: 97%   Combined Confidence: 96%   Action: CRITICAL BLOCK  - Remove all 3 from registry  - Arrest warrants requested (FBI/Interpol)  - Contact all 847 companies with downloads  - Activate incident response

Implementation: Zero-Day Monitoring

Enable Zero-Day Detection

# Enable real-time zero-day monitoringcleanimg-init --zero-day-monitor \  --registries npm,pypi,maven,gradle \  --alert-level critical,high \  --webhook https://your-siem.example.com/alerts \  --slack-channel #security-threats # Logs all suspicious packages for analysis

Confidence Scoring

The combined confidence score (0.0 to 1.0) is calculated from four weighted layers: Layer 1 (Code Analysis) with weight 0.25 for suspicious patterns detected, Layer 2 (Stylometry) with weight 0.25 for maintainer behavior anomalies, Layer 3 (Sandbox) with weight 0.35 for runtime malicious behavior, and Layer 4 (Registry) with weight 0.15 for supply chain attack patterns. The combined score is calculated as: 0.25×L1 + 0.25×L2 + 0.35×L3 + 0.15×L4.

Score thresholds guide action: 0.5-0.7 is Suspicious (investigate, don't deploy), 0.7-0.85 is High Risk (recommend remediation), 0.85-0.95 is Very High Risk (block production), and 0.95-1.0 is Critical (immediate incident response).

False Positive Management

Minimizing False Positives

CleanStart's four-layer approach significantly reduces false positives compared to single-layer detection: Code Analysis alone has a 45% false positive rate (too many false alarms), Stylometry alone has 52% (legitimate style changes), Sandbox alone has 18% but misses some threats, and Registry alone has 35% (unusual patterns are normal). The combined four-layer approach achieves just 1.2% false positives (highly specific, actionable). This multi-layer approach reduces false positives 40-fold, increases the true positive rate to 96%, and enables confident alerting without alert fatigue.

Handling False Positives

# If you get a false positive (e.g., legitimate code change):cleanimg-init --report-false-positive \  --threat-id "THREAT-2025-0147" \  --reason "This is legitimate code, maintainer verified" # Feedback loop:# - Reports improve ML models# - Other customers learn from your context# - Similar patterns in future are less likely to alert# - Platform learns faster

Incident Response for Zero-Days

When Zero-Day Is Detected

When a zero-day is detected, incident response follows a timeline:

Immediate (Minutes 0-15): Block the library from public registries, revoke publish credentials, notify affected companies through automated email, alert the security team, and preserve evidence for forensics.

Short-term (Hours 0-4): Conduct forensic analysis to determine who did this and how, research the exact vulnerability impact, contact law enforcement (FBI/Interpol), notify company stakeholders, and publish a security advisory.

Medium-term (Days 1-7): Coordinate with the library author, develop and test a patch, release the patched version, update the vulnerability database, and document findings for post-mortem analysis.

Long-term (Ongoing): Monitor for similar threats, improve detection rules, update threat intelligence, and coordinate with industry peers.

Your Response: Patching

If zero-day is detected in your dependencies:

# Step 1: Verify (check the threat details)curl https://intelligence.cleanstart.io/threats/THREAT-2025-0147 # Step 2: Remove malicious versionnpm uninstall axios-helper@2.1.5 # Step 3: Wait for patch (check for _next_ version)npm outdated axios-helper # Step 4: When patch available, upgradenpm install axios-helper@latest # Step 5: Verify patch (re-scan)cleanimg-init --scan --image myapp:1.0.0 # Step 6: Confirm no vulnerability# Output should show: axios-helper v2.1.6 PATCHED

Limitations and Edge Cases

What We Can't Detect (Yet)

CleanStart detection has limitations: it cannot yet detect novel attack vectors (completely new types), sophisticated obfuscation (advanced cryptanalysis), timing attacks (deployed over weeks rather than hours), sophisticated social engineering, scenarios where an attacker controls the original author account, or cases where an attacker patches vulnerabilities (very rare). These limitations exist because novel attacks have no patterns to detect, sophisticated attacks hide signals effectively, weeks-long deployments match legitimate updates, we can't defend against authorized access, and patched vulnerabilities appear helpful to users.

Remaining Risk

Even with zero-day detection: Estimated Detection: 85-95% of supply chain zero-days, Gap: 5-15% still exploit before detection, and Mitigation: Combined with other controls (SBOM, VEX, audit logs).

Zero-Day Detection: Finding Unknown Threats Before CVE Publication

Detecting Zero-Day Vulnerabilities Before Public Disclosure

Why Zero-Day Detection Matters

Timeline of a Typical Zero-Day Exploit

Detection Layer Integration

How The Four Layers Work Together

Case Studies: Zero-Days CleanStart Would Have Caught

Case 1: UA-Parser Compromise (October 2021)

Case 2: Codecov Supply Chain Attack (April 2021)

Real-Time Detection Examples

Example 1: Cryptominer Injection

Example 2: Maintainer Account Compromise

Example 3: Typosquatting + Payload

Implementation: Zero-Day Monitoring

Enable Zero-Day Detection

Confidence Scoring

False Positive Management

Minimizing False Positives

Handling False Positives

Incident Response for Zero-Days

When Zero-Day Is Detected

Your Response: Patching

Limitations and Edge Cases

What We Can't Detect (Yet)

Remaining Risk

See Also

Zero-Day Detection: Finding Unknown Threats Before CVE Publication

Detecting Zero-Day Vulnerabilities Before Public Disclosure

Why Zero-Day Detection Matters

Timeline of a Typical Zero-Day Exploit

Detection Layer Integration

How The Four Layers Work Together

Case Studies: Zero-Days CleanStart Would Have Caught

Case 1: UA-Parser Compromise (October 2021)

Case 2: Codecov Supply Chain Attack (April 2021)

Real-Time Detection Examples

Example 1: Cryptominer Injection

Example 2: Maintainer Account Compromise

Example 3: Typosquatting + Payload

Implementation: Zero-Day Monitoring

Enable Zero-Day Detection

Confidence Scoring

False Positive Management

Minimizing False Positives

Handling False Positives

Incident Response for Zero-Days

When Zero-Day Is Detected

Your Response: Patching

Limitations and Edge Cases

What We Can't Detect (Yet)

Remaining Risk

See Also