The Core Principle
CleanStart's philosophy is simple and uncompromising: Every dependency is untrusted until proven safe through rigorous verification.
This is the opposite of traditional software development, where dependencies are trusted immediately upon publication and verification only happens if compromise is suspected. By then, damage has often already occurred. The traditional trust model follows a "trust by default" approach where a dependency is published, immediately assumed to be safe without verification, and used in production. Verification only happens reactively if the dependency is suspected of problems—sometimes weeks or months later. CleanStart reverses this model entirely with a "verify by default" approach where a dependency is published, then systematically verified through multiple independent layers, tested against compatibility requirements, cryptographically signed, and only then approved for use in production. This upfront verification work eliminates entire categories of supply chain attacks before they can reach production.
The CleanStart model requires more upfront work than traditional approaches, but this investment eliminates entire categories of attacks that are difficult or impossible to detect reactively.
The Supply Chain Reality: 281M+ Relationships
Consider the scale of the global open source ecosystem: Go modules host 238 million packages, Crates hosts 16.6M Rust packages, npm hosts 12.5M JavaScript packages with billions of downloads per week, Maven Central contains 7.2M Java artifacts, PyPI hosts 3.4M Python packages, RubyGems contains 3.1M Ruby gems, and C++ package ecosystems contain 18.7K packages. These packages don't exist in isolation—they have 281 million documented dependency relationships where every package depends on other packages, creating an extraordinarily complex web of interconnections. This complexity creates enormous attack surface: attackers have 281 million potential insertion points where they could compromise a single package and affect millions of downstream applications that depend on it, directly or transitively.
A single compromised package in this ecosystem could affect millions of downstream applications instantly. Zero-Inheritance Architecture—images built entirely from source without inheriting anything from upstream distributions—eliminates this attack vector entirely by removing the dependency on pre-built packages from registries.
Example: colors.js (npm package)
In 2021, the maintainer of the popular colors.js package added malicious code that printed messages like "AAAAAAAA" and obscured text. The maintenance account had been compromised, but the package was already widely used. Because of the traditional "trust by default" model, the malicious code reached thousands of projects before being detected and remediated.
Four Verification Layers
CleanStart verifies dependencies through four independent verification layers that work together to catch different types of attacks and compromises. Source code integrity verification catches trojaned binaries and build environment compromises. Maintainer identity verification catches account compromises and deliberate malicious handoffs. Behavioral analysis catches hidden malicious code that source review misses. Runtime verification catches previously-verified packages that are compromised at deployment time.
Layer 1: Source Code Integrity
Layer 1 establishes confidence that source code corresponds exactly to the published binary, catching trojaned binaries and build environment compromises. This verification operates through a rigorous three-step process. First, the source code is fetched from upstream repositories and verified through cryptographic means using git verify-commit to validate release tags. Second, the source code is built reproducibly using the official build process. Third, the output is verified by computing the cryptographic hash of the rebuilt binary and comparing it to the upstream published hash. If the hashes match perfectly, the source code is authentic and matches the published binary. If the hashes differ, one of three things has happened: the source code has been tampered with, the build environment differs in a meaningful way from the upstream environment, or the upstream binary was trojaned after publication. All of these scenarios represent critical security concerns that trigger an immediate alert and block the package from use.
The principle is simple but powerful: if you rebuild a package from its published source code and the resulting binary's hash doesn't match the upstream published hash, something is seriously wrong.
Layer 2: Maintainer Identity
Layer 2 verifies that package ownership and control have not changed unexpectedly, catching account compromises and deliberate handoffs where new maintainers insert malicious code. This verification operates through multiple complementary signals including signature chain verification ensuring packages are signed by the expected maintainer's key, maintainer history analysis tracking whether maintenance has changed hands unexpectedly, code authorship analysis verifying that commits come from recognized developers, account security monitoring checking for unusual activity, and signing key verification checking that keys are verifiable through public key servers.
A historical example illustrates the importance of Layer 2 verification. In 2018, npm detected the "event-stream" compromise, where the original maintainer of the popular event-stream package transferred ownership to a new maintainer. That new maintainer then published a malicious version that attempted to steal cryptocurrency by targeting Bitcoin wallet libraries. Layer 2 verification catches exactly this scenario: it would detect the maintainer change, flag the package for additional scrutiny, and likely block the malicious version from use even before manual discovery of the malicious code.
Layer 3: Behavioral Analysis
Layer 3 executes packages in a controlled, sandboxed environment and monitors their actual runtime behavior, catching malicious code that may have evaded source code review and maintainer identity checks. This layer catches packages that behave suspiciously in ways that source code analysis alone might miss. The behavioral analysis monitors all system calls made by the package, creating a comprehensive log of the package's actual behavior at runtime. Four categories of behavior are particularly scrutinized: network access monitoring whether the package makes external network calls, file system access monitoring whether the package writes to unexpected locations, process spawning monitoring whether the package executes other binaries, and cryptographic activity monitoring whether the package uses cryptographic functions.
The principle underlying Layer 3 is that package behavior should match package purpose. A build tool should make compilation system calls, read source files, and write compiled artifacts—these are expected behaviors. It should not contact external servers, write to SSH directories, or spawn shell processes. A JSON parsing library should parse JSON—it should read memory and perform basic computation, but it should not access the network, write to files outside its expected scope, or execute other programs. When actual runtime behavior diverges from expected behavior, it's a red flag indicating potential malicious code.
Layer 4: Runtime Verification
Layer 4 extends verification into production, continuously validating that packages behave as expected even after deployment. This defense-in-depth approach ensures that even if the first three layers were somehow bypassed, production safeguards prevent malicious code from causing damage. Signature verification in Layer 4 confirms that the package running in production is exactly the package that was verified in the earlier layers. Runtime isolation constrains what the package can do even if it tries to do something malicious. Behavioral monitoring continues in production, watching for unexpected system calls or network activity. Incident response mechanisms enable instant remediation through automatic pod termination, automatic workload migration, and automatic deployment of known-clean previous version.
This defense-in-depth approach ensures that even if a previously-verified package is somehow compromised at runtime, the damage is contained and remediated automatically.
Kubernetes-level verification: apiVersion: v1 kind: Pod spec: securityContext: readOnlyRootFilesystem: true runAsNonRoot: true allowPrivilegeEscalation: false containers: - image: myapp@sha256:abc123... # Pin to exact digest volumeMounts: - name: app mountPath: /app readOnly: true # Can't modify application code resources: limits: network: "100M" # Bandwidth limit cpu: "1000m" memory: "512Mi" # Result: Even if a package is malicious, it's constrainedDependency Graph Analysis
CleanStart builds a complete dependency graph for every application to understand exactly what dependencies are included. For myapp 1.0.0, this includes express@4.18.2 which depends on accepts@1.3.8 and body-parser@1.20.1, which in turn depends on bytes@3.1.2, content-type@1.0.4, and iconv-lite@0.4.24 (which depends on safer-buffer@2.1.2), plus 5 additional transitive dependencies. Express also has 15 direct dependencies. The application also depends on pg@8.9.0 and lodash@4.17.21 (which has no further dependencies).
In total, the application has 52 packages. CleanStart verifies 51 of them (98%), leaving one for human review. The analysis identifies 12 known vulnerabilities: 5 in transitive dependencies and 7 in code that the application doesn't actually use. Every node in the graph is verified independently, and if any node fails verification, the entire path is blocked.
Attack Scenarios CleanStart Prevents
Scenario 1: Compromised Build System
Consider an attacker who successfully compromises npm's build server and injects a backdoor into packages before they are distributed. In the traditional "trust by default" approach, packages would appear normal because they are signed by npm's key, which users trust. The compromised packages would circulate globally for days before the compromise is discovered. In the CleanStart approach, reproducible build verification immediately detects the problem. When the Package Factory rebuilds the package from source, the resulting hash will not match the published hash—because the published binary was trojaned. This hash mismatch triggers an immediate alert, and the package is blocked from use. The attack is contained in minutes, affecting zero applications.
Scenario 2: Typosquatting
Consider an attacker who registers a typosquatted package name like "expres" (one letter different from "express") and publishes malicious code under that name. In the traditional approach, a developer might accidentally type the wrong package name, the typosquatted package is installed, and malicious code runs in the application. In the CleanStart approach, the typosquatted package would immediately fail behavioral analysis. The malicious code would attempt suspicious activity like network calls to external servers or attempts to write to sensitive files. These suspicious behaviors would be detected, the package blocked from use, and the developer would be notified of the typo. The attack is prevented entirely.
Scenario 3: Stolen Maintainer Credentials
Consider an attacker who obtains a maintainer's GitHub credentials (perhaps through phishing or credential reuse) and publishes a malicious version of a popular package. In the traditional approach, the malicious version is published with a valid cryptographic signature—because it was signed using the stolen credentials. In the CleanStart approach, behavioral analysis detects suspicious activity in the malicious version. The maintainer identity verification layer also identifies the suspicious release—perhaps the signing key was used from a new geographic location, or the release pattern is unusual. The package fails verification and is blocked from distribution, preventing any applications from being affected.
Scenario 4: Supply Chain Pivot
Consider an attacker who carefully contributes legitimate features to a popular open source package, gradually gains the trust of the original maintainers, eventually becomes a maintainer, and then publishes a backdoor in what appears to be a routine maintenance release. In the CleanStart approach, the malicious code would be caught by multiple verification layers. Code auditing would reveal suspicious code patterns. Behavioral analysis would detect the hidden malicious functionality. VCS history analysis would show sudden changes in code style or approach that differ from previous contributions. The package would be blocked before distribution, and the original maintainers would be alerted to investigate their repository for compromise.
The Cost of Verification
Implementing four layers of verification has a cost. The four verification layers require significant time: source code integrity checks take 5-10 minutes per package, maintainer identity verification takes 2-5 minutes, behavioral analysis takes 15-30 minutes, and runtime verification happens continuously with minimal overhead. In total, each package takes 30-50 minutes to verify. For an application with 52 packages, this would be 26-43 hours of verification work.
However, CleanStart performs this verification once per package and then reuses the result across all applications that depend on that package. Verification scales linearly with the number of unique packages in the entire ecosystem (approximately 5 million), not with the number of applications (billions). Once express@4.18.2 is verified, every application using that exact version can instantly reuse that verification result. This scaling model makes comprehensive verification practical.
Once express@4.18.2 is verified the first time (taking 45 minutes), the result is cached globally along with the signature, SBOM, and attestation. The next 1,000 applications that depend on express@4.18.2 can look up the verification result in the cache in less than a second, verify the stored signature in another second, check for any newly discovered vulnerabilities in one more second, and instantly use the cached verified package. This caching model eliminates the need to reverify the same package version millions of times.
This is why CleanStart's approach scales: verification is done once, results are cached globally.
The Verified Source Index
CleanStart maintains a global index of verified packages with express@4.18.2 including verification status, verification date, source and binary hashes, maintainer verification, behavioral analysis results, known vulnerabilities, cryptographic signatures, SBOM, SLSA attestation, and VEX.
This index is queryable, allowing any system to instantly check the verification status of any package version. It's auditable, with every verification decision logged and traceable back to the specific checks that were performed. It's distributed across multiple CleanStart servers with replication, ensuring that no single point of failure can compromise the index. And it's tamper-evident, cryptographically signed to ensure that the index cannot be modified without detection.
Philosophy in Practice
The verified source philosophy is simple: "No code runs in production unless verified." The implementation has six steps. When a developer submits code, the application factory identifies all dependencies. For each dependency, it checks the verified index to see if it's already been verified. If a dependency is missing or unverified, the verification process is triggered automatically. Only after every single dependency is verified does the build get approval. Once all dependencies are verified, the image is signed and deployed only to verified infrastructure with signature checks enabled. This approach eliminates the possibility of deploying code that hasn't been verified.
This is zero trust applied to software supply chain.
It's more stringent than industry standard, but it's the model that prevents supply chain attacks at scale.
The tradeoff: slightly longer build times (minutes of verification) for dramatically improved security (zero probability of known verified packages being compromised).
For organizations building critical infrastructure, this tradeoff is worth every second of the verification process.
Next Steps: Apply Verified Source in Your Pipeline
Understand the full supply chain — To see verified source in context, read What is Supply Chain Security? for an overview of all supply chain attack vectors, Pre-Build Stage Security for controlling what code enters your build, What is SBOM? for inventorying what's in your images, and What is SLSA? for the provenance framework behind verified sources.
Implement verified source builds — To put philosophy into practice, explore Build Stage Security for hermetic, verified builds, The Continuous Trust Loop for automated rebuilds when verification fails, The 11 Build Artifacts for what verified builds generate, and Dockerfile to YAML Migration for declarative, verified specs.
Operate with verified sources — To keep verification in production, see End-to-End Secure Deployment for deploying verified images, Helm Charts & Kubernetes for running verified images at scale, and Upgrade & Patching Playbook for maintaining verification over time.
