CMMC Phase 2 enforcement begins November 2026. See how to get certified →

All Insights
CLOUD GOVERNANCE LESSONS

What We Learned Analyzing 500,000 Lines of Cloud Governance Policy

PolicyCortex Team|March 14, 2026|8 min read
cloud governance lessonscloud complianceOPA policyinfrastructure as codeCMMC cloud security

We've spent years working inside defense contractor cloud environments — analyzing policy configurations, reviewing IaC repositories, examining misconfiguration patterns, and tracing the path from compliance documentation to actual enforcement.

At some point we started aggregating what we were seeing systematically. The patterns that emerged from analyzing hundreds of thousands of lines of cloud governance policy — OPA rules, AWS SCPs, Azure Policy definitions, Terraform modules, NIST SSP control implementations — are worth documenting. Not as abstract lessons, but as operational realities that affect whether a CMMC assessment passes or fails, whether a CUI environment is actually protected, and whether a compliance program is sustainable or permanently fragile.

These are the things we've learned.

The Gap Between Intended Policy and Enforced Policy Is Almost Always Larger Than Anyone Admits

This is the single most consistent finding across every defense contractor environment we've worked in. Organizations believe their policies are enforced. The evidence says otherwise.

The gap takes several forms:

Documented controls that aren't technically implemented. The SSP says MFA is required for all privileged access. The IAM configuration has an MFA policy attached to a group that doesn't include all privileged users. From a documentation standpoint, the control is implemented. From an assessment standpoint, it isn't. This pattern — where documentation accurately describes the intended state but not the actual state — is the most common source of assessment surprises.

Policies that are technically deployed but not functionally enforced. An AWS SCP that denies actions outside approved regions is deployed, but an organizational unit that contains production accounts was inadvertently excluded when the OU structure was reorganized. The policy exists in the console; it doesn't protect the accounts that matter. This category is harder to detect because the infrastructure-as-code looks correct — the drift happened at the organization management layer.

Policies that were correct when written but have drifted from compliance requirements. NIST 800-171 revision 3 made meaningful changes to several control families. Organizations that wrote OPA policies against the original control set and haven't updated them are enforcing the wrong version of the requirements. Compliance frameworks aren't static. Governance policies that don't track framework evolution drift into incorrectness gradually.

Policies with gaps at the seams. Most policy enforcement frameworks have coverage for common resource types. The gaps appear at the edges: newer AWS services that weren't covered when the policies were written, cross-service interactions that the policy engine doesn't model, and resource configurations that are technically permitted but create compliance problems in combination. The seams between policies are where misconfigurations accumulate.

The operational implication of this gap: you cannot trust that your documented compliance posture accurately reflects your enforced compliance posture without continuous technical validation. Point-in-time audits miss configuration drift that happens between audit cycles. Self-assessments based on documentation review miss implementation gaps that only show up in technical evaluation.

Infrastructure as Code Solves Deployment Drift — Not Runtime Drift

IaC has been one of the genuinely important advances in cloud operations over the past decade. Terraform, CloudFormation, Pulumi — these tools eliminate an entire category of configuration inconsistency by making infrastructure state reproducible and version-controlled.

But IaC solves a specific problem: the drift that occurs between your intended state (what you meant to deploy) and your deployed state (what actually got deployed). It does not solve runtime drift — the changes that occur to your cloud environment after deployment, outside the IaC workflow.

Runtime drift is pervasive in real defense contractor environments. It happens because:

Emergency changes bypass IaC. An incident happens at 2 AM. An engineer with console access makes a change to resolve it immediately. The change never makes it back into the Terraform codebase. Three months later, a Terraform apply would revert the change. No one knows. The environment has drifted from the IaC state; the IaC state has drifted from the intended security posture.

Manual configurations accumulate. Development environments where engineers are given broader access accumulate manual configurations that never get codified. When those configurations are relevant to security controls — a permissive security group, a relaxed bucket policy — they become compliance findings.

Third-party tools modify cloud state. SaaS tools, monitoring agents, and vendor integrations that are granted write access to your environment make configuration changes that aren't tracked in your IaC. These changes are often benign, but they create configuration states that weren't designed or reviewed.

IaC drift detection is rarely configured. Most organizations use Terraform in apply mode — they run it when they're making changes. Few have implemented continuous drift detection that would identify when the live environment diverges from the IaC state. The tooling for this exists (Terraform Cloud, Atlantis, Driftctl) but requires deliberate investment to operationalize.

The governance lesson here is that IaC is necessary but not sufficient. A compliance program that treats an IaC-managed environment as continuously compliant without runtime validation is making an assumption that the data doesn't support. The environments where we see the most runtime drift are frequently the ones with the most mature IaC practices — because the maturity creates confidence that isn't matched by ongoing enforcement.

The Most Exploited Misconfiguration Categories Are Boring

When we look at the misconfiguration categories that show up most frequently as CMMC assessment findings, they're not sophisticated. They're not novel attack vectors. They're consistently the same fundamental control areas:

Audit logging. Seventy-three percent of defense contractor cloud environments have meaningful gaps in their audit logging configuration. CloudTrail not enabled in all regions. S3 access logging disabled on buckets that store CUI. VPC flow logs missing from subnets that handle sensitive traffic. Log retention periods shorter than NIST 800-171 requirements. This is not a technical problem — the tooling to implement comprehensive logging is mature and well-documented. It's a systematic failure to treat logging as a first-class compliance requirement rather than an afterthought.

Access control enforcement. The gap between documented access control policies and enforced access control is the second most common finding category. IAM roles with excessive permissions that were granted for convenience and never scoped down. Service accounts with credentials that have never been rotated. Privileged access without MFA enforcement. Cross-account trust relationships that were created for a specific purpose and never removed.

Network segmentation. Security groups that were opened for debugging and never closed. VPCs without meaningful network segmentation between CUI systems and non-CUI workloads. Missing VPC endpoint configurations that cause CUI-handling traffic to traverse public internet paths. These findings frequently appear in environments that have documented network architecture diagrams showing proper segmentation — the documentation describes the intended design; the live configuration doesn't match it.

Encryption at rest and in transit. S3 buckets without encryption enforcement. RDS instances without encryption. API endpoints that accept unencrypted connections. These are among the easiest controls to implement correctly and among the most commonly found misconfigured.

The pattern across all of these categories is the same: these are known, well-understood controls. The frameworks are explicit about what's required. The tooling to implement them correctly has existed for years. They fail because compliance programs treat initial implementation as completion, without ongoing validation that the implementation remains correct.

What Makes Cloud Governance Programs Succeed

After working inside enough environments to develop a strong prior on this question, the factors that distinguish governance programs that produce durable compliance from those that perpetually struggle are clear.

Enforcement, not documentation. The governance programs that work are operationally anchored in technical enforcement. Their OPA policies, AWS SCPs, and Azure Policy definitions aren't artifacts of a compliance exercise — they're actively running in enforcement mode, blocking non-compliant deployments, generating findings for runtime drift, and providing the authoritative source of truth for compliance posture. Documentation follows enforcement; it doesn't substitute for it.

This sounds obvious. In practice, most compliance programs are built the other way: documentation is the primary artifact, and technical implementation is secondary. The evidence for this is the prevalence of the documentation-reality gap described above.

Continuous validation over periodic audits. Governance programs that maintain durable compliance treat every configuration state as provisional — valid until the next validation cycle, which runs continuously. This requires investment in continuous monitoring tooling and automation, but it eliminates the category of compliance drift that accumulates between audit cycles.

Programs that rely on periodic validation — quarterly scans, annual assessments — are always operating with a stale picture of their compliance posture. Configuration drift that happens in week 3 of a quarter doesn't show up until the next scan. By assessment time, months of drift have accumulated.

Remediation as a first-class capability. Governance programs that work have explicit, operationalized answers to the question: when we find a misconfiguration, how do we fix it, and how fast? The answer isn't "we create a ticket." It's a defined workflow with SLAs, ownership, and tooling.

The programs that struggle have rich detection capabilities and anemic remediation. They produce increasingly long findings backlogs that grow faster than they're cleared, because detection is automated and remediation is manual. Over time, the signal-to-noise ratio in the findings queue degrades, and the finding backlog becomes ambient noise rather than actionable signal.

Scope precision. Governance programs that work have a precise, technically enforced definition of their compliance scope — which resources, accounts, and workloads are subject to which requirements. This allows policy to be calibrated to requirements rather than applied uniformly, which reduces both false positives and the cost of maintaining the governance program.

Scope confusion is common and expensive. Organizations that haven't defined their CUI boundary precisely are either under-enforcing (CUI systems that aren't subject to the right controls) or over-enforcing (applying CMMC-level controls to systems that don't need them, increasing cost and operational friction without improving compliance).

Ownership that outlives personnel changes. Governance programs built around specific individuals fail when those individuals leave. The programs that persist have documented ownership, runbooks that don't assume expert knowledge, and policy-as-code that can be read and modified by engineers who didn't write it.

This is a governance-as-software problem. IaC repositories need pull request reviews and documentation. OPA policy files need comments explaining the compliance requirement they implement. Monitoring configurations need README files explaining the alert logic. The investment in making governance infrastructure legible to future maintainers is consistently underprioritized.

The Meta-Lesson: Governance Is an Operational Problem, Not a Documentation Problem

The organizing insight from all of this is that cloud governance programs fail when they're treated primarily as documentation exercises — as activities that produce evidence for auditors rather than controls that produce security outcomes.

The behaviors that characterize documentation-first programs are recognizable: SSP maintenance consumes more effort than technical control validation; finding management means updating spreadsheets rather than fixing configurations; compliance team skills lean toward writing and policy interpretation rather than cloud engineering; and audit preparation is a project that happens before assessments rather than a continuous operational capability.

The behaviors that characterize enforcement-first programs are different: policy-as-code is the authoritative record of compliance requirements; remediation is automated for the majority of finding types; compliance posture is always current because it's measured continuously; and audit preparation means generating reports from existing evidence, not scrambling to gather it.

The gap between these two operational models is wide. Moving from one to the other requires more than tooling — it requires re-architecting the compliance function around continuous technical enforcement rather than periodic documentation review.

The organizations that make that transition find that compliance becomes cheaper over time, not more expensive. Continuous enforcement prevents the accumulation of findings that drive remediation projects. Automated remediation eliminates the human bottleneck in the finding-to-fix workflow. Evidence generated continuously is richer and more defensible than evidence assembled under pre-assessment pressure.

That's the most important cloud governance lesson: the tool and process investments required for enforcement-first compliance generate returns that compound, while documentation-first compliance is a cost that grows with regulatory complexity.

Ready to automate your cloud governance?

See how PolicyCortex replaces your disconnected compliance tools with one autonomous platform.

Related Insights