Really: Policy language for infra that doesn't suck

Using Rego for cloud configuration is awful. Use Really: policy-as-code built for humans.
AUTHOR
Travis McPeak
PUBLISH DATE
May 16, 2024

Announcing Really

When we started building Resourcely, the global configuration engine for cloud infrastructure, we slowly realized that the status quo of policy-as-code was broken. Writing our first Resourcely guardrails in Rego took hours to create and even more time to maintain. Writing new policies was extremely tedious and time-intensive, especially given the fact that we wanted to make them flexible. To help achieve our mission of making infrastructure more secure, it became evident that a new policy language would be needed that allowed policy to be written and maintained without headaches.

The Resourcely team has spent the last 6 months working on developing that new policy language, and today we are introducing Really: a policy language and enforcement engine that is built to be written and read by humans - not machines. With the introduction of Really, policy can be created in minutes compared to hours, with a human-readable syntax that looks remarkably like SQL. The readability of our structured policy language means that writing, deciphering, and adjusting policy isn’t a nightmare, and your policy team can take vacation without worrying about their pager going off.

Yet Another Config Language (YACL)

Why did we invest millions of dollars in time to build Yet Another Config Language (YACL)? I’ve had some version of the below asked of me, or running through my head, daily for the last 6 months:

Travis, why the fuck do we need another config language? Can’t we just use Rego like normal, pain loving people?

See relevant xkcd

First, Resourcely

To understand why we created Really, you first need to understand Resourcely. We started this company with one mission: preventing misconfiguration. To achieve this mission, we knew that we needed to standardize the development experience around infrastructure configuration for a variety of parties:

  • software engineers, who are most often actually configuring and deploying infrastructure
  • platform teams, who want to give software engineers systems to build faster
  • security teams, who want to limit incidents and embed secure-by-default into existing development workflows

A platform like Resourcely is needed because best practices, security requirements, and parameter nuances aren’t common knowledge among your typical engineers. How many developers know which port you should set the target group to when deploying an ECS cluster, and how many can write you optimal Terraform code for setting it?

The status quo is too messy: developers without the skills (or the desire) to be infrastructure experts, security teams stuck in ops mode but worried about slowing engineers down, and platform teams busy writing one-off PRs or triaging accidental deletions.

The world of infrastructure misconfiguration spans all of computing, from security to cost efficiency to reliability and incidents. Every time a developer is asked to deploy, we are asking them to make critical decisions without enough context or information that could cause a vulnerability, business downtime, improper access, or excess cost.

So why hasn’t an infrastructure configuration platform that tackles this problem been built yet? Why is configuration still as messy of a problem as ever? As we’ll explore below, that platform hasn’t existed before Resourcely because Rego sucks.

The world of Resourcely is split into two parts:

  • Blueprints: Easily configurable templates that can help developers ship faster without screwing up configuration
  • Guardrails: Interpreting security policy and making sure that rules aren’t violated

We knew we could create a templating language and a first-class developer experience around configuration blueprints. The real problem we encountered was that creating and enforcing custom guardrails across a variety of services was incredibly painful to read and write in Rego. We found that it would be faster to invest millions of dollars in time and months of effort to write a new policy language than to suffer through writing and maintaining Rego.

Policy-as-code: Rego and OPA

The existing world of policy-as-code is primarily served by Rego, the policy language used with Open Policy Agent (OPA). Organizations around the world use languages like Rego or custom implementations based in Python or YAML to define and enforce security policies.

At the end of the day, OPA evaluates Rego against your data to identify whether a policy has been violated or not.

Rego is written, OPA queries data using the Rego policy and provided data, evaluates, and returns a decision.

Let’s consider a basic example of policy enforcement. Your company only wants to allow software engineers from a particular group to have access to instantiate new EC2 instances. A security team will construct the appropriate Rego and configure their Terraform runner to apply it to every Terraform plan. When an end user wants to create a new EC2 instance, the Rego then determines whether their plan passes or fails this policy.

Writing this policy in Rego would be convoluted, slow to build, and unmaintainable. We’ll observe some concrete examples below of the differences between Rego and Really.

The problems with Rego

It is hard to read & write

The overhead to learn and maintain Rego is very high, with a convoluted syntax resulting in queries of extraordinary length and complexity. These can end up being difficult to read & write, especially for people familiar with other declarative languages that aren’t as convoluted.

Here is a real example of Rego, from Terrascan’s open source repo:

...and the equivalent Really code:

Why is Rego this complex, especially when implementing security policy? It really wasn’t designed for it. Rego was designed to evaluate and return IAM policy checks but evolved into a general-purpose policy language as cloud services proliferated. Rego has a variety of nuances that make it a poor choice for infrastructure policy-as-code.

  • Data types are confusing
  • Heavily biased towards single statements
    • Checking across groups means creating lists and iterating through them
    • Imagine reading a nested SQL query…it is worse than that
  • No support for recursive rules
  • Need to take into account the hierarchy of metadata structure in security
    • Consider the relationship between VMs, containers, applications, gateways, etc.
    • To achieve this in Rego, you suddenly need to be a Terraform plan expert AND a cloud services schema expert
  • It wasn’t built for real-time feedback

These issues result in a complex, unreadable language with a steep learning curve. When policies are painful to write and maintain, even simple best practices like tagging and naming conventions are eschewed for simply getting working (and usually misconfigured) infrastructure deployed. This limits the adoption of policy-as-code and ultimately makes misconfiguration even more likely to happen at scale.

Policy-as-code, for humans

We created Really to be human-readable and writeable, to help with our development of Resourcely. When starting Resourcely we hired Rego experts, and set them loose on using Rego to create customizable guardrails at scale as part of the Resourcely platform: it wasn’t sustainable. So we spent 6+ months and millions of dollars to build policy language that could power a configuration engine of the future, and what we built is Really.

Really is an opinionated policy language built for articulating rules that deal with configuration settings, most commonly for infrastructure. Writing policy with Really takes our team 5-10% of the time compared to Rego, because it is easy to both read and write. Here is an example Really policy that is obvious to understand:

...with the equivalent Rego (again from Terrascan - note the templating is for taking inputs for different AWS instance types):

Here is a gnarly example for requiring GCP encryption:

...and the equivalent Really:

Really is readable and maintainable by humans

Why is Really so easy to read and write? As an opinionated framework, it abstracts away complexity that Rego can not:

  • Relationships with resources are automatically accounted for
  • Uses a declarative format that engineers who are familiar with languages like SQL, Terraform, or markup languages will be inherently comfortable with
    • Rego may be “declarative”, but it hardly feels that way when you get into the world of configuring cloud resources
  • Evaluates policies when they are written

Writing and maintaining customizable guardrails for all of the cloud services that we need to support in order to build a true configuration platform using Rego would have been untenable. That’s why we built Really: it gives our team, and anyone writing policy-as-code, the ability to move with speed and accuracy that they can’t with Rego.

The future of Really

The security of cloud infrastructure is a critical problem for organizations around the world. Writing policy with frustrating language can mean your policies are poorly implemented with a limited scope, and when (or if) they’re updated it takes significant time investment to do so.

If you’d like to write policy for your organization quickly and maintain it easily at scale, you would love Resourcely. You can see some examples of Really here.

I invite you to try Resourcely out on your own. Sign up for our self service waitlist here, or get in touch and we can talk all about it.

Ready to get started?

Set up a time to talk to our team to get started with Resourcely.

Get in touch

More posts

View all
November 22, 2024

The DevOps Tax on Central Teams: Livestream

Diving in to how Netflix tackled DevOps challenges
November 20, 2024

Making AWS ControlTower Account Factory easier with Resourcely

Turning the Account Factory for Terraform modules into a smart UI

Talk to a Human

See Resourcely in action and learn how it can help you secure and manage your cloud infrastructure today!