Streamline AI adoption in an inexpensive, safe way
LLMs are powerful…and expensive. If you’re using a managed LLM provider like OpenAI’s ChatGPT or Anthropic’s Claude, you are spending money based on each call that you make to their API. Ultimately their pricing is based on input and output tokens. I asked ChatGPT to explain what a token is in the context of their pricing:
A token is a chunk of text that represents part of a word, a whole word, or sometimes punctuation or spaces. For example, the word “ChatGPT” is one token, and “Chat GPT is great!” might be six tokens, with each word and punctuation mark counted. Tokens are used in language models to track processing costs, as each model processes a certain number of tokens to generate responses.
In general, 1,000 tokens are roughly equivalent to 750 words, although this varies depending on the language and complexity of the text.
It is generally accepted that LLMs are a groundbreaking technology…but they’re also relatively more expensive than traditional computing techniques. This is because LLMs use GPUs as their core computing unit, which require more power than traditional CPUs.
Self-hosting your own LLM is an option, especially if you are concerned about data privacy. Many companies don’t want to use managed LLMs because they are worried about their data being exposed to and used by the AI companies.
Bedrock and considerations around scaling
AWS Bedrock is one of the leading cloud-based services for building and scaling Generative AI (GenAI). With Bedrock, you can start with foundational models, customize and fine-tune them, and embed them into enterprise applications.
There are inherent risks when using a powerful service like Bedrock, some of which we’ve already addressed. Specifically, you may want to limit and enable:
- Costs - LLMs are expensive, and embedding GenAI into an application could have a serious financial impact to your company
- Data privacy - Putting limits in place on what models can be used, what data can be used, etc.
- Access - You want to control who can create, interact with, and manage your Bedrock service
- Complexity - You may want to make AI accessible within your organization, but writing the Terraform required to do so sustainably is prohibitive for most
- Scalability - Making your GenAI application enterprise-grade requires durability and scalability. Perhaps you want to limit the size of a model, or maybe you want to make sure it can scale to support the most complex of prompts
Streamline and govern GenAI with Resourcely
Resourcely is a configuration platform for streamlining and governing infrastructure configuration. With it, customize paved roads to deploying infrastructure you care about while keeping control over what they can do.
In this blog we’ll walk through how Resourcely Blueprints can give developers a simple way to deploy AWS Bedrock while putting controls in place over what they can do with Resourcely Guardrails. With our Blueprint and Guardrails, developers will be able to interact with a form that generates properly configured Terraform that creates a Bedrock model and required associated infrastructure.
Building blocks for AI
It may come as no surprise that you can’t just create a Bedrock custom model. You need the following resources at a minimum:
- An
aws_bedrock_custom_model
, for creating a custom model based on a foundational model that Bedrock supports. This resources creates a customization job, whose results are then stored in an s3 bucket. - An
aws_s3_bucket
, for storing customization job results, and anaws_s3_bucket_public_access_block
for restricting access to our results - An
aws_iam_role
for interacting with our s3 bucket and Bedrock model
Let’s consider the AWS Bedrock example from their Terraform registry:
This example doesn’t include the complementary resources that are needed discussed above (S3 bucket, IAM role). We’ll include those when building our Blueprint.
Presumably, developers will know what valid inputs are for each of these parameters. What if they aren’t? It turns out that each of the models supported by Bedrock have different parameter requirements. We can bake these into our form developers will interact with to deploy the above model.
Creating an AI Blueprint
Let’s start by creating a Blueprint version of a Bedrock Terraform resource, with S3 and IAM baked in. You can see the full Blueprint on GitHub.
We start with frontmatter to define all of the variables that we want to take, and organize those into groups. Any variables defined in the frontmatter (or even only inline) are exposed as input variables in our form, decorated with guidance features such as descriptions or defaults.
Adding Hyperparameter Guardrails
To keep our developers on track so their models don’t fail, let’s create some Hyperparameter Guardrails. Given these values differ by model, we’ll want to make our Guardrails contextual based on the model that the individual chooses.
We use this Guardrail to customize guidance for Titan models. When a user chooses the Titan model on our form, each of the hyperparameters will have guidance that is custom to that specific foundational model:
- Epoch Count between 1 and 5
- Batch Size = 1
- Learning Rate between 0.0000001 and 0.00001.
Here’s another Guardrail, specific to Claude:
This enforces the following hyperparameter guidance that is unique to Claude:
- Epoch Count between 1 and 10
- Batch Size between 4 and 256
With these Guardrails in place, users will be given feedback when they are configuring their model…not after it has failed.
Secure-by-default AI storage
Asking a developer to configure a Bedrock model standalone could result in them utilizing S3 buckets or IAM roles that are unsafe.
In our Blueprint, we have added the following resources:
S3 bucket
…and we are referencing this bucket within the aws_bedrock_custom_model resource:
Within the S3 bucket resource, the only variable we are making available is the bucket name. We have automatically implemented secure-by-default access policies with the public access block. Developers can’t make changes via the form to these secure settings.
The custom model resource automatically references these resources, without requiring the developer to make that connection. This means that our training and output data will remain safe.
IAM role
Similarly, we create an assumable IAM role that is specific to Bedrock as part of our Blueprint. It restricts access to the Bedrock model and the associated S3 bucket. You could customize this role based on your company’s requirements (more or fewer permissions).
Streamlining the fine-tuning of AI models with Bedrock and Resourcely
Driving value creation with AI is the dream, and many companies are looking to achieve this by fine-tuning some of the more popular foundational models to their unique use cases.
Doing this can be costly, unsafe, and just plain hard to execute. With Resourcely, you can streamline and simplify AWS Bedrock while governing the parameters of your models. This frees your developers from thinking about Terraform or Bedrock-specific complexity, or wondering if they’re deploying in a safe and cost-effective way.
To customize your own paved roads for AI, sign up for your free account here.
See also:
- Video version of this blog
- Tutorial in the documentation