Skip to content

Cloud object storage

Overview

Dotenv files are commonly kept in cloud object storage, but environment variable management packages typically don't integrate with object storage clients. Additional logic is therefore required to download the files from object storage prior to loading environment variables. This project offers integration with S3-compatible object storage. AWS S3, Backblaze B2, and Cloudflare R2 are directly supported and tested.

Why not Boto3?

fastenv uses its own object storage client. Why implement a client here instead of using Boto3?

  • Async. fastenv uses HTTPX for asynchronous HTTP operations. Boto3's methods use synchronous I/O.
  • Simple. fastenv is a small, simple project that provides the necessary features without the bloat of Boto3. Why install all of Boto3 if you just need a few of the features? And if you actually want to understand what your code is doing, you can try sifting through Boto's subpackages and dynamically-generated objects, but wouldn't you rather just look at a few hundred lines of code right in front of you?
  • Type-annotated. fastenv is fully type-annotated. Boto3 is not type-annotated. Its objects are dynamically generated at runtime using factory methods, making the code difficult to annotate and read. Some attempts are being made to add type annotations (see alliefitter/boto3_type_annotations, boto/botostubs, vemel/mypy_boto3_builder, and vemel/boto3-ide), but these attempts are still works-in-progress.
Building an object storage client from scratch

Configuration

fastenv provides a configuration class to manage credentials and other information related to cloud object storage buckets.

  • Buckets can be specified in "virtual-hosted-style", like <BUCKET_NAME>.s3.<REGION>.amazonaws.com for AWS S3 or <BUCKET_NAME>.s3.<REGION>.backblazeb2.com for Backblaze B2.
  • If credentials are not provided as arguments, this class will auto-detect configuration from the default AWS environment variables AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN, and the region from either AWS_S3_REGION, AWS_REGION, or AWS_DEFAULT_REGION, in that order.
  • Boto3 detects credentials from several other locations, including credential files and instance metadata endpoints. These other locations are not currently supported.

AWS Signature Version 4

AWS Signature Version 4 is the secret sauce that allows requests to flow through AWS services. fastenv uses its own implementation of AWS Signature Version 4 to connect to AWS S3 and other S3-compatible platforms like Backblaze B2.

Creating a signature is a four-step process:

  1. Create a canonical request. "Canonical" just means that the string has a standard set of fields. These fields provide request metadata like the HTTP method and headers.
  2. Create a string to sign. In this step, a SHA256 hash of the canonical request is calculated, and combined with some additional authentication information to produce a new string called the "string to sign." The Python standard library package hashlib makes this straightforward.
  3. Calculate a signature. To set up this step, a signing key is derived with successive rounds of HMAC hashing. The concept behind HMAC ("Keyed-Hashing for Message Authentication" or "Hash-based Message Authentication Codes") is to generate hashes with mostly non-secret information, along with a small amount of secret information that both the sender and recipient have agreed upon ahead of time. The secret information here is the secret access key. The signature is then calculated with another round of HMAC, using the signing key and the string to sign. The Python standard library package hmac does most of the hard work here.
  4. Add the signature to the HTTP request. The hex digest of the signature is included with the request.

Object storage operations

Once the AWS Signature Version 4 process is in place, it can be used to authorize object storage operations. There are three categories of operations: download, upload, and list.

Download

The download method generates a presigned URL, uses it to download file contents, and either saves the contents to a file or returns the contents as a string.

Downloads with GET can be authenticated by including AWS Signature Version 4 information either with request headers or query parameters. fastenv uses query parameters to generate presigned URLs. The advantage to presigned URLs with query parameters is that URLs can be used on their own.

A related operation is head_object, which can be used to check if an object exists. The request is the same as a GET, except the HEAD HTTP request method is used. fastenv does not provide an implementation of head_object at this time, but it could be considered in the future.

Upload

The upload method uploads source contents to an object storage bucket, selecting the appropriate upload strategy based on the cloud platform being used. Uploads can be done with either POST or PUT.

Uploads with PUT can use presigned URLs. Unlike downloads with GET, presigned PUT URL query parameters do not necessarily contain all the required information. Additional information may need to be supplied in request headers. In addition to supplying header keys and values with HTTP requests, header keys should be signed into the URL in the X-Amz-SignedHeaders query string parameter. These request headers can specify:

  • Object encryption. Encryption information can be specified with headers including X-Amz-Server-Side-Encryption. Note that, although similar headers like X-Amz-Algorithm are included as query string parameters in presigned URLs, X-Amz-Server-Side-Encryption is not. If X-Amz-Server-Side-Encryption is included in query string parameters, it may be silently ignored by the object storage platform. AWS S3 now automatically encrypts all objects and Cloudflare R2 does also, but Backblaze B2 will only automatically encrypt objects if the bucket has default encryption enabled.
  • Object metadata. Headers like Content-Disposition, Content-Length, and Content-Type can be supplied in request headers.
  • Object integrity checks. The Content-MD5 header, defined by RFC 1864, can supply a base64-encoded MD5 checksum. After the upload is completed, the object storage platform server will calculate a checksum for the object in the same manner. If the client and server checksums are the same, this means that all expected information was successfully sent to the server. If the checksums are different, this may mean that object information was lost in transit, and an error will be reported. Note that, although Backblaze B2 accepts and processes the Content-MD5 header, it will report a SHA1 checksum to align with uploads to the B2-native API.

Uploads with POST work differently than GET or PUT operations. A typical back-end engineer might ask, "Can't I just POST binary data to an API endpoint with a bearer token or something?" To which AWS might respond, "No, not really. Here's how you do it instead: pretend like you're submitting a web form." "What?"

Anyway, here's how it works:

  1. Create a POST policy. A POST policy is a security policy with a list of conditions under which uploads are allowed. It is used instead of the "canonical request" that would be used in query string auth.
  2. Create a string to sign. The list is dumped to a string, encoded to bytes in UTF-8 format, Base64 encoded, and then decoded again to a string.
  3. Calculate a signature. This step is basically the same as for query string auth. A signing key is derived with HMAC, and then used with the string to sign for another round of HMAC to calculate the signature.
  4. Add the signature to the HTTP request. For POST uploads, the signature is provided with other required information as form data, rather than as URL query parameters. An advantage of this approach is that it can also be used for browser-based uploads, because the form data can be used to populate the fields of an HTML web form. There is some overlap between items in the POST policy and fields in the form data, but they are not exactly the same.

Backblaze uploads with POST are different, though there are good reasons for that (helps keep costs low). fastenv includes an implementation of the Backblaze B2 POST upload process.

List

fastenv does not currently have methods for listing bucket contents.

Perhaps someone who is willing to spend their free time parsing XML can implement this.

Getting started

Set up a virtual environment

To get started, let's set up a virtual environment and install fastenv from the command line. If you've been through the environment variable docs, the only change here is installing the optional extras like python -m pip install fastenv[httpx].

Setting up a virtual environment

python3 -m venv .venv
. .venv/bin/activate
python -m pip install fastenv[httpx]

Save a .env file

We'll work with an example .env file that contains variables in various formats. Copy the code block below using the "Copy to clipboard" icon in the top right of the code block, paste the contents into a new file in your text editor, and save it as .env.

Example .env file

# .env
AWS_ACCESS_KEY_ID_EXAMPLE=AKIAIOSFODNN7EXAMPLE
AWS_SECRET_ACCESS_KEY_EXAMPLE=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLE
CSV_VARIABLE=comma,separated,value
EMPTY_VARIABLE=''
# comment
INLINE_COMMENT=no_comment  # inline comment
JSON_EXAMPLE='{"array": [1, 2, 3], "exponent": 2.99e8, "number": 123}'
PASSWORD='64w2Q$!&,,[EXAMPLE'
QUOTES_AND_WHITESPACE='text and spaces'
URI_TO_DIRECTORY='~/dev'
URI_TO_S3_BUCKET=s3://mybucket/.env
URI_TO_SQLITE_DB=sqlite:////path/to/db.sqlite
URL_EXAMPLE=https://start.duckduckgo.com/

These environment variables are formatted as described in the environment variable docs.

Create a bucket

We'll also need to create a bucket in cloud object storage. Backblaze B2 and AWS S3 are directly supported and tested.

Backblaze B2 offers 10 GB for free, so consider signing up and creating a bucket there.

You can also create a bucket on AWS S3 if you prefer.

Create credentials

Credentials are usually required in order to connect to an object storage bucket.

Credentials for cloud object storage have two parts: a non-secret portion and a secret portion.

AWS

AWS calls these credentials "access keys," and commonly stores them in environment variables named AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and optionally AWS_SESSION_TOKEN. After configuring the AWS CLI, access keys can be retrieved programmatically.

Retrieving AWS access keys programmatically with the AWS CLI

AWS_ACCESS_KEY_ID=$(aws configure get fastenv.aws_access_key_id)
AWS_SECRET_ACCESS_KEY=$(aws configure get fastenv.aws_secret_access_key)
AWS session token support

AWS session tokens are used when resources obtain temporary security credentials. The authorization flow works like this:

  • IAM roles, such as service-linked roles or Lambda execution roles, are set up and linked to infrastructure resources. These roles can have two kinds of IAM policies attached:
    1. Resource-based policies called "role trust policies" define how the role can be assumed.
    2. Identity-based policies define what the role can do once it has been assumed (interactions with other resources on AWS).
  • The AWS runtime (Fargate, Lambda, etc) requests authorization to use the role by calling the STS AssumeRole API.
  • If the requesting entity has permissions to assume the role, STS responds with temporary security credentials that have permissions based on the identity-based policies associated with the IAM role.
  • The AWS runtime stores the temporary security credentials, typically by setting environment variables:
    • AWS_ACCESS_KEY_ID
    • AWS_SECRET_ACCESS_KEY
    • AWS_SESSION_TOKEN
  • AWS API calls with temporary credentials must include the session token.
  • The AWS runtime will typically rotate the temporary security credentials before they expire.

fastenv supports session tokens. The session_token argument can be passed to fastenv.ObjectStorageConfig or fastenv.ObjectStorageClient. If the session token is not provided as an argument, fastenv will check for the environment variable AWS_SESSION_TOKEN.

It is important to keep session token expiration in mind. fastenv will not automatically rotate tokens. Developers are responsible for updating client attributes or instantiating new clients when temporary credentials expire. This is particularly important to keep in mind when generating S3 presigned URLs. As explained in the docs, "If you created a presigned URL using a temporary token, then the URL expires when the token expires, even if the URL was created with a later expiration time."

Backblaze

Backblaze calls these credentials "application keys." Backblaze doesn't specify environment variable names, so it's easiest to use the same environment variable names as for AWS.

Setting Backblaze credentials using AWS variable names

AWS_ACCESS_KEY_ID="<YOUR_BACKBLAZE_B2_ACCESS_KEY_HERE>"
AWS_SECRET_ACCESS_KEY="<YOUR_BACKBLAZE_B2_SECRET_KEY_HERE>"

Omitting credentials from shell history

It's preferable to avoid storing sensitive credentials like AWS_SECRET_ACCESS_KEY in your shell history. Thankfully, most shells offer the ability to omit commands from the shell history by prefixing the command with one or more spaces. In Bash, this behavior can be enabled with the HISTCONTROL and HISTIGNORE environment variables. In Zsh, this behavior can be enabled with HIST_IGNORE_SPACE or setopt histignorespace.

Uploading files

Now that we have a bucket, let's upload the .env file to the bucket. It's a three step process:

  1. Create a configuration instance. To instantiate fastenv.ObjectStorageConfig, provide a bucket and a region. Buckets can be specified with the bucket_host argument in "virtual-hosted-style," like <BUCKET_NAME>.s3.<REGION>.amazonaws.com for AWS S3 or <BUCKET_NAME>.s3.<REGION>.backblazeb2.com for Backblaze B2. For AWS S3 only, the bucket can be also provided with the bucket_name argument as just <BUCKET_NAME>. If credentials are not provided as arguments, fastenv.ObjectStorageConfig will auto-detect configuration from the default AWS environment variables AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN, and the region from either AWS_S3_REGION, AWS_REGION, or AWS_DEFAULT_REGION, in that order. Boto3 detects credentials from several other locations, including credential files and instance metadata endpoints. These other locations are not currently supported.
  2. Create a client instance. fastenv.ObjectStorageClient instances have two attributes: an instance of fastenv.ObjectStorageConfig, and an instance of httpx.AsyncClient. They can be automatically instantiated if not provided as arguments. We've instantiated the fastenv.ObjectStorageConfig instance separately in step 1 to see how it works, but we'll let fastenv.ObjectStorageClient instantiate its httpx.AsyncClient automatically. As a shortcut, you could skip step 1 and just provide the configuration arguments to fastenv.ObjectStorageClient, like fastenv.ObjectStorageClient(bucket_host="<BUCKET_NAME>.s3.<REGION>.backblazeb2.com", bucket_region="<REGION>").
  3. Use the client's upload method to upload the file. To upload, we need to specify a source, and a destination path. The destination path is like a file path. AWS uses the term "key" for these bucket paths because buckets don't have actual directories. The "file path" inside the bucket is just a virtual path, not a concrete file path.

Here's an example of how the code might look. Save the code snippet below as example.py.

Uploading a .env file to a bucket

#!/usr/bin/env python3
# example.py
from __future__ import annotations

import anyio
import fastenv
import httpx


async def upload_my_dotenv(
    bucket_host: str,
    bucket_region: str,
    bucket_path: str = "uploads/fastenv-docs/.env",
    source: anyio.Path | str = ".env",
) -> httpx.Response | None:
    config = fastenv.ObjectStorageConfig(  # (1)
        bucket_host=bucket_host,
        bucket_region=bucket_region,
    )
    client = fastenv.ObjectStorageClient(config=config)  # (2)
    return await client.upload(bucket_path, source)  # (3)


if __name__ == "__main__":
    bucket_host = "<BUCKET_NAME>.s3.<REGION>.backblazeb2.com"
    bucket_region = "<REGION>"
    anyio.run(upload_my_dotenv, bucket_host, bucket_region)
  1. Step 1: create a configuration instance
  2. Step 2: create a client instance
  3. Step 3: use the client's upload method to upload the file

Then set credentials and run the script from a shell. Remember to activate the virtualenv if you haven't already done so.

AWS_ACCESS_KEY_ID="<YOUR_ACCESS_KEY_HERE>" \
  AWS_SECRET_ACCESS_KEY="<YOUR_SECRET_KEY_HERE>" \
  python example.py

Downloading files

We now have a bucket with a .env file in it. Let's download the file. Steps are pretty much the same.

Downloading a .env file from a bucket

#!/usr/bin/env python3
# example.py
from __future__ import annotations

import anyio
import fastenv
import httpx


async def upload_my_dotenv(
    bucket_host: str,
    bucket_region: str,
    bucket_path: str = "uploads/fastenv-docs/.env",
    source: anyio.Path | str = ".env",
) -> httpx.Response | None:
    config = fastenv.ObjectStorageConfig(
        bucket_host=bucket_host,
        bucket_region=bucket_region,
    )
    client = fastenv.ObjectStorageClient(config=config)
    return await client.upload(bucket_path, source)


async def download_my_dotenv(
    bucket_host: str,
    bucket_region: str,
    bucket_path: str = "uploads/fastenv-docs/.env",
    destination: anyio.Path | str = ".env.download",
) -> anyio.Path:
    config = fastenv.ObjectStorageConfig(
        bucket_host=bucket_host,
        bucket_region=bucket_region,
    )
    client = fastenv.ObjectStorageClient(config=config)
    return await client.download(bucket_path, destination)


if __name__ == "__main__":
    bucket_host = "<BUCKET_NAME>.s3.<REGION>.backblazeb2.com"
    bucket_region = "<REGION>"
    # anyio.run(upload_my_dotenv, bucket_host, bucket_region)
    anyio.run(download_my_dotenv, bucket_host, bucket_region)

Then set credentials and run the script from a shell. Remember to activate the virtualenv if you haven't already done so.

AWS_ACCESS_KEY_ID="<YOUR_ACCESS_KEY_HERE>" \
  AWS_SECRET_ACCESS_KEY="<YOUR_SECRET_KEY_HERE>" \
  python example.py

Downloading multiple files

Sometimes applications use multiple .env files. For example, a team may have a common .env file that provides variables used across many applications. Each application may also have its own .env file to customize, or add to, the variables in the common file.

Here's an example of how this could be implemented.

Downloading multiple .env files

#!/usr/bin/env python3
# example.py
from __future__ import annotations

import anyio
import fastenv
import httpx


async def download_my_dotenvs(
    bucket_host: str,
    bucket_region: str,
    bucket_path_to_common_env: str = ".env.common",
    bucket_path_to_custom_env: str = ".env.custom",
) -> fastenv.DotEnv:
    config = fastenv.ObjectStorageConfig(
        bucket_host=bucket_host,
        bucket_region=bucket_region,
    )
    client = fastenv.ObjectStorageClient(config=config)
    env_common = await client.download(bucket_path_to_common_env)
    env_custom = await client.download(bucket_path_to_custom_env)
    return fastenv.DotEnv(env_common, env_custom)


if __name__ == "__main__":
    bucket_host = "<BUCKET_NAME>.s3.<REGION>.backblazeb2.com"
    bucket_region = "<REGION>"
    anyio.run(download_my_dotenvs, bucket_host, bucket_region)

Cloud object storage comparisons

AWS S3

Azure Blob Storage

Azure Blob Storage is not S3-compatible, and will not be directly supported by fastenv. In downstream projects that store .env file objects in Azure, users are welcome to download objects using the Azure Python SDK, and then load the files with fastenv.load_dotenv() after download.

Backblaze B2

  • Pricing:
  • S3-compatible API*
    • Downloading and listing operations are S3-compatible
    • *Uploads are different, though there are good reasons for that (helps Backblaze keep pricing low)
  • URIs:
    • Path style URL: https://s3.<region>.backblazeb2.com/<bucketname>
    • Virtual-hosted-style URL: https://<bucketname>.s3.<region>.backblazeb2.com
    • Presigned URLs are supported
  • Identity and Access Management (IAM):
    • Simpler than AWS, while also providing fine-grained access controls.
    • No configuration of IAM users or roles needed. Access controls are configured on the access keys themselves.
    • Account master access key is separate from bucket access keys
    • Access key permissions can be scoped to individual buckets, and even object names within buckets.
  • Docs

Cloudflare R2

DigitalOcean Spaces

Google Cloud Storage

Linode Object Storage