Using DuckDB With AWS SSO Credentials
duckdb
with aws configure export-credentials
provides a seamless workaround.My colleague Raúl recently showed me the advantages of DuckDB, a powerful analytical database that can directly query data from Parquet, CSV, and JSON files, including those stored in S3. However, I ran into a problem when trying to use it with credentials managed by AWS SSO.
The Problem
The DuckDB AWS extension does not natively support the AWS SSO authentication flow. This results in credential errors when you try to query a file directly from an S3 bucket. There are several open GitHub issues detailing this problem.
The Workaround
After reading through the comments on the issue, I found a clean and effective workaround for Linux/macOS users running bash
or zsh
. It involves creating a shell function that wraps the duckdb
command.
You can add the following function to your shell configuration file (e.g., ~/.bashrc
or ~/.zshrc
):
bash code snippet start
function duckdb() {
(eval "$(aws configure export-credentials --format env)" && command duckdb "$@")
}
bash code snippet end
How It Works
This simple function makes the integration seamless:
function duckdb() { ... }
: It creates a new function namedduckdb
. When you typeduckdb
in your terminal, this function is called instead of the original binary.aws configure export-credentials --format env
: This command fetches temporary, short-lived credentials (access key, secret key, and session token) from your active AWS SSO session and outputs them in a format that can be used to set environment variables.eval "$(...)"
: This executes the output from theaws
command, setting theAWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY
, andAWS_SESSION_TOKEN
environment variables.( ... )
: The entire command is wrapped in a subshell. This is a key benefit, as it ensures the exported credentials only exist for the scope of this single command and do not leak into your main shell session.command duckdb "$@"
: Finally, it executes the originalduckdb
command, passing along any arguments you provided ($@
). The DuckDB AWS extension automatically picks up the standard AWS environment variables for authentication.
After reloading your shell (source ~/.zshrc
), you can now run commands transparently as you normally would, and they will work perfectly with S3:
shell code snippet start
duckdb -c "SELECT * FROM 's3://somebucket/somefile.csv'"
shell code snippet end