Developer Guide¶
The full local-development workflow for Aegis-KMS — from cloning the repo to shipping a PR. If you only want to use Aegis, Quickstart is the better entry point. This document is for people writing code.
1. Prerequisites¶
| Tool | Version | Why |
|---|---|---|
| JDK | 21 (recommended) or 17 | Aegis is Scala 3.3.x; CI matrix tests both 17 and 21 |
| sbt | 1.10+ | Build tool; the repo ships an sbt wrapper but a system install is fine |
| Docker | any recent | Required for Testcontainers-based persistence tests |
| Git | 2.30+ | DCO sign-off uses git commit -s |
| GPG | optional | Only needed if you cut a release that publishes to Maven Central |
We strongly recommend SDKMAN for managing the JDK / sbt versions:
bash
sdk install java 21.0.5-tem
sdk install sbt
If you're on macOS and prefer Homebrew: brew install temurin sbt.
2. First build¶
bash
git clone https://github.com/sharma-bhaskar/aegis-kms.git
cd aegis-kms
sbt compile
The first compile downloads ~400MB of dependencies and takes 3-5 minutes on a cold cache. Every subsequent compile is incremental and finishes in seconds.
Verify the build:
bash
sbt test # 160+ tests across all modules
sbt scalafmtCheckAll # formatting gate
sbt "scalafixAll --check" # lint gate
If sbt test fails on a clean checkout, file an issue — that's a real bug.
3. Project layout¶
Aegis is a multi-project Scala build with strictly enforced two-tier module split. This is the most important rule in the codebase.
aegis-kms/
├── modules/
│ ├── aegis-core/ ← Library tier (NO Pekko)
│ ├── aegis-iam/ ← Library tier
│ ├── aegis-audit/ ← Library tier
│ ├── aegis-crypto/ ← Library tier
│ ├── aegis-persistence/ ← Library tier
│ ├── aegis-sdk-scala/ ← Library tier
│ ├── aegis-sdk-java/ ← Library tier
│ ├── aegis-http/ ← Server tier (Pekko-aware)
│ ├── aegis-agent-ai/ ← Server tier
│ ├── aegis-server/ ← Server tier (Docker image)
│ ├── aegis-cli/ ← Server-tier-adjacent (no Pekko, packaged app)
│ ├── aegis-kmip/ ← Server tier (skeleton)
│ └── aegis-mcp-server/ ← Server tier (skeleton)
├── docs/ ← Rendered docs source (this site)
├── deploy/
│ ├── docker/ ← docker-compose.yml + Dockerfile context
│ └── helm/ ← Placeholder, v0.3.0
├── project/
│ ├── build.properties ← sbt version
│ ├── plugins.sbt ← sbt plugins (sbt-ci-release, sbt-mima, ...)
│ └── Dependencies.scala ← Pinned dependency versions
└── build.sbt ← Module definitions, settings, dependency overrides
The two-tier split, explained¶
Library tier modules MUST NOT depend on Pekko. They're embeddable in any JVM app — a Spring
Boot service, a Lambda function, a sbt-launched script. Adding a pekko import to any of them
will fail to compile because aegis-core and friends literally don't have Pekko on their
classpath.
Server tier modules add concurrency, HTTP, and runtime. They depend on Pekko Typed for actor-based state, pekko-http via Tapir for the REST plane, and a small handful of server-only deps (Micrometer for Prometheus, OpenTelemetry SDK, etc.).
Why this matters:
- The split lets enterprise users embed Aegis as a library without taking on the full server-tier dependency closure.
- It makes property tests in
aegis-corecheap — no actor system to spin up, no HTTP server. - It prevents vendor coupling from leaking into the core algebras.
If you're unsure which tier a piece of new code belongs to, ask in the PR. The default for library-tier additions is no I/O, no Pekko, no vendor SDK calls.
Module names — sbt vs. directory¶
The directory modules/aegis-core/ defines a sbt module named core (camelCase, no
prefix). This is unfortunate but intentional — sbt modules can't have hyphens, and the directory
prefix communicates the project name to humans browsing the repo.
| Directory | sbt module | Run a test |
|---|---|---|
modules/aegis-core/ |
core |
sbt "core / test" |
modules/aegis-persistence/ |
persistence |
sbt "persistence / test" |
modules/aegis-crypto/ |
crypto |
sbt "crypto / test" |
modules/aegis-sdk-scala/ |
sdkScala |
sbt "sdkScala / test" |
modules/aegis-sdk-java/ |
sdkJava |
sbt "sdkJava / test" |
modules/aegis-mcp-server/ |
mcpServer |
sbt "mcpServer / test" |
modules/aegis-agent-ai/ |
agentAi |
sbt "agentAi / test" |
modules/aegis-http/ |
http |
sbt "http / test" |
modules/aegis-server/ |
server |
sbt "server / test" |
modules/aegis-cli/ |
cli |
sbt "cli / test" |
4. Build & test cheatsheet¶
```bash
Compile everything¶
sbt compile
Run all tests¶
sbt test
Run one module's tests¶
sbt "core / test"
Run one suite¶
sbt "http / testOnly *HttpRoutesSpec"
Run one test by name substring¶
sbt "http / testOnly *HttpRoutesSpec -- -z 'creates a key'"
Format gate (CI runs this)¶
sbt scalafmtCheckAll scalafmtSbtCheck
Lint gate (CI runs this)¶
sbt "scalafixAll --check"
Auto-format¶
sbt scalafmtAll
Auto-fix lint¶
sbt scalafixAll
Run the server (forks a JVM; in-memory journal by default)¶
sbt "server / run"
Build the local Docker image¶
sbt "server / Docker / publishLocal"
Build the CLI tarball¶
sbt "cli / Universal / packageZipTarball"
Binary-compat check vs. the previous release¶
sbt mimaReportBinaryIssues ```
Test / fork := true is set globally, and run / fork := true is set on server
specifically (Pekko's user-guardian dispatcher hangs in sbt's in-process classloader otherwise —
do not remove these settings).
5. Testing¶
Unit tests¶
We use Scalatest with the AnyFlatSpec style. Tests follow the
naming convention <ClassName>Spec.scala and live next to the code under
modules/<module>/src/test/scala/<package>/.
The canonical reference is KeyServiceSpec —
copy its shape when adding a new test for the algebra. For decorators, see
AuditingKeyServiceSpec.
Integration tests with Testcontainers¶
Some aegis-persistence tests spin up a real Postgres container via
Testcontainers. They assume(...) away gracefully when
Docker isn't available, so sbt test succeeds on workstations without Docker — but in CI we
always run them.
If a persistence test silently passes when you expect it to fail, check whether Docker is
running. The test output will say [info] Skipped (Docker unavailable).
Property tests¶
aegis-core algebra invariants (state-machine completeness, idempotence) are tested with
ScalaCheck. New algebra changes should add a property test — see KeySpecGenSpec for the
canonical pattern.
6. Code style & quality gates¶
Hard rules enforced by the build:
- Scala 3. No Scala 2 idioms. Use
enum,given/using,extension, contextual abstractions. -Xfatal-warnings+-Wunused:all— every warning fails the build.- scalafmt on every file. CI runs
scalafmtCheckAllandscalafmtSbtCheck. - scalafix for lint and import organization. CI runs
scalafixAll --check.
Soft rules (PR review will surface these):
- Prefer immutable data; mutable state lives only in actor
Behaviorclosures. Option/Eitherover null / exceptions for control flow. Exceptions are reserved for truly exceptional events (driver crashes, JVM-level failures).- ADTs over boolean flags. If you have two booleans on a class, you probably want a 4-arm enum.
- Actor behaviors are always typed (
Behavior[T]over a sealed command ADT). Untyped actors are not used anywhere in the codebase. - DCO sign-off: every commit must be signed off (
git commit -s).
Imports¶
scalafmt's AsciiSortImports rule and scalafix's OrganizeImports rule are deliberately
configured to agree on import order. Don't change the import order config in either tool
without verifying both still produce the same output — they used to disagree, and the disagreement
caused a long-running CI flake.
7. Architecture — what you need to know to navigate¶
(For the full architecture writeup see Concepts → Architecture. This section is the contributor's-eye summary.)
The decorator stack¶
Every request through aegis-server flows through six decorators in this order:
HTTP → Audit → Tracing → Metrics → Authorize → Actor → Persistence + RoT
Each decorator wraps the same KeyService[F[_]] algebra and adds one orthogonal concern:
| Decorator | Adds | Module |
|---|---|---|
AuditingKeyService |
Append-only audit log on every state change | aegis-audit |
TracingKeyService |
OpenTelemetry spans per operation | aegis-server |
MeteredKeyService |
Prometheus counters + latency histograms | aegis-server |
AuthorizingKeyService |
IAM allowlist check, returns PermissionDenied on miss |
aegis-iam |
ActorBackedKeyService |
Single-writer state via Pekko Typed actor | aegis-server |
PostgresEventJournal (or in-memory) |
Persistent event log + RoT calls | aegis-persistence |
The order matters:
- Audit is outermost so it sees the final outcome including the IAM denial. Moving audit
inside
AuthorizingKeyServicewould cause denied requests to silently disappear from audit. - Authorize is inside metrics so denies are countable as
aegis_keys_op_errors_total{code="PermissionDenied"}rather than disappearing. - The actor is innermost so all state-mutating calls serialize through one mailbox per process. Concurrent reads bypass the actor and hit the journal directly.
When adding a new concern, decide which layer it belongs to and slot a new decorator. Don't add the concern to an existing decorator — that's how this codebase becomes Vault.
Pekko Typed actor pattern (load-bearing)¶
aegis-server uses one core actor: KeyOpsActor. The pattern is unusual:
scala
val system = ActorSystem(KeyOpsActor.behavior(deps), "aegis")
// system IS the actor — no user-guardian dance
val keyService = ActorBackedKeyService(system, ...)
This works because in Pekko Typed, ActorSystem[T] <: ActorRef[T]. The actor system is the
user guardian. Earlier versions used the more conventional Behaviors.setup { ctx => ... } +
Promise + Await.result pattern to expose an ActorRef to the main thread; that pattern
hangs on cold start on JDK 21 + sbt's classloader because the user guardian's setup block
is never dispatched.
If you see boot-time hangs, check whether someone has reintroduced the user-guardian +
Promise pattern. The fix is in PR #45
and there's a regression test (ServerWiringSpec) that asserts the actor system itself is the
actor.
Resource[IO] boot scope¶
Server.scala uses IOApp.Simple + a single composed Resource[IO, Unit] chain to acquire
and release every long-lived dependency:
scala
def run: IO[Unit] =
(for
metricsRegistry <- MetricsRegistry.make
journal <- PostgresEventJournal.make(config.jdbc, metricsRegistry)
actorSystem <- ActorSystemResource.make(...)
httpBinding <- HttpServer.bind(...)
yield ()).useForever
SIGTERM unwinds the stack in reverse — HTTP unbind (5 s grace) → actor system terminate →
journal pool close → meter registry close. Don't replace this with unsafeRunSync calls —
the previous pattern leaked the journal connection pool until JVM exit.
Vendor isolation via SPIs¶
aegis-core MUST NOT contain vendor-specific code. Three SPIs in the library tier handle
adapter pluggability:
| SPI | Module | Today | v0.2.0 |
|---|---|---|---|
RootOfTrust |
aegis-crypto |
AWS KMS | + GCP KMS, Azure Key Vault, HashiCorp Vault, PKCS#11 |
EventJournal |
aegis-persistence |
Postgres + in-memory | + MySQL, SQLite |
AuditSink |
aegis-audit |
stdout | + Kafka, S3, Webhook, OTel logs |
If you're adding a vendor adapter, the right place is one of these modules — never aegis-core.
8. Adding a new feature — walked example¶
Imagine adding a new crypto operation mac(id, message). The places you'd touch, in order:
aegis-core/src/main/scala/dev/aegiskms/core/KeyService.scala— adddef mac(...)to the algebra, implement in the in-memory reference impl.aegis-core/src/main/scala/dev/aegiskms/core/Operation.scala— addOperation.Macto the IAM allowlist enum.aegis-iam/src/main/scala/dev/aegiskms/iam/AuthorizingKeyService.scala— routemacthrough the policy check.aegis-audit/src/main/scala/dev/aegiskms/audit/AuditingKeyService.scala— record aSuccess/Failedaudit row carrying the relevant metadata (operation, principal, message length, not the message content).aegis-crypto/src/main/scala/dev/aegiskms/crypto/RootOfTrust.scala— addmacto the SPI.aegis-crypto/src/main/scala/dev/aegiskms/crypto/aws/AwsKmsRootOfTrust.scala— implement via the AWS KMSGenerateMacAPI.aegis-server/src/main/scala/dev/aegiskms/app/ActorBackedKeyService.scala— route through the actor mailbox if state-mutating, or bypass it if read-only.aegis-http/src/main/scala/dev/aegiskms/http/Endpoints.scala— add the Tapir endpointPOST /v1/keys/{id}/mac.aegis-http/src/main/scala/dev/aegiskms/http/HttpRoutes.scala— wire the endpoint to the route.aegis-cli/src/main/scala/dev/aegiskms/cli/{Cli,Commands}.scala— add theaegis keys macverb.- Tests across all of the above. Aim for at least: an algebra test, a decorator test, a REST integration test, a CLI parser test.
CHANGELOG.md— under## Unreleased, an entry describing what shipped.
The existing sign / verify PR (#5)
is the canonical reference for this pattern. Read its diff before opening a similar PR.
9. Adding a new RootOfTrust adapter¶
The minimum viable shape is a class that implements the SPI:
scala
class GcpKmsRootOfTrust(client: KeyManagementServiceClient) extends RootOfTrust[IO]:
def generateDataKey(spec: KeySpec): IO[Either[KmsError, WrappedKey]] = ???
def unwrap(wrapped: WrappedKey): IO[Either[KmsError, RawKey]] = ???
def sign(id: KeyId, message: Array[Byte], alg: SigAlgorithm): IO[Either[KmsError, Signature]] = ???
// ... and the rest of the SPI
Then add a config-driven boot path in Server.scala so AEGIS_ROT_KIND=gcp selects it. New
adapters should ship with:
- A unit test using a fake / mock client
- A live integration test gated on env vars (skipped on CI without credentials)
- An entry in
docs/operations/security.mddescribing the IAM permissions required - A new line in the comparison table in
docs/about/comparison.md
10. Adding a new audit sink¶
Implement AuditSink:
scala
trait AuditSink:
def emit(record: AuditRecord): IO[Unit]
Wire into Server.scala selectable by AEGIS_AUDIT_KIND=stdout|kafka|s3|.... The
AuditingKeyService decorator does not care which sink is on the other end — keep that
invariant.
11. Common debugging patterns¶
| Symptom | Likely cause | Fix |
|---|---|---|
sbt server / run hangs at boot |
Pekko user-guardian + Promise pattern reintroduced | Make the ActorSystem be the actor (ActorSystem[T] <: ActorRef[T]) |
[error] No source compatibility issues found, but binary compatibility issues exist |
A change broke MiMa | Either restore binary compat, or add a justified entry to mimaBinaryIssueFilters |
| Test passes locally, fails on CI | Likely a Testcontainers test that runs on CI's Docker but skips locally without Docker | Run with Docker locally to repro |
[error] Imports are not in scalafmt order |
scalafmt and scalafix disagree on import sort | Run sbt scalafmtAll then sbt scalafixAll; if they fight each other, see §6 |
Postgres test fails with connection refused on CI |
Testcontainers couldn't pull the postgres image | Look at CI logs for image-pull rate-limiting; pin to a specific Postgres tag |
[error] -Wunused:all flags an import as unused |
Probably a given import that scalafix doesn't see used |
Use import x.y.given style; if it's still flagged, add // scalafix:ok ImportsRule |
| Audit row missing context.source.ip | The HTTP layer doesn't yet populate AuditRecord.context |
Known v0.1.1 limitation; tracked as a follow-up to issue #13 |
12. Pull request flow¶
- Open or claim an issue. State your design before writing 500 lines.
- Branch:
feat/<thing>for features,fix/<thing>for bugs,docs/<thing>for docs. - Make commits small and atomic — one logical change per commit, DCO-signed.
- CHANGELOG entry under
## Unreleasedlands in the same PR as the change. - Run the full local gate before pushing:
bash sbt test scalafmtCheckAll scalafmtSbtCheck "scalafixAll --check" mimaReportBinaryIssues - Push, open the PR, link the issue.
- CI runs the same gate against JDK 17 and JDK 21. Both must be green.
- At least one maintainer review; design-impacting PRs need two.
- Squash or rebase merge — the project keeps a linear history.
13. Releasing¶
Releases are tag-driven. The maintainer-facing runbook is in RELEASING.md. The short version:
```bash
Promote ## Unreleased to ## X.Y.Z — date in CHANGELOG.md¶
git commit -s -m "chore(release): cut vX.Y.Z" git push origin main git tag -a vX.Y.Z -m "vX.Y.Z — short description" git push origin vX.Y.Z ```
The tag push triggers .github/workflows/release.yml, which runs the full CI gate, then
publishes:
- Library jars to Maven Central (if PGP / Sonatype secrets are configured)
aegis-serverDocker image to GHCR- CLI universal tarball to a GitHub Release with auto-generated notes
If Maven secrets aren't configured, the workflow logs ::notice::Skipping Maven Central
publish and ships Docker + CLI only — that's the correct semantics for a non-publishing
maintainer.
14. Where to ask for help¶
- GitHub Discussions — design questions, "is this a good idea?", "how do I…?"
- GitHub Issues — bugs, feature requests, anything actionable
- Email — see SECURITY.md for security disclosures only
- PRs — best place to ask "would you accept this change?" by opening it as a draft
15. Reference docs in this repo¶
docs/ARCHITECTURE.md— the canonical architecture writeup, including the per-capability status table and how Aegis compares to AWS KMS / Vault / OpenBao.docs/USAGE.md— per-backend semantics, key lifecycle in practice.ROADMAP.md— per-release delivery plan and capability tracks.CHANGELOG.md— under## Unreleased, update before opening a PR.CONTRIBUTING.md— the short-form ground rules.SECURITY.md— responsible disclosure + deploy-time configuration matrix.
This guide is a living document. If anything here is wrong, out of date, or missing — open a PR. The fastest way to make Aegis better for the next contributor is to fix what tripped you up.