Developer Guide¶

The full local-development workflow for Aegis-KMS — from cloning the repo to shipping a PR. If you only want to use Aegis, Quickstart is the better entry point. This document is for people writing code.

1. Prerequisites¶

Tool	Version	Why
JDK	21 (recommended) or 17	Aegis is Scala 3.3.x; CI matrix tests both 17 and 21
sbt	1.10+	Build tool; the repo ships an sbt wrapper but a system install is fine
Docker	any recent	Required for Testcontainers-based persistence tests
Git	2.30+	DCO sign-off uses `git commit -s`
GPG	optional	Only needed if you cut a release that publishes to Maven Central

We strongly recommend SDKMAN for managing the JDK / sbt versions:

bash sdk install java 21.0.5-tem sdk install sbt

If you're on macOS and prefer Homebrew: brew install temurin sbt.

2. First build¶

bash git clone https://github.com/sharma-bhaskar/aegis-kms.git cd aegis-kms sbt compile

The first compile downloads ~400MB of dependencies and takes 3-5 minutes on a cold cache. Every subsequent compile is incremental and finishes in seconds.

Verify the build:

bash sbt test # 160+ tests across all modules sbt scalafmtCheckAll # formatting gate sbt "scalafixAll --check" # lint gate

If sbt test fails on a clean checkout, file an issue — that's a real bug.

3. Project layout¶

Aegis is a multi-project Scala build with strictly enforced two-tier module split. This is the most important rule in the codebase.

aegis-kms/ ├── modules/ │ ├── aegis-core/ ← Library tier (NO Pekko) │ ├── aegis-iam/ ← Library tier │ ├── aegis-audit/ ← Library tier │ ├── aegis-crypto/ ← Library tier │ ├── aegis-persistence/ ← Library tier │ ├── aegis-sdk-scala/ ← Library tier │ ├── aegis-sdk-java/ ← Library tier │ ├── aegis-http/ ← Server tier (Pekko-aware) │ ├── aegis-agent-ai/ ← Server tier │ ├── aegis-server/ ← Server tier (Docker image) │ ├── aegis-cli/ ← Server-tier-adjacent (no Pekko, packaged app) │ ├── aegis-kmip/ ← Server tier (skeleton) │ └── aegis-mcp-server/ ← Server tier (skeleton) ├── docs/ ← Rendered docs source (this site) ├── deploy/ │ ├── docker/ ← docker-compose.yml + Dockerfile context │ └── helm/ ← Placeholder, v0.3.0 ├── project/ │ ├── build.properties ← sbt version │ ├── plugins.sbt ← sbt plugins (sbt-ci-release, sbt-mima, ...) │ └── Dependencies.scala ← Pinned dependency versions └── build.sbt ← Module definitions, settings, dependency overrides

The two-tier split, explained¶

Library tier modules MUST NOT depend on Pekko. They're embeddable in any JVM app — a Spring Boot service, a Lambda function, a sbt-launched script. Adding a pekko import to any of them will fail to compile because aegis-core and friends literally don't have Pekko on their classpath.

Server tier modules add concurrency, HTTP, and runtime. They depend on Pekko Typed for actor-based state, pekko-http via Tapir for the REST plane, and a small handful of server-only deps (Micrometer for Prometheus, OpenTelemetry SDK, etc.).

Why this matters:

The split lets enterprise users embed Aegis as a library without taking on the full server-tier dependency closure.
It makes property tests in aegis-core cheap — no actor system to spin up, no HTTP server.
It prevents vendor coupling from leaking into the core algebras.

If you're unsure which tier a piece of new code belongs to, ask in the PR. The default for library-tier additions is no I/O, no Pekko, no vendor SDK calls.

Module names — sbt vs. directory¶

The directory modules/aegis-core/ defines a sbt module named core (camelCase, no prefix). This is unfortunate but intentional — sbt modules can't have hyphens, and the directory prefix communicates the project name to humans browsing the repo.

Directory	sbt module	Run a test
`modules/aegis-core/`	`core`	`sbt "core / test"`
`modules/aegis-persistence/`	`persistence`	`sbt "persistence / test"`
`modules/aegis-crypto/`	`crypto`	`sbt "crypto / test"`
`modules/aegis-sdk-scala/`	`sdkScala`	`sbt "sdkScala / test"`
`modules/aegis-sdk-java/`	`sdkJava`	`sbt "sdkJava / test"`
`modules/aegis-mcp-server/`	`mcpServer`	`sbt "mcpServer / test"`
`modules/aegis-agent-ai/`	`agentAi`	`sbt "agentAi / test"`
`modules/aegis-http/`	`http`	`sbt "http / test"`
`modules/aegis-server/`	`server`	`sbt "server / test"`
`modules/aegis-cli/`	`cli`	`sbt "cli / test"`

4. Build & test cheatsheet¶

```bash

Compile everything¶

sbt compile

Run all tests¶

sbt test

Run one module's tests¶

sbt "core / test"

Run one suite¶

sbt "http / testOnly *HttpRoutesSpec"

Run one test by name substring¶

sbt "http / testOnly *HttpRoutesSpec -- -z 'creates a key'"

Format gate (CI runs this)¶

sbt scalafmtCheckAll scalafmtSbtCheck

Lint gate (CI runs this)¶

sbt "scalafixAll --check"

Auto-format¶

sbt scalafmtAll

Auto-fix lint¶

sbt scalafixAll

Run the server (forks a JVM; in-memory journal by default)¶

sbt "server / run"

Build the local Docker image¶

sbt "server / Docker / publishLocal"

Build the CLI tarball¶

sbt "cli / Universal / packageZipTarball"

Binary-compat check vs. the previous release¶

sbt mimaReportBinaryIssues ```

Test / fork := true is set globally, and run / fork := true is set on server specifically (Pekko's user-guardian dispatcher hangs in sbt's in-process classloader otherwise — do not remove these settings).

5. Testing¶

Unit tests¶

We use Scalatest with the AnyFlatSpec style. Tests follow the naming convention <ClassName>Spec.scala and live next to the code under modules/<module>/src/test/scala/<package>/.

The canonical reference is KeyServiceSpec — copy its shape when adding a new test for the algebra. For decorators, see AuditingKeyServiceSpec.

Integration tests with Testcontainers¶

Some aegis-persistence tests spin up a real Postgres container via Testcontainers. They assume(...) away gracefully when Docker isn't available, so sbt test succeeds on workstations without Docker — but in CI we always run them.

If a persistence test silently passes when you expect it to fail, check whether Docker is running. The test output will say [info] Skipped (Docker unavailable).

Property tests¶

aegis-core algebra invariants (state-machine completeness, idempotence) are tested with ScalaCheck. New algebra changes should add a property test — see KeySpecGenSpec for the canonical pattern.

6. Code style & quality gates¶

Hard rules enforced by the build:

Scala 3. No Scala 2 idioms. Use enum, given/using, extension, contextual abstractions.
-Xfatal-warnings + -Wunused:all — every warning fails the build.
scalafmt on every file. CI runs scalafmtCheckAll and scalafmtSbtCheck.
scalafix for lint and import organization. CI runs scalafixAll --check.

Soft rules (PR review will surface these):

Prefer immutable data; mutable state lives only in actor Behavior closures.
Option / Either over null / exceptions for control flow. Exceptions are reserved for truly exceptional events (driver crashes, JVM-level failures).
ADTs over boolean flags. If you have two booleans on a class, you probably want a 4-arm enum.
Actor behaviors are always typed (Behavior[T] over a sealed command ADT). Untyped actors are not used anywhere in the codebase.
DCO sign-off: every commit must be signed off (git commit -s).

Imports¶

scalafmt's AsciiSortImports rule and scalafix's OrganizeImports rule are deliberately configured to agree on import order. Don't change the import order config in either tool without verifying both still produce the same output — they used to disagree, and the disagreement caused a long-running CI flake.

7. Architecture — what you need to know to navigate¶

(For the full architecture writeup see Concepts → Architecture. This section is the contributor's-eye summary.)

The decorator stack¶

Every request through aegis-server flows through six decorators in this order:

HTTP → Audit → Tracing → Metrics → Authorize → Actor → Persistence + RoT

Each decorator wraps the same KeyService[F[_]] algebra and adds one orthogonal concern:

Decorator	Adds	Module
`AuditingKeyService`	Append-only audit log on every state change	`aegis-audit`
`TracingKeyService`	OpenTelemetry spans per operation	`aegis-server`
`MeteredKeyService`	Prometheus counters + latency histograms	`aegis-server`
`AuthorizingKeyService`	IAM allowlist check, returns `PermissionDenied` on miss	`aegis-iam`
`ActorBackedKeyService`	Single-writer state via Pekko Typed actor	`aegis-server`
`PostgresEventJournal` (or in-memory)	Persistent event log + RoT calls	`aegis-persistence`

The order matters:

Audit is outermost so it sees the final outcome including the IAM denial. Moving audit inside AuthorizingKeyService would cause denied requests to silently disappear from audit.
Authorize is inside metrics so denies are countable as aegis_keys_op_errors_total{code="PermissionDenied"} rather than disappearing.
The actor is innermost so all state-mutating calls serialize through one mailbox per process. Concurrent reads bypass the actor and hit the journal directly.

When adding a new concern, decide which layer it belongs to and slot a new decorator. Don't add the concern to an existing decorator — that's how this codebase becomes Vault.

Pekko Typed actor pattern (load-bearing)¶

aegis-server uses one core actor: KeyOpsActor. The pattern is unusual:

scala val system = ActorSystem(KeyOpsActor.behavior(deps), "aegis") // system IS the actor — no user-guardian dance val keyService = ActorBackedKeyService(system, ...)

This works because in Pekko Typed, ActorSystem[T] <: ActorRef[T]. The actor system is the user guardian. Earlier versions used the more conventional Behaviors.setup { ctx => ... } + Promise + Await.result pattern to expose an ActorRef to the main thread; that pattern hangs on cold start on JDK 21 + sbt's classloader because the user guardian's setup block is never dispatched.

If you see boot-time hangs, check whether someone has reintroduced the user-guardian + Promise pattern. The fix is in PR #45 and there's a regression test (ServerWiringSpec) that asserts the actor system itself is the actor.

Resource[IO] boot scope¶

Server.scala uses IOApp.Simple + a single composed Resource[IO, Unit] chain to acquire and release every long-lived dependency:

scala def run: IO[Unit] = (for metricsRegistry <- MetricsRegistry.make journal <- PostgresEventJournal.make(config.jdbc, metricsRegistry) actorSystem <- ActorSystemResource.make(...) httpBinding <- HttpServer.bind(...) yield ()).useForever

SIGTERM unwinds the stack in reverse — HTTP unbind (5 s grace) → actor system terminate → journal pool close → meter registry close. Don't replace this with unsafeRunSync calls — the previous pattern leaked the journal connection pool until JVM exit.

Vendor isolation via SPIs¶

aegis-core MUST NOT contain vendor-specific code. Three SPIs in the library tier handle adapter pluggability:

SPI	Module	Today	v0.2.0
`RootOfTrust`	`aegis-crypto`	AWS KMS	+ GCP KMS, Azure Key Vault, HashiCorp Vault, PKCS#11
`EventJournal`	`aegis-persistence`	Postgres + in-memory	+ MySQL, SQLite
`AuditSink`	`aegis-audit`	stdout	+ Kafka, S3, Webhook, OTel logs

If you're adding a vendor adapter, the right place is one of these modules — never aegis-core.

8. Adding a new feature — walked example¶

Imagine adding a new crypto operation mac(id, message). The places you'd touch, in order:

aegis-core/src/main/scala/dev/aegiskms/core/KeyService.scala — add def mac(...) to the algebra, implement in the in-memory reference impl.
aegis-core/src/main/scala/dev/aegiskms/core/Operation.scala — add Operation.Mac to the IAM allowlist enum.
aegis-iam/src/main/scala/dev/aegiskms/iam/AuthorizingKeyService.scala — route mac through the policy check.
aegis-audit/src/main/scala/dev/aegiskms/audit/AuditingKeyService.scala — record a Success / Failed audit row carrying the relevant metadata (operation, principal, message length, not the message content).
aegis-crypto/src/main/scala/dev/aegiskms/crypto/RootOfTrust.scala — add mac to the SPI.
aegis-crypto/src/main/scala/dev/aegiskms/crypto/aws/AwsKmsRootOfTrust.scala — implement via the AWS KMS GenerateMac API.
aegis-server/src/main/scala/dev/aegiskms/app/ActorBackedKeyService.scala — route through the actor mailbox if state-mutating, or bypass it if read-only.
aegis-http/src/main/scala/dev/aegiskms/http/Endpoints.scala — add the Tapir endpoint POST /v1/keys/{id}/mac.
aegis-http/src/main/scala/dev/aegiskms/http/HttpRoutes.scala — wire the endpoint to the route.
aegis-cli/src/main/scala/dev/aegiskms/cli/{Cli,Commands}.scala — add the aegis keys mac verb.
Tests across all of the above. Aim for at least: an algebra test, a decorator test, a REST integration test, a CLI parser test.
CHANGELOG.md — under ## Unreleased, an entry describing what shipped.

The existing sign / verify PR (#5) is the canonical reference for this pattern. Read its diff before opening a similar PR.

9. Adding a new RootOfTrust adapter¶

The minimum viable shape is a class that implements the SPI:

scala class GcpKmsRootOfTrust(client: KeyManagementServiceClient) extends RootOfTrust[IO]: def generateDataKey(spec: KeySpec): IO[Either[KmsError, WrappedKey]] = ??? def unwrap(wrapped: WrappedKey): IO[Either[KmsError, RawKey]] = ??? def sign(id: KeyId, message: Array[Byte], alg: SigAlgorithm): IO[Either[KmsError, Signature]] = ??? // ... and the rest of the SPI

Then add a config-driven boot path in Server.scala so AEGIS_ROT_KIND=gcp selects it. New adapters should ship with:

A unit test using a fake / mock client
A live integration test gated on env vars (skipped on CI without credentials)
An entry in docs/operations/security.md describing the IAM permissions required
A new line in the comparison table in docs/about/comparison.md

10. Adding a new audit sink¶

Implement AuditSink:

scala trait AuditSink: def emit(record: AuditRecord): IO[Unit]

Wire into Server.scala selectable by AEGIS_AUDIT_KIND=stdout|kafka|s3|.... The AuditingKeyService decorator does not care which sink is on the other end — keep that invariant.

11. Common debugging patterns¶

Symptom	Likely cause	Fix
`sbt server / run` hangs at boot	Pekko user-guardian + Promise pattern reintroduced	Make the ActorSystem be the actor (`ActorSystem[T] <: ActorRef[T]`)
`[error] No source compatibility issues found, but binary compatibility issues exist`	A change broke MiMa	Either restore binary compat, or add a justified entry to `mimaBinaryIssueFilters`
Test passes locally, fails on CI	Likely a Testcontainers test that runs on CI's Docker but skips locally without Docker	Run with Docker locally to repro
`[error] Imports are not in scalafmt order`	scalafmt and scalafix disagree on import sort	Run `sbt scalafmtAll` then `sbt scalafixAll`; if they fight each other, see §6
Postgres test fails with `connection refused` on CI	Testcontainers couldn't pull the postgres image	Look at CI logs for image-pull rate-limiting; pin to a specific Postgres tag
`[error] -Wunused:all` flags an import as unused	Probably a `given` import that scalafix doesn't see used	Use `import x.y.given` style; if it's still flagged, add `// scalafix:ok ImportsRule`
Audit row missing context.source.ip	Request reached the server without a resolvable client IP (e.g. no `X-Forwarded-For` behind a proxy that strips it)	The HTTP layer populates `source.ip` as of v0.2.0 (#78); check your proxy forwards the client address

12. Pull request flow¶

Open or claim an issue. State your design before writing 500 lines.
Branch: feat/<thing> for features, fix/<thing> for bugs, docs/<thing> for docs.
Make commits small and atomic — one logical change per commit, DCO-signed.
CHANGELOG entry under ## Unreleased lands in the same PR as the change.
Run the full local gate before pushing: bash sbt test scalafmtCheckAll scalafmtSbtCheck "scalafixAll --check" mimaReportBinaryIssues
Push, open the PR, link the issue.
CI runs the same gate against JDK 17 and JDK 21. Both must be green.
At least one maintainer review; design-impacting PRs need two.
Squash or rebase merge — the project keeps a linear history.

13. Releasing¶

Releases are tag-driven. The maintainer-facing runbook is in RELEASING.md. The short version:

```bash

Promote ## Unreleased to ## X.Y.Z — date in CHANGELOG.md¶

git commit -s -m "chore(release): cut vX.Y.Z" git push origin main git tag -a vX.Y.Z -m "vX.Y.Z — short description" git push origin vX.Y.Z ```

The tag push triggers .github/workflows/release.yml, which runs the full CI gate, then publishes:

Library jars to Maven Central (if PGP / Sonatype secrets are configured)
aegis-server Docker image to GHCR
CLI universal tarball to a GitHub Release with auto-generated notes

If Maven secrets aren't configured, the workflow logs ::notice::Skipping Maven Central publish and ships Docker + CLI only — that's the correct semantics for a non-publishing maintainer.

14. Where to ask for help¶

GitHub Discussions — design questions, "is this a good idea?", "how do I…?"
GitHub Issues — bugs, feature requests, anything actionable
Email — see SECURITY.md for security disclosures only
PRs — best place to ask "would you accept this change?" by opening it as a draft

15. Reference docs in this repo¶

docs/ARCHITECTURE.md — the canonical architecture writeup, including the per-capability status table and how Aegis compares to AWS KMS / Vault / OpenBao.
docs/USAGE.md — per-backend semantics, key lifecycle in practice.
ROADMAP.md — per-release delivery plan and capability tracks.
CHANGELOG.md — under ## Unreleased, update before opening a PR.
CONTRIBUTING.md — the short-form ground rules.
SECURITY.md — responsible disclosure + deploy-time configuration matrix.

This guide is a living document. If anything here is wrong, out of date, or missing — open a PR. The fastest way to make Aegis better for the next contributor is to fix what tripped you up.