Go Unit Test¶

Create and refine Go tests for this repository with table-driven cases and explicit bug-hunting rules.

Hard Rules¶

Name test files as <target_file>_test.go, co-located with source.
Assertion strategy (adapt to project):
If project uses testify: require for fatal preconditions, assert for value checks.
If project uses standard library only: use t.Fatalf for fatal preconditions, t.Errorf for value checks. Include got/want in messages: t.Errorf("Name = %q, want %q", got, want).
If project uses go-cmp: use cmp.Diff for deep struct comparison. Prefer over field-by-field assertion for complex output.
Detection: Check existing _test.go files for "github.com/stretchr/testify" imports. Follow project convention.
Keep tests deterministic; isolate time, randomness, environment, and network.
Prefer t.Setenv for env changes; avoid leaking global state between tests.
Prefer stable fakes/stubs over heavy mock chains.
Unit tests SHOULD NOT require real external services (DB/Redis/HTTP) unless explicitly requested; that belongs to integration tests.
Do NOT test constructors (NewXxx) or private helpers unless explicitly requested OR they contain non-trivial logic (validation/defaulting/option-merging) that can break runtime invariants.
For service-layer code with interfaces, focus on methods declared in the interface. For pure functions/handlers, focus on exported functions/endpoints.
Run with race detector: go test -race ./....
Killer Case hard constraint (Standard + Strict): each test target (interface method / exported function / handler endpoint) must include at least 1 "killer case" (fault-injection or boundary-kill case) that is expected to fail on a known bad mutation/path.
In the report, for each killer case, explicitly state: "if this assertion is removed, the known bug can escape detection."

Killer Case — Definition (Standard + Strict Modes)¶

A killer case is a test case designed to catch a specific, named defect. It has four mandatory components:

Defect hypothesis: a concrete statement of what could go wrong (e.g., "loop uses i < len-1 instead of i < len, dropping the last element")
Fault injection or boundary setup: test input that triggers the defect if present
Critical assertion: the specific assert/require call that would fail if the defect exists
Removal risk statement: "if this assertion is removed, the known bug can escape detection"

A killer case is NOT just another edge case — it is explicitly tied to a defect hypothesis. If you cannot name the defect it catches, it is not a killer case.

See references/killer-case-patterns.md for 6 concrete Go templates.

Anti-examples (DO NOT write these tests)¶

Testing Go standard library behavior (e.g., json.Marshal serializes struct correctly)
Testing trivial getters/setters with no logic
Testing constructor NewXxx that only assigns fields (unless it has validation/defaulting)
Writing one case per possible string input instead of using representative boundaries
Asserting only err == nil without verifying the returned value
Tests that depend on execution order of other tests
Tests that assert log output format (fragile, couples to logging implementation)
Mocking everything: if you mock 5+ dependencies, the test tests the mocks, not the code
Over-reliance on snapshot/golden files for volatile output (timestamps, UUIDs, map iteration order) — golden files are fine for stable serialization formats, but not for output that changes across runs
Testing implementation details instead of behavior: asserting internal method call order, private field values, or specific goroutine scheduling rather than observable outputs and side effects

Coverage Gate Policy (Default + Scope)¶

Coverage gate: >= 80% by default for logic-heavy packages (pure/domain/transform code).
For integration-heavy/IO-heavy packages (infra, clients, wiring, DB adapters):
Coverage may be lower (typical 60–80%) only with explicit rationale.
Even when coverage is lower, Boundary Checklist discipline remains mandatory (full checklist in Standard + Strict; Light Boundary Check in Light mode). Failure Hypothesis and Killer Case discipline apply in Standard + Strict modes only.
Never inflate coverage by adding low-signal tests with weak assertions.

Multi-Package Coverage¶

When testing spans multiple packages: - Use -coverpkg=./... to measure cross-package coverage accurately. - Packages with no _test.go files report 0% — exclude them from gate calculations with explicit rationale. - Generate separate coverprofile per package when fine-grained analysis is needed:

go test -coverprofile=pkg_a.out -covermode=atomic ./pkg/a
go test -coverprofile=pkg_b.out -covermode=atomic ./pkg/b

Go Version Gate¶

Before generating tests, check go.mod for the project's Go version. Adapt test patterns accordingly:

Feature	Minimum Go Version	Adaptation
`t.Setenv`	1.17	Below 1.17: use `os.Setenv` + `t.Cleanup`
Range var capture fix	1.22	Below 1.22: copy loop variable in `t.Run` + `t.Parallel()` closures
`t.Parallel()` + `t.Setenv` safe	1.24	Below 1.24: do NOT combine `t.Parallel()` with `t.Setenv` in the same subtest

If go.mod cannot be read, state the assumption and proceed with Go 1.21 defaults.

Test Execution Hardening¶

Shuffle: Run with go test -shuffle=on to catch tests that depend on execution order. If any test fails only under shuffle, it has hidden state coupling — fix the test, not the ordering.
Fuzzing collaboration: When a function already has a fuzz test (func FuzzXxx), unit tests should cover structured boundary cases that fuzzing is unlikely to find (e.g., specific business rule violations, multi-field interaction). Do not duplicate what fuzzing covers (random byte input, crash discovery). If fuzzing is appropriate but missing, note it in the report as a recommendation.

PR-Diff Scoped Testing¶

When testing in a CI / PR review context:

Determine changed packages from git diff --name-only origin/main...HEAD | grep '\.go$' | xargs -I{} dirname {} | sort -u.
Run tests only for changed packages and their direct dependents: go test -race ./changed/pkg/... ./dependent/pkg/....
Coverage gate applies only to changed packages (not the entire repo).
If a changed package has no _test.go file, flag it in the report as a gap.

Generated Code Exclusion¶

Do NOT generate tests for files matching these patterns: - *.pb.go (protobuf generated) - *_gen.go, wire_gen.go (code generators) - mock_*.go, *_mock.go (generated mocks) - Files containing the directive // Code generated .* DO NOT EDIT

If the user explicitly requests testing generated code, proceed but note that generated files are typically validated by their generator's own test suite.

Repository Config (Optional)¶

When present, load repository config from .unit-test.yaml (or .unit-test.json) before test generation.

Config keys:

coverage.logic_min: default minimum coverage for logic-heavy packages (default 80)
coverage.infra_min: minimum coverage for infra-heavy packages when policy is stricter than default (optional)
coverage.package_rules: per-package overrides with explicit rationale
assertion_style: auto|testify|stdlib|go-cmp (prefer auto)
race.required: whether -race execution is mandatory in this repo (true|false)
commands.test: custom test command template
commands.coverage: custom coverage command template
mode: auto|light|standard|strict — set the minimum mode floor (default auto). Auto-selection still runs; if it detects a higher mode than the configured floor, the higher mode wins. For example, mode: light allows Light for simple targets but still auto-promotes to Standard/Strict when triggers fire. mode: strict forces Strict for all targets.

If config is missing, use this skill's defaults and state: Repository unit-test config not found; using skill defaults.

Execution Modes (Light / Standard / Strict)¶

Select mode before writing tests. Declare the selected mode and rationale at the start of output.

Mode Selection¶

Criterion	Light	Standard	Strict
Target count	≤ 3 simple targets	1-8 targets	> 8 targets
Concurrency	None (no `go func`, channels, `sync.*`)	Any	Shared mutable state, error fan-in
Dependencies	≤ 1 failing dependency	Multiple	Complex error chains
Branching	≤ 3 branches per function	Any	Complex state machines
Security	Not security-sensitive	Any	Auth, crypto, input sanitization
Context usage	Pass-through only	Any	Cancellation/deadline logic
Collection transforms	No slice/map transforms (scalar I/O only)	Any	—
Invariant patterns	None (trivial arithmetic commutativity like `Add(a,b)` does not count)	Non-trivial PBT trigger detected (roundtrip, idempotency, preservation, domain-level commutativity, parse validity)	— (use Standard)

Light: ALL Light criteria must be met. For simple pure functions with scalar I/O, utilities, type conversions. NOT for collection/slice/map transforms (these need the full boundary checklist to catch off-by-one and dropped-element bugs).
Standard: Default. Use when any Light criterion is violated but no Strict trigger fires.
Strict: ANY Strict criterion triggers this mode. For concurrent, security-sensitive, or high-risk code.

When in doubt, choose Standard.

Mode Requirements¶

Feature	Light	Standard	Strict
Table-driven tests	Required	Required	Required
Mutation-resistant assertions	Required	Required	Required
Race detection (`-race`)	Required	Required	Required
Coverage gate (80%)	Required	Required	Required
Reporting Integrity	Required	Required	Required
Case budget per target	3-6	5-12	8-15+
Failure Hypothesis List	Skip	Required	Required
Killer Case per target	Skip	Required (1)	Required (1+)
Removal Risk Statement	Skip	Required	Required
Boundary Checklist	Light (5 items)	Full (12 items)	Full (12 items)
Scorecard	Light (7 checks)	Full (13 checks)	Full (13 checks)
Property-based test guidance	N/A	Recommend if applicable	Required when pattern matches
JSON Summary	Skip	Required	Required

Light Boundary Check (5 items)¶

Mark each Covered or N/A:

nil/zero-value input (if parameter type allows)
Empty collection / zero-length input
Single element / boundary size (n=1)
Error from dependency (if any)
Invalid/malformed input (if format constraints exist)

Light Scorecard (7 checks)¶

#	Tier	Check
L1	Hygiene	File naming and location correct
L2	Hygiene	Table-driven style used
L3	Critical	Assertions are mutation-resistant (business fields, not existence-only)
L4	Hygiene	Happy path covered
L5	Standard	Critical dependency error paths covered (or N/A)
L6	Standard	`-race` execution result reported (or N/A with rationale)
L7	Critical	Coverage meets gate (logic >= 80%)

N/A handling: Items marked N/A with explicit rationale count as PASS for tier and total calculations (same rule as the full scorecard).

PASS when: both Critical (L3, L7) PASS (or N/A with explicit rationale), Standard >= 1/2, Hygiene >= 2/3, total >= 6/7.

State Light mode: standard scorecard not applicable in output.

Target Type Adaptation¶

Adapt test organization based on the target code type:

Target Type	Top-level Test Naming	t.Run Organization	Killer Case Granularity
Service interface	TestXxxService	By interface method	1 per interface method
Package-level functions	TestFuncName	By function	1 per exported function
HTTP handler	TestHandlerName	By HTTP method + path	1 per endpoint
CLI command/runner	TestRunnerXxx	By command/subcommand	1 per command
Middleware	TestMiddlewareName	By pass-through / block / error	1 per middleware

HTTP handler tests: use httptest.NewRequest + httptest.NewRecorder. Verify status code, response body, headers. Inject dependencies via Deps struct or handler constructor.

Pure function tests: direct table-driven, no mock needed. Focus on input boundaries and output correctness.

Defect-First Workflow (Standard + Strict Modes)¶

Before writing cases, produce a short Failure Hypothesis List from the target code:

Loop/index risks: i < n, i <= n, i+1, n-1, slice/map access.
Collection transform risks: input->output cardinality mismatch, dropped first/last item, wrong key mapping.
Branching risks: terminal state branch, empty/singleton branch, error short-circuit branch.
Concurrency risks: goroutine error fan-in, shared variable writes, panic recovery path.
Context/time risks: context.Canceled, DeadlineExceeded, missing ctx propagation, timeout not enforced.

Then map each hypothesis to at least one concrete test case name.

Then define at least one killer case per test target and map it to a specific defect hypothesis.

If this mapping is missing, do not proceed to large test generation.

High-Signal Test Budget (Anti-Bloat)¶

Avoid generating huge suites with weak assertions.

Mode	Cases per target	Notes
Light	3-6	Happy path + key error/edge paths
Standard	5-12	Full budget below + killer case
Strict	8-15+	Extended budget + property-based tests when applicable

Standard/Strict default budget per target:

1 happy path
1 terminal/last-element boundary path
1 empty or single-element path
1 dependency error propagation path per critical dependency
1 invariant/path-completeness path
1 killer case (mandatory, Standard + Strict)

Only exceed the budget when new cases cover distinct logic paths.

Bug-Finding Techniques → `references/bug-finding-techniques.md`¶

#	Technique	Key Rule
1	Mutation-Resistant Assertions	Assert concrete business fields, not just `!= nil`
2	Collection Mapping Completeness	Assert len + identity + first/middle/last for transforms
3	Off-by-One Precision	Test n=0,1,2,3 for every index boundary
4	Dependency Error Propagation	Inject failure per dependency, verify no partial payload
5	Concurrency & Panic Recovery	Channel barriers, -race, panic recovery path → also see `references/concurrency-testing.md`
6	Branch Completeness	Both branches: marker behavior + payload completeness
7	Killer Case Design	Fault-injection tied to defect hypothesis → also see `references/killer-case-patterns.md`

For detailed patterns and Go code examples, load the reference file.

Property-Based Testing (Standard: optional | Strict: required when applicable)¶

Property-based testing finds bugs that hand-picked boundary cases miss by verifying invariants over randomized input.

Pattern	Invariant	Example
Roundtrip	`decode(encode(x)) == x`	marshal/unmarshal, serialize/deserialize
Idempotency	`f(f(x)) == f(x)`	normalization, canonicalization
Preservation	`len(output) == len(input)`	transforms that must not drop/duplicate items
Commutativity	`f(a,b) == f(b,a)`	set operations, merge functions
Parse validity	valid input → no panic	parsers, validators

Quick Example (`testing/quick`)¶

func TestRoundtrip(t *testing.T) {
    f := func(input string) bool {
        decoded, err := Decode(Encode(input))
        return err == nil && decoded == input
    }
    if err := quick.Check(f, nil); err != nil {
        t.Error(err)
    }
}

For complex domain types, use hand-rolled generators with deterministic seeds. See references/property-based-testing.md.

Relationship to table-driven tests: Property-based tests verify invariants over wide input space; table-driven tests verify exact expected values at specific boundaries. Use both when target has both invariants AND boundary risks.

Mode Applicability¶

Light: Not applicable. If an invariant pattern is detected, auto-promote to Standard (see Mode Selection table).
Standard: Note in report if property-based testing would add value; do not require.
Strict: Required recommendation when target matches any trigger pattern above; include at least one property test or justify why none apply.

Fixed Boundary Checklist (Standard + Strict — Per Test Target)¶

For Light mode, use the Light Boundary Check (5 items) in Execution Modes.

Mark each item as Covered or N/A (reason):

nil input (only if parameter is pointer/interface/map/slice/channel/function)
empty value/collection
single element (len == 1)
size/index boundary (n=2, n=3, last element)
min/max value boundary (x-1, x, x+1) if numeric
invalid format/type
zero-value struct/default trap
error from each critical dependency
context cancellation/deadline propagation (if method accepts/uses context.Context)
concurrent/race behavior (if stateful or goroutine-based)
mapping completeness (no dropped first/middle/last item)
killer case present and mapped to a concrete defect hypothesis

Test Structure Standard¶

Top-level test naming follows the Target Type Adaptation table.
t.Run groups map to test targets (interface methods, exported functions, or endpoints).
Use table-driven cases inside each group.
Keep case names defect-oriented and readable in go test -v.
Prefer t.Parallel() for independent subtests.
Do NOT use t.Parallel() when subtests share mutable globals, temp dirs without isolation, or process-wide resources.

Incremental Mode (Fix / Add Tests)¶

When the task is fixing failing tests or adding tests to existing code, use these simplified flows instead of the full workflow.

Fix failing test:¶

Read failing test and target code
Identify root cause: test bug vs implementation bug
Fix the actual bug side (do NOT weaken assertions just to make tests pass)
Run go test -run TestXxx -v -race to verify the fix
Skip full scorecard and use incremental scorecard only (see Auto Scorecard applicability for mode-aware rules).

Add tests for existing code:¶

Read target code, identify untested paths
(Standard + Strict only) Build targeted Failure Hypothesis List (only for uncovered paths)
Design cases for gaps only (do not rewrite existing tests)
Run coverage diff: compare before/after
Simplified Scorecard (mode-aware):
Standard/Strict targets: only verify items 5, 7, 8, 11 for new cases.
Light targets: only verify items L3, L5, L7 for new cases.

Coverage recovery:¶

Run go test -coverprofile=before.out
Identify uncovered lines with go tool cover -func=before.out
Write targeted cases for uncovered branches
Verify coverage gate met

Workflow¶

Assess target code complexity and select execution mode (Light/Standard/Strict). Declare mode and rationale.
Check go.mod for Go version; note version-dependent test pattern adaptations (see Go Version Gate).
Exclude generated code files from test scope (see Generated Code Exclusion).
Read target code and identify test targets (interface methods, exported functions, handler endpoints).
(Standard + Strict only) Build Failure Hypothesis List (loops, mapping, branch, concurrency, context/time).
(Standard + Strict only) For each target, define 1 mandatory killer case and bind it to one hypothesis.
Design minimal high-signal cases (Light: 3-6, Standard: 5-12, Strict: 8-15+ per target).
Implement tests with strong field-level assertions.
Run focused tests:
go test ./path/to/pkg -run TestXxx -v -race
Run package tests:
go test ./path/to/pkg -race
Measure coverage (prefer atomic for concurrency safety):
go test ./path/to/pkg -coverprofile=coverage.out -covermode=atomic -race
go tool cover -func=coverage.out
If coverage < required gate OR key hypotheses untested, add targeted cases only.
(Standard + Strict only) Verify killer case integrity in report (required assertion present + removal risk statement).

Reporting Integrity (Mandatory)¶

Do NOT claim -race or coverage results unless you actually ran the commands and observed output.
If you cannot run commands in the current environment, say so, and output the exact commands for the user to run plus what to look for.

Auto Scorecard (13 Checks)¶

Score each item PASS / FAIL / N/A (reason). Output Total: X/13 and final result.

Each item has a weight tier that determines its impact on the final verdict:

Tier	Items	Rule
Critical (must PASS)	5, 11, 13	Any Critical FAIL → overall FAIL regardless of total
Standard	7, 8, 9, 10, 12	Must achieve >= 4/5 Standard PASS
Hygiene	1, 2, 3, 4, 6	Must achieve >= 4/5 Hygiene PASS

Applicability:

Light mode: Use Light Scorecard (7 checks); state Light mode: standard scorecard not applicable.
Standard/Strict mode: Full 13-check scorecard mandatory.
Incremental mode (Standard/Strict targets): Simplified scorecard (items 5, 7, 8, 11); state Incremental mode: full scorecard skipped.
Incremental mode (Light targets): Use Light Scorecard (items L3, L5, L7 only); PASS when all 3 items are PASS or N/A with rationale; state Incremental + Light mode: minimal scorecard.
[Hygiene] File naming and location are correct.
[Hygiene] Top-level test naming follows the Target Type Adaptation table.
[Hygiene] t.Run groups map 1-to-1 to test targets.
[Hygiene] Table-driven style is used for test cases.
[Critical] Assertions are mutation-resistant (business fields, not existence-only).
[Hygiene] Happy path is covered.
[Standard] Critical dependency error paths are covered.
[Standard] Boundary checklist items are explicitly marked Covered/N/A.
[Standard] Collection mapping completeness is asserted (length + identities + first/middle/last).
[Standard] Terminal/last-element branch behavior is asserted.
[Critical] Killer case exists for every target and is linked to a defect hypothesis.
[Standard] -race execution result is reported (or marked N/A with rationale if not runnable here).
[Critical] Coverage meets gate for the package category (logic >= 80%; infra per rationale) OR marked N/A with explicit justification.

Final PASS only when:

All 3 Critical items (5, 11, 13) are PASS (or N/A with explicit rationale and hypothesis coverage is complete), and
Standard tier: >= 4/5 PASS, and
Hygiene tier: >= 4/5 PASS, and
total >= 11/13.

Otherwise: FAIL, with missing items and next targeted test additions.

Output Expectations¶

Include:

Execution mode (Light/Standard/Strict) with selection rationale
Targets tested + case counts
Go version (from go.mod) and version-dependent adaptations applied
Generated files excluded from scope (list, or "none")
Failure Hypothesis List and which case covers each
Killer case list per target:
case name
linked defect hypothesis
critical assertion(s)
mandatory statement: "if this assertion is removed, the known bug can escape detection."
Boundary checklist per target (Covered/N/A + reason)
Coverage and race results (or N/A + exact commands)
Scorecard and final PASS/FAIL
Remaining untested risks (if any)

Light mode output reduction: Skip Failure Hypothesis List, Killer Case list, and JSON Summary. Report only: mode + rationale, targets + case counts, Light Boundary Check, coverage/race results, Light Scorecard, remaining risks.

For list/transform logic, include explicit statement:

whether first/middle/last items were validated
whether output cardinality and identity completeness were validated

Machine-Readable Summary (JSON) — Standard + Strict Only¶

Also output a compact JSON block for CI/pipeline ingestion (skip for Light mode):

{
  "summary": {
    "pass": true,
    "score": "12/13",
    "go_version": "1.22"
  },
  "targets": [
    {
      "name": "TestOrderService",
      "type": "Service interface",
      "cases": 8,
      "killer_cases": 2,
      "hypothesis_covered": ["H1", "H3"]
    }
  ],
  "coverage": {
    "package": "internal/domain/order",
    "line_pct": 87.5,
    "gate": 80,
    "met": true
  },
  "race": {
    "executed": true,
    "clean": true
  },
  "scorecard": {
    "critical_pass": 3,
    "critical_total": 3,
    "standard_pass": 5,
    "standard_total": 5,
    "hygiene_pass": 4,
    "hygiene_total": 5
  }
}

Skill Maintenance¶

Run regression checks for this skill with:

bash "<path-to-skill>/scripts/run_regression.sh"