Go Unit Test¶
Create and refine Go tests for this repository with table-driven cases and explicit bug-hunting rules.
Hard Rules¶
- Name test files as
<target_file>_test.go, co-located with source. - Assertion strategy (adapt to project):
- If project uses testify:
requirefor fatal preconditions,assertfor value checks. - If project uses standard library only: use
t.Fatalffor fatal preconditions,t.Errorffor value checks. Include got/want in messages:t.Errorf("Name = %q, want %q", got, want). - If project uses go-cmp: use
cmp.Difffor deep struct comparison. Prefer over field-by-field assertion for complex output. - Detection: Check existing
_test.gofiles for"github.com/stretchr/testify"imports. Follow project convention. - Keep tests deterministic; isolate time, randomness, environment, and network.
- Prefer
t.Setenvfor env changes; avoid leaking global state between tests. - Prefer stable fakes/stubs over heavy mock chains.
- Unit tests SHOULD NOT require real external services (DB/Redis/HTTP) unless explicitly requested; that belongs to integration tests.
- Do NOT test constructors (
NewXxx) or private helpers unless explicitly requested OR they contain non-trivial logic (validation/defaulting/option-merging) that can break runtime invariants. - For service-layer code with interfaces, focus on methods declared in the interface. For pure functions/handlers, focus on exported functions/endpoints.
- Run with race detector:
go test -race ./.... - Killer Case hard constraint (Standard + Strict): each test target (interface method / exported function / handler endpoint) must include at least 1 "killer case" (fault-injection or boundary-kill case) that is expected to fail on a known bad mutation/path.
- In the report, for each killer case, explicitly state: "if this assertion is removed, the known bug can escape detection."
Killer Case — Definition (Standard + Strict Modes)¶
A killer case is a test case designed to catch a specific, named defect. It has four mandatory components:
- Defect hypothesis: a concrete statement of what could go wrong (e.g., "loop uses
i < len-1instead ofi < len, dropping the last element") - Fault injection or boundary setup: test input that triggers the defect if present
- Critical assertion: the specific
assert/requirecall that would fail if the defect exists - Removal risk statement: "if this assertion is removed, the known bug can escape detection"
A killer case is NOT just another edge case — it is explicitly tied to a defect hypothesis. If you cannot name the defect it catches, it is not a killer case.
See references/killer-case-patterns.md for 6 concrete Go templates.
Anti-examples (DO NOT write these tests)¶
- Testing Go standard library behavior (e.g., json.Marshal serializes struct correctly)
- Testing trivial getters/setters with no logic
- Testing constructor NewXxx that only assigns fields (unless it has validation/defaulting)
- Writing one case per possible string input instead of using representative boundaries
- Asserting only
err == nilwithout verifying the returned value - Tests that depend on execution order of other tests
- Tests that assert log output format (fragile, couples to logging implementation)
- Mocking everything: if you mock 5+ dependencies, the test tests the mocks, not the code
- Over-reliance on snapshot/golden files for volatile output (timestamps, UUIDs, map iteration order) — golden files are fine for stable serialization formats, but not for output that changes across runs
- Testing implementation details instead of behavior: asserting internal method call order, private field values, or specific goroutine scheduling rather than observable outputs and side effects
Coverage Gate Policy (Default + Scope)¶
- Coverage gate: >= 80% by default for logic-heavy packages (pure/domain/transform code).
- For integration-heavy/IO-heavy packages (infra, clients, wiring, DB adapters):
- Coverage may be lower (typical 60–80%) only with explicit rationale.
- Even when coverage is lower, Boundary Checklist discipline remains mandatory (full checklist in Standard + Strict; Light Boundary Check in Light mode). Failure Hypothesis and Killer Case discipline apply in Standard + Strict modes only.
- Never inflate coverage by adding low-signal tests with weak assertions.
Multi-Package Coverage¶
When testing spans multiple packages: - Use -coverpkg=./... to measure cross-package coverage accurately. - Packages with no _test.go files report 0% — exclude them from gate calculations with explicit rationale. - Generate separate coverprofile per package when fine-grained analysis is needed:
go test -coverprofile=pkg_a.out -covermode=atomic ./pkg/a
go test -coverprofile=pkg_b.out -covermode=atomic ./pkg/b
Go Version Gate¶
Before generating tests, check go.mod for the project's Go version. Adapt test patterns accordingly:
| Feature | Minimum Go Version | Adaptation |
|---|---|---|
t.Setenv | 1.17 | Below 1.17: use os.Setenv + t.Cleanup |
| Range var capture fix | 1.22 | Below 1.22: copy loop variable in t.Run + t.Parallel() closures |
t.Parallel() + t.Setenv safe | 1.24 | Below 1.24: do NOT combine t.Parallel() with t.Setenv in the same subtest |
If go.mod cannot be read, state the assumption and proceed with Go 1.21 defaults.
Test Execution Hardening¶
- Shuffle: Run with
go test -shuffle=onto catch tests that depend on execution order. If any test fails only under shuffle, it has hidden state coupling — fix the test, not the ordering. - Fuzzing collaboration: When a function already has a fuzz test (
func FuzzXxx), unit tests should cover structured boundary cases that fuzzing is unlikely to find (e.g., specific business rule violations, multi-field interaction). Do not duplicate what fuzzing covers (random byte input, crash discovery). If fuzzing is appropriate but missing, note it in the report as a recommendation.
PR-Diff Scoped Testing¶
When testing in a CI / PR review context:
- Determine changed packages from
git diff --name-only origin/main...HEAD | grep '\.go$' | xargs -I{} dirname {} | sort -u. - Run tests only for changed packages and their direct dependents:
go test -race ./changed/pkg/... ./dependent/pkg/.... - Coverage gate applies only to changed packages (not the entire repo).
- If a changed package has no
_test.gofile, flag it in the report as a gap.
Generated Code Exclusion¶
Do NOT generate tests for files matching these patterns: - *.pb.go (protobuf generated) - *_gen.go, wire_gen.go (code generators) - mock_*.go, *_mock.go (generated mocks) - Files containing the directive // Code generated .* DO NOT EDIT
If the user explicitly requests testing generated code, proceed but note that generated files are typically validated by their generator's own test suite.
Repository Config (Optional)¶
When present, load repository config from .unit-test.yaml (or .unit-test.json) before test generation.
Config keys:
coverage.logic_min: default minimum coverage for logic-heavy packages (default80)coverage.infra_min: minimum coverage for infra-heavy packages when policy is stricter than default (optional)coverage.package_rules: per-package overrides with explicit rationaleassertion_style:auto|testify|stdlib|go-cmp(preferauto)race.required: whether-raceexecution is mandatory in this repo (true|false)commands.test: custom test command templatecommands.coverage: custom coverage command templatemode:auto|light|standard|strict— set the minimum mode floor (defaultauto). Auto-selection still runs; if it detects a higher mode than the configured floor, the higher mode wins. For example,mode: lightallows Light for simple targets but still auto-promotes to Standard/Strict when triggers fire.mode: strictforces Strict for all targets.
If config is missing, use this skill's defaults and state: Repository unit-test config not found; using skill defaults.
Execution Modes (Light / Standard / Strict)¶
Select mode before writing tests. Declare the selected mode and rationale at the start of output.
Mode Selection¶
| Criterion | Light | Standard | Strict |
|---|---|---|---|
| Target count | ≤ 3 simple targets | 1-8 targets | > 8 targets |
| Concurrency | None (no go func, channels, sync.*) | Any | Shared mutable state, error fan-in |
| Dependencies | ≤ 1 failing dependency | Multiple | Complex error chains |
| Branching | ≤ 3 branches per function | Any | Complex state machines |
| Security | Not security-sensitive | Any | Auth, crypto, input sanitization |
| Context usage | Pass-through only | Any | Cancellation/deadline logic |
| Collection transforms | No slice/map transforms (scalar I/O only) | Any | — |
| Invariant patterns | None (trivial arithmetic commutativity like Add(a,b) does not count) | Non-trivial PBT trigger detected (roundtrip, idempotency, preservation, domain-level commutativity, parse validity) | — (use Standard) |
- Light: ALL Light criteria must be met. For simple pure functions with scalar I/O, utilities, type conversions. NOT for collection/slice/map transforms (these need the full boundary checklist to catch off-by-one and dropped-element bugs).
- Standard: Default. Use when any Light criterion is violated but no Strict trigger fires.
- Strict: ANY Strict criterion triggers this mode. For concurrent, security-sensitive, or high-risk code.
When in doubt, choose Standard.
Mode Requirements¶
| Feature | Light | Standard | Strict |
|---|---|---|---|
| Table-driven tests | Required | Required | Required |
| Mutation-resistant assertions | Required | Required | Required |
Race detection (-race) | Required | Required | Required |
| Coverage gate (80%) | Required | Required | Required |
| Reporting Integrity | Required | Required | Required |
| Case budget per target | 3-6 | 5-12 | 8-15+ |
| Failure Hypothesis List | Skip | Required | Required |
| Killer Case per target | Skip | Required (1) | Required (1+) |
| Removal Risk Statement | Skip | Required | Required |
| Boundary Checklist | Light (5 items) | Full (12 items) | Full (12 items) |
| Scorecard | Light (7 checks) | Full (13 checks) | Full (13 checks) |
| Property-based test guidance | N/A | Recommend if applicable | Required when pattern matches |
| JSON Summary | Skip | Required | Required |
Light Boundary Check (5 items)¶
Mark each Covered or N/A:
nil/zero-value input (if parameter type allows)- Empty collection / zero-length input
- Single element / boundary size (n=1)
- Error from dependency (if any)
- Invalid/malformed input (if format constraints exist)
Light Scorecard (7 checks)¶
| # | Tier | Check |
|---|---|---|
| L1 | Hygiene | File naming and location correct |
| L2 | Hygiene | Table-driven style used |
| L3 | Critical | Assertions are mutation-resistant (business fields, not existence-only) |
| L4 | Hygiene | Happy path covered |
| L5 | Standard | Critical dependency error paths covered (or N/A) |
| L6 | Standard | -race execution result reported (or N/A with rationale) |
| L7 | Critical | Coverage meets gate (logic >= 80%) |
N/A handling: Items marked N/A with explicit rationale count as PASS for tier and total calculations (same rule as the full scorecard).
PASS when: both Critical (L3, L7) PASS (or N/A with explicit rationale), Standard >= 1/2, Hygiene >= 2/3, total >= 6/7.
State Light mode: standard scorecard not applicable in output.
Target Type Adaptation¶
Adapt test organization based on the target code type:
| Target Type | Top-level Test Naming | t.Run Organization | Killer Case Granularity |
|---|---|---|---|
| Service interface | TestXxxService | By interface method | 1 per interface method |
| Package-level functions | TestFuncName | By function | 1 per exported function |
| HTTP handler | TestHandlerName | By HTTP method + path | 1 per endpoint |
| CLI command/runner | TestRunnerXxx | By command/subcommand | 1 per command |
| Middleware | TestMiddlewareName | By pass-through / block / error | 1 per middleware |
HTTP handler tests: use httptest.NewRequest + httptest.NewRecorder. Verify status code, response body, headers. Inject dependencies via Deps struct or handler constructor.
Pure function tests: direct table-driven, no mock needed. Focus on input boundaries and output correctness.
Defect-First Workflow (Standard + Strict Modes)¶
Before writing cases, produce a short Failure Hypothesis List from the target code:
- Loop/index risks:
i < n,i <= n,i+1,n-1, slice/map access. - Collection transform risks: input->output cardinality mismatch, dropped first/last item, wrong key mapping.
- Branching risks: terminal state branch, empty/singleton branch, error short-circuit branch.
- Concurrency risks: goroutine error fan-in, shared variable writes, panic recovery path.
- Context/time risks:
context.Canceled,DeadlineExceeded, missing ctx propagation, timeout not enforced.
Then map each hypothesis to at least one concrete test case name.
Then define at least one killer case per test target and map it to a specific defect hypothesis.
If this mapping is missing, do not proceed to large test generation.
High-Signal Test Budget (Anti-Bloat)¶
Avoid generating huge suites with weak assertions.
| Mode | Cases per target | Notes |
|---|---|---|
| Light | 3-6 | Happy path + key error/edge paths |
| Standard | 5-12 | Full budget below + killer case |
| Strict | 8-15+ | Extended budget + property-based tests when applicable |
Standard/Strict default budget per target:
- 1 happy path
- 1 terminal/last-element boundary path
- 1 empty or single-element path
- 1 dependency error propagation path per critical dependency
- 1 invariant/path-completeness path
- 1 killer case (mandatory, Standard + Strict)
Only exceed the budget when new cases cover distinct logic paths.
Bug-Finding Techniques → references/bug-finding-techniques.md¶
| # | Technique | Key Rule |
|---|---|---|
| 1 | Mutation-Resistant Assertions | Assert concrete business fields, not just != nil |
| 2 | Collection Mapping Completeness | Assert len + identity + first/middle/last for transforms |
| 3 | Off-by-One Precision | Test n=0,1,2,3 for every index boundary |
| 4 | Dependency Error Propagation | Inject failure per dependency, verify no partial payload |
| 5 | Concurrency & Panic Recovery | Channel barriers, -race, panic recovery path → also see references/concurrency-testing.md |
| 6 | Branch Completeness | Both branches: marker behavior + payload completeness |
| 7 | Killer Case Design | Fault-injection tied to defect hypothesis → also see references/killer-case-patterns.md |
For detailed patterns and Go code examples, load the reference file.
Property-Based Testing (Standard: optional | Strict: required when applicable)¶
Property-based testing finds bugs that hand-picked boundary cases miss by verifying invariants over randomized input.
When to Recommend¶
| Pattern | Invariant | Example |
|---|---|---|
| Roundtrip | decode(encode(x)) == x | marshal/unmarshal, serialize/deserialize |
| Idempotency | f(f(x)) == f(x) | normalization, canonicalization |
| Preservation | len(output) == len(input) | transforms that must not drop/duplicate items |
| Commutativity | f(a,b) == f(b,a) | set operations, merge functions |
| Parse validity | valid input → no panic | parsers, validators |
Quick Example (testing/quick)¶
func TestRoundtrip(t *testing.T) {
f := func(input string) bool {
decoded, err := Decode(Encode(input))
return err == nil && decoded == input
}
if err := quick.Check(f, nil); err != nil {
t.Error(err)
}
}
For complex domain types, use hand-rolled generators with deterministic seeds. See references/property-based-testing.md.
Relationship to table-driven tests: Property-based tests verify invariants over wide input space; table-driven tests verify exact expected values at specific boundaries. Use both when target has both invariants AND boundary risks.
Mode Applicability¶
- Light: Not applicable. If an invariant pattern is detected, auto-promote to Standard (see Mode Selection table).
- Standard: Note in report if property-based testing would add value; do not require.
- Strict: Required recommendation when target matches any trigger pattern above; include at least one property test or justify why none apply.
Fixed Boundary Checklist (Standard + Strict — Per Test Target)¶
For Light mode, use the Light Boundary Check (5 items) in Execution Modes.
Mark each item as Covered or N/A (reason):
nilinput (only if parameter is pointer/interface/map/slice/channel/function)- empty value/collection
- single element (
len == 1) - size/index boundary (
n=2,n=3, last element) - min/max value boundary (
x-1,x,x+1) if numeric - invalid format/type
- zero-value struct/default trap
- error from each critical dependency
- context cancellation/deadline propagation (if method accepts/uses
context.Context) - concurrent/race behavior (if stateful or goroutine-based)
- mapping completeness (
no dropped first/middle/last item) - killer case present and mapped to a concrete defect hypothesis
Test Structure Standard¶
- Top-level test naming follows the Target Type Adaptation table.
t.Rungroups map to test targets (interface methods, exported functions, or endpoints).- Use table-driven cases inside each group.
- Keep case names defect-oriented and readable in
go test -v. - Prefer
t.Parallel()for independent subtests. - Do NOT use
t.Parallel()when subtests share mutable globals, temp dirs without isolation, or process-wide resources.
Incremental Mode (Fix / Add Tests)¶
When the task is fixing failing tests or adding tests to existing code, use these simplified flows instead of the full workflow.
Fix failing test:¶
- Read failing test and target code
- Identify root cause: test bug vs implementation bug
- Fix the actual bug side (do NOT weaken assertions just to make tests pass)
- Run
go test -run TestXxx -v -raceto verify the fix - Skip full scorecard and use incremental scorecard only (see Auto Scorecard applicability for mode-aware rules).
Add tests for existing code:¶
- Read target code, identify untested paths
- (Standard + Strict only) Build targeted Failure Hypothesis List (only for uncovered paths)
- Design cases for gaps only (do not rewrite existing tests)
- Run coverage diff: compare before/after
- Simplified Scorecard (mode-aware):
- Standard/Strict targets: only verify items 5, 7, 8, 11 for new cases.
- Light targets: only verify items L3, L5, L7 for new cases.
Coverage recovery:¶
- Run
go test -coverprofile=before.out - Identify uncovered lines with
go tool cover -func=before.out - Write targeted cases for uncovered branches
- Verify coverage gate met
Workflow¶
- Assess target code complexity and select execution mode (Light/Standard/Strict). Declare mode and rationale.
- Check
go.modfor Go version; note version-dependent test pattern adaptations (see Go Version Gate). - Exclude generated code files from test scope (see Generated Code Exclusion).
- Read target code and identify test targets (interface methods, exported functions, handler endpoints).
- (Standard + Strict only) Build Failure Hypothesis List (loops, mapping, branch, concurrency, context/time).
- (Standard + Strict only) For each target, define 1 mandatory killer case and bind it to one hypothesis.
- Design minimal high-signal cases (Light: 3-6, Standard: 5-12, Strict: 8-15+ per target).
- Implement tests with strong field-level assertions.
- Run focused tests:
go test ./path/to/pkg -run TestXxx -v -race- Run package tests:
go test ./path/to/pkg -race- Measure coverage (prefer atomic for concurrency safety):
go test ./path/to/pkg -coverprofile=coverage.out -covermode=atomic -racego tool cover -func=coverage.out- If coverage < required gate OR key hypotheses untested, add targeted cases only.
- (Standard + Strict only) Verify killer case integrity in report (required assertion present + removal risk statement).
Reporting Integrity (Mandatory)¶
- Do NOT claim
-raceor coverage results unless you actually ran the commands and observed output. - If you cannot run commands in the current environment, say so, and output the exact commands for the user to run plus what to look for.
Auto Scorecard (13 Checks)¶
Score each item PASS / FAIL / N/A (reason). Output Total: X/13 and final result.
Each item has a weight tier that determines its impact on the final verdict:
| Tier | Items | Rule |
|---|---|---|
| Critical (must PASS) | 5, 11, 13 | Any Critical FAIL → overall FAIL regardless of total |
| Standard | 7, 8, 9, 10, 12 | Must achieve >= 4/5 Standard PASS |
| Hygiene | 1, 2, 3, 4, 6 | Must achieve >= 4/5 Hygiene PASS |
Applicability:
- Light mode: Use Light Scorecard (7 checks); state
Light mode: standard scorecard not applicable. - Standard/Strict mode: Full 13-check scorecard mandatory.
- Incremental mode (Standard/Strict targets): Simplified scorecard (items 5, 7, 8, 11); state
Incremental mode: full scorecard skipped. -
Incremental mode (Light targets): Use Light Scorecard (items L3, L5, L7 only); PASS when all 3 items are PASS or N/A with rationale; state
Incremental + Light mode: minimal scorecard. -
[Hygiene] File naming and location are correct.
- [Hygiene] Top-level test naming follows the Target Type Adaptation table.
- [Hygiene]
t.Rungroups map 1-to-1 to test targets. - [Hygiene] Table-driven style is used for test cases.
- [Critical] Assertions are mutation-resistant (business fields, not existence-only).
- [Hygiene] Happy path is covered.
- [Standard] Critical dependency error paths are covered.
- [Standard] Boundary checklist items are explicitly marked Covered/N/A.
- [Standard] Collection mapping completeness is asserted (length + identities + first/middle/last).
- [Standard] Terminal/last-element branch behavior is asserted.
- [Critical] Killer case exists for every target and is linked to a defect hypothesis.
- [Standard]
-raceexecution result is reported (or marked N/A with rationale if not runnable here). - [Critical] Coverage meets gate for the package category (logic >= 80%; infra per rationale) OR marked N/A with explicit justification.
Final PASS only when:
- All 3 Critical items (5, 11, 13) are PASS (or N/A with explicit rationale and hypothesis coverage is complete), and
- Standard tier: >= 4/5 PASS, and
- Hygiene tier: >= 4/5 PASS, and
- total >= 11/13.
Otherwise: FAIL, with missing items and next targeted test additions.
Output Expectations¶
Include:
- Execution mode (Light/Standard/Strict) with selection rationale
- Targets tested + case counts
- Go version (from
go.mod) and version-dependent adaptations applied - Generated files excluded from scope (list, or "none")
- Failure Hypothesis List and which case covers each
- Killer case list per target:
- case name
- linked defect hypothesis
- critical assertion(s)
- mandatory statement: "if this assertion is removed, the known bug can escape detection."
- Boundary checklist per target (Covered/N/A + reason)
- Coverage and race results (or N/A + exact commands)
- Scorecard and final PASS/FAIL
- Remaining untested risks (if any)
Light mode output reduction: Skip Failure Hypothesis List, Killer Case list, and JSON Summary. Report only: mode + rationale, targets + case counts, Light Boundary Check, coverage/race results, Light Scorecard, remaining risks.
For list/transform logic, include explicit statement:
- whether first/middle/last items were validated
- whether output cardinality and identity completeness were validated
Machine-Readable Summary (JSON) — Standard + Strict Only¶
Also output a compact JSON block for CI/pipeline ingestion (skip for Light mode):
{
"summary": {
"pass": true,
"score": "12/13",
"go_version": "1.22"
},
"targets": [
{
"name": "TestOrderService",
"type": "Service interface",
"cases": 8,
"killer_cases": 2,
"hypothesis_covered": ["H1", "H3"]
}
],
"coverage": {
"package": "internal/domain/order",
"line_pct": 87.5,
"gate": 80,
"met": true
},
"race": {
"executed": true,
"clean": true
},
"scorecard": {
"critical_pass": 3,
"critical_total": 3,
"standard_pass": 5,
"standard_total": 5,
"hygiene_pass": 4,
"hygiene_total": 5
}
}
Skill Maintenance¶
Run regression checks for this skill with: