From def0f119cf2ad671c543e77919afd11f5ea448a4 Mon Sep 17 00:00:00 2001 From: Paul Buetow Date: Sat, 21 Feb 2026 23:38:51 +0200 Subject: Update content for html --- gemfeed/2026-02-22-taskwarrior-autonomous-agent-loop.html | 7 ++++++- gemfeed/atom.xml | 9 +++++++-- 2 files changed, 13 insertions(+), 3 deletions(-) diff --git a/gemfeed/2026-02-22-taskwarrior-autonomous-agent-loop.html b/gemfeed/2026-02-22-taskwarrior-autonomous-agent-loop.html index f4ad68b2..c09ed84b 100644 --- a/gemfeed/2026-02-22-taskwarrior-autonomous-agent-loop.html +++ b/gemfeed/2026-02-22-taskwarrior-autonomous-agent-loop.html @@ -40,6 +40,7 @@
  • ⇢ ⇢ 4-annotate-update-task.md — progress tracking
  • ⇢ ⇢ 5-review-overview-tasks.md — picking the next task
  • The reflection and review loop
  • +
  • Code review: human spot-check at the end
  • Measurable results
  • A real bug found by the review loop
  • Gotchas and lessons learned
  • @@ -430,6 +431,10 @@ tasks are in progress, show the next actionable (READY) task:
    The sub-agent reviews consistently caught things the main agent missed — tests that only asserted on mocks, missing edge cases, and even a real bug. Without the dual review loop, the agent tends to write tests that look correct but do not actually exercise real behavior.

    +

    Code review: human spot-check at the end


    +
    +On top of the agent's self-reflection and the two sub-agent reviews per task, I reviewed the produced outcome at the end. I did not read through all 5k lines one by one. Instead I looked for repeating patterns across the test files and cherry-picked a few scenarios — for example one integration test from the open/close family, one from the rename/link family, and one negative test — and went through those in detail manually. That was enough to satisfy me that the workflow had produced consistent, runnable tests and that the whole pipeline (task → implement → self-review → sub-agent review → fix → second review → commit) was working as intended.
    +

    Measurable results



    Here is what one day of autonomous Ampcode work produced:
    @@ -439,7 +444,7 @@ tasks are in progress, show the next actionable (READY) task:
  • 48 Taskwarrior tasks completed
  • 47 git commits
  • 87 files changed
  • -
  • 12,012 lines added, 1,543 removed
  • +
  • ~5,000 lines added, ~500 removed
  • 18 integration test files
  • 15 workload scenario files (one per syscall category)
  • 93 test scenarios total (happy-path and negative)
  • diff --git a/gemfeed/atom.xml b/gemfeed/atom.xml index 9d7682a5..e33ff057 100644 --- a/gemfeed/atom.xml +++ b/gemfeed/atom.xml @@ -1,6 +1,6 @@ - 2026-02-21T23:24:01+02:00 + 2026-02-21T23:38:45+02:00 foo.zone feed To be in the .zone! @@ -47,6 +47,7 @@
  • ⇢ ⇢ 4-annotate-update-task.md — progress tracking
  • ⇢ ⇢ 5-review-overview-tasks.md — picking the next task
  • The reflection and review loop
  • +
  • Code review: human spot-check at the end
  • Measurable results
  • A real bug found by the review loop
  • Gotchas and lessons learned
  • @@ -437,6 +438,10 @@ tasks are in progress, show the next actionable (READY) task:
    The sub-agent reviews consistently caught things the main agent missed — tests that only asserted on mocks, missing edge cases, and even a real bug. Without the dual review loop, the agent tends to write tests that look correct but do not actually exercise real behavior.

    +

    Code review: human spot-check at the end


    +
    +On top of the agent's self-reflection and the two sub-agent reviews per task, I reviewed the produced outcome at the end. I did not read through all 5k lines one by one. Instead I looked for repeating patterns across the test files and cherry-picked a few scenarios — for example one integration test from the open/close family, one from the rename/link family, and one negative test — and went through those in detail manually. That was enough to satisfy me that the workflow had produced consistent, runnable tests and that the whole pipeline (task → implement → self-review → sub-agent review → fix → second review → commit) was working as intended.
    +

    Measurable results



    Here is what one day of autonomous Ampcode work produced:
    @@ -446,7 +451,7 @@ tasks are in progress, show the next actionable (READY) task:
  • 48 Taskwarrior tasks completed
  • 47 git commits
  • 87 files changed
  • -
  • 12,012 lines added, 1,543 removed
  • +
  • ~5,000 lines added, ~500 removed
  • 18 integration test files
  • 15 workload scenario files (one per syscall category)
  • 93 test scenarios total (happy-path and negative)
  • -- cgit v1.2.3