It’s been 2 months since last post but this blog is not dead (yet). I’ve been busy working on two AI-related projects. Let me briefly introduce them.

ragev

https://github.com/achiwa912/ragev

ragev is a RAG evaluation harness where you can change parameters such as top-k and evaluate performance. The README.org includes complete evaluation report.

I added frontend UI to ragev because I wanted to use JavaScript + Vue that I had been trying to learn for a few years without success. Once again, I’m convinced that after reading an introductory book or two, I need to work on projects to be able to use a programming language. AIs are a great help at this learning phase as it catches all beginner mistakes I make and explain diligently to me without pointing out how novice I am.

Thanks to the UI, you can run evaluations and view them on the UI.

I found a full-stack project like ragev is hard if you are not a senior-level dev. It has so many moving parts from database to ORM to backend logic to API to frontnd logic to css to html. I was often pointed out by AI that I mixed up Python expressions with JS ones. Without being assisted by AI, I would have given up early. AI is a boon for a solo, non-full-time developer like me. It’s pretty exciting time.

agnt3

https://github.com/achiwa912/agnt3

agnt3 is a PoC AI agent system for HR. It simulates a real-world HR system in which employees request PTO days and manager/VP approve/deny. Instead of humans, AI first processes requests referring to human-written PTO rules.

For HITL (Human-in-the-Loop), managers/VP have the final authority and can override AI decisions.

I wanted to learn what agents are, and to know how undeterministic AI behaviors can fit into a deterministic system. The key was the system prompt. You need to carefully craft one to avoid hallucinations even when inputs to AI are ambiguously written by humans. Especially if you want to use a “cheap” (ie, weak) AI model, you need strong guardrails and step-by-step (CoT; Chain of Thoughts) instructions.

But it turned out that I had a harder time troubleshooting state management than implementing the AI agent and its MCP tool.

I used Claude (Sonnet 4.6) as my coding assistant, but for this difficult issue, ChatGPT (GPT-5.5) was a great help. When Calude and I were discussing ever-complex action-driven state changes, ChatGPT advised me just to focus on states, not actions. It greatly simplified the design.

Probably I need to say ChatGPT and Claude are coding consultants, not assistants. I’ve learned a lot from AIs.