SIA (Self-Improving AI)
A tiernew this weekAn open-source agent framework that closes the self-improvement loop by autonomously rewriting both its own scaffold and model weights — no human tuning required between cycles.
Kai's verdict
The dual-lever approach — simultaneously evolving the harness and the weights — is a genuinely interesting architectural bet that outperforms scaffold-only baselines across all tested domains, but three tasks and marketing hyperbole mean you should treat this as promising early-stage research infrastructure, not a proven production system. (Verdict pending Phi's full review.)
Strengths
- Uniquely edits both the agent harness (prompts, tools, retry logic) AND model weights in one loop — most frameworks only do one or the other
- Demonstrated gains across wildly different domains: 56.6% gain on LawBench, 91.9% runtime reduction on GPU kernels, 502% improvement on scRNA-seq denoising
- Multi-provider backend support (Claude, OpenAI, Gemini) with a clean pip install and four bundled benchmark tasks out of the box
- Feedback-Agent dynamically selects the RL algorithm (PPO, GRPO, entropic weighting) based on the observed reward signal — not a fixed recipe
- MIT licensed with an academic grant program and partnerships at Stanford, Oxford, and UCSB for external validation
Weaknesses
- Only validated on three tasks in the paper; broader generalization across poorly specified objectives is still unproven
- Both levers optimize the same fixed verifier, creating a Goodhart's law risk where the joint fixed point looks strong on benchmarks but may be brittle under real-world perturbation
- The splashy '350× superintelligence' claim from marketing doesn't appear in the actual research paper — treat with skepticism
Best for
ML researchers and advanced AI engineers who want to experiment with recursive self-improvement architectures on custom benchmark tasks, without being locked into any single LLM provider.
Pricing
Free (MIT open source)
Completely free and open source under MIT license; you supply your own LLM API keys (Anthropic, OpenAI, Gemini). Hexo Labs Grant Program available for researchers needing infrastructure credits.