AI 'REASONINGBANK' Help LLM's Learn from Mistakes and Evolve
Updated Oct 13, 2025 1:30 PM
Researchers from
Google and the
University of Illinois Urbana Champaign have developed a groundbreaking new framework that allows AI agents to learn from their past experiences, a crucial step towards creating truly intelligent and self evolving machines.
In a world where AI assistants are becoming increasingly common, a major limitation has been their inability to learn from their interactions. They often make the same mistakes repeatedly, failing to build on their experiences. This is akin to a student who never learns from their errors, doomed to repeat them in every exam. But what if AI could learn, adapt, and evolve based on its own unique history of successes and failures?
This is the promise of
REASONINGBANK, a novel memory framework for AI agents. The system, detailed in a
new research paper, acts as a "memory bank" that distills generalizable reasoning strategies from an agent's past actions. Unlike previous attempts that focused only on successful outcomes,
REASONINGBANK also learns valuable lessons from failures.
How REASONINGBANK Works: A Cycle of Learning
The REASONINGBANK framework operates in a continuous, closed loop process:
- Memory Retrieval: When faced with a new task, the AI agent queries the REASONINGBANK for relevant past experiences. These memories are then used to inform the agent's decision-making process, providing it with a wealth of accumulated knowledge.
- Memory Construction: Once the task is completed, the agent, with the help of an AI judge, determines whether the outcome was a success or a failure. New "memory items" are then created from this experience. Successful experiences contribute validated strategies, while failed ones provide crucial insights into what not to do, acting as guardrails for future actions.
- Memory Consolidation: These newly created memory items are then integrated back into the REASONINGBANK, enriching the agent's knowledge base and ensuring that it is constantly learning and evolving.
Scaling Up the Learning Process with MATTS
To accelerate and enhance this learning process, the researchers also introduced Memory Aware Test Time Scaling (MATTS). MATTS allows the agent to explore multiple different ways of solving a single problem, either in parallel or sequentially. This generates a rich and diverse set of experiences that provides strong signals for creating more robust and generalizable memories. This creates a powerful synergy: better memories lead to more effective problem solving, which in turn generates even better memories.
Putting REASONINGBANK to the Test: The Impressive Results
The researchers tested REASONINGBANK on a variety of challenging benchmarks, including web browsing (WebArena, Mind2Web) and software engineering (SWE-Bench Verified). The results were impressive, showing that agents equipped with REASONINGBANK consistently outperformed existing memory mechanisms.
Key statistics from the study:
- WebArena Benchmark:
- REASONINGBANK improved the overall success rate by +8.3%, +7.2%, and +4.6% with three different AI models compared to agents with no memory.
- It also proved to be more efficient, reducing the number of interaction steps by 16.0% on average.
- SWE-Bench-Verified (Software Engineering):
- With the Gemini-1.5 pro model, REASONINGBANK achieved a 57.4% resolve rate, compared to 54.0% for the no-memory agent.
- MATTS Performance:
- On the WebArena Shopping subset, parallel scaling with MATTS increased the success rate from 49.7% (k=1) to 55.1% (k=5), while sequential scaling increased it from 49.7% to 54.5%.
The Future of Self Evolving AI
The development of REASONINGBANK and MATTS represents a significant step forward in the field of artificial intelligence. By enabling AI agents to learn from their experiences, this research paves the way for more capable, efficient, and adaptable AI systems that can tackle complex, real world problems. While the researchers acknowledge that there is still work to be done, particularly in reducing the reliance on AI judges for correctness signals, this work provides a strong foundation for the future of self evolving AI. The dream of AI that can learn and grow like a human is now one step closer to reality.