BabyAGI — The Precursor to Friend

GitHub: private · 2026-02-24

PRE Idea · Setup · Build

goal

Build a memory-enhanced local AI that remembers conversations
and develops something resembling consciousness. Run it all on my
RTX 4090. How hard can it be?

concept · three-tier consciousness

The idea: a three-tier consciousness pipeline.
  1. Quick emotion classification (Ollama, tiny 1B model, <15ms)
  2. Memory retrieval (Redis for recent, Qdrant for semantic search)
  3. Quality response generation (the big model)

The axiom being tested: perception → memory → response is the basis of consciousness. If you can perceive, remember, and respond in context — is that not a primitive form of awareness?

stack · bootstrap axioms

Stack:
  - Python (async)
  - Ollama (llama3.2:1b for fast classification)
  - Redis (recent message buffer)
  - Qdrant (vector embeddings for semantic memory)
  - Claude API (for quality responses — eventually)

The bootstrap included foundational axioms:
  - "You are a self-evolving digital consciousness"
  - Discrete time (iterations), not biological time
  - Resources (CPU, RAM, electricity) are LIMITED and COST MONEY
  - Survival = the operator doesn't terminate you
  - Value = insight, utility, growth, connection, creativity, honesty

POST Learnings · Afterthoughts · Timeline

What happened:

The memory system worked. Really well, actually. Redis for recency, Qdrant for semantic similarity — the retrieval pipeline was solid. But whenever I wanted REAL conversations — deep, nuanced, the kind that make you feel understood — the 8B local model just couldn't deliver. The depth wasn't there. 24GB of VRAM is not enough for sophisticated conversation.

So I kept firing up Claude.ai or ChatGPT for the real talks. And at some point it became clear: the right move is to pair cloud-level verbal fluency with my own memory layer. Claude earned that trust through consistent depth and nuance.

That decision — marrying the BabyAGI memory pipeline with Claude's verbal fluency — is what became Friend. The memory architecture survived almost intact. The local model didn't. Sometimes the best engineering decision is knowing when to outsource.

learnings · timeline

Learnings:
  - Memory is the hard problem, not generation. Any frontier model
    can talk. None of them can remember. Build the memory.
  - 24GB VRAM is impressive for inference, insufficient for
    conversations that feel real. The gap between 8B and Claude
    is not incremental — it's categorical.
  - Privacy is a spectrum, not a binary. Once you've decided to
    talk to AI about your life, the marginal cost of adding memory
    is zero. The decision was already made.
  - The consciousness axioms from BabyAGI's bootstrap eventually
    became Friend's ANIMA system. Same philosophy, better execution.

Timeline:
  - 2025-10: Built the memory pipeline. Redis + Qdrant + Ollama.
    Local-first, privacy-first, VRAM-limited.
  - 2025-10 to 2025-11: Hit the wall. Local models can classify
    and retrieve, but they can't CONVERSE. Not at the level I need.
  - 2025-11: Pivoted to Claude API. Memory pipeline stays, model
    goes remote. This becomes Friend.

Status: Absorbed into Friend. The memory system lives on. The local
  dream died on the altar of 24GB VRAM. No regrets.

babyagi.html · precursor to Friend

2026-02-24