updates

a timeline of what's been built and improved in skillgraph.

v0November 5, 2025

initial release

skillgraph is out. it's experimental, it's not production-ready, but it works. here's what it has:

core agent

  • chat and streaming support
  • multi-llm support (anthropic, openai, azure, bedrock, deepseek, together, huggingface)
  • async-first architecture
  • handles 1000+ concurrent requests

skills system

  • custom skill creation and registration
  • intent-based skill routing
  • skill execution with multi-tool orchestration
  • interactive skills (skill mode): native multi-turn workflows. skills can take control of the conversation for as many turns as needed. perfect for ticket booking, form filling, confirmation flows.
  • feedback collection for skill improvement

subject-object memory architecture

  • subject tracking (what the user wants) - accumulates over conversation
  • object tracking (what's being discussed) - switches when topic changes
  • fast utility LLM analysis (llama-3-8b, ~200ms)
  • fallback chain: utility llm → gamma llm → previous state
  • replaces complex rag systems with simple state tracking

caching (cost + speed)

  • anthropic prompt caching: only caches static system prompt (shared across all conversations). achieves 89% cost reduction on system prompt tokens (~$7 saved per 500 messages)
  • redis conversation cache: 50ms → <5ms retrieval for conversation history
  • postgresql: source of truth for all data
  • smart invalidation (cache busts when new messages posted)

conversation intelligence

  • strategy selection: auto-detects conversation type (lightweight vs detailed)
  • conversation summarization: incremental summaries every 10 messages (configurable). enables indefinite conversation length without hitting context limits.
  • vector search recall: pgvector semantic search for past messages. scoped to current conversation for security.
  • message indexing with embeddings (sentencetransformer, 384 dims)

llm routing & fallback

  • complexity-based routing (simple → gamma, medium → beta, complex → alpha)
  • fallback chain: beta (haiku 4.5) → alpha (sonnet 4.5) → alpha retry → error
  • reliable operation even during partial outages
  • graceful degradation everywhere

learning system

  • feedback collection (explicit and implicit)
  • reflexion for self-improvement
  • skill learning storage
  • note: improvement loop is basic, needs work

guardrails Guardrails & security security

  • rate limiting (per-minute and per-day)
  • sql injection detection
  • conversation-scoped vector search (no cross-conversation leaks)

frontend

  • react chat UI with streaming
  • web search result cards
  • interactive skill mode renderer (buttons, forms)
  • minimalist design (8 dependencies)

database

  • postgresql with async sqlalchemy
  • pgvector for semantic search
  • unified message table (merged messages + embeddings)
  • conversation state, summaries, skills, learning data

what's missing

  • observability (prometheus + grafana integration pending)
  • production testing (haven't tested at scale)
  • more unit tests
  • deployment guides (k8s configs)
  • better documentation for advanced features

this is experimental. it works, but it's not perfect. use it, break it, tell me what's wrong.