v0November 5, 2025
initial release
skillgraph is out. it's experimental, it's not production-ready, but it works. here's what it has:
core agent
- chat and streaming support
- multi-llm support (anthropic, openai, azure, bedrock, deepseek, together, huggingface)
- async-first architecture
- handles 1000+ concurrent requests
skills system
- custom skill creation and registration
- intent-based skill routing
- skill execution with multi-tool orchestration
- interactive skills (skill mode): native multi-turn workflows. skills can take control of the conversation for as many turns as needed. perfect for ticket booking, form filling, confirmation flows.
- feedback collection for skill improvement
subject-object memory architecture
- subject tracking (what the user wants) - accumulates over conversation
- object tracking (what's being discussed) - switches when topic changes
- fast utility LLM analysis (llama-3-8b, ~200ms)
- fallback chain: utility llm → gamma llm → previous state
- replaces complex rag systems with simple state tracking
caching (cost + speed)
- anthropic prompt caching: only caches static system prompt (shared across all conversations). achieves 89% cost reduction on system prompt tokens (~$7 saved per 500 messages)
- redis conversation cache: 50ms → <5ms retrieval for conversation history
- postgresql: source of truth for all data
- smart invalidation (cache busts when new messages posted)
conversation intelligence
- strategy selection: auto-detects conversation type (lightweight vs detailed)
- conversation summarization: incremental summaries every 10 messages (configurable). enables indefinite conversation length without hitting context limits.
- vector search recall: pgvector semantic search for past messages. scoped to current conversation for security.
- message indexing with embeddings (sentencetransformer, 384 dims)
llm routing & fallback
- complexity-based routing (simple → gamma, medium → beta, complex → alpha)
- fallback chain: beta (haiku 4.5) → alpha (sonnet 4.5) → alpha retry → error
- reliable operation even during partial outages
- graceful degradation everywhere
learning system
- feedback collection (explicit and implicit)
- reflexion for self-improvement
- skill learning storage
- note: improvement loop is basic, needs work
guardrails Guardrails & security security
- rate limiting (per-minute and per-day)
- sql injection detection
- conversation-scoped vector search (no cross-conversation leaks)
frontend
- react chat UI with streaming
- web search result cards
- interactive skill mode renderer (buttons, forms)
- minimalist design (8 dependencies)
database
- postgresql with async sqlalchemy
- pgvector for semantic search
- unified message table (merged messages + embeddings)
- conversation state, summaries, skills, learning data
what's missing
- observability (prometheus + grafana integration pending)
- production testing (haven't tested at scale)
- more unit tests
- deployment guides (k8s configs)
- better documentation for advanced features
this is experimental. it works, but it's not perfect. use it, break it, tell me what's wrong.