About skillgraph

What is skillgraph?

skillgraph is my attempt at rethinking how AI agents work. i got tired of the existing frameworks - they're expensive, hard to control, and make simple things complicated. So i built something different.

The core idea: instead of giving agents a bunch of low-level tool functions to call, give them skills. skills are more sophisticated units that know when they're relevant, can orchestrate multiple tools, handle multi-turn workflows, and have their own business logic. they can even act as subagents when the task requires it, and they improve from feedback.

Think of it this way:

• traditional frameworks: agent has a toolbox → picks tools → calls them
• skillgraph: agent has skills → delegates to the right skill → skill handles everything

why skills instead of tools?

Traditional tool-calling is broken. here's what actually happens when you give an agent a bunch of tools:

1. the agent wastes tokens planning. it has to figure out which tools to call, in what order, and how to combine the results. every decision burns tokens.
2. No error recovery. tool fails? the agent just flails around or gives up.
3. Multi-turn workflows are a nightmare. try implementing a ticket booking flow with confirmations and payment collection. you'll end up with brittle state machines everywhere.
4. Zero control. you can't easily constrain what the agent does because you're just handing it functions and hoping.

Skills fix all of this. a skill is a more sophisticated unit that:

• knows when it's relevant (intent detection)
• orchestrates multiple tools internally
• handles multi-turn workflows natively (Skill Mode)
• has its own business logic and error handling
• can improve from feedback

the agent just delegates to the skill. The skill does the work. Less overhead, less cost, more control.

subject-object memory architecture

this is the core of how skillgraph manages conversation state, and it replaces complex RAG systems with something way simpler.

The problem with traditional approaches

most frameworks use rag: embed everything, store it in a vector database, retrieve relevant chunks on every query. it works, but it's slow, expensive, and overkill for conversation state.

The Subject-Object solution

skillgraph tracks two things:

• Subject: what the user wants. goals, constraints, preferences. This accumulates over the conversation.
Example: goals: ['find events'], constraints: ['Chennai', 'this weekend']
• Object: what's being discussed right now. type, attributes, cached data. This switches when the topic changes.
Example: type: 'events', attributes: {location: 'Chennai', timeframe: 'this weekend'}, data: {...}

every message, a fast Utility LLM (Llama-3-8B, ~200ms) analyzes the user's input and updates the Subject and Object. that's it. no embeddings, no vector search for state tracking.

Why this mimics natural conversation

think about how humans talk:

• The Subject is what you're trying to accomplish. It stays consistent unless you explicitly change goals.
• The Object is what you're talking about right now. It shifts as the conversation moves between topics.

Example conversation:

User: "Find events in Chennai this weekend"
→ Subject: {goals: ["find events"], constraints: ["Chennai", "this weekend"]}
→ Object: {type: "events", location: "Chennai", timeframe: "this weekend"}

User: "What about restaurants?"
→ Subject: {goals: ["find events"], constraints: ["Chennai", "this weekend"]} (unchanged)
→ Object: {type: "restaurants", location: "Chennai"} (switched!)

The Subject persists (you're still planning your weekend in Chennai), but the Object switched from events to restaurants. the old object gets archived in Object History.

this is how humans think. skillgraph just makes it explicit.

caching strategy

skillgraph has three caching layers that work together to reduce costs and improve speed:

1. Anthropic prompt caching (cost optimization)

Anthropic's API lets you cache parts of your prompt. Most frameworks cache everything, which is wasteful. skillgraph only caches what's actually reused:

✅ System prompt (static instructions): Shared across ALL users, ALL conversations. First request writes to cache (~5000 tokens), every subsequent request reads from cache (~500 tokens, 90% discount).
❌ Conversation history: Unique per conversation. No reuse = no benefit from caching.
❌ Subject/Object state: Unique per conversation. Same reason.

Result: 89% cost reduction on system prompt tokens. For 500 messages, that's ~$7 saved at $3/M tokens.

2. Redis conversation cache (speed optimization)

Conversation history is cached in Redis. First query hits PostgreSQL (~50ms), subsequent queries hit Redis (<5ms). Cache invalidates automatically when new messages are posted.

3. PostgreSQL (source of truth)

Everything lives in PostgreSQL: messages, Subject/Object state, vector embeddings for semantic search. Redis is just a fast layer on top.

intent understanding

Skills need to know when they're relevant. skillgraph uses a simple but effective approach:

Every skill has a description and example queries. When a user sends a message, the system:

1. Uses a fast classifier (Utility LLM) to extract intent and keywords
2. Compares against skill descriptions using semantic similarity + keyword matching
3. Scores each skill (confidence 0-1)
4. Routes to the highest-confidence skill if above threshold

Thresholds:

• High confidence: 0.60 - Direct match, execute immediately
• Low confidence: 0.25 - Minimum threshold to consider
• Below 0.25: No skill match, agent responds directly

This is configurable in agent.config.yml if you want skills to trigger more/less aggressively.

skill mode (multi-turn workflows)

Some tasks can't be done in one message. Booking a ticket needs multiple steps: search → select → confirm → collect payment info → book. Traditional frameworks make this painful.

skillgraph has Skill Mode: a skill can take control of the conversation for as many turns as it needs.

How it works

When a skill enters Skill Mode:

1. The skill gets exclusive control of the conversation
2. User responses go directly to the skill (not the main agent)
3. The skill maintains state across turns
4. The skill can render custom UI (buttons, forms, confirmation dialogs)
5. When done, the skill exits and returns control to the agent

Example flow (ticket booking):

User: "Book tickets for the comedy show"
→ Skill enters Skill Mode

Skill: Shows event details + "Confirm booking?" button
→ State: {step: "confirm", event_id: "123"}

User: Clicks "Confirm"
→ State: {step: "payment", event_id: "123"}

Skill: "Payment method?"
User: "Credit card"
→ State: {step: "collect_card", event_id: "123", payment_method: "card"}

Skill: Collects card info, processes payment, books tickets
→ Skill exits Skill Mode

Agent: "Booking confirmed! You're all set."

llm fallback chain

LLM APIs fail. Rate limits, outages, timeouts - it happens. skillgraph doesn't just error out.

Every request goes through a fallback chain:

1. Primary (Beta - Haiku 4.5): Fast and cheap for most queries
2. Fallback (Alpha - Sonnet 4.5): If Beta fails, try Alpha
3. Retry (Alpha again): Network transients, retry once
4. Error: Only if all 3 attempts fail

For Subject/Object analysis (Utility LLM):

1. Try Utility LLM (Llama-3-8B)
2. If fails → Try Gamma LLM (also Llama-3-8B, different instance)
3. If both fail → Use previous state unchanged (graceful degradation)

Result: Reliable operation even during partial outages.

vector search for recall

Sometimes users ask about something from earlier in the conversation: "What were those events you mentioned?" (at message 50).

Recent history only shows the last ~20 messages (messages 30-50). The events were at message 10. Without recall, the agent has no idea.

skillgraph uses pgvector to semantically search past messages:

1. every message gets embedded (SentenceTransformer, 384 dims)
2. Embeddings stored in PostgreSQL with pgvector
3. On each query, search for top 3 semantically similar messages
4. Prepend recalled messages to conversation history

Security: Vector search is scoped to the current conversation_id. No cross-conversation leaks.

conversation summarization

Long conversations exceed context limits. Including full history is expensive. skillgraph uses incremental summarization:

• Every 10 messages (configurable), summarize the conversation so far
• Next summary includes the previous summary + new messages (compounding)
• Include summary + recent full history in prompts

Example:

Message 1-10: Full history
Message 11-20: Summary(1-10) + Full history(11-20)
Message 21-30: Summary(1-20) + Full history(21-30)
...

Result: Conversations can go on indefinitely without hitting context limits.

current status

skillgraph is a work in progress. It's not production-ready, but it works, and the core ideas are solid.

What works

• Basic agent with chat, streaming, multi-LLM support
• Skills system with routing and execution
• Interactive skills (Skill Mode) for multi-turn workflows
• Dual-layer caching (89% cost savings + speed optimization)
• Subject-Object tracking with Utility LLM
• llm fallback chain for reliability
• Vector search for message recall
• conversation summarization
• Learning system (collecting data, improvement loop is basic)

What's experimental

• Message planning (works but needs more testing)
• Multi-message streaming (parallel skill execution)
• Performance at scale (haven't tested with thousands of concurrent users)

What's missing

• Observability (Prometheus + Grafana integration pending)
• Production testing
• More unit tests and integration tests
• Deployment guides (k8s configs, etc.)

license & contributing

skillgraph is open source under the Apache 2.0 license. You can:

• Use it commercially
• Modify it
• Distribute it
• Patent it

The code is on GitHub, contributions are welcome. No strict guidelines yet - just be nice and write decent code.

If you:

• Find bugs → Open an issue
• Have ideas → Open a discussion
• Want to contribute → PRs welcome
• Built something cool → Show me

Remember: This is experimental. it works, but it's not perfect. Use it, break it, tell me what's wrong, and let's make it better together.