Argentor

Live Demos

Skills Toolkit & Guardrails

18 utility skills (calculator, JSON query, web search, prompt guard, secret scanner, diff…) plus a guardrails pipeline that blocks PII and prompt injection in real time.

cargo run --example demo_skills_toolkit

16 real tool executions + 4 guardrail tests — zero API keys, zero mocks.

DevOps Pipeline

An Argentor agent running an automated 8-step DevOps pipeline — real tool execution, no API keys, no mocks.

cargo run --example demo_pipeline

Interactive player — click to pause, scroll to rewind.

Why Argentor?

Most agent frameworks trade security for flexibility, or intelligence for simplicity. Argentor gives you all three.

🧠

Genuinely Intelligent

ReAct reasoning loops, self-evaluation with quality scoring, cost-aware model routing, and adaptive memory that learns across sessions.

🛡️

Security-First

WASM sandboxing, capability-based permissions, encrypted credentials, SSRF prevention, RBAC, and human-in-the-loop approval gates.

⚡

Production-Grade

4498 tests, 0 failures. Persistent state, control plane, web dashboard, OpenTelemetry, and A2A protocol for agent interop.

Capability	Argentor	CrewAI	AutoGPT	LangGraph
ReAct Reasoning Engine	✓	✗	Basic	Manual
Self-Evaluation Loop	✓	✗	✗	✗
Cost-Aware Model Routing	✓	✗	✗	✗
A2A Protocol (Agent Interop)	✓	✗	✗	✗
WASM Sandboxed Plugins	✓	✗	✗	✗
Multi-Agent Patterns (6+)	✓	2	✗	✓
Compliance (GDPR, ISO)	✓	✗	✗	✗
Control Plane + Dashboard	✓	✗	✗	Plugin
Memory-Safe (Rust)	✓	Python	Python	Python

Intelligent Agent Core

Not just tool calling — structured reasoning, self-correction, and cost optimization built into every agent.

ReAct Reasoning Engine

Structured Think → Act → Observe → Reflect cycle. Agents decompose complex tasks into reasoning steps, track confidence per step, and know when to stop or ask for clarification.

Smart Tool Selection

TF-IDF relevance scoring filters tools before sending to the LLM, reducing token waste by up to 80%. Tracks per-tool success rates and adapts selection strategy automatically.

Self-Evaluation

Every response is scored on 4 dimensions: relevance, consistency, completeness, and clarity. Below-threshold responses trigger automatic refinement loops before delivery.

Cost-Aware Model Router

Routes simple tasks to fast/cheap models and complex tasks to powerful ones. 7-factor complexity estimation with 4 strategies: CostOptimized, QualityOptimized, Balanced, and Tiered.

Adaptive Memory

Cross-session memory that auto-extracts facts, tool patterns, and error resolutions. Importance decay over time keeps memory relevant. Keyword-based recall with configurable relevance thresholds.

14 LLM Providers

Claude, GPT-4, Gemini, Ollama, Mistral, xAI, Azure OpenAI, Cerebras, Together, DeepSeek, vLLM, OpenRouter, Groq, and more. Automatic failover across backends.

Capabilities

14 crates, 187K+ lines of code, 50+ built-in skills, every component tested and documented.

🧠

ReAct + Self-Evaluation

Structured reasoning with Think/Act/Observe/Reflect cycle. Quality scoring on 4 dimensions with automatic refinement loops. Agents know when their answer isn't good enough.

Intelligence

💰

Cost-Aware Routing

Route simple tasks to Haiku/GPT-4o-mini and complex ones to Opus/o1. Budget tracking, 7-factor complexity estimation, and 4 configurable routing strategies.

Intelligence

🔒

WASM Sandboxed Plugins

Skills run in WebAssembly (wasmtime + WASI) with capability-based permissions. No skill can escape its sandbox. SSRF prevention, path traversal blocking, and shell injection blocking.

Security

🤖

Multi-Agent Orchestration

10 specialized agent roles with DAG task queue, dependency resolution, inter-agent messaging via A2A MessageBus, and dynamic replanning with 6 recovery strategies.

Core

🌐

A2A Protocol

Google Agent-to-Agent interop via JSON-RPC 2.0. AgentCard discovery, task send/get/cancel/list, streaming SSE for real-time updates. Your agents talk to the world.

New

🔌

MCP Proxy + Credential Vault

Centralized MCP proxy with intelligent routing (round-robin, least-loaded, pattern-based), circuit breaker, credential vault with rotation, and token pool management.

Core

🧠

Semantic + Adaptive Memory

Hybrid BM25 + embedding vector search for long-term memory. Adaptive memory auto-extracts facts, tool patterns, and error resolutions across sessions.

Intelligence

⚖️

Compliance Built-in

GDPR, ISO 27001, ISO 42001, and DPGA compliance modules. Audit logging, consent tracking, bias monitoring, and automated reporting.

🔨

Code Generation & DevOps

API scaffold generator (Rust/Axum, Python/FastAPI, Node/Express). IaC generator (Docker, Helm, Terraform, GitHub Actions). Git operations, code analysis, test runner.

📡

Control Plane & Dashboard

17 REST API endpoints for deployment management. Web dashboard with real-time status, agent registry with 9 default definitions, health monitoring with auto-recovery.

New

📊

Observability

Prometheus /metrics, OpenTelemetry traces, token budget tracking per agent, structured audit logging, and real-time agent monitoring with health state machine.

🖥️

Full CLI

Deploy, scale, and monitor agents from the terminal. A2A discovery and task management. Compliance reports, config hot-reload, and skill management.

New

Agent Intelligence

10 new cognitive modules that make agents think deeper, recover faster, and learn continuously.

💭

Extended Thinking

Chain-of-thought reasoning with configurable thinking budgets. Agents break complex problems into deliberate reasoning steps before acting.

Reasoning

🔍

Self-Critique

Agents review their own outputs against quality criteria, identify weaknesses, and revise before delivery. Multi-pass refinement built in.

Quality

📦

Context Compaction

Intelligent summarization that compresses long conversations while preserving critical decisions and facts. Keeps agents coherent across long sessions.

Memory

🔎

Tool Discovery

Dynamic tool registry with semantic search. Agents discover and compose tools at runtime based on task requirements, not static configuration.

Tools

🤝

Agent Handoffs

Structured delegation between agents with context transfer. Agents know when to escalate, what context to pass, and how to resume.

Orchestration

💾

State Checkpointing

Save and restore agent execution state at any point. Enables rollback, branching, and resumption of interrupted workflows.

Reliability

📈

Trace Visualization

Full execution traces with reasoning steps, tool calls, and decision points. Export to OpenTelemetry or view in the built-in dashboard.

Observability

⚙️

Dynamic Tool Gen

Agents generate new tools on-the-fly from natural language descriptions. Created tools are sandboxed, tested, and optionally persisted for reuse.

Tools

🎯

Process Reward Scoring

Step-level reward signals that evaluate each reasoning step, not just the final answer. Guides agents toward better intermediate decisions.

Reasoning

📚

Learning Feedback

Continuous improvement from execution outcomes. Agents store success and failure patterns, adapting their strategies across sessions.

Learning

SDK — Build with Argentor

First-class SDKs for Python and TypeScript. Build agents in your language of choice with full type safety.

🐍

Python SDK

argentor-sdk

Async-first with asyncio support
Pydantic models for full type safety
Agent builder with fluent API
Built-in tool decorators
Streaming support for real-time output
Comprehensive error handling

pip install argentor-sdk

📌

TypeScript SDK

@argentor/sdk

Full TypeScript types and generics
Works in Node.js and browser
Zod schema integration for tool I/O
Reactive streams with RxJS
Tree-shakeable ESM package
Auto-generated from Rust types

npm install @argentor/sdk

💻

React Dashboard

@argentor/dashboard

Real-time agent status and health
Execution trace visualization
Token usage and cost analytics
Deployment management UI
Skill registry browser
Built with React + Tailwind CSS

npm install @argentor/dashboard

Defense-in-Depth Security

Security is not a feature — it is the architecture. Every layer enforces isolation, authentication, and auditability.

Plugin Isolation

WASM sandboxing via wasmtime + WASI
Capability-based permission model
Memory isolation per plugin
Progressive tool disclosure

Input Sanitization

SSRF prevention (private IP blocking)
Path traversal detection and blocking
Shell injection prevention
Prompt injection defense

Authentication

JWT auth (HMAC-SHA256)
OAuth2 support
TLS/mTLS connections
Encrypted credential store (AES-256-GCM)

Rate Limiting & Audit

Per-agent rate limiting
Token budget tracking with alerts
Structured audit log (JSONL)
Audit query API with filtering

Compliance Modules

GDPR data handling controls
ISO 27001 information security
ISO 42001 AI management
DPGA digital public goods

Access Control

Role-based access control (RBAC)
Human-in-the-loop approval gates
WebSocket approval workflows
Per-tool permission scoping

Architecture

Orchestrator-Workers pattern with intelligent core, centralized MCP proxy, A2A interop, and full control plane.

HTTP/WS Gateway ←→ Web Dashboard (JWT Auth + TLS) (Real-time SPA) | Prometheus /metrics + OpenTelemetry | Control Plane (17 REST endpoints) Deploy / Scale / Health / Registry | Orchestrator [Intelligent Core] Model Router → Tool Selector → ReAct → Evaluator | +------+------+------+------+------+ | | | | | | Spec Coder Tester Review Archi ... Worker Worker Worker Worker Worker +4 | | | | | +------+------+------+------+ | MCP Proxy (Credential Vault + Token Pool) Circuit Breaker + Intelligent Routing | +--------+---------+---------+--------+ | | | | | Skills External Audit CodeGen A2A (WASM) MCP Srvrs Log API/IaC Interop | | | Sandbox Encrypted JSON-RPC Isolation Credential + SSE Store Streaming

A2A Protocol — Agent Interoperability

Google's Agent-to-Agent protocol for cross-platform agent communication. Your Argentor agents can discover, delegate, and collaborate with any A2A-compatible agent.

🌐

Agent Discovery

AgentCard at /.well-known/agent.json
Capabilities, skills, and authentication advertised
AgentCardBuilder for fluent card construction
CLI: argentor a2a discover --url http://agent

📨

Task Management

JSON-RPC 2.0 — tasks/send, tasks/get, tasks/cancel, tasks/list
Task status tracking with artifacts and messages
Session-based task grouping
Typed client with full error handling

📡

Streaming (SSE)

tasks/sendSubscribe via Server-Sent Events
Real-time status updates, artifacts, and messages
StreamingTaskHandler trait for custom handlers
Automatic fallback for non-streaming handlers

6 Collaboration Patterns

Choose the right multi-agent pattern for your task. Mix and match within a single orchestration.

A → B → C → D

Pipeline

Sequential stages. Each agent transforms and passes output to the next. Ideal for linear workflows.

┌ B ┐ A ─┤ ├─ D └ C ┘

MapReduce

Fan out to parallel workers, then aggregate results. Great for divide-and-conquer problems.

A ⇄ B ⇄ C → Judge

Debate

Agents argue opposing positions. A judge agent synthesizes the best answer from the discourse.

┌ A ┐ ├ B ├ → Vote └ C ┘

Ensemble

Multiple agents solve independently. Results are combined via voting, ranking, or weighted merge.

Supervisor ┌──┼──┐ A B C

Supervisor

A supervisor agent monitors workers, re-assigns tasks on failure, and enforces quality gates.

A ─ B | | D ─ C

Swarm

Fully decentralized. Agents self-organize via a shared message bus with emergent coordination.

Code Generation & DevOps

Generate production-ready scaffolds, infrastructure, and CI/CD pipelines from agent conversations.

🔨

API Scaffold Generator

Rust / Axum — typed handlers, extractors, error handling
Python / FastAPI — Pydantic models, async routes
Node / Express — middleware, validation, TypeScript
OpenAPI spec generation included
Database migrations and models

☁️

Infrastructure as Code

Docker — multi-stage builds, compose files
Helm — Kubernetes charts with HPA, ingress
Terraform — AWS and GCP provider modules
GitHub Actions — CI/CD pipelines
Secrets management and env configuration

🔧

Developer Tools

Git operations — branch, commit, PR automation
Code analysis — AST parsing, dependency graphs
Test runner — execute and parse test results
Linter integration — clippy, eslint, ruff
Automated code review via SecurityAuditor agent

Observability & Monitoring

Know exactly what your agents are doing, how much they cost, and when they need attention.

📊

Prometheus Metrics

Built-in /metrics endpoint. Track latency, tokens, errors, and agent health in Grafana.

🔍

OpenTelemetry

Distributed tracing with OTLP export. Instrument the full request path from gateway to tool execution.

💰

Cost Tracking

Per-agent token budgets with real-time estimation. Track spend across 14 providers with cost-aware routing.

📝

Audit Logging

Every tool call, LLM request, and decision logged as structured JSONL. Query API with time-range filters.

# Example: /metrics endpoint output argentor_requests_total{agent="coder"} 1247 argentor_tokens_used{agent="coder",type="output"} 89432 argentor_model_cost_usd{model="sonnet",strategy="balanced"} 2.47 argentor_tool_calls_total{tool="shell"} 342 argentor_react_steps_total{agent="architect"} 156 argentor_evaluation_score{quantile="0.50"} 0.82

14 Workspace Crates

Modular architecture. Use only what you need, or the full stack.

Crate	Description
argentor-core	Core types, errors, Message, ToolCall, ToolResult, OpenTelemetry integration
argentor-agent	Agent runner, ReAct engine, self-evaluator, model router, tool selector, adaptive memory, 14 LLM backends, failover, streaming
argentor-security	Capabilities, RBAC, rate limiting, SSRF/path traversal prevention, audit, TLS/mTLS, encrypted store
argentor-orchestrator	Multi-agent engine, 6 collaboration patterns, DAG task queue, deployment manager, health checker, agent registry
argentor-mcp	MCP client, server, proxy with credential vault, token pool, circuit breaker, intelligent routing
argentor-a2a	Google A2A protocol: JSON-RPC 2.0 server/client, AgentCard, task management, SSE streaming
argentor-skills	Skill trait, WASM sandbox runtime (wasmtime + WASI), plugin registry, vetting pipeline
argentor-memory	Semantic vector memory, hybrid BM25 + embedding search, query expansion, JSONL persistence
argentor-builtins	Built-in skills: shell, file I/O, HTTP fetch, memory, browser automation, Docker sandbox, code gen
argentor-gateway	HTTP/WS gateway, REST API, control plane (17 endpoints), web dashboard, proxy management, Prometheus metrics
argentor-channels	Channel bridges for Slack, Discord, Telegram, and webchat integration
argentor-session	Session management, file and database persistence, conversation transcripts
argentor-compliance	GDPR, ISO 27001, ISO 42001, DPGA modules with automated reporting and hooks
argentor-cli	CLI binary: serve, deploy, agents, health, a2a, skill, compliance — full control from the terminal

Quick Start

Up and running in under a minute. The demo requires no API keys.

# Clone and build
git clone https://github.com/fboiero/Agentor.git
cd Agentor
cargo build --workspace

# Run the demo (no API keys needed)
cargo run --example demo_pipeline

# Run all 4498 tests
cargo test --workspace

# Start the gateway with dashboard
cargo run --bin argentor -- serve

# Deploy and manage agents
cargo run --bin argentor -- deploy create --name my-agent
cargo run --bin argentor -- deploy summary
cargo run --bin argentor -- health summary

# Discover a remote A2A agent
cargo run --bin argentor -- a2a --url http://remote:3000 discover

# Generate a compliance report
cargo run --bin argentor -- compliance report
    

Full Getting Started Guide View on GitHub

Live Demos

Skills Toolkit & Guardrails

DevOps Pipeline

Why Argentor?

Genuinely Intelligent

Security-First

Production-Grade

Intelligent Agent Core

ReAct Reasoning Engine

Smart Tool Selection

Self-Evaluation

Cost-Aware Model Router

Adaptive Memory

14 LLM Providers

Capabilities

ReAct + Self-Evaluation

Cost-Aware Routing

WASM Sandboxed Plugins

Multi-Agent Orchestration

A2A Protocol

MCP Proxy + Credential Vault

Semantic + Adaptive Memory

Compliance Built-in

Code Generation & DevOps

Control Plane & Dashboard

Observability

Full CLI

Agent Intelligence

Extended Thinking

Self-Critique

Context Compaction

Tool Discovery

Agent Handoffs

State Checkpointing

Trace Visualization

Dynamic Tool Gen

Process Reward Scoring

Learning Feedback

SDK — Build with Argentor

Python SDK

TypeScript SDK

React Dashboard

Defense-in-Depth Security

Plugin Isolation

Input Sanitization

Authentication

Rate Limiting & Audit

Compliance Modules

Access Control

Architecture

A2A Protocol — Agent Interoperability

Agent Discovery

Task Management

Streaming (SSE)

6 Collaboration Patterns

Pipeline

MapReduce

Debate

Ensemble

Supervisor

Swarm

Code Generation & DevOps

API Scaffold Generator

Infrastructure as Code

Developer Tools

Observability & Monitoring

Prometheus Metrics

OpenTelemetry

Cost Tracking

Audit Logging

14 Workspace Crates

Quick Start