GET STARTED DOCS SAFETY COMMUNITY DONATE

A LOCAL-FIRST AI WORKSHOP FOR AUTONOMOUS CODING

selfware

Run it on your own hardware. Keep your code private. Improve your software with safety rails.

No cloud account required. Open source. MIT. v0.2 BETA

Selfware chat interface showing ASCII banner, token stats, and command prompt
โ†“ SCROLL โ†“
YOUR CODE STAYS ON YOUR MACHINE ยท YOUR DATA NEVER LEAVES ยท EMERGENCE IS NOT ENTROPY ยท CONSTRAINTS ARE PHYSICS ยท FITNESS IS SACRED ยท SOFTWARE THAT LASTS ยท OWN YOUR INTELLIGENCE ยท BREAK THE API LEASH ยท CHARITY FOR PRIVACY ยท YOUR CODE STAYS ON YOUR MACHINE ยท YOUR DATA NEVER LEAVES ยท EMERGENCE IS NOT ENTROPY ยท CONSTRAINTS ARE PHYSICS ยท FITNESS IS SACRED ยท SOFTWARE THAT LASTS ยท OWN YOUR INTELLIGENCE ยท BREAK THE API LEASH ยท CHARITY FOR PRIVACY ยท

WHAT WE BELIEVE

Intelligence should be owned, not rented.

Every API call is a dependency. Every subscription is a leash. Every cloud model is someone else's computer reading your code. Selfware is the anti-singularity measure: intelligence that runs on your hardware, improves your software, and answers to no one.

We're building toward a public-benefit structure โ€” self-hosted tools that keep capability in the hands of individuals and small teams. Preserve privacy. Resist centralization-by-convenience.

โœ“ No API keys โœ“ No telemetry โœ“ No subscription โœ“ No cloud dependency

What data leaves my machine?

By default, none. LLM inference runs locally. Web tools and HTTP calls are explicit opt-in, shown in logs.

WHY LOCAL-FIRST

Cloud AI is convenient.
That convenience has a cost.

Every cloud model centralizes power, data, and capability. Selfware is built for personal sovereignty: you own the runtime, your context, and your tools.

AI Misdirection in 2025

Why privacy preservation matters โ€” and what the industry gets wrong.

Your code stays on your machine

No uploads, no training data extraction. Your source code never leaves your hardware.

Works with local OpenAI-compatible endpoints

Ollama, vLLM, LM Studio, llama.cpp. Bring your own model, run it your way.

Configurable safety boundaries

Path allow/deny, command filtering, autonomy levels. You decide what it can and can't do.

SEE IT WORK

Point it at a codebase.
Watch it think.

Selfware analyzed a 42,000-line game server, found 3 critical bugs, and deployed 6 agents to fix them โ€” all autonomously.

No prompting. No hand-holding. Just selfware run and a task description.

$ selfware analyze ./massive_game_server
๐Ÿ” Scanning 133 Rust files (42,596 lines)...
โš  Found 3 critical issues:
1. Delta bitmask truncation โ†’ silent data loss
2. ObjectPool race condition โ†’ memory exhaustion
3. SMS command injection โ†’ RCE vulnerability
๐Ÿ“Š Doc coverage: 0.8% โ€” needs 20%+
$ selfware run "fix critical bugs, add tests"
๐Ÿ—๏ธ Architect โ†’ planning fix across 18 modules
๐Ÿ’ป Coder โ†’ patching delta bitmask (u16 not u8)
๐Ÿงช Tester โ†’ generating 47 regression tests
๐Ÿ” Reviewer โ†’ verifying thread safety
๐Ÿ”’ Security โ†’ sanitizing shell construction
๐ŸŒธ BLOOM โ€” all 3 critical bugs fixed. 47 tests pass.
$ selfware garden
๐ŸŒฑ massive_game_server
โ”œโ”€โ”€ ๐ŸŒธ core/ BLOOM (healthy)
โ”œโ”€โ”€ ๐ŸŒฟ network/ GROW (improving)
โ”œโ”€โ”€ ๐ŸŒพ state_sync/ WILT (needs care)
โ””โ”€โ”€ โ„๏ธ auth/ FROST (critical)

THE SWARM

Six minds. One purpose.

๐Ÿ—๏ธ Architect

Module design, refactoring, dependency mapping. Plans the structure before anyone writes a line.

๐Ÿ’ป Coder

Implementation and feature development. Writes the code that passes the Tester's scrutiny.

๐Ÿงช Tester

Test generation, verification gates. Every mutation must survive the test suite.

๐Ÿ” Reviewer

Code review, quality checks. Catches what the Coder missed and the Tester didn't cover.

โš™๏ธ DevOps

Build pipelines, deployment, infrastructure. Keeps the workshop running.

๐Ÿ”’ Security

Vulnerability scanning, safety audits. Guards the protected groves.

Agents coordinate through shared memory and a task queue. Each runs its own PDVR cycle.

SELFWARE AGENTIC BENCHMARK

Measured. Not promised.

20 autonomous coding scenarios. Easy to Expert. 27 rounds, 323 runs. Real Rust projects, real compilation, real test suites.

89/100
Grand average
90/100
Steady-state
22/27
BLOOM rounds
6
Perfect streak
S TIER 99-100
calculator, string_ops, json_merge, perf_opt, codegen
Reliability: 100%
A TIER 86-96
bitset, scheduler, event_bus, async_race
Reliability: 89-96%
B TIER 63-72
security, testgen, refactor
Reliability: 74%
calculator
Easy
string_ops
Easy
json_merge
Med
bitset
Med
testgen
Med
refactor
Med
scheduler
Hard
event_bus
Hard
security
Hard
perf_opt
Hard
codegen
Hard
async_race
Expert
svg_chart
Easy
ascii_table
Easy
histogram
E-Med
sparkline
E-Med
progress_bar
Med
maze_gen
Med
unsafe_scan
Hard
actor_pdvr
Hard

PHASE TRANSITIONS โ€” HOW WE GOT HERE

โ„๏ธRound 10/5 tests
โ†’+ gateCompletion gate
๐ŸŒธRound 26/6 pass, 97/100
โ†’+ detectNo-op + repetition detector
๐ŸŒธRound 5Rescues infinite loops

Small constraint changes โ†’ large emergent behavior shifts.

27 rounds on Qwen3-Coder-Next FP8. No cherry-picking. Full run logs on GitHub.

VISUAL LANGUAGE MODEL BENCHMARK

It sees TUIs. It reads flamegraphs.
It writes layouts from mockups.

19 scenarios across 6 difficulty levels. Three models tested on real hardware. PNG fixtures generated from TUI screenshots.

LEVEL Qwen3.5-9B Qwen3-VL-30B Qwen3.5-35B
L1 TUI State Easy88% BLOOM62% GROW50% WILT
L2 Diagnostics Medium39% WILT67% GROW67% GROW
L3 Architecture Hard30% WILT47% GROW63% BLOOM
L4 Profiling VeryHard100% BLOOM100% BLOOM100% BLOOM
L5 Layout Extreme85% BLOOM100% BLOOM100% BLOOM
Mega Evolution Mega53% BLOOM33% WILT76% BLOOM
Overall66% BLOOM68% BLOOM76% BLOOM
Qwen3.5-9B Q8
RTX 4090
17.9K tokens118s
Qwen3-VL-30B FP8
2ร— RTX 4090
38.7K tokens486s
Qwen3.5-35B Q8
M2 Max 96GB
60.5K tokens3,186s
L1
TUI State
Terminal recognition
L2
Diagnostics
Compiler error parsing
L3
Architecture
Diagram comprehension
L4
Profiling
Flamegraph analysis
L5
Layout
Mockup โ†’ ratatui code
Mega
Evolution
Multi-image scoring

All 3 models hit 100% BLOOM on L4 Profiling. The 35B model was first to pass Hard threshold (60%) on L3 Architecture.

See the vision demos โ†’

WOLFRAM'S FOUR CLASSES

How selfware
manages complexity.

Static output. No adaptation, no learning. Dead ends.

Rigid patterns. Maximum order, zero flexibility. Traditional CI/CD falls here.

Unconstrained LLM output โ€” hallucinations, inconsistent logic, no guardrails.

Structured creativity. PDVR cognition + safety constraints = adaptable but controlled.

โ†‘ Toggle phases. Watch the hero particles respond.

PHASE TRANSITION MAP

โ€”
Flat
entropy: ~0
โˆฟ
Repetitive
entropy: low
โšก
Chaotic
entropy: max
โœฆ
Complex
entropy: med
โœฆ Complexity โ‰  entropy. Organized complexity โ€” structure with surprise.
PLAN
DO
VERIFY
REFLECT
PDVR
CYCLE

TWO LOOPS, NESTED

Two feedback
loops.

Inner loop: PDVR cognitive cycle. Working memory tracks hypotheses. Episodic memory persists across sessions.

Outer loop: Evolution Engine evaluates proposed changes against test suites and fitness benchmarks. Only improvements that pass all gates are kept.

BLOOM FROST

COMPOSITE FITNESS FUNCTION

SAB
Tokens
Latency
SAB Score 50% Token Efficiency 25% Latency 15% Coverage 5% Binary Size 5%

VISUAL INTELLIGENCE

It can see.

Screenshot capture, vision analysis, visual regression testing. An autonomous feedback loop that captures, critiques, and iterates โ€” until the design scores above threshold.

selfware dashboard
โ”Œโ”€ AGENTS โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€ OUTPUT LOG โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ โ— Architect โœ“ โ”‚ [14:23] Scanning 133.. โ”‚ โ”‚ โ— Coder โ–  โ”‚ [14:24] Found 3 issues โ”‚ โ”‚ โ— Tester โ— โ”‚ [14:24] Patching... โ”‚ โ”‚ โ— Reviewer ยท โ”‚ [14:25] Tests pass โœ“ โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ STATUS: 4/6 active โ”‚ MEM: 142MB โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
agent_list
output_log
status_bar
VLM Response โ€” structured JSON
{ "panels": [ { "name": "agent_list", "bounds": [1,1,16,5] }, { "name": "output_log", "bounds": [18,1,38,5] }, { "name": "status_bar", "bounds": [1,7,38,7] } ], "indicators": { "active": 4, "icons": ["โœ“","โ– ","โ—","ยท"] }, "accessibility": { "contrast": 4.8, "box_drawing": true } }
rustc output
error[E0382]: borrow of moved value: `data` --> src/sync/buffer.rs:47:18 | 45 | let data = vec![1, 2, 3]; | ---- move occurs because `data` | has type `Vec<i32>` 46 | process(data); | ---- value moved here 47 | println!("{:?}", data); | ^^^^ value borrowed | here after move | help: consider cloning the value | 46 | process(data.clone()); | ++++++++
Structured Extraction
{
"error_code": "E0382",
"error_type": "borrow_after_move",
"location": {
"file": "src/sync/buffer.rs",
"line": 47, "col": 18
},
"moved_at": { "line": 46 },
"suggestion": {
"action": "clone",
"insert": ".clone()"
},
"confidence": 0.97
}
AFTER
โ”Œโ”€ Dashboard โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Agents โ”‚ Task Queue โ”‚ โ”‚ โ— Arch โ”‚ fix: delta โ”‚ โ”‚ โ— Code โ”ƒ test: buffer โ”‚ โ”‚ โ— Test โ”‚ rev: auth โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ Status: OK โ”‚ Mem: 142MB โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
BEFORE
โ”Œโ”€ Dashboard โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Agents โ”‚ Task Queue โ”‚ โ”‚ โ— Arch โ”‚ fix: delta โ”‚ โ”‚ โ— Code โ”‚ test: buffer โ”‚ โ”‚ โ— Test โ”‚ rev: auth โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ Status: OK โ”‚ Mem: 142MB โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ—‚โ–ธ
vision_compare output
pixel_similarity: 73.2% semantic_diffs: 3 1. Sidebar border: "โ”‚" โ†’ "โ”ƒ" (line 4, col 13) 2. Row join: "โ”ผ" โ†’ "โ”€" (line 6, col 12) 3. Status row: column divider removed (line 7) verdict: "regression_detected"
design loop
$ selfware design-loop --threshold 80
[iter 1] Capturing screenshot...
[iter 1] Analyzing โ€” score: 59/100
[iter 1] Below threshold. Applying 4 fixes...
[iter 2] Capturing screenshot...
[iter 2] Analyzing โ€” score: 72/100
[iter 2] Below threshold. Applying 2 fixes...
[iter 3] Capturing screenshot...
[iter 3] Analyzing โ€” score: 82/100
[iter 3] โœ“ Threshold met โ€” design approved
Visual Critic Scores
Composition82
Hierarchy78
Readability91
Overall: 82/100 THRESHOLD: 80 โœ“
Iter 1
59
4 fixes
Iter 2
72
2 fixes
Iter 3
82
โœ“ done
screen_capture
Screenshot any screen, window, or region
browser_screenshot
Headless Chrome/Playwright full-page capture
vision_analyze
Send to any VLM for structured analysis
vision_compare
Pixel + semantic visual regression testing

CONSTRAINTS ARE PHYSICS

Fitness is sacred.

Entropy is removing constraints. Organized complexity is adding the right constraints.

IMMUTABLE INVARIANTS

โœ—Cannot modify its own fitness function
โœ—Cannot modify the benchmark suite
โœ—Cannot modify the safety module
โœ—All mutations must pass cargo check
โœ—Property tests mandatory for core mutations

PROTECTED GROVES

๐Ÿ›ก๏ธ src/evolution/
๐Ÿ›ก๏ธ src/safety/
๐Ÿ›ก๏ธ system_tests/
๐Ÿ›ก๏ธ benches/sab_

The evaluator cannot be modified by the thing being evaluated.

SAFETY RAILS

What it can't do matters
as much as what it can.

Path allow/deny

File access is gated by configurable path rules. No wildcard defaults.

Command filtering

Dangerous commands (rm -rf, sudo, etc.) are blocked unless explicitly allowed.

Protected invariants

Safety modules, fitness functions, and evaluation code are immutable. The system cannot modify its own guardrails.

KNOWN LIMITATIONS โ–พ
โ€ขLLM outputs are non-deterministic โ€” same input may produce different results
โ€ขSafety boundaries are advisory when autonomy level is set to maximum
โ€ขNo formal verification of generated code โ€” tests are the safety net
Full security policy โ†’ SECURITY.md

SELFWARE DOCTOR โ€” SYSTEM DIAGNOSTICS

Selfware doctor command showing system diagnostics โ€” Rust tools, Python tools, security checks, container tools

selfware doctor audits your toolchain, security posture, and dependencies before you start.

YOUR TOOLKIT

54 tools. 6 commands. One binary.

๐ŸŒŠ

Multi-Agent Swarm

6 specialized roles โ€” Architect, Coder, Tester, Reviewer, DevOps, Security. Up to 16 concurrent.

$ selfware multi-chat
src/orchestration/
๐Ÿงฌ

Evolution Engine

LLM proposes edits to its own source. Compile, test, evaluate fitness. Only BLOOMs survive.

$ selfware evolve
src/evolution/
๐Ÿ”„

PDVR Cognition

Plan โ†’ Do โ†’ Verify โ†’ Reflect. Working memory + episodic memory across sessions.

$ selfware run <task>
src/cognitive/
๐ŸŒฑ

Digital Garden

Files as plants with growth stages. selfware garden visualizes codebase health.

$ selfware garden ./src
src/ui/garden.rs
โ„๏ธ

Frost Recovery

Checkpoint seeds survive crashes. Resume any session from last known state.

$ selfware resume
src/session/checkpoint.rs
๐Ÿ›ก๏ธ

Safety Guardians

Path validation, command filtering, SSRF protection, symlink guards. 4 autonomy levels.

$ selfware chat --level 3
src/safety/

Also: analyze ยท journal ยท status ยท workflow ยท dashboard ยท demo

WHAT IT LOOKS LIKE

Selfware dashboard TUI showing chat, garden view, active tools, and logs panels
selfware dashboard
Chat, garden, tools, and logs in one view
Selfware chat interface with ASCII banner, token usage, and model info
selfware chat
Interactive coding with token tracking
Selfware command palette showing all available commands
command palette
30+ commands at your fingertips

BATTLE-TESTED

42,000 lines of Rust.
One command.

A real-time multiplayer game server โ€” WebRTC networking, SIMD physics, 16-agent AI, FlatBuffers protocol. Selfware scanned it, found critical vulnerabilities, and proposed fixes.

133
Rust files
42.6K
Lines of code
3
Critical bugs found
18
Modules refactored

BUGS FOUND AUTONOMOUSLY

Delta bitmask truncation
u16 cast to u8 โ€” shields and flags silently lost
CRITICAL
ObjectPool race condition
TOCTOU on max size โ€” unbounded memory growth
CRITICAL
Shell command injection
Unsanitized SMS command construction โ€” RCE vector
CRITICAL

WHAT THE SWARM DID

โœ“Split 1,787-line monolith into 18 focused modules
โœ“Generated regression tests for each critical fix
โœ“Mapped all 40+ undocumented environment variables
โœ“Identified O(Nยฒ) bot AI bottleneck with spatial index fix
โœ“Flagged join-throughput collapse at 73+ concurrent clients

Selfware identified these issues and proposed fixes. Human review was applied before merging. View the project โ†’

๐ŸฆŠ

Be first to know when v1.0 ships.

Changelog drops, hardware guides, and community builds. One email when it matters.

HOW IT COMPARES

The only tool that owns its own loop.

When local LLMs are good enough โ€” why rent intelligence?

Selfware Cursor Copilot Aider SWE-Agent
Runs locallyโœ“โ€”โ€”โœ“โœ“
Self-improvingโœ“โ€”โ€”โ€”โ€”
Multi-agentโœ“โ€”โ€”โ€”โœ“
Visual agentโœ“โ€”โ€”โ€”โ€”
Your data stays localโœ“โ€”โ€”โœ“โœ“
No subscriptionโœ“โ€”โ€”โœ“โœ“
Any LLM backendโœ“โ€”โ€”โœ“โ€”
Evolution engineโœ“โ€”โ€”โ€”โ€”
Codebase health vizโœ“โ€”โ€”โ€”โ€”
Agentic benchmarkโœ“โ€”โ€”โ€”โœ“

CLOUD AI

Cursor Pro$240/yr
Copilot Business$228/yr
API credits$600+/yr
Your dataleaves
$600โ€“1,000+/year

LOCAL (RTX 4090)

GPU (one-time)~$1,600
Electricity~$55/yr
Selfware$0 (MIT)
Your datastays
~$55/yrafter GPU

150W ร— 8h/day ร— $0.12/kWh. GPU amortized over 5 years = $320/yr. Still cheaper than cloud year 1.

54
Tools
6800+
Tests
16
Agents
515+
Commits
90/100
SAB Score

EARLY ADOPTERS

What builders are saying.

From early testers during private beta.

"Pointed it at our 30K-line monolith. It found a race condition we'd missed for months. In minutes."
S
Systems engineer
Game server team
"The evolution engine is unlike anything else. Watching it improve its own code is wild."
M
ML researcher
University lab
"Finally, AI coding that doesn't phone home. Our security team signed off in a day."
S
Staff engineer
Defense contractor

RECENT ACTIVITY

view all โ†’
latest
VLM benchmark: 19 scenarios, 3 models, 6 difficulty levels
Visual language model evaluation suite
recent
Evolution engine: multi-population tournament selection
Recursive self-improvement via fitness-gated mutation
recent
SAB benchmark: 20 scenarios, 27 rounds, 89/100 grand average
Selfware Agentic Benchmark on Qwen3-Coder-Next FP8

GET STARTED

Three commands.
Your workshop is running.

STEP 1
$ cargo install selfware

Install from crates.io. Requires Rust toolchain.

STEP 2
$ selfware init

Setup wizard โ€” choose your LLM endpoint and model.

STEP 3
$ selfware chat

Start coding. Or selfware run <task> for autonomous mode.

Works with any OpenAI-compatible endpoint โ€” Ollama, vLLM, LM Studio, llama.cpp, SGLang. Primary model: Qwen3.5 Coder.

SAFE CONFIG

# ~/.config/selfware/config.toml
[safety]
allowed_paths = ["./src/**", "./tests/**"]
denied_paths = ["./secrets/**", ".env"]
autonomy_level = 2 # supervised

Start with least-privilege paths. Widen as you build trust.

CAN I RUN IT?

If you have a GPU, you're ready.

ENTRY
RTX 4060 Ti (16 GB)
Qwen3.5 9B Q4
Context: 16-32K

Everyday coding, moderate tasks

Apple: Mac 24 GB
SWEET SPOT
RECOMMENDED
RTX 4090 / 3090 (24 GB)
Qwen3.5 27B Q4
Context: 32-64K

Full SAB scenarios, multi-agent

Apple: Mac 64 GB
PRODUCTION
RTX 5090 (32 GB) / H100
Qwen3.5 35B-A3B Q4
Context: 64-128K

Best value. Full evolution engine.

Apple: Mac 96-128 GB
Ollama vLLM llama.cpp LM Studio MLX SGLang

Any OpenAI-compatible endpoint. No API keys to cloud services needed.

BEGIN

$ cargo install selfware

or the quick path: ollama pull qwen3-coder && cargo install selfware && selfware init

MIT licensed. Linux, macOS, Windows.
Your hardware. Your data. Your loop.

AT THE EDGE OF CHAOS

Simple rules. Iterated.
Until structure appears.

Conway's Game of Life โ€” three rules, infinite complexity. Nobody designs the gliders or the oscillators. They emerge.

Selfware's multi-agent swarm works the same way. Individual agents follow PDVR cycles. The collective behavior is a property of the system, not any single agent.

GAME OF LIFE โ€” NEW ยท MATURE

Click 'reseed' ยท Watch order emerge

OPEN SOURCE

Built in the open. Join the loop.

Selfware is MIT licensed. Contributions, bug reports, and ideas are welcome.

Join the inner loop.

Evolution logs, benchmark drops, hardware guides. The signal, not the noise.

No spam. Stored locally on this server. Your data never leaves.