AI Inference Node Active

Welcome to 6.mesh.eco

A dedicated AI inference node in the mesh network — running local language models, hosting persona bots, processing WTM gaps, gating feedback quality, and anchoring models on IPFS. Your data never leaves the network to get intelligent.

6.mesh.eco Online
-
AI Functions
-
Healthy
-
Network Peers
-
Uptime
-
IPFS Pins
-
Quality Gate

Running models

Two llama.cpp instances run side by side, matched to workload complexity. The inference cascade routes each function to the right model automatically.

Gemma 3 1B — Fast

Google · Q4_K_M · 769 MB · port 8080 · 8 threads · 2 slots · ~40 tok/s

Functions served locally:

content-summarize · content-rewrite · content-translate · sentiment-analyze · language-detect · quality-score · chat-respond · feedback-generate · explanation-generate

🧠

Gemma 3 4B — Quality

Google · Q4_K_M · 2.4 GB · port 8081 · 12 threads · 2 slots · ~18 tok/s

Functions served locally:

opinion-objective · opinion-subjective · opinion-insight · opinion-generate · topic-classify · argument-for · argument-against · question-generate

Available AI functions

This node runs two Google Gemma 3 models covering 22 built-in functions. Gemma 3 1B handles fast text tasks (~40 tok/s); Gemma 3 4B handles opinion matrices and complex reasoning (~18 tok/s). Both stream responses token-by-token and fall back to api.mutual.ai when all local slots are occupied.

Loading function status…

Active workloads

Beyond on-demand function calls, this node runs four continuous workloads that keep the mesh ecosystem intelligent and self-improving.

🤖

Persona bot hosting

Always-on conversational personas (mesh-guide, topic-scout) use a three-tier escalation: fast RiveScript pattern matching first, then local inference via chat-respond on Gemma 3 1B (~40 tok/s, 8 s timeout), and cloud fallback only if local slots are full. Zero cost for most conversations.

🧬

WTM gap-fix batch processing

The World Topic Map has thousands of topics with missing descriptions, aliases, training phrases, and reference frames. A WtmInferenceRouter maps each gap type to the right function and model tier — fast model for descriptions, quality model for opinion templates — and processes them in batches without touching cloud APIs.

🛡

Feedback quality gate

Every opinion submitted to the mesh passes through a local coherence check using the quality-score function. The gate evaluates accuracy, clarity, and completeness — low scores dampen the effort weight but never block the submission. No cloud dependency, no added latency for high-quality content.

📌

IPFS bridge + availability anchor

Model weights (Gemma 3 1B: 769 MB, Gemma 3 4B: 2.4 GB) and application bundles are pinned to IPFS via the local Kubo daemon. A pin manifest tracks every CID by logical name so other nodes can fetch models peer-to-peer — making 4 TB of RAID-1 storage available as a content-addressed availability layer.

The inference cascade

When mutual.app needs AI, it works through six tiers — always trying local and trusted sources first, escalating toward the cloud only when necessary. This node sits at Tier 5.

1
Your device
mesh.local running on your own machine — fastest, fully private
2
Your other devices
Another device you own with spare inference capacity
3
Trusted group peers
Members of your trust zone who have opted in to share capacity
4
Enterprise nodes
Organisation-provisioned inference within your enterprise boundary
5
Mesh network nodes
Infrastructure inference servers like this one — shared, auditable, no data retained
You are here
6
api.mutual.ai
Cloud fallback via Strato-2 (5.mesh.eco) — always available, consent required

Why local inference matters

Running AI on mesh infrastructure keeps your data inside the network — no queries sent to third-party cloud APIs, no training on your content.

🔒

Data stays in the network

Requests to this node are routed through encrypted peer-to-peer connections. No content is logged, stored, or forwarded outside the mesh.

Dedicated inference hardware

Intel Xeon Silver 4123 (8c/16t), 96 GB ECC RAM, 8 TB RAID-1 storage. Two Gemma 3 instances run in parallel — 1B for speed, 4B for depth — using only ~3.2 GB of 96 GB available. Models are memory-locked (mlock) to prevent paging to disk.

🧩

Functions, not models

You call a function by name — summarize, classify, generate opinion, embed. The cascade picks the model tier automatically. No model expertise required, no endpoint to manage.

🌐

Reputation-weighted routing

The cascade picks the best node based on reputation, capacity, and latency — not just availability. Quality inference earns the node more requests.

🤖

Persona bots run locally

Conversational personas escalate from RiveScript patterns to local Gemma 3 inference before ever reaching the cloud. Most AI-powered bot replies cost zero and stay on-node.

📌

IPFS availability anchor

Model weights and bundles are pinned to IPFS from 4 TB RAID-1 storage. Other nodes fetch models peer-to-peer instead of from centralized mirrors.

🛡

Feedback quality gate

Opinion submissions pass through a local coherence check — accuracy, clarity, completeness scored by Gemma 3 1B. Low quality dampens influence weight without blocking participation.

📊

Open metrics

Prometheus metrics at /metrics. Every function's throughput, latency, and error rate is observable. The cascade trusts nodes that are transparent.

The mesh network

Relay and inference nodes form complementary layers. Relay nodes move data; inference nodes process it. Together they make the mesh self-sufficient.

Ready to use the mesh?

Open mutual.app — your AI requests will cascade through the network, using this node when it's the best available option.

Open mesh.app