AI Inference Node Active

Welcome to 6.mesh.eco

A dedicated AI inference node in the mesh network — running local language models, hosting persona bots, processing WTM gaps, gating feedback quality, and anchoring models on IPFS. Your data never leaves the network to get intelligent.

6.mesh.eco Online

AI Functions

Healthy

Network Peers

Uptime

IPFS Pins

Quality Gate

Running models

Two llama.cpp instances run side by side, matched to workload complexity. The inference cascade routes each function to the right model automatically.

⚡

Gemma 3 1B — Fast

Google · Q4_K_M · 769 MB · port 8080 · 8 threads · 2 slots · ~40 tok/s

Functions served locally:

content-summarize · content-rewrite · content-translate · sentiment-analyze · language-detect · quality-score · chat-respond · feedback-generate · explanation-generate

🧠

Gemma 3 4B — Quality

Google · Q4_K_M · 2.4 GB · port 8081 · 12 threads · 2 slots · ~18 tok/s

Functions served locally:

opinion-objective · opinion-subjective · opinion-insight · opinion-generate · topic-classify · argument-for · argument-against · question-generate

Available AI functions

This node runs two Google Gemma 3 models covering 22 built-in functions. Gemma 3 1B handles fast text tasks (~40 tok/s); Gemma 3 4B handles opinion matrices and complex reasoning (~18 tok/s). Both stream responses token-by-token and fall back to api.mutual.ai when all local slots are occupied.

Loading function status…

Active workloads

Beyond on-demand function calls, this node runs four continuous workloads that keep the mesh ecosystem intelligent and self-improving.

🤖

Persona bot hosting

Always-on conversational personas (mesh-guide, topic-scout) use a three-tier escalation: fast RiveScript pattern matching first, then local inference via chat-respond on Gemma 3 1B (~40 tok/s, 8 s timeout), and cloud fallback only if local slots are full. Zero cost for most conversations.

🧬

WTM gap-fix batch processing

The World Topic Map has thousands of topics with missing descriptions, aliases, training phrases, and reference frames. A WtmInferenceRouter maps each gap type to the right function and model tier — fast model for descriptions, quality model for opinion templates — and processes them in batches without touching cloud APIs.

🛡

Feedback quality gate

Every opinion submitted to the mesh passes through a local coherence check using the quality-score function. The gate evaluates accuracy, clarity, and completeness — low scores dampen the effort weight but never block the submission. No cloud dependency, no added latency for high-quality content.

📌

IPFS bridge + availability anchor

Model weights (Gemma 3 1B: 769 MB, Gemma 3 4B: 2.4 GB) and application bundles are pinned to IPFS via the local Kubo daemon. A pin manifest tracks every CID by logical name so other nodes can fetch models peer-to-peer — making 4 TB of RAID-1 storage available as a content-addressed availability layer.

The inference cascade

When mutual.app needs AI, it works through six tiers — always trying local and trusted sources first, escalating toward the cloud only when necessary. This node sits at Tier 5.

Your device

mesh.local running on your own machine — fastest, fully private

Your other devices

Another device you own with spare inference capacity

Trusted group peers

Members of your trust zone who have opted in to share capacity

Enterprise nodes

Organisation-provisioned inference within your enterprise boundary

Mesh network nodes

Infrastructure inference servers like this one — shared, auditable, no data retained

You are here

api.mutual.ai

Cloud fallback via Strato-2 (5.mesh.eco) — always available, consent required

Why local inference matters

Running AI on mesh infrastructure keeps your data inside the network — no queries sent to third-party cloud APIs, no training on your content.

🔒

Data stays in the network

Requests to this node are routed through encrypted peer-to-peer connections. No content is logged, stored, or forwarded outside the mesh.

⚡

Dedicated inference hardware

Intel Xeon Silver 4123 (8c/16t), 96 GB ECC RAM, 8 TB RAID-1 storage. Two Gemma 3 instances run in parallel — 1B for speed, 4B for depth — using only ~3.2 GB of 96 GB available. Models are memory-locked (mlock) to prevent paging to disk.

🧩

Functions, not models

You call a function by name — summarize, classify, generate opinion, embed. The cascade picks the model tier automatically. No model expertise required, no endpoint to manage.

🌐

Reputation-weighted routing

The cascade picks the best node based on reputation, capacity, and latency — not just availability. Quality inference earns the node more requests.

🤖

Persona bots run locally

Conversational personas escalate from RiveScript patterns to local Gemma 3 inference before ever reaching the cloud. Most AI-powered bot replies cost zero and stay on-node.

📌

IPFS availability anchor

Model weights and bundles are pinned to IPFS from 4 TB RAID-1 storage. Other nodes fetch models peer-to-peer instead of from centralized mirrors.

🛡

Feedback quality gate

Opinion submissions pass through a local coherence check — accuracy, clarity, completeness scored by Gemma 3 1B. Low quality dampens influence weight without blocking participation.

📊

Open metrics

Prometheus metrics at /metrics. Every function's throughput, latency, and error rate is observable. The cascade trusts nodes that are transparent.

The mesh network

Relay and inference nodes form complementary layers. Relay nodes move data; inference nodes process it. Together they make the mesh self-sufficient.

Ready to use the mesh?

Open mutual.app — your AI requests will cascade through the network, using this node when it's the best available option.

Open mesh.app