Yaarbal — distributed LLM inference.

Credit-metered P2P inference for every model.

Yaarbal is a BitTorrent-inspired peer-to-peer network for LLM inference. Seeders host complete models and earn credits; running inference spends them. Install once — every model in the registry is one command away.

Terminal recording: pip install yaarbal, then yaarbal run qwen2.5-0.5b-instruct "hello", with streamed token output. Final frame of the Yaarbal terminal recording showing streamed Qwen 2.5 output.

Recorded with v0.1.3 — install command shown is identical for v0.1.4+; only the resolver underneath changed.

Try it in 2 minutes

Python 3.11 or newer. macOS or Linux. Pulls a ~600 MB Qwen 2.5 0.5B model on first run.

pip install yaarbal && yaarbal run qwen2.5-0.5b-instruct "hello"

How it works

How Yaarbal works: seeders host complete models, leechers run inference against the swarm, and a tracker matches them. Credits flow from leechers to seeders.

Seeders host complete models and earn credits. Leechers spend credits to run inference. The tracker matches them — no central inference provider.