EULLM
Legal IT — first vertical · dataset nearly complete · Q3 2026

7B specialists that outperform
70B generalists

on your regulated domain, on your EU infrastructure.

EuLLM distills and verticalizes open-weight models into lean 7B specialists for legal, medical and technical domains — EU-hosted, AI Act compliance cards included.

Built for legal teams, compliance officers, medical institutions, and engineering organizations that can't send data to US clouds.

70B → 7B
Model compression
~50×
Lower inference cost vs frontier
259 tok/s
Inference throughput
AI Act
Compliance cards built-in
EIC 2026
Accelerator applicant
Forge

The vertical model foundry

Take a 70B frontier model and distill it into a lean 7B or 4B domain specialist that outperforms the original in your target field. Less compute, more precision, full EU sovereignty.

  • Structural pruning — remove irrelevant capacity without retraining from scratch
  • Knowledge distillation — transfer domain expertise into a smaller model
  • Quantization — maximize throughput on your existing hardware
  • Identity fine-tuning — custom persona, brand voice, and instructions
  • GGUF export — runs instantly on Engine, no extra tooling
1

Select base model

Any Apache 2.0 frontier model (70B, 32B…)

2

Define your domain

Legal, medical, finance, technical…

3

Run Forge pipeline

Prune → Distill → Quantize → Fine-tune

4

Export & deploy

GGUF output, runs instantly on Engine

Hub

EU-based model registry

Pre-specialized vertical models for regulated European industries, hosted entirely within the EU. Every model ships with an AI Act compliance card.

First vertical — dataset readyQ3 2026

Legal IT

Contract analysis, GDPR assessment, EU regulatory compliance — Italian jurisdiction

  • Trained on curated Italian jurisprudence, civil code, and EU regulatory corpus
  • 7B parameters — AI Act high-risk compliance card included
  • Distilled from a 70B frontier open-weight model
Medical

Clinical documentation, ICD coding, patient triage support

Coming later in 2026
Finance

Risk assessment, KYC automation, MiFID II compliance reporting

Coming later in 2026

Need a vertical we don't have yet?

We build custom specialist models for your regulated domain.

Talk to us →

All Hub models use exclusively Apache 2.0 licensed weights — white-label sovereignty for European businesses. See full roadmap →

Engine

The runtime that makes it all fast

A production-ready inference server built in Rust — drop-in Ollama replacement with an OpenAI-compatible API. Engine is what powers every EULLM vertical model at 259 tok/s with zero non-EU telemetry.

  • Continuous batching — 259 tok/s with 16 concurrent requests
  • GPU acceleration: NVIDIA CUDA, AMD ROCm, Vulkan, Apple Metal
  • TurboQuant KV cache compression — 131K context on 16 GB GPUs
  • Built-in audit logging for EU AI Act compliance
  • Transparent web browsing without function-calling overhead
  • Prebuilt binaries for Linux and macOS (x64 & ARM64)
terminal

# Download Engine (Linux x64)

curl -L https://github.com/eullm/eullm/releases/latest/download/eullm-linux-x64 -o eullm

chmod +x eullm

# Run a model

./eullm run ./model.gguf --batch-size 16

# OpenAI-compatible API on :11434

curl http://localhost:11434/v1/chat/completions \

-H "Content-Type: application/json" \

-d '{"model":"qwen3","messages":[{"role":"user","content":"Ciao!"}]}'

Sovereign AI starts here

Your data stays in Europe. Your models carry your brand. No API dependencies, no vendor lock-in.