Semifly · LLMs

Compare large language models

Frontier hosted models and open-weight models you can self-host, side by side. Figures as of July 2026 and refreshed regularly.

Interactive

Head-to-head comparison

Pick any two models — up to four — and see how they stack up across coding, agentic, reasoning, knowledge and context, with the key spec differences called out.

Choose modelspick 2–4

Frontier · API

Open-weight

Capability radar

BenchLM weighted scores (0–100) · June 2026

The Overall score auto-refreshes weekly from BenchLM (July 2026). The five radar axes (coding, agentic, reasoning, knowledge + a normalised context score) are a manually-curated BenchLM snapshot as of June 2026; some open-model axis values are indicative pending full per-benchmark data, and frontier parameter counts are undisclosed. Confirm specifics on each model’s official page.

Frontier

Frontier & proprietary models

Leading hosted models, accessed via API. Scores and prices move quickly; figures are indicative.

Model	Lab	Context	Notable	Access
Claude Fable 5	Anthropic	1M+	Tops the Artificial Analysis Intelligence Index (~65); first public Mythos-class model (Jun 9). Access intermittent under US export controls.	API
Claude Opus 4.8	Anthropic	1M+	Highest-scoring widely-available model (AA Index ~61)	API
GPT-5.5	OpenAI	1M+	Strong all-round; Pro / Instant variants (AA Index ~60)	API
Gemini 3.1 Pro	Google	1M+	Top reasoning & data analysis (94%+ GPQA Diamond)	API
Grok 4.3	xAI	~2.0M	Cheapest of the frontier four; strong agentic / tool use	API
Gemini 3.5 Flash	Google	1M	Flagship-level quality at ~4x speed (~$1.50 / 1M in)	API

Sources: LLM-Stats, Morph, LM Council (July 2026).

Open source

Open-source & open-weight models

Open-weight models you can download and self-host — run on your own GPUs or on Semifly in one click.

Model	Developer	Params	Context	License	Download	Run
DeepSeek V4 Pro	DeepSeek	~1.6T (MoE)	1M	MIT	Hugging Face →	Run on Semifly
MiniMax M3	MiniMax	428B (23B active)	1M	Apache 2.0	Hugging Face →	Run on Semifly
GLM-5.1	Zhipu AI	MoE	200K	MIT	Hugging Face →	Run on Semifly
Kimi K2.7-Code	Moonshot AI	~1T (A32B, MoE)	256K	Mod. MIT	Hugging Face →	Run on Semifly
Qwen3.5 (397B-A17B)	Alibaba	397B (17B active)	256K	Apache 2.0	Hugging Face →	Run on Semifly
Llama 4 Scout	Meta	MoE	10M	Llama 4	Hugging Face →	Run on Semifly
Llama 4 Maverick	Meta	MoE	1M	Llama 4	Hugging Face →	Run on Semifly
Mistral Small 4	Mistral AI	24B	256K	Apache 2.0	Hugging Face →	Run on Semifly
Gemma 4 (31B)	Google	31B	256K	Gemma	Hugging Face →	Run on Semifly

Confirm the current license on each model’s official page before deployment.

Run any of these on Semifly

Tokens & API

Access hosted models through a simple, metered token API.

Get API access →

GPU servers

Buy or lease Supermicro GPU systems to self-host open-weight models.

Browse GPU servers →

AI Foundry

Managed compute for training, fine-tuning, and inference.

Explore AI Foundry →