The internet will happily talk you into a dual-RTX-5090 monster. You almost certainly don't need it. A genuinely useful local model — one that drafts emails, summarizes documents, and writes code — runs on hardware that costs less than a mid-range phone if you buy smart.
The one rule: VRAM > everything
For local LLMs, the amount of VRAM on your GPU sets the ceiling on what you can run. Raw speed matters second. A cheap card with 8 GB beats an expensive card with 6 GB for this job. So the cheapest useful LLM PC is really "the cheapest path to 8–12 GB of VRAM."
The budget tiers
| Budget tier | GPU pick | Runs |
|---|---|---|
| Rock-bottom | Used RTX 3060 12 GB | Up to a 14B model at Q4 — genuinely capable. |
| Value | RTX 5060 / 4060 8 GB | 7B–8B models fast, with new-card warranty. |
| No GPU | Any modern CPU + 16 GB RAM | 1B–4B models on CPU only. Slow but free. |
The rest of the build
Around the GPU, keep it boring: any 6-core (or better) CPU from the last few years, 16 GB of RAM minimum (32 GB if you want to offload bigger models), and a small NVMe SSD so model files load quickly. You do not need a top-tier CPU — for inference it mostly sits and waits on the GPU.
Where "cheap" starts to hurt
Two places. First, 6 GB cards: they technically work but box you into 1B–4B models and short contexts. Second, skimping on RAM: with only 8 GB of system RAM you can't offload anything, so the GPU's VRAM becomes a hard wall. Spend the extra few dollars on RAM before anything else.
FAQ
We may partner with companies or groups to affiliate hardware products based on user needs, earning a commission from qualifying purchases. Recommendations are based on VRAM requirements (weights + KV cache + overhead) and may vary by runtime. Data current as of June 2026.