The Nvidia GTX 1650 has 4 GB of memory. At 4-bit quantization (8K context), 18 of 78 popular local LLMs fit comfortably. Full list below, smallest first.
| Model | Params | VRAM (Q4) | On 4 GB |
|---|---|---|---|
| nomic-embed-text v1.5 (137M) | 0.137B | 2.1 GB | Fits |
| jina-reranker-v2 (300M) | 0.3B | 2.2 GB | Fits |
| EmbeddingGemma (308M) | 0.308B | 2.2 GB | Fits |
| bge-large-en-v1.5 (335M) | 0.335B | 2.2 GB | Fits |
| bge-reranker-large (335M) | 0.335B | 2.2 GB | Fits |
| stella-en-400M (435M) | 0.435B | 2.3 GB | Fits |
| nomic-embed-text v2 MoE (475M) | 0.475B | 2.3 GB | Fits |
| Qwen 2.5 Coder (0.5B) | 0.5B | 2.3 GB | Fits |
| bge-m3 (567M) | 0.567B | 2.4 GB | Fits |
| jina-embeddings-v3 (570M) | 0.57B | 2.4 GB | Fits |
| Qwen 3 (0.6B) | 0.6B | 2.4 GB | Fits |
| Qwen 3 Embedding (0.6B) | 0.6B | 2.4 GB | Fits |
| Qwen 3 Reranker (0.6B) | 0.6B | 2.4 GB | Fits |
| Gemma 3 (1B) | 1B | 2.7 GB | Fits |
| DeepSeek-R1 Distill (1.5B) | 1.5B | 3.0 GB | Fits |
| Qwen 2.5 Coder (1.5B) | 1.5B | 3.0 GB | Fits |
| Qwen 3 (1.7B) | 1.7B | 3.1 GB | Fits |
| Gemma 3n (E2B) | 2B | 3.3 GB | Fits |
| SmolLM3 (3B) | 3B | 4.0 GB | Tight |
| Llama 3.2 (3B) | 3B | 4.0 GB | Tight |
| Qwen 2.5 Coder (3B) | 3B | 4.0 GB | Tight |
| StarCoder 2 (3B) | 3B | 4.0 GB | Tight |
| Qwen 2.5 VL (3B) | 3B | 4.0 GB | Tight |
| Phi-4 Mini (3.8B) | 3.8B | 4.5 GB | Tight |
| Gemma 3 (4B) | 4B | 4.7 GB | Tight |
| Qwen 3 (4B) | 4B | 4.7 GB | Tight |
| Qwen 3 VL (4B) | 4B | 4.7 GB | Tight |
| Phi-4 Multimodal (5.6B) | 5.6B | 5.8 GB | Won’t fit |
| DeepSeek-R1 Distill (7B) | 7B | 6.7 GB | Won’t fit |
| StarCoder 2 (7B) | 7B | 6.7 GB | Won’t fit |
| Qwen 2.5 Coder (7B) | 7.2B | 6.8 GB | Won’t fit |
| Qwen 2.5 VL (7B) | 7.2B | 6.8 GB | Won’t fit |
| Llama 3.1 (8B) | 8B | 7.4 GB | Won’t fit |
| DeepSeek-R1 Distill (8B) | 8B | 7.4 GB | Won’t fit |
| Qwen 3 VL (8B) | 8B | 7.4 GB | Won’t fit |
| InternVL3 (8B) | 8B | 7.4 GB | Won’t fit |
| LLaVA 1.6 (8B) | 8B | 7.4 GB | Won’t fit |
| Qwen 3 (8B) | 8.2B | 7.5 GB | Won’t fit |
| Gemma 2 (9B) | 9.2B | 8.2 GB | Won’t fit |
| Llama 3.2 Vision (11B) | 11B | 9.4 GB | Won’t fit |
| Gemma 3 (12B) | 12B | 10.0 GB | Won’t fit |
| Pixtral (12B) | 12B | 10.0 GB | Won’t fit |
| DeepSeek-R1 Distill (14B) | 14B | 11.4 GB | Won’t fit |
| Phi-4 (14B) | 14B | 11.4 GB | Won’t fit |
| Qwen 3 (14B) | 14.8B | 11.9 GB | Won’t fit |
| Qwen 2.5 Coder (14B) | 14.8B | 11.9 GB | Won’t fit |
| StarCoder 2 (15B) | 15B | 12.1 GB | Won’t fit |
| DeepSeek-Coder-V2 Lite (16B) | 16B | 12.7 GB | Won’t fit |
| gpt-oss (20B MoE) | 21B | 16.1 GB | Won’t fit |
| Devstral Small (24B) | 24B | 18.1 GB | Won’t fit |
| Codestral 25.01 (24B) | 24B | 18.1 GB | Won’t fit |
| Gemma 4 (26B MoE) | 26B | 19.4 GB | Won’t fit |
| Gemma 3 (27B) | 27B | 20.1 GB | Won’t fit |
| Qwen 3 (30B-A3B MoE) | 30.5B | 22.4 GB | Won’t fit |
| Qwen 3 Coder (30B-A3B MoE) | 30.5B | 22.4 GB | Won’t fit |
| DeepSeek-R1 Distill (32B) | 32B | 23.4 GB | Won’t fit |
| Qwen 3 VL (32B) | 32B | 23.4 GB | Won’t fit |
| Qwen 2.5 (32B) | 32.5B | 23.8 GB | Won’t fit |
| Qwen 2.5 Coder (32B) | 32.5B | 23.8 GB | Won’t fit |
| Qwen 3 (32B) | 32.8B | 24.0 GB | Won’t fit |
| Qwen 3.6 (35B-A3B MoE) | 35B | 25.4 GB | Won’t fit |
| Llama 3.3 (70B) | 70.6B | 49.3 GB | Won’t fit |
| DeepSeek-R1 (70B Distill) | 70.6B | 49.3 GB | Won’t fit |
| Qwen 2.5 VL (72B) | 72B | 50.2 GB | Won’t fit |
| InternVL3 (78B) | 78B | 54.3 GB | Won’t fit |
| Llama 3.2 Vision (90B) | 90B | 62.3 GB | Won’t fit |
| GLM-4.5 Air (106B-A12B) | 106B | 73.0 GB | Won’t fit |
| Llama 4 Scout (109B MoE) | 109B | 75.0 GB | Won’t fit |
| gpt-oss (120B MoE) | 117B | 80.4 GB | Won’t fit |
| Mistral Large 2 (123B) | 123B | 84.4 GB | Won’t fit |
| DeepSeek-Coder-V2 (236B) | 236B | 160.1 GB | Won’t fit |
| GLM-4.6 (355B-A32B) | 355B | 239.9 GB | Won’t fit |
| Llama 4 Maverick (400B MoE) | 400B | 270.0 GB | Won’t fit |
| Llama 3.1 (405B) | 405B | 273.4 GB | Won’t fit |
| Qwen 3 Coder (480B-A35B) | 480B | 323.6 GB | Won’t fit |
| DeepSeek-V3.1 (671B) | 671B | 451.6 GB | Won’t fit |
| DeepSeek-R1 (671B Full) | 671B | 451.6 GB | Won’t fit |
| Kimi K2 (1T MoE) | 1000B | 672.0 GB | Won’t fit |
"Tight" means it only fits with little headroom — close other GPU apps or expect some system-RAM offload. For models that won't fit, drop to a smaller model, use 2-bit, or step up VRAM.
Get the exact number for your setup
Pick your model, quantization, and context length — the calculator shows the full VRAM math and tells you precisely which hardware fits.
Open the Local AI Calculator →
VRAM figures are reproducible estimates (weights + KV cache + overhead) and vary by runtime and quant format. Data current as of 2026-06-15.