Q: What quantization of Qwen 2.5 Coder 32B should I use on a NVIDIA GeForce GTX 1660?

For 6 GB VRAM on the NVIDIA GeForce GTX 1660, the Q8_0 variant is the best fit. Estimated ~2 tokens/sec on the Q8_0 quantization.

Q: How fast does Qwen 2.5 Coder 32B run on NVIDIA GeForce GTX 1660?

Roughly 2 tokens/sec for Q8_0. Real speed depends on context length, backend (Ollama, llama.cpp, LM Studio), and KV cache size.

Q: What if NVIDIA GeForce GTX 1660 is not enough for Qwen 2.5 Coder 32B?

Consider upgrading to Apple M4 Pro (48 GB VRAM) which fits the recommended 40 GB target. Or pick a smaller quantization to stay on your current card.

Question 1

Can I run Qwen 2.5 Coder 32B on a NVIDIA GeForce GTX 1660?

Accepted Answer

Sort of — NVIDIA GeForce GTX 1660 can run Qwen 2.5 Coder 32B (Q8_0) only by spilling layers to RAM. Generation will be slow. CPU + GPU hybrid — not enough VRAM (6 GB < 36 GB min), but 64 GB RAM is sufficient. Expect significantly slower inference.

Question 2

What quantization of Qwen 2.5 Coder 32B should I use on a NVIDIA GeForce GTX 1660?

Accepted Answer

For 6 GB VRAM on the NVIDIA GeForce GTX 1660, the Q8_0 variant is the best fit. Estimated ~2 tokens/sec on the Q8_0 quantization.

Question 3

How fast does Qwen 2.5 Coder 32B run on NVIDIA GeForce GTX 1660?

Accepted Answer

Roughly 2 tokens/sec for Q8_0. Real speed depends on context length, backend (Ollama, llama.cpp, LM Studio), and KV cache size.

Question 4

What if NVIDIA GeForce GTX 1660 is not enough for Qwen 2.5 Coder 32B?

Accepted Answer

Consider upgrading to Apple M4 Pro (48 GB VRAM) which fits the recommended 40 GB target. Or pick a smaller quantization to stay on your current card.

Quantization	File Size	Min VRAM	Rec VRAM	Context	Verdict	Estimated tok/s
Q4_K_M	19 GB	21 GB	24 GB	8K / 128K	Hybrid CPU+GPU	~3
Q8_0Best fit	34 GB	36 GB	40 GB	8K / 128K	Hybrid CPU+GPU	~2

Can I Run Qwen 2.5 Coder 32B on NVIDIA GeForce GTX 1660?

Share this matchup

Every Qwen 2.5 Coder 32B quantization on NVIDIA GeForce GTX 1660

Upgrade options that fit Qwen 2.5 Coder 32B better