rtx-3060

Here is 1 public repository matching this topic...

Wierzbowski-Alien / qwen-coder-w4a16-demo

DeepSeek-R1 7B INT4 at 69.3 tok/s on a $300 RTX 3060. Faster than llama.cpp, vLLM, and NVIDIA TensorRT-LLM. Is one developer + Ai really better than the entire industry?

inference-engine cachyos local-llm speculative-decoding deepseek-r1 cuda-optimization rtx-3060 w4a16

Updated May 19, 2026
Python

Improve this page

Add a description, image, and links to the rtx-3060 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rtx-3060 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rtx-3060

Here is 1 public repository matching this topic...

Wierzbowski-Alien / qwen-coder-w4a16-demo

Improve this page

Add this topic to your repo