Show HN: Open Access Qwen3.6-35B-A3B-UD-Q5_K_M with TurboQuant
Category: ai-ml
Tags: llm, model-serving, quantization, runpod
Score: 4.7/10 (Innovation: 3, Technical: 7, Documentation: 2, Utility: 4)
This project provides temporary, open access to a quantized Qwen3.6 35B language model running on a cloud GPU instance. It's interesting as a practical demonstration of deploying a large, high-context model with specialized quantization (TurboQuant) for improved inference speed, but it's a temporary, personal sharing of compute resources rather than a maintained software project.
Target audience: ai-researchers, ml-engineers, hobbyists
View on Hacker News