Show HN: Open Access Qwen3.6-35B-A3B-UD-Q5_K_M with TurboQuant

Category: ai-ml

Tags: llm, model-serving, quantization, runpod

Score: 4.7/10 (Innovation: 3, Technical: 7, Documentation: 2, Utility: 4)

This project provides temporary, open access to a quantized Qwen3.6 35B language model running on a cloud GPU instance. It's interesting as a practical demonstration of deploying a large, high-context model with specialized quantization (TurboQuant) for improved inference speed, but it's a temporary, personal sharing of compute resources rather than a maintained software project.

Target audience: ai-researchers, ml-engineers, hobbyists

View on Hacker News