Qwen 27B on Blackwell: Local Coding at Hardware Cost
May 12, 2026 · 559 messages · 81 active members
@jcartu spent 4 days optimizing a vLLM fork (MTP-3 to D-Flash to D-Tree) on Blackwell, with 27B running well on a single 5090 and TP4 targeting 200tps on 397B. Claims Qwen 27B + Opus planning delivers 4-6x faster coding…
Read full digest →