Local GPU Rigs and Self-Hosted Inference
May 6, 2026 · 785 messages · 96 active members
@jcartu detailed his multi-RTX Pro 6000 Blackwell 'Rasputin' rig (Xeon w9-3495X, 256GB DDR5 ECC, dual 2800W PSUs, watercooled), targeting 300+ tps on a 27B coding model with Opus scaffolding. He open-sourced an LLM stres…
Read full digest →