GLM-5.2 Stack, Hermes vs OMP, Realtime Voice — AI Daily Jun 29
414 messages · 73 active members
Overview
Topics
@c_1media cancelled all but one Claude sub to make Codex the main driver, keeping Claude only for UI/copy. Multiple builders flagged Claude 4.8 as too restrictive, while GLM-5.2 and Qihoo 360's Tulongfeng drew genuine production praise — @jcartu's CTO ships GLM-5.2 code daily. z.ai and Fireworks lead on reliable API access at 200k–1M context, and @jcartu's NEONFALL asteroids game showed local GLM-5.2 via OMP can ship polished products in a day.
@samb69 crystallized the framing: 'Hermes = SOPs and workflows, OpenCode/Pi = to build stuff.' Kieran championed OMP after using it to reconcile Grok and Gemini Flash research agents into a 'fable-esque' founder-story ads framework. @aa12on detailed a full e2e creative pipeline but agreed full automation only fits high-volume bid-cap plays — refine each component to 10/10 before chaining.
@samb69 tested both and was 'blown away' by GPT realtime's near-instant latency, while @jrizzolo countered that realtime-2 sounds robotic compared to ElevenLabs. Consensus: ElevenLabs v3 wins on quality (and ships a proper audiobook in 5 minutes), but GPT realtime is the latency king for live voice agents.
Seedance is becoming the go-to for AI video ad creatives, with members shipping landers and assets to account managers in under an hour. Lottie + Figma + Claude was pitched as a Canva replacement for motion social posts, and native AI ads were framed as the new Pixar-style production layer for ecom teams.
Rising memory, CPU, and Apple silicon prices have builders weighing local rigs as a hedge against Claude/Codex rate cuts. @jcartu showed off a waterblock-bound multi-GPU setup running Sero reaps toward a 75 Terminal-bench target. Cerebras hitting 1800 TPS on Gemma 4 31B sharpened the inference-cost story, though cloud still wins on productivity today.
Key Takeaways
- Consensus stack: Opus for scaffolding, GLM-5.2 local for coding, Codex as main driver — Claude 4.8 widely flagged as too restrictive on edge cases.
- Frame Hermes as your SOP/workflow layer and OMP/OpenCode as your build layer; don't chain components end-to-end until each is 10/10.
- GPT realtime wins on voice latency (freakishly instant), ElevenLabs v3 still wins on quality and just shipped a 5-minute audiobook feature.
- Seedance + Claude pipelines and Lottie + Figma + Claude are replacing Canva-style workflows for ecom video and motion social assets.
- Cerebras serving Gemma 4 31B at 1800 TPS reshapes inference economics, but most builders still find cloud Claude/Codex too productive to leave for local rigs.
Hot Threads
NEONFALL asteroids game built locally with GLM-5.2 + OMP
OMP for founder-story ads research framework
Most reliable platform for GLM-5.2 API access