GLM-5.2 Stack, Hermes vs OMP, Realtime Voice — AI Daily Jun 29

414 messages · 73 active members

414
messages
73
active members
@jcartu, @samb69, @Kieran
top contributors

Overview

Today's 414 messages from 73 active builders centered on stack consolidation and the rapid ascent of Chinese open-weight models. The Opus + Codex + GLM-5.2 combo hardened as the consensus stack: @c_1media cancelled all but one Claude sub to make Codex the main driver, and multiple builders flagged Claude 4.8 as more restrictive on edge-case prompts. GLM-5.2 and Qihoo 360's Tulongfeng drew strong reviews — @jcartu's CTO is shipping production code with GLM-5.2, while z.ai and Fireworks were recommended as reliable API hosts. @jcartu also shipped NEONFALL, a polished asteroids game built locally on GLM-5.2 via Oh My Pi in a single day, iterating with mouse control and a leaderboard based on community feedback. Orchestration philosophy sharpened around Hermes vs OMP. @samb69 framed it cleanly: 'Hermes = SOPs and workflows, OpenCode/Pi = to build stuff.' Kieran championed OMP after using it to reconcile a Grok Twitter agent and Gemini Flash YT agent into a single 'fable-esque' founder-story ads framework. @aa12on detailed a full e2e creative pipeline (research → dissect → script → image/video → stitch) but warned full automation only suits high-volume bid-cap plays — refine each component to 10/10 before chaining. On voice, @samb69 was 'blown away' by GPT realtime's near-instant latency, while ElevenLabs v3 still wins on quality and just shipped a 5-minute audiobook feature. Parallel threads covered local iron as a hedge against Claude/Codex caps (Cerebras hitting 1800 TPS on Gemma 4 31B reshaped the cost-per-token math), Seedance + Claude pipelines turning around ecom landers and video creatives in under an hour, and Lottie + Figma + Claude replacing Canva for motion social assets. Vibecoders are also increasingly hiring senior devs — @navuud confirmed bringing on @jaypozo to audit architecture and security before scaling.

Topics

@c_1media cancelled all but one Claude sub to make Codex the main driver, keeping Claude only for UI/copy. Multiple builders flagged Claude 4.8 as too restrictive, while GLM-5.2 and Qihoo 360's Tulongfeng drew genuine production praise — @jcartu's CTO ships GLM-5.2 code daily. z.ai and Fireworks lead on reliable API access at 200k–1M context, and @jcartu's NEONFALL asteroids game showed local GLM-5.2 via OMP can ship polished products in a day.

@samb69 crystallized the framing: 'Hermes = SOPs and workflows, OpenCode/Pi = to build stuff.' Kieran championed OMP after using it to reconcile Grok and Gemini Flash research agents into a 'fable-esque' founder-story ads framework. @aa12on detailed a full e2e creative pipeline but agreed full automation only fits high-volume bid-cap plays — refine each component to 10/10 before chaining.

@samb69 tested both and was 'blown away' by GPT realtime's near-instant latency, while @jrizzolo countered that realtime-2 sounds robotic compared to ElevenLabs. Consensus: ElevenLabs v3 wins on quality (and ships a proper audiobook in 5 minutes), but GPT realtime is the latency king for live voice agents.

Seedance is becoming the go-to for AI video ad creatives, with members shipping landers and assets to account managers in under an hour. Lottie + Figma + Claude was pitched as a Canva replacement for motion social posts, and native AI ads were framed as the new Pixar-style production layer for ecom teams.

Rising memory, CPU, and Apple silicon prices have builders weighing local rigs as a hedge against Claude/Codex rate cuts. @jcartu showed off a waterblock-bound multi-GPU setup running Sero reaps toward a 75 Terminal-bench target. Cerebras hitting 1800 TPS on Gemma 4 31B sharpened the inference-cost story, though cloud still wins on productivity today.

Key Takeaways

  • Consensus stack: Opus for scaffolding, GLM-5.2 local for coding, Codex as main driver — Claude 4.8 widely flagged as too restrictive on edge cases.
  • Frame Hermes as your SOP/workflow layer and OMP/OpenCode as your build layer; don't chain components end-to-end until each is 10/10.
  • GPT realtime wins on voice latency (freakishly instant), ElevenLabs v3 still wins on quality and just shipped a 5-minute audiobook feature.
  • Seedance + Claude pipelines and Lottie + Figma + Claude are replacing Canva-style workflows for ecom video and motion social assets.
  • Cerebras serving Gemma 4 31B at 1800 TPS reshapes inference economics, but most builders still find cloud Claude/Codex too productive to leave for local rigs.

Hot Threads

@jcartustarted

NEONFALL asteroids game built locally with GLM-5.2 + OMP

28 replies12 participants
Kieranstarted

OMP for founder-story ads research framework

22 replies5 participants
@expadzstarted

Most reliable platform for GLM-5.2 API access

9 replies4 participants

Linked Items