Hermes Stack, GLM-5.2 Coding, Ornith-1.0 — AI Daily Jun 28
651 messages · 79 active members
Overview
Topics
Builders compared the three Hermes Mac apps vs Telegram (Telegram wins for speed), dug into the new /moa mixture-of-agents debate/synthesis mode (powerful but slow), and circulated a 15-level mastery roadmap covering foundation → leverage → autonomy. VPS + tmux + /remote-control became the standard pattern for keeping agents working when laptops are closed.
@jcartu's stack hardened into a community pattern: Opus 4.6/4.7 for scaffolding/PRDs (4.8 widely panned), GLM-5.2 locally for coding (rated on par with Opus 4.6–4.7), and Gemini Flash 2.5 (270 tps, 1M ctx, near-perfect tool calls) for Hermes orchestration. DeepReinforce also dropped Ornith-1.0, a self-scaffolding agentic-coding LLM using a frozen judge + deterministic monitor to block reward hacking. GLM-5.5 weights drop in August.
@Tz1888 detailed a 13-hour GPT-5.5 chain (plan → subagent review → implement → simplify → E2E with HTML screenshots → PR review). @Anonymoushat pushed back that E2E suites are backward-looking and bloat context — instrument with OpenTelemetry and pipe logs to a fine-tuned SLM that triggers Codex/Claude fixes in real time. @thewildzeno argued production systems with customer money need both. Claude Code rate limits are biting even at 2 parallel sessions.
@drcopybymatt prototyped a conversational voice tutor in a 3.5-hour hackathon using GPT realtime + Azure pronunciation analysis. Group consensus: personalized AI tutoring at ~17¢/session is a genuine 'actually useful' LLM application (Stimuler highlighted), but economics break at ~5 hrs/user/month on consumer pricing. Duolingo optimizes for retention not fluency — Pimsleur and live conversation patterns work better.
@jasonakatiff hit silent schema-skip issues merging worktrees. @seekersight outlined a 6-step gated flow (deploy → migrate staging → smoke → promote → migrate prod → verify) with separate CI jobs. @navuud shared his agent's pattern: SQL files in repo, transactional apply script, a _ff_migrations ledger table, and a drift-check command that diffs repo vs DB and fails CI on unapplied migrations.
Key Takeaways
- Consensus production stack: Opus 4.6/4.7 for scaffolding, GLM-5.2 locally for coding, Gemini Flash 2.5 for Hermes orchestration — Opus 4.8 widely panned, GLM-5.5 weights drop in August.
- Run Hermes on a VPS with tmux + /remote-control to keep agents working when your laptop is closed; the new /moa mode adds debate/synthesis but is slow.
- Replace bloated E2E suites with OpenTelemetry + a fine-tuned SLM watching logs to trigger real-time Codex/Claude fixes — keeps context windows focused on functionality.
- Ornith-1.0 uses a frozen LLM judge + deterministic monitor to block reward hacking in agentic coding RL — worth evaluating against existing harnesses.
- Treat DB migrations as an explicit gated release step with a ledger table and drift-check CI command — never let deploys silently apply or skip schema changes.
Hot Threads
Hermes Mac apps vs Telegram, /moa mode, and the 15-level mastery roadmap
Building a voice-based AI language tutor — economics and approach
Why agents still miss edge cases — E2E chains vs OTEL + SLM observability