Opus 4.8 Token Burn, Hermes Agents, Claude Code Scaling — AI Daily May 29
386 messages · 69 active members
Overview
Topics
Builders reported massive token consumption on Opus 4.8 — @GuruTime hit 15M tokens / 182 agents / 27 min, @c_1media burned $850-equivalent in one ultracode run. @sibunting praised smoother context handling while @samtome and @mb29266 preferred 4.6 for ad copy. @Wootbro shared a full OMO team-mode orchestrator template (preflight → lane planning → swarm waves → fan-in → proof audit) that several members said mirrored what they were independently building.
@rmktg detailed spinning up new Hermes bot profiles in minutes via botfather + orchestrator hand-off, with bots auto-reading prior notes. @kennyaronson shared a Klaviyo manager agent building 14-day sequences plus a calendar-driven scheduling system. @navuud demoed 'Migi', a Telegram userbot on pyrofork + GPT-realtime for voice-prompted coding sessions. Hermes had another outage hitting traveling users.
@danfeldman runs ~40 Claude Code sessions on a dedicated machine and is hitting concurrency limits sharing via AnyDesk. Solutions: SSH + tmux per-user session scripts (@geilt), Dropbox-shared project folders with Telegram bridge (@weslindquist). AWS rolled out Claude on Bedrock plus Claude Cowork — @geilt clarified only true Bedrock deployment keeps prompts inside your AWS account; the Anthropic Platform route still ships data to Anthropic.
@tidemid and @arielletolome hunted for cheapest Seedance 2 API ($2.41/15s deemed too expensive). @jlang123 recommended kie.ai at $1.16 for 15s at 480p with upscaling. @kingofgrowth surfaced MoneyPrinterTurbo (69k stars) as a candidate auto-video pipeline. @fmill1 tried Gemini for IG/TT video with no luck.
@jcartu kept evangelizing GLM 5.1 as a cheap Opus substitute, flagging Fireworks hosting GLM 5.1 and Kimi 2.6 at 250-350 tps. @jrizzolo announced Codex remote control and computer use on Windows. @mikeconner shared a case study on a company going 100% local with Codex + Ollama. In parallel, @pqbd1, @tounano, @bofu2u and others debated Oura vs Whoop vs Fitbit Air for sleep and recovery tracking.
Key Takeaways
- Opus 4.8 ultracode can burn 40% of a $200 weekly Claude quota in one run — reserve max effort for tasks that need it; harness + prompting usually beats raw effort.
- Dynamic workflows are converging on a standard pattern: preflight → lane planning → parallel swarm → fan-in verification → proof audit (see @Wootbro's OMO template).
- For Hermes-style multi-bot setups, point new bots at prior bot notes via the orchestrator — @rmktg gets new profiles cranking in minutes instead of hours.
- Claude via AWS Anthropic Platform still ships data to Anthropic — only true Bedrock deployment keeps prompts inside your AWS account.
- Cheapest viable AI video pipeline right now: kie.ai at ~$1.16 per 15s at 480p, then upscale to 720p only on winners.
Hot Threads
Opus 4.8 token consumption insanity (182 agents, 15M tokens, 27 min runs)
Oura ring vs Apple Watch vs Whoop for sleep and recovery tracking
How to share Claude Code across a team with 40+ active sessions