Claude Fable 5 Drops, Hermes Stack, Ralph Loops — AI Daily Jun 09
607 messages · 86 active members
Overview
Topics
Anthropic released Fable 5 mid-day via /model claude-fable-5 in Claude Code v2.1.169+, with a reported 91/100 senior engineer benchmark (Opus 4.8: 63, GPT-5.5: 62), 128k context, and 2x Opus pricing. Builders praised one-shot coding, agent orchestration, and security cleanups, but aggressive safety filters auto-downgrade full-codebase audits and biology contexts to Opus 4.8. Max-plan access ends June 22, then it shifts to API-only pricing.
@jcartu published his full Hermes config: Hindsight on cloud via Cerebras gpt-oss-120b (2000 tps, Groq backup), Cohere for embeddings (embed-english-v3) and reranking (rerank-v3.5), keeping the local GPU free for DeepSeek. @rmktg confirmed Hermes is driving $6-12k daily revenue on a $60 VPS + $220/mo AI subs, soon migrating to a Ryzen 9950 with 96GB RAM. Bottleneck is orchestration latency, not compute cost.
@tounano shared a detailed methodology: engineering + functional specs, gap analysis between codebase and specs, sprint planning, then Ralph processing with embedded code review. Members discussed task sizing, nested loops, and using Codex for execution while Claude writes the specs. loops.elorm.xyz circulated as a starting resource as devs migrate from one-shot prompting to copy-paste AI coding loops.
@arielletolome shared a direct advertiser offer of 5% adspend at $150-175 CPA on GLP-1 weight loss. Group consensus: completely unrealistic. @Coybh noted brands currently scaling at $400 CPA (Medvi paid ~$400), @WhiskeyATX targets $250, and standard agency rates are 15% of adspend. The math at 5% = $7.50-$8.75 per sale at heroic CPAs, plus you'd need agency accounts on top.
@iannagy asked how to identify competitor payment processor stacks. @victorbrsss recommended letting Claude Code run actual transactions via Playwright or Chrome MCP with a safe card to analyze checkout flow, upsells, and network events. @mb29266 added BuiltWith as a free option, while @pqbd1 noted orchestrated payment setups are nearly impossible to fully reverse engineer.
Key Takeaways
- Claude Fable 5 is live via /model claude-fable-5 — 128k context, ~2x Opus pricing, 91/100 benchmark. Max-plan access ends June 22, then API-only — ship complex features and harden systems now.
- Fable 5's safety filters aggressively flag security and biology contexts and auto-downgrade to Opus 4.8 — frame audits as 'my own system' and split large audits into smaller scoped reviews.
- For Hermes performance: push memory hooks (Hindsight) to Cerebras/Groq at 2000 tps and Cohere for embeddings — keep your GPU free for the orchestration model. Real stack doing $6-12k/day on $60 VPS + $220/mo subs.
- Loop engineering pattern: spec → gap analysis → scoped sprint task list → Ralph execution with tests-first and embedded code review. Codex for execution, Claude for specs.
- GLP-1 CPAs at $250-400 are current market — anyone offering 5% adspend at $150 CPA is lowballing; demand 15% and agency account passthrough. Also: audit your Google API key exposure (3,000 sites currently leaking).
Hot Threads
Full Hermes + Hindsight + RAGFlow cloud setup with Cerebras orchestration at 2000 tps
Ralph loops methodology: specs, gap analysis, and nested loop design
GLP-1 direct advertiser offering 5% adspend at $150 CPA — scam or signal?