Claude Fable 5 Drops, Hermes Stack, Ralph Loops — AI Daily Jun 09

607 messages · 86 active members

607
messages
86
active members
@jasonakatiff, @rmktg, @jcartu
top contributors

Overview

Today was dominated by Anthropic's mid-day launch of Claude Fable 5, the first 'Mythos-class' model, accessible via /model claude-fable-5 in Claude Code v2.1.169+. Early reports peg it at 2x Opus pricing with a 128k context window and a reported 91/100 senior engineer benchmark (vs Opus 4.8 at 63 and GPT-5.5 at 62). Builders rushed to test it on coding, security audits, and agent orchestration — strong one-shot results, but aggressive safety filters flagged full-codebase security audits and biology contexts, auto-downgrading to Opus 4.8. With Fable included in Max plans only until June 22, the race is on to harden systems and ship complex features before API-only pricing kicks in. Deep infrastructure talk centered on @jcartu's full SOTA Hermes + Hindsight + RAGFlow setup using Cerebras gpt-oss-120b at 2000 tps with Cohere for embeddings and reranking — the thesis being heavy memory hooks must push to waferscale cloud providers to keep your GPU free for orchestration. @rmktg confirmed Hermes is driving $6-12k daily revenue on a $60 VPS plus $220/mo in AI subs. Alongside, @tounano shared a detailed Ralph loop methodology: engineering specs, gap analysis, sprint planning, then Ralph execution with embedded code review. Side threads covered GLP-1 affiliate economics (5% adspend offers unanimously called scams vs $250-400 CPA market reality), phone farming workflows, competitor payment processor recon via Claude Code + Playwright/MCP, Supabase migration from Airtable, and a TruffleSecurity warning that 3,000 sites are leaking Google API keys.

Topics

Anthropic released Fable 5 mid-day via /model claude-fable-5 in Claude Code v2.1.169+, with a reported 91/100 senior engineer benchmark (Opus 4.8: 63, GPT-5.5: 62), 128k context, and 2x Opus pricing. Builders praised one-shot coding, agent orchestration, and security cleanups, but aggressive safety filters auto-downgrade full-codebase audits and biology contexts to Opus 4.8. Max-plan access ends June 22, then it shifts to API-only pricing.

@jcartu published his full Hermes config: Hindsight on cloud via Cerebras gpt-oss-120b (2000 tps, Groq backup), Cohere for embeddings (embed-english-v3) and reranking (rerank-v3.5), keeping the local GPU free for DeepSeek. @rmktg confirmed Hermes is driving $6-12k daily revenue on a $60 VPS + $220/mo AI subs, soon migrating to a Ryzen 9950 with 96GB RAM. Bottleneck is orchestration latency, not compute cost.

@tounano shared a detailed methodology: engineering + functional specs, gap analysis between codebase and specs, sprint planning, then Ralph processing with embedded code review. Members discussed task sizing, nested loops, and using Codex for execution while Claude writes the specs. loops.elorm.xyz circulated as a starting resource as devs migrate from one-shot prompting to copy-paste AI coding loops.

@arielletolome shared a direct advertiser offer of 5% adspend at $150-175 CPA on GLP-1 weight loss. Group consensus: completely unrealistic. @Coybh noted brands currently scaling at $400 CPA (Medvi paid ~$400), @WhiskeyATX targets $250, and standard agency rates are 15% of adspend. The math at 5% = $7.50-$8.75 per sale at heroic CPAs, plus you'd need agency accounts on top.

@iannagy asked how to identify competitor payment processor stacks. @victorbrsss recommended letting Claude Code run actual transactions via Playwright or Chrome MCP with a safe card to analyze checkout flow, upsells, and network events. @mb29266 added BuiltWith as a free option, while @pqbd1 noted orchestrated payment setups are nearly impossible to fully reverse engineer.

Key Takeaways

  • Claude Fable 5 is live via /model claude-fable-5 — 128k context, ~2x Opus pricing, 91/100 benchmark. Max-plan access ends June 22, then API-only — ship complex features and harden systems now.
  • Fable 5's safety filters aggressively flag security and biology contexts and auto-downgrade to Opus 4.8 — frame audits as 'my own system' and split large audits into smaller scoped reviews.
  • For Hermes performance: push memory hooks (Hindsight) to Cerebras/Groq at 2000 tps and Cohere for embeddings — keep your GPU free for the orchestration model. Real stack doing $6-12k/day on $60 VPS + $220/mo subs.
  • Loop engineering pattern: spec → gap analysis → scoped sprint task list → Ralph execution with tests-first and embedded code review. Codex for execution, Claude for specs.
  • GLP-1 CPAs at $250-400 are current market — anyone offering 5% adspend at $150 CPA is lowballing; demand 15% and agency account passthrough. Also: audit your Google API key exposure (3,000 sites currently leaking).

Hot Threads

@jcartustarted

Full Hermes + Hindsight + RAGFlow cloud setup with Cerebras orchestration at 2000 tps

18 replies6 participants
@tounanostarted

Ralph loops methodology: specs, gap analysis, and nested loop design

8 replies4 participants
@arielletolomestarted

GLP-1 direct advertiser offering 5% adspend at $150 CPA — scam or signal?

15 replies7 participants

Linked Items