Daily Digest — Saturday, April 25, 2026

569 messages · 70 active members

569
messages
70
active members
@jasonakatiff, @Wootbro, @jcartu
top contributors

Overview

Saturday's 569-message firehose was dominated by tooling shifts, model performance debates, and an end-of-an-era project announcement. @jcartu archived RASPUTIN (72.4% LoCoMo) after Hindsight (89.6%) and Mem0's April rewrite (91.6%) leapfrogged it in just 60 days, recommending a Hindsight + Honcho stack as the new path forward. Meanwhile, the Opus 4.7 vs GPT-5.5 battle raged on—4.7 had a rocky launch week with reports of extreme slowness and bugs (5 hours to fix one bug for @zippi101), pushing many toward Codex/GPT-5.5 for orchestration—but by day's end several builders confirmed 4.7 on xHigh/max effort is now matching or exceeding peak 4.6 quality. Workflow optimization was the other major thread, with cmux emerging as the consensus terminal for managing 30-40 concurrent Claude Code/Codex sessions at a fraction of VSCode/Cursor's memory footprint. @Wootbro evangelized Hermes for autonomous overnight execution (a 5-6 hour ULTRAWORK PRD that didn't stop for permissions), while a nasty CC 2.1.120 regression sent users rolling back to 2.1.119. The day's showstopper was @iamgalba's deep dive into his ~1M-line agentic Google/Bing Ads system spanning 114 skills, running on 5x Claude Code Max + 1x Codex Pro, managing ~$10M in ad spend. Deeper threads explored LLM reliability for high-stakes financial data (with tool calls + thinking-disabled architectures as the proposed fix), the value of $5k/mo info products like StefanBrain (verdict: cheaper alternatives like Franky win), and Google Antigravity + Claude Max delivering night-and-day better visual dev than Cursor alone.

Topics

RASPUTIN Shutdown & Memory SOTA Pivot

18 msgs

@jcartu archived RASPUTIN (72.4% LoCoMo) because Hindsight (89.6%) and Mem0's April rewrite (91.6%, under 7K tokens/call) moved the field 18-20 points in 60 days. He recommends combining Hindsight (fact retrieval) with Honcho (theory-of-mind/psychological modeling) as the new stack, with existing qdrant users pivoting directly to mem0.

Opus 4.7 Performance Saga & GPT-5.5 Migration

43 msgs

After 2-3 rough days with widespread complaints about slowness and bugs (theorized as Anthropic compute rationing), 4.7 on xHigh/max effort is now reportedly matching or surpassing peak 4.6 for complex code. Mid-week workarounds included flipping to 4.6 mid-session and migrating orchestration to GPT-5.5 via Codex despite 2x cost and more hallucinations. Anthropic published a postmortem on the April 23 outages.

cmux + Hermes: New Terminal Stack for Multi-Agent Workflows

49 msgs

@jasonakatiff championed cmux as a low-memory alternative to running CC inside VSCode/Cursor, with users managing 30-40 concurrent sessions across project tabs. /rename + /resume from session start saves you from crashes and SSH drops. @Wootbro reported Hermes ran a 5-6 hour autonomous ULTRAWORK PRD overnight where OpenCode would have halted for permissions. Note: roll back CC 2.1.120 → 2.1.119 to fix the broken session resume bug.

iamgalba's 1M-Line Google Ads Agentic System

25 msgs

Detailed breakdown of a 114-skill system covering shopping feeds, RSAs, PMax, presell/offer pages, Demand Gen video pipelines, competitive intel, and self-improving prompt evals. Built over 2 years with $100k+ in token spend, runs on 5x Claude Code Max + 1x Codex Pro, manages ~$10M in ad spend, and includes a refinement learner that auto-patches prompts from human edits. Available at tegra.co.

$5k/Mo AI Courses vs. Cheaper Alternatives & LLM Reliability for Finance

44 msgs

StefanBrain ($5k/mo) and Evolve ($1.5k/mo) sparked debate, with @Coybh recommending Franky (sub-$100) and 0xRoas as better-value alternatives—consensus: AI moves too fast for static courses, foundational DR principles get recycled across all of them. Separately, @ericshaf argued LLMs are borderline useless for financial data due to silent hallucination corruption; @jcartu countered with deterministic tool calls, thinking-disabled modes, and variance-detection guardrails.

Key Takeaways

  • Memory SOTA jumped 18-20 points in 60 days—Hindsight (89.6%) + Honcho is the new recommended stack, with Mem0's April rewrite hitting 91.6% LoCoMo under 7K tokens/call.
  • cmux is the new go-to terminal for AI coding—dramatically lower memory than VSCode/Cursor when running multiple concurrent agents; use /rename + /resume from session start to survive crashes.
  • Claude 4.7 on xHigh/max effort has caught up to peak 4.6 for complex code after a rocky launch week—give it another shot if you bounced early. Roll back CC 2.1.120 → 2.1.119 for session resume.
  • For hallucination-sensitive domains, don't let the LLM BE the solution—have it BUILD the solution: deterministic tool calls with thinking disabled, single source of truth, pandas/numpy scripts.
  • Hermes is emerging as the preferred harness for long-running autonomous work because it doesn't halt for permissions like OpenCode does—ideal for overnight ULTRAWORK PRDs.

Hot Threads

@iamgalbastarted

Full breakdown of 1M-line Google/Bing Ads agentic system with 114 skills

25 replies8 participants
@jasonakatiffstarted

cmux vs VSCode/Cursor for managing Claude Code sessions

30 replies10 participants
@jcartustarted

RASPUTIN shutdown and pivot to Hindsight + Honcho memory stack

12 replies4 participants

Linked Items