GLM-5.2 Drops, Claude Outage, Patter AI Voice — AI Daily Jun 16

705 messages · 86 active members

705

messages

active members

@jcartu, @arielletolome, @robinroy

top contributors

Overview

Today's builder chatter centered on a major shift in the local + cloud LLM stack. Zhipu's GLM-5.2 dropped on Hugging Face at 750B parameters, reportedly topping design benchmarks and approaching Mythos/Fable-tier quality at 150 TPS — though it needs 6 GPUs to run, pushing some toward the MiniMax M3 NVFP4 quant that fits on 4. @jcartu detailed his production stack: Kimi 2.6 turbo on Fireworks (150-170 TPS, ~10x cheaper than Opus 4.8), DS4f local at 270 TPS with 1M context, and Opus 4.8 as scaffolding/Oracle in OpenCode with Sisyphus as coding agent. Kimi 2.7's 6x speed tier is imminent, potentially hitting 600-1000 TPS on Cerebras waferscale. Claude went down mid-session, interrupting cowork flows and amplifying anxiety about Anthropic's API-only transition after June 22. Builders pushed back on the updated privacy policy requiring documentation for premium intelligence access, with several declaring they'll switch to GLM-5.2, Zen Black, or other Chinese models. OpenAI countered with free Codex rate limit resets and a referral program through June 24. A claim that a Gemma 4 12B finetune trained on Fable's CoT runs locally was dismissed as clickbait — sub-1T models can't approach Opus. Voice and automation economics dominated side threads: @arielletolome shared Patter AI running at $0.025/min ($1.50/hour call center agents) and a $200k/mo ACA transfer playbook, sparking TCPA debate. @weslindquist detailed continuous AI bookkeeping for 10 businesses via QuickBooks API + Ramp + Bill.com. Hermes Atlas launched as a curated map of 169+ open-source tools for the Hermes Agent ecosystem, while builders kept griping about SOTA coding agents over-engineering with defensive bloat.

Topics

GLM-5.2 Release & Kimi/Local LLM Stack at 270+ TPS55 msgs

Zhipu's GLM-5.2 dropped on Hugging Face at 750B params, reportedly approaching Mythos quality at 150 TPS but requiring 6 GPUs (MiniMax M3 NVFP4 fits on 4). @jcartu detailed his stack: Kimi 2.6 turbo on Fireworks at 150-170 TPS, DS4f local at 270 TPS with 1M context, Opus 4.8 for scaffolding via OpenCode + Sisyphus. Kimi 2.7's 6x speed tier could hit 600-1000 TPS on Cerebras.

Claude Outage & Anthropic API-Only Transition33 msgs

Claude went down mid-day, hitting users who'd maxed out Codex as backup. Builders pushed back on Anthropic's updated privacy policy and the June 22 API-only transition, with several declaring they'll switch to GLM-5.2 or Zen Black. OpenAI rolled out free Codex rate limit resets with a referral program through June 24.

Patter AI Voice Agents & AI Trading Bots18 msgs

@arielletolome shared Patter AI at $0.025/min — roughly $1.50/hour call center agents, undercutting Retell and VAPI — plus a $200k/mo ACA transfer playbook that raised TCPA concerns. She also shared a Claude-driven trading bot doing top-down analysis across S&P 500, Nasdaq, gold, silver, and FX with a 38% papertrading win rate; @fewga893 flagged liquidity/slippage gaps in live trading.

Bookkeeping & Bank Transaction Automation22 msgs

@justingacina kicked off a thread on pulling Chase transactions for invoice tracking, with the group converging on QuickBooks API as easier than Plaid direct. @weslindquist shared a production setup running continuous AI bookkeeping for 10 businesses, recommending Ramp and Bill.com for vendor payments plus the $10/mo QB ledger tier for accounting firms.

Hermes Atlas, Agent Over-Engineering & Video Gen23 msgs

Hermes Atlas launched as a curated registry of 169+ open-source tools across 12 categories for the Hermes Agent. @sibunting raised SOTA coding agents passing QA but piling on defensive bloat — ponytail floated as mitigation. Builders are using Seedance for single-image animation and stacking prop libraries, routing around OpenAI's tightening similarity guardrails with Stable Diffusion pipelines.

Key Takeaways

GLM-5.2 (750B) is live on Hugging Face and reportedly approaches Mythos/Fable quality at 150 TPS — needs 6 GPUs locally; MiniMax M3 NVFP4 quant is the 4-GPU fallback.
Anthropic's June 22 API-only transition is pushing power users to Chinese models; Codex now banks free rate limit resets and runs a referral program through June 24.
Kimi 2.6 turbo delivers 150-170 TPS at ~10x cheaper than Opus; pair with DS4f local (270 TPS, 1M context) and Opus 4.8 for planning in OpenCode + Sisyphus.
Patter AI at $0.025/min makes sub-$2/hour voice call centers viable — but TCPA exposure on outbound AI dialing is the real constraint.
For bookkeeping automation, QuickBooks API approval is easier than direct Plaid; Ramp/Bill.com handle vendor payments cleanly, and the $10/mo QB ledger tier works for firms managing multiple clients.

Hot Threads

@jcartustarted

GLM-5.2 release plus Kimi/DS4f local stack and OpenCode workflow

38 replies8 participants

@justingacinastarted

Automating Chase transaction pulls and continuous AI bookkeeping

18 replies6 participants

@arielletolomestarted

Patter AI voice agent economics and $200k/mo ACA transfer playbook

8 replies4 participants

Linked Items

Overview

Topics

Key Takeaways

Hot Threads

Linked Items

Alok on X

Instagram

FLAME | AI streaming network

Elon Musk on X

403 - Forbidden