GLM-5.2 Hardware, Paperclip Agents, Mac-to-VPS — AI Daily Jun 19
432 messages · 71 active members
Overview
Topics
@jcartu broke down why GLM-5.2 (1.1TB BF16) is unrealistic on Macs even with extreme quants — you need 6-8 GPUs at FP8 or an 8x RTX Pro 6000 setup (~100k EUR). Unsloth's GGUF quants dropped from 82% to 76% accuracy per a new HF discussion. For experimenting cheaply, $20 on z.ai or OpenRouter was the consensus. The community largely agreed 2-bit quants of frontier models aren't worth running.
Paperclip can spin up 15 concurrent subagents acting as a 'CEO' — @rmktg's test agent immediately spawned 5 agents to build landers chasing $50k/week TikTok spend. @jcartu warned not to point it at flat-rate subscriptions on expensive models. Separately, @jarvisballer's one-orchestrator-per-project model with Obsidian Kanban resonated with @sav310 and @samb69, who debated standardized 'machine packets' in Linear comments so agents resume cleanly.
Near-unanimous endorsement of Hermes over OpenClaw — @rmktg quit OC after 2 weeks, @startropics noted Hermes was smooth from day one. @ecom2023 highlighted Hermes + Camoufox for scraping and browser automation as things OC simply can't do right now.
@danfeldman's home Mac mini Claude Code setup is going company-wide and needs proper infra. Advice ranged from letting the bots SSH-migrate themselves, to using VPS snapshots over S3, and pushing archival data to Glacier. Obsidian over Tailscale was flagged as not refreshing in real time, so docs should live on the work machine.
Palmier IO, an AI video editor reportedly at $4M ARR with a free BYO-key tier, drew mixed reactions — @jarvisballer called it shit and rebuilt a better version from its open GitHub. Separately, @namtalks asked how to demo Slack/Discord agent harnesses to non-technical clients; suggestions included split-screen scripted recordings or building a parallel n8n flow as a visual prop.
Key Takeaways
- GLM-5.2 BF16 needs ~8x RTX Pro 6000 (~100k EUR); Mac Studios are dreaming. Use z.ai or OpenRouter for $20 to experiment first, and skip 2-bit quants of frontier models.
- Unsloth GLM-5.2 GGUF accuracy was revised down from 82% to 76% — aggressive quantization loses real intelligence; NVFP4 is the cleaner path.
- Paperclip is a token shredder running 15 concurrent agents — never point it at flat-rate subscriptions on expensive models; use it as a ticketing/observability layer if you want safety.
- One orchestrator per project plus a shared Kanban/Obsidian board with standardized 'machine packets' beats letting each agent freelance its own notes.
- For Mac-to-VPS migrations, let agents SSH-migrate themselves but rely on VPS snapshots or S3 Glacier for proper backup hygiene; Tailscale gets you remote access but Obsidian sync lags.
Hot Threads
GLM-5.2 local inference hardware reality + Mac Studio teardown
Paperclip orchestration, OMO config, and token-burn warnings
Migrating Claude Code work from a home Mac mini to a VPS