40:41⚡️Making DeepSeek v4 outperform Opus 4.7 with Taste — @Ahmad Awais , CommandCode.ai
Ahmad Awais explains how his open-source CLI CommandCode uses a 'validate-then-repair' layer to fix tool-calling errors in open models like DeepSeek, allowing them to outperform premium models like Opus 4.7 in 6 of 10 evaluations. He argues that perceived weaknesses are harness/contract issues, not capability gaps, and extends the same repair logic to combat 'design slop' by encoding designer frameworks. Awais also shares plans to open-source CommandCode while keeping it focused on the best models.
https://x.com/MrAhmadAwais/status/2050956678502420612 We sit down with Ahmad Awais, CEO of CommandCodeAI, who developed a lightweight "tool-input repair layer" in their open-source AI CLI that dramatically improves tool-calling reliability for open models like DeepSeek. By analyzing failure patterns across billions of tokens, he shifted from rigid validation to a "validate-then-repair" approach, allowing cheaper open models (especially DeepSeek V4 Pro) to outperform premium ones like Opus 4.7 in 6 out of 10 internal evaluations. The core insight: most perceived "open model weaknesses" in tool calling are harness/contract issues rather than true capability gaps, fixable with targeted repairs, semantic hints, and transparent feedback instead of changing the underlying LLM. Timestamps 0:03 Introduction and background of Ahmad Awais 1:12 The origins of CommandCode and AI coding agents 2:51 Introducing "Taste": A meta-neurosymbolic framework 4:48 Identifying the "Tool Confusion" phenomenon in open models 9:20 Deep-dive into tool-calling reliability and the "Repair Layer" 12:04 Why common coding agent harnesses struggle with open models 16:23 Proving open model performance and the "Go" plan 17:35 Applying repair logic to solve "Design Slop" 20:44 The role of OKLCH and design compositional frameworks 24:19 Demonstrating real-world design capabilities 26:52 How Taste manages skills and developer preferences 32:08 Skills vs. Taste: Understanding the hierarchy 37:05 Roadmap: Open-sourcing CommandCode and future philosophy
