Over the past few months, I've shipped dozens of projects using Cursor, Gemini, Opus 4.5, Claude Code, AI Studio, and various local models. The design stays consistent across all of them. Not because I found one magic model, but because I learned how each model responds to design constraints differently—and I stopped being vague.
🎨 The First Prompt Is a Contract
Before any code gets written, I tell the agent exactly what's allowed and what's forbidden. No gradients. No harsh shadows. Five pixel border radius everywhere unless I say otherwise. Cards, modals, buttons, inputs—everything follows the same radius. I also require dark and light mode from the start, with exact background colors for both. Sidebars stay near-black in dark mode, and cards sit on a slightly lighter gray so they read as layers.
But here's the thing that really changed everything for me: I give it Dieter Rams' 10 principles of good design at the start of every conversation. These principles are also in my .cursorrules file and my DESIGN-RULES.md file, so the agent has access to them throughout the project. This gives the model a philosophy to follow, not just a list of rules. When it has to make a judgment call, it has a framework to reason from—less but better, unobtrusive, honest, as little design as possible.
🧠 The Persona Trick
My .cursorrules file starts with something like: "You are an experienced designer with 30 years of experience who follows the Dieter Rams philosophy of 'less, but better.'"
This sounds silly. It works extremely well.
LLMs respond to persona and experience level better than they respond to rules alone. The persona combined with the actual Rams principles gives the model something to reason from when your explicit rules don't cover a specific situation. It's not just following instructions—it's adopting a point of view.
🎯 Design Tokens Over Everything
This is the single biggest fix for AI slop, and I cannot stress this enough.
Never let the agent use hard-coded values. No bg-blue-500. No #3b82f6. No rgb(). No hex codes. No Tailwind color utilities. No px values. Instead, everything must use design tokens: bg-card, bg-background, text-foreground, p-4, shadow-sm.
When you ban hard-coded values entirely, the agent literally cannot go rogue with color choices because there are no color choices to make. It references your system or it uses nothing. This one rule eliminates probably 80% of the AI slop problem on its own. Hard-coded colors are a dead giveaway that an interface was AI-generated, and they make theming impossible later.
📁 File Structure That Enforces the System
I have a .cursorrules file and a DESIGN-RULES.md file that I copy into every single project. Both files contain my design tokens, my constraints, the Dieter Rams principles, and specific instructions about what's allowed and what isn't.
Before the agent writes any code, my prompt says: "Read DESIGN-RULES.md before generating any code."
For tools like Cursor or Claude Code that can access my file system, I point them to my design system repo: "Review the design guardrails in /01-NEW-PROJECT-SYSTEMS/8-GUIDES/02_DESIGN-GUARDRAILS.md before proceeding."
This changes everything. The agent reads your documentation first, then generates code that follows it. No guessing, no interpretation, just execution against a spec you already wrote.
🔍 Specific References Beat Vague Descriptions
"Make it look professional" means nothing. Every model interprets that differently, and usually wrong.
"Make it look like namoslabs.com, Notion, Linear, or Vercel" gives the agent a real target to pattern-match against. It can reference actual interfaces instead of guessing what professional means to you.
I also tell it what to avoid by name: "No AI slop like gradients, excessive icons, or random emoji in the UI." Naming the anti-patterns explicitly makes them easier to catch. The model now knows exactly what you're watching for.
🤖 How Different Models Handle Design Systems
Not all models follow design rules the same way, and this is something I wish someone had told me earlier.
Claude Sonnet and Opus are the best at following complex multi-rule design systems. The longer context window means Claude actually remembers your guardrails throughout an entire session without needing reminders. If you have a detailed design system with many constraints, Claude will reference it consistently. This is my go-to for design-heavy work.
GPT-4o is solid at consistency but has this tendency to get "creative" with colors even when you've explicitly told it not to. I've had GPT-4o slip in subtle accents or slightly-too-bold shadows despite clear instructions. It follows the spirit of your rules but sometimes interprets them loosely. Always double-check color usage with this one.
Gemini 2.0 Flash and 2.5 Pro are surprisingly good at following token-based design systems, and they're fast. The tradeoff is that Gemini can forget rules mid-conversation. If you're working on a longer session, you'll need to remind it of your constraints every five to ten prompts. Not a dealbreaker, just something to plan for.
The context window matters more than people realize. Larger context models like Claude and newer Gemini can reference your entire design system repo throughout a session. Smaller context models need you to repeat the important rules periodically or they drift.
One trick that works across all models: add words like "CRITICAL" or "ABSOLUTELY NEVER" to your most important rules. Models weight these phrases heavily. "Never use gradients" works okay. "CRITICAL: Absolutely never use gradients under any circumstances" works much better. Also, put your design rules at the top of your .cursorrules file—the first 500 tokens matter most.
✂️ Cutting the Common AI Tells
Beyond colors and tokens, there are other giveaways that scream "no human touched this." Too many icons scattered randomly. Emojis throughout the UI for no reason. Inconsistent spacing. Decoration that doesn't support clarity.
I tell agents explicitly: never use emojis unless I specifically ask for them. Icons are only allowed in locations I define. If I haven't said to put an icon somewhere, don't put one there.
This level of control sounds tedious, but it's actually freeing. You make the decisions once, upfront, and then the agent executes consistently instead of making random choices you have to clean up later.
✅ The Self-Audit Step
At the end of major changes, I tell the agent: "List any gradients, hard-coded colors, or icon-only buttons you just added."
Most of the time it catches its own mistakes before I even have to review. When it doesn't catch them, at least I know exactly where to look. This turns the model into its own QA layer.
🎯 The Real Lesson
AI agents aren't bad at design. They're bad at reading your mind.
The people complaining online that AI can't do design are usually the same people giving vague instructions and expecting the model to guess correctly. That's not how this works. You have to teach it your system the same way you'd onboard a junior designer—except the AI actually follows your rules once you explain them clearly.
The more constraints you give upfront, the less you argue about button colors later. Different models need different levels of hand-holding, but all of them respond to clarity.
Try these techniques. Come back and tell me if they worked for you.
