Blog Post

Can you complete the frontend challenge that AI failed?

AI & DevelopmentJul 2712 min readClaudia

The same challenge that tripped up GPT-4.1. No hand-holding. No guardrails. Just real tasks, real scope creep, and the cracks started to show.

Why I Created the Reality Check Challenge

When you're a self-taught frontend developer (or one without any professional experience yet), tutorials can only take you so far. Most projects online walk you through happy-path scenarios. Clean requirements. Fixed designs. Perfect APIs. But that's not how real software is built.

In the real world, the design changes. The scope expands. Product wants a new mode. Marketing wants different data. QA finds a bug in a state you didn't plan for. You're not just coding, you're architecting for change. That's what separates junior devs from pros.

I created the Frontend Reality Check Challenge to simulate that reality. A 10-day, email-driven challenge where you build a real feature and get hit with real curveballs, just like on the job. And then I had a thought...

What Happens When You Give This Challenge to AI?

We're in a golden age of AI-assisted development.

Tools like GitHub Copilot and Cursor are speeding up workflows and unlocking productivity for developers around the world. And rightly so, AI is doing some incredible things.

Let me be clear: I'm not anti-AI. I use it. I love it. I think it's a game-changer.

But I also believe this:

"If you don't know what good code looks like, you can't guide AI to write it."

You can't prompt your way into clean architecture if you don't understand why the architecture matters. You can't fix anti-patterns if you can't spot them. AI isn't the problem — lack of conceptual understanding is.

So I decided to put that idea to the test.

I ran the Reality Check Challenge through GPT-4.1 via Cursor, feeding it the raw prompts for all 10 days. No coaching, no commentary, no feedback from me. Just a straightforward developer workflow.

The result?

Let's just say, the cracks showed early.

Where the AI Struggled — and Why It Matters

I intentionally designed the Reality Check Challenge to surface the kinds of architectural and UX problems you only learn through experience, or painful trial and error.

Here's how Cursor AI (GPT-4.1) handled it:

1. No Component Extraction Until Day 10

For the first nine days, everything lived in a single file. The code grew with each requirement, stacking state logic, rendering conditions, and flags. But not once did it split things into components.

Even on Day 10, when prompted to reflect and refactor, it extracted some pieces, but the result was still tangled.

The component tree lacked clarity. Reusability? Minimal. Maintainability? Questionable.

2. Component Explosion via Props

By the end, the AI's main component was overloaded with optional props and conditionals for every feature: editable mode, preview mode, compact view, audit state...

Instead of creating separate components or layout modes, it bundled everything together with deeply nested if checks and && chains.

  • Small changes broke other states.
  • Preview mode design tweaks affected editable mode.
  • QA toggles impacted marketing views.

This is exactly the type of rigidity the challenge is meant to expose.

3. No Concept of Future-Proofing

Despite clear indications that more modes and features were coming, the AI always optimized for today's requirements.

No foresight. No guardrails. Just patch after patch.

That might be okay in a toy project. But in real codebases, this mindset creates tech debt fast, and it becomes your job to clean it up later.

4. Ambiguous UX Interpretation

One email asked for a "compact view" toggle. AI added it, but just jammed another if (compactMode) into the same component. No separation, no layout abstraction, no design consideration.

When the next task came in to add editable fields, it clashed with the compact mode layout and caused DOM inconsistencies.

And the list goes on. The AI produced working code, yes. But clean? Maintainable? Scalable?

Not even close.

What This Experiment Proves

This isn't a dunk on AI. It's a reality check for developers.

AI can generate code — fast. But it can't think like a senior developer unless you tell it how to. And you can't do that if you don't have the mental model in place yourself.

Code generation is easy.

Code architecture is learned.

And you can't learn it by just watching tutorial after tutorial.

  • You learn it by making mistakes.
  • By feeling the pain of a design change.
  • By trying to scale messy code.
  • By realizing you trapped yourself with your own component logic.

That's what the Reality Check Challenge gives you.

It's not about getting it perfect. It's about learning why it broke and how to do it better next time.

Can You Complete the Challenge AI Failed?

Cursor AI couldn't structure a scalable frontend solution across 10 days of real-world requirements.

Can you?

Take the challenge, build your own version of the app, and see how well your decisions hold up.

You'll come out of it with:

  • Sharper architecture instincts
  • Realistic experience handling changing requirements
  • A stronger understanding of what good looks like

And if you want to go even deeper, submit your code for a detailed review. I'll show you what you missed — and how to level up fast.