Select Page

Building Software with Claude and Windsurf: What Worked (and What Didn’t)

by | June 13, 2025

When it comes to AI-assisted development, sometimes known as vibe coding, there’s no shortage of hype. But what actually happens when you sit down and try to build something real? 

To find out, we ran an internal experiment using Claude 3.5 Sonnet and Windsurf IDE to build an internal dashboard designed to centralize contractor information called Scopic People.  

Our goal was to evaluate real productivity, output quality, and workflow usability using AI as a primary driver. 

In this blog, we will tell you all about what worked, what didn’t, and what we learned along the way. Let’s dive right in. 

*Note: All findings are drawn directly from Scopic’s whitepaper on AI-Powered Development: Promise and Perils. 

What Worked 

Let’s start with the good news. Throughout the development of Scopic People, several aspects of the AI-assisted workflow proved highly effective.  

These successes highlight where Claude 3.5 Sonnet and Windsurf offered tangible productivity gains and produced reliable, usable output with minimal intervention.  

Task Structuring for AI 

The AI produced the most reliable output when instructions were specific and tasks were clearly segmented. Attempting to generate full interfaces or multiple components in one step often led to errors. Breaking the work into smaller, logically discrete steps improved results significantly and reduced the need for rework. 

Prompt-Based Development 

Windsurf’s Cascade interface enabled direct prompting within the codebase, allowing developers to iterate on instructions in real time. The integration of natural language prompts and inline output made it easier to test changes, revise code, and minimize context-switching. 

Rapid Prototyping and Setup 

Claude performed well during initial project setup. It created the Git repository, added appropriate configuration files, and generated scaffolded user interface elements using Tailwind CSS and DaisyUI. These steps were completed quickly with minimal adjustment. 

Complex Logic and Architectural Patterns 

When given precise requirements, Claude successfully handled service layer design, data modeling, and role-based access control. In one case, it exceeded expectations by replacing hardcoded administrator roles with a more maintainable, database-driven access system. 

Containerization and DevOps Tasks 

The AI completed DevOps-related work effectively, including: 

  • Generating Docker and docker-compose files 
  • Configuring PostgreSQL integration 
  • Implementing caching logic 
  • Creating schema migration scripts 

What Didn’t Work 

Despite the overall time savings and functional output, several challenges emerged when working with Claude and Windsurf. These limitations clarified when human input was still essential. 

Overly Complex Prompts 

Broad or multi-part instructions led to incomplete or incorrect output. Attempts to generate large interface sections or multi-step workflows in one prompt were often unsuccessful and had to be restructured. 

Looping and Repetitive Output 

In some cases, the LLM entered a loop – reapplying the same change or reversing previous code. Developers had to stop the process, revert to the last stable version, and rewrite the prompt with greater specificity. 

API Integrations Without Good Docs 

Integration with the Zoho People API could not be completed independently by Claude due to limited documentation. A developer manually explored the API, gathered endpoint details, and then provided that context to the model to complete the integration. 

Minor UI Adjustments and Fixes 

For simple layout changes – such as spacing or button positioning – it was faster to edit the code manually than to prompt the AI. The overhead of crafting a precise prompt often outweighed the benefits. 

The Key: Knowing When to Step In 

 Although the goal was to explore the full extent of AI-assisted development, the most efficient outcome came from a hybrid approach.  

Claude 3.5 Sonnet was most effective when used for logic scaffolding, backend structure, and repetitive code generation.  

Developers remained essential for quality assurance, integration troubleshooting, and rapid micro-adjustments. 

Conclusion 

This experiment demonstrated that AI-powered development tools can significantly reduce time-to-completion – when used with structure and oversight.  

In our case, Claude and Windsurf helped complete a working internal application in just 9 hours, compared to an estimated 144–180 hours using traditional methods.  

Key recommendations for development teams: 

  • Break tasks into clearly defined steps 
  • Use AI for scaffolding, not precision changes 
  • Maintain version control throughout 
  • Step in manually for third-party APIs and UI tweaks 
  • Treat AI as a tool that extends – not replaces – developer expertise 

To learn more about how our experiment went, download the full whitepaper – AI-Powered Development: Promise and Perils.

Or book a free consultation to see how we can support your LLM strategy. 

About Creating the Article – Building Software with Claude and Windsurf: What Worked (and What Didn’t)

This guide was authored by Angel Poghosyan, and reviewed by Mladen Lazic, Cheif Operations Officer at Scopic.

Scopic provides quality and informative content, powered by our deep-rooted expertise in software development. Our team of content writers and experts have great knowledge in the latest software technologies, allowing them to break down even the most complex topics in the field. They also know how to tackle topics from a wide range of industries, capture their essence, and deliver valuable content across all digital platforms.

Note: This blog’s images are sourced from Freepik.

If you would like to start a project, feel free to contact us today.
You may also like
Have more questions?

Talk to us about what you’re looking for. We’ll share our knowledge and guide you on your journey.