Building Software with Claude and Windsurf: What Worked (and What Didn’t)

by Angel Poghosyan | June 13, 2025

Table of Contents

When it comes to AI-assisted development, sometimes known as vibe coding, there’s no shortage of hype. But what actually happens when you sit down and try to build something real?

To find out, we ran an internal experiment using Claude 3.5 Sonnet and Windsurf IDE to build an internal dashboard designed to centralize contractor information called Scopic People.

Our goal was to evaluate real productivity, output quality, and workflow usability using AI as a primary driver.

In this blog, we will tell you all about what worked, what didn’t, and what we learned along the way. Let’s dive right in.

*Note: All findings are drawn directly from Scopic’s whitepaper on AI-Powered Development: Promise and Perils.

Download the full whitepaper

What Worked

Let’s start with the good news. Throughout the development of Scopic People, several aspects of the AI-assisted workflow proved highly effective.

These successes highlight where Claude 3.5 Sonnet and Windsurf offered tangible productivity gains and produced reliable, usable output with minimal intervention.

Task Structuring for AI

The AI produced the most reliable output when instructions were specific and tasks were clearly segmented. Attempting to generate full interfaces or multiple components in one step often led to errors. Breaking the work into smaller, logically discrete steps improved results significantly and reduced the need for rework.

Prompt-Based Development

Windsurf’s Cascade interface enabled direct prompting within the codebase, allowing developers to iterate on instructions in real time. The integration of natural language prompts and inline output made it easier to test changes, revise code, and minimize context-switching.

Rapid Prototyping and Setup

Claude performed well during initial project setup. It created the Git repository, added appropriate configuration files, and generated scaffolded user interface elements using Tailwind CSS and DaisyUI. These steps were completed quickly with minimal adjustment.

Complex Logic and Architectural Patterns

When given precise requirements, Claude successfully handled service layer design, data modeling, and role-based access control. In one case, it exceeded expectations by replacing hardcoded administrator roles with a more maintainable, database-driven access system.

Containerization and DevOps Tasks

The AI completed DevOps-related work effectively, including:

Generating Docker and docker-compose files
Configuring PostgreSQL integration
Implementing caching logic
Creating schema migration scripts

What Didn’t Work

Despite the overall time savings and functional output, several challenges emerged when working with Claude and Windsurf. These limitations clarified when human input was still essential.

Overly Complex Prompts

Broad or multi-part instructions led to incomplete or incorrect output. Attempts to generate large interface sections or multi-step workflows in one prompt were often unsuccessful and had to be restructured.

Looping and Repetitive Output

In some cases, the LLM entered a loop – reapplying the same change or reversing previous code. Developers had to stop the process, revert to the last stable version, and rewrite the prompt with greater specificity.

API Integrations Without Good Docs

Integration with the Zoho People API could not be completed independently by Claude due to limited documentation. A developer manually explored the API, gathered endpoint details, and then provided that context to the model to complete the integration.

Minor UI Adjustments and Fixes

For simple layout changes – such as spacing or button positioning – it was faster to edit the code manually than to prompt the AI. The overhead of crafting a precise prompt often outweighed the benefits.

The Key: Knowing When to Step In

Although the goal was to explore the full extent of AI-assisted development, the most efficient outcome came from a hybrid approach.

Claude 3.5 Sonnet was most effective when used for logic scaffolding, backend structure, and repetitive code generation.

Developers remained essential for quality assurance, integration troubleshooting, and rapid micro-adjustments.

Conclusion

This experiment demonstrated that AI-powered development tools can significantly reduce time-to-completion – when used with structure and oversight.

In our case, Claude and Windsurf helped complete a working internal application in just 9 hours, compared to an estimated 144–180 hours using traditional methods.

Key recommendations for development teams:

Break tasks into clearly defined steps
Use AI for scaffolding, not precision changes
Maintain version control throughout
Step in manually for third-party APIs and UI tweaks
Treat AI as a tool that extends – not replaces – developer expertise

To learn more about how our experiment went, download the full whitepaper – AI-Powered Development: Promise and Perils.

Download the full whitepaper

Or book a free consultation to see how we can support your LLM strategy.

What is the AI-powered development experiment about?

It documents how Scopic used Windsurf (powered by Claude 3.5 Sonnet) to build an internal app 90% faster than traditional methods.

What kind of application was developed?

A dashboard called Scopic People, which consolidates employee data from different systems into a single view.

What tools and technologies were used in the experiment?

Windsurf by Codeium, Claude 3.5 Sonnet, Vanilla PHP, PostgreSQL, Tailwind CSS, Keycloak, Docker.

Why did Scopic choose vanilla PHP with no frameworks?

To ensure the AI had no pre-built scaffolding and would rely entirely on generated code — a true test of LLM capability.

About Creating the Article – Building Software with Claude and Windsurf: What Worked (and What Didn’t)

This guide was authored by Angel Poghosyan, and reviewed by Mladen Lazic, Cheif Operations Officer at Scopic.

Scopic provides quality and informative content, powered by our deep-rooted expertise in software development. Our team of content writers and experts have great knowledge in the latest software technologies, allowing them to break down even the most complex topics in the field. They also know how to tackle topics from a wide range of industries, capture their essence, and deliver valuable content across all digital platforms.

Note: This blog’s images are sourced from Freepik.

If you would like to start a project, feel free to contact us today.

Have more questions?

Talk to us about what you’re looking for. We’ll share our knowledge and guide you on your journey.

Contact Us

Let’s collaborate to bring your vision to life!

Let’s collaborate to bring your vision to life!

Let’s collaborate to bring your vision to life!

Ready to Level Up Your Impact with Advanced Tech Innovation?

Building Software with Claude and Windsurf: What Worked (and What Didn’t)

What Worked

Task Structuring for AI

Prompt-Based Development

Rapid Prototyping and Setup

Complex Logic and Architectural Patterns

Containerization and DevOps Tasks

What Didn’t Work

Overly Complex Prompts

Looping and Repetitive Output

API Integrations Without Good Docs

Minor UI Adjustments and Fixes

The Key: Knowing When to Step In

Conclusion

To learn more about how our experiment went, download the full whitepaper – AI-Powered Development: Promise and Perils.

What is the AI-powered development experiment about?

What kind of application was developed?

What tools and technologies were used in the experiment?

Why did Scopic choose vanilla PHP with no frameworks?

If you would like to start a project, feel free to contact us today.

You may also like

What Is a Product Consultant? Your Guide to Building Better Products Faster

AI Development Toolkit: How to Achieve Development Time Reduction with AI

How to Fine-Tune an LLM Without Big Tech Resources

Have more questions?

Hey, I'm Scopio.

Message Scopio