This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Documentation

1: Overview
2: Tutorials

2.1: First Steps: From Spec to First Code
2.2: Building a Simple REST API Endpoint from an Interface Spec
2.3: Creating a React Component from a UI/Functional Spec
2.4: Generating Database Schema and CRUD Operations from a Data Model Spec
2.5: Writing Unit Tests Based on Acceptance Criteria
2.6: Evolving Code as Specs Evolve
2.7: Cross-Spec Challenge: Multi-Spec Integration

3: How-to Guides

3.1: Setting Up Your LLM Coding Environment
3.2: Validating and Verifying LLM-Generated Code
3.3: Managing Changes and Spec Evolution
3.4: Crafting Effective Prompts for Spec-to-Code
3.5: Writing LLM-Friendly Specifications
3.6: Integrating LLM Assistance into Your Development Workflow
3.7: Handle Platform Differences

4: Reference

4.1: Glossary of Terms
4.2: Overview of LLM Coding Tools and Platforms
4.3: Types of Specifications
4.4: Common Prompt Patterns
4.5: Validation Checklists

5: Explanations

5.1: Why Spec-Centered Development with LLMs?
5.2: How LLMs Process and Generate Code from Text
5.3: The Limitations and Capabilities of LLMs in Coding
5.4: The Importance of Validation and Human Oversight

6: Appendices

1 - Overview

Setting the stage for the guide. Why use LLMs for spec-to-code? What are the core principles of a spec-centered workflow augmented by AI? This section introduces the philosophy and the guide’s structure.

Key Points:
- The primacy of the specification as the single source of truth.
- LLMs as powerful accelerators for code generation and translation, not replacements for human design or verification.
- Balancing the speed of AI generation with the necessity of correctness, maintainability, and alignment with architectural goals.
- Brief overview of the guide’s structure based on the Diátaxis framework (Tutorials, How-to Guides, Reference, Explanations).

2 - Tutorials

Step-by-step guides walking the user through specific spec-to-code tasks using LLMs. Each tutorial focuses on achieving a concrete, tangible outcome by following a guided path from a given specification snippet to functional code.

Learning Goals: Users will gain practical experience in applying LLMs to generate code from various specification types and integrating the output into a project, understanding the iterative nature of the process.

2.1 - First Steps: From Spec to First Code

Build a basic web app from a short functional spec using GitHub Copilot.

Key Points:

Introduction to LLMs and their role in coding
Setting up your development environment
Selecting the right spec chunks
Basic prompt crafting
Managing output iterations
Hands-on exercise: Generating a simple function from a specification

Assets:

📄 Starter Spec Template (Markdown)
📄 Prompt Template: “Draft a React component from this spec…”
📄 Prompt Engineering Flowchart (Spec ➔ Prompt ➔ Output ➔ Test)

2.2 - Building a Simple REST API Endpoint from an Interface Spec

Guide on using an LLM (e.g., via Copilot or Cursor) to generate code for a basic CRUD API endpoint based on a simplified Interface/API Specification snippet. Focuses on Python (Flask/FastAPI) or TypeScript (Node.js/Express).

Key Points:

Understanding business requirements specifications
Breaking down requirements into actionable tasks
Using LLMs to generate initial code drafts
Analyzing the Interface/API Specification to identify key elements (endpoints, methods, request/response formats).
Mapping requirements to code components
Crafting the initial prompt to request the API endpoint code, providing necessary context (language, framework, spec details).
Iterative prompting to add details like input validation, basic error handling, and database interaction placeholders.
Integrating the generated code into a minimal project structure.
Basic manual testing or using a simple script to verify the endpoint’s response structure.

Assets:

Example Interface/API Specification snippet (Markdown or simple text format).
Prompt Template: “Generate API Endpoint from Spec” (Reusable Markdown template).
Example LLM interaction flow (showing initial prompt, refinement prompts, and responses).
Code Snippets: Generated endpoint code, minimal application setup code.
Callout (Inline Tip): “Common Pitfall: Over-reliance on the first LLM output. Always expect to iterate and refine based on the spec.”
Callout (Inline Note): “Tip: Be explicit about the desired libraries and framework versions in your prompt.”

2.3 - Creating a React Component from a UI/Functional Spec

Guide on using an LLM to generate a React component based on a UI description or Functional Specification snippet, emphasizing styling with Tailwind CSS.

Key Points:

Translating visual descriptions and functional requirements from the spec into prompt requirements (component name, props, state, behavior, styling).
Specifying desired libraries (React) and styling framework (Tailwind CSS) explicitly.
Prompting for component structure, handling state/props as defined in the spec.
Refining generated JSX and applying/correcting Tailwind classes based on the spec’s visual requirements.
Integrating the component into a sample React application and verifying its appearance and basic functionality.

Assets:

Example UI/Functional Specification snippet (could include text descriptions, simple wireframe ASCII art, or reference to a diagram).
Prompt Template: “Generate React Component from Spec” (Reusable Markdown template).
Example LLM interaction flow.
Code Snippets: Generated React component code (JSX with Tailwind classes).
Callout (Inline Tip): “Tip: For complex UI, break it down into smaller components and generate them individually.”
Callout (Inline Warning): “Warning: LLMs might not perfectly replicate complex layouts from text descriptions alone. Visual review is essential.”

2.4 - Generating Database Schema and CRUD Operations from a Data Model Spec

Guide on using an LLM to create database schema definitions (SQL CREATE TABLE statements) and basic CRUD functions (e.g., Python with SQLAlchemy or TypeScript with Mongoose) based on a Data Model or relevant section of a Software Requirements Specification.

Key Points:

Identifying entities, attributes, relationships, and data types from the spec.
Representing data relationships (one-to-one, one-to-many, many-to-many) clearly in prompts.
Prompting for SQL schema definitions or ORM model classes.
Generating basic functions for Create, Read, Update, and Delete operations for one or more entities.
Choosing between generating raw SQL or ORM code based on project needs and prompting accordingly.

Assets:

Example Data Model snippet from an SRS or a dedicated Data Specification.
Prompt Template: “Generate Database Schema/CRUD from Spec” (Reusable Markdown template).
Example LLM interaction flow.
Code Snippets: Generated SQL schema, generated ORM models, generated CRUD function stubs.
Callout (Inline Warning): “Warning: Always review generated schema definitions and raw SQL for potential security vulnerabilities (e.g., lack of proper escaping, injection risks).”
Callout (Inline Note): “Note: LLMs are good at generating standard CRUD patterns, but complex query logic often requires significant human guidance or writing.”

2.5 - Writing Unit Tests Based on Acceptance Criteria

Guide on using an LLM to generate unit tests (e.g., Jest for JS/TS, Pytest for Python) for existing code based on User Stories and their associated Acceptance Criteria.

Key Points:

Understanding how Acceptance Criteria translate into test cases (Given-When-Then).
Providing the LLM with the code to be tested and the relevant User Story/Acceptance Criteria.
Prompting for specific test functions or methods covering each criterion.
Refining generated test code, adding necessary setup/teardown, and ensuring assertions are correct.
Running the generated tests and verifying their correctness and coverage against the criteria.

Assets:

Example User Story and Acceptance Criteria.
Example code snippet (e.g., a function or class) to be tested.
Prompt Template: “Generate Unit Tests from Acceptance Criteria” (Reusable Markdown template).
Example LLM interaction flow.
Code Snippets: Generated unit test code.
Callout (Inline Tip): “Tip: Provide the LLM with the signature or interface of the code being tested for better results.”
Callout (Inline Note): “Note: LLMs can generate test structures and ideas based on criteria, but human expertise is vital for ensuring comprehensive test coverage and correct assertions.”

2.6 - Evolving Code as Specs Evolve

Iteratively update a Python backend when specs change mid-development.

Key Points:

Version control practices
Updating LLM prompts with deltas
Regression testing LLM changes

Assets:

📄 Prompt Template: “Update existing code to reflect these changes…”
⚡ Callout: Warning about code drift

2.7 - Cross-Spec Challenge: Multi-Spec Integration

Combine a Product Requirements Doc and a System Design Spec to build a feature.

Key Points:

Reconciling different spec types
Layering prompts
Managing ambiguity

Assets:

📄 Example PRD + System Design Spec
📄 Diagram: “Spec Flow Merging”

3 - How-to Guides

Practical guides focused on achieving specific tasks or solving common problems that arise when using LLMs in the spec-to-code workflow. These guides assume the user knows what they want to do and provides steps on how to do it effectively.

Goals: Users will learn how to perform key actions like setting up tools, validating output, managing changes, integrating LLMs into their workflow, and applying effective prompting strategies.

3.1 - Setting Up Your LLM Coding Environment

Step-by-step instructions for getting started with different LLM coding platforms and integrating them into your development setup.

Key Points:

Installing and configuring popular IDE extensions (e.g., GitHub Copilot in VSCode, JetBrains IDEs).
Getting started with AI-native IDEs (e.g., Cursor, Windsurf - covering installation and basic usage).
Setting up and running local models via Ollama for privacy or offline use.
Managing API keys, authentication, and usage limits for cloud-based tools.

Assets:

Configuration code snippets (e.g., VSCode settings JSON, basic Ollama run commands).
Callout (Inline Tip): “Tip: Consider starting with an IDE extension for minimal workflow disruption.”
Callout (Inline Note): “Note: Local models offer privacy benefits but may require more setup and computational resources.”

3.2 - Validating and Verifying LLM-Generated Code

Practical steps and recommended practices for ensuring the quality, correctness, security, and performance of code produced by LLMs before integrating it into your codebase.

Key Points:

Implementing a structured manual code review process specifically for LLM output.
Leveraging static analysis tools (linters, formatters like Black, ESLint, Prettier) to enforce code style and catch basic errors.
Running and extending automatically generated tests (as shown in Tutorial 2.4).
Using security scanning tools (SAST) on generated code snippets.
Basic performance considerations and profiling generated code if necessary.
Reusable Template: Code Review Checklist for LLM Output (Markdown template).
Mapping output back to spec acceptance criteria
Writing lightweight validation prompts
Using checklists to catch missing functionality

Assets:

Example configurations for linters/formatters.
Callout (Inline Warning): “Warning: LLMs can confidently generate insecure code patterns. Security review is non-negotiable.”
Callout (Inline Tip): “Tip: Integrate linting and formatting into a pre-commit hook to automate checks.”
📄 Mini flowchart: “Spec ➔ Code ➔ Review ➔ Confirm”

3.3 - Managing Changes and Spec Evolution

Strategies and techniques for handling updates and changes to specifications over time and aligning them with previously generated and potentially human-modified code. Addresses the ’lifecycle management challenge’. Strategies for uncertainty.

Key Points:

Identifying and assessing the impact of changes in an updated specification.
Deciding whether to attempt regenerating the affected code section or manually modifying the existing code.
Using version control (Git) and diffing tools effectively to understand changes.
Strategies for merging LLM-generated code updates with existing human-written code, minimizing conflicts.
Documenting which parts of the codebase originated from LLM generation and which were human-modified.
Triaging missing information
Hypothesis-driven prompting
Annotating assumptions

Assets:

Flowchart: “Spec Change Handling Workflow with LLMs” (Placeholder for a visual diagram).
Callout (Inline Note): “Note: Regenerating large sections of code can be faster initially but may lead to more complex merges later.”
Callout (Inline Tip): “Tip: Use clear commit messages indicating when code was LLM-generated vs. human-written.”
⚡ Callout: Risk of hallucinations when specs are weak
📄 “Assumption Tracker” Template

3.4 - Crafting Effective Prompts for Spec-to-Code

Advanced techniques and best practices for writing prompts that maximize the chances of getting accurate, relevant, and usable code from specifications. This is core ‘prompt engineering’ for this specific use case.

Key Points:

Structuring your prompts logically: Role/Persona, Context (project, libraries, relevant code), Instruction (the task), Examples (few-shot prompting), Format (desired output structure).
Providing sufficient and relevant context without overwhelming the model.
Translating abstract spec requirements into concrete coding tasks and constraints.
Using negative constraints effectively (“Do not use library X”, “Avoid pattern Y”).
Techniques for iterative prompting and refining output through conversation.
Handling ambiguity in specifications through clarifying prompts.
Decomposing specs into promptable units
Using “show, don’t tell” in prompts
Prompt chaining for large features

Assets:

Prompt Engineering Template: General Structure for Spec-to-Code Prompts (Reusable Markdown template).
Examples: Side-by-side comparison of ineffective vs. effective prompts for a given spec snippet.
Callout (Inline Tip): “Tip: Start with a clear, concise instruction and add context/constraints iteratively if the initial output is insufficient.”
Callout (Inline Misconception): “Common Misconception: More detail is always better. Focus on relevant detail and clear structure.”
📄 Prompt Skeleton Template
⚡ Callout: Common Mistakes (e.g., vague verbs)

3.5 - Writing LLM-Friendly Specifications

Tutorial on creating machine-parsable specifications that work well with language models.

Key Points:

Atomic user stories
Testable acceptance criteria
API contract syntax examples (OpenAPI/YAML)

Content: Hands-on: Convert vague requirements into structured prompts

3.6 - Integrating LLM Assistance into Your Development Workflow

Guidance on incorporating LLM tools seamlessly into existing development practices, including IDE usage, version control workflows, and potential (cautious) integration into CI/CD pipelines.

Key Points:

Leveraging IDE features for inline code completion and chat-based assistance.
Establishing team conventions for using and reviewing LLM-generated code within a version control system (e.g., specific branch naming, required reviews).
Exploring possibilities for automating boilerplate or test generation in CI/CD (emphasizing the need for rigorous validation steps afterward).
Strategies for team collaboration when some members are using LLM tools and others are not.

Assets:

Diagram: “LLM-Assisted Development Workflow” (Placeholder for a visual diagram showing integration points).
Callout (Inline Warning): “Warning: Fully automating code generation and deployment in CI/CD without human oversight is risky and not recommended for critical systems.”
Callout (Inline Note): “Note: Consistency in how LLM tools are used and reviewed is key for team effectiveness.”

3.7 - Handle Platform Differences

Adjusting practices between Copilot, Cursor, Windsurf, and local LLMs.

Key Points:

Specific prompt tuning for each platform
Copilot vs. Cursor context window differences
Benefits of local models (privacy, control)

Assets:

📄 Comparison Table: LLM IDE Platforms

4 - Reference

Provides factual information, definitions, and technical details relevant to LLM-assisted spec-to-code. This section is designed to be a lookup source for specific terms, tools, or concepts.

Goal: Serve as a quick, accurate source of information.

4.1 - Glossary of Terms

Definitions of key terms used throughout the guide and commonly found in the LLM/AI coding space.

Key Points:

Definitions for: LLM, Prompt Engineering, Specification (and brief definition of each type covered), Hallucination, Fine-tuning, Token, Context Window, Temperature (LLM parameter), Few-shot/Zero-shot prompting.
Definition of llm.txt and llmd.txt: (Proposing standard usage)
- llm.txt: A project-level file (in the root directory) containing guidelines, constraints, and context for LLMs assisting with the project (e.g., preferred libraries, architectural patterns to follow/avoid, general coding standards).
- llmd.txt: Directory-specific files containing context, examples, or specific instructions relevant only to the code within that directory (e.g., details about a specific module, examples of patterns used in that area).

4.2 - Overview of LLM Coding Tools and Platforms

A summary of the capabilities, target use cases, and characteristics of the LLM coding tools and platforms mentioned in the guide.

Key Points:

IDE Extensions (Copilot)
- GitHub Copilot: Focus (inline completion, chat), Integration (IDE extension), Model (OpenAI/Microsoft), Pros/Cons.
AI-Native IDEs
- Cursor: Focus (AI-native editor, chat, code analysis), Integration (Standalone IDE), Model (various, including GPT-4), Pros/Cons.
- Windsurf: Focus (specific features), Integration, Model, Pros/Cons (based on available information).
Local Deployment
- Ollama: Focus (running local models), Integration (API, CLI, IDE extensions), Model (various open-source models), Pros/Cons (privacy vs. setup/resource needs).
Comparison summary (e.g., table format) highlighting key differences (cloud vs. local, IDE integration vs. native, features).

Assets:

📄 LLM Platform Feature Table

4.3 - Types of Specifications

A brief overview and characteristics of the different specification types featured in the guide, focusing on the kind of information they contain that is valuable for LLM-assisted code generation.

Key Points:

Business Requirements Specification (BRS): High-level goals, “Why”. Useful for context.
Software Requirements Specification (SRS): Functional and non-functional requirements. Detailed “What”. Core input.
Functional Specification Document (FSD): Detailed behavior, user interaction flows. “How it works”. Core input.
Technical Specification (TS): Implementation details, data structures, algorithms. “How it’s built”. Useful for guiding LLM implementation choices.
System Design Specification (SDS): Architecture, component interaction. High-level “How”. Useful for ensuring LLM output fits the design.
Interface/API Specifications: Endpoints, data models, request/response formats. Precise input for API code.
Product Requirements Documents (PRDs): Market context, user stories, features. Focus on user value. Useful for context and user stories.
User Stories and Acceptance Criteria: User perspective, testable conditions. Excellent input for generating features and tests.
Test Specifications: Detailed test cases, setup, expected results. Input for test automation or verifying generated tests.
Differences
- Business vs. Technical vs. Functional Specs
- User Stories vs. Acceptance Criteria
What makes a spec “LLM-friendly”?
- Language modeling ≠ semantic understanding
- Bias toward plausible completion
- Clarity, atomicity, consistency
- Examples of good vs. bad specs

Assets:

📄 Example spec snippets
📄 Table of Spec Characteristics

4.4 - Common Prompt Patterns

Collection of prompt templates for different scenarios.

Key Points:

Single Component Prompts
Multi-Step Feature Prompts
Refactoring Prompts

Assets:

📄 Prompt Pattern Library (Markdown)

4.5 - Validation Checklists

Ready-to-use checklists for spec-to-code QA.

Validation Checklists

Description: Ready-to-use checklists for spec-to-code QA.

Key Points:

Functionality validation
Security/Safety checks
Style and readability reviews

Assets:

📄 Downloadable Markdown QA Lists

5 - Explanations

Provides background information, conceptual understanding, and rationale to help users build a mental model of how LLMs work in the spec-to-code context, why certain practices are recommended, and the broader implications.

Goal: Help users understand the “why” behind the techniques and tools, fostering deeper understanding rather than just rote application.

5.1 - Why Spec-Centered Development with LLMs?

The fundamental rationale behind prioritizing specifications when using AI for code generation. Explains the benefits compared to purely code-centric or ad-hoc AI usage.

Key Points:

Specifications as the stable, human-readable source of truth that transcends specific code implementations.
How a clear spec reduces ambiguity and potential LLM “hallucinations” or misinterpretations before code is generated.
Improving alignment and communication between different roles (Product, Design, QA, Engineering) by centering discussions on the spec.
LLMs as a powerful translation layer between structured human language (specifications) and code.
How this approach supports better maintainability and reduces technical debt compared to generating code without a clear, verifiable source.
Separation of concerns
Managing human+AI collaboration
Reducing technical debt early

5.2 - How LLMs Process and Generate Code from Text

A simplified, accessible explanation of the underlying mechanisms of LLMs relevant to understanding their behavior in code generation. Avoids deep technical jargon where possible.

Key Points:

LLMs as pattern-matching engines trained on vast amounts of text and code data.
The concept of tokens and the context window: understanding the limitations on how much information an LLM can effectively process at once.
The probabilistic nature of output: LLMs predict the next token, which explains variations in output and the possibility of errors or “hallucinations.”
How training data influences code style, patterns, and potential biases in generated code.
The difference between understanding syntax/patterns and understanding complex system architecture or business logic without explicit guidance.
Diagram: Simplified LLM process flow (Input -> Processing -> Output).

5.3 - The Limitations and Capabilities of LLMs in Coding

A realistic and balanced look at what LLMs are currently good at in the context of software development and where their inherent limitations require human intervention and expertise. Addresses common misconceptions.

Key Points:

Capabilities: Generating boilerplate code, writing simple functions, translating code between languages, suggesting syntax, refactoring small code blocks, writing initial drafts of tests, explaining code snippets.
Limitations: Difficulty with complex algorithms, making architectural decisions, understanding the nuances of large, unfamiliar codebases, guaranteeing functional correctness or security, creative problem-solving beyond training patterns, handling ambiguous or incomplete specifications effectively without human clarification.
The problem of “hallucination” – generating plausible-looking but incorrect or non-existent code/APIs.
The critical and irreplaceable need for human oversight, review, and domain expertise.
Callout (Inline Misconception): “Common Misconception: LLMs can replace architects or senior engineers. They are tools that augment, not replace, human expertise.”

5.4 - The Importance of Validation and Human Oversight

Reinforcing the critical role of human review, automated checks, and testing in the LLM-assisted workflow. Explains why generated code cannot be blindly trusted and the potential consequences of skipping validation steps.

Key Points:

Why LLM output is a starting point, not a finished product.
The risks of unchecked generated code: bugs, security vulnerabilities, performance issues, technical debt, non-compliance with standards.
The human role in verifying alignment with the specification, overall system design, performance requirements, and security policies.
The necessity of comprehensive testing (unit, integration, end-to-end) regardless of how the code was generated.
Ethical considerations related to using LLMs for code (bias in training data, security implications).

6 - Appendices

Supplementary materials that provide additional context or resources.

Key Points:
- Further Reading / Resources: Links to relevant research papers, articles, tools, and communities.
- Case Studies: Brief examples or summaries of how LLMs have been successfully applied in spec-to-code scenarios (can be hypothetical or real-world examples if available and permissible).