Methodology

30 days. Seven stages. Zero guesswork.

Every AI consulting engagement follows our battle-tested SHIP-7 framework. You know exactly what happens on each day, what you'll receive, and what risks we're mitigating — before we write a single line of code.

Book a Technical Assessment

Why a repeatable framework matters.

Most AI projects fail not because the technology doesn't work — but because the process is ad-hoc. Teams skip discovery, jump into implementation, and spend months fixing problems that a proper architecture phase would have prevented.

Our 7-stage framework eliminates this. Every stage has clear objectives, concrete deliverables, and built-in risk mitigation. It's the same process whether we're building an AI copilot, hardening an existing system, or automating document processing.

The 30-day timeline

Implementation

Testing

Day 1Day 10Day 22Day 30

Stage 1

Days 1-3

Discovery

10% of engagement

Objectives

✓Understand the business problem, not just the technical request
✓Map the current workflow and identify where AI creates the highest impact
✓Define measurable success criteria and acceptance benchmarks
✓Align all stakeholders on scope, timeline, and expected outcomes

Activities

›Stakeholder interviews (CTO, product, engineering, end users)
›Current workflow analysis and pain point mapping
›Data availability and quality assessment
›Success metrics definition with baseline measurement
›Competitive and prior art analysis

Deliverables

■Discovery Report with business context and technical requirements
■Success Criteria Document with measurable KPIs
■Data Readiness Assessment
■Stakeholder Alignment Summary (signed off)

Tools

Loom, Notion, Linear, custom intake questionnaire

Risks & mitigation

Unclear business objectives → Structured intake questionnaire completed before Day 1
Missing stakeholder buy-in → Require CTO sign-off on discovery report before proceeding
Insufficient data → Data readiness gate: if data quality is below threshold, we pause and advise

Stage 2

Days 4-6

Technical Audit

10% of engagement

Objectives

✓Evaluate the existing codebase, infrastructure, and integration points
✓Identify technical constraints, security requirements, and compliance needs
✓Benchmark current system performance as a baseline for improvement
✓Surface hidden technical debt that could block AI integration

Activities

›Codebase review (architecture, patterns, tech debt)
›Infrastructure audit (cloud, CI/CD, monitoring, security)
›API and data pipeline assessment
›Performance benchmarking (latency, throughput, error rates)
›Security and compliance review (SOC2, GDPR, HIPAA as applicable)

Deliverables

■Technical Audit Report (architecture, gaps, recommendations)
■Infrastructure Readiness Scorecard
■Performance Baseline Document
■Security and Compliance Checklist

Tools

GitHub, SonarQube, Datadog/Grafana, AWS Well-Architected Tool, custom audit scripts

Risks & mitigation

Codebase too large to audit in 3 days → Focus on integration-relevant modules only, flag rest for future audit
No existing monitoring → Deploy lightweight observability in implementation phase
Compliance blockers discovered → Escalate immediately with mitigation plan, adjust scope if needed

Stage 3

Days 7-9

Architecture

10% of engagement

Objectives

✓Design the AI integration architecture with production constraints in mind
✓Select models, frameworks, and infrastructure based on requirements — not hype
✓Define the data pipeline, prompt strategy, and evaluation approach
✓Get architectural sign-off before writing any implementation code

Activities

›Architecture design (system diagrams, data flow, integration points)
›Model selection and evaluation (Claude, GPT, open-source, cost/performance trade-offs)
›Prompt engineering strategy and template design
›Fallback chain and error handling design
›Cost modeling at 1x, 5x, and 10x scale
›Architecture Decision Record (ADR) documentation

Deliverables

■Architecture Decision Record (ADR) with rationale for every choice
■System Architecture Diagram (C4 model)
■Model Selection Report with benchmarks
■Cost Projection Model (1x, 5x, 10x)
■Prompt Strategy Document

Tools

Excalidraw, LangSmith/Braintrust for model eval, custom cost calculator, Notion ADR templates

Risks & mitigation

Wrong model selection → Run structured evaluation with 50+ test cases before committing
Over-engineering → Apply YAGNI principle: design for current requirements, document future extensibility
Cost estimates miss reality → Include 40% buffer in cost projections, validate with spike during implementation

Stage 4

Days 10-22

Implementation

43% of engagement

Objectives

✓Build the AI feature in the client's codebase, not in isolation
✓Follow the client's PR process, coding standards, and deployment pipeline
✓Implement with production hardening from day one — not as an afterthought
✓Maintain daily progress visibility through async updates

Activities

›Core AI feature development (in client's repository)
›Prompt engineering, iteration, and optimization
›Integration with existing APIs, databases, and authentication
›Error handling, fallback chains, and circuit breakers
›Streaming response implementation (where applicable)
›Daily async progress updates (Slack/Loom)
›Mid-project check-in call (Day 16)

Deliverables

■Production code in client repository (via PRs)
■Prompt library with version control
■Integration layer with error handling
■Mid-project status report

Tools

Client's stack (Java/Python/Node.js/Go), OpenAI/Anthropic SDKs, pgvector, Redis, client's CI/CD pipeline

Risks & mitigation

Scope creep during implementation → Strict adherence to signed architecture document, change requests go through formal process
API rate limits or model degradation → Build multi-provider fallback chain from day one
Integration conflicts with existing code → Daily PRs with small, reviewable changes instead of large merges
Client team unavailable for reviews → Define review SLA in kickoff, escalate blockers within 24 hours

Stage 5

Days 23-26

Testing

13% of engagement

Objectives

✓Validate AI quality with a structured evaluation suite, not manual spot-checking
✓Load test under realistic conditions to verify performance at scale
✓Run adversarial testing to find edge cases before users do
✓Verify all acceptance criteria from the discovery phase are met

Activities

›Evaluation suite creation (50-100+ test cases across categories)
›Automated regression test pipeline
›Load testing and latency profiling under production-like traffic
›Adversarial and edge case testing (prompt injection, unexpected inputs)
›Acceptance criteria validation against discovery document
›User acceptance testing with client stakeholders

Deliverables

■Evaluation Suite (50-100+ test cases with expected outputs)
■Test Results Report with pass/fail rates per category
■Load Test Report (throughput, latency at p50/p95/p99)
■Adversarial Test Results with mitigations applied
■Acceptance Criteria Sign-Off Document

Tools

Braintrust/LangSmith for eval, k6/Locust for load testing, custom adversarial test harness, pytest/Jest

Risks & mitigation

Evaluation shows quality below threshold → Built-in buffer days for prompt iteration and fixes
Performance degrades under load → Implement caching, streaming, and request queuing before deployment
Edge cases discovered late → Adversarial testing runs in parallel with functional testing from Day 23

Stage 6

Days 27-28

Deployment

7% of engagement

Objectives

✓Deploy to production with full monitoring and observability from minute one
✓Configure alerting rules for cost, latency, error rate, and quality drift
✓Validate production behavior matches staging environment results
✓Establish rollback procedures and incident response protocols

Activities

›Production deployment via client's CI/CD pipeline
›Monitoring dashboard setup (latency, cost, error rate, usage, quality metrics)
›Alerting configuration (PagerDuty/Slack/email thresholds)
›Feature flag or gradual rollout configuration
›Rollback procedure verification
›Production smoke tests

Deliverables

■Production-deployed feature with monitoring
■Monitoring Dashboards (4+ dashboards: performance, cost, quality, usage)
■Alerting Configuration Document
■Rollback Procedure Playbook
■Incident Response Playbook for AI-specific failures

Tools

Datadog/Grafana/CloudWatch, PagerDuty/OpsGenie, LaunchDarkly/custom feature flags, client's CI/CD

Risks & mitigation

Production environment differs from staging → Deploy to staging-prod first, run full test suite before user traffic
Unexpected cost spike at scale → Per-request cost tracking with automated alerts at 80% of projected budget
Silent quality degradation → Automated quality sampling (5% of requests) with drift detection alerts

Stage 7

Days 29-30

Handoff

7% of engagement

Objectives

✓Transfer complete ownership and operational knowledge to the client's team
✓Ensure the client's engineers can maintain, modify, and extend the system independently
✓Document everything — architecture decisions, operational procedures, and troubleshooting guides
✓Establish the 30-day async support window for post-handoff questions

Activities

›90-minute recorded knowledge transfer session with engineering team
›Complete documentation review and walkthrough
›Operational playbook review (monitoring, alerting, incident response)
›Q&A session with engineering team
›30-day async support kickoff (Slack channel or email)

Deliverables

■Recorded Handoff Session (90 minutes, searchable, timestamped)
■Complete Technical Documentation (architecture, code, prompts, evaluation)
■Operational Runbook (monitoring, alerting, incident response, cost management)
■Maintenance Guide (how to update prompts, retrain evaluations, scale infrastructure)
■30-Day Async Support Agreement

Tools

Loom for recording, Notion/Confluence for docs, Slack for async support, GitHub for code documentation

Risks & mitigation

Knowledge gaps in client team → Recorded session enables async re-learning, documentation covers all operational scenarios
Issues discovered after handoff → 30-day async support included, critical issues addressed within 24 hours
Team turnover post-handoff → All documentation is self-contained and doesn't require tribal knowledge

Our guarantee.

Every engagement follows this exact framework. No shortcuts. No skipped stages. If we can't meet the 30-day timeline for your project, we'll tell you before we start — not after.

Fixed scope. Fixed price. Fixed timeline. The risk is on us, not on you.

Ready to see how this framework applies to your project?

Book a 30-minute technical assessment. We'll walk through your architecture, identify where AI fits, and show you exactly how the 30-day framework maps to your specific requirements.

Book a Technical Assessment

No commitment. You'll talk directly to the engineer who'll run the engagement.