Methodology
30 days. Seven stages. Zero guesswork.
Every AI consulting engagement follows our battle-tested SHIP-7 framework. You know exactly what happens on each day, what you'll receive, and what risks we're mitigating — before we write a single line of code.
Why a repeatable framework matters.
Most AI projects fail not because the technology doesn't work — but because the process is ad-hoc. Teams skip discovery, jump into implementation, and spend months fixing problems that a proper architecture phase would have prevented.
Our 7-stage framework eliminates this. Every stage has clear objectives, concrete deliverables, and built-in risk mitigation. It's the same process whether we're building an AI copilot, hardening an existing system, or automating document processing.
The 30-day timeline
Discovery
10% of engagement
Objectives
- ✓Understand the business problem, not just the technical request
- ✓Map the current workflow and identify where AI creates the highest impact
- ✓Define measurable success criteria and acceptance benchmarks
- ✓Align all stakeholders on scope, timeline, and expected outcomes
Activities
- ›Stakeholder interviews (CTO, product, engineering, end users)
- ›Current workflow analysis and pain point mapping
- ›Data availability and quality assessment
- ›Success metrics definition with baseline measurement
- ›Competitive and prior art analysis
Deliverables
- ■Discovery Report with business context and technical requirements
- ■Success Criteria Document with measurable KPIs
- ■Data Readiness Assessment
- ■Stakeholder Alignment Summary (signed off)
Tools
Loom, Notion, Linear, custom intake questionnaire
Risks & mitigation
- Unclear business objectives → Structured intake questionnaire completed before Day 1
- Missing stakeholder buy-in → Require CTO sign-off on discovery report before proceeding
- Insufficient data → Data readiness gate: if data quality is below threshold, we pause and advise
Technical Audit
10% of engagement
Objectives
- ✓Evaluate the existing codebase, infrastructure, and integration points
- ✓Identify technical constraints, security requirements, and compliance needs
- ✓Benchmark current system performance as a baseline for improvement
- ✓Surface hidden technical debt that could block AI integration
Activities
- ›Codebase review (architecture, patterns, tech debt)
- ›Infrastructure audit (cloud, CI/CD, monitoring, security)
- ›API and data pipeline assessment
- ›Performance benchmarking (latency, throughput, error rates)
- ›Security and compliance review (SOC2, GDPR, HIPAA as applicable)
Deliverables
- ■Technical Audit Report (architecture, gaps, recommendations)
- ■Infrastructure Readiness Scorecard
- ■Performance Baseline Document
- ■Security and Compliance Checklist
Tools
GitHub, SonarQube, Datadog/Grafana, AWS Well-Architected Tool, custom audit scripts
Risks & mitigation
- Codebase too large to audit in 3 days → Focus on integration-relevant modules only, flag rest for future audit
- No existing monitoring → Deploy lightweight observability in implementation phase
- Compliance blockers discovered → Escalate immediately with mitigation plan, adjust scope if needed
Architecture
10% of engagement
Objectives
- ✓Design the AI integration architecture with production constraints in mind
- ✓Select models, frameworks, and infrastructure based on requirements — not hype
- ✓Define the data pipeline, prompt strategy, and evaluation approach
- ✓Get architectural sign-off before writing any implementation code
Activities
- ›Architecture design (system diagrams, data flow, integration points)
- ›Model selection and evaluation (Claude, GPT, open-source, cost/performance trade-offs)
- ›Prompt engineering strategy and template design
- ›Fallback chain and error handling design
- ›Cost modeling at 1x, 5x, and 10x scale
- ›Architecture Decision Record (ADR) documentation
Deliverables
- ■Architecture Decision Record (ADR) with rationale for every choice
- ■System Architecture Diagram (C4 model)
- ■Model Selection Report with benchmarks
- ■Cost Projection Model (1x, 5x, 10x)
- ■Prompt Strategy Document
Tools
Excalidraw, LangSmith/Braintrust for model eval, custom cost calculator, Notion ADR templates
Risks & mitigation
- Wrong model selection → Run structured evaluation with 50+ test cases before committing
- Over-engineering → Apply YAGNI principle: design for current requirements, document future extensibility
- Cost estimates miss reality → Include 40% buffer in cost projections, validate with spike during implementation
Implementation
43% of engagement
Objectives
- ✓Build the AI feature in the client's codebase, not in isolation
- ✓Follow the client's PR process, coding standards, and deployment pipeline
- ✓Implement with production hardening from day one — not as an afterthought
- ✓Maintain daily progress visibility through async updates
Activities
- ›Core AI feature development (in client's repository)
- ›Prompt engineering, iteration, and optimization
- ›Integration with existing APIs, databases, and authentication
- ›Error handling, fallback chains, and circuit breakers
- ›Streaming response implementation (where applicable)
- ›Daily async progress updates (Slack/Loom)
- ›Mid-project check-in call (Day 16)
Deliverables
- ■Production code in client repository (via PRs)
- ■Prompt library with version control
- ■Integration layer with error handling
- ■Mid-project status report
Tools
Client's stack (Java/Python/Node.js/Go), OpenAI/Anthropic SDKs, pgvector, Redis, client's CI/CD pipeline
Risks & mitigation
- Scope creep during implementation → Strict adherence to signed architecture document, change requests go through formal process
- API rate limits or model degradation → Build multi-provider fallback chain from day one
- Integration conflicts with existing code → Daily PRs with small, reviewable changes instead of large merges
- Client team unavailable for reviews → Define review SLA in kickoff, escalate blockers within 24 hours
Testing
13% of engagement
Objectives
- ✓Validate AI quality with a structured evaluation suite, not manual spot-checking
- ✓Load test under realistic conditions to verify performance at scale
- ✓Run adversarial testing to find edge cases before users do
- ✓Verify all acceptance criteria from the discovery phase are met
Activities
- ›Evaluation suite creation (50-100+ test cases across categories)
- ›Automated regression test pipeline
- ›Load testing and latency profiling under production-like traffic
- ›Adversarial and edge case testing (prompt injection, unexpected inputs)
- ›Acceptance criteria validation against discovery document
- ›User acceptance testing with client stakeholders
Deliverables
- ■Evaluation Suite (50-100+ test cases with expected outputs)
- ■Test Results Report with pass/fail rates per category
- ■Load Test Report (throughput, latency at p50/p95/p99)
- ■Adversarial Test Results with mitigations applied
- ■Acceptance Criteria Sign-Off Document
Tools
Braintrust/LangSmith for eval, k6/Locust for load testing, custom adversarial test harness, pytest/Jest
Risks & mitigation
- Evaluation shows quality below threshold → Built-in buffer days for prompt iteration and fixes
- Performance degrades under load → Implement caching, streaming, and request queuing before deployment
- Edge cases discovered late → Adversarial testing runs in parallel with functional testing from Day 23
Deployment
7% of engagement
Objectives
- ✓Deploy to production with full monitoring and observability from minute one
- ✓Configure alerting rules for cost, latency, error rate, and quality drift
- ✓Validate production behavior matches staging environment results
- ✓Establish rollback procedures and incident response protocols
Activities
- ›Production deployment via client's CI/CD pipeline
- ›Monitoring dashboard setup (latency, cost, error rate, usage, quality metrics)
- ›Alerting configuration (PagerDuty/Slack/email thresholds)
- ›Feature flag or gradual rollout configuration
- ›Rollback procedure verification
- ›Production smoke tests
Deliverables
- ■Production-deployed feature with monitoring
- ■Monitoring Dashboards (4+ dashboards: performance, cost, quality, usage)
- ■Alerting Configuration Document
- ■Rollback Procedure Playbook
- ■Incident Response Playbook for AI-specific failures
Tools
Datadog/Grafana/CloudWatch, PagerDuty/OpsGenie, LaunchDarkly/custom feature flags, client's CI/CD
Risks & mitigation
- Production environment differs from staging → Deploy to staging-prod first, run full test suite before user traffic
- Unexpected cost spike at scale → Per-request cost tracking with automated alerts at 80% of projected budget
- Silent quality degradation → Automated quality sampling (5% of requests) with drift detection alerts
Handoff
7% of engagement
Objectives
- ✓Transfer complete ownership and operational knowledge to the client's team
- ✓Ensure the client's engineers can maintain, modify, and extend the system independently
- ✓Document everything — architecture decisions, operational procedures, and troubleshooting guides
- ✓Establish the 30-day async support window for post-handoff questions
Activities
- ›90-minute recorded knowledge transfer session with engineering team
- ›Complete documentation review and walkthrough
- ›Operational playbook review (monitoring, alerting, incident response)
- ›Q&A session with engineering team
- ›30-day async support kickoff (Slack channel or email)
Deliverables
- ■Recorded Handoff Session (90 minutes, searchable, timestamped)
- ■Complete Technical Documentation (architecture, code, prompts, evaluation)
- ■Operational Runbook (monitoring, alerting, incident response, cost management)
- ■Maintenance Guide (how to update prompts, retrain evaluations, scale infrastructure)
- ■30-Day Async Support Agreement
Tools
Loom for recording, Notion/Confluence for docs, Slack for async support, GitHub for code documentation
Risks & mitigation
- Knowledge gaps in client team → Recorded session enables async re-learning, documentation covers all operational scenarios
- Issues discovered after handoff → 30-day async support included, critical issues addressed within 24 hours
- Team turnover post-handoff → All documentation is self-contained and doesn't require tribal knowledge
Our guarantee.
Every engagement follows this exact framework. No shortcuts. No skipped stages. If we can't meet the 30-day timeline for your project, we'll tell you before we start — not after.
Fixed scope. Fixed price. Fixed timeline. The risk is on us, not on you.
Ready to see how this framework applies to your project?
Book a 30-minute technical assessment. We'll walk through your architecture, identify where AI fits, and show you exactly how the 30-day framework maps to your specific requirements.
No commitment. You'll talk directly to the engineer who'll run the engagement.