50 min readIntelligence & insights

Implementation Guide: Synthesize stakeholder interview data into themes and findings for deliverables

Step-by-step implementation guide for deploying AI to synthesize stakeholder interview data into themes and findings for deliverables for Professional Services clients.

Hardware Procurement

Jabra Speak2 75 Wireless Speakerphone

Jabra (GN Audio)2775-419 (USB-A variant) / 2775-429 (USB-C variant)Qty: 5

$175 per unit MSP cost / $250 suggested resale

Primary interview recording device for conference rooms and in-person stakeholder interviews. Features 4 noise-cancelling beamforming microphones and super-wideband audio optimized for clear voice capture. Connects via USB or Bluetooth to laptops running transcription software. Critical for ensuring high-quality audio input to AI transcription pipeline.

Jabra Speak 510 Portable Speakerphone

Jabra (GN Audio)7510-209 (USB-A)Qty: 2

$95 per unit MSP cost / $140 suggested resale

Portable backup and field interview recording device. Compact form factor for consultants conducting on-site stakeholder interviews at client facilities. USB/Bluetooth connectivity for laptop-based recording.

RØDE NT-USB Mini USB Condenser Microphone

RØDE MicrophonesNTUSB-MINIQty: 2

$80 per unit MSP cost / $115 suggested resale

High-quality desk-based microphone for dedicated interview recording setups in the office. Provides broadcast-quality audio for solo interviewer recording when a speakerphone pickup pattern is not needed. Ideal for phone/VoIP interview recording scenarios.

Software Procurement

Otter.ai Business

Otter.aiBusinessQty: 10 seats

$20/user/month billed annually ($240/user/year). 10 seats = $200/month or $2,400/year

Primary transcription and meeting intelligence platform. Provides automatic transcription of Zoom and Teams meetings with speaker identification, search, and export. 6,000 minutes/month per user. Admin features including usage analytics. Feeds raw transcripts to downstream analysis pipeline.

Dovetail Professional

DovetailProfessional PlanQty: 5 researcher seats

$15/user/month — $75/month or $900/year

Cloud-native qualitative data analysis platform. Provides AI-driven transcription tagging, automated summarization, collaborative theme coding, and insight repositories. Researchers upload transcripts, tag themes, and generate structured findings. Serves as the central analysis workspace between raw transcripts and final deliverables.

OpenAI API (GPT-5.4)

OpenAIGPT-5.4

$2.50 per 1M input tokens, $10 per 1M output tokens. Estimated $25/month for typical consulting firm workload (~2M tokens/month)

Powers the custom thematic synthesis pipeline. Processes cleaned transcripts through structured prompts to extract themes, generate cross-interview findings, identify contradictions, and draft deliverable sections. Used via API for programmatic, repeatable analysis at scale.

OpenAI Whisper API

OpenAIWhisper

$0.006/minute of audio. 40 hours of interviews = $14.40 per project

Backup and high-accuracy transcription engine for audio files not captured through Otter.ai (e.g., in-person recordings, phone interviews). Provides raw transcription output that feeds into the synthesis pipeline. Supports 98+ languages.

Microsoft 365 Business Standard

MicrosoftSaaS per-seat annual subscription (CSP)Qty: 10 seats

$12.50/user/month via CSP. 10 seats = $125/month or $1,500/year

Foundation platform providing Microsoft Teams (for remote interview recording and built-in transcription), SharePoint (document storage and transcript repository), Word and PowerPoint (deliverable creation), and OneDrive (individual file storage). Assumed pre-existing at most professional services firms.

Microsoft 365 Copilot

MicrosoftSaaS per-seat monthly add-on (CSP)Qty: 5 seats (senior consultants/partners)

$30/user/month add-on = $150/month or $1,800/year

AI assistant embedded in Word and PowerPoint for drafting final client deliverables from synthesized themes and findings. Copilot ingests structured synthesis outputs and generates draft report sections, executive summaries, and presentation slides. Significant time savings on deliverable production.

Notion Business

Notion LabsSaaS per-seat annual subscriptionQty: 5 seats

$20/user/month for Business plan with AI. 5 seats = $100/month or $1,200/year. Optional — can substitute Confluence or SharePoint.

Optional research knowledge management wiki. Stores interview repositories organized by engagement, maintains prompt templates and analysis playbooks, and serves as a collaborative workspace for building findings before final deliverable creation. Includes Notion AI for additional summarization.

Zapier Professional

ZapierProfessional PlanQty: 1

$49/month (2,000 tasks/month)

Integration orchestration layer connecting Otter.ai, Dovetail, SharePoint, and the custom synthesis pipeline. Automates transcript routing, triggers synthesis workflows, and pushes notifications to Slack/Teams when analysis is complete.

Prerequisites

  • Microsoft 365 Business Standard (or higher) tenant fully provisioned with Teams, SharePoint, and OneDrive configured for all users
  • Active Microsoft CSP relationship with the MSP for license management and Copilot provisioning
  • Zoom Business ($199.90/user/year) OR Microsoft Teams (included in M365) deployed and configured for meeting recording — confirm which platform the client uses for remote interviews
  • Minimum 25 Mbps internet upload bandwidth at primary office location; 10 Mbps minimum at any remote interview sites
  • Outbound HTTPS (port 443) allowed through firewall to: otter.ai, dovetail.com, api.openai.com, login.microsoftonline.com, zapier.com, notion.so
  • WPA3 or WPA2-Enterprise Wi-Fi at interview recording locations for reliable connectivity
  • Modern web browsers (Chrome 120+, Edge 120+, Firefox 120+) on all workstations
  • Standard business laptops with minimum 8GB RAM, USB-A or USB-C ports for recording hardware, and functional audio drivers
  • Client has signed engagement letter or MSA that permits use of AI tools for data processing — critical for compliance
  • OpenAI Platform account created with billing configured and API key generated (MSP should create under client's organization or MSP's managed tenant)
  • Python 3.10+ runtime available on at least one workstation or Azure VM for running the custom synthesis pipeline scripts
  • Client has existing templates (Word/PowerPoint) for their standard deliverable formats — needed for Copilot template configuration
  • Identified project lead / power user within the client organization who will own the interview analysis workflow
  • Written interview consent form template approved by client's legal counsel that discloses AI processing of interview data

Installation Steps

Step 1: Provision and Configure Otter.ai Business Workspace

Create the Otter.ai Business workspace for the client organization. This is the primary transcription engine that will automatically capture and transcribe all stakeholder interviews conducted via Zoom or Teams. Configure SSO if the client uses Azure AD, set up team structure, and enable key integrations.

1
Navigate to https://otter.ai/signup and create a Business workspace
2
Under Admin > Settings > Authentication, configure SSO: Identity Provider: Azure AD | SAML SSO URL: https://login.microsoftonline.com/{tenant-id}/saml2 | Certificate: Upload from Azure AD Enterprise Application
3
Under Admin > Settings > Integrations: Enable Zoom integration (OAuth connection to client's Zoom account) | Enable Microsoft Teams integration | Enable Google Calendar sync (if applicable)
4
Under Admin > Settings > Security: Enable 2FA requirement for all users | Set data retention to 90 days (configurable per compliance needs) | Review and accept DPA at https://otter.ai/dpa
5
Invite all 10 users via Admin > Team > Invite Members
Note

If the client uses Microsoft Teams exclusively (no Zoom), consider whether Otter.ai is needed or if Teams built-in transcription + Whisper API for higher accuracy is sufficient. Otter.ai adds value with its search, highlight, and summary features beyond raw transcription. Ensure the DPA is signed before any interview data flows through the platform. For HIPAA-covered clients, confirm Otter.ai Business tier includes BAA — if not, use Enterprise tier or switch to Azure OpenAI Whisper.

Step 2: Deploy Recording Hardware

Unbox, configure, and deploy Jabra Speak2 75 speakerphones and RØDE NT-USB Mini microphones. Test audio quality with each device connected to the client's laptops. Ensure firmware is updated and devices are recognized by Otter.ai and Zoom/Teams.

1
Connect Jabra Speak2 75 via USB-A/USB-C to laptop
2
Download Jabra Direct from https://www.jabra.com/software-and-services/jabra-direct
3
Install Jabra Direct and run firmware update
4
In Zoom/Teams, set Jabra Speak2 75 as default microphone and speaker
5
Test recording: start a test Otter.ai recording, speak from 6ft distance, verify transcript accuracy
Run firmware update for Jabra Speak2 75 via Jabra Direct CLI
bash
jabra-direct --check-firmware --update
1
Connect RØDE NT-USB Mini via USB-C to laptop
2
Download RØDE Central from https://www.rode.com/software/rode-central
3
Update firmware via RØDE Central
4
Set as input device in OS audio settings
5
In Otter.ai desktop app, select RØDE NT-USB Mini as microphone source
Note

Label each device with an asset tag for the client's inventory. Position Jabra Speak2 75 units centrally on conference tables — optimal pickup range is 6-8 feet. The RØDE NT-USB Mini should be positioned 6-12 inches from the interviewer for best results. Test each device in the actual rooms where interviews will be conducted, as room acoustics significantly impact transcription accuracy.

Step 3: Configure Dovetail Professional Workspace

Set up the Dovetail qualitative analysis workspace where researchers will organize, tag, and synthesize interview transcripts. Create project templates, configure tag taxonomies, and set up team permissions. Dovetail serves as the central analysis hub between raw transcripts and final deliverables.

1
Navigate to https://dovetail.com and create a new workspace
2
Workspace Settings > Team: Invite 5 researcher seats
3
Configure SSO via SAML (Azure AD integration): Go to Settings > Security > SAML SSO — Entity ID: https://dovetail.com/saml/{workspace-id} — ACS URL: provided by Dovetail — Configure in Azure AD as Enterprise Application
4
Create Project Template: Project Name: [Client Name] - [Engagement Name] | Tags Taxonomy: Themes (auto-generated by AI, refined by researcher), Sentiment (Positive / Neutral / Negative / Mixed), Stakeholder Type (Executive / Manager / Staff / External), Priority (High / Medium / Low), Finding Type (Pain Point / Opportunity / Strength / Risk)
5
Create a 'Master Tag Set' under workspace settings for reuse across projects
6
Enable AI Features: Settings > AI > Enable Summaries, Auto-tagging
7
Configure data export: Enable CSV and PDF export for all projects
Note

Dovetail's AI auto-tagging works best when you seed the tag taxonomy with 10-15 initial tags relevant to the client's typical engagement themes (e.g., 'Process Inefficiency', 'Technology Gap', 'Change Management', 'Stakeholder Alignment'). These will be refined per project. If Dovetail is not available or the client prefers an alternative, Looppanel ($30/mo) or Insight7 (custom pricing) are viable substitutes.

Step 4: Set Up OpenAI API Access and Custom Synthesis Pipeline

Configure the OpenAI API account, set up the custom Python-based synthesis pipeline that processes transcripts through GPT-5.4 for deep thematic extraction, and deploy the pipeline on an Azure VM or local workstation. This is the core intelligence engine that goes beyond Dovetail's built-in AI to provide consulting-grade thematic analysis.

1
Create OpenAI API Organization (or use existing): Navigate to https://platform.openai.com/settings/organization
2
Create API key: Settings > API Keys > Create new secret key — Name: interview-synthesis-prod — Store securely in Azure Key Vault or client's password manager
3
Set usage limits: Settings > Limits > Set monthly budget cap to $100 (adjustable)
4
Install Python environment on synthesis workstation or Azure VM
5
Clone the synthesis pipeline repository (MSP-maintained): Create the pipeline files as specified in custom_ai_components section
6
Configure environment variables
7
Test API connectivity
Install Python 3.11 virtual environment and required packages
bash
sudo apt update && sudo apt install python3.11 python3.11-venv -y
python3.11 -m venv ~/interview-synthesis-env
source ~/interview-synthesis-env/bin/activate
pip install openai==1.40.0 tiktoken==0.7.0 python-docx==1.1.0 pandas==2.2.0 pyyaml==6.0.1 python-pptx==0.6.23
Create synthesis pipeline directory
bash
mkdir -p ~/interview-synthesis
cd ~/interview-synthesis
Write environment variables to .env file
bash
cat > .env << 'EOF'
OPENAI_API_KEY=sk-proj-xxxxxxxxxxxxxxxxxxxx
OPENAI_MODEL=gpt-5.4
OPENAI_ORG_ID=org-xxxxxxxxxxxxxxxxxxxx
MAX_TOKENS_PER_REQUEST=16000
TEMPERATURE=0.3
OUTPUT_DIR=./outputs
TRANSCRIPT_DIR=./transcripts
EOF
Test OpenAI API connectivity
python
python3 -c "from openai import OpenAI; client = OpenAI(); print(client.models.list().data[0].id)"
Note

For enterprise clients requiring data residency guarantees, use Azure OpenAI Service instead of OpenAI directly. The Azure OpenAI endpoint would replace the standard API URL and data stays within the client's Azure tenant. API keys should NEVER be committed to version control. Use .env files with .gitignore or Azure Key Vault for production deployments. Set a conservative monthly budget cap initially — typical usage for a 10-person firm is $15–$30/month.

Step 5: Deploy Custom Synthesis Pipeline Scripts

Create and deploy the core Python scripts that implement the multi-stage interview synthesis pipeline. These scripts process transcripts through a series of GPT-5.4 prompts to extract individual interview summaries, cross-interview themes, supporting evidence, contradictions, and structured findings for deliverables. See the custom_ai_components section for full source code.

Set up directory structure, scaffold pipeline files, and run a test execution

1
cd ~/interview-synthesis
2
Create directory structure mkdir -p transcripts outputs templates config
3
Create the main pipeline files (content provided in custom_ai_components) - config/prompts.yaml (prompt templates) - synthesis_pipeline.py (main orchestration script) - transcript_processor.py (individual transcript analysis) - theme_synthesizer.py (cross-interview theme extraction) - findings_generator.py (structured findings output) - deliverable_drafter.py (Word/PPT draft generation) - utils.py (helper functions)
4
Make pipeline executable chmod +x synthesis_pipeline.py
5
Run a test with sample transcript python3 synthesis_pipeline.py --input transcripts/sample_interview.txt --project 'Test Project' --output outputs/test_run/
Note

The pipeline is designed to be run by a consultant or analyst after interviews are completed and transcripts exported from Otter.ai. It can also be triggered automatically via Zapier when new transcripts appear in a designated SharePoint folder. Processing time for a typical 20-interview project is 5-15 minutes. Always review AI-generated themes and findings before including in client deliverables — human validation is essential for consulting quality.

Step 6: Configure Microsoft 365 Copilot for Deliverable Drafting

Provision Microsoft 365 Copilot licenses for the 5 senior consultants/partners who will produce client deliverables. Configure Copilot to access the SharePoint document library where synthesized findings are stored, and create custom Copilot prompts for generating report sections and presentation slides from the synthesis outputs.

1
Provision Copilot licenses via Microsoft 365 Admin Center: Navigate to https://admin.microsoft.com > Billing > Purchase Services. Add 'Microsoft 365 Copilot' add-on for 5 users. Assign licenses to designated users under Users > Active Users > [User] > Licenses.
2
Configure SharePoint document library for synthesis outputs: Navigate to SharePoint Admin Center > Sites > Create Site. Site Name: 'Interview Intelligence Hub'. Create document libraries: /Transcripts (raw transcripts from Otter.ai), /Synthesis Outputs (JSON/Markdown from synthesis pipeline), /Deliverable Drafts (Word/PPT outputs), /Templates (standard deliverable templates).
3
Configure Copilot access to SharePoint site: In Microsoft 365 Admin Center > Settings > Copilot, ensure 'Interview Intelligence Hub' site is indexed by Microsoft Search. Verify Copilot can access the site: test with a Copilot query in Word.
4
Upload client's standard deliverable templates to /Templates: Assessment Report Template.docx, Strategy Recommendations Template.pptx, Interview Summary Template.docx.
5
Create Copilot prompt library in SharePoint: Upload prompt reference documents that Copilot can use as context.
Note

Copilot works best when synthesis outputs are stored as structured Markdown or Word documents in SharePoint, not as raw JSON. The deliverable_drafter.py component of the custom pipeline generates Word documents specifically formatted for Copilot consumption. Copilot requires Microsoft 365 E3/E5 or Business Standard/Premium as a prerequisite — verify the client's base license tier before purchasing the Copilot add-on.

Step 7: Set Up Integration Automation with Zapier

Configure Zapier workflows to automate the flow of data between Otter.ai, SharePoint, the synthesis pipeline, Dovetail, and notification channels. This reduces manual steps and ensures transcripts flow automatically from recording to analysis.

1
Navigate to https://zapier.com and create a Professional account
2
Create the following Zaps:

ZAP 1: Otter.ai → SharePoint (New Transcript → Upload)

  • Trigger: Otter.ai - New Transcript
  • Action: SharePoint - Upload File to /Transcripts library
  • Filter: Only trigger for recordings tagged 'stakeholder-interview'

ZAP 2: SharePoint → Webhook (New File → Trigger Synthesis)

  • Trigger: SharePoint - New File in /Transcripts
  • Action: Webhooks - POST to synthesis pipeline endpoint
  • URL: https://{azure-vm-ip}:8443/api/synthesize
ZAP 2 — Webhook POST body to synthesis pipeline
json
{"file_path": "{{file_url}}", "project": "{{folder_name}}"}

ZAP 3: Synthesis Complete → Teams Notification

  • Trigger: Webhooks - Catch Hook (synthesis pipeline calls back)
  • Action: Microsoft Teams - Send Channel Message
  • Channel: #interview-insights
  • Message: 'Synthesis complete for {{project_name}}. {{theme_count}} themes identified. View results: {{output_url}}'

ZAP 4: Synthesis Output → Dovetail (Upload for Collaborative Review)

  • Trigger: SharePoint - New File in /Synthesis Outputs
  • Action: Dovetail - Create Note (via API)
  • Map synthesis JSON fields to Dovetail note fields
Note

Zapier Professional plan ($49/month) provides 2,000 tasks/month which is sufficient for most consulting firms processing 5-15 interviews per month. For higher volume or if the client prefers Microsoft-native automation, replace Zapier with Power Automate (included in M365) — the flows are conceptually identical but use Power Automate connectors instead. The webhook-based synthesis trigger requires the synthesis pipeline to be running as a web service (Flask/FastAPI) on the Azure VM rather than as a CLI script.

Step 8: Configure Compliance and Data Governance Controls

Implement the compliance framework required for processing stakeholder interview data through AI systems. This includes consent management, data processing agreements, retention policies, access controls, and audit logging. Professional services firms have strict obligations around client confidentiality and data handling.

1
Configure SharePoint retention policies: Microsoft 365 Compliance Center > Data lifecycle management > Retention policies. Create policy: 'Interview Data Retention' — Applies to: Interview Intelligence Hub SharePoint site — Retain for: 90 days after project completion (configurable) — Then: Delete automatically — Label: 'Stakeholder Interview Data - Confidential'
2
Configure Data Loss Prevention (DLP) policy: Microsoft 365 Compliance Center > Data loss prevention. Create policy: 'Interview Data Protection' — Detect: PII (SSN, financial data, health information) in transcripts — Action: Alert compliance officer + block external sharing
3
Configure Otter.ai data retention: Otter.ai Admin > Settings > Data Retention — Set: Auto-delete recordings and transcripts after 90 days — Enable: Audit log for all access and exports
4
Configure OpenAI API data handling: Verify at https://platform.openai.com/settings/organization — Data Usage: Confirm 'API data is not used for training' is shown — Review OpenAI DPA: https://openai.com/policies/data-processing-addendum — Sign DPA if not already executed
5
Generate and store consent form template in SharePoint/Templates (Template provided in custom_ai_components section)
6
Configure Azure AD Conditional Access (if not already in place): Require MFA for all access to Interview Intelligence Hub SharePoint site — Block access from non-compliant devices
Critical

The consent form must be signed by every interview participant BEFORE recording begins. Train all consultants on this requirement. For clients in healthcare consulting, verify HIPAA compliance — Otter.ai Business does NOT include BAA by default; you may need to upgrade to Enterprise or switch entirely to Azure OpenAI Whisper + Azure-hosted pipeline. For EU stakeholders, GDPR requires explicit consent for AI processing and the right to erasure. Implement a documented process for handling deletion requests that covers all systems (Otter, Dovetail, SharePoint, OpenAI logs).

Step 9: Create Deliverable Templates and Copilot Prompt Library

Build the Word and PowerPoint templates that the synthesis pipeline and Copilot will populate with findings. These templates should match the client's existing branding and deliverable formats. Create a library of Copilot prompts that consultants can use to generate specific deliverable sections.

1
Obtain client's existing deliverable templates (Word/PPT). Upload to SharePoint: /Interview Intelligence Hub/Templates/
2
Create structured sections in Word template for synthesis output: Executive Summary (auto-populated from findings_generator.py), Methodology & Approach, Key Themes (auto-populated, with supporting quotes), Detailed Findings by Theme, Stakeholder Alignment Analysis, Recommendations (human-drafted with Copilot assistance), Appendix: Interview Participant List
3
Create Copilot prompt reference document. Save as 'Copilot Prompts for Interview Synthesis.docx' in /Templates. Include prompts such as: 'Using the synthesis output in [file], draft an executive summary of key themes', 'Create a findings section for the theme [X] with supporting evidence quotes', 'Generate a stakeholder alignment matrix based on the synthesis data', 'Draft 3-5 strategic recommendations based on the identified pain points'
4
Create PowerPoint template with pre-built slide layouts: Title Slide, Methodology Slide, Theme Overview Slide (auto-populated), Individual Theme Deep-Dive Slides, Stakeholder Quote Highlight Slides, Recommendations Summary Slide
Note

Templates are the bridge between AI-generated analysis and client-ready deliverables. Invest time in getting these right — poorly structured templates negate much of the time savings from AI synthesis. The deliverable_drafter.py script in the custom pipeline generates a pre-populated Word document, but Copilot is used for the final 'polish pass' and for generating sections that require more creative interpretation (e.g., recommendations). Always include a human review step before any deliverable goes to a client.

Step 10: End-User Training and Workflow Validation

Conduct hands-on training sessions with the client's consulting team covering the complete interview-to-deliverable workflow. Validate the entire pipeline end-to-end with a real or realistic test case. Ensure all users understand their role in the workflow and can operate the tools independently.

  • Training Session 1: Recording & Transcription (1 hour, all 10 users)
  • Hardware setup: Jabra Speak2 75 positioning and connection
  • Otter.ai: Starting recordings, tagging interviews, exporting transcripts
  • Consent workflow: When and how to obtain participant consent
  • Teams/Zoom recording settings for remote interviews
  • Training Session 2: Analysis & Synthesis (2 hours, 5 researchers)
  • Dovetail: Uploading transcripts, using AI auto-tagging, manual refinement
  • Running the synthesis pipeline: CLI usage and output interpretation
  • Reviewing AI-generated themes: what to accept, modify, or reject
  • Cross-interview analysis: identifying patterns and contradictions
  • Training Session 3: Deliverable Creation (1.5 hours, 5 senior consultants)
  • Using synthesis outputs in Word/PowerPoint templates
  • Copilot prompts for draft generation
  • Quality assurance: reviewing AI-drafted content for accuracy
  • Finalizing and formatting deliverables
1
End-to-End Validation Exercise: Record a mock stakeholder interview (15 minutes)
2
Transcribe via Otter.ai
3
Export to SharePoint
4
Run synthesis pipeline
5
Review themes in Dovetail
6
Generate deliverable draft
7
Polish with Copilot — Total exercise time: ~45 minutes (vs. 4+ hours manual)
Note

Schedule training sessions across 1 week, not all in one day. The most common failure mode is consultants reverting to manual processes because they don't trust the AI output — address this by showing side-by-side comparisons of AI vs. manual synthesis quality during training. Create a quick-reference card (1-page PDF) summarizing the workflow steps for each role. Record all training sessions for future onboarding of new team members.

Custom AI Components

Interview Transcript Processor

Type: prompt A structured GPT-5.4 prompt that processes individual interview transcripts to extract per-interview summaries, key quotes, sentiment indicators, and preliminary theme tags. This is Stage 1 of the synthesis pipeline — it runs once per transcript and produces a structured JSON output that feeds into the cross-interview Theme Synthesizer.

Implementation

config/prompts.yaml — transcript_analysis section

Prompt Template (config/prompts.yaml - transcript_analysis section) system_prompt: | You are an expert qualitative research analyst working for a professional services consulting firm. Your task is to analyze a stakeholder interview transcript and extract structured insights. You must: 1. Identify the interviewee's role, perspective, and key concerns 2. Extract direct quotes that are particularly insightful or representative 3. Identify preliminary themes discussed in the interview 4. Assess sentiment (positive, negative, neutral, mixed) for each topic discussed 5. Note any specific pain points, opportunities, or recommendations mentioned 6. Flag any contradictions or tensions within the interviewee's responses Output your analysis as valid JSON matching the schema below. Do not include any text outside the JSON. user_prompt_template: | Analyze the following stakeholder interview transcript. **Project Context:** {project_name} **Project Description:** {project_description} **Interviewee:** {interviewee_name} **Interviewee Role:** {interviewee_role} **Interview Date:** {interview_date} **Interviewer:** {interviewer_name} **Transcript:** {transcript_text} Produce a JSON analysis with the following structure: {{ "interviewee_summary": {{ "name": "string", "role": "string", "overall_sentiment": "positive|negative|neutral|mixed", "key_perspective": "2-3 sentence summary of this person's overall perspective", "engagement_level": "high|medium|low" }}, "themes_identified": [ {{ "theme_name": "string (concise theme label)", "description": "string (1-2 sentence description)", "sentiment": "positive|negative|neutral|mixed", "confidence": "high|medium|low", "supporting_quotes": [ {{ "quote": "exact quote from transcript", "context": "brief context for the quote" }} ] }} ], "pain_points": [ {{ "description": "string", "severity": "high|medium|low", "quote": "supporting quote" }} ], "opportunities": [ {{ "description": "string", "potential_impact": "high|medium|low", "quote": "supporting quote" }} ], "recommendations_from_interviewee": [ {{ "recommendation": "string", "quote": "supporting quote" }} ], "contradictions_or_tensions": [ {{ "description": "string", "quotes": ["quote 1", "quote 2"] }} ], "notable_quotes": [ {{ "quote": "exact quote", "significance": "why this quote is notable", "potential_use": "executive summary|theme evidence|callout box|appendix" }} ] }}
Sonnet 4.6
transcript_processor.py — TranscriptProcessor class
python
import json
import os
import yaml
from openai import OpenAI
from pathlib import Path
import tiktoken

class TranscriptProcessor:
    def __init__(self, config_path='config/prompts.yaml'):
        self.client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))
        self.model = os.getenv('OPENAI_MODEL', 'gpt-5.4')
        self.encoding = tiktoken.encoding_for_model(self.model)
        with open(config_path, 'r') as f:
            self.prompts = yaml.safe_load(f)

    def count_tokens(self, text: str) -> int:
        return len(self.encoding.encode(text))

    def chunk_transcript(self, transcript: str, max_tokens: int = 12000) -> list:
        """Split long transcripts into overlapping chunks for processing."""
        paragraphs = transcript.split('\n\n')
        chunks = []
        current_chunk = []
        current_tokens = 0
        for para in paragraphs:
            para_tokens = self.count_tokens(para)
            if current_tokens + para_tokens > max_tokens and current_chunk:
                chunks.append('\n\n'.join(current_chunk))
                # Keep last 2 paragraphs for overlap
                current_chunk = current_chunk[-2:]
                current_tokens = sum(self.count_tokens(p) for p in current_chunk)
            current_chunk.append(para)
            current_tokens += para_tokens
        if current_chunk:
            chunks.append('\n\n'.join(current_chunk))
        return chunks

    def process_transcript(self, transcript_text: str, metadata: dict) -> dict:
        """Process a single interview transcript and return structured analysis."""
        system_prompt = self.prompts['system_prompt']
        user_prompt = self.prompts['user_prompt_template'].format(
            project_name=metadata.get('project_name', 'Unknown Project'),
            project_description=metadata.get('project_description', ''),
            interviewee_name=metadata.get('interviewee_name', 'Unknown'),
            interviewee_role=metadata.get('interviewee_role', 'Unknown'),
            interview_date=metadata.get('interview_date', 'Unknown'),
            interviewer_name=metadata.get('interviewer_name', 'Unknown'),
            transcript_text=transcript_text
        )

        token_count = self.count_tokens(system_prompt + user_prompt)
        if token_count > 120000:  # GPT-5.4 context limit safety margin
            return self._process_chunked(transcript_text, metadata)

        response = self.client.chat.completions.create(
            model=self.model,
            messages=[
                {'role': 'system', 'content': system_prompt},
                {'role': 'user', 'content': user_prompt}
            ],
            temperature=0.3,
            response_format={'type': 'json_object'},
            max_tokens=int(os.getenv('MAX_TOKENS_PER_REQUEST', 16000))
        )

        result = json.loads(response.choices[0].message.content)
        result['_metadata'] = {
            'model': self.model,
            'tokens_used': response.usage.total_tokens,
            'input_tokens': response.usage.prompt_tokens,
            'output_tokens': response.usage.completion_tokens,
            'cost_estimate': self._estimate_cost(response.usage)
        }
        return result

    def _process_chunked(self, transcript_text: str, metadata: dict) -> dict:
        """Process very long transcripts by chunking and merging results."""
        chunks = self.chunk_transcript(transcript_text)
        chunk_results = []
        for i, chunk in enumerate(chunks):
            metadata_copy = metadata.copy()
            metadata_copy['chunk_info'] = f'Part {i+1} of {len(chunks)}'
            result = self.process_transcript(chunk, metadata_copy)
            chunk_results.append(result)
        return self._merge_chunk_results(chunk_results, metadata)

    def _merge_chunk_results(self, results: list, metadata: dict) -> dict:
        """Merge multiple chunk analyses into a single coherent result."""
        merged = {
            'interviewee_summary': results[0].get('interviewee_summary', {}),
            'themes_identified': [],
            'pain_points': [],
            'opportunities': [],
            'recommendations_from_interviewee': [],
            'contradictions_or_tensions': [],
            'notable_quotes': []
        }
        seen_themes = set()
        for r in results:
            for theme in r.get('themes_identified', []):
                if theme['theme_name'] not in seen_themes:
                    merged['themes_identified'].append(theme)
                    seen_themes.add(theme['theme_name'])
            merged['pain_points'].extend(r.get('pain_points', []))
            merged['opportunities'].extend(r.get('opportunities', []))
            merged['recommendations_from_interviewee'].extend(r.get('recommendations_from_interviewee', []))
            merged['contradictions_or_tensions'].extend(r.get('contradictions_or_tensions', []))
            merged['notable_quotes'].extend(r.get('notable_quotes', []))
        return merged

    def _estimate_cost(self, usage) -> str:
        input_cost = (usage.prompt_tokens / 1_000_000) * 2.50
        output_cost = (usage.completion_tokens / 1_000_000) * 10.00
        return f'${input_cost + output_cost:.4f}'

if __name__ == '__main__':
    import sys
    processor = TranscriptProcessor()
    transcript_file = sys.argv[1]
    with open(transcript_file, 'r') as f:
        text = f.read()
    metadata = {
        'project_name': sys.argv[2] if len(sys.argv) > 2 else 'Test',
        'interviewee_name': Path(transcript_file).stem
    }
    result = processor.process_transcript(text, metadata)
    print(json.dumps(result, indent=2))

Cross-Interview Theme Synthesizer

Type: prompt

Stage 2 of the pipeline. Takes the structured outputs from multiple individual transcript analyses (from Stage 1) and synthesizes them into consolidated cross-cutting themes with frequency analysis, sentiment patterns, stakeholder alignment mapping, and evidence triangulation. This is the core intelligence component that produces consulting-grade thematic findings.

Implementation:

Prompt Template (config/prompts.yaml - theme_synthesis section)

theme_synthesis_system_prompt: | You are a senior consulting analyst synthesizing findings from multiple stakeholder interviews. Your task is to identify cross-cutting themes, patterns of agreement and disagreement, and generate strategic insights that will form the basis of a client deliverable. Principles: - Themes should be substantive and actionable, not generic - Always ground themes in evidence (specific quotes from specific stakeholders) - Identify where stakeholders agree AND where they diverge - Note the strength of each theme (how many stakeholders mentioned it, how prominently) - Distinguish between symptoms and root causes - Provide enough context for a consultant to draft deliverable content directly from your output theme_synthesis_user_prompt: | You are synthesizing {interview_count} stakeholder interviews for the project: {project_name}. Project Description: {project_description} Here are the individual interview analyses: {interview_analyses_json} Synthesize these into a comprehensive thematic analysis. Output valid JSON: {{ "synthesis_metadata": {{ "project_name": "string", "total_interviews": integer, "stakeholder_roles": ["list of unique roles"], "synthesis_date": "string" }}, "executive_summary": "3-4 paragraph executive summary of key findings suitable for a client deliverable", "major_themes": [ {{ "theme_id": "T1", "theme_name": "Concise theme label", "theme_description": "2-3 sentence description of this theme", "strength": "strong|moderate|emerging", "frequency": {{ "stakeholders_mentioning": integer, "total_stakeholders": integer, "percentage": float }}, "sentiment_distribution": {{ "positive": integer, "negative": integer, "neutral": integer, "mixed": integer }}, "stakeholder_perspectives": [ {{ "stakeholder_name": "string", "role": "string", "perspective": "1-2 sentence summary", "key_quote": "exact quote" }} ], "sub_themes": [ {{ "name": "string", "description": "string" }} ], "implications": "What this theme means for the client/project", "recommended_actions": ["list of potential actions"] }} ], "alignment_analysis": {{ "areas_of_consensus": [ {{ "topic": "string", "description": "What stakeholders broadly agree on", "stakeholders": ["names"] }} ], "areas_of_divergence": [ {{ "topic": "string", "perspective_a": {{ "position": "string", "stakeholders": ["names"], "quotes": ["supporting quotes"] }}, "perspective_b": {{ "position": "string", "stakeholders": ["names"], "quotes": ["supporting quotes"] }}, "significance": "Why this divergence matters" }} ] }}, "cross_cutting_insights": [ {{ "insight": "A strategic insight not directly stated but emerging from patterns across interviews", "evidence": "What patterns support this insight", "confidence": "high|medium|low" }} ], "priority_findings": [ {{ "finding_id": "F1", "finding": "Clear, concise finding statement", "supporting_themes": ["T1", "T3"], "evidence_strength": "strong|moderate|limited", "urgency": "immediate|short-term|long-term", "impact": "high|medium|low" }} ], "recommended_deliverable_structure": {{ "suggested_sections": ["Ordered list of sections for the final deliverable"], "key_messages": ["3-5 key messages for the executive audience"] }} }}
Sonnet 4.6
Python implementation (theme_synthesizer.py)
python
import json
import os
import yaml
from openai import OpenAI
from datetime import datetime

class ThemeSynthesizer:
    def __init__(self, config_path='config/prompts.yaml'):
        self.client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))
        self.model = os.getenv('OPENAI_MODEL', 'gpt-5.4')
        with open(config_path, 'r') as f:
            self.prompts = yaml.safe_load(f)

    def synthesize_themes(self, interview_analyses: list, project_metadata: dict) -> dict:
        """Synthesize themes across multiple interview analyses."""
        # Prepare condensed version of analyses to fit in context window
        condensed = self._condense_analyses(interview_analyses)

        system_prompt = self.prompts['theme_synthesis_system_prompt']
        user_prompt = self.prompts['theme_synthesis_user_prompt'].format(
            interview_count=len(interview_analyses),
            project_name=project_metadata.get('project_name', 'Unknown'),
            project_description=project_metadata.get('project_description', ''),
            interview_analyses_json=json.dumps(condensed, indent=2)
        )

        response = self.client.chat.completions.create(
            model=self.model,
            messages=[
                {'role': 'system', 'content': system_prompt},
                {'role': 'user', 'content': user_prompt}
            ],
            temperature=0.3,
            response_format={'type': 'json_object'},
            max_tokens=16000
        )

        result = json.loads(response.choices[0].message.content)
        result['_processing_metadata'] = {
            'model': self.model,
            'total_tokens': response.usage.total_tokens,
            'cost_estimate': self._estimate_cost(response.usage),
            'processed_at': datetime.now().isoformat(),
            'interview_count': len(interview_analyses)
        }
        return result

    def _condense_analyses(self, analyses: list) -> list:
        """Remove verbose fields to fit more interviews in context."""
        condensed = []
        for a in analyses:
            c = {
                'interviewee': a.get('interviewee_summary', {}),
                'themes': [{
                    'name': t['theme_name'],
                    'description': t.get('description', ''),
                    'sentiment': t.get('sentiment', ''),
                    'quotes': [q['quote'] for q in t.get('supporting_quotes', [])[:3]]
                } for t in a.get('themes_identified', [])],
                'pain_points': a.get('pain_points', []),
                'opportunities': a.get('opportunities', []),
                'recommendations': a.get('recommendations_from_interviewee', []),
                'contradictions': a.get('contradictions_or_tensions', []),
                'top_quotes': [q['quote'] for q in a.get('notable_quotes', [])[:5]]
            }
            condensed.append(c)
        return condensed

    def _estimate_cost(self, usage) -> str:
        input_cost = (usage.prompt_tokens / 1_000_000) * 2.50
        output_cost = (usage.completion_tokens / 1_000_000) * 10.00
        return f'${input_cost + output_cost:.4f}'

    def generate_theme_summary_markdown(self, synthesis: dict) -> str:
        """Generate a human-readable Markdown summary from the synthesis."""
        md = f"# Thematic Analysis: {synthesis.get('synthesis_metadata', {}).get('project_name', 'Project')}\n\n"
        md += f"## Executive Summary\n\n{synthesis.get('executive_summary', '')}\n\n"
        md += f"## Major Themes\n\n"
        for theme in synthesis.get('major_themes', []):
            freq = theme.get('frequency', {})
            md += f"### {theme['theme_id']}: {theme['theme_name']}\n\n"
            md += f"{theme.get('theme_description', '')}\n\n"
            md += f"**Strength:** {theme.get('strength', 'N/A')} | "
            md += f"**Mentioned by:** {freq.get('stakeholders_mentioning', '?')}/{freq.get('total_stakeholders', '?')} stakeholders\n\n"
            for sp in theme.get('stakeholder_perspectives', []):
                md += f'- **{sp["stakeholder_name"]}** ({sp["role"]}): "{sp["key_quote"]}"\n'
            md += f"\n**Implications:** {theme.get('implications', '')}\n\n"
        md += f"## Areas of Consensus\n\n"
        for area in synthesis.get('alignment_analysis', {}).get('areas_of_consensus', []):
            md += f"- **{area['topic']}**: {area['description']}\n"
        md += f"\n## Areas of Divergence\n\n"
        for div in synthesis.get('alignment_analysis', {}).get('areas_of_divergence', []):
            md += f"### {div['topic']}\n\n{div.get('significance', '')}\n\n"
        md += f"## Priority Findings\n\n"
        for f in synthesis.get('priority_findings', []):
            md += f"- **[{f['finding_id']}]** {f['finding']} (Impact: {f.get('impact', 'N/A')}, Urgency: {f.get('urgency', 'N/A')})\n"
        return md

Deliverable Draft Generator

Type: workflow

Stage 3 of the pipeline. Takes the synthesized themes from Stage 2 and generates pre-populated Word and PowerPoint documents using the client's branded templates. These drafts serve as the starting point for consultant review and refinement, and can also be further polished using Microsoft 365 Copilot.

Implementation:

deliverable_drafter.py
python
# Python implementation (deliverable_drafter.py)

import json
import os
from docx import Document
from docx.shared import Inches, Pt, RGBColor
from docx.enum.text import WD_ALIGN_PARAGRAPH
from pptx import Presentation
from pptx.util import Inches as PptxInches, Pt as PptxPt
from datetime import datetime

class DeliverableDrafter:
    def __init__(self, template_dir='templates'):
        self.template_dir = template_dir

    def generate_word_report(self, synthesis: dict, output_path: str, template_path: str = None):
        """Generate a Word document from synthesis results."""
        if template_path and os.path.exists(template_path):
            doc = Document(template_path)
        else:
            doc = Document()

        # Title
        title = doc.add_heading(level=0)
        title.text = f"Stakeholder Interview Findings"
        subtitle = doc.add_paragraph()
        subtitle.text = f"{synthesis.get('synthesis_metadata', {}).get('project_name', 'Project')}\n"
        subtitle.text += f"Date: {datetime.now().strftime('%B %d, %Y')}\n"
        subtitle.text += f"Interviews Conducted: {synthesis.get('synthesis_metadata', {}).get('total_interviews', 'N/A')}"
        subtitle.style = doc.styles['Subtitle']

        # Executive Summary
        doc.add_heading('Executive Summary', level=1)
        exec_summary = synthesis.get('executive_summary', '')
        for para in exec_summary.split('\n\n'):
            doc.add_paragraph(para.strip())

        # Key Messages Box
        rec_structure = synthesis.get('recommended_deliverable_structure', {})
        key_messages = rec_structure.get('key_messages', [])
        if key_messages:
            doc.add_heading('Key Messages', level=2)
            for msg in key_messages:
                p = doc.add_paragraph(msg, style='List Bullet')

        # Major Themes
        doc.add_heading('Major Themes', level=1)
        for theme in synthesis.get('major_themes', []):
            doc.add_heading(f"{theme['theme_id']}: {theme['theme_name']}", level=2)
            doc.add_paragraph(theme.get('theme_description', ''))

            # Frequency info
            freq = theme.get('frequency', {})
            freq_text = f"Mentioned by {freq.get('stakeholders_mentioning', '?')} of {freq.get('total_stakeholders', '?')} stakeholders ({freq.get('percentage', 0):.0f}%). Theme strength: {theme.get('strength', 'N/A').upper()}"
            p = doc.add_paragraph(freq_text)
            p.runs[0].italic = True

            # Stakeholder perspectives with quotes
            if theme.get('stakeholder_perspectives'):
                doc.add_heading('Stakeholder Perspectives', level=3)
                for sp in theme['stakeholder_perspectives']:
                    p = doc.add_paragraph()
                    runner = p.add_run(f"{sp['stakeholder_name']} ({sp['role']}): ")
                    runner.bold = True
                    p.add_run(sp.get('perspective', ''))
                    if sp.get('key_quote'):
                        quote_para = doc.add_paragraph()
                        quote_run = quote_para.add_run(f'"{sp["key_quote"]}')
                        quote_run.italic = True
                        quote_para.paragraph_format.left_indent = Inches(0.5)

            # Implications
            if theme.get('implications'):
                doc.add_heading('Implications', level=3)
                doc.add_paragraph(theme['implications'])

            # Recommended Actions
            if theme.get('recommended_actions'):
                doc.add_heading('Recommended Actions', level=3)
                for action in theme['recommended_actions']:
                    doc.add_paragraph(action, style='List Bullet')

        # Alignment Analysis
        alignment = synthesis.get('alignment_analysis', {})
        doc.add_heading('Stakeholder Alignment Analysis', level=1)

        if alignment.get('areas_of_consensus'):
            doc.add_heading('Areas of Consensus', level=2)
            for area in alignment['areas_of_consensus']:
                doc.add_paragraph(f"{area['topic']}: {area['description']}")

        if alignment.get('areas_of_divergence'):
            doc.add_heading('Areas of Divergence', level=2)
            for div in alignment['areas_of_divergence']:
                doc.add_heading(div['topic'], level=3)
                doc.add_paragraph(div.get('significance', ''))
                pa = div.get('perspective_a', {})
                pb = div.get('perspective_b', {})
                if pa:
                    p = doc.add_paragraph()
                    p.add_run('View A: ').bold = True
                    p.add_run(f"{pa.get('position', '')} (Held by: {', '.join(pa.get('stakeholders', []))}")
                if pb:
                    p = doc.add_paragraph()
                    p.add_run('View B: ').bold = True
                    p.add_run(f"{pb.get('position', '')} (Held by: {', '.join(pb.get('stakeholders', []))}")

        # Priority Findings
        doc.add_heading('Priority Findings', level=1)
        for finding in synthesis.get('priority_findings', []):
            p = doc.add_paragraph()
            p.add_run(f"[{finding['finding_id']}] ").bold = True
            p.add_run(finding['finding'])
            detail = f"Evidence: {finding.get('evidence_strength', 'N/A')} | Impact: {finding.get('impact', 'N/A')} | Urgency: {finding.get('urgency', 'N/A')}"
            det_p = doc.add_paragraph(detail)
            det_p.runs[0].italic = True

        # Cross-cutting Insights
        if synthesis.get('cross_cutting_insights'):
            doc.add_heading('Cross-Cutting Insights', level=1)
            for insight in synthesis['cross_cutting_insights']:
                doc.add_paragraph(insight['insight'])
                if insight.get('evidence'):
                    ev = doc.add_paragraph(f"Evidence: {insight['evidence']}")
                    ev.runs[0].italic = True

        # Save
        doc.save(output_path)
        return output_path

    def generate_pptx_summary(self, synthesis: dict, output_path: str, template_path: str = None):
        """Generate a PowerPoint summary from synthesis results."""
        if template_path and os.path.exists(template_path):
            prs = Presentation(template_path)
        else:
            prs = Presentation()

        # Title slide
        slide = prs.slides.add_slide(prs.slide_layouts[0])
        slide.shapes.title.text = 'Stakeholder Interview Findings'
        slide.placeholders[1].text = f"{synthesis.get('synthesis_metadata', {}).get('project_name', '')}\n{datetime.now().strftime('%B %Y')}"

        # Executive Summary slide
        slide = prs.slides.add_slide(prs.slide_layouts[1])
        slide.shapes.title.text = 'Executive Summary'
        exec_text = synthesis.get('executive_summary', '')
        # Truncate for slide
        if len(exec_text) > 800:
            exec_text = exec_text[:797] + '...'
        slide.placeholders[1].text = exec_text

        # Key Messages slide
        key_messages = synthesis.get('recommended_deliverable_structure', {}).get('key_messages', [])
        if key_messages:
            slide = prs.slides.add_slide(prs.slide_layouts[1])
            slide.shapes.title.text = 'Key Messages'
            body = '\n'.join([f'• {msg}' for msg in key_messages])
            slide.placeholders[1].text = body

        # Theme overview slide
        themes = synthesis.get('major_themes', [])
        if themes:
            slide = prs.slides.add_slide(prs.slide_layouts[1])
            slide.shapes.title.text = f'Themes Overview ({len(themes)} Themes Identified)'
            theme_list = '\n'.join([f"• {t['theme_id']}: {t['theme_name']} ({t.get('strength', 'N/A')})" for t in themes])
            slide.placeholders[1].text = theme_list

        # Individual theme slides
        for theme in themes[:8]:  # Limit to 8 theme slides
            slide = prs.slides.add_slide(prs.slide_layouts[1])
            slide.shapes.title.text = f"{theme['theme_id']}: {theme['theme_name']}"
            freq = theme.get('frequency', {})
            content = f"{theme.get('theme_description', '')}\n\n"
            content += f"Mentioned by {freq.get('stakeholders_mentioning', '?')}/{freq.get('total_stakeholders', '?')} stakeholders\n\n"
            perspectives = theme.get('stakeholder_perspectives', [])[:3]
            for sp in perspectives:
                content += f'• "{sp.get("key_quote", "")}" — {sp["stakeholder_name"]}\n'
            if theme.get('implications'):
                content += f"\nImplications: {theme['implications']}"
            slide.placeholders[1].text = content

        # Priority Findings slide
        findings = synthesis.get('priority_findings', [])
        if findings:
            slide = prs.slides.add_slide(prs.slide_layouts[1])
            slide.shapes.title.text = 'Priority Findings'
            findings_text = '\n'.join([f"• [{f['finding_id']}] {f['finding']}" for f in findings[:6]])
            slide.placeholders[1].text = findings_text

        prs.save(output_path)
        return output_path

if __name__ == '__main__':
    import sys
    synthesis_file = sys.argv[1]
    with open(synthesis_file, 'r') as f:
        synthesis = json.load(f)
    drafter = DeliverableDrafter()
    drafter.generate_word_report(synthesis, 'outputs/findings_report.docx')
    drafter.generate_pptx_summary(synthesis, 'outputs/findings_summary.pptx')
    print('Deliverables generated successfully.')

Main Synthesis Pipeline Orchestrator

Type: workflow

The master orchestration script that ties together all three stages: (1) process individual transcripts, (2) synthesize cross-interview themes, and (3) generate deliverable drafts. Provides CLI interface for consultants and can also be exposed as a REST API for Zapier integration.

Implementation:

synthesis_pipeline.py
python
# Main orchestrator for interview synthesis pipeline

#!/usr/bin/env python3
# synthesis_pipeline.py - Main orchestrator for interview synthesis pipeline

import json
import os
import sys
import argparse
import glob
from pathlib import Path
from datetime import datetime
from transcript_processor import TranscriptProcessor
from theme_synthesizer import ThemeSynthesizer
from deliverable_drafter import DeliverableDrafter

def load_project_config(config_path: str) -> dict:
    """Load project configuration from JSON file."""
    with open(config_path, 'r') as f:
        return json.load(f)

def discover_transcripts(transcript_dir: str) -> list:
    """Find all transcript files in the specified directory."""
    extensions = ['*.txt', '*.md', '*.vtt', '*.srt']
    files = []
    for ext in extensions:
        files.extend(glob.glob(os.path.join(transcript_dir, ext)))
    return sorted(files)

def parse_transcript_metadata(filepath: str) -> dict:
    """Extract metadata from transcript filename convention.
    Expected format: YYYY-MM-DD_IntervieweeName_Role.txt
    Falls back to filename as interviewee name if convention not followed."""
    stem = Path(filepath).stem
    parts = stem.split('_')
    if len(parts) >= 3:
        return {
            'interview_date': parts[0],
            'interviewee_name': parts[1].replace('-', ' '),
            'interviewee_role': ' '.join(parts[2:]).replace('-', ' ')
        }
    return {
        'interview_date': 'Unknown',
        'interviewee_name': stem.replace('-', ' ').replace('_', ' '),
        'interviewee_role': 'Unknown'
    }

def main():
    parser = argparse.ArgumentParser(description='AI Interview Synthesis Pipeline')
    parser.add_argument('--input', '-i', required=True, help='Directory of transcript files OR single transcript file')
    parser.add_argument('--project', '-p', required=True, help='Project name')
    parser.add_argument('--description', '-d', default='', help='Project description for context')
    parser.add_argument('--output', '-o', default='outputs', help='Output directory')
    parser.add_argument('--config', '-c', default=None, help='Path to project config JSON')
    parser.add_argument('--word-template', default=None, help='Path to Word template .docx')
    parser.add_argument('--pptx-template', default=None, help='Path to PowerPoint template .pptx')
    parser.add_argument('--skip-transcripts', action='store_true', help='Skip Stage 1 (use existing analyses)')
    parser.add_argument('--skip-deliverables', action='store_true', help='Skip Stage 3 (no deliverable generation)')
    args = parser.parse_args()

    # Setup output directory
    timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
    output_dir = os.path.join(args.output, f"{args.project.replace(' ', '_')}_{timestamp}")
    os.makedirs(output_dir, exist_ok=True)
    os.makedirs(os.path.join(output_dir, 'individual_analyses'), exist_ok=True)

    project_metadata = {
        'project_name': args.project,
        'project_description': args.description,
        'processed_at': datetime.now().isoformat()
    }

    if args.config:
        project_metadata.update(load_project_config(args.config))

    print(f'\n=== Interview Synthesis Pipeline ===')
    print(f'Project: {args.project}')
    print(f'Output: {output_dir}')

    # Stage 1: Process Individual Transcripts
    individual_analyses = []
    if not args.skip_transcripts:
        print(f'\n--- Stage 1: Individual Transcript Analysis ---')
        processor = TranscriptProcessor()

        if os.path.isdir(args.input):
            transcript_files = discover_transcripts(args.input)
        else:
            transcript_files = [args.input]

        print(f'Found {len(transcript_files)} transcript(s)')

        for i, filepath in enumerate(transcript_files, 1):
            print(f'  Processing [{i}/{len(transcript_files)}]: {Path(filepath).name}')
            with open(filepath, 'r', encoding='utf-8') as f:
                transcript_text = f.read()

            metadata = parse_transcript_metadata(filepath)
            metadata.update(project_metadata)

            analysis = processor.process_transcript(transcript_text, metadata)
            individual_analyses.append(analysis)

            # Save individual analysis
            analysis_path = os.path.join(output_dir, 'individual_analyses', f'{Path(filepath).stem}_analysis.json')
            with open(analysis_path, 'w') as f:
                json.dump(analysis, f, indent=2)
            print(f'    -> Saved: {analysis_path}')
            if analysis.get('_metadata', {}).get('cost_estimate'):
                print(f'    -> Cost: {analysis["_metadata"]["cost_estimate"]}')
    else:
        print('\n--- Stage 1: SKIPPED (loading existing analyses) ---')
        analysis_dir = os.path.join(output_dir, 'individual_analyses')
        for f in sorted(glob.glob(os.path.join(analysis_dir, '*_analysis.json'))):
            with open(f, 'r') as fh:
                individual_analyses.append(json.load(fh))
        print(f'Loaded {len(individual_analyses)} existing analyses')

    if not individual_analyses:
        print('ERROR: No transcript analyses found. Exiting.')
        sys.exit(1)

    # Stage 2: Cross-Interview Theme Synthesis
    print(f'\n--- Stage 2: Cross-Interview Theme Synthesis ---')
    print(f'Synthesizing {len(individual_analyses)} interviews...')
    synthesizer = ThemeSynthesizer()
    synthesis = synthesizer.synthesize_themes(individual_analyses, project_metadata)

    synthesis_path = os.path.join(output_dir, 'theme_synthesis.json')
    with open(synthesis_path, 'w') as f:
        json.dump(synthesis, f, indent=2)
    print(f'  -> Saved: {synthesis_path}')

    # Save markdown version
    md_path = os.path.join(output_dir, 'theme_synthesis.md')
    md_content = synthesizer.generate_theme_summary_markdown(synthesis)
    with open(md_path, 'w') as f:
        f.write(md_content)
    print(f'  -> Markdown: {md_path}')

    theme_count = len(synthesis.get('major_themes', []))
    finding_count = len(synthesis.get('priority_findings', []))
    print(f'  -> {theme_count} major themes, {finding_count} priority findings identified')

    # Stage 3: Generate Deliverable Drafts
    if not args.skip_deliverables:
        print(f'\n--- Stage 3: Deliverable Draft Generation ---')
        drafter = DeliverableDrafter()

        word_path = os.path.join(output_dir, f"{args.project.replace(' ', '_')}_Findings_Report.docx")
        drafter.generate_word_report(synthesis, word_path, args.word_template)
        print(f'  -> Word Report: {word_path}')

        pptx_path = os.path.join(output_dir, f"{args.project.replace(' ', '_')}_Findings_Summary.pptx")
        drafter.generate_pptx_summary(synthesis, pptx_path, args.pptx_template)
        print(f'  -> PowerPoint Summary: {pptx_path}')

    # Summary
    print(f'\n=== Pipeline Complete ===')
    print(f'All outputs saved to: {output_dir}')
    total_cost = synthesis.get('_processing_metadata', {}).get('cost_estimate', 'N/A')
    print(f'Estimated API cost for synthesis stage: {total_cost}')
    print(f'Review the theme_synthesis.md file for a human-readable summary.')
    print(f'Open the Word/PPTX files and refine with Microsoft 365 Copilot for final polish.')

if __name__ == '__main__':
    main()

Type: prompt

A standardized consent form template that must be signed by all interview participants before recording begins. This is a critical compliance component for the solution. The MSP should customize this with the client's specific branding and legal language.

Implementation:

1
Save as: templates/interview_consent_form.md
2
Convert to branded Word document using client's template
  • Project: [PROJECT NAME]
  • Client: [CLIENT ORGANIZATION]
  • Conducted by: [CONSULTING FIRM NAME]
  • Date: [DATE]

Purpose

You have been invited to participate in a stakeholder interview as part of [PROJECT NAME]. The purpose of this interview is to gather your perspectives, experiences, and insights on [TOPIC AREA]. Your input will help inform [DELIVERABLE TYPE, e.g., 'strategic recommendations', 'assessment findings', 'transformation roadmap'].

Recording & Transcription

With your permission, this interview will be:

  • Audio recorded for accuracy purposes
  • Automatically transcribed using AI-powered transcription software (Otter.ai / Microsoft Teams)
  • Analyzed using AI tools (including large language models) to identify themes and patterns across all stakeholder interviews

How Your Data Will Be Used

  • Your responses will be aggregated with other stakeholder interviews to identify common themes
  • Direct quotes may be used in deliverables but will be attributed by role only (e.g., 'Senior Manager') unless you grant explicit permission for name attribution
  • AI tools will process your transcript to extract themes; no AI system will make decisions based on your individual responses
  • Raw transcripts and recordings will be deleted within [90] days of project completion

Confidentiality

  • All data is stored on encrypted, SOC 2-certified cloud platforms
  • Access is restricted to the project team at [CONSULTING FIRM NAME]
  • Your data will not be sold, shared with third parties, or used for any purpose beyond this project
  • [CONSULTING FIRM NAME] has executed Data Processing Agreements with all technology vendors

Your Rights

  • You may decline to answer any question
  • You may withdraw your consent at any time by contacting [CONTACT EMAIL]
  • You may request deletion of your interview data at any time
  • You may opt out of AI processing while still participating in the interview (manual analysis only)

By signing below, I confirm that:

Optional:

Participant Name: ________________________________

Participant Signature: ________________________________

Date: ________________________________

Interviewer Name: ________________________________

Note

This form should be reviewed and approved by your organization's legal counsel before use. Modify retention periods, attribution policies, and data handling procedures to match your specific regulatory requirements (GDPR, CCPA, HIPAA, etc.).

Zapier-to-Pipeline Webhook API

Type: integration A lightweight Flask API that receives webhook calls from Zapier when new transcripts appear in SharePoint, triggers the synthesis pipeline, and sends a callback when processing is complete. Runs on the same VM or workstation as the synthesis pipeline.

Implementation

webhook_api.py
python
# Flask API for Zapier integration. Run with: python webhook_api.py or
# deploy via gunicorn.

# webhook_api.py - Flask API for Zapier integration
# Run with: python webhook_api.py (or deploy via gunicorn)

import os
import json
import subprocess
import threading
from flask import Flask, request, jsonify
import requests
from datetime import datetime

app = Flask(__name__)

# Configuration
PIPELINE_SCRIPT = os.getenv('PIPELINE_SCRIPT', './synthesis_pipeline.py')
TRANSCRIPT_DIR = os.getenv('TRANSCRIPT_DIR', './transcripts')
OUTPUT_DIR = os.getenv('OUTPUT_DIR', './outputs')
CALLBACK_URL = os.getenv('ZAPIER_CALLBACK_URL', '')  # Zapier webhook catch URL
API_KEY = os.getenv('WEBHOOK_API_KEY', 'change-me-to-a-secure-key')

def verify_api_key(req):
    """Simple API key authentication."""
    key = req.headers.get('X-API-Key') or req.args.get('api_key')
    return key == API_KEY

def run_pipeline_async(project_name, transcript_dir, description=''):
    """Run the synthesis pipeline in a background thread."""
    def _run():
        try:
            cmd = [
                'python3', PIPELINE_SCRIPT,
                '--input', transcript_dir,
                '--project', project_name,
                '--description', description,
                '--output', OUTPUT_DIR
            ]
            result = subprocess.run(cmd, capture_output=True, text=True, timeout=600)

            # Send callback to Zapier
            if CALLBACK_URL:
                payload = {
                    'project_name': project_name,
                    'status': 'success' if result.returncode == 0 else 'error',
                    'output': result.stdout[-500:] if result.stdout else '',
                    'error': result.stderr[-500:] if result.stderr else '',
                    'completed_at': datetime.now().isoformat()
                }
                requests.post(CALLBACK_URL, json=payload, timeout=30)
        except Exception as e:
            if CALLBACK_URL:
                requests.post(CALLBACK_URL, json={
                    'project_name': project_name,
                    'status': 'error',
                    'error': str(e),
                    'completed_at': datetime.now().isoformat()
                }, timeout=30)

    thread = threading.Thread(target=_run, daemon=True)
    thread.start()

@app.route('/api/health', methods=['GET'])
def health():
    return jsonify({'status': 'healthy', 'timestamp': datetime.now().isoformat()})

@app.route('/api/synthesize', methods=['POST'])
def synthesize():
    if not verify_api_key(request):
        return jsonify({'error': 'Unauthorized'}), 401

    data = request.json or {}
    project_name = data.get('project', 'Unknown Project')
    transcript_dir = data.get('transcript_dir', TRANSCRIPT_DIR)
    description = data.get('description', '')

    if not os.path.exists(transcript_dir):
        return jsonify({'error': f'Transcript directory not found: {transcript_dir}'}), 400

    run_pipeline_async(project_name, transcript_dir, description)

    return jsonify({
        'status': 'processing',
        'project': project_name,
        'message': 'Pipeline started. Callback will be sent to Zapier when complete.',
        'started_at': datetime.now().isoformat()
    }), 202

@app.route('/api/status', methods=['GET'])
def status():
    if not verify_api_key(request):
        return jsonify({'error': 'Unauthorized'}), 401
    # List recent output directories
    outputs = []
    if os.path.exists(OUTPUT_DIR):
        for d in sorted(os.listdir(OUTPUT_DIR), reverse=True)[:10]:
            dir_path = os.path.join(OUTPUT_DIR, d)
            if os.path.isdir(dir_path):
                has_synthesis = os.path.exists(os.path.join(dir_path, 'theme_synthesis.json'))
                outputs.append({'project': d, 'complete': has_synthesis})
    return jsonify({'recent_runs': outputs})

if __name__ == '__main__':
    port = int(os.getenv('PORT', 8443))
    app.run(host='0.0.0.0', port=port, debug=False)
    # For production: gunicorn -w 2 -b 0.0.0.0:8443 webhook_api:app
Install required Python dependencies.
bash
pip install flask==3.0.0 gunicorn==22.0.0 requests==2.32.0

Testing & Validation

  • AUDIO QUALITY TEST: Record a 5-minute test conversation using each Jabra Speak2 75 in the actual conference rooms where interviews will be conducted. Play back the recording and verify all speakers are audible, no significant background noise, and speech is clear. Transcribe with Otter.ai and verify >95% word accuracy rate.
  • OTTER.AI TRANSCRIPTION ACCURACY TEST: Conduct a 15-minute mock stakeholder interview with two participants. Record via Otter.ai. After transcription completes, manually compare 3 random 2-minute segments against the recording. Verify speaker identification is correct (speakers labeled distinctly) and overall word error rate is below 10%.
  • OTTER.AI TO SHAREPOINT INTEGRATION TEST: Record a test meeting tagged 'stakeholder-interview' in Otter.ai. Verify the Zapier zap triggers and the transcript file appears in the SharePoint /Transcripts library within 5 minutes. Confirm the file is readable and contains the full transcript text.
  • TRANSCRIPT PROCESSOR (STAGE 1) TEST: Run transcript_processor.py against a sample 45-minute interview transcript. Verify the output JSON contains: (a) valid interviewee_summary with correct role and name, (b) at least 3 themes_identified with supporting quotes that actually appear in the transcript, (c) at least 1 pain_point and 1 opportunity, (d) all quotes in the output are verbatim from the source transcript (no hallucinated quotes).
  • THEME SYNTHESIZER (STAGE 2) TEST: Process 5+ sample interview transcripts through Stage 1, then run theme_synthesizer.py. Verify: (a) major_themes are distinct and non-overlapping, (b) frequency counts are mathematically correct (stakeholders_mentioning ≤ total_stakeholders), (c) alignment_analysis identifies at least one area of consensus and one area of divergence, (d) executive_summary is coherent and references actual themes, (e) total processing time is under 3 minutes.
  • DELIVERABLE DRAFT GENERATION TEST: Run deliverable_drafter.py with synthesis output. Open the generated .docx file and verify: (a) document opens without errors in Word, (b) all sections (Executive Summary, Themes, Alignment Analysis, Priority Findings) are populated, (c) quotes in the document match source transcripts, (d) formatting is clean with proper heading hierarchy. Open the .pptx file and verify all slides render correctly.
  • END-TO-END PIPELINE TEST: Run the full synthesis_pipeline.py with --input pointing to a directory of 3-5 sample transcripts. Verify all three stages complete without errors, output directory contains individual_analyses/*.json, theme_synthesis.json, theme_synthesis.md, and Word/PPTX files. Total execution time should be under 10 minutes for 5 interviews.
  • MICROSOFT 365 COPILOT INTEGRATION TEST: Open the generated Word deliverable in Microsoft Word with Copilot enabled. Test these Copilot prompts: (a) 'Summarize the key themes in this document' — verify response references actual themes, (b) 'Rewrite the executive summary for a C-suite audience' — verify output is appropriate, (c) 'Create a table comparing stakeholder perspectives on [Theme 1]' — verify table is generated with relevant data.
  • ZAPIER WEBHOOK INTEGRATION TEST: Send a POST request to the webhook API endpoint with a test payload. Verify: (a) API returns 202 status, (b) pipeline begins processing, (c) Zapier callback webhook receives completion notification with correct project name and status. Test with invalid API key and verify 401 rejection.
  • DATA RETENTION AND COMPLIANCE TEST: Verify that SharePoint retention policy is applied to the Interview Intelligence Hub site. Upload a test document, confirm it is labeled correctly. Verify Otter.ai admin panel shows correct retention settings. Verify OpenAI API organization settings confirm data is not used for training. Test a deletion request workflow end-to-end: request deletion of a specific interview's data and confirm removal from Otter.ai, SharePoint, and Dovetail within 48 hours.
  • CONCURRENT USER TEST: Have 3 consultants simultaneously record and process interviews through the pipeline. Verify there are no conflicts in file storage, API rate limiting is handled gracefully, and all three analyses complete successfully with correct data segregation.
  • SECURITY TEST: Verify all API keys are stored in environment variables (not hardcoded). Confirm the webhook API rejects requests without valid API key. Verify SharePoint permissions restrict Interview Intelligence Hub access to authorized project team members only. Confirm MFA is enforced for all users accessing the platform.

Client Handoff

Client Handoff Checklist

Training Topics to Cover

1
Interview Recording Best Practices (all consultants): Hardware setup, Otter.ai recording initiation, consent form workflow, room acoustics tips, handling in-person vs. remote interviews
2
Transcript Quality Review (researchers/analysts): How to spot and correct transcription errors, when to use Whisper API for re-transcription, naming conventions for transcript files
3
Running the Synthesis Pipeline (2-3 power users): CLI usage, project configuration, interpreting JSON and Markdown outputs, troubleshooting common errors
4
Dovetail Analysis Workflow (researchers): Uploading transcripts, reviewing AI tags, manual refinement of themes, creating and managing tag taxonomies, exporting findings
5
Deliverable Creation with Copilot (senior consultants): Using synthesis outputs as Copilot context, effective prompting for draft generation, human review and quality assurance process
6
Compliance Procedures (all staff): Consent form requirements, data handling obligations, deletion request process, what NOT to put into AI tools

Documentation to Deliver

  • Quick Reference Card (1-page PDF): Step-by-step workflow from recording to deliverable
  • Pipeline User Guide (5-10 pages): Detailed instructions for running the synthesis pipeline with screenshots
  • Troubleshooting Guide: Common errors and resolutions (API rate limits, transcription quality issues, file format problems)
  • Prompt Library: Saved Copilot and GPT-5.4 prompts for common deliverable types (assessment reports, strategy recommendations, stakeholder alignment analyses)
  • Compliance Handbook: Consent form template, data retention policy, vendor DPA status, deletion request procedure
  • Admin Guide (for IT contact): Account administration for Otter.ai, Dovetail, OpenAI API; license management; usage monitoring
  • Video Recordings: All training sessions recorded and saved to SharePoint for future onboarding

Success Criteria to Review Together

Transition Details

  • MSP retains admin access to Otter.ai, Dovetail, and OpenAI accounts for ongoing management
  • Client designates 1-2 internal 'AI Champions' for first-line support
  • 30-day hypercare period with weekly check-in calls included in implementation
  • After hypercare, transition to standard managed services agreement

Maintenance

Ongoing Maintenance Plan

Monthly Tasks (MSP Responsibility)

  • Usage & Cost Monitoring: Review OpenAI API usage dashboard for unexpected spikes; verify Otter.ai minutes consumption is within plan limits; check Dovetail seat utilization. Target: keep API costs under $50/month for typical firm.
  • Prompt Quality Review: Review a sample of recent synthesis outputs for quality degradation (theme relevance, quote accuracy, finding actionability). LLM behavior can drift with model updates.
  • Software Updates: Check for updates to Python dependencies (openai, tiktoken, python-docx, python-pptx, flask). Apply security patches within 7 days of release. Run pip list --outdated and test updates in staging before deploying.
  • Backup Verification: Confirm SharePoint retention policies are active and functioning. Verify synthesis output archives are being maintained.
  • License Reconciliation: Review Otter.ai, Dovetail, and Copilot seat assignments. Remove departed employees, add new hires. Optimize plan tier if usage patterns change.
Check for outdated Python dependencies before applying updates in staging
bash
pip list --outdated

Quarterly Tasks

  • Prompt Optimization Session (2 hours): Review accumulated synthesis outputs with client power users. Refine prompt templates based on what's working well and what needs improvement. Update config/prompts.yaml accordingly.
  • Compliance Audit: Verify all DPAs are current. Check vendor SOC 2 report validity dates. Review data retention compliance (are old transcripts being deleted per policy?). Audit access logs for unauthorized access attempts.
  • Performance Benchmarking: Time the full pipeline for a recent project and compare to baseline. If processing time has increased >25%, investigate API performance or data volume changes.
  • Template Refresh: Update Word/PPTX templates if client has rebranded or changed deliverable formats.

Trigger-Based Maintenance

  • OpenAI Model Updates: When OpenAI releases new GPT-5.4 versions, test synthesis quality with the new model on 2-3 existing projects before switching. Update OPENAI_MODEL environment variable only after validation. Expected frequency: 2-4 times per year.
  • Vendor Pricing Changes: Monitor Otter.ai, Dovetail, and OpenAI pricing announcements. Renegotiate or switch vendors if cost increases exceed 20%. Maintain vendor comparison matrix.
  • Client Workflow Changes: If client adds new engagement types, interview methodologies, or deliverable formats, create new prompt templates and test thoroughly before deployment.
  • Security Incidents: If any vendor reports a data breach, immediately assess exposure, notify client, and execute incident response per the compliance handbook. Have a vendor replacement plan ready.
  • API Rate Limiting or Outages: If OpenAI API experiences repeated rate limiting, implement exponential backoff in the pipeline (already built into the OpenAI Python SDK). For extended outages, have Anthropic Claude API as a fallback (requires prompt adaptation).

SLA Considerations

  • Response Time: 4-hour response for pipeline errors blocking active engagements; 24-hour response for non-urgent issues
  • Availability: Pipeline should be available during business hours (M-F 8am-6pm client local time). SaaS vendor uptime is governed by their respective SLAs (typically 99.9%)
  • Data Recovery: All synthesis outputs recoverable from SharePoint within 4 hours. Individual transcript re-processing can be triggered at any time.

Escalation Path

1
Level 1 (Client AI Champion): Basic troubleshooting — restart pipeline, re-export transcript, verify file naming conventions
2
Level 2 (MSP Tier 2 Technician): Software configuration issues, integration failures, Zapier troubleshooting, license management
3
Level 3 (MSP Solutions Architect / Developer): Prompt engineering changes, API migration, custom code modifications, vendor replacement
4
Level 4 (Vendor Support): OpenAI, Otter.ai, Dovetail, Microsoft support tickets for platform-level issues

Cost Monitoring Alerts

  • Set OpenAI API monthly budget alert at $75 (warning) and $100 (hard cap)
  • Monitor Otter.ai minutes usage; alert at 80% of plan capacity
  • Review monthly MSP invoice against expected baseline; investigate variances >15%

Alternatives

Turnkey SaaS Approach (Dovetail or Insight7 Only)

Replace the custom Python synthesis pipeline with a single all-in-one platform like Dovetail Professional or Insight7. These platforms handle transcription, AI-powered thematic analysis, collaborative coding, and basic reporting within a single web interface. No custom code, no API integrations, no VM required. The MSP simply provisions accounts, uploads transcripts, and trains users on the platform's built-in AI features.

  • COST: Lower total cost (~$75-150/month vs. ~$300+/month for the full stack) and zero development cost.
  • COMPLEXITY: Dramatically simpler — Tier 1 MSP technician can deploy in 1-2 weeks vs. 4-6 weeks for the primary approach.
  • CAPABILITY: Less customizable synthesis prompts; dependent on vendor's AI quality which varies; weaker deliverable generation (no auto-populated Word/PPTX). Limited integration options.
  • RECOMMENDATION: Best for small consulting firms (1-5 people) with straightforward interview analysis needs and low desire for customization. Not recommended for firms that need branded deliverable automation or have complex thematic analysis requirements.

Microsoft-Native Stack (Teams + Azure OpenAI + Copilot)

Build the entire solution within the Microsoft ecosystem. Use Microsoft Teams for interview recording and built-in transcription, Azure OpenAI Service (GPT-5.4) for the synthesis pipeline with enterprise data residency, Azure Blob Storage for transcript storage, Power Automate for workflow orchestration (replacing Zapier), and Microsoft 365 Copilot for deliverable creation. The custom Python pipeline runs on an Azure App Service or Azure Functions instead of a standalone VM.

  • COST: Potentially higher software costs (Azure OpenAI has same token pricing but Azure compute adds $30-100/month; Power Automate Premium is $15/user/month) but eliminates Otter.ai, Dovetail, and Zapier subscriptions.
  • COMPLEXITY: Medium — requires Azure expertise (Tier 2-3 technician) but all components are within a single vendor ecosystem.
  • CAPABILITY: Strongest data governance and compliance story (all data stays in Azure tenant, SOC 2/HIPAA/FedRAMP ready). Teams transcription quality is slightly below Otter.ai. No dedicated QDA platform means less collaborative analysis UX.
  • RECOMMENDATION: Best for firms already deep in the Microsoft ecosystem, those with strict data sovereignty requirements (government consulting, healthcare), or MSPs with strong Azure practices. Choose this when compliance is the primary concern over user experience.

Enterprise QDA Platform Approach (NVivo or ATLAS.ti)

Use a traditional enterprise qualitative data analysis platform like NVivo 15 (Lumivero, ~$1,350-$2,500/license) or ATLAS.ti (~€1,100/year) as the primary analysis tool, supplemented by their newly added AI-powered auto-coding features. These platforms offer the deepest qualitative analysis capabilities including mixed-methods research, advanced coding frameworks, and publication-quality outputs.

  • COST: Significantly higher software cost ($1,350-$2,500 per license vs. $15-$20/month for modern alternatives). Desktop-based licensing is less flexible than SaaS.
  • COMPLEXITY: High learning curve — these are academic-grade research tools that require substantial training (budget 10-20 hours per user).
  • CAPABILITY: Most powerful analysis features available; handles complex mixed-methods research that simpler tools cannot; AI features are newer and less mature than purpose-built AI platforms; strong export and visualization capabilities.
  • RECOMMENDATION: Only for professional services firms doing rigorous academic-style research (policy consulting, evaluation firms, market research) where analytical depth and methodological rigor are more important than speed and cost. Overkill for typical strategy consulting interview synthesis.

Build-from-Scratch Custom Application

Develop a fully custom web application using the OpenAI Whisper API for transcription, GPT-5.4 for analysis, a React/Next.js frontend for the analysis interface, and a PostgreSQL database for storing all interview data and analysis results. This gives maximum customization and can be white-labeled by the MSP for multiple clients.

Warning

COST: Highest upfront investment ($15,000-$40,000 in development) but potentially lowest per-client marginal cost once built.

Warning

COMPLEXITY: Very high — requires a full-stack developer, 8-16 weeks of development, and ongoing maintenance.

Note

CAPABILITY: Maximum flexibility — can implement any analysis methodology, any output format, any integration. Can be white-labeled and resold across the MSP's client base as a proprietary product.

Critical

RECOMMENDATION: Only viable if the MSP has development resources and plans to deploy this across 5+ professional services clients. The ROI threshold is approximately 5 clients at $500/month to recover development costs within 12 months. Not recommended for a single-client engagement.

Want early access to the full toolkit?