March 29, 202650 min readIntelligence & insights

Implementation Guide: Synthesize stakeholder interview data into themes and findings for deliverables

Step-by-step implementation guide for deploying AI to synthesize stakeholder interview data into themes and findings for deliverables for Professional Services clients.

Use Case Implementation Guide

Hardware Procurement

Jabra Speak2 75 Wireless Speakerphone

Jabra (GN Audio)2775-419 (USB-A variant) / 2775-429 (USB-C variant)Qty: 5

$175 per unit MSP cost / $250 suggested resale

Primary interview recording device for conference rooms and in-person stakeholder interviews. Features 4 noise-cancelling beamforming microphones and super-wideband audio optimized for clear voice capture. Connects via USB or Bluetooth to laptops running transcription software. Critical for ensuring high-quality audio input to AI transcription pipeline.

Jabra Speak 510 Portable Speakerphone

Jabra (GN Audio)7510-209 (USB-A)Qty: 2

$95 per unit MSP cost / $140 suggested resale

Portable backup and field interview recording device. Compact form factor for consultants conducting on-site stakeholder interviews at client facilities. USB/Bluetooth connectivity for laptop-based recording.

RØDE NT-USB Mini USB Condenser Microphone

RØDE MicrophonesNTUSB-MINIQty: 2

$80 per unit MSP cost / $115 suggested resale

High-quality desk-based microphone for dedicated interview recording setups in the office. Provides broadcast-quality audio for solo interviewer recording when a speakerphone pickup pattern is not needed. Ideal for phone/VoIP interview recording scenarios.

Software Procurement

Otter.ai Business

Otter.aiBusinessQty: 10 seats

$20/user/month billed annually ($240/user/year). 10 seats = $200/month or $2,400/year

Primary transcription and meeting intelligence platform. Provides automatic transcription of Zoom and Teams meetings with speaker identification, search, and export. 6,000 minutes/month per user. Admin features including usage analytics. Feeds raw transcripts to downstream analysis pipeline.

Dovetail Professional

DovetailProfessional PlanQty: 5 researcher seats

$15/user/month — $75/month or $900/year

Cloud-native qualitative data analysis platform. Provides AI-driven transcription tagging, automated summarization, collaborative theme coding, and insight repositories. Researchers upload transcripts, tag themes, and generate structured findings. Serves as the central analysis workspace between raw transcripts and final deliverables.

OpenAI API (GPT-5.4)

OpenAIGPT-5.4

$2.50 per 1M input tokens, $10 per 1M output tokens. Estimated $25/month for typical consulting firm workload (~2M tokens/month)

Powers the custom thematic synthesis pipeline. Processes cleaned transcripts through structured prompts to extract themes, generate cross-interview findings, identify contradictions, and draft deliverable sections. Used via API for programmatic, repeatable analysis at scale.

OpenAI Whisper API

OpenAIWhisper

$0.006/minute of audio. 40 hours of interviews = $14.40 per project

Backup and high-accuracy transcription engine for audio files not captured through Otter.ai (e.g., in-person recordings, phone interviews). Provides raw transcription output that feeds into the synthesis pipeline. Supports 98+ languages.

Microsoft 365 Business Standard

MicrosoftSaaS per-seat annual subscription (CSP)Qty: 10 seats

$12.50/user/month via CSP. 10 seats = $125/month or $1,500/year

Foundation platform providing Microsoft Teams (for remote interview recording and built-in transcription), SharePoint (document storage and transcript repository), Word and PowerPoint (deliverable creation), and OneDrive (individual file storage). Assumed pre-existing at most professional services firms.

Microsoft 365 Copilot

MicrosoftSaaS per-seat monthly add-on (CSP)Qty: 5 seats (senior consultants/partners)

$30/user/month add-on = $150/month or $1,800/year

AI assistant embedded in Word and PowerPoint for drafting final client deliverables from synthesized themes and findings. Copilot ingests structured synthesis outputs and generates draft report sections, executive summaries, and presentation slides. Significant time savings on deliverable production.

Notion Business

Notion LabsSaaS per-seat annual subscriptionQty: 5 seats

$20/user/month for Business plan with AI. 5 seats = $100/month or $1,200/year. Optional — can substitute Confluence or SharePoint.

Optional research knowledge management wiki. Stores interview repositories organized by engagement, maintains prompt templates and analysis playbooks, and serves as a collaborative workspace for building findings before final deliverable creation. Includes Notion AI for additional summarization.

Zapier Professional

ZapierProfessional PlanQty: 1

$49/month (2,000 tasks/month)

Integration orchestration layer connecting Otter.ai, Dovetail, SharePoint, and the custom synthesis pipeline. Automates transcript routing, triggers synthesis workflows, and pushes notifications to Slack/Teams when analysis is complete.

Prerequisites

Microsoft 365 Business Standard (or higher) tenant fully provisioned with Teams, SharePoint, and OneDrive configured for all users
Active Microsoft CSP relationship with the MSP for license management and Copilot provisioning
Zoom Business ($199.90/user/year) OR Microsoft Teams (included in M365) deployed and configured for meeting recording — confirm which platform the client uses for remote interviews
Minimum 25 Mbps internet upload bandwidth at primary office location; 10 Mbps minimum at any remote interview sites
Outbound HTTPS (port 443) allowed through firewall to: otter.ai, dovetail.com, api.openai.com, login.microsoftonline.com, zapier.com, notion.so
WPA3 or WPA2-Enterprise Wi-Fi at interview recording locations for reliable connectivity
Modern web browsers (Chrome 120+, Edge 120+, Firefox 120+) on all workstations
Standard business laptops with minimum 8GB RAM, USB-A or USB-C ports for recording hardware, and functional audio drivers
Client has signed engagement letter or MSA that permits use of AI tools for data processing — critical for compliance
OpenAI Platform account created with billing configured and API key generated (MSP should create under client's organization or MSP's managed tenant)
Python 3.10+ runtime available on at least one workstation or Azure VM for running the custom synthesis pipeline scripts
Client has existing templates (Word/PowerPoint) for their standard deliverable formats — needed for Copilot template configuration
Identified project lead / power user within the client organization who will own the interview analysis workflow
Written interview consent form template approved by client's legal counsel that discloses AI processing of interview data

Installation Steps

Step 1: Provision and Configure Otter.ai Business Workspace

Create the Otter.ai Business workspace for the client organization. This is the primary transcription engine that will automatically capture and transcribe all stakeholder interviews conducted via Zoom or Teams. Configure SSO if the client uses Azure AD, set up team structure, and enable key integrations.

Navigate to https://otter.ai/signup and create a Business workspace

Under Admin > Settings > Authentication, configure SSO: Identity Provider: Azure AD | SAML SSO URL: https://login.microsoftonline.com/{tenant-id}/saml2 | Certificate: Upload from Azure AD Enterprise Application

Under Admin > Settings > Integrations: Enable Zoom integration (OAuth connection to client's Zoom account) | Enable Microsoft Teams integration | Enable Google Calendar sync (if applicable)

Under Admin > Settings > Security: Enable 2FA requirement for all users | Set data retention to 90 days (configurable per compliance needs) | Review and accept DPA at https://otter.ai/dpa

Invite all 10 users via Admin > Team > Invite Members

Note

If the client uses Microsoft Teams exclusively (no Zoom), consider whether Otter.ai is needed or if Teams built-in transcription + Whisper API for higher accuracy is sufficient. Otter.ai adds value with its search, highlight, and summary features beyond raw transcription. Ensure the DPA is signed before any interview data flows through the platform. For HIPAA-covered clients, confirm Otter.ai Business tier includes BAA — if not, use Enterprise tier or switch to Azure OpenAI Whisper.

Step 2: Deploy Recording Hardware

Unbox, configure, and deploy Jabra Speak2 75 speakerphones and RØDE NT-USB Mini microphones. Test audio quality with each device connected to the client's laptops. Ensure firmware is updated and devices are recognized by Otter.ai and Zoom/Teams.

Connect Jabra Speak2 75 via USB-A/USB-C to laptop

Download Jabra Direct from https://www.jabra.com/software-and-services/jabra-direct

Install Jabra Direct and run firmware update

In Zoom/Teams, set Jabra Speak2 75 as default microphone and speaker

Test recording: start a test Otter.ai recording, speak from 6ft distance, verify transcript accuracy

Run firmware update for Jabra Speak2 75 via Jabra Direct CLI

bash

jabra-direct --check-firmware --update

Connect RØDE NT-USB Mini via USB-C to laptop

Download RØDE Central from https://www.rode.com/software/rode-central

Update firmware via RØDE Central

Set as input device in OS audio settings

In Otter.ai desktop app, select RØDE NT-USB Mini as microphone source

Note

Label each device with an asset tag for the client's inventory. Position Jabra Speak2 75 units centrally on conference tables — optimal pickup range is 6-8 feet. The RØDE NT-USB Mini should be positioned 6-12 inches from the interviewer for best results. Test each device in the actual rooms where interviews will be conducted, as room acoustics significantly impact transcription accuracy.

Step 3: Configure Dovetail Professional Workspace

Set up the Dovetail qualitative analysis workspace where researchers will organize, tag, and synthesize interview transcripts. Create project templates, configure tag taxonomies, and set up team permissions. Dovetail serves as the central analysis hub between raw transcripts and final deliverables.

Navigate to https://dovetail.com and create a new workspace

Workspace Settings > Team: Invite 5 researcher seats

Configure SSO via SAML (Azure AD integration): Go to Settings > Security > SAML SSO — Entity ID: https://dovetail.com/saml/{workspace-id} — ACS URL: provided by Dovetail — Configure in Azure AD as Enterprise Application

Create Project Template: Project Name: [Client Name] - [Engagement Name] | Tags Taxonomy: Themes (auto-generated by AI, refined by researcher), Sentiment (Positive / Neutral / Negative / Mixed), Stakeholder Type (Executive / Manager / Staff / External), Priority (High / Medium / Low), Finding Type (Pain Point / Opportunity / Strength / Risk)

Create a 'Master Tag Set' under workspace settings for reuse across projects

Enable AI Features: Settings > AI > Enable Summaries, Auto-tagging

Configure data export: Enable CSV and PDF export for all projects

Note

Dovetail's AI auto-tagging works best when you seed the tag taxonomy with 10-15 initial tags relevant to the client's typical engagement themes (e.g., 'Process Inefficiency', 'Technology Gap', 'Change Management', 'Stakeholder Alignment'). These will be refined per project. If Dovetail is not available or the client prefers an alternative, Looppanel ($30/mo) or Insight7 (custom pricing) are viable substitutes.

Step 4: Set Up OpenAI API Access and Custom Synthesis Pipeline

Configure the OpenAI API account, set up the custom Python-based synthesis pipeline that processes transcripts through GPT-5.4 for deep thematic extraction, and deploy the pipeline on an Azure VM or local workstation. This is the core intelligence engine that goes beyond Dovetail's built-in AI to provide consulting-grade thematic analysis.

Create OpenAI API Organization (or use existing): Navigate to https://platform.openai.com/settings/organization

Create API key: Settings > API Keys > Create new secret key — Name: interview-synthesis-prod — Store securely in Azure Key Vault or client's password manager

Set usage limits: Settings > Limits > Set monthly budget cap to $100 (adjustable)

Install Python environment on synthesis workstation or Azure VM

Clone the synthesis pipeline repository (MSP-maintained): Create the pipeline files as specified in custom_ai_components section

Configure environment variables

Test API connectivity

Install Python 3.11 virtual environment and required packages

bash

sudo apt update && sudo apt install python3.11 python3.11-venv -y
python3.11 -m venv ~/interview-synthesis-env
source ~/interview-synthesis-env/bin/activate
pip install openai==1.40.0 tiktoken==0.7.0 python-docx==1.1.0 pandas==2.2.0 pyyaml==6.0.1 python-pptx==0.6.23

Create synthesis pipeline directory

bash

mkdir -p ~/interview-synthesis
cd ~/interview-synthesis

Write environment variables to .env file

bash

cat > .env << 'EOF'
OPENAI_API_KEY=sk-proj-xxxxxxxxxxxxxxxxxxxx
OPENAI_MODEL=gpt-5.4
OPENAI_ORG_ID=org-xxxxxxxxxxxxxxxxxxxx
MAX_TOKENS_PER_REQUEST=16000
TEMPERATURE=0.3
OUTPUT_DIR=./outputs
TRANSCRIPT_DIR=./transcripts
EOF

Test OpenAI API connectivity

python

python3 -c "from openai import OpenAI; client = OpenAI(); print(client.models.list().data[0].id)"

Note

For enterprise clients requiring data residency guarantees, use Azure OpenAI Service instead of OpenAI directly. The Azure OpenAI endpoint would replace the standard API URL and data stays within the client's Azure tenant. API keys should NEVER be committed to version control. Use .env files with .gitignore or Azure Key Vault for production deployments. Set a conservative monthly budget cap initially — typical usage for a 10-person firm is $15–$30/month.

Step 5: Deploy Custom Synthesis Pipeline Scripts

Create and deploy the core Python scripts that implement the multi-stage interview synthesis pipeline. These scripts process transcripts through a series of GPT-5.4 prompts to extract individual interview summaries, cross-interview themes, supporting evidence, contradictions, and structured findings for deliverables. See the custom_ai_components section for full source code.

Set up directory structure, scaffold pipeline files, and run a test execution

cd ~/interview-synthesis

Create directory structure mkdir -p transcripts outputs templates config

Create the main pipeline files (content provided in custom_ai_components) - config/prompts.yaml (prompt templates) - synthesis_pipeline.py (main orchestration script) - transcript_processor.py (individual transcript analysis) - theme_synthesizer.py (cross-interview theme extraction) - findings_generator.py (structured findings output) - deliverable_drafter.py (Word/PPT draft generation) - utils.py (helper functions)

Make pipeline executable chmod +x synthesis_pipeline.py

Run a test with sample transcript python3 synthesis_pipeline.py --input transcripts/sample_interview.txt --project 'Test Project' --output outputs/test_run/

Note

The pipeline is designed to be run by a consultant or analyst after interviews are completed and transcripts exported from Otter.ai. It can also be triggered automatically via Zapier when new transcripts appear in a designated SharePoint folder. Processing time for a typical 20-interview project is 5-15 minutes. Always review AI-generated themes and findings before including in client deliverables — human validation is essential for consulting quality.

Step 6: Configure Microsoft 365 Copilot for Deliverable Drafting

Provision Microsoft 365 Copilot licenses for the 5 senior consultants/partners who will produce client deliverables. Configure Copilot to access the SharePoint document library where synthesized findings are stored, and create custom Copilot prompts for generating report sections and presentation slides from the synthesis outputs.

Provision Copilot licenses via Microsoft 365 Admin Center: Navigate to https://admin.microsoft.com > Billing > Purchase Services. Add 'Microsoft 365 Copilot' add-on for 5 users. Assign licenses to designated users under Users > Active Users > [User] > Licenses.

Configure SharePoint document library for synthesis outputs: Navigate to SharePoint Admin Center > Sites > Create Site. Site Name: 'Interview Intelligence Hub'. Create document libraries: /Transcripts (raw transcripts from Otter.ai), /Synthesis Outputs (JSON/Markdown from synthesis pipeline), /Deliverable Drafts (Word/PPT outputs), /Templates (standard deliverable templates).

Configure Copilot access to SharePoint site: In Microsoft 365 Admin Center > Settings > Copilot, ensure 'Interview Intelligence Hub' site is indexed by Microsoft Search. Verify Copilot can access the site: test with a Copilot query in Word.

Upload client's standard deliverable templates to /Templates: Assessment Report Template.docx, Strategy Recommendations Template.pptx, Interview Summary Template.docx.

Create Copilot prompt library in SharePoint: Upload prompt reference documents that Copilot can use as context.

Note

Copilot works best when synthesis outputs are stored as structured Markdown or Word documents in SharePoint, not as raw JSON. The deliverable_drafter.py component of the custom pipeline generates Word documents specifically formatted for Copilot consumption. Copilot requires Microsoft 365 E3/E5 or Business Standard/Premium as a prerequisite — verify the client's base license tier before purchasing the Copilot add-on.

Step 7: Set Up Integration Automation with Zapier

Configure Zapier workflows to automate the flow of data between Otter.ai, SharePoint, the synthesis pipeline, Dovetail, and notification channels. This reduces manual steps and ensures transcripts flow automatically from recording to analysis.

Navigate to https://zapier.com and create a Professional account

Create the following Zaps:

ZAP 1: Otter.ai → SharePoint (New Transcript → Upload)

Trigger: Otter.ai - New Transcript
Action: SharePoint - Upload File to /Transcripts library
Filter: Only trigger for recordings tagged 'stakeholder-interview'

ZAP 2: SharePoint → Webhook (New File → Trigger Synthesis)

Trigger: SharePoint - New File in /Transcripts
Action: Webhooks - POST to synthesis pipeline endpoint
URL: https://{azure-vm-ip}:8443/api/synthesize

ZAP 2 — Webhook POST body to synthesis pipeline

json

{"file_path": "{{file_url}}", "project": "{{folder_name}}"}

ZAP 3: Synthesis Complete → Teams Notification

Trigger: Webhooks - Catch Hook (synthesis pipeline calls back)
Action: Microsoft Teams - Send Channel Message
Channel: #interview-insights
Message: 'Synthesis complete for {{project_name}}. {{theme_count}} themes identified. View results: {{output_url}}'

ZAP 4: Synthesis Output → Dovetail (Upload for Collaborative Review)

Trigger: SharePoint - New File in /Synthesis Outputs
Action: Dovetail - Create Note (via API)
Map synthesis JSON fields to Dovetail note fields

Note

Zapier Professional plan ($49/month) provides 2,000 tasks/month which is sufficient for most consulting firms processing 5-15 interviews per month. For higher volume or if the client prefers Microsoft-native automation, replace Zapier with Power Automate (included in M365) — the flows are conceptually identical but use Power Automate connectors instead. The webhook-based synthesis trigger requires the synthesis pipeline to be running as a web service (Flask/FastAPI) on the Azure VM rather than as a CLI script.

Step 8: Configure Compliance and Data Governance Controls

Implement the compliance framework required for processing stakeholder interview data through AI systems. This includes consent management, data processing agreements, retention policies, access controls, and audit logging. Professional services firms have strict obligations around client confidentiality and data handling.

Configure SharePoint retention policies: Microsoft 365 Compliance Center > Data lifecycle management > Retention policies. Create policy: 'Interview Data Retention' — Applies to: Interview Intelligence Hub SharePoint site — Retain for: 90 days after project completion (configurable) — Then: Delete automatically — Label: 'Stakeholder Interview Data - Confidential'

Configure Data Loss Prevention (DLP) policy: Microsoft 365 Compliance Center > Data loss prevention. Create policy: 'Interview Data Protection' — Detect: PII (SSN, financial data, health information) in transcripts — Action: Alert compliance officer + block external sharing

Configure Otter.ai data retention: Otter.ai Admin > Settings > Data Retention — Set: Auto-delete recordings and transcripts after 90 days — Enable: Audit log for all access and exports

Configure OpenAI API data handling: Verify at https://platform.openai.com/settings/organization — Data Usage: Confirm 'API data is not used for training' is shown — Review OpenAI DPA: https://openai.com/policies/data-processing-addendum — Sign DPA if not already executed

Generate and store consent form template in SharePoint/Templates (Template provided in custom_ai_components section)

Configure Azure AD Conditional Access (if not already in place): Require MFA for all access to Interview Intelligence Hub SharePoint site — Block access from non-compliant devices

Critical

The consent form must be signed by every interview participant BEFORE recording begins. Train all consultants on this requirement. For clients in healthcare consulting, verify HIPAA compliance — Otter.ai Business does NOT include BAA by default; you may need to upgrade to Enterprise or switch entirely to Azure OpenAI Whisper + Azure-hosted pipeline. For EU stakeholders, GDPR requires explicit consent for AI processing and the right to erasure. Implement a documented process for handling deletion requests that covers all systems (Otter, Dovetail, SharePoint, OpenAI logs).

Step 9: Create Deliverable Templates and Copilot Prompt Library

Build the Word and PowerPoint templates that the synthesis pipeline and Copilot will populate with findings. These templates should match the client's existing branding and deliverable formats. Create a library of Copilot prompts that consultants can use to generate specific deliverable sections.

Obtain client's existing deliverable templates (Word/PPT). Upload to SharePoint: /Interview Intelligence Hub/Templates/

Create structured sections in Word template for synthesis output: Executive Summary (auto-populated from findings_generator.py), Methodology & Approach, Key Themes (auto-populated, with supporting quotes), Detailed Findings by Theme, Stakeholder Alignment Analysis, Recommendations (human-drafted with Copilot assistance), Appendix: Interview Participant List

Create Copilot prompt reference document. Save as 'Copilot Prompts for Interview Synthesis.docx' in /Templates. Include prompts such as: 'Using the synthesis output in [file], draft an executive summary of key themes', 'Create a findings section for the theme [X] with supporting evidence quotes', 'Generate a stakeholder alignment matrix based on the synthesis data', 'Draft 3-5 strategic recommendations based on the identified pain points'

Create PowerPoint template with pre-built slide layouts: Title Slide, Methodology Slide, Theme Overview Slide (auto-populated), Individual Theme Deep-Dive Slides, Stakeholder Quote Highlight Slides, Recommendations Summary Slide

Note

Templates are the bridge between AI-generated analysis and client-ready deliverables. Invest time in getting these right — poorly structured templates negate much of the time savings from AI synthesis. The deliverable_drafter.py script in the custom pipeline generates a pre-populated Word document, but Copilot is used for the final 'polish pass' and for generating sections that require more creative interpretation (e.g., recommendations). Always include a human review step before any deliverable goes to a client.

Step 10: End-User Training and Workflow Validation

Conduct hands-on training sessions with the client's consulting team covering the complete interview-to-deliverable workflow. Validate the entire pipeline end-to-end with a real or realistic test case. Ensure all users understand their role in the workflow and can operate the tools independently.

Training Session 1: Recording & Transcription (1 hour, all 10 users)
Hardware setup: Jabra Speak2 75 positioning and connection
Otter.ai: Starting recordings, tagging interviews, exporting transcripts
Consent workflow: When and how to obtain participant consent
Teams/Zoom recording settings for remote interviews
Training Session 2: Analysis & Synthesis (2 hours, 5 researchers)
Dovetail: Uploading transcripts, using AI auto-tagging, manual refinement
Running the synthesis pipeline: CLI usage and output interpretation
Reviewing AI-generated themes: what to accept, modify, or reject
Cross-interview analysis: identifying patterns and contradictions
Training Session 3: Deliverable Creation (1.5 hours, 5 senior consultants)
Using synthesis outputs in Word/PowerPoint templates
Copilot prompts for draft generation
Quality assurance: reviewing AI-drafted content for accuracy
Finalizing and formatting deliverables

End-to-End Validation Exercise: Record a mock stakeholder interview (15 minutes)

Transcribe via Otter.ai

Export to SharePoint

Run synthesis pipeline

Review themes in Dovetail

Generate deliverable draft

Polish with Copilot — Total exercise time: ~45 minutes (vs. 4+ hours manual)

Note

Schedule training sessions across 1 week, not all in one day. The most common failure mode is consultants reverting to manual processes because they don't trust the AI output — address this by showing side-by-side comparisons of AI vs. manual synthesis quality during training. Create a quick-reference card (1-page PDF) summarizing the workflow steps for each role. Record all training sessions for future onboarding of new team members.

Custom AI Components

Interview Transcript Processor

Type: prompt A structured GPT-5.4 prompt that processes individual interview transcripts to extract per-interview summaries, key quotes, sentiment indicators, and preliminary theme tags. This is Stage 1 of the synthesis pipeline — it runs once per transcript and produces a structured JSON output that feeds into the cross-interview Theme Synthesizer.

Implementation

config/prompts.yaml — transcript_analysis section

Prompt Template (config/prompts.yaml - transcript_analysis section) system_prompt: | You are an expert qualitative research analyst working for a professional services consulting firm. Your task is to analyze a stakeholder interview transcript and extract structured insights. You must: 1. Identify the interviewee's role, perspective, and key concerns 2. Extract direct quotes that are particularly insightful or representative 3. Identify preliminary themes discussed in the interview 4. Assess sentiment (positive, negative, neutral, mixed) for each topic discussed 5. Note any specific pain points, opportunities, or recommendations mentioned 6. Flag any contradictions or tensions within the interviewee's responses Output your analysis as valid JSON matching the schema below. Do not include any text outside the JSON. user_prompt_template: | Analyze the following stakeholder interview transcript. **Project Context:** {project_name} **Project Description:** {project_description} **Interviewee:** {interviewee_name} **Interviewee Role:** {interviewee_role} **Interview Date:** {interview_date} **Interviewer:** {interviewer_name} **Transcript:** {transcript_text} Produce a JSON analysis with the following structure: {{ "interviewee_summary": {{ "name": "string", "role": "string", "overall_sentiment": "positive|negative|neutral|mixed", "key_perspective": "2-3 sentence summary of this person's overall perspective", "engagement_level": "high|medium|low" }}, "themes_identified": [ {{ "theme_name": "string (concise theme label)", "description": "string (1-2 sentence description)", "sentiment": "positive|negative|neutral|mixed", "confidence": "high|medium|low", "supporting_quotes": [ {{ "quote": "exact quote from transcript", "context": "brief context for the quote" }} ] }} ], "pain_points": [ {{ "description": "string", "severity": "high|medium|low", "quote": "supporting quote" }} ], "opportunities": [ {{ "description": "string", "potential_impact": "high|medium|low", "quote": "supporting quote" }} ], "recommendations_from_interviewee": [ {{ "recommendation": "string", "quote": "supporting quote" }} ], "contradictions_or_tensions": [ {{ "description": "string", "quotes": ["quote 1", "quote 2"] }} ], "notable_quotes": [ {{ "quote": "exact quote", "significance": "why this quote is notable", "potential_use": "executive summary|theme evidence|callout box|appendix" }} ] }}

Sonnet 4.6 ↓

transcript_processor.py — TranscriptProcessor class

python

import json
import os
import yaml
from openai import OpenAI
from pathlib import Path
import tiktoken

class TranscriptProcessor:
    def __init__(self, config_path='config/prompts.yaml'):
        self.client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))
        self.model = os.getenv('OPENAI_MODEL', 'gpt-5.4')
        self.encoding = tiktoken.encoding_for_model(self.model)
        with open(config_path, 'r') as f:
            self.prompts = yaml.safe_load(f)

    def count_tokens(self, text: str) -> int:
        return len(self.encoding.encode(text))

    def chunk_transcript(self, transcript: str, max_tokens: int = 12000) -> list:
        """Split long transcripts into overlapping chunks for processing."""
        paragraphs = transcript.split('\n\n')
        chunks = []
        current_chunk = []
        current_tokens = 0
        for para in paragraphs:
            para_tokens = self.count_tokens(para)
            if current_tokens + para_tokens > max_tokens and current_chunk:
                chunks.append('\n\n'.join(current_chunk))
                # Keep last 2 paragraphs for overlap
                current_chunk = current_chunk[-2:]
                current_tokens = sum(self.count_tokens(p) for p in current_chunk)
            current_chunk.append(para)
            current_tokens += para_tokens
        if current_chunk:
            chunks.append('\n\n'.join(current_chunk))
        return chunks

    def process_transcript(self, transcript_text: str, metadata: dict) -> dict:
        """Process a single interview transcript and return structured analysis."""
        system_prompt = self.prompts['system_prompt']
        user_prompt = self.prompts['user_prompt_template'].format(
            project_name=metadata.get('project_name', 'Unknown Project'),
            project_description=metadata.get('project_description', ''),
            interviewee_name=metadata.get('interviewee_name', 'Unknown'),
            interviewee_role=metadata.get('interviewee_role', 'Unknown'),
            interview_date=metadata.get('interview_date', 'Unknown'),
            interviewer_name=metadata.get('interviewer_name', 'Unknown'),
            transcript_text=transcript_text
        )

        token_count = self.count_tokens(system_prompt + user_prompt)
        if token_count > 120000:  # GPT-5.4 context limit safety margin
            return self._process_chunked(transcript_text, metadata)

        response = self.client.chat.completions.create(
            model=self.model,
            messages=[
                {'role': 'system', 'content': system_prompt},
                {'role': 'user', 'content': user_prompt}
            ],
            temperature=0.3,
            response_format={'type': 'json_object'},
            max_tokens=int(os.getenv('MAX_TOKENS_PER_REQUEST', 16000))
        )

        result = json.loads(response.choices[0].message.content)
        result['_metadata'] = {
            'model': self.model,
            'tokens_used': response.usage.total_tokens,
            'input_tokens': response.usage.prompt_tokens,
            'output_tokens': response.usage.completion_tokens,
            'cost_estimate': self._estimate_cost(response.usage)
        }
        return result

    def _process_chunked(self, transcript_text: str, metadata: dict) -> dict:
        """Process very long transcripts by chunking and merging results."""
        chunks = self.chunk_transcript(transcript_text)
        chunk_results = []
        for i, chunk in enumerate(chunks):
            metadata_copy = metadata.copy()
            metadata_copy['chunk_info'] = f'Part {i+1} of {len(chunks)}'
            result = self.process_transcript(chunk, metadata_copy)
            chunk_results.append(result)
        return self._merge_chunk_results(chunk_results, metadata)

    def _merge_chunk_results(self, results: list, metadata: dict) -> dict:
        """Merge multiple chunk analyses into a single coherent result."""
        merged = {
            'interviewee_summary': results[0].get('interviewee_summary', {}),
            'themes_identified': [],
            'pain_points': [],
            'opportunities': [],
            'recommendations_from_interviewee': [],
            'contradictions_or_tensions': [],
            'notable_quotes': []
        }
        seen_themes = set()
        for r in results:
            for theme in r.get('themes_identified', []):
                if theme['theme_name'] not in seen_themes:
                    merged['themes_identified'].append(theme)
                    seen_themes.add(theme['theme_name'])
            merged['pain_points'].extend(r.get('pain_points', []))
            merged['opportunities'].extend(r.get('opportunities', []))
            merged['recommendations_from_interviewee'].extend(r.get('recommendations_from_interviewee', []))
            merged['contradictions_or_tensions'].extend(r.get('contradictions_or_tensions', []))
            merged['notable_quotes'].extend(r.get('notable_quotes', []))
        return merged

    def _estimate_cost(self, usage) -> str:
        input_cost = (usage.prompt_tokens / 1_000_000) * 2.50
        output_cost = (usage.completion_tokens / 1_000_000) * 10.00
        return f'${input_cost + output_cost:.4f}'

if __name__ == '__main__':
    import sys
    processor = TranscriptProcessor()
    transcript_file = sys.argv[1]
    with open(transcript_file, 'r') as f:
        text = f.read()
    metadata = {
        'project_name': sys.argv[2] if len(sys.argv) > 2 else 'Test',
        'interviewee_name': Path(transcript_file).stem
    }
    result = processor.process_transcript(text, metadata)
    print(json.dumps(result, indent=2))

Cross-Interview Theme Synthesizer

Type: prompt

Stage 2 of the pipeline. Takes the structured outputs from multiple individual transcript analyses (from Stage 1) and synthesizes them into consolidated cross-cutting themes with frequency analysis, sentiment patterns, stakeholder alignment mapping, and evidence triangulation. This is the core intelligence component that produces consulting-grade thematic findings.

Implementation:

Prompt Template (config/prompts.yaml - theme_synthesis section)

theme_synthesis_system_prompt: | You are a senior consulting analyst synthesizing findings from multiple stakeholder interviews. Your task is to identify cross-cutting themes, patterns of agreement and disagreement, and generate strategic insights that will form the basis of a client deliverable. Principles: - Themes should be substantive and actionable, not generic - Always ground themes in evidence (specific quotes from specific stakeholders) - Identify where stakeholders agree AND where they diverge - Note the strength of each theme (how many stakeholders mentioned it, how prominently) - Distinguish between symptoms and root causes - Provide enough context for a consultant to draft deliverable content directly from your output theme_synthesis_user_prompt: | You are synthesizing {interview_count} stakeholder interviews for the project: {project_name}. Project Description: {project_description} Here are the individual interview analyses: {interview_analyses_json} Synthesize these into a comprehensive thematic analysis. Output valid JSON: {{ "synthesis_metadata": {{ "project_name": "string", "total_interviews": integer, "stakeholder_roles": ["list of unique roles"], "synthesis_date": "string" }}, "executive_summary": "3-4 paragraph executive summary of key findings suitable for a client deliverable", "major_themes": [ {{ "theme_id": "T1", "theme_name": "Concise theme label", "theme_description": "2-3 sentence description of this theme", "strength": "strong|moderate|emerging", "frequency": {{ "stakeholders_mentioning": integer, "total_stakeholders": integer, "percentage": float }}, "sentiment_distribution": {{ "positive": integer, "negative": integer, "neutral": integer, "mixed": integer }}, "stakeholder_perspectives": [ {{ "stakeholder_name": "string", "role": "string", "perspective": "1-2 sentence summary", "key_quote": "exact quote" }} ], "sub_themes": [ {{ "name": "string", "description": "string" }} ], "implications": "What this theme means for the client/project", "recommended_actions": ["list of potential actions"] }} ], "alignment_analysis": {{ "areas_of_consensus": [ {{ "topic": "string", "description": "What stakeholders broadly agree on", "stakeholders": ["names"] }} ], "areas_of_divergence": [ {{ "topic": "string", "perspective_a": {{ "position": "string", "stakeholders": ["names"], "quotes": ["supporting quotes"] }}, "perspective_b": {{ "position": "string", "stakeholders": ["names"], "quotes": ["supporting quotes"] }}, "significance": "Why this divergence matters" }} ] }}, "cross_cutting_insights": [ {{ "insight": "A strategic insight not directly stated but emerging from patterns across interviews", "evidence": "What patterns support this insight", "confidence": "high|medium|low" }} ], "priority_findings": [ {{ "finding_id": "F1", "finding": "Clear, concise finding statement", "supporting_themes": ["T1", "T3"], "evidence_strength": "strong|moderate|limited", "urgency": "immediate|short-term|long-term", "impact": "high|medium|low" }} ], "recommended_deliverable_structure": {{ "suggested_sections": ["Ordered list of sections for the final deliverable"], "key_messages": ["3-5 key messages for the executive audience"] }} }}

Sonnet 4.6 ↓

Python implementation (theme_synthesizer.py)

python

import json
import os
import yaml
from openai import OpenAI
from datetime import datetime

class ThemeSynthesizer:
    def __init__(self, config_path='config/prompts.yaml'):
        self.client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))
        self.model = os.getenv('OPENAI_MODEL', 'gpt-5.4')
        with open(config_path, 'r') as f:
            self.prompts = yaml.safe_load(f)

    def synthesize_themes(self, interview_analyses: list, project_metadata: dict) -> dict:
        """Synthesize themes across multiple interview analyses."""
        # Prepare condensed version of analyses to fit in context window
        condensed = self._condense_analyses(interview_analyses)

        system_prompt = self.prompts['theme_synthesis_system_prompt']
        user_prompt = self.prompts['theme_synthesis_user_prompt'].format(
            interview_count=len(interview_analyses),
            project_name=project_metadata.get('project_name', 'Unknown'),
            project_description=project_metadata.get('project_description', ''),
            interview_analyses_json=json.dumps(condensed, indent=2)
        )

        response = self.client.chat.completions.create(
            model=self.model,
            messages=[
                {'role': 'system', 'content': system_prompt},
                {'role': 'user', 'content': user_prompt}
            ],
            temperature=0.3,
            response_format={'type': 'json_object'},
            max_tokens=16000
        )

        result = json.loads(response.choices[0].message.content)
        result['_processing_metadata'] = {
            'model': self.model,
            'total_tokens': response.usage.total_tokens,
            'cost_estimate': self._estimate_cost(response.usage),
            'processed_at': datetime.now().isoformat(),
            'interview_count': len(interview_analyses)
        }
        return result

    def _condense_analyses(self, analyses: list) -> list:
        """Remove verbose fields to fit more interviews in context."""
        condensed = []
        for a in analyses:
            c = {
                'interviewee': a.get('interviewee_summary', {}),
                'themes': [{
                    'name': t['theme_name'],
                    'description': t.get('description', ''),
                    'sentiment': t.get('sentiment', ''),
                    'quotes': [q['quote'] for q in t.get('supporting_quotes', [])[:3]]
                } for t in a.get('themes_identified', [])],
                'pain_points': a.get('pain_points', []),
                'opportunities': a.get('opportunities', []),
                'recommendations': a.get('recommendations_from_interviewee', []),
                'contradictions': a.get('contradictions_or_tensions', []),
                'top_quotes': [q['quote'] for q in a.get('notable_quotes', [])[:5]]
            }
            condensed.append(c)
        return condensed

    def _estimate_cost(self, usage) -> str:
        input_cost = (usage.prompt_tokens / 1_000_000) * 2.50
        output_cost = (usage.completion_tokens / 1_000_000) * 10.00
        return f'${input_cost + output_cost:.4f}'

    def generate_theme_summary_markdown(self, synthesis: dict) -> str:
        """Generate a human-readable Markdown summary from the synthesis."""
        md = f"# Thematic Analysis: {synthesis.get('synthesis_metadata', {}).get('project_name', 'Project')}\n\n"
        md += f"## Executive Summary\n\n{synthesis.get('executive_summary', '')}\n\n"
        md += f"## Major Themes\n\n"
        for theme in synthesis.get('major_themes', []):
            freq = theme.get('frequency', {})
            md += f"### {theme['theme_id']}: {theme['theme_name']}\n\n"
            md += f"{theme.get('theme_description', '')}\n\n"
            md += f"**Strength:** {theme.get('strength', 'N/A')} | "
            md += f"**Mentioned by:** {freq.get('stakeholders_mentioning', '?')}/{freq.get('total_stakeholders', '?')} stakeholders\n\n"
            for sp in theme.get('stakeholder_perspectives', []):
                md += f'- **{sp["stakeholder_name"]}** ({sp["role"]}): "{sp["key_quote"]}"\n'
            md += f"\n**Implications:** {theme.get('implications', '')}\n\n"
        md += f"## Areas of Consensus\n\n"
        for area in synthesis.get('alignment_analysis', {}).get('areas_of_consensus', []):
            md += f"- **{area['topic']}**: {area['description']}\n"
        md += f"\n## Areas of Divergence\n\n"
        for div in synthesis.get('alignment_analysis', {}).get('areas_of_divergence', []):
            md += f"### {div['topic']}\n\n{div.get('significance', '')}\n\n"
        md += f"## Priority Findings\n\n"
        for f in synthesis.get('priority_findings', []):
            md += f"- **[{f['finding_id']}]** {f['finding']} (Impact: {f.get('impact', 'N/A')}, Urgency: {f.get('urgency', 'N/A')})\n"
        return md

Deliverable Draft Generator

Type: workflow

Stage 3 of the pipeline. Takes the synthesized themes from Stage 2 and generates pre-populated Word and PowerPoint documents using the client's branded templates. These drafts serve as the starting point for consultant review and refinement, and can also be further polished using Microsoft 365 Copilot.

Implementation:

deliverable_drafter.py

python

# Python implementation (deliverable_drafter.py)

import json
import os
from docx import Document
from docx.shared import Inches, Pt, RGBColor
from docx.enum.text import WD_ALIGN_PARAGRAPH
from pptx import Presentation
from pptx.util import Inches as PptxInches, Pt as PptxPt
from datetime import datetime

class DeliverableDrafter:
    def __init__(self, template_dir='templates'):
        self.template_dir = template_dir

    def generate_word_report(self, synthesis: dict, output_path: str, template_path: str = None):
        """Generate a Word document from synthesis results."""
        if template_path and os.path.exists(template_path):
            doc = Document(template_path)
        else:
            doc = Document()

        # Title
        title = doc.add_heading(level=0)
        title.text = f"Stakeholder Interview Findings"
        subtitle = doc.add_paragraph()
        subtitle.text = f"{synthesis.get('synthesis_metadata', {}).get('project_name', 'Project')}\n"
        subtitle.text += f"Date: {datetime.now().strftime('%B %d, %Y')}\n"
        subtitle.text += f"Interviews Conducted: {synthesis.get('synthesis_metadata', {}).get('total_interviews', 'N/A')}"
        subtitle.style = doc.styles['Subtitle']

        # Executive Summary
        doc.add_heading('Executive Summary', level=1)
        exec_summary = synthesis.get('executive_summary', '')
        for para in exec_summary.split('\n\n'):
            doc.add_paragraph(para.strip())

        # Key Messages Box
        rec_structure = synthesis.get('recommended_deliverable_structure', {})
        key_messages = rec_structure.get('key_messages', [])
        if key_messages:
            doc.add_heading('Key Messages', level=2)
            for msg in key_messages:
                p = doc.add_paragraph(msg, style='List Bullet')

        # Major Themes
        doc.add_heading('Major Themes', level=1)
        for theme in synthesis.get('major_themes', []):
            doc.add_heading(f"{theme['theme_id']}: {theme['theme_name']}", level=2)
            doc.add_paragraph(theme.get('theme_description', ''))

            # Frequency info
            freq = theme.get('frequency', {})
            freq_text = f"Mentioned by {freq.get('stakeholders_mentioning', '?')} of {freq.get('total_stakeholders', '?')} stakeholders ({freq.get('percentage', 0):.0f}%). Theme strength: {theme.get('strength', 'N/A').upper()}"
            p = doc.add_paragraph(freq_text)
            p.runs[0].italic = True

            # Stakeholder perspectives with quotes
            if theme.get('stakeholder_perspectives'):
                doc.add_heading('Stakeholder Perspectives', level=3)
                for sp in theme['stakeholder_perspectives']:
                    p = doc.add_paragraph()
                    runner = p.add_run(f"{sp['stakeholder_name']} ({sp['role']}): ")
                    runner.bold = True
                    p.add_run(sp.get('perspective', ''))
                    if sp.get('key_quote'):
                        quote_para = doc.add_paragraph()
                        quote_run = quote_para.add_run(f'"{sp["key_quote"]}')
                        quote_run.italic = True
                        quote_para.paragraph_format.left_indent = Inches(0.5)

            # Implications
            if theme.get('implications'):
                doc.add_heading('Implications', level=3)
                doc.add_paragraph(theme['implications'])

            # Recommended Actions
            if theme.get('recommended_actions'):
                doc.add_heading('Recommended Actions', level=3)
                for action in theme['recommended_actions']:
                    doc.add_paragraph(action, style='List Bullet')

        # Alignment Analysis
        alignment = synthesis.get('alignment_analysis', {})
        doc.add_heading('Stakeholder Alignment Analysis', level=1)

        if alignment.get('areas_of_consensus'):
            doc.add_heading('Areas of Consensus', level=2)
            for area in alignment['areas_of_consensus']:
                doc.add_paragraph(f"{area['topic']}: {area['description']}")

        if alignment.get('areas_of_divergence'):
            doc.add_heading('Areas of Divergence', level=2)
            for div in alignment['areas_of_divergence']:
                doc.add_heading(div['topic'], level=3)
                doc.add_paragraph(div.get('significance', ''))
                pa = div.get('perspective_a', {})
                pb = div.get('perspective_b', {})
                if pa:
                    p = doc.add_paragraph()
                    p.add_run('View A: ').bold = True
                    p.add_run(f"{pa.get('position', '')} (Held by: {', '.join(pa.get('stakeholders', []))}")
                if pb:
                    p = doc.add_paragraph()
                    p.add_run('View B: ').bold = True
                    p.add_run(f"{pb.get('position', '')} (Held by: {', '.join(pb.get('stakeholders', []))}")

        # Priority Findings
        doc.add_heading('Priority Findings', level=1)
        for finding in synthesis.get('priority_findings', []):
            p = doc.add_paragraph()
            p.add_run(f"[{finding['finding_id']}] ").bold = True
            p.add_run(finding['finding'])
            detail = f"Evidence: {finding.get('evidence_strength', 'N/A')} | Impact: {finding.get('impact', 'N/A')} | Urgency: {finding.get('urgency', 'N/A')}"
            det_p = doc.add_paragraph(detail)
            det_p.runs[0].italic = True

        # Cross-cutting Insights
        if synthesis.get('cross_cutting_insights'):
            doc.add_heading('Cross-Cutting Insights', level=1)
            for insight in synthesis['cross_cutting_insights']:
                doc.add_paragraph(insight['insight'])
                if insight.get('evidence'):
                    ev = doc.add_paragraph(f"Evidence: {insight['evidence']}")
                    ev.runs[0].italic = True

        # Save
        doc.save(output_path)
        return output_path

    def generate_pptx_summary(self, synthesis: dict, output_path: str, template_path: str = None):
        """Generate a PowerPoint summary from synthesis results."""
        if template_path and os.path.exists(template_path):
            prs = Presentation(template_path)
        else:
            prs = Presentation()

        # Title slide
        slide = prs.slides.add_slide(prs.slide_layouts[0])
        slide.shapes.title.text = 'Stakeholder Interview Findings'
        slide.placeholders[1].text = f"{synthesis.get('synthesis_metadata', {}).get('project_name', '')}\n{datetime.now().strftime('%B %Y')}"

        # Executive Summary slide
        slide = prs.slides.add_slide(prs.slide_layouts[1])
        slide.shapes.title.text = 'Executive Summary'
        exec_text = synthesis.get('executive_summary', '')
        # Truncate for slide
        if len(exec_text) > 800:
            exec_text = exec_text[:797] + '...'
        slide.placeholders[1].text = exec_text

        # Key Messages slide
        key_messages = synthesis.get('recommended_deliverable_structure', {}).get('key_messages', [])
        if key_messages:
            slide = prs.slides.add_slide(prs.slide_layouts[1])
            slide.shapes.title.text = 'Key Messages'
            body = '\n'.join([f'• {msg}' for msg in key_messages])
            slide.placeholders[1].text = body

        # Theme overview slide
        themes = synthesis.get('major_themes', [])
        if themes:
            slide = prs.slides.add_slide(prs.slide_layouts[1])
            slide.shapes.title.text = f'Themes Overview ({len(themes)} Themes Identified)'
            theme_list = '\n'.join([f"• {t['theme_id']}: {t['theme_name']} ({t.get('strength', 'N/A')})" for t in themes])
            slide.placeholders[1].text = theme_list

        # Individual theme slides
        for theme in themes[:8]:  # Limit to 8 theme slides
            slide = prs.slides.add_slide(prs.slide_layouts[1])
            slide.shapes.title.text = f"{theme['theme_id']}: {theme['theme_name']}"
            freq = theme.get('frequency', {})
            content = f"{theme.get('theme_description', '')}\n\n"
            content += f"Mentioned by {freq.get('stakeholders_mentioning', '?')}/{freq.get('total_stakeholders', '?')} stakeholders\n\n"
            perspectives = theme.get('stakeholder_perspectives', [])[:3]
            for sp in perspectives:
                content += f'• "{sp.get("key_quote", "")}" — {sp["stakeholder_name"]}\n'
            if theme.get('implications'):
                content += f"\nImplications: {theme['implications']}"
            slide.placeholders[1].text = content

        # Priority Findings slide
        findings = synthesis.get('priority_findings', [])
        if findings:
            slide = prs.slides.add_slide(prs.slide_layouts[1])
            slide.shapes.title.text = 'Priority Findings'
            findings_text = '\n'.join([f"• [{f['finding_id']}] {f['finding']}" for f in findings[:6]])
            slide.placeholders[1].text = findings_text

        prs.save(output_path)
        return output_path

if __name__ == '__main__':
    import sys
    synthesis_file = sys.argv[1]
    with open(synthesis_file, 'r') as f:
        synthesis = json.load(f)
    drafter = DeliverableDrafter()
    drafter.generate_word_report(synthesis, 'outputs/findings_report.docx')
    drafter.generate_pptx_summary(synthesis, 'outputs/findings_summary.pptx')
    print('Deliverables generated successfully.')

Main Synthesis Pipeline Orchestrator

Type: workflow

The master orchestration script that ties together all three stages: (1) process individual transcripts, (2) synthesize cross-interview themes, and (3) generate deliverable drafts. Provides CLI interface for consultants and can also be exposed as a REST API for Zapier integration.

Implementation:

synthesis_pipeline.py

python

# Main orchestrator for interview synthesis pipeline

#!/usr/bin/env python3
# synthesis_pipeline.py - Main orchestrator for interview synthesis pipeline

import json
import os
import sys
import argparse
import glob
from pathlib import Path
from datetime import datetime
from transcript_processor import TranscriptProcessor
from theme_synthesizer import ThemeSynthesizer
from deliverable_drafter import DeliverableDrafter

def load_project_config(config_path: str) -> dict:
    """Load project configuration from JSON file."""
    with open(config_path, 'r') as f:
        return json.load(f)

def discover_transcripts(transcript_dir: str) -> list:
    """Find all transcript files in the specified directory."""
    extensions = ['*.txt', '*.md', '*.vtt', '*.srt']
    files = []
    for ext in extensions:
        files.extend(glob.glob(os.path.join(transcript_dir, ext)))
    return sorted(files)

def parse_transcript_metadata(filepath: str) -> dict:
    """Extract metadata from transcript filename convention.
    Expected format: YYYY-MM-DD_IntervieweeName_Role.txt
    Falls back to filename as interviewee name if convention not followed."""
    stem = Path(filepath).stem
    parts = stem.split('_')
    if len(parts) >= 3:
        return {
            'interview_date': parts[0],
            'interviewee_name': parts[1].replace('-', ' '),
            'interviewee_role': ' '.join(parts[2:]).replace('-', ' ')
        }
    return {
        'interview_date': 'Unknown',
        'interviewee_name': stem.replace('-', ' ').replace('_', ' '),
        'interviewee_role': 'Unknown'
    }

def main():
    parser = argparse.ArgumentParser(description='AI Interview Synthesis Pipeline')
    parser.add_argument('--input', '-i', required=True, help='Directory of transcript files OR single transcript file')
    parser.add_argument('--project', '-p', required=True, help='Project name')
    parser.add_argument('--description', '-d', default='', help='Project description for context')
    parser.add_argument('--output', '-o', default='outputs', help='Output directory')
    parser.add_argument('--config', '-c', default=None, help='Path to project config JSON')
    parser.add_argument('--word-template', default=None, help='Path to Word template .docx')
    parser.add_argument('--pptx-template', default=None, help='Path to PowerPoint template .pptx')
    parser.add_argument('--skip-transcripts', action='store_true', help='Skip Stage 1 (use existing analyses)')
    parser.add_argument('--skip-deliverables', action='store_true', help='Skip Stage 3 (no deliverable generation)')
    args = parser.parse_args()

    # Setup output directory
    timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
    output_dir = os.path.join(args.output, f"{args.project.replace(' ', '_')}_{timestamp}")
    os.makedirs(output_dir, exist_ok=True)
    os.makedirs(os.path.join(output_dir, 'individual_analyses'), exist_ok=True)

    project_metadata = {
        'project_name': args.project,
        'project_description': args.description,
        'processed_at': datetime.now().isoformat()
    }

    if args.config:
        project_metadata.update(load_project_config(args.config))

    print(f'\n=== Interview Synthesis Pipeline ===')
    print(f'Project: {args.project}')
    print(f'Output: {output_dir}')

    # Stage 1: Process Individual Transcripts
    individual_analyses = []
    if not args.skip_transcripts:
        print(f'\n--- Stage 1: Individual Transcript Analysis ---')
        processor = TranscriptProcessor()

        if os.path.isdir(args.input):
            transcript_files = discover_transcripts(args.input)
        else:
            transcript_files = [args.input]

        print(f'Found {len(transcript_files)} transcript(s)')

        for i, filepath in enumerate(transcript_files, 1):
            print(f'  Processing [{i}/{len(transcript_files)}]: {Path(filepath).name}')
            with open(filepath, 'r', encoding='utf-8') as f:
                transcript_text = f.read()

            metadata = parse_transcript_metadata(filepath)
            metadata.update(project_metadata)

            analysis = processor.process_transcript(transcript_text, metadata)
            individual_analyses.append(analysis)

            # Save individual analysis
            analysis_path = os.path.join(output_dir, 'individual_analyses', f'{Path(filepath).stem}_analysis.json')
            with open(analysis_path, 'w') as f:
                json.dump(analysis, f, indent=2)
            print(f'    -> Saved: {analysis_path}')
            if analysis.get('_metadata', {}).get('cost_estimate'):
                print(f'    -> Cost: {analysis["_metadata"]["cost_estimate"]}')
    else:
        print('\n--- Stage 1: SKIPPED (loading existing analyses) ---')
        analysis_dir = os.path.join(output_dir, 'individual_analyses')
        for f in sorted(glob.glob(os.path.join(analysis_dir, '*_analysis.json'))):
            with open(f, 'r') as fh:
                individual_analyses.append(json.load(fh))
        print(f'Loaded {len(individual_analyses)} existing analyses')

    if not individual_analyses:
        print('ERROR: No transcript analyses found. Exiting.')
        sys.exit(1)

    # Stage 2: Cross-Interview Theme Synthesis
    print(f'\n--- Stage 2: Cross-Interview Theme Synthesis ---')
    print(f'Synthesizing {len(individual_analyses)} interviews...')
    synthesizer = ThemeSynthesizer()
    synthesis = synthesizer.synthesize_themes(individual_analyses, project_metadata)

    synthesis_path = os.path.join(output_dir, 'theme_synthesis.json')
    with open(synthesis_path, 'w') as f:
        json.dump(synthesis, f, indent=2)
    print(f'  -> Saved: {synthesis_path}')

    # Save markdown version
    md_path = os.path.join(output_dir, 'theme_synthesis.md')
    md_content = synthesizer.generate_theme_summary_markdown(synthesis)
    with open(md_path, 'w') as f:
        f.write(md_content)
    print(f'  -> Markdown: {md_path}')

    theme_count = len(synthesis.get('major_themes', []))
    finding_count = len(synthesis.get('priority_findings', []))
    print(f'  -> {theme_count} major themes, {finding_count} priority findings identified')

    # Stage 3: Generate Deliverable Drafts
    if not args.skip_deliverables:
        print(f'\n--- Stage 3: Deliverable Draft Generation ---')
        drafter = DeliverableDrafter()

        word_path = os.path.join(output_dir, f"{args.project.replace(' ', '_')}_Findings_Report.docx")
        drafter.generate_word_report(synthesis, word_path, args.word_template)
        print(f'  -> Word Report: {word_path}')

        pptx_path = os.path.join(output_dir, f"{args.project.replace(' ', '_')}_Findings_Summary.pptx")
        drafter.generate_pptx_summary(synthesis, pptx_path, args.pptx_template)
        print(f'  -> PowerPoint Summary: {pptx_path}')

    # Summary
    print(f'\n=== Pipeline Complete ===')
    print(f'All outputs saved to: {output_dir}')
    total_cost = synthesis.get('_processing_metadata', {}).get('cost_estimate', 'N/A')
    print(f'Estimated API cost for synthesis stage: {total_cost}')
    print(f'Review the theme_synthesis.md file for a human-readable summary.')
    print(f'Open the Word/PPTX files and refine with Microsoft 365 Copilot for final polish.')

if __name__ == '__main__':
    main()

Type: prompt

A standardized consent form template that must be signed by all interview participants before recording begins. This is a critical compliance component for the solution. The MSP should customize this with the client's specific branding and legal language.

Implementation:

Save as: templates/interview_consent_form.md

Convert to branded Word document using client's template

Project: [PROJECT NAME]
Client: [CLIENT ORGANIZATION]
Conducted by: [CONSULTING FIRM NAME]
Date: [DATE]

Purpose

You have been invited to participate in a stakeholder interview as part of [PROJECT NAME]. The purpose of this interview is to gather your perspectives, experiences, and insights on [TOPIC AREA]. Your input will help inform [DELIVERABLE TYPE, e.g., 'strategic recommendations', 'assessment findings', 'transformation roadmap'].

Recording & Transcription

With your permission, this interview will be:

Audio recorded for accuracy purposes
Automatically transcribed using AI-powered transcription software (Otter.ai / Microsoft Teams)
Analyzed using AI tools (including large language models) to identify themes and patterns across all stakeholder interviews

How Your Data Will Be Used

Your responses will be aggregated with other stakeholder interviews to identify common themes
Direct quotes may be used in deliverables but will be attributed by role only (e.g., 'Senior Manager') unless you grant explicit permission for name attribution
AI tools will process your transcript to extract themes; no AI system will make decisions based on your individual responses
Raw transcripts and recordings will be deleted within [90] days of project completion

Confidentiality

All data is stored on encrypted, SOC 2-certified cloud platforms
Access is restricted to the project team at [CONSULTING FIRM NAME]
Your data will not be sold, shared with third parties, or used for any purpose beyond this project
[CONSULTING FIRM NAME] has executed Data Processing Agreements with all technology vendors

Your Rights

You may decline to answer any question
You may withdraw your consent at any time by contacting [CONTACT EMAIL]
You may request deletion of your interview data at any time
You may opt out of AI processing while still participating in the interview (manual analysis only)

By signing below, I confirm that:

I understand the purpose of this interview

I consent to audio recording of this interview

I consent to AI-powered transcription and analysis of my responses

I understand my data will be aggregated and used in project deliverables

I understand I can withdraw consent at any time

Optional:

I consent to being quoted by name in deliverables (default: role-only attribution)

Participant Name: ________________________________

Participant Signature: ________________________________

Date: ________________________________

Interviewer Name: ________________________________

Note

This form should be reviewed and approved by your organization's legal counsel before use. Modify retention periods, attribution policies, and data handling procedures to match your specific regulatory requirements (GDPR, CCPA, HIPAA, etc.).

Zapier-to-Pipeline Webhook API

Type: integration A lightweight Flask API that receives webhook calls from Zapier when new transcripts appear in SharePoint, triggers the synthesis pipeline, and sends a callback when processing is complete. Runs on the same VM or workstation as the synthesis pipeline.

Implementation

webhook_api.py

python

# Flask API for Zapier integration. Run with: python webhook_api.py or
# deploy via gunicorn.

# webhook_api.py - Flask API for Zapier integration
# Run with: python webhook_api.py (or deploy via gunicorn)

import os
import json
import subprocess
import threading
from flask import Flask, request, jsonify
import requests
from datetime import datetime

app = Flask(__name__)

# Configuration
PIPELINE_SCRIPT = os.getenv('PIPELINE_SCRIPT', './synthesis_pipeline.py')
TRANSCRIPT_DIR = os.getenv('TRANSCRIPT_DIR', './transcripts')
OUTPUT_DIR = os.getenv('OUTPUT_DIR', './outputs')
CALLBACK_URL = os.getenv('ZAPIER_CALLBACK_URL', '')  # Zapier webhook catch URL
API_KEY = os.getenv('WEBHOOK_API_KEY', 'change-me-to-a-secure-key')

def verify_api_key(req):
    """Simple API key authentication."""
    key = req.headers.get('X-API-Key') or req.args.get('api_key')
    return key == API_KEY

def run_pipeline_async(project_name, transcript_dir, description=''):
    """Run the synthesis pipeline in a background thread."""
    def _run():
        try:
            cmd = [
                'python3', PIPELINE_SCRIPT,
                '--input', transcript_dir,
                '--project', project_name,
                '--description', description,
                '--output', OUTPUT_DIR
            ]
            result = subprocess.run(cmd, capture_output=True, text=True, timeout=600)

            # Send callback to Zapier
            if CALLBACK_URL:
                payload = {
                    'project_name': project_name,
                    'status': 'success' if result.returncode == 0 else 'error',
                    'output': result.stdout[-500:] if result.stdout else '',
                    'error': result.stderr[-500:] if result.stderr else '',
                    'completed_at': datetime.now().isoformat()
                }
                requests.post(CALLBACK_URL, json=payload, timeout=30)
        except Exception as e:
            if CALLBACK_URL:
                requests.post(CALLBACK_URL, json={
                    'project_name': project_name,
                    'status': 'error',
                    'error': str(e),
                    'completed_at': datetime.now().isoformat()
                }, timeout=30)

    thread = threading.Thread(target=_run, daemon=True)
    thread.start()

@app.route('/api/health', methods=['GET'])
def health():
    return jsonify({'status': 'healthy', 'timestamp': datetime.now().isoformat()})

@app.route('/api/synthesize', methods=['POST'])
def synthesize():
    if not verify_api_key(request):
        return jsonify({'error': 'Unauthorized'}), 401

    data = request.json or {}
    project_name = data.get('project', 'Unknown Project')
    transcript_dir = data.get('transcript_dir', TRANSCRIPT_DIR)
    description = data.get('description', '')

    if not os.path.exists(transcript_dir):
        return jsonify({'error': f'Transcript directory not found: {transcript_dir}'}), 400

    run_pipeline_async(project_name, transcript_dir, description)

    return jsonify({
        'status': 'processing',
        'project': project_name,
        'message': 'Pipeline started. Callback will be sent to Zapier when complete.',
        'started_at': datetime.now().isoformat()
    }), 202

@app.route('/api/status', methods=['GET'])
def status():
    if not verify_api_key(request):
        return jsonify({'error': 'Unauthorized'}), 401
    # List recent output directories
    outputs = []
    if os.path.exists(OUTPUT_DIR):
        for d in sorted(os.listdir(OUTPUT_DIR), reverse=True)[:10]:
            dir_path = os.path.join(OUTPUT_DIR, d)
            if os.path.isdir(dir_path):
                has_synthesis = os.path.exists(os.path.join(dir_path, 'theme_synthesis.json'))
                outputs.append({'project': d, 'complete': has_synthesis})
    return jsonify({'recent_runs': outputs})

if __name__ == '__main__':
    port = int(os.getenv('PORT', 8443))
    app.run(host='0.0.0.0', port=port, debug=False)
    # For production: gunicorn -w 2 -b 0.0.0.0:8443 webhook_api:app

Install required Python dependencies.

bash

pip install flask==3.0.0 gunicorn==22.0.0 requests==2.32.0

Testing & Validation

AUDIO QUALITY TEST: Record a 5-minute test conversation using each Jabra Speak2 75 in the actual conference rooms where interviews will be conducted. Play back the recording and verify all speakers are audible, no significant background noise, and speech is clear. Transcribe with Otter.ai and verify >95% word accuracy rate.
OTTER.AI TRANSCRIPTION ACCURACY TEST: Conduct a 15-minute mock stakeholder interview with two participants. Record via Otter.ai. After transcription completes, manually compare 3 random 2-minute segments against the recording. Verify speaker identification is correct (speakers labeled distinctly) and overall word error rate is below 10%.
OTTER.AI TO SHAREPOINT INTEGRATION TEST: Record a test meeting tagged 'stakeholder-interview' in Otter.ai. Verify the Zapier zap triggers and the transcript file appears in the SharePoint /Transcripts library within 5 minutes. Confirm the file is readable and contains the full transcript text.
TRANSCRIPT PROCESSOR (STAGE 1) TEST: Run transcript_processor.py against a sample 45-minute interview transcript. Verify the output JSON contains: (a) valid interviewee_summary with correct role and name, (b) at least 3 themes_identified with supporting quotes that actually appear in the transcript, (c) at least 1 pain_point and 1 opportunity, (d) all quotes in the output are verbatim from the source transcript (no hallucinated quotes).
THEME SYNTHESIZER (STAGE 2) TEST: Process 5+ sample interview transcripts through Stage 1, then run theme_synthesizer.py. Verify: (a) major_themes are distinct and non-overlapping, (b) frequency counts are mathematically correct (stakeholders_mentioning ≤ total_stakeholders), (c) alignment_analysis identifies at least one area of consensus and one area of divergence, (d) executive_summary is coherent and references actual themes, (e) total processing time is under 3 minutes.
DELIVERABLE DRAFT GENERATION TEST: Run deliverable_drafter.py with synthesis output. Open the generated .docx file and verify: (a) document opens without errors in Word, (b) all sections (Executive Summary, Themes, Alignment Analysis, Priority Findings) are populated, (c) quotes in the document match source transcripts, (d) formatting is clean with proper heading hierarchy. Open the .pptx file and verify all slides render correctly.
END-TO-END PIPELINE TEST: Run the full synthesis_pipeline.py with --input pointing to a directory of 3-5 sample transcripts. Verify all three stages complete without errors, output directory contains individual_analyses/*.json, theme_synthesis.json, theme_synthesis.md, and Word/PPTX files. Total execution time should be under 10 minutes for 5 interviews.
MICROSOFT 365 COPILOT INTEGRATION TEST: Open the generated Word deliverable in Microsoft Word with Copilot enabled. Test these Copilot prompts: (a) 'Summarize the key themes in this document' — verify response references actual themes, (b) 'Rewrite the executive summary for a C-suite audience' — verify output is appropriate, (c) 'Create a table comparing stakeholder perspectives on [Theme 1]' — verify table is generated with relevant data.
ZAPIER WEBHOOK INTEGRATION TEST: Send a POST request to the webhook API endpoint with a test payload. Verify: (a) API returns 202 status, (b) pipeline begins processing, (c) Zapier callback webhook receives completion notification with correct project name and status. Test with invalid API key and verify 401 rejection.
DATA RETENTION AND COMPLIANCE TEST: Verify that SharePoint retention policy is applied to the Interview Intelligence Hub site. Upload a test document, confirm it is labeled correctly. Verify Otter.ai admin panel shows correct retention settings. Verify OpenAI API organization settings confirm data is not used for training. Test a deletion request workflow end-to-end: request deletion of a specific interview's data and confirm removal from Otter.ai, SharePoint, and Dovetail within 48 hours.
CONCURRENT USER TEST: Have 3 consultants simultaneously record and process interviews through the pipeline. Verify there are no conflicts in file storage, API rate limiting is handled gracefully, and all three analyses complete successfully with correct data segregation.
SECURITY TEST: Verify all API keys are stored in environment variables (not hardcoded). Confirm the webhook API rejects requests without valid API key. Verify SharePoint permissions restrict Interview Intelligence Hub access to authorized project team members only. Confirm MFA is enforced for all users accessing the platform.

Client Handoff

Client Handoff Checklist

Training Topics to Cover

Interview Recording Best Practices (all consultants): Hardware setup, Otter.ai recording initiation, consent form workflow, room acoustics tips, handling in-person vs. remote interviews

Transcript Quality Review (researchers/analysts): How to spot and correct transcription errors, when to use Whisper API for re-transcription, naming conventions for transcript files

Running the Synthesis Pipeline (2-3 power users): CLI usage, project configuration, interpreting JSON and Markdown outputs, troubleshooting common errors

Dovetail Analysis Workflow (researchers): Uploading transcripts, reviewing AI tags, manual refinement of themes, creating and managing tag taxonomies, exporting findings

Deliverable Creation with Copilot (senior consultants): Using synthesis outputs as Copilot context, effective prompting for draft generation, human review and quality assurance process

Compliance Procedures (all staff): Consent form requirements, data handling obligations, deletion request process, what NOT to put into AI tools

Documentation to Deliver

Quick Reference Card (1-page PDF): Step-by-step workflow from recording to deliverable
Pipeline User Guide (5-10 pages): Detailed instructions for running the synthesis pipeline with screenshots
Troubleshooting Guide: Common errors and resolutions (API rate limits, transcription quality issues, file format problems)
Prompt Library: Saved Copilot and GPT-5.4 prompts for common deliverable types (assessment reports, strategy recommendations, stakeholder alignment analyses)
Compliance Handbook: Consent form template, data retention policy, vendor DPA status, deletion request procedure
Admin Guide (for IT contact): Account administration for Otter.ai, Dovetail, OpenAI API; license management; usage monitoring
Video Recordings: All training sessions recorded and saved to SharePoint for future onboarding

Success Criteria to Review Together

At least 3 consultants can independently record an interview and export a transcript

At least 2 power users can run the full synthesis pipeline and interpret outputs

Deliverable generation time reduced by ≥60% compared to manual process (measure with a real engagement)

AI-generated themes validated by senior consultant as 'accurate and useful' for at least 80% of themes

All compliance documentation is in place and approved by client's legal counsel

Client can articulate the consent workflow and data handling procedures

Zapier automations are running reliably (test over 2-week burn-in period)

Transition Details

MSP retains admin access to Otter.ai, Dovetail, and OpenAI accounts for ongoing management
Client designates 1-2 internal 'AI Champions' for first-line support
30-day hypercare period with weekly check-in calls included in implementation
After hypercare, transition to standard managed services agreement

Maintenance

Ongoing Maintenance Plan

Monthly Tasks (MSP Responsibility)

Usage & Cost Monitoring: Review OpenAI API usage dashboard for unexpected spikes; verify Otter.ai minutes consumption is within plan limits; check Dovetail seat utilization. Target: keep API costs under $50/month for typical firm.
Prompt Quality Review: Review a sample of recent synthesis outputs for quality degradation (theme relevance, quote accuracy, finding actionability). LLM behavior can drift with model updates.
Software Updates: Check for updates to Python dependencies (openai, tiktoken, python-docx, python-pptx, flask). Apply security patches within 7 days of release. Run pip list --outdated and test updates in staging before deploying.
Backup Verification: Confirm SharePoint retention policies are active and functioning. Verify synthesis output archives are being maintained.
License Reconciliation: Review Otter.ai, Dovetail, and Copilot seat assignments. Remove departed employees, add new hires. Optimize plan tier if usage patterns change.

Check for outdated Python dependencies before applying updates in staging

bash

pip list --outdated

Quarterly Tasks

Prompt Optimization Session (2 hours): Review accumulated synthesis outputs with client power users. Refine prompt templates based on what's working well and what needs improvement. Update config/prompts.yaml accordingly.
Compliance Audit: Verify all DPAs are current. Check vendor SOC 2 report validity dates. Review data retention compliance (are old transcripts being deleted per policy?). Audit access logs for unauthorized access attempts.
Performance Benchmarking: Time the full pipeline for a recent project and compare to baseline. If processing time has increased >25%, investigate API performance or data volume changes.
Template Refresh: Update Word/PPTX templates if client has rebranded or changed deliverable formats.

Trigger-Based Maintenance

OpenAI Model Updates: When OpenAI releases new GPT-5.4 versions, test synthesis quality with the new model on 2-3 existing projects before switching. Update OPENAI_MODEL environment variable only after validation. Expected frequency: 2-4 times per year.
Vendor Pricing Changes: Monitor Otter.ai, Dovetail, and OpenAI pricing announcements. Renegotiate or switch vendors if cost increases exceed 20%. Maintain vendor comparison matrix.
Client Workflow Changes: If client adds new engagement types, interview methodologies, or deliverable formats, create new prompt templates and test thoroughly before deployment.
Security Incidents: If any vendor reports a data breach, immediately assess exposure, notify client, and execute incident response per the compliance handbook. Have a vendor replacement plan ready.
API Rate Limiting or Outages: If OpenAI API experiences repeated rate limiting, implement exponential backoff in the pipeline (already built into the OpenAI Python SDK). For extended outages, have Anthropic Claude API as a fallback (requires prompt adaptation).

SLA Considerations

Response Time: 4-hour response for pipeline errors blocking active engagements; 24-hour response for non-urgent issues
Availability: Pipeline should be available during business hours (M-F 8am-6pm client local time). SaaS vendor uptime is governed by their respective SLAs (typically 99.9%)
Data Recovery: All synthesis outputs recoverable from SharePoint within 4 hours. Individual transcript re-processing can be triggered at any time.

Escalation Path

Level 1 (Client AI Champion): Basic troubleshooting — restart pipeline, re-export transcript, verify file naming conventions

Level 2 (MSP Tier 2 Technician): Software configuration issues, integration failures, Zapier troubleshooting, license management

Level 3 (MSP Solutions Architect / Developer): Prompt engineering changes, API migration, custom code modifications, vendor replacement

Level 4 (Vendor Support): OpenAI, Otter.ai, Dovetail, Microsoft support tickets for platform-level issues

Cost Monitoring Alerts

Set OpenAI API monthly budget alert at $75 (warning) and $100 (hard cap)
Monitor Otter.ai minutes usage; alert at 80% of plan capacity
Review monthly MSP invoice against expected baseline; investigate variances >15%

Alternatives

Turnkey SaaS Approach (Dovetail or Insight7 Only)

Replace the custom Python synthesis pipeline with a single all-in-one platform like Dovetail Professional or Insight7. These platforms handle transcription, AI-powered thematic analysis, collaborative coding, and basic reporting within a single web interface. No custom code, no API integrations, no VM required. The MSP simply provisions accounts, uploads transcripts, and trains users on the platform's built-in AI features.

COST: Lower total cost (~$75-150/month vs. ~$300+/month for the full stack) and zero development cost.
COMPLEXITY: Dramatically simpler — Tier 1 MSP technician can deploy in 1-2 weeks vs. 4-6 weeks for the primary approach.
CAPABILITY: Less customizable synthesis prompts; dependent on vendor's AI quality which varies; weaker deliverable generation (no auto-populated Word/PPTX). Limited integration options.
RECOMMENDATION: Best for small consulting firms (1-5 people) with straightforward interview analysis needs and low desire for customization. Not recommended for firms that need branded deliverable automation or have complex thematic analysis requirements.

Microsoft-Native Stack (Teams + Azure OpenAI + Copilot)

Build the entire solution within the Microsoft ecosystem. Use Microsoft Teams for interview recording and built-in transcription, Azure OpenAI Service (GPT-5.4) for the synthesis pipeline with enterprise data residency, Azure Blob Storage for transcript storage, Power Automate for workflow orchestration (replacing Zapier), and Microsoft 365 Copilot for deliverable creation. The custom Python pipeline runs on an Azure App Service or Azure Functions instead of a standalone VM.

COST: Potentially higher software costs (Azure OpenAI has same token pricing but Azure compute adds $30-100/month; Power Automate Premium is $15/user/month) but eliminates Otter.ai, Dovetail, and Zapier subscriptions.
COMPLEXITY: Medium — requires Azure expertise (Tier 2-3 technician) but all components are within a single vendor ecosystem.
CAPABILITY: Strongest data governance and compliance story (all data stays in Azure tenant, SOC 2/HIPAA/FedRAMP ready). Teams transcription quality is slightly below Otter.ai. No dedicated QDA platform means less collaborative analysis UX.
RECOMMENDATION: Best for firms already deep in the Microsoft ecosystem, those with strict data sovereignty requirements (government consulting, healthcare), or MSPs with strong Azure practices. Choose this when compliance is the primary concern over user experience.

Enterprise QDA Platform Approach (NVivo or ATLAS.ti)

Use a traditional enterprise qualitative data analysis platform like NVivo 15 (Lumivero, ~$1,350-$2,500/license) or ATLAS.ti (~€1,100/year) as the primary analysis tool, supplemented by their newly added AI-powered auto-coding features. These platforms offer the deepest qualitative analysis capabilities including mixed-methods research, advanced coding frameworks, and publication-quality outputs.

COST: Significantly higher software cost ($1,350-$2,500 per license vs. $15-$20/month for modern alternatives). Desktop-based licensing is less flexible than SaaS.
COMPLEXITY: High learning curve — these are academic-grade research tools that require substantial training (budget 10-20 hours per user).
CAPABILITY: Most powerful analysis features available; handles complex mixed-methods research that simpler tools cannot; AI features are newer and less mature than purpose-built AI platforms; strong export and visualization capabilities.
RECOMMENDATION: Only for professional services firms doing rigorous academic-style research (policy consulting, evaluation firms, market research) where analytical depth and methodological rigor are more important than speed and cost. Overkill for typical strategy consulting interview synthesis.

Build-from-Scratch Custom Application

Develop a fully custom web application using the OpenAI Whisper API for transcription, GPT-5.4 for analysis, a React/Next.js frontend for the analysis interface, and a PostgreSQL database for storing all interview data and analysis results. This gives maximum customization and can be white-labeled by the MSP for multiple clients.

Warning

COST: Highest upfront investment ($15,000-$40,000 in development) but potentially lowest per-client marginal cost once built.

Warning

COMPLEXITY: Very high — requires a full-stack developer, 8-16 weeks of development, and ongoing maintenance.

Note

CAPABILITY: Maximum flexibility — can implement any analysis methodology, any output format, any integration. Can be white-labeled and resold across the MSP's client base as a proprietary product.

Critical

RECOMMENDATION: Only viable if the MSP has development resources and plans to deploy this across 5+ professional services clients. The ROI threshold is approximately 5 clients at $500/month to recover development costs within 12 months. Not recommended for a single-client engagement.

Want early access to the full toolkit?

← All Professional Services solutions

Hardware Procurement

Jabra Speak2 75 Wireless Speakerphone

Jabra Speak 510 Portable Speakerphone

RØDE NT-USB Mini USB Condenser Microphone

Software Procurement

Prerequisites

Installation Steps

Step 1: Provision and Configure Otter.ai Business Workspace

Step 2: Deploy Recording Hardware

Step 3: Configure Dovetail Professional Workspace

Step 4: Set Up OpenAI API Access and Custom Synthesis Pipeline

Step 5: Deploy Custom Synthesis Pipeline Scripts

Set up directory structure, scaffold pipeline files, and run a test execution

Step 6: Configure Microsoft 365 Copilot for Deliverable Drafting

Step 7: Set Up Integration Automation with Zapier

ZAP 1: Otter.ai → SharePoint (New Transcript → Upload)

ZAP 2: SharePoint → Webhook (New File → Trigger Synthesis)

ZAP 3: Synthesis Complete → Teams Notification

ZAP 4: Synthesis Output → Dovetail (Upload for Collaborative Review)

Step 8: Configure Compliance and Data Governance Controls

Step 9: Create Deliverable Templates and Copilot Prompt Library

Step 10: End-User Training and Workflow Validation

Custom AI Components

Interview Transcript Processor

Implementation

Cross-Interview Theme Synthesizer

Implementation:

Deliverable Draft Generator

Implementation:

Main Synthesis Pipeline Orchestrator

Interview Consent Form Template

Implementation:

STAKEHOLDER INTERVIEW CONSENT FORM

Purpose

Recording & Transcription

How Your Data Will Be Used

Confidentiality

Your Rights

Consent

Zapier-to-Pipeline Webhook API

Implementation

Testing & Validation

Client Handoff

Client Handoff Checklist

Training Topics to Cover

Documentation to Deliver

Success Criteria to Review Together

Transition Details

Maintenance

Ongoing Maintenance Plan

Monthly Tasks (MSP Responsibility)

Quarterly Tasks

Trigger-Based Maintenance

SLA Considerations

Escalation Path

Cost Monitoring Alerts

Alternatives

Turnkey SaaS Approach (Dovetail or Insight7 Only)

Microsoft-Native Stack (Teams + Azure OpenAI + Copilot)

Enterprise QDA Platform Approach (NVivo or ATLAS.ti)

Build-from-Scratch Custom Application