
Implementation Guide: Synthesize stakeholder interview data into themes and findings for deliverables
Step-by-step implementation guide for deploying AI to synthesize stakeholder interview data into themes and findings for deliverables for Professional Services clients.
Hardware Procurement
Jabra Speak2 75 Wireless Speakerphone
$175 per unit MSP cost / $250 suggested resale
Primary interview recording device for conference rooms and in-person stakeholder interviews. Features 4 noise-cancelling beamforming microphones and super-wideband audio optimized for clear voice capture. Connects via USB or Bluetooth to laptops running transcription software. Critical for ensuring high-quality audio input to AI transcription pipeline.
Jabra Speak 510 Portable Speakerphone
$95 per unit MSP cost / $140 suggested resale
Portable backup and field interview recording device. Compact form factor for consultants conducting on-site stakeholder interviews at client facilities. USB/Bluetooth connectivity for laptop-based recording.
RØDE NT-USB Mini USB Condenser Microphone
$80 per unit MSP cost / $115 suggested resale
High-quality desk-based microphone for dedicated interview recording setups in the office. Provides broadcast-quality audio for solo interviewer recording when a speakerphone pickup pattern is not needed. Ideal for phone/VoIP interview recording scenarios.
Software Procurement
Otter.ai Business
$20/user/month billed annually ($240/user/year). 10 seats = $200/month or $2,400/year
Primary transcription and meeting intelligence platform. Provides automatic transcription of Zoom and Teams meetings with speaker identification, search, and export. 6,000 minutes/month per user. Admin features including usage analytics. Feeds raw transcripts to downstream analysis pipeline.
Dovetail Professional
$15/user/month — $75/month or $900/year
Cloud-native qualitative data analysis platform. Provides AI-driven transcription tagging, automated summarization, collaborative theme coding, and insight repositories. Researchers upload transcripts, tag themes, and generate structured findings. Serves as the central analysis workspace between raw transcripts and final deliverables.
OpenAI API (GPT-5.4)
$2.50 per 1M input tokens, $10 per 1M output tokens. Estimated $25/month for typical consulting firm workload (~2M tokens/month)
Powers the custom thematic synthesis pipeline. Processes cleaned transcripts through structured prompts to extract themes, generate cross-interview findings, identify contradictions, and draft deliverable sections. Used via API for programmatic, repeatable analysis at scale.
OpenAI Whisper API
$0.006/minute of audio. 40 hours of interviews = $14.40 per project
Backup and high-accuracy transcription engine for audio files not captured through Otter.ai (e.g., in-person recordings, phone interviews). Provides raw transcription output that feeds into the synthesis pipeline. Supports 98+ languages.
Microsoft 365 Business Standard
$12.50/user/month via CSP. 10 seats = $125/month or $1,500/year
Foundation platform providing Microsoft Teams (for remote interview recording and built-in transcription), SharePoint (document storage and transcript repository), Word and PowerPoint (deliverable creation), and OneDrive (individual file storage). Assumed pre-existing at most professional services firms.
Microsoft 365 Copilot
$30/user/month add-on = $150/month or $1,800/year
AI assistant embedded in Word and PowerPoint for drafting final client deliverables from synthesized themes and findings. Copilot ingests structured synthesis outputs and generates draft report sections, executive summaries, and presentation slides. Significant time savings on deliverable production.
Notion Business
$20/user/month for Business plan with AI. 5 seats = $100/month or $1,200/year. Optional — can substitute Confluence or SharePoint.
Optional research knowledge management wiki. Stores interview repositories organized by engagement, maintains prompt templates and analysis playbooks, and serves as a collaborative workspace for building findings before final deliverable creation. Includes Notion AI for additional summarization.
Zapier Professional
$49/month (2,000 tasks/month)
Integration orchestration layer connecting Otter.ai, Dovetail, SharePoint, and the custom synthesis pipeline. Automates transcript routing, triggers synthesis workflows, and pushes notifications to Slack/Teams when analysis is complete.
Prerequisites
- Microsoft 365 Business Standard (or higher) tenant fully provisioned with Teams, SharePoint, and OneDrive configured for all users
- Active Microsoft CSP relationship with the MSP for license management and Copilot provisioning
- Zoom Business ($199.90/user/year) OR Microsoft Teams (included in M365) deployed and configured for meeting recording — confirm which platform the client uses for remote interviews
- Minimum 25 Mbps internet upload bandwidth at primary office location; 10 Mbps minimum at any remote interview sites
- Outbound HTTPS (port 443) allowed through firewall to: otter.ai, dovetail.com, api.openai.com, login.microsoftonline.com, zapier.com, notion.so
- WPA3 or WPA2-Enterprise Wi-Fi at interview recording locations for reliable connectivity
- Modern web browsers (Chrome 120+, Edge 120+, Firefox 120+) on all workstations
- Standard business laptops with minimum 8GB RAM, USB-A or USB-C ports for recording hardware, and functional audio drivers
- Client has signed engagement letter or MSA that permits use of AI tools for data processing — critical for compliance
- OpenAI Platform account created with billing configured and API key generated (MSP should create under client's organization or MSP's managed tenant)
- Python 3.10+ runtime available on at least one workstation or Azure VM for running the custom synthesis pipeline scripts
- Client has existing templates (Word/PowerPoint) for their standard deliverable formats — needed for Copilot template configuration
- Identified project lead / power user within the client organization who will own the interview analysis workflow
- Written interview consent form template approved by client's legal counsel that discloses AI processing of interview data
Installation Steps
Step 1: Provision and Configure Otter.ai Business Workspace
Create the Otter.ai Business workspace for the client organization. This is the primary transcription engine that will automatically capture and transcribe all stakeholder interviews conducted via Zoom or Teams. Configure SSO if the client uses Azure AD, set up team structure, and enable key integrations.
If the client uses Microsoft Teams exclusively (no Zoom), consider whether Otter.ai is needed or if Teams built-in transcription + Whisper API for higher accuracy is sufficient. Otter.ai adds value with its search, highlight, and summary features beyond raw transcription. Ensure the DPA is signed before any interview data flows through the platform. For HIPAA-covered clients, confirm Otter.ai Business tier includes BAA — if not, use Enterprise tier or switch to Azure OpenAI Whisper.
Step 2: Deploy Recording Hardware
Unbox, configure, and deploy Jabra Speak2 75 speakerphones and RØDE NT-USB Mini microphones. Test audio quality with each device connected to the client's laptops. Ensure firmware is updated and devices are recognized by Otter.ai and Zoom/Teams.
jabra-direct --check-firmware --updateLabel each device with an asset tag for the client's inventory. Position Jabra Speak2 75 units centrally on conference tables — optimal pickup range is 6-8 feet. The RØDE NT-USB Mini should be positioned 6-12 inches from the interviewer for best results. Test each device in the actual rooms where interviews will be conducted, as room acoustics significantly impact transcription accuracy.
Step 3: Configure Dovetail Professional Workspace
Set up the Dovetail qualitative analysis workspace where researchers will organize, tag, and synthesize interview transcripts. Create project templates, configure tag taxonomies, and set up team permissions. Dovetail serves as the central analysis hub between raw transcripts and final deliverables.
Dovetail's AI auto-tagging works best when you seed the tag taxonomy with 10-15 initial tags relevant to the client's typical engagement themes (e.g., 'Process Inefficiency', 'Technology Gap', 'Change Management', 'Stakeholder Alignment'). These will be refined per project. If Dovetail is not available or the client prefers an alternative, Looppanel ($30/mo) or Insight7 (custom pricing) are viable substitutes.
Step 4: Set Up OpenAI API Access and Custom Synthesis Pipeline
Configure the OpenAI API account, set up the custom Python-based synthesis pipeline that processes transcripts through GPT-5.4 for deep thematic extraction, and deploy the pipeline on an Azure VM or local workstation. This is the core intelligence engine that goes beyond Dovetail's built-in AI to provide consulting-grade thematic analysis.
sudo apt update && sudo apt install python3.11 python3.11-venv -y
python3.11 -m venv ~/interview-synthesis-env
source ~/interview-synthesis-env/bin/activate
pip install openai==1.40.0 tiktoken==0.7.0 python-docx==1.1.0 pandas==2.2.0 pyyaml==6.0.1 python-pptx==0.6.23mkdir -p ~/interview-synthesis
cd ~/interview-synthesiscat > .env << 'EOF'
OPENAI_API_KEY=sk-proj-xxxxxxxxxxxxxxxxxxxx
OPENAI_MODEL=gpt-5.4
OPENAI_ORG_ID=org-xxxxxxxxxxxxxxxxxxxx
MAX_TOKENS_PER_REQUEST=16000
TEMPERATURE=0.3
OUTPUT_DIR=./outputs
TRANSCRIPT_DIR=./transcripts
EOFpython3 -c "from openai import OpenAI; client = OpenAI(); print(client.models.list().data[0].id)"For enterprise clients requiring data residency guarantees, use Azure OpenAI Service instead of OpenAI directly. The Azure OpenAI endpoint would replace the standard API URL and data stays within the client's Azure tenant. API keys should NEVER be committed to version control. Use .env files with .gitignore or Azure Key Vault for production deployments. Set a conservative monthly budget cap initially — typical usage for a 10-person firm is $15–$30/month.
Step 5: Deploy Custom Synthesis Pipeline Scripts
Create and deploy the core Python scripts that implement the multi-stage interview synthesis pipeline. These scripts process transcripts through a series of GPT-5.4 prompts to extract individual interview summaries, cross-interview themes, supporting evidence, contradictions, and structured findings for deliverables. See the custom_ai_components section for full source code.
Set up directory structure, scaffold pipeline files, and run a test execution
The pipeline is designed to be run by a consultant or analyst after interviews are completed and transcripts exported from Otter.ai. It can also be triggered automatically via Zapier when new transcripts appear in a designated SharePoint folder. Processing time for a typical 20-interview project is 5-15 minutes. Always review AI-generated themes and findings before including in client deliverables — human validation is essential for consulting quality.
Step 6: Configure Microsoft 365 Copilot for Deliverable Drafting
Provision Microsoft 365 Copilot licenses for the 5 senior consultants/partners who will produce client deliverables. Configure Copilot to access the SharePoint document library where synthesized findings are stored, and create custom Copilot prompts for generating report sections and presentation slides from the synthesis outputs.
Copilot works best when synthesis outputs are stored as structured Markdown or Word documents in SharePoint, not as raw JSON. The deliverable_drafter.py component of the custom pipeline generates Word documents specifically formatted for Copilot consumption. Copilot requires Microsoft 365 E3/E5 or Business Standard/Premium as a prerequisite — verify the client's base license tier before purchasing the Copilot add-on.
Step 7: Set Up Integration Automation with Zapier
Configure Zapier workflows to automate the flow of data between Otter.ai, SharePoint, the synthesis pipeline, Dovetail, and notification channels. This reduces manual steps and ensures transcripts flow automatically from recording to analysis.
ZAP 1: Otter.ai → SharePoint (New Transcript → Upload)
- Trigger: Otter.ai - New Transcript
- Action: SharePoint - Upload File to /Transcripts library
- Filter: Only trigger for recordings tagged 'stakeholder-interview'
ZAP 2: SharePoint → Webhook (New File → Trigger Synthesis)
- Trigger: SharePoint - New File in /Transcripts
- Action: Webhooks - POST to synthesis pipeline endpoint
- URL: https://{azure-vm-ip}:8443/api/synthesize
{"file_path": "{{file_url}}", "project": "{{folder_name}}"}ZAP 3: Synthesis Complete → Teams Notification
- Trigger: Webhooks - Catch Hook (synthesis pipeline calls back)
- Action: Microsoft Teams - Send Channel Message
- Channel: #interview-insights
- Message: 'Synthesis complete for {{project_name}}. {{theme_count}} themes identified. View results: {{output_url}}'
ZAP 4: Synthesis Output → Dovetail (Upload for Collaborative Review)
- Trigger: SharePoint - New File in /Synthesis Outputs
- Action: Dovetail - Create Note (via API)
- Map synthesis JSON fields to Dovetail note fields
Zapier Professional plan ($49/month) provides 2,000 tasks/month which is sufficient for most consulting firms processing 5-15 interviews per month. For higher volume or if the client prefers Microsoft-native automation, replace Zapier with Power Automate (included in M365) — the flows are conceptually identical but use Power Automate connectors instead. The webhook-based synthesis trigger requires the synthesis pipeline to be running as a web service (Flask/FastAPI) on the Azure VM rather than as a CLI script.
Step 8: Configure Compliance and Data Governance Controls
Implement the compliance framework required for processing stakeholder interview data through AI systems. This includes consent management, data processing agreements, retention policies, access controls, and audit logging. Professional services firms have strict obligations around client confidentiality and data handling.
The consent form must be signed by every interview participant BEFORE recording begins. Train all consultants on this requirement. For clients in healthcare consulting, verify HIPAA compliance — Otter.ai Business does NOT include BAA by default; you may need to upgrade to Enterprise or switch entirely to Azure OpenAI Whisper + Azure-hosted pipeline. For EU stakeholders, GDPR requires explicit consent for AI processing and the right to erasure. Implement a documented process for handling deletion requests that covers all systems (Otter, Dovetail, SharePoint, OpenAI logs).
Step 9: Create Deliverable Templates and Copilot Prompt Library
Build the Word and PowerPoint templates that the synthesis pipeline and Copilot will populate with findings. These templates should match the client's existing branding and deliverable formats. Create a library of Copilot prompts that consultants can use to generate specific deliverable sections.
Templates are the bridge between AI-generated analysis and client-ready deliverables. Invest time in getting these right — poorly structured templates negate much of the time savings from AI synthesis. The deliverable_drafter.py script in the custom pipeline generates a pre-populated Word document, but Copilot is used for the final 'polish pass' and for generating sections that require more creative interpretation (e.g., recommendations). Always include a human review step before any deliverable goes to a client.
Step 10: End-User Training and Workflow Validation
Conduct hands-on training sessions with the client's consulting team covering the complete interview-to-deliverable workflow. Validate the entire pipeline end-to-end with a real or realistic test case. Ensure all users understand their role in the workflow and can operate the tools independently.
- Training Session 1: Recording & Transcription (1 hour, all 10 users)
- Hardware setup: Jabra Speak2 75 positioning and connection
- Otter.ai: Starting recordings, tagging interviews, exporting transcripts
- Consent workflow: When and how to obtain participant consent
- Teams/Zoom recording settings for remote interviews
- Training Session 2: Analysis & Synthesis (2 hours, 5 researchers)
- Dovetail: Uploading transcripts, using AI auto-tagging, manual refinement
- Running the synthesis pipeline: CLI usage and output interpretation
- Reviewing AI-generated themes: what to accept, modify, or reject
- Cross-interview analysis: identifying patterns and contradictions
- Training Session 3: Deliverable Creation (1.5 hours, 5 senior consultants)
- Using synthesis outputs in Word/PowerPoint templates
- Copilot prompts for draft generation
- Quality assurance: reviewing AI-drafted content for accuracy
- Finalizing and formatting deliverables
Schedule training sessions across 1 week, not all in one day. The most common failure mode is consultants reverting to manual processes because they don't trust the AI output — address this by showing side-by-side comparisons of AI vs. manual synthesis quality during training. Create a quick-reference card (1-page PDF) summarizing the workflow steps for each role. Record all training sessions for future onboarding of new team members.
Custom AI Components
Interview Transcript Processor
Type: prompt A structured GPT-5.4 prompt that processes individual interview transcripts to extract per-interview summaries, key quotes, sentiment indicators, and preliminary theme tags. This is Stage 1 of the synthesis pipeline — it runs once per transcript and produces a structured JSON output that feeds into the cross-interview Theme Synthesizer.
Implementation
config/prompts.yaml — transcript_analysis section
import json
import os
import yaml
from openai import OpenAI
from pathlib import Path
import tiktoken
class TranscriptProcessor:
def __init__(self, config_path='config/prompts.yaml'):
self.client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))
self.model = os.getenv('OPENAI_MODEL', 'gpt-5.4')
self.encoding = tiktoken.encoding_for_model(self.model)
with open(config_path, 'r') as f:
self.prompts = yaml.safe_load(f)
def count_tokens(self, text: str) -> int:
return len(self.encoding.encode(text))
def chunk_transcript(self, transcript: str, max_tokens: int = 12000) -> list:
"""Split long transcripts into overlapping chunks for processing."""
paragraphs = transcript.split('\n\n')
chunks = []
current_chunk = []
current_tokens = 0
for para in paragraphs:
para_tokens = self.count_tokens(para)
if current_tokens + para_tokens > max_tokens and current_chunk:
chunks.append('\n\n'.join(current_chunk))
# Keep last 2 paragraphs for overlap
current_chunk = current_chunk[-2:]
current_tokens = sum(self.count_tokens(p) for p in current_chunk)
current_chunk.append(para)
current_tokens += para_tokens
if current_chunk:
chunks.append('\n\n'.join(current_chunk))
return chunks
def process_transcript(self, transcript_text: str, metadata: dict) -> dict:
"""Process a single interview transcript and return structured analysis."""
system_prompt = self.prompts['system_prompt']
user_prompt = self.prompts['user_prompt_template'].format(
project_name=metadata.get('project_name', 'Unknown Project'),
project_description=metadata.get('project_description', ''),
interviewee_name=metadata.get('interviewee_name', 'Unknown'),
interviewee_role=metadata.get('interviewee_role', 'Unknown'),
interview_date=metadata.get('interview_date', 'Unknown'),
interviewer_name=metadata.get('interviewer_name', 'Unknown'),
transcript_text=transcript_text
)
token_count = self.count_tokens(system_prompt + user_prompt)
if token_count > 120000: # GPT-5.4 context limit safety margin
return self._process_chunked(transcript_text, metadata)
response = self.client.chat.completions.create(
model=self.model,
messages=[
{'role': 'system', 'content': system_prompt},
{'role': 'user', 'content': user_prompt}
],
temperature=0.3,
response_format={'type': 'json_object'},
max_tokens=int(os.getenv('MAX_TOKENS_PER_REQUEST', 16000))
)
result = json.loads(response.choices[0].message.content)
result['_metadata'] = {
'model': self.model,
'tokens_used': response.usage.total_tokens,
'input_tokens': response.usage.prompt_tokens,
'output_tokens': response.usage.completion_tokens,
'cost_estimate': self._estimate_cost(response.usage)
}
return result
def _process_chunked(self, transcript_text: str, metadata: dict) -> dict:
"""Process very long transcripts by chunking and merging results."""
chunks = self.chunk_transcript(transcript_text)
chunk_results = []
for i, chunk in enumerate(chunks):
metadata_copy = metadata.copy()
metadata_copy['chunk_info'] = f'Part {i+1} of {len(chunks)}'
result = self.process_transcript(chunk, metadata_copy)
chunk_results.append(result)
return self._merge_chunk_results(chunk_results, metadata)
def _merge_chunk_results(self, results: list, metadata: dict) -> dict:
"""Merge multiple chunk analyses into a single coherent result."""
merged = {
'interviewee_summary': results[0].get('interviewee_summary', {}),
'themes_identified': [],
'pain_points': [],
'opportunities': [],
'recommendations_from_interviewee': [],
'contradictions_or_tensions': [],
'notable_quotes': []
}
seen_themes = set()
for r in results:
for theme in r.get('themes_identified', []):
if theme['theme_name'] not in seen_themes:
merged['themes_identified'].append(theme)
seen_themes.add(theme['theme_name'])
merged['pain_points'].extend(r.get('pain_points', []))
merged['opportunities'].extend(r.get('opportunities', []))
merged['recommendations_from_interviewee'].extend(r.get('recommendations_from_interviewee', []))
merged['contradictions_or_tensions'].extend(r.get('contradictions_or_tensions', []))
merged['notable_quotes'].extend(r.get('notable_quotes', []))
return merged
def _estimate_cost(self, usage) -> str:
input_cost = (usage.prompt_tokens / 1_000_000) * 2.50
output_cost = (usage.completion_tokens / 1_000_000) * 10.00
return f'${input_cost + output_cost:.4f}'
if __name__ == '__main__':
import sys
processor = TranscriptProcessor()
transcript_file = sys.argv[1]
with open(transcript_file, 'r') as f:
text = f.read()
metadata = {
'project_name': sys.argv[2] if len(sys.argv) > 2 else 'Test',
'interviewee_name': Path(transcript_file).stem
}
result = processor.process_transcript(text, metadata)
print(json.dumps(result, indent=2))Cross-Interview Theme Synthesizer
Type: prompt
Stage 2 of the pipeline. Takes the structured outputs from multiple individual transcript analyses (from Stage 1) and synthesizes them into consolidated cross-cutting themes with frequency analysis, sentiment patterns, stakeholder alignment mapping, and evidence triangulation. This is the core intelligence component that produces consulting-grade thematic findings.
Implementation:
Prompt Template (config/prompts.yaml - theme_synthesis section)
import json
import os
import yaml
from openai import OpenAI
from datetime import datetime
class ThemeSynthesizer:
def __init__(self, config_path='config/prompts.yaml'):
self.client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))
self.model = os.getenv('OPENAI_MODEL', 'gpt-5.4')
with open(config_path, 'r') as f:
self.prompts = yaml.safe_load(f)
def synthesize_themes(self, interview_analyses: list, project_metadata: dict) -> dict:
"""Synthesize themes across multiple interview analyses."""
# Prepare condensed version of analyses to fit in context window
condensed = self._condense_analyses(interview_analyses)
system_prompt = self.prompts['theme_synthesis_system_prompt']
user_prompt = self.prompts['theme_synthesis_user_prompt'].format(
interview_count=len(interview_analyses),
project_name=project_metadata.get('project_name', 'Unknown'),
project_description=project_metadata.get('project_description', ''),
interview_analyses_json=json.dumps(condensed, indent=2)
)
response = self.client.chat.completions.create(
model=self.model,
messages=[
{'role': 'system', 'content': system_prompt},
{'role': 'user', 'content': user_prompt}
],
temperature=0.3,
response_format={'type': 'json_object'},
max_tokens=16000
)
result = json.loads(response.choices[0].message.content)
result['_processing_metadata'] = {
'model': self.model,
'total_tokens': response.usage.total_tokens,
'cost_estimate': self._estimate_cost(response.usage),
'processed_at': datetime.now().isoformat(),
'interview_count': len(interview_analyses)
}
return result
def _condense_analyses(self, analyses: list) -> list:
"""Remove verbose fields to fit more interviews in context."""
condensed = []
for a in analyses:
c = {
'interviewee': a.get('interviewee_summary', {}),
'themes': [{
'name': t['theme_name'],
'description': t.get('description', ''),
'sentiment': t.get('sentiment', ''),
'quotes': [q['quote'] for q in t.get('supporting_quotes', [])[:3]]
} for t in a.get('themes_identified', [])],
'pain_points': a.get('pain_points', []),
'opportunities': a.get('opportunities', []),
'recommendations': a.get('recommendations_from_interviewee', []),
'contradictions': a.get('contradictions_or_tensions', []),
'top_quotes': [q['quote'] for q in a.get('notable_quotes', [])[:5]]
}
condensed.append(c)
return condensed
def _estimate_cost(self, usage) -> str:
input_cost = (usage.prompt_tokens / 1_000_000) * 2.50
output_cost = (usage.completion_tokens / 1_000_000) * 10.00
return f'${input_cost + output_cost:.4f}'
def generate_theme_summary_markdown(self, synthesis: dict) -> str:
"""Generate a human-readable Markdown summary from the synthesis."""
md = f"# Thematic Analysis: {synthesis.get('synthesis_metadata', {}).get('project_name', 'Project')}\n\n"
md += f"## Executive Summary\n\n{synthesis.get('executive_summary', '')}\n\n"
md += f"## Major Themes\n\n"
for theme in synthesis.get('major_themes', []):
freq = theme.get('frequency', {})
md += f"### {theme['theme_id']}: {theme['theme_name']}\n\n"
md += f"{theme.get('theme_description', '')}\n\n"
md += f"**Strength:** {theme.get('strength', 'N/A')} | "
md += f"**Mentioned by:** {freq.get('stakeholders_mentioning', '?')}/{freq.get('total_stakeholders', '?')} stakeholders\n\n"
for sp in theme.get('stakeholder_perspectives', []):
md += f'- **{sp["stakeholder_name"]}** ({sp["role"]}): "{sp["key_quote"]}"\n'
md += f"\n**Implications:** {theme.get('implications', '')}\n\n"
md += f"## Areas of Consensus\n\n"
for area in synthesis.get('alignment_analysis', {}).get('areas_of_consensus', []):
md += f"- **{area['topic']}**: {area['description']}\n"
md += f"\n## Areas of Divergence\n\n"
for div in synthesis.get('alignment_analysis', {}).get('areas_of_divergence', []):
md += f"### {div['topic']}\n\n{div.get('significance', '')}\n\n"
md += f"## Priority Findings\n\n"
for f in synthesis.get('priority_findings', []):
md += f"- **[{f['finding_id']}]** {f['finding']} (Impact: {f.get('impact', 'N/A')}, Urgency: {f.get('urgency', 'N/A')})\n"
return mdDeliverable Draft Generator
Type: workflow
Stage 3 of the pipeline. Takes the synthesized themes from Stage 2 and generates pre-populated Word and PowerPoint documents using the client's branded templates. These drafts serve as the starting point for consultant review and refinement, and can also be further polished using Microsoft 365 Copilot.
Implementation:
# Python implementation (deliverable_drafter.py)
import json
import os
from docx import Document
from docx.shared import Inches, Pt, RGBColor
from docx.enum.text import WD_ALIGN_PARAGRAPH
from pptx import Presentation
from pptx.util import Inches as PptxInches, Pt as PptxPt
from datetime import datetime
class DeliverableDrafter:
def __init__(self, template_dir='templates'):
self.template_dir = template_dir
def generate_word_report(self, synthesis: dict, output_path: str, template_path: str = None):
"""Generate a Word document from synthesis results."""
if template_path and os.path.exists(template_path):
doc = Document(template_path)
else:
doc = Document()
# Title
title = doc.add_heading(level=0)
title.text = f"Stakeholder Interview Findings"
subtitle = doc.add_paragraph()
subtitle.text = f"{synthesis.get('synthesis_metadata', {}).get('project_name', 'Project')}\n"
subtitle.text += f"Date: {datetime.now().strftime('%B %d, %Y')}\n"
subtitle.text += f"Interviews Conducted: {synthesis.get('synthesis_metadata', {}).get('total_interviews', 'N/A')}"
subtitle.style = doc.styles['Subtitle']
# Executive Summary
doc.add_heading('Executive Summary', level=1)
exec_summary = synthesis.get('executive_summary', '')
for para in exec_summary.split('\n\n'):
doc.add_paragraph(para.strip())
# Key Messages Box
rec_structure = synthesis.get('recommended_deliverable_structure', {})
key_messages = rec_structure.get('key_messages', [])
if key_messages:
doc.add_heading('Key Messages', level=2)
for msg in key_messages:
p = doc.add_paragraph(msg, style='List Bullet')
# Major Themes
doc.add_heading('Major Themes', level=1)
for theme in synthesis.get('major_themes', []):
doc.add_heading(f"{theme['theme_id']}: {theme['theme_name']}", level=2)
doc.add_paragraph(theme.get('theme_description', ''))
# Frequency info
freq = theme.get('frequency', {})
freq_text = f"Mentioned by {freq.get('stakeholders_mentioning', '?')} of {freq.get('total_stakeholders', '?')} stakeholders ({freq.get('percentage', 0):.0f}%). Theme strength: {theme.get('strength', 'N/A').upper()}"
p = doc.add_paragraph(freq_text)
p.runs[0].italic = True
# Stakeholder perspectives with quotes
if theme.get('stakeholder_perspectives'):
doc.add_heading('Stakeholder Perspectives', level=3)
for sp in theme['stakeholder_perspectives']:
p = doc.add_paragraph()
runner = p.add_run(f"{sp['stakeholder_name']} ({sp['role']}): ")
runner.bold = True
p.add_run(sp.get('perspective', ''))
if sp.get('key_quote'):
quote_para = doc.add_paragraph()
quote_run = quote_para.add_run(f'"{sp["key_quote"]}')
quote_run.italic = True
quote_para.paragraph_format.left_indent = Inches(0.5)
# Implications
if theme.get('implications'):
doc.add_heading('Implications', level=3)
doc.add_paragraph(theme['implications'])
# Recommended Actions
if theme.get('recommended_actions'):
doc.add_heading('Recommended Actions', level=3)
for action in theme['recommended_actions']:
doc.add_paragraph(action, style='List Bullet')
# Alignment Analysis
alignment = synthesis.get('alignment_analysis', {})
doc.add_heading('Stakeholder Alignment Analysis', level=1)
if alignment.get('areas_of_consensus'):
doc.add_heading('Areas of Consensus', level=2)
for area in alignment['areas_of_consensus']:
doc.add_paragraph(f"{area['topic']}: {area['description']}")
if alignment.get('areas_of_divergence'):
doc.add_heading('Areas of Divergence', level=2)
for div in alignment['areas_of_divergence']:
doc.add_heading(div['topic'], level=3)
doc.add_paragraph(div.get('significance', ''))
pa = div.get('perspective_a', {})
pb = div.get('perspective_b', {})
if pa:
p = doc.add_paragraph()
p.add_run('View A: ').bold = True
p.add_run(f"{pa.get('position', '')} (Held by: {', '.join(pa.get('stakeholders', []))}")
if pb:
p = doc.add_paragraph()
p.add_run('View B: ').bold = True
p.add_run(f"{pb.get('position', '')} (Held by: {', '.join(pb.get('stakeholders', []))}")
# Priority Findings
doc.add_heading('Priority Findings', level=1)
for finding in synthesis.get('priority_findings', []):
p = doc.add_paragraph()
p.add_run(f"[{finding['finding_id']}] ").bold = True
p.add_run(finding['finding'])
detail = f"Evidence: {finding.get('evidence_strength', 'N/A')} | Impact: {finding.get('impact', 'N/A')} | Urgency: {finding.get('urgency', 'N/A')}"
det_p = doc.add_paragraph(detail)
det_p.runs[0].italic = True
# Cross-cutting Insights
if synthesis.get('cross_cutting_insights'):
doc.add_heading('Cross-Cutting Insights', level=1)
for insight in synthesis['cross_cutting_insights']:
doc.add_paragraph(insight['insight'])
if insight.get('evidence'):
ev = doc.add_paragraph(f"Evidence: {insight['evidence']}")
ev.runs[0].italic = True
# Save
doc.save(output_path)
return output_path
def generate_pptx_summary(self, synthesis: dict, output_path: str, template_path: str = None):
"""Generate a PowerPoint summary from synthesis results."""
if template_path and os.path.exists(template_path):
prs = Presentation(template_path)
else:
prs = Presentation()
# Title slide
slide = prs.slides.add_slide(prs.slide_layouts[0])
slide.shapes.title.text = 'Stakeholder Interview Findings'
slide.placeholders[1].text = f"{synthesis.get('synthesis_metadata', {}).get('project_name', '')}\n{datetime.now().strftime('%B %Y')}"
# Executive Summary slide
slide = prs.slides.add_slide(prs.slide_layouts[1])
slide.shapes.title.text = 'Executive Summary'
exec_text = synthesis.get('executive_summary', '')
# Truncate for slide
if len(exec_text) > 800:
exec_text = exec_text[:797] + '...'
slide.placeholders[1].text = exec_text
# Key Messages slide
key_messages = synthesis.get('recommended_deliverable_structure', {}).get('key_messages', [])
if key_messages:
slide = prs.slides.add_slide(prs.slide_layouts[1])
slide.shapes.title.text = 'Key Messages'
body = '\n'.join([f'• {msg}' for msg in key_messages])
slide.placeholders[1].text = body
# Theme overview slide
themes = synthesis.get('major_themes', [])
if themes:
slide = prs.slides.add_slide(prs.slide_layouts[1])
slide.shapes.title.text = f'Themes Overview ({len(themes)} Themes Identified)'
theme_list = '\n'.join([f"• {t['theme_id']}: {t['theme_name']} ({t.get('strength', 'N/A')})" for t in themes])
slide.placeholders[1].text = theme_list
# Individual theme slides
for theme in themes[:8]: # Limit to 8 theme slides
slide = prs.slides.add_slide(prs.slide_layouts[1])
slide.shapes.title.text = f"{theme['theme_id']}: {theme['theme_name']}"
freq = theme.get('frequency', {})
content = f"{theme.get('theme_description', '')}\n\n"
content += f"Mentioned by {freq.get('stakeholders_mentioning', '?')}/{freq.get('total_stakeholders', '?')} stakeholders\n\n"
perspectives = theme.get('stakeholder_perspectives', [])[:3]
for sp in perspectives:
content += f'• "{sp.get("key_quote", "")}" — {sp["stakeholder_name"]}\n'
if theme.get('implications'):
content += f"\nImplications: {theme['implications']}"
slide.placeholders[1].text = content
# Priority Findings slide
findings = synthesis.get('priority_findings', [])
if findings:
slide = prs.slides.add_slide(prs.slide_layouts[1])
slide.shapes.title.text = 'Priority Findings'
findings_text = '\n'.join([f"• [{f['finding_id']}] {f['finding']}" for f in findings[:6]])
slide.placeholders[1].text = findings_text
prs.save(output_path)
return output_path
if __name__ == '__main__':
import sys
synthesis_file = sys.argv[1]
with open(synthesis_file, 'r') as f:
synthesis = json.load(f)
drafter = DeliverableDrafter()
drafter.generate_word_report(synthesis, 'outputs/findings_report.docx')
drafter.generate_pptx_summary(synthesis, 'outputs/findings_summary.pptx')
print('Deliverables generated successfully.')Main Synthesis Pipeline Orchestrator
Type: workflow
The master orchestration script that ties together all three stages: (1) process individual transcripts, (2) synthesize cross-interview themes, and (3) generate deliverable drafts. Provides CLI interface for consultants and can also be exposed as a REST API for Zapier integration.
Implementation:
# Main orchestrator for interview synthesis pipeline
#!/usr/bin/env python3
# synthesis_pipeline.py - Main orchestrator for interview synthesis pipeline
import json
import os
import sys
import argparse
import glob
from pathlib import Path
from datetime import datetime
from transcript_processor import TranscriptProcessor
from theme_synthesizer import ThemeSynthesizer
from deliverable_drafter import DeliverableDrafter
def load_project_config(config_path: str) -> dict:
"""Load project configuration from JSON file."""
with open(config_path, 'r') as f:
return json.load(f)
def discover_transcripts(transcript_dir: str) -> list:
"""Find all transcript files in the specified directory."""
extensions = ['*.txt', '*.md', '*.vtt', '*.srt']
files = []
for ext in extensions:
files.extend(glob.glob(os.path.join(transcript_dir, ext)))
return sorted(files)
def parse_transcript_metadata(filepath: str) -> dict:
"""Extract metadata from transcript filename convention.
Expected format: YYYY-MM-DD_IntervieweeName_Role.txt
Falls back to filename as interviewee name if convention not followed."""
stem = Path(filepath).stem
parts = stem.split('_')
if len(parts) >= 3:
return {
'interview_date': parts[0],
'interviewee_name': parts[1].replace('-', ' '),
'interviewee_role': ' '.join(parts[2:]).replace('-', ' ')
}
return {
'interview_date': 'Unknown',
'interviewee_name': stem.replace('-', ' ').replace('_', ' '),
'interviewee_role': 'Unknown'
}
def main():
parser = argparse.ArgumentParser(description='AI Interview Synthesis Pipeline')
parser.add_argument('--input', '-i', required=True, help='Directory of transcript files OR single transcript file')
parser.add_argument('--project', '-p', required=True, help='Project name')
parser.add_argument('--description', '-d', default='', help='Project description for context')
parser.add_argument('--output', '-o', default='outputs', help='Output directory')
parser.add_argument('--config', '-c', default=None, help='Path to project config JSON')
parser.add_argument('--word-template', default=None, help='Path to Word template .docx')
parser.add_argument('--pptx-template', default=None, help='Path to PowerPoint template .pptx')
parser.add_argument('--skip-transcripts', action='store_true', help='Skip Stage 1 (use existing analyses)')
parser.add_argument('--skip-deliverables', action='store_true', help='Skip Stage 3 (no deliverable generation)')
args = parser.parse_args()
# Setup output directory
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
output_dir = os.path.join(args.output, f"{args.project.replace(' ', '_')}_{timestamp}")
os.makedirs(output_dir, exist_ok=True)
os.makedirs(os.path.join(output_dir, 'individual_analyses'), exist_ok=True)
project_metadata = {
'project_name': args.project,
'project_description': args.description,
'processed_at': datetime.now().isoformat()
}
if args.config:
project_metadata.update(load_project_config(args.config))
print(f'\n=== Interview Synthesis Pipeline ===')
print(f'Project: {args.project}')
print(f'Output: {output_dir}')
# Stage 1: Process Individual Transcripts
individual_analyses = []
if not args.skip_transcripts:
print(f'\n--- Stage 1: Individual Transcript Analysis ---')
processor = TranscriptProcessor()
if os.path.isdir(args.input):
transcript_files = discover_transcripts(args.input)
else:
transcript_files = [args.input]
print(f'Found {len(transcript_files)} transcript(s)')
for i, filepath in enumerate(transcript_files, 1):
print(f' Processing [{i}/{len(transcript_files)}]: {Path(filepath).name}')
with open(filepath, 'r', encoding='utf-8') as f:
transcript_text = f.read()
metadata = parse_transcript_metadata(filepath)
metadata.update(project_metadata)
analysis = processor.process_transcript(transcript_text, metadata)
individual_analyses.append(analysis)
# Save individual analysis
analysis_path = os.path.join(output_dir, 'individual_analyses', f'{Path(filepath).stem}_analysis.json')
with open(analysis_path, 'w') as f:
json.dump(analysis, f, indent=2)
print(f' -> Saved: {analysis_path}')
if analysis.get('_metadata', {}).get('cost_estimate'):
print(f' -> Cost: {analysis["_metadata"]["cost_estimate"]}')
else:
print('\n--- Stage 1: SKIPPED (loading existing analyses) ---')
analysis_dir = os.path.join(output_dir, 'individual_analyses')
for f in sorted(glob.glob(os.path.join(analysis_dir, '*_analysis.json'))):
with open(f, 'r') as fh:
individual_analyses.append(json.load(fh))
print(f'Loaded {len(individual_analyses)} existing analyses')
if not individual_analyses:
print('ERROR: No transcript analyses found. Exiting.')
sys.exit(1)
# Stage 2: Cross-Interview Theme Synthesis
print(f'\n--- Stage 2: Cross-Interview Theme Synthesis ---')
print(f'Synthesizing {len(individual_analyses)} interviews...')
synthesizer = ThemeSynthesizer()
synthesis = synthesizer.synthesize_themes(individual_analyses, project_metadata)
synthesis_path = os.path.join(output_dir, 'theme_synthesis.json')
with open(synthesis_path, 'w') as f:
json.dump(synthesis, f, indent=2)
print(f' -> Saved: {synthesis_path}')
# Save markdown version
md_path = os.path.join(output_dir, 'theme_synthesis.md')
md_content = synthesizer.generate_theme_summary_markdown(synthesis)
with open(md_path, 'w') as f:
f.write(md_content)
print(f' -> Markdown: {md_path}')
theme_count = len(synthesis.get('major_themes', []))
finding_count = len(synthesis.get('priority_findings', []))
print(f' -> {theme_count} major themes, {finding_count} priority findings identified')
# Stage 3: Generate Deliverable Drafts
if not args.skip_deliverables:
print(f'\n--- Stage 3: Deliverable Draft Generation ---')
drafter = DeliverableDrafter()
word_path = os.path.join(output_dir, f"{args.project.replace(' ', '_')}_Findings_Report.docx")
drafter.generate_word_report(synthesis, word_path, args.word_template)
print(f' -> Word Report: {word_path}')
pptx_path = os.path.join(output_dir, f"{args.project.replace(' ', '_')}_Findings_Summary.pptx")
drafter.generate_pptx_summary(synthesis, pptx_path, args.pptx_template)
print(f' -> PowerPoint Summary: {pptx_path}')
# Summary
print(f'\n=== Pipeline Complete ===')
print(f'All outputs saved to: {output_dir}')
total_cost = synthesis.get('_processing_metadata', {}).get('cost_estimate', 'N/A')
print(f'Estimated API cost for synthesis stage: {total_cost}')
print(f'Review the theme_synthesis.md file for a human-readable summary.')
print(f'Open the Word/PPTX files and refine with Microsoft 365 Copilot for final polish.')
if __name__ == '__main__':
main()Interview Consent Form Template
Type: prompt
A standardized consent form template that must be signed by all interview participants before recording begins. This is a critical compliance component for the solution. The MSP should customize this with the client's specific branding and legal language.
Implementation:
STAKEHOLDER INTERVIEW CONSENT FORM
- Project: [PROJECT NAME]
- Client: [CLIENT ORGANIZATION]
- Conducted by: [CONSULTING FIRM NAME]
- Date: [DATE]
Purpose
You have been invited to participate in a stakeholder interview as part of [PROJECT NAME]. The purpose of this interview is to gather your perspectives, experiences, and insights on [TOPIC AREA]. Your input will help inform [DELIVERABLE TYPE, e.g., 'strategic recommendations', 'assessment findings', 'transformation roadmap'].
Recording & Transcription
With your permission, this interview will be:
- Audio recorded for accuracy purposes
- Automatically transcribed using AI-powered transcription software (Otter.ai / Microsoft Teams)
- Analyzed using AI tools (including large language models) to identify themes and patterns across all stakeholder interviews
How Your Data Will Be Used
- Your responses will be aggregated with other stakeholder interviews to identify common themes
- Direct quotes may be used in deliverables but will be attributed by role only (e.g., 'Senior Manager') unless you grant explicit permission for name attribution
- AI tools will process your transcript to extract themes; no AI system will make decisions based on your individual responses
- Raw transcripts and recordings will be deleted within [90] days of project completion
Confidentiality
- All data is stored on encrypted, SOC 2-certified cloud platforms
- Access is restricted to the project team at [CONSULTING FIRM NAME]
- Your data will not be sold, shared with third parties, or used for any purpose beyond this project
- [CONSULTING FIRM NAME] has executed Data Processing Agreements with all technology vendors
Your Rights
- You may decline to answer any question
- You may withdraw your consent at any time by contacting [CONTACT EMAIL]
- You may request deletion of your interview data at any time
- You may opt out of AI processing while still participating in the interview (manual analysis only)
Consent
By signing below, I confirm that:
Optional:
Participant Name: ________________________________
Participant Signature: ________________________________
Date: ________________________________
Interviewer Name: ________________________________
This form should be reviewed and approved by your organization's legal counsel before use. Modify retention periods, attribution policies, and data handling procedures to match your specific regulatory requirements (GDPR, CCPA, HIPAA, etc.).
Zapier-to-Pipeline Webhook API
Type: integration A lightweight Flask API that receives webhook calls from Zapier when new transcripts appear in SharePoint, triggers the synthesis pipeline, and sends a callback when processing is complete. Runs on the same VM or workstation as the synthesis pipeline.
Implementation
# Flask API for Zapier integration. Run with: python webhook_api.py or
# deploy via gunicorn.
# webhook_api.py - Flask API for Zapier integration
# Run with: python webhook_api.py (or deploy via gunicorn)
import os
import json
import subprocess
import threading
from flask import Flask, request, jsonify
import requests
from datetime import datetime
app = Flask(__name__)
# Configuration
PIPELINE_SCRIPT = os.getenv('PIPELINE_SCRIPT', './synthesis_pipeline.py')
TRANSCRIPT_DIR = os.getenv('TRANSCRIPT_DIR', './transcripts')
OUTPUT_DIR = os.getenv('OUTPUT_DIR', './outputs')
CALLBACK_URL = os.getenv('ZAPIER_CALLBACK_URL', '') # Zapier webhook catch URL
API_KEY = os.getenv('WEBHOOK_API_KEY', 'change-me-to-a-secure-key')
def verify_api_key(req):
"""Simple API key authentication."""
key = req.headers.get('X-API-Key') or req.args.get('api_key')
return key == API_KEY
def run_pipeline_async(project_name, transcript_dir, description=''):
"""Run the synthesis pipeline in a background thread."""
def _run():
try:
cmd = [
'python3', PIPELINE_SCRIPT,
'--input', transcript_dir,
'--project', project_name,
'--description', description,
'--output', OUTPUT_DIR
]
result = subprocess.run(cmd, capture_output=True, text=True, timeout=600)
# Send callback to Zapier
if CALLBACK_URL:
payload = {
'project_name': project_name,
'status': 'success' if result.returncode == 0 else 'error',
'output': result.stdout[-500:] if result.stdout else '',
'error': result.stderr[-500:] if result.stderr else '',
'completed_at': datetime.now().isoformat()
}
requests.post(CALLBACK_URL, json=payload, timeout=30)
except Exception as e:
if CALLBACK_URL:
requests.post(CALLBACK_URL, json={
'project_name': project_name,
'status': 'error',
'error': str(e),
'completed_at': datetime.now().isoformat()
}, timeout=30)
thread = threading.Thread(target=_run, daemon=True)
thread.start()
@app.route('/api/health', methods=['GET'])
def health():
return jsonify({'status': 'healthy', 'timestamp': datetime.now().isoformat()})
@app.route('/api/synthesize', methods=['POST'])
def synthesize():
if not verify_api_key(request):
return jsonify({'error': 'Unauthorized'}), 401
data = request.json or {}
project_name = data.get('project', 'Unknown Project')
transcript_dir = data.get('transcript_dir', TRANSCRIPT_DIR)
description = data.get('description', '')
if not os.path.exists(transcript_dir):
return jsonify({'error': f'Transcript directory not found: {transcript_dir}'}), 400
run_pipeline_async(project_name, transcript_dir, description)
return jsonify({
'status': 'processing',
'project': project_name,
'message': 'Pipeline started. Callback will be sent to Zapier when complete.',
'started_at': datetime.now().isoformat()
}), 202
@app.route('/api/status', methods=['GET'])
def status():
if not verify_api_key(request):
return jsonify({'error': 'Unauthorized'}), 401
# List recent output directories
outputs = []
if os.path.exists(OUTPUT_DIR):
for d in sorted(os.listdir(OUTPUT_DIR), reverse=True)[:10]:
dir_path = os.path.join(OUTPUT_DIR, d)
if os.path.isdir(dir_path):
has_synthesis = os.path.exists(os.path.join(dir_path, 'theme_synthesis.json'))
outputs.append({'project': d, 'complete': has_synthesis})
return jsonify({'recent_runs': outputs})
if __name__ == '__main__':
port = int(os.getenv('PORT', 8443))
app.run(host='0.0.0.0', port=port, debug=False)
# For production: gunicorn -w 2 -b 0.0.0.0:8443 webhook_api:apppip install flask==3.0.0 gunicorn==22.0.0 requests==2.32.0Testing & Validation
- AUDIO QUALITY TEST: Record a 5-minute test conversation using each Jabra Speak2 75 in the actual conference rooms where interviews will be conducted. Play back the recording and verify all speakers are audible, no significant background noise, and speech is clear. Transcribe with Otter.ai and verify >95% word accuracy rate.
- OTTER.AI TRANSCRIPTION ACCURACY TEST: Conduct a 15-minute mock stakeholder interview with two participants. Record via Otter.ai. After transcription completes, manually compare 3 random 2-minute segments against the recording. Verify speaker identification is correct (speakers labeled distinctly) and overall word error rate is below 10%.
- OTTER.AI TO SHAREPOINT INTEGRATION TEST: Record a test meeting tagged 'stakeholder-interview' in Otter.ai. Verify the Zapier zap triggers and the transcript file appears in the SharePoint /Transcripts library within 5 minutes. Confirm the file is readable and contains the full transcript text.
- TRANSCRIPT PROCESSOR (STAGE 1) TEST: Run transcript_processor.py against a sample 45-minute interview transcript. Verify the output JSON contains: (a) valid interviewee_summary with correct role and name, (b) at least 3 themes_identified with supporting quotes that actually appear in the transcript, (c) at least 1 pain_point and 1 opportunity, (d) all quotes in the output are verbatim from the source transcript (no hallucinated quotes).
- THEME SYNTHESIZER (STAGE 2) TEST: Process 5+ sample interview transcripts through Stage 1, then run theme_synthesizer.py. Verify: (a) major_themes are distinct and non-overlapping, (b) frequency counts are mathematically correct (stakeholders_mentioning ≤ total_stakeholders), (c) alignment_analysis identifies at least one area of consensus and one area of divergence, (d) executive_summary is coherent and references actual themes, (e) total processing time is under 3 minutes.
- DELIVERABLE DRAFT GENERATION TEST: Run deliverable_drafter.py with synthesis output. Open the generated .docx file and verify: (a) document opens without errors in Word, (b) all sections (Executive Summary, Themes, Alignment Analysis, Priority Findings) are populated, (c) quotes in the document match source transcripts, (d) formatting is clean with proper heading hierarchy. Open the .pptx file and verify all slides render correctly.
- END-TO-END PIPELINE TEST: Run the full synthesis_pipeline.py with --input pointing to a directory of 3-5 sample transcripts. Verify all three stages complete without errors, output directory contains individual_analyses/*.json, theme_synthesis.json, theme_synthesis.md, and Word/PPTX files. Total execution time should be under 10 minutes for 5 interviews.
- MICROSOFT 365 COPILOT INTEGRATION TEST: Open the generated Word deliverable in Microsoft Word with Copilot enabled. Test these Copilot prompts: (a) 'Summarize the key themes in this document' — verify response references actual themes, (b) 'Rewrite the executive summary for a C-suite audience' — verify output is appropriate, (c) 'Create a table comparing stakeholder perspectives on [Theme 1]' — verify table is generated with relevant data.
- ZAPIER WEBHOOK INTEGRATION TEST: Send a POST request to the webhook API endpoint with a test payload. Verify: (a) API returns 202 status, (b) pipeline begins processing, (c) Zapier callback webhook receives completion notification with correct project name and status. Test with invalid API key and verify 401 rejection.
- DATA RETENTION AND COMPLIANCE TEST: Verify that SharePoint retention policy is applied to the Interview Intelligence Hub site. Upload a test document, confirm it is labeled correctly. Verify Otter.ai admin panel shows correct retention settings. Verify OpenAI API organization settings confirm data is not used for training. Test a deletion request workflow end-to-end: request deletion of a specific interview's data and confirm removal from Otter.ai, SharePoint, and Dovetail within 48 hours.
- CONCURRENT USER TEST: Have 3 consultants simultaneously record and process interviews through the pipeline. Verify there are no conflicts in file storage, API rate limiting is handled gracefully, and all three analyses complete successfully with correct data segregation.
- SECURITY TEST: Verify all API keys are stored in environment variables (not hardcoded). Confirm the webhook API rejects requests without valid API key. Verify SharePoint permissions restrict Interview Intelligence Hub access to authorized project team members only. Confirm MFA is enforced for all users accessing the platform.
Client Handoff
Client Handoff Checklist
Training Topics to Cover
Documentation to Deliver
- Quick Reference Card (1-page PDF): Step-by-step workflow from recording to deliverable
- Pipeline User Guide (5-10 pages): Detailed instructions for running the synthesis pipeline with screenshots
- Troubleshooting Guide: Common errors and resolutions (API rate limits, transcription quality issues, file format problems)
- Prompt Library: Saved Copilot and GPT-5.4 prompts for common deliverable types (assessment reports, strategy recommendations, stakeholder alignment analyses)
- Compliance Handbook: Consent form template, data retention policy, vendor DPA status, deletion request procedure
- Admin Guide (for IT contact): Account administration for Otter.ai, Dovetail, OpenAI API; license management; usage monitoring
- Video Recordings: All training sessions recorded and saved to SharePoint for future onboarding
Success Criteria to Review Together
Transition Details
- MSP retains admin access to Otter.ai, Dovetail, and OpenAI accounts for ongoing management
- Client designates 1-2 internal 'AI Champions' for first-line support
- 30-day hypercare period with weekly check-in calls included in implementation
- After hypercare, transition to standard managed services agreement
Maintenance
Ongoing Maintenance Plan
Monthly Tasks (MSP Responsibility)
- Usage & Cost Monitoring: Review OpenAI API usage dashboard for unexpected spikes; verify Otter.ai minutes consumption is within plan limits; check Dovetail seat utilization. Target: keep API costs under $50/month for typical firm.
- Prompt Quality Review: Review a sample of recent synthesis outputs for quality degradation (theme relevance, quote accuracy, finding actionability). LLM behavior can drift with model updates.
- Software Updates: Check for updates to Python dependencies (openai, tiktoken, python-docx, python-pptx, flask). Apply security patches within 7 days of release. Run pip list --outdated and test updates in staging before deploying.
- Backup Verification: Confirm SharePoint retention policies are active and functioning. Verify synthesis output archives are being maintained.
- License Reconciliation: Review Otter.ai, Dovetail, and Copilot seat assignments. Remove departed employees, add new hires. Optimize plan tier if usage patterns change.
pip list --outdatedQuarterly Tasks
- Prompt Optimization Session (2 hours): Review accumulated synthesis outputs with client power users. Refine prompt templates based on what's working well and what needs improvement. Update config/prompts.yaml accordingly.
- Compliance Audit: Verify all DPAs are current. Check vendor SOC 2 report validity dates. Review data retention compliance (are old transcripts being deleted per policy?). Audit access logs for unauthorized access attempts.
- Performance Benchmarking: Time the full pipeline for a recent project and compare to baseline. If processing time has increased >25%, investigate API performance or data volume changes.
- Template Refresh: Update Word/PPTX templates if client has rebranded or changed deliverable formats.
Trigger-Based Maintenance
- OpenAI Model Updates: When OpenAI releases new GPT-5.4 versions, test synthesis quality with the new model on 2-3 existing projects before switching. Update OPENAI_MODEL environment variable only after validation. Expected frequency: 2-4 times per year.
- Vendor Pricing Changes: Monitor Otter.ai, Dovetail, and OpenAI pricing announcements. Renegotiate or switch vendors if cost increases exceed 20%. Maintain vendor comparison matrix.
- Client Workflow Changes: If client adds new engagement types, interview methodologies, or deliverable formats, create new prompt templates and test thoroughly before deployment.
- Security Incidents: If any vendor reports a data breach, immediately assess exposure, notify client, and execute incident response per the compliance handbook. Have a vendor replacement plan ready.
- API Rate Limiting or Outages: If OpenAI API experiences repeated rate limiting, implement exponential backoff in the pipeline (already built into the OpenAI Python SDK). For extended outages, have Anthropic Claude API as a fallback (requires prompt adaptation).
SLA Considerations
- Response Time: 4-hour response for pipeline errors blocking active engagements; 24-hour response for non-urgent issues
- Availability: Pipeline should be available during business hours (M-F 8am-6pm client local time). SaaS vendor uptime is governed by their respective SLAs (typically 99.9%)
- Data Recovery: All synthesis outputs recoverable from SharePoint within 4 hours. Individual transcript re-processing can be triggered at any time.
Escalation Path
Cost Monitoring Alerts
- Set OpenAI API monthly budget alert at $75 (warning) and $100 (hard cap)
- Monitor Otter.ai minutes usage; alert at 80% of plan capacity
- Review monthly MSP invoice against expected baseline; investigate variances >15%
Alternatives
Turnkey SaaS Approach (Dovetail or Insight7 Only)
Replace the custom Python synthesis pipeline with a single all-in-one platform like Dovetail Professional or Insight7. These platforms handle transcription, AI-powered thematic analysis, collaborative coding, and basic reporting within a single web interface. No custom code, no API integrations, no VM required. The MSP simply provisions accounts, uploads transcripts, and trains users on the platform's built-in AI features.
- COST: Lower total cost (~$75-150/month vs. ~$300+/month for the full stack) and zero development cost.
- COMPLEXITY: Dramatically simpler — Tier 1 MSP technician can deploy in 1-2 weeks vs. 4-6 weeks for the primary approach.
- CAPABILITY: Less customizable synthesis prompts; dependent on vendor's AI quality which varies; weaker deliverable generation (no auto-populated Word/PPTX). Limited integration options.
- RECOMMENDATION: Best for small consulting firms (1-5 people) with straightforward interview analysis needs and low desire for customization. Not recommended for firms that need branded deliverable automation or have complex thematic analysis requirements.
Microsoft-Native Stack (Teams + Azure OpenAI + Copilot)
Build the entire solution within the Microsoft ecosystem. Use Microsoft Teams for interview recording and built-in transcription, Azure OpenAI Service (GPT-5.4) for the synthesis pipeline with enterprise data residency, Azure Blob Storage for transcript storage, Power Automate for workflow orchestration (replacing Zapier), and Microsoft 365 Copilot for deliverable creation. The custom Python pipeline runs on an Azure App Service or Azure Functions instead of a standalone VM.
- COST: Potentially higher software costs (Azure OpenAI has same token pricing but Azure compute adds $30-100/month; Power Automate Premium is $15/user/month) but eliminates Otter.ai, Dovetail, and Zapier subscriptions.
- COMPLEXITY: Medium — requires Azure expertise (Tier 2-3 technician) but all components are within a single vendor ecosystem.
- CAPABILITY: Strongest data governance and compliance story (all data stays in Azure tenant, SOC 2/HIPAA/FedRAMP ready). Teams transcription quality is slightly below Otter.ai. No dedicated QDA platform means less collaborative analysis UX.
- RECOMMENDATION: Best for firms already deep in the Microsoft ecosystem, those with strict data sovereignty requirements (government consulting, healthcare), or MSPs with strong Azure practices. Choose this when compliance is the primary concern over user experience.
Enterprise QDA Platform Approach (NVivo or ATLAS.ti)
Use a traditional enterprise qualitative data analysis platform like NVivo 15 (Lumivero, ~$1,350-$2,500/license) or ATLAS.ti (~€1,100/year) as the primary analysis tool, supplemented by their newly added AI-powered auto-coding features. These platforms offer the deepest qualitative analysis capabilities including mixed-methods research, advanced coding frameworks, and publication-quality outputs.
- COST: Significantly higher software cost ($1,350-$2,500 per license vs. $15-$20/month for modern alternatives). Desktop-based licensing is less flexible than SaaS.
- COMPLEXITY: High learning curve — these are academic-grade research tools that require substantial training (budget 10-20 hours per user).
- CAPABILITY: Most powerful analysis features available; handles complex mixed-methods research that simpler tools cannot; AI features are newer and less mature than purpose-built AI platforms; strong export and visualization capabilities.
- RECOMMENDATION: Only for professional services firms doing rigorous academic-style research (policy consulting, evaluation firms, market research) where analytical depth and methodological rigor are more important than speed and cost. Overkill for typical strategy consulting interview synthesis.
Build-from-Scratch Custom Application
Develop a fully custom web application using the OpenAI Whisper API for transcription, GPT-5.4 for analysis, a React/Next.js frontend for the analysis interface, and a PostgreSQL database for storing all interview data and analysis results. This gives maximum customization and can be white-labeled by the MSP for multiple clients.
COST: Highest upfront investment ($15,000-$40,000 in development) but potentially lowest per-client marginal cost once built.
COMPLEXITY: Very high — requires a full-stack developer, 8-16 weeks of development, and ongoing maintenance.
CAPABILITY: Maximum flexibility — can implement any analysis methodology, any output format, any integration. Can be white-labeled and resold across the MSP's client base as a proprietary product.
RECOMMENDATION: Only viable if the MSP has development resources and plans to deploy this across 5+ professional services clients. The ROI threshold is approximately 5 clients at $500/month to recover development costs within 12 months. Not recommended for a single-client engagement.
Want early access to the full toolkit?