
Implementation Guide: Transcribe daily production standup meetings and generate shift handoff notes
Step-by-step implementation guide for deploying AI to transcribe daily production standup meetings and generate shift handoff notes for Manufacturing clients.
Hardware Procurement
Jabra Speak2 75 Wireless Speakerphone
$250 MSP cost via distribution (Ingram/D&H) / $349 suggested resale to client
Primary audio capture device for standup meetings. Features 4 beamforming noise-cancelling microphones, IP64 dust and splash resistance (critical for manufacturing-adjacent environments), up to 32 hours wireless battery, and 98-ft wireless range via the included Link 390 dongle. Supports up to 8 participants in a standup circle. Microsoft Teams and Zoom certified.
Intel NUC 13 Pro Mini PC (or equivalent)
$350 MSP cost / $500 suggested resale to client
Dedicated capture station that runs the recording client application, connects to the Jabra speakerphone via USB, records audio locally, and sends it to the transcription API. Runs a lightweight Python service or scheduled task. Mounted in or near the standup meeting area with ethernet or Wi-Fi connectivity. Can also serve as a local buffer if network connectivity is intermittent.
Wall-Mount Bracket and Cable Lock Kit
Wall-Mount Bracket and Cable Lock Kit
$40 MSP cost / $75 suggested resale
Secure the Mini PC to the wall or underside of a shelf in the standup area to prevent theft or accidental damage in a manufacturing environment. The cable lock secures the Jabra speakerphone to the table or shelf.
Recording In Progress LED Sign
$30 MSP cost / $60 suggested resale
Compliance requirement: clearly visible indicator that the meeting is being recorded. Powered via USB from the Mini PC and activated by the recording script when capture begins. Required for employee notification in all-party consent states and as best practice everywhere.
Software Procurement
Deepgram Nova-3 Speech-to-Text API
$0.0043/minute (~$1.50/month). $200 free signup credits.
Primary speech-to-text engine. Nova-3 provides industry-leading accuracy with built-in speaker diarization (essential for identifying which speaker reported which issue), punctuation, and paragraphing. Supports streaming and pre-recorded modes. Manufacturing-specific vocabulary can be added via keywords boosting. License type: Usage-based API (pay-per-minute). Diarization included at no extra cost. Estimated usage: 15-min standup × 22 workdays = ~330 minutes/month.
OpenAI GPT-5.4 API
$2.50/1M input tokens, $10/1M output tokens. A 15-min transcript ≈ 3,000 tokens input + ~800 tokens output = ~$0.015 per meeting. ~$0.33/month for 22 meetings.
Summarization and structuring engine. Takes the raw diarized transcript and applies a manufacturing-specific prompt to extract structured shift handoff data: equipment status, safety issues, production targets, quality concerns, action items with owners, and open issues for the next shift.
Microsoft 365 Business Premium
$22/user/month (assumed existing). Only the service account and shift supervisors need licenses.
Provides Microsoft Teams (delivery channel for shift handoff notes), SharePoint Online (archival and search of transcripts and summaries), Power Automate (workflow orchestration), and Azure AD/Entra ID (authentication). Assumed to already be in place at the client site.
Microsoft Power Automate (included in M365 Business Premium)
$0 if M365 Business Premium is in place; $15/user/month for Premium connectors (e.g., HTTP webhook trigger, custom connectors to ERP)
Workflow automation engine. Triggers on new files in SharePoint (the structured JSON handoff note), posts formatted Adaptive Card to the Teams shift channel, archives transcript, and optionally pushes data to ERP/MES/CMMS via HTTP connector.
Python 3.11+ Runtime
Free
Runtime for the capture agent script running on the Mini PC. Handles audio recording, file management, API calls to Deepgram and OpenAI, and output of structured JSON to SharePoint.
Azure Blob Storage (optional, for audio archival)
$0.018/GB/month for Hot tier. ~4 GB/year of audio = ~$0.07/month. Negligible cost.
Long-term archival of raw audio files for compliance, dispute resolution, or re-processing. Lifecycle policies auto-tier to Cool ($0.01/GB) after 30 days and Archive ($0.002/GB) after 90 days.
Prerequisites
- Microsoft 365 Business Premium (or Standard + Power Automate Premium) deployed and operational for at least the shift supervisors and a service account
- Reliable network connectivity (wired Ethernet preferred, Wi-Fi acceptable) in or near the standup meeting area with at least 5 Mbps upload bandwidth
- A Microsoft Teams team and channel structure for production/shift communication already established (or willingness to create one)
- Client legal/HR review and sign-off on employee recording notification policy — MSP will provide a template but client legal must approve for their specific state(s) and any union/CBA requirements
- If the facility is in a two-party/all-party consent state (CA, DE, FL, IL, MD, MA, MT, NH, OR, PA, WA), written employee consent forms must be executed before go-live
- Azure AD / Entra ID tenant with Global Admin or Application Admin access for registering the service application and creating a service account
- SharePoint Online site designated for transcript and handoff note storage, with appropriate permissions for the service account
- If ERP/MES/CMMS integration is desired in Phase 2: API documentation and credentials for the target system, plus a test environment
- OpenAI API account created and funded (minimum $10 prepaid credit) at https://platform.openai.com/
- Deepgram API account created at https://console.deepgram.com/ ($200 free credits available on signup)
- Physical access to the standup meeting area for hardware installation, including a 120V AC outlet within 6 feet and a flat surface or wall-mount location for the Mini PC
- Client-designated 'champion' — typically a shift supervisor or production manager — who will participate in testing and provide feedback on handoff note quality
Installation Steps
...
Step 1: Compliance and Legal Setup
Before any technology is deployed, complete the legal and HR prerequisites for recording employee meetings. This step is non-negotiable and must be completed before hardware installation.
CRITICAL: If the facility is in CA, DE, FL, IL, MD, MA, MT, NH, OR, PA, or WA, all-party consent is required. If the workforce is unionized, the recording policy may need to be bargained with the union under NLRA Section 7 — consult labor counsel. For ITAR/defense manufacturing, STOP and assess whether standup discussions may contain CUI — if so, skip to the 'On-Premises Whisper' alternative approach. Do NOT proceed with cloud-based transcription for ITAR-controlled content.
Step 2: Provision Cloud API Accounts
Create and configure the Deepgram and OpenAI API accounts that will power transcription and summarization. Generate API keys and store them securely.
Store API keys in a password manager (e.g., IT Glue, Hudu, or Keeper). Never hardcode keys in scripts — we will use environment variables on the Mini PC. Set usage limits in both platforms: Deepgram > Settings > Usage Limits (set to $20/month to start); OpenAI > Settings > Limits (set monthly budget to $20).
Step 3: Configure Microsoft 365 Infrastructure
Create a dedicated service account, Teams channel, and SharePoint document library for the solution. This service account will be used by the capture agent to upload files and trigger automations.
If the client has Conditional Access policies, ensure the service account is excluded or has appropriate exceptions for non-interactive sign-in. For smaller deployments, you can use a webhook-based approach (incoming webhook on the Teams channel) instead of Graph API — this is simpler but less flexible. The retention policy should be reviewed with client legal; SOX-relevant manufacturers may need 7-year retention.
Step 4: Hardware Installation — Mini PC and Speakerphone
Physically install the Intel NUC Mini PC and Jabra Speak2 75 speakerphone in the standup meeting area. Configure the Mini PC with the operating system and base software.
winget install Python.Python.3.11python --version
pip --versionpip install deepgram-sdk openai python-dotenv sounddevice soundfile numpy requests msalpython -c "import sounddevice as sd; print(sd.query_devices())"python -c "import sounddevice as sd; import soundfile as sf; data = sd.rec(int(5*16000), samplerate=16000, channels=1, device=2); sd.wait(); sf.write('test.wav', data, 16000); print('Test recording saved.')"Position the Jabra Speak2 75 centrally where standup participants gather — typically on a high table or mounted shelf at chest height. The 4-beamforming mic array works best when participants are within 8 feet. Avoid placing near loud machinery (compressors, conveyors) — if the standup area is directly on the production floor, consider relocating to a nearby breakroom or office. The IP64 rating protects against dust and splashing but not immersion. Run a 30-second test recording during active production to verify noise cancellation is adequate. If audio quality is poor, upgrade to the EPOS EXPAND 80T ($489) which has 6 adaptive beamforming microphones.
Step 5: Configure Environment Variables and Secrets
Set up the environment variables on the Mini PC that the capture agent will use for API credentials and configuration. This keeps secrets out of the codebase.
mkdir C:\StandupCapture
cd C:\StandupCapture
echo DEEPGRAM_API_KEY=your_deepgram_key_here > .env
echo OPENAI_API_KEY=your_openai_key_here >> .env
echo AUDIO_DEVICE_INDEX=2 >> .env
echo RECORDING_DURATION_MINUTES=20 >> .env
echo SAMPLE_RATE=16000 >> .env
echo SHAREPOINT_SITE_ID=your_sharepoint_site_id >> .env
echo SHAREPOINT_TENANT_ID=your_tenant_id >> .env
echo SHAREPOINT_CLIENT_ID=your_app_client_id >> .env
echo SHAREPOINT_CLIENT_SECRET=your_app_client_secret >> .env
echo TEAMS_WEBHOOK_URL=https://clientdomain.webhook.office.com/webhookb2/xxx >> .env
echo STANDUP_SCHEDULE=06:45,14:45,22:45 >> .env
echo MAX_RECORDING_MINUTES=30 >> .env
echo OUTPUT_DIR=C:\StandupCapture\recordings >> .envicacls .env /inheritance:r /grant:r "%USERNAME%:F" /grant:r "SYSTEM:F"mkdir recordings
mkdir transcripts
mkdir handoff_notesThe STANDUP_SCHEDULE variable holds the expected start times for each shift's standup meeting. The capture agent can be triggered manually or auto-started at these times via Windows Task Scheduler. RECORDING_DURATION_MINUTES=20 provides a buffer beyond the typical 15-minute standup. The agent includes silence detection to auto-stop if the meeting ends early. For Linux deployments, use chmod 600 .env and store in /opt/standup-capture/.
Step 6: Deploy the Capture Agent Application
Install the main Python application that orchestrates the entire pipeline: audio recording, transcription via Deepgram, summarization via GPT-5.4, and output delivery to Teams and SharePoint. The application is deployed as a Windows service or systemd service.
echo deepgram-sdk==3.4.0 > requirements.txt
echo openai==1.40.0 >> requirements.txt
echo python-dotenv==1.0.1 >> requirements.txt
echo sounddevice==0.4.7 >> requirements.txt
echo soundfile==0.12.1 >> requirements.txt
echo numpy==1.26.4 >> requirements.txt
echo requests==2.32.3 >> requirements.txt
echo msal==1.28.0 >> requirements.txt
echo schedule==1.2.2 >> requirements.txt
pip install -r requirements.txtpython capture_agent.py --test --duration 60nssm install StandupCaptureAgent "C:\Python311\python.exe" "C:\StandupCapture\capture_agent.py --service"
nssm set StandupCaptureAgent AppDirectory "C:\StandupCapture"
nssm set StandupCaptureAgent Start SERVICE_AUTO_START
nssm set StandupCaptureAgent AppStdout "C:\StandupCapture\logs\service.log"
nssm set StandupCaptureAgent AppStderr "C:\StandupCapture\logs\error.log"
nssm start StandupCaptureAgentThe capture_agent.py script runs as a persistent service. It uses the 'schedule' library to trigger recordings at the configured standup times, or it can be triggered manually via a desktop shortcut or physical button press. The --test flag runs a single short capture cycle for validation. See the custom_ai_components section for the complete source code of all Python modules. On Linux, use a systemd unit file instead of NSSM.
Step 7: Configure Windows Task Scheduler (Alternative to Service)
If running as a Windows service is not preferred, configure Task Scheduler to launch the capture agent at each shift's standup time. This is simpler but less flexible than the service approach.
schtasks /create /tn "StandupCapture-Morning" /tr "C:\Python311\python.exe C:\StandupCapture\capture_agent.py --once" /sc daily /st 06:45 /ru SYSTEM
schtasks /create /tn "StandupCapture-Afternoon" /tr "C:\Python311\python.exe C:\StandupCapture\capture_agent.py --once" /sc daily /st 14:45 /ru SYSTEM
schtasks /create /tn "StandupCapture-Night" /tr "C:\Python311\python.exe C:\StandupCapture\capture_agent.py --once" /sc daily /st 22:45 /ru SYSTEM
schtasks /query /tn "StandupCapture-Morning" /vThe --once flag tells the agent to record a single meeting and then exit, as opposed to --service which runs continuously. Adjust times to match the client's actual standup schedule. Add 5 minutes before the typical meeting start to ensure recording captures the beginning. The agent's silence detection will auto-stop recording after 2 minutes of continuous silence.
Step 8: Configure Power Automate Workflows
Set up Microsoft Power Automate flows to process the structured handoff notes generated by the capture agent and distribute them through additional channels beyond the direct Teams webhook.
Flow 1: Archive Notification Flow
Flow 2: Weekly Digest Flow
Flow 3 (Optional): CMMS Work Order Creation
Power Automate flows are configured in the browser at https://make.powerautomate.com/. The capture agent posts the structured JSON handoff note to SharePoint, which triggers these downstream flows. For clients on M365 Business Premium, standard connectors (SharePoint, Teams, Outlook, Planner) are included. The HTTP connector for CMMS/ERP integration requires Power Automate Premium ($15/user/month) for the service account only. Export flow definitions as ZIP packages and store in the MSP's documentation system for redeployment.
Step 9: Manufacturing Vocabulary and Speaker Profile Configuration
Optimize transcription accuracy by configuring Deepgram keyword boosting for manufacturing-specific terminology and setting up speaker profiles for regular standup attendees.
keywords = [
'CNC:2', 'PLC:2', 'OEE:2', 'TPM:1.5',
'changeover:1.5', 'downtime:2', 'scrap rate:2',
'first pass yield:2', 'cycle time:1.5',
'work order:1.5', 'lot number:2', 'batch:1.5',
'preventive maintenance:1.5', 'lockout tagout:2',
'PPE:2', 'near miss:2', 'safety incident:2',
'quality hold:2', 'nonconformance:2',
# Add client-specific terms:
'Line 1:1.5', 'Line 2:1.5', 'Line 3:1.5',
'Robot Cell A:1.5', 'Fanuc:1.5', 'Haas:1.5',
# Add product names, part numbers, etc.
]
# Speaker diarization settings:
# diarize=True, diarize_version='latest',
# utterances=True, detect_language=True# maps Deepgram speaker labels to actual attendee names
{
"Speaker 0": "Mike R. (Day Shift Supervisor)",
"Speaker 1": "Sarah L. (Night Shift Supervisor)",
"Speaker 2": "Tom K. (Maintenance Lead)",
"Speaker 3": "Jenny W. (Quality Manager)"
}Deepgram's keyword boosting uses a numeric intensity from -10 to 10 (default 1.5 is moderate boost). Start with the terms listed above plus client-specific equipment, product, and personnel names. Review the first 5 transcripts with the client champion to identify misrecognized terms and add them to the boosting list. Speaker diarization labels speakers as 'Speaker 0', 'Speaker 1', etc. — the speaker_map.json file maps these to actual names. This mapping must be recalibrated when new regular attendees join standup meetings. The summarization prompt instructs GPT-5.4 to use these mapped names in the handoff notes.
Step 10: End-to-End Integration Test and Go-Live
Conduct a full end-to-end test with actual production standup participants, validate all outputs, and transition to production operation.
python capture_agent.py --once --verbose- Check C:\StandupCapture\recordings\ for the WAV file
- Check C:\StandupCapture\transcripts\ for the raw transcript JSON
- Check C:\StandupCapture\handoff_notes\ for the structured handoff JSON
- Check Teams > Shift Handoff Notes channel for the posted Adaptive Card
- Check SharePoint > Standup Transcripts Archive for uploaded files
- Check the incoming shift supervisor's email for the handoff notification
- Calculate Word Error Rate (WER) informally: count errors per 100 words
- Target: < 10% WER for manufacturing-specific content
- If WER > 15%, investigate: mic placement, background noise, keyword boosting
Plan for a 1-week parallel operation period where both the AI-generated handoff notes and the existing manual process run simultaneously. This builds confidence with shift supervisors and allows for prompt tuning. During this week, the client champion should review every handoff note and provide feedback. Common issues to watch for: crosstalk causing speaker misattribution, manufacturing jargon not being recognized, and the summarizer missing implicit action items. All of these are addressed by prompt iteration and keyword boosting adjustments.
Custom AI Components
Capture Agent (capture_agent.py)
Type: agent The core orchestration agent that runs on the Mini PC. It manages the recording lifecycle: starts audio capture at scheduled times or on manual trigger, monitors for silence to detect meeting end, saves the audio file, sends it to Deepgram for transcription, passes the transcript to the summarizer, and triggers the delivery module to post results to Teams and SharePoint. Runs as a Windows service or scheduled task.
Implementation:
# Full recording, transcription, summarization, and delivery pipeline
#!/usr/bin/env python3
"""Standup Meeting Capture Agent for Manufacturing Shift Handoffs."""
import os
import sys
import json
import logging
import argparse
import time
from datetime import datetime, timedelta
from pathlib import Path
import numpy as np
import sounddevice as sd
import soundfile as sf
import schedule
from dotenv import load_dotenv
from deepgram import DeepgramClient, PrerecordedOptions, FileSource
from summarizer import generate_handoff_note
from delivery import post_to_teams, upload_to_sharepoint
# Load environment
load_dotenv()
# Configuration
AUDIO_DEVICE_INDEX = int(os.getenv('AUDIO_DEVICE_INDEX', '2'))
SAMPLE_RATE = int(os.getenv('SAMPLE_RATE', '16000'))
MAX_RECORDING_MINUTES = int(os.getenv('MAX_RECORDING_MINUTES', '30'))
SILENCE_THRESHOLD = 0.01 # RMS threshold for silence detection
SILENCE_TIMEOUT_SECONDS = 120 # Stop after 2 min continuous silence
OUTPUT_DIR = Path(os.getenv('OUTPUT_DIR', './recordings'))
TRANSCRIPT_DIR = Path('./transcripts')
HANDOFF_DIR = Path('./handoff_notes')
LOG_DIR = Path('./logs')
# Ensure directories exist
for d in [OUTPUT_DIR, TRANSCRIPT_DIR, HANDOFF_DIR, LOG_DIR]:
d.mkdir(parents=True, exist_ok=True)
# Logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s [%(levelname)s] %(message)s',
handlers=[
logging.FileHandler(LOG_DIR / 'capture_agent.log'),
logging.StreamHandler()
]
)
logger = logging.getLogger(__name__)
def get_shift_name() -> str:
"""Determine current shift based on time of day."""
hour = datetime.now().hour
if 5 <= hour < 13:
return 'day'
elif 13 <= hour < 21:
return 'afternoon'
else:
return 'night'
def record_meeting(duration_minutes: int = None) -> Path:
"""Record audio from the Jabra speakerphone with silence detection."""
duration = duration_minutes or MAX_RECORDING_MINUTES
max_samples = int(duration * 60 * SAMPLE_RATE)
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
shift = get_shift_name()
filename = OUTPUT_DIR / f'standup_{shift}_{timestamp}.wav'
logger.info(f'Starting recording: {filename} (max {duration} min, device {AUDIO_DEVICE_INDEX})')
# Record in chunks for silence detection
chunk_duration = 5 # seconds per chunk
chunk_samples = int(chunk_duration * SAMPLE_RATE)
all_audio = []
silence_start = None
recording_started = False
try:
total_recorded = 0
while total_recorded < max_samples:
chunk = sd.rec(
chunk_samples,
samplerate=SAMPLE_RATE,
channels=1,
dtype='float32',
device=AUDIO_DEVICE_INDEX
)
sd.wait()
rms = np.sqrt(np.mean(chunk**2))
if rms > SILENCE_THRESHOLD:
recording_started = True
silence_start = None
all_audio.append(chunk)
total_recorded += chunk_samples
logger.debug(f'Audio chunk recorded. RMS: {rms:.4f}, Total: {total_recorded/SAMPLE_RATE:.0f}s')
else:
all_audio.append(chunk) # Keep silence in recording for natural flow
total_recorded += chunk_samples
if recording_started:
if silence_start is None:
silence_start = time.time()
elif time.time() - silence_start > SILENCE_TIMEOUT_SECONDS:
logger.info(f'Silence detected for {SILENCE_TIMEOUT_SECONDS}s. Stopping recording.')
break
if not all_audio:
logger.warning('No audio captured.')
return None
audio_data = np.concatenate(all_audio)
sf.write(str(filename), audio_data, SAMPLE_RATE)
duration_actual = len(audio_data) / SAMPLE_RATE
logger.info(f'Recording saved: {filename} ({duration_actual:.1f} seconds)')
return filename
except Exception as e:
logger.error(f'Recording error: {e}')
return None
def transcribe_audio(audio_path: Path) -> dict:
"""Send audio to Deepgram Nova-3 for transcription with speaker diarization."""
logger.info(f'Transcribing: {audio_path}')
try:
deepgram = DeepgramClient(os.getenv('DEEPGRAM_API_KEY'))
with open(audio_path, 'rb') as f:
buffer_data = f.read()
payload = {'buffer': buffer_data}
options = PrerecordedOptions(
model='nova-3',
language='en-US',
smart_format=True,
punctuate=True,
paragraphs=True,
diarize=True,
utterances=True,
detect_language=True,
keywords=[
'CNC:2', 'PLC:2', 'OEE:2', 'TPM:1.5',
'changeover:1.5', 'downtime:2', 'scrap rate:2',
'first pass yield:2', 'cycle time:1.5',
'work order:1.5', 'lot number:2', 'batch:1.5',
'preventive maintenance:1.5', 'lockout tagout:2',
'PPE:2', 'near miss:2', 'safety incident:2',
'quality hold:2', 'nonconformance:2'
]
)
response = deepgram.listen.rest.v('1').transcribe_file(payload, options)
result = response.to_dict()
# Save raw transcript
transcript_path = TRANSCRIPT_DIR / f'{audio_path.stem}_transcript.json'
with open(transcript_path, 'w') as f:
json.dump(result, f, indent=2)
logger.info(f'Transcript saved: {transcript_path}')
return result
except Exception as e:
logger.error(f'Transcription error: {e}')
return None
def load_speaker_map() -> dict:
"""Load speaker name mapping from configuration file."""
map_path = Path('./speaker_map.json')
if map_path.exists():
with open(map_path) as f:
return json.load(f)
return {}
def format_diarized_transcript(deepgram_result: dict) -> str:
"""Convert Deepgram diarized output to a readable speaker-labeled transcript."""
speaker_map = load_speaker_map()
utterances = deepgram_result.get('results', {}).get('utterances', [])
lines = []
for utt in utterances:
speaker_id = f"Speaker {utt.get('speaker', '?')}"
speaker_name = speaker_map.get(speaker_id, speaker_id)
text = utt.get('transcript', '')
start_time = utt.get('start', 0)
minutes = int(start_time // 60)
seconds = int(start_time % 60)
lines.append(f'[{minutes:02d}:{seconds:02d}] {speaker_name}: {text}')
return '\n'.join(lines)
def run_pipeline(duration_minutes: int = None, verbose: bool = False):
"""Execute the full capture-transcribe-summarize-deliver pipeline."""
pipeline_start = datetime.now()
shift = get_shift_name()
logger.info(f'=== Pipeline started: {shift} shift standup ===')
# Step 1: Record
audio_path = record_meeting(duration_minutes)
if not audio_path:
logger.error('Recording failed. Pipeline aborted.')
return
# Step 2: Transcribe
transcript_result = transcribe_audio(audio_path)
if not transcript_result:
logger.error('Transcription failed. Pipeline aborted.')
return
# Step 3: Format diarized transcript
formatted_transcript = format_diarized_transcript(transcript_result)
if verbose:
print('\n--- TRANSCRIPT ---')
print(formatted_transcript)
# Step 4: Summarize into structured handoff note
handoff_note = generate_handoff_note(
transcript=formatted_transcript,
shift=shift,
meeting_date=pipeline_start.strftime('%Y-%m-%d'),
meeting_time=pipeline_start.strftime('%H:%M')
)
if not handoff_note:
logger.error('Summarization failed. Pipeline aborted.')
return
# Save handoff note
handoff_path = HANDOFF_DIR / f'handoff_{shift}_{pipeline_start.strftime("%Y%m%d_%H%M%S")}.json'
with open(handoff_path, 'w') as f:
json.dump(handoff_note, f, indent=2)
if verbose:
print('\n--- HANDOFF NOTE ---')
print(json.dumps(handoff_note, indent=2))
# Step 5: Deliver
try:
post_to_teams(handoff_note)
logger.info('Posted to Teams successfully.')
except Exception as e:
logger.error(f'Teams delivery failed: {e}')
try:
upload_to_sharepoint(audio_path, handoff_path, formatted_transcript)
logger.info('Uploaded to SharePoint successfully.')
except Exception as e:
logger.error(f'SharePoint upload failed: {e}')
elapsed = (datetime.now() - pipeline_start).total_seconds()
logger.info(f'=== Pipeline completed in {elapsed:.1f}s ===')
def run_service():
"""Run as a persistent service, triggering at scheduled times."""
standup_times = os.getenv('STANDUP_SCHEDULE', '06:45,14:45').split(',')
for t in standup_times:
t = t.strip()
schedule.every().day.at(t).do(run_pipeline)
logger.info(f'Scheduled standup capture at {t}')
logger.info('Capture agent service started. Waiting for scheduled times...')
while True:
schedule.run_pending()
time.sleep(30)
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Manufacturing Standup Capture Agent')
parser.add_argument('--service', action='store_true', help='Run as persistent service')
parser.add_argument('--once', action='store_true', help='Run single capture cycle')
parser.add_argument('--test', action='store_true', help='Run test capture (short duration)')
parser.add_argument('--duration', type=int, default=None, help='Recording duration in seconds')
parser.add_argument('--verbose', action='store_true', help='Print outputs to console')
args = parser.parse_args()
if args.service:
run_service()
elif args.test:
run_pipeline(duration_minutes=(args.duration or 60) / 60, verbose=True)
elif args.once:
run_pipeline(duration_minutes=args.duration, verbose=args.verbose)
else:
parser.print_help()Shift Handoff Summarizer (summarizer.py)
Type: prompt The GPT-5.4-powered summarization module that takes a raw diarized transcript from a manufacturing standup meeting and produces a structured JSON shift handoff note. The prompt is specifically engineered for manufacturing contexts, extracting equipment status, safety issues, production metrics, quality concerns, and action items with accountability assignments.
Implementation:
#!/usr/bin/env python3
"""Manufacturing Shift Handoff Note Generator using GPT-5.4."""
import os
import json
import logging
from openai import OpenAI
from dotenv import load_dotenv
load_dotenv()
logger = logging.getLogger(__name__)
client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))
SYSTEM_PROMPT = """You are an expert manufacturing shift handoff report generator. You analyze transcribed production standup meeting discussions and produce structured shift handoff notes.
You must extract and organize information into the following categories:
1. **Production Status**: Current production rates, targets, variances, lot/batch numbers discussed
2. **Equipment Status**: Machine status, breakdowns, scheduled maintenance, changeovers
3. **Safety Issues**: Near misses, incidents, PPE concerns, lockout/tagout status, hazard observations
4. **Quality Issues**: Defects, scrap rates, holds, nonconformances, customer complaints
5. **Action Items**: Tasks assigned to specific people with deadlines
6. **Open Issues for Next Shift**: Unresolved problems the incoming shift must address
7. **Staffing Notes**: Absences, overtime, temporary workers, training needs
Rules:
- Only include information explicitly discussed in the transcript. Do NOT infer or fabricate.
- Attribute statements to the speaker who said them using the speaker labels in the transcript.
- For action items, clearly state WHO is responsible and WHAT they must do.
- Flag any safety issue with severity: 'critical' (immediate danger), 'high' (needs attention this shift), 'medium' (monitor), or 'low' (informational).
- Flag any equipment issue with severity: 'down' (not running), 'degraded' (running with issues), 'scheduled' (planned maintenance), or 'ok'.
- If production numbers are mentioned, include them with units.
- Use plain, direct language appropriate for a factory floor audience.
- Output valid JSON only. No markdown, no commentary."""
USER_PROMPT_TEMPLATE = """Analyze the following production standup meeting transcript and generate a structured shift handoff note.
Meeting Date: {meeting_date}
Meeting Time: {meeting_time}
Shift: {shift} shift
--- TRANSCRIPT START ---
{transcript}
--- TRANSCRIPT END ---
Generate the shift handoff note as a JSON object with this exact structure:
{{
"handoff_metadata": {{
"date": "{meeting_date}",
"time": "{meeting_time}",
"outgoing_shift": "{shift}",
"incoming_shift": "<determine from context or leave as next shift>",
"meeting_duration_minutes": <estimated from timestamps>,
"attendees": ["list of speaker names from transcript"]
}},
"production_status": [
{{
"line_or_area": "<production line or area>",
"status": "<running/down/changeover>",
"current_output": "<number with units if mentioned>",
"target_output": "<number with units if mentioned>",
"variance": "<ahead/behind/on-target>",
"notes": "<additional context>",
"reported_by": "<speaker name>"
}}
],
"equipment_status": [
{{
"equipment_name": "<machine or system name>",
"status": "<down/degraded/scheduled/ok>",
"issue_description": "<what's wrong or what's planned>",
"estimated_resolution": "<when it should be fixed if mentioned>",
"reported_by": "<speaker name>"
}}
],
"safety_issues": [
{{
"description": "<what happened or was observed>",
"severity": "<critical/high/medium/low>",
"location": "<where in the facility>",
"action_required": "<what needs to be done>",
"reported_by": "<speaker name>"
}}
],
"quality_issues": [
{{
"description": "<defect, hold, or nonconformance details>",
"affected_product": "<part number, lot, or product name if mentioned>",
"action_required": "<what needs to be done>",
"reported_by": "<speaker name>"
}}
],
"action_items": [
{{
"task": "<specific task description>",
"assigned_to": "<person name>",
"deadline": "<when it's due if mentioned, otherwise 'this shift'>",
"priority": "<high/medium/low>",
"context": "<why this task matters>"
}}
],
"open_issues_for_next_shift": [
{{
"issue": "<description of unresolved issue>",
"context": "<background information>",
"recommended_action": "<what the next shift should do>"
}}
],
"staffing_notes": [
{{
"note": "<absence, overtime, training, or staffing change>",
"impact": "<how this affects operations>"
}}
],
"summary": "<2-3 sentence executive summary of the most important points from this standup for the incoming shift supervisor>"
}}
IMPORTANT: If a category has no relevant items from the transcript, use an empty array []. Do not omit any category. Output ONLY the JSON object."""
def generate_handoff_note(
transcript: str,
shift: str,
meeting_date: str,
meeting_time: str
) -> dict:
"""Generate a structured shift handoff note from a meeting transcript."""
user_prompt = USER_PROMPT_TEMPLATE.format(
transcript=transcript,
shift=shift,
meeting_date=meeting_date,
meeting_time=meeting_time
)
try:
response = client.chat.completions.create(
model='gpt-5.4',
messages=[
{'role': 'system', 'content': SYSTEM_PROMPT},
{'role': 'user', 'content': user_prompt}
],
temperature=0.1, # Low temperature for consistent, factual extraction
max_tokens=4000,
response_format={'type': 'json_object'}
)
content = response.choices[0].message.content
handoff_note = json.loads(content)
# Validate required keys
required_keys = [
'handoff_metadata', 'production_status', 'equipment_status',
'safety_issues', 'quality_issues', 'action_items',
'open_issues_for_next_shift', 'staffing_notes', 'summary'
]
for key in required_keys:
if key not in handoff_note:
handoff_note[key] = [] if key != 'summary' and key != 'handoff_metadata' else ''
logger.info(f'Handoff note generated: {len(handoff_note.get("action_items", []))} action items, '
f'{len(handoff_note.get("safety_issues", []))} safety issues, '
f'{len(handoff_note.get("equipment_status", []))} equipment items')
# Log token usage for cost tracking
usage = response.usage
logger.info(f'Token usage - Input: {usage.prompt_tokens}, Output: {usage.completion_tokens}, '
f'Est. cost: ${(usage.prompt_tokens * 2.5 + usage.completion_tokens * 10) / 1_000_000:.4f}')
return handoff_note
except json.JSONDecodeError as e:
logger.error(f'Failed to parse GPT-5.4 response as JSON: {e}')
logger.error(f'Raw response: {content[:500]}')
return None
except Exception as e:
logger.error(f'Summarization error: {e}')
return NoneDelivery Module (delivery.py)
Type: integration Handles posting the formatted shift handoff note to Microsoft Teams via an incoming webhook (formatted as an Adaptive Card) and uploading raw audio, transcript, and handoff note files to SharePoint Online via Microsoft Graph API. Also supports optional email delivery.
Implementation
# Teams webhook, SharePoint upload, and email notification
#!/usr/bin/env python3
"""Delivery module: Teams webhook, SharePoint upload, and email notification."""
import os
import json
import logging
from datetime import datetime
from pathlib import Path
import requests
from msal import ConfidentialClientApplication
from dotenv import load_dotenv
load_dotenv()
logger = logging.getLogger(__name__)
# Configuration
TEAMS_WEBHOOK_URL = os.getenv('TEAMS_WEBHOOK_URL')
SHAREPOINT_SITE_ID = os.getenv('SHAREPOINT_SITE_ID')
TENANT_ID = os.getenv('SHAREPOINT_TENANT_ID')
CLIENT_ID = os.getenv('SHAREPOINT_CLIENT_ID')
CLIENT_SECRET = os.getenv('SHAREPOINT_CLIENT_SECRET')
GRAPH_API_BASE = 'https://graph.microsoft.com/v1.0'
def get_graph_token() -> str:
"""Acquire Microsoft Graph API token using MSAL."""
app = ConfidentialClientApplication(
CLIENT_ID,
authority=f'https://login.microsoftonline.com/{TENANT_ID}',
client_credential=CLIENT_SECRET
)
result = app.acquire_token_for_client(scopes=['https://graph.microsoft.com/.default'])
if 'access_token' in result:
return result['access_token']
raise Exception(f'Failed to acquire Graph token: {result.get("error_description", "Unknown error")}')
def build_adaptive_card(handoff_note: dict) -> dict:
"""Build a Teams Adaptive Card from the structured handoff note."""
metadata = handoff_note.get('handoff_metadata', {})
summary = handoff_note.get('summary', 'No summary available.')
safety_issues = handoff_note.get('safety_issues', [])
equipment_status = handoff_note.get('equipment_status', [])
action_items = handoff_note.get('action_items', [])
production_status = handoff_note.get('production_status', [])
open_issues = handoff_note.get('open_issues_for_next_shift', [])
# Build safety section with color coding
safety_facts = []
for issue in safety_issues:
severity_emoji = {'critical': '🔴', 'high': '🟠', 'medium': '🟡', 'low': '🟢'}.get(issue.get('severity', ''), '⚪')
safety_facts.append({
'type': 'TextBlock',
'text': f"{severity_emoji} **{issue.get('severity', 'unknown').upper()}**: {issue.get('description', '')} — *{issue.get('action_required', '')}*",
'wrap': True,
'size': 'Small'
})
# Build equipment section
equipment_facts = []
for eq in equipment_status:
status_emoji = {'down': '🔴', 'degraded': '🟠', 'scheduled': '🔵', 'ok': '🟢'}.get(eq.get('status', ''), '⚪')
equipment_facts.append({
'type': 'TextBlock',
'text': f"{status_emoji} **{eq.get('equipment_name', '')}**: {eq.get('issue_description', 'No issues')} (ETA: {eq.get('estimated_resolution', 'N/A')})",
'wrap': True,
'size': 'Small'
})
# Build action items section
action_facts = []
for item in action_items:
priority_emoji = {'high': '🔴', 'medium': '🟡', 'low': '🟢'}.get(item.get('priority', ''), '⚪')
action_facts.append({
'type': 'TextBlock',
'text': f"{priority_emoji} **{item.get('assigned_to', 'Unassigned')}**: {item.get('task', '')} (Due: {item.get('deadline', 'TBD')})",
'wrap': True,
'size': 'Small'
})
card = {
'type': 'message',
'attachments': [{
'contentType': 'application/vnd.microsoft.card.adaptive',
'contentUrl': None,
'content': {
'$schema': 'http://adaptivecards.io/schemas/adaptive-card.json',
'type': 'AdaptiveCard',
'version': '1.4',
'body': [
{
'type': 'TextBlock',
'text': f"🏭 Shift Handoff: {metadata.get('outgoing_shift', '').title()} → {metadata.get('incoming_shift', 'Next')} Shift",
'weight': 'Bolder',
'size': 'Large',
'wrap': True
},
{
'type': 'TextBlock',
'text': f"📅 {metadata.get('date', '')} at {metadata.get('time', '')} | Attendees: {', '.join(metadata.get('attendees', []))}",
'size': 'Small',
'isSubtle': True,
'wrap': True
},
{
'type': 'TextBlock',
'text': f"**Summary:** {summary}",
'wrap': True,
'separator': True
},
{
'type': 'TextBlock',
'text': '⚠️ **Safety Issues**' if safety_facts else '✅ **No Safety Issues Reported**',
'weight': 'Bolder',
'separator': True
},
*safety_facts,
{
'type': 'TextBlock',
'text': '🔧 **Equipment Status**',
'weight': 'Bolder',
'separator': True
},
*(equipment_facts if equipment_facts else [{'type': 'TextBlock', 'text': 'All equipment running normally.', 'size': 'Small'}]),
{
'type': 'TextBlock',
'text': '📋 **Action Items**',
'weight': 'Bolder',
'separator': True
},
*(action_facts if action_facts else [{'type': 'TextBlock', 'text': 'No action items assigned.', 'size': 'Small'}]),
{
'type': 'TextBlock',
'text': f"📌 **Open Issues for Next Shift:** {len(open_issues)} item(s)",
'weight': 'Bolder',
'separator': True
},
*[{'type': 'TextBlock', 'text': f"• {oi.get('issue', '')} — *{oi.get('recommended_action', '')}*", 'wrap': True, 'size': 'Small'} for oi in open_issues]
]
}
}]
}
return card
def post_to_teams(handoff_note: dict):
"""Post the handoff note as an Adaptive Card to Teams via incoming webhook."""
if not TEAMS_WEBHOOK_URL:
logger.warning('TEAMS_WEBHOOK_URL not configured. Skipping Teams delivery.')
return
card = build_adaptive_card(handoff_note)
response = requests.post(
TEAMS_WEBHOOK_URL,
json=card,
headers={'Content-Type': 'application/json'},
timeout=30
)
if response.status_code in (200, 202):
logger.info('Handoff note posted to Teams successfully.')
else:
logger.error(f'Teams webhook failed: {response.status_code} - {response.text}')
raise Exception(f'Teams webhook returned {response.status_code}')
def upload_to_sharepoint(audio_path: Path, handoff_path: Path, transcript_text: str):
"""Upload audio, handoff note, and transcript to SharePoint document libraries."""
if not all([SHAREPOINT_SITE_ID, TENANT_ID, CLIENT_ID, CLIENT_SECRET]):
logger.warning('SharePoint credentials not fully configured. Skipping upload.')
return
token = get_graph_token()
headers = {
'Authorization': f'Bearer {token}',
'Content-Type': 'application/octet-stream'
}
date_folder = datetime.now().strftime('%Y/%m')
# Upload audio file
if audio_path and audio_path.exists():
audio_url = f"{GRAPH_API_BASE}/sites/{SHAREPOINT_SITE_ID}/drive/root:/Audio-Archive/{date_folder}/{audio_path.name}:/content"
with open(audio_path, 'rb') as f:
resp = requests.put(audio_url, headers=headers, data=f, timeout=120)
if resp.status_code in (200, 201):
logger.info(f'Audio uploaded to SharePoint: {audio_path.name}')
else:
logger.error(f'Audio upload failed: {resp.status_code} - {resp.text[:200]}')
# Upload handoff note JSON
if handoff_path and handoff_path.exists():
note_url = f"{GRAPH_API_BASE}/sites/{SHAREPOINT_SITE_ID}/drive/root:/Handoff-Notes/{date_folder}/{handoff_path.name}:/content"
with open(handoff_path, 'rb') as f:
resp = requests.put(note_url, headers=headers, data=f, timeout=30)
if resp.status_code in (200, 201):
logger.info(f'Handoff note uploaded to SharePoint: {handoff_path.name}')
else:
logger.error(f'Handoff note upload failed: {resp.status_code} - {resp.text[:200]}')
# Upload transcript as text file
if transcript_text:
transcript_filename = f"{handoff_path.stem.replace('handoff_', 'transcript_')}.txt"
transcript_url = f"{GRAPH_API_BASE}/sites/{SHAREPOINT_SITE_ID}/drive/root:/Raw-Transcripts/{date_folder}/{transcript_filename}:/content"
headers_text = {**headers, 'Content-Type': 'text/plain'}
resp = requests.put(transcript_url, headers=headers_text, data=transcript_text.encode('utf-8'), timeout=30)
if resp.status_code in (200, 201):
logger.info(f'Transcript uploaded to SharePoint: {transcript_filename}')
else:
logger.error(f'Transcript upload failed: {resp.status_code} - {resp.text[:200]}')
if __name__ == '__main__':
# Test with a sample handoff note
sample_note = {
'handoff_metadata': {
'date': '2025-01-15',
'time': '06:50',
'outgoing_shift': 'night',
'incoming_shift': 'day',
'meeting_duration_minutes': 14,
'attendees': ['Mike R.', 'Sarah L.', 'Tom K.']
},
'summary': 'Night shift completed 450 of 500 target units on Line 1. CNC-04 is down awaiting a spindle bearing replacement. Near miss reported in packaging area — forklift clearance issue being addressed.',
'safety_issues': [{'description': 'Forklift near miss in packaging', 'severity': 'high', 'location': 'Packaging Area B', 'action_required': 'Add floor markings for pedestrian path', 'reported_by': 'Sarah L.'}],
'equipment_status': [{'equipment_name': 'CNC-04', 'status': 'down', 'issue_description': 'Spindle bearing failure', 'estimated_resolution': 'Noon today — parts on order', 'reported_by': 'Tom K.'}],
'action_items': [{'task': 'Call bearing supplier to confirm delivery ETA', 'assigned_to': 'Tom K.', 'deadline': '8:00 AM', 'priority': 'high', 'context': 'CNC-04 down'}],
'production_status': [{'line_or_area': 'Line 1', 'status': 'running', 'current_output': '450 units', 'target_output': '500 units', 'variance': 'behind', 'notes': '10% behind due to CNC-04 downtime', 'reported_by': 'Mike R.'}],
'quality_issues': [],
'open_issues_for_next_shift': [{'issue': 'CNC-04 spindle bearing replacement pending', 'context': 'Parts ordered, ETA noon', 'recommended_action': 'Redirect CNC-04 jobs to CNC-02 until repair complete'}],
'staffing_notes': []
}
post_to_teams(sample_note)
print('Test card posted to Teams.')Weekly Production Summary Prompt
Type: prompt A Power Automate-compatible prompt template used in the weekly digest flow. It takes the week's accumulated handoff notes and produces a weekly production summary for plant management. Designed to be called via Power Automate's HTTP action to OpenAI API.
Implementation
SYSTEM PROMPT:
You are a manufacturing operations analyst. You analyze a week's worth of shift handoff notes and produce a concise weekly production summary for plant management.
Focus on:
1. Overall production performance vs. targets (calculate weekly totals and variances)
2. Recurring equipment issues (identify patterns across the week)
3. Safety trend analysis (are incidents increasing or decreasing?)
4. Quality metrics summary
5. Key accomplishments
6. Recommendations for next week
Format the output as a clear, professional report suitable for a plant manager. Use bullet points, not paragraphs. Include specific numbers where available.USER PROMPT TEMPLATE (Power Automate dynamic content in curly braces):
Analyze the following shift handoff notes from the week of {WeekStartDate} to {WeekEndDate}:
{ConcatenatedHandoffNotes}
Produce a weekly production summary report.{
"model": "gpt-5.4",
"messages": [
{"role": "system", "content": "<system prompt above>"},
{"role": "user", "content": "<user prompt with dynamic content>"}
],
"temperature": 0.2,
"max_tokens": 3000
}- Method: POST
- URI: https://api.openai.com/v1/chat/completions
- Headers: Authorization: Bearer {OpenAI_API_Key}, Content-Type: application/json
Estimated cost per weekly summary: ~$0.03 (approximately 5,000 input tokens from concatenated notes + 1,500 output tokens).
CMMS Work Order Auto-Creator
Type: workflow
A Power Automate flow that monitors newly uploaded handoff notes in SharePoint for high-severity equipment issues and automatically creates work orders in the client's CMMS system. This is an optional Phase 2 enhancement that demonstrates downstream integration value.
Implementation
- Trigger: When a file is created (SharePoint) — Library: Handoff-Notes
Step 1: Get File Content
- Action: Get file content (SharePoint)
- Site: Standup Transcripts Archive
- File Identifier: @{triggerOutputs()?['body/{Identifier}']}
Step 2: Parse JSON
- Action: Parse JSON
- Content: @{body('Get_file_content')}
- Schema: (use the handoff note JSON schema from summarizer.py)
Step 3: Filter Equipment Issues
- Action: Filter Array
- From: @{body('Parse_JSON')?['equipment_status']}
- Condition: @item()?['status'] is equal to 'down' OR @item()?['status'] is equal to 'degraded'
Step 4: Apply to Each (filtered equipment issues)
- For each item in @{body('Filter_array')}
Step 4a: HTTP POST to CMMS API
- Method: POST
- URI: https://{cmms_instance}.upkeep.com/api/v2/work-orders (example for UpKeep CMMS)
- Headers: Session-Token: {CMMS_API_Key}, Content-Type: application/json
{
"title": "[Auto] @{items('Apply_to_each')?['equipment_name']} - @{items('Apply_to_each')?['status']}",
"description": "Reported during @{body('Parse_JSON')?['handoff_metadata']?['outgoing_shift']} shift standup on @{body('Parse_JSON')?['handoff_metadata']?['date']}\n\nIssue: @{items('Apply_to_each')?['issue_description']}\nReported by: @{items('Apply_to_each')?['reported_by']}\nEstimated Resolution: @{items('Apply_to_each')?['estimated_resolution']}",
"priority": 1,
"category": "Breakdown",
"dueDate": "@{addDays(utcNow(), 1)}"
}Step 4b: Post Notification to Teams
- Action: Post message in a chat or channel (Teams)
- Team: Plant Floor Operations
- Channel: Maintenance Alerts
- Message: "🔧 Auto-created work order for @{items('Apply_to_each')?['equipment_name']} (@{items('Apply_to_each')?['status']}): @{items('Apply_to_each')?['issue_description']}"
Step 5: Condition — Safety Critical Check
- Condition: length(body('Parse_JSON')?['safety_issues']) is greater than 0
- If yes: Send email to EHS Manager with all safety issues formatted in a table
- Priority: High
- Subject: "⚠️ Safety Issue(s) Reported — @{body('Parse_JSON')?['handoff_metadata']?['outgoing_shift']} Shift Standup @{body('Parse_JSON')?['handoff_metadata']?['date']}"
Adapt the CMMS API endpoint and payload to the client's specific CMMS platform (UpKeep, Fiix, Limble, etc.). The flow requires Power Automate Premium plan ($15/month) for the HTTP connector.
Testing & Validation
- AUDIO QUALITY TEST: Record a 2-minute test with 3+ people speaking at normal standup distance (4-8 feet from the Jabra Speak2 75) while production equipment is running in the background. Play back the recording and verify all speakers are clearly audible and background noise is suppressed. If any speaker is unintelligible, reposition the microphone or upgrade to EPOS EXPAND 80T.
- TRANSCRIPTION ACCURACY TEST: Record a scripted 5-minute test meeting where participants read a prepared script containing 20 manufacturing-specific terms (CNC, OEE, changeover, lockout tagout, scrap rate, lot number, etc.). Compare the Deepgram transcript to the script and calculate Word Error Rate. Target: < 10% WER overall and < 15% WER on manufacturing-specific terms. If WER exceeds thresholds, add terms to the keyword boosting list and retest.
- SPEAKER DIARIZATION TEST: Have 4 participants take turns speaking (each says 2-3 sentences). Verify the transcript correctly identifies speaker changes. Diarization accuracy target: > 90% of speaker transitions correctly identified. Note: Deepgram diarization works best when speakers don't talk over each other — coach participants to avoid crosstalk.
- SUMMARIZATION QUALITY TEST: Take a real (or realistic) 15-minute standup transcript and run it through the GPT-5.4 summarizer. Have the shift supervisor who participated in the meeting review the output and confirm: (a) no fabricated information, (b) all discussed topics are captured, (c) action items are correctly attributed, (d) safety issues are properly flagged. Repeat with 3 different meetings.
- TEAMS DELIVERY TEST: Run the delivery module with the sample handoff note (included in delivery.py). Verify the Adaptive Card appears correctly in the designated Teams channel with proper formatting, emoji indicators, and all sections populated. Test on both desktop and mobile Teams clients.
- SHAREPOINT UPLOAD TEST: Verify that after a full pipeline run, three files appear in the correct SharePoint document libraries: (1) WAV audio file in Audio-Archive/{YYYY}/{MM}/, (2) transcript TXT in Raw-Transcripts/{YYYY}/{MM}/, (3) handoff JSON in Handoff-Notes/{YYYY}/{MM}/. Verify file metadata and permissions are correct.
- SILENCE DETECTION TEST: Start the capture agent and let it run for 3 minutes with no one speaking. Verify that it auto-stops after the configured SILENCE_TIMEOUT_SECONDS (120 seconds) and does not produce a useless empty transcript.
- END-TO-END LATENCY TEST: Time the full pipeline from meeting end to Teams notification. Target: < 3 minutes for a 15-minute recording (transcription ~30-60 seconds, summarization ~10-20 seconds, delivery ~5 seconds). If latency exceeds 5 minutes, check network bandwidth and API response times.
- POWER AUTOMATE FLOW TEST: Manually upload a sample handoff JSON to the SharePoint Handoff-Notes library and verify that the Power Automate flows trigger correctly: (1) email notification sent to incoming shift supervisor, (2) Planner tasks created for each action item, (3) if safety issues are present, urgent email sent to EHS manager.
- FAILURE RECOVERY TEST: Disconnect the network during a recording and verify the agent saves the audio locally, then retries transcription and delivery when connectivity is restored. Also test: what happens if the Deepgram API returns an error? If the OpenAI API is rate-limited? The agent should log errors and retry with exponential backoff.
- COST TRACKING TEST: After one week of daily standups, check the Deepgram and OpenAI usage dashboards. Verify actual costs align with estimates (~$1.50/month Deepgram, ~$0.33/month OpenAI for 22 meetings). If costs are significantly higher, investigate: are recordings running longer than expected? Is the summarization prompt too verbose?
- CONCURRENT MEETING TEST: If the client runs standup meetings at shift overlap times, verify the capture agent handles back-to-back recordings correctly (finishes processing the first before starting the second). Ensure scheduled tasks have adequate buffer time.
Client Handoff
Client Handoff Meeting Agenda (60-90 minutes with Shift Supervisors and Plant Manager):
Maintenance
Ongoing MSP Maintenance Responsibilities:
Weekly (15 minutes)
- Review the capture agent logs (C:\StandupCapture\logs\) for errors or warnings
- Check Deepgram and OpenAI API usage dashboards against expected consumption
- Verify the Teams channel is receiving handoff notes daily (quick visual check)
Monthly (30-60 minutes)
- Review transcription accuracy with the client champion — have them flag any persistent errors
- Update keyword boosting list with any new manufacturing terms, product names, or equipment identified by the client
- Update speaker_map.json if personnel have changed
- Review and apply Windows updates to the Mini PC (schedule during a non-standup window)
- Rotate API keys if required by the client's security policy (recommended every 90 days minimum)
- Check and renew the Jabra Speak2 75 firmware via Jabra Direct software
- Review Power Automate flow run history for failures
Quarterly (1-2 hours)
- Prompt optimization review: analyze a sample of 5 recent handoff notes with the client and refine the GPT-5.4 system prompt if output quality has drifted
- Review API pricing changes from Deepgram and OpenAI (these evolve frequently) and adjust client billing if necessary
- Test the full pipeline end-to-end to ensure all integrations are still functioning
- Review data retention: ensure old audio files are being archived/deleted per the retention policy
- Upgrade Python packages to latest stable versions (pip install --upgrade -r requirements.txt) and test
- Review Deepgram model updates (Nova-3 → future Nova-4) and test new models for accuracy improvements
pip install --upgrade -r requirements.txtAnnually
- Hardware inspection: check the Jabra speakerphone for physical wear (manufacturing environments are hard on equipment), replace if needed
- Full compliance audit: verify recording consent records are current for all participating employees, update signage if needed
- Strategic review with client: discuss expanding the system to additional meeting types (safety committee meetings, quality reviews), additional shifts, or additional facilities
SLA Considerations
- Recommended SLA: 99% uptime for scheduled standup captures (allowing ~3 missed captures per year)
- Response time: 4-hour response for capture agent outages during business hours, next-business-day for non-critical issues
- Escalation path: L1 (client champion restarts the service) → L2 (MSP remote troubleshooting via RMM) → L3 (MSP on-site visit for hardware or network issues)
- Monthly cost to client: $199-$299/month managed service fee covers all of the above (API costs, monitoring, prompt tuning, quarterly reviews)
Monitoring (via RMM tool)
- Set up a monitoring check on the Windows service StandupCaptureAgent — alert if the service stops
- Monitor disk space on the Mini PC (audio files accumulate — ensure cleanup script runs weekly)
- Monitor SharePoint storage quotas
- Set up Deepgram and OpenAI billing alerts at 150% of expected monthly usage to catch anomalies
Alternatives
Microsoft Teams Premium + Copilot (Turnkey SaaS Approach)
Instead of building a custom pipeline, leverage Microsoft Teams for the standup meeting itself (even if in-person, join a Teams meeting from the room for recording). Teams Premium ($10/user/month add-on) provides automatic transcription and intelligent recap. Microsoft 365 Copilot ($30/user/month add-on) adds AI-generated meeting summaries, action items, and follow-up suggestions directly within Teams. No custom code, no separate audio hardware (use any Teams-certified room device).
Otter.ai Business (SaaS Transcription Platform)
Use Otter.ai Business ($20/seat/month) as the transcription and summarization platform instead of building a custom Deepgram + GPT-5.4 pipeline. Otter provides real-time transcription, speaker identification, AI-generated summaries and action items, and integrations with Zapier for downstream workflow automation. The Jabra speakerphone connects to a laptop running the Otter desktop app or a participant joins via the Otter mobile app.
On-Premises Whisper + Local LLM (Air-Gapped / ITAR Deployment)
For defense manufacturers or facilities with strict data sovereignty requirements, deploy the entire pipeline on-premises with no cloud dependencies. Use OpenAI Whisper (or Faster-Whisper) running on a local GPU server for transcription, and a locally hosted LLM (e.g., Llama 3.1 8B or Mistral 7B via Ollama) for summarization. All data stays within the facility's network perimeter.
Fireflies.ai Business + Zapier Automation
Similar to the Otter.ai alternative but using Fireflies.ai ($19/seat/month billed annually) which offers superior integration options. Fireflies captures audio via its mobile app or desktop client, provides AI-generated summaries with topic tracking, and offers native integrations with Slack, Asana, Monday.com, HubSpot, and others. Zapier integration enables pushing data to manufacturing-specific systems.
Budget/Starter Approach: Voice Recorder + Manual Upload
The simplest possible implementation: use a quality portable voice recorder (e.g., Sony ICD-UX570 at ~$80) to record the standup, then have the shift supervisor manually upload the audio file to a SharePoint folder. A Power Automate flow detects the new file, sends it to the Deepgram API for transcription, passes it through GPT-5.4 for summarization, and posts the result to Teams. No dedicated Mini PC, no always-on service, no automated scheduling.
Want early access to the full toolkit?