
Implementation Guide: Monitor supplier lead times and proactively adjust production schedules when delays detected
Step-by-step implementation guide for deploying AI to monitor supplier lead times and proactively adjust production schedules when delays detected for Manufacturing clients.
Hardware Procurement
Dell PowerEdge T360 Tower Server
$4,200 MSP cost / $5,500 suggested resale
Primary on-premises server for hosting the n8n agent orchestration platform, CrewAI agent runtime, PostgreSQL database for historical lead-time analytics, and reverse proxy. Tower form factor is ideal for SMB manufacturing sites without dedicated server rooms. 64GB RAM supports concurrent agent processes and local data caching.
APC Smart-UPS 1500VA LCD RM 2U
$850 MSP cost / $1,100 suggested resale
Uninterruptible power supply to protect the agent server from power fluctuations common in manufacturing environments. Provides 15–20 minutes of runtime for graceful shutdown during outages and includes network management card for remote monitoring.
Cisco Catalyst 1300-24T-4G Managed Switch
$350 MSP cost / $475 suggested resale
Managed Gigabit switch for VLAN segmentation between IT network (agent server, ERP) and OT network (shop floor systems, IoT sensors). Enables QoS policies to prioritize ERP API traffic and agent communications.
Fortinet FortiGate 40F Next-Gen Firewall
$750 MSP cost / $1,000 suggested resale (includes 1-year UTP bundle)
Network security appliance providing firewall, IPS, web filtering, and SSL inspection for outbound API calls to cloud LLM services (OpenAI/Azure). Essential for CMMC/NIST compliance in defense supply chain manufacturers. Supports IPSec VPN for secure MSP remote management.
Software Procurement
n8n Self-Hosted (Business License)
$50/month — bundle into managed service fee
Visual workflow automation platform serving as the primary orchestration layer for all agent workflows. Handles EDI data ingestion, ERP API polling, webhook triggers, and human-in-the-loop approval flows. Self-hosted on client server for data sovereignty.
CrewAI Framework (Open Source + AMP)
$0 for framework + $99/month for AMP cloud observability dashboard
Multi-agent orchestration framework powering the core AI agents: Supplier Monitor Agent, Delay Analyzer Agent, Schedule Optimizer Agent, and Notification Agent. AMP provides agent performance monitoring, token usage tracking, and decision audit logs.
Azure OpenAI Service (GPT-5.4)
$100–$250/month estimated based on ~500 agent decisions/day ($2.50/M input tokens, $10.00/M output tokens) — mark up 25% in managed service
Large language model API providing reasoning capabilities for delay impact analysis, schedule optimization recommendations, and natural language notification generation. Azure deployment ensures enterprise compliance, data residency controls, and integration with Entra ID.
Microsoft 365 Business Premium (Teams integration)
$22/user/month (assumed already deployed; no incremental cost for webhook integration)
Microsoft Teams serves as the primary notification and human-approval interface. Production planners receive delay alerts and approve/reject schedule changes directly within Teams adaptive cards. Assumed pre-existing in client environment.
SPS Commerce Fulfillment (EDI Integration)
$200–$500/month depending on number of trading partners
EDI integration platform providing structured inbound ASN (Advanced Shipping Notice), PO Acknowledgment, and Ship Notice data from suppliers. Pre-built connectors for Epicor Kinetic, Dynamics 365, and Acumatica eliminate custom EDI parsing. Critical data source for the Supplier Monitor Agent.
PostgreSQL 16 (Open Source)
$0 (included in server deployment)
Relational database storing historical supplier lead time data, agent decision logs, schedule change audit trail, and performance metrics. Serves as the analytical backbone for delay pattern recognition and compliance audit reporting.
Azure Entra ID (P1)
$6/user/month (often bundled with M365 Business Premium)
Identity and access management for RBAC on the agent dashboard, SSO for n8n web interface, and conditional access policies ensuring only authorized planners can approve autonomous schedule changes.
Grafana OSS
$0 (self-hosted)
Monitoring and visualization dashboard for agent system health, API latency, decision throughput, and supplier performance KPIs. Connects to PostgreSQL and Prometheus for real-time operational visibility.
Prerequisites
- Client must have a modern, API-enabled ERP system with REST API access: Epicor Kinetic 2021+, Acumatica 2023 R1+, Dynamics 365 Finance & Operations, or SAP Business One 10.0+. Legacy ERPs without REST APIs require a middleware adapter (adds 4-6 weeks and $10,000-$20,000).
- Minimum 3 years of historical Purchase Order data accessible in the ERP, including: PO creation dates, requested delivery dates, actual receipt dates, supplier IDs, part numbers, and quantities. This data trains the delay prediction baseline.
- Standardized supplier master data in ERP with unique supplier IDs, primary contact information, and categorization (critical/non-critical). Data cleansing may be required as a pre-project engagement.
- Stable internet connectivity: minimum 50 Mbps symmetric with <50ms latency to Azure data centers. Redundant WAN (e.g., primary fiber + LTE failover) strongly recommended for production-critical agent operations.
- Internal Gigabit Ethernet LAN connecting the agent server to the ERP server/cloud gateway with <1ms latency. VLAN-capable switching infrastructure for IT/OT network segmentation.
- Microsoft 365 environment with Teams deployed to production planning staff. At minimum, a Teams channel dedicated to supply chain alerts must be created prior to Phase 2.
- Azure subscription (Pay-As-You-Go or EA) for Azure OpenAI Service access. Azure OpenAI requires a separate application/approval process — submit at least 2 weeks before Phase 2 begins.
- Designated production planner or supply chain manager as project champion with authority to define schedule change approval thresholds and participate in weekly feedback sessions throughout implementation.
- Network firewall rules allowing outbound HTTPS (port 443) to: api.openai.com or [customer].openai.azure.com, graph.microsoft.com, SPS Commerce endpoints, and ERP cloud endpoints (if applicable).
- Physical or rack space for the Dell PowerEdge T360 server: adequate ventilation, dedicated 120V/20A circuit, ambient temperature below 35°C. Manufacturing floor placement is NOT recommended — server room or office closet preferred.
- IT administrator credentials with API access to the ERP system, including permissions to read Purchase Orders, Work Orders, BOM data, and Inventory levels. Write access to Work Orders and Production Schedules required for Phase 4.
- Compliance requirements documented: if client is a DoD contractor (CMMC), ITAR-regulated, or publicly traded (SOX), additional audit logging and data residency constraints must be identified before architecture finalization.
Installation Steps
Step 1: Server Hardware Setup and OS Installation
Unbox and rack/place the Dell PowerEdge T360 server. Connect to UPS, managed switch, and KVM. Install Ubuntu Server 22.04 LTS as the base operating system. Configure RAID-1 mirroring across the two NVMe SSDs using the Dell PERC controller during BIOS setup. Set static IP address on the management VLAN.
sudo nano /etc/netplan/00-installer-config.yamlnetwork:
ethernets:
eno1:
addresses: [192.168.10.50/24]
routes:
- to: default
via: 192.168.10.1
nameservers:
addresses: [8.8.8.8, 8.8.4.4]
version: 2sudo netplan apply
sudo apt update && sudo apt upgrade -y
sudo apt install -y curl git wget htop net-tools ufw openssh-server
sudo ufw allow 22/tcp
sudo ufw allow 443/tcp
sudo ufw allow 5678/tcp comment 'n8n web UI'
sudo ufw allow 3000/tcp comment 'Grafana'
sudo ufw enableEnsure BIOS is updated to the latest version via Dell Support before OS install. Enable Secure Boot if CMMC compliance is required. Document iDRAC IP and credentials in the client's password manager. Configure iDRAC email alerts to MSP monitoring inbox.
Step 2: Install Docker and Container Runtime
Install Docker Engine and Docker Compose on the server. All application components (n8n, PostgreSQL, CrewAI agents, Grafana, Prometheus) will run as Docker containers for easy deployment, isolation, and maintenance.
sudo apt install -y ca-certificates gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo $VERSION_CODENAME) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo useradd -m -s /bin/bash agentops
sudo usermod -aG docker agentops
docker --version
docker compose version
sudo mkdir -p /opt/supply-chain-agent/{n8n,crewai,postgres,grafana,prometheus,nginx,logs}
sudo chown -R agentops:agentops /opt/supply-chain-agentDo NOT install Docker via snap — use the official apt repository for production stability. The agentops service account will own all agent processes. Document the directory structure in the client runbook.
Step 3: Deploy PostgreSQL Database
Deploy PostgreSQL 16 as a Docker container to store historical supplier lead time data, agent decision logs, and audit trails. Create the database schema with tables for supplier performance tracking, delay events, schedule changes, and agent activity logging.
cd /opt/supply-chain-agentcat > docker-compose.yml << 'EOF'
version: '3.8'
services:
postgres:
image: postgres:16-alpine
container_name: scm-postgres
restart: unless-stopped
environment:
POSTGRES_DB: supply_chain_agent
POSTGRES_USER: scm_admin
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
volumes:
- ./postgres/data:/var/lib/postgresql/data
- ./postgres/init:/docker-entrypoint-initdb.d
ports:
- '5432:5432'
healthcheck:
test: ['CMD-SHELL', 'pg_isready -U scm_admin -d supply_chain_agent']
interval: 10s
timeout: 5s
retries: 5
EOFcat > .env << 'EOF'
POSTGRES_PASSWORD=<GENERATE_STRONG_PASSWORD_HERE>
OPENAI_API_KEY=<AZURE_OPENAI_KEY_HERE>
ERP_API_BASE_URL=<CLIENT_ERP_API_ENDPOINT>
ERP_API_KEY=<CLIENT_ERP_API_KEY>
TEAMS_WEBHOOK_URL=<TEAMS_INCOMING_WEBHOOK_URL>
EOF
chmod 600 .envcat > postgres/init/01-schema.sql << 'SQLEOF'
-- Supplier master data cache
CREATE TABLE suppliers (
supplier_id VARCHAR(50) PRIMARY KEY,
supplier_name VARCHAR(255) NOT NULL,
category VARCHAR(50) DEFAULT 'standard',
criticality VARCHAR(20) DEFAULT 'medium' CHECK (criticality IN ('critical','high','medium','low')),
avg_lead_time_days NUMERIC(6,1),
lead_time_std_dev NUMERIC(6,2),
on_time_delivery_rate NUMERIC(5,2),
last_synced_at TIMESTAMPTZ DEFAULT NOW(),
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Purchase order tracking
CREATE TABLE purchase_orders (
po_id VARCHAR(50) PRIMARY KEY,
erp_po_number VARCHAR(50) NOT NULL,
supplier_id VARCHAR(50) REFERENCES suppliers(supplier_id),
part_number VARCHAR(100),
part_description TEXT,
quantity NUMERIC(12,2),
order_date DATE NOT NULL,
requested_delivery_date DATE NOT NULL,
confirmed_delivery_date DATE,
actual_receipt_date DATE,
current_status VARCHAR(30) DEFAULT 'open',
delay_days INTEGER GENERATED ALWAYS AS (
CASE WHEN actual_receipt_date IS NOT NULL THEN
(actual_receipt_date - requested_delivery_date)
WHEN confirmed_delivery_date IS NOT NULL THEN
(confirmed_delivery_date - requested_delivery_date)
ELSE NULL END
) STORED,
last_checked_at TIMESTAMPTZ,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX idx_po_supplier ON purchase_orders(supplier_id);
CREATE INDEX idx_po_status ON purchase_orders(current_status);
CREATE INDEX idx_po_delivery ON purchase_orders(requested_delivery_date);
-- Delay events detected by agents
CREATE TABLE delay_events (
event_id SERIAL PRIMARY KEY,
po_id VARCHAR(50) REFERENCES purchase_orders(po_id),
supplier_id VARCHAR(50) REFERENCES suppliers(supplier_id),
detected_at TIMESTAMPTZ DEFAULT NOW(),
delay_type VARCHAR(30) NOT NULL CHECK (delay_type IN ('confirmed_late','predicted_late','shipment_delayed','no_confirmation')),
original_date DATE NOT NULL,
new_expected_date DATE,
delay_magnitude_days INTEGER,
confidence_score NUMERIC(4,3),
detection_method VARCHAR(50),
raw_signal JSONB,
resolved_at TIMESTAMPTZ,
resolution_action VARCHAR(100)
);
-- Schedule adjustment recommendations and actions
CREATE TABLE schedule_adjustments (
adjustment_id SERIAL PRIMARY KEY,
delay_event_id INTEGER REFERENCES delay_events(event_id),
work_order_id VARCHAR(50) NOT NULL,
original_start_date DATE,
original_end_date DATE,
recommended_start_date DATE,
recommended_end_date DATE,
impact_description TEXT,
priority VARCHAR(20) DEFAULT 'medium',
status VARCHAR(30) DEFAULT 'pending' CHECK (status IN ('pending','approved','rejected','auto_executed','rolled_back')),
approved_by VARCHAR(100),
approved_at TIMESTAMPTZ,
executed_at TIMESTAMPTZ,
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Agent decision audit log (compliance-critical)
CREATE TABLE agent_audit_log (
log_id BIGSERIAL PRIMARY KEY,
timestamp TIMESTAMPTZ DEFAULT NOW(),
agent_name VARCHAR(100) NOT NULL,
action_type VARCHAR(50) NOT NULL,
input_summary JSONB,
output_summary JSONB,
llm_model VARCHAR(50),
token_usage JSONB,
decision_rationale TEXT,
confidence_score NUMERIC(4,3),
human_override BOOLEAN DEFAULT FALSE,
execution_time_ms INTEGER
);
CREATE INDEX idx_audit_timestamp ON agent_audit_log(timestamp);
CREATE INDEX idx_audit_agent ON agent_audit_log(agent_name);
-- Supplier performance metrics (daily aggregation)
CREATE TABLE supplier_metrics_daily (
metric_date DATE,
supplier_id VARCHAR(50) REFERENCES suppliers(supplier_id),
total_open_pos INTEGER,
delayed_pos INTEGER,
avg_delay_days NUMERIC(6,1),
risk_score NUMERIC(4,3),
PRIMARY KEY (metric_date, supplier_id)
);
SQLEOFdocker compose up -d postgressleep 10
docker exec scm-postgres psql -U scm_admin -d supply_chain_agent -c '\dt'Generate a cryptographically strong password for POSTGRES_PASSWORD (use 'openssl rand -base64 32'). Store all credentials in the client's approved password manager (e.g., IT Glue, Hudu). The .env file contains secrets — ensure it is excluded from any version control and has restrictive file permissions (chmod 600).
Step 4: Deploy n8n Workflow Automation Platform
Deploy n8n as a Docker container connected to the PostgreSQL database. n8n serves as the visual workflow orchestration layer, handling scheduled ERP API polling, webhook receivers for EDI/supplier updates, human-in-the-loop approval flows, and triggering the CrewAI agent crew. Configure SSL via Nginx reverse proxy.
cat >> docker-compose.yml << 'EOF'
n8n:
image: docker.n8n.io/n8nio/n8n:latest
container_name: scm-n8n
restart: unless-stopped
environment:
- N8N_HOST=agent.${CLIENT_DOMAIN}
- N8N_PORT=5678
- N8N_PROTOCOL=https
- NODE_ENV=production
- WEBHOOK_URL=https://agent.${CLIENT_DOMAIN}/
- DB_TYPE=postgresdb
- DB_POSTGRESDB_HOST=postgres
- DB_POSTGRESDB_PORT=5432
- DB_POSTGRESDB_DATABASE=supply_chain_agent
- DB_POSTGRESDB_USER=scm_admin
- DB_POSTGRESDB_PASSWORD=${POSTGRES_PASSWORD}
- N8N_ENCRYPTION_KEY=${N8N_ENCRYPTION_KEY}
- N8N_USER_MANAGEMENT_JWT_SECRET=${N8N_JWT_SECRET}
- EXECUTIONS_DATA_PRUNE=true
- EXECUTIONS_DATA_MAX_AGE=168
volumes:
- ./n8n/data:/home/node/.n8n
ports:
- '5678:5678'
depends_on:
postgres:
condition: service_healthy
links:
- postgres
EOF
echo "N8N_ENCRYPTION_KEY=$(openssl rand -hex 32)" >> .env
echo "N8N_JWT_SECRET=$(openssl rand -hex 32)" >> .env
docker compose up -d n8n
sleep 15
curl -s http://localhost:5678/healthzAfter first launch, access n8n at http://<server-ip>:5678 to create the admin account. Use the client's IT admin email. For production, configure Nginx reverse proxy with Let's Encrypt SSL (step 5). The N8N_ENCRYPTION_KEY is critical — if lost, all stored credentials in n8n are irrecoverable. Back it up to the MSP's secure vault.
Step 5: Configure Nginx Reverse Proxy with SSL
Deploy Nginx as a reverse proxy to provide HTTPS access to n8n and Grafana. Use Let's Encrypt certificates via Certbot for automatic SSL provisioning and renewal. This secures all web-based management interfaces and webhook endpoints. ``` # Add nginx to docker-compose.yml cat >> docker-compose.yml << 'EOF' nginx: image: nginx:alpine container_name: scm-nginx restart: unless-stopped ports: - '80:80' - '443:443' volumes: - ./nginx/conf.d:/etc/nginx/conf.d ...
Step 6: Configure ERP API Integration
Establish the bidirectional API connection to the client's ERP system. This step varies by ERP platform. We provide configurations for the three most common SMB manufacturing ERPs: Epicor Kinetic, Acumatica, and Dynamics 365. The integration must be able to read Purchase Orders, Work Orders, BOM data, and Supplier master data, and write schedule changes to Work Orders.
mkdir -p /opt/supply-chain-agent/crewai/integrationscurl -X GET 'https://<epicor-server>/api/v2/odata/<company>/Erp.BO.POSvc/POs?$top=5&$select=PONum,VendorNum,OrderDate,DueDate,OpenOrder' \
-H 'Authorization: Basic <base64-encoded-credentials>' \
-H 'X-API-Key: <epicor-api-key>' \
-H 'Content-Type: application/json'curl -X POST 'https://<acumatica-instance>/entity/auth/login' \
-H 'Content-Type: application/json' \
-d '{"name":"apiuser","password":"<password>","company":"<company>"}'curl -X GET 'https://<acumatica-instance>/entity/Default/24.200.001/PurchaseOrder?$top=5' \
-H 'Content-Type: application/json' \
--cookie 'cookie-from-login'curl -X POST 'https://login.microsoftonline.com/<tenant-id>/oauth2/v2.0/token' \
-d 'client_id=<app-id>&scope=https://<d365-instance>.operations.dynamics.com/.default&client_secret=<secret>&grant_type=client_credentials'# saved to /opt/supply-chain-agent/crewai/integrations/erp_config.yaml
erp_system: epicor_kinetic # Options: epicor_kinetic, acumatica, dynamics365
base_url: https://<erp-server>/api/v2
auth_method: api_key # Options: api_key, basic, oauth2
company_id: <COMPANY_ID>
api_endpoints:
purchase_orders:
list: /odata/{company}/Erp.BO.POSvc/POs
detail: /odata/{company}/Erp.BO.POSvc/POs('{po_num}')
fields: [PONum, VendorNum, VendorName, OrderDate, DueDate, PromiseDt, OpenOrder, OrderQty, ReceivedQty]
work_orders:
list: /odata/{company}/Erp.BO.JobEntrySvc/JobHeads
detail: /odata/{company}/Erp.BO.JobEntrySvc/JobHeads('{job_num}')
update: /odata/{company}/Erp.BO.JobEntrySvc/JobHeads('{job_num}')
fields: [JobNum, PartNum, ProdQty, StartDate, DueDate, JobClosed, JobReleased, SchedCode]
suppliers:
list: /odata/{company}/Erp.BO.VendorSvc/Vendors
fields: [VendorNum, VendorID, Name, Address1, City, State, Country]
bom:
list: /odata/{company}/Erp.BO.EngWorkBenchSvc/ECORevs
polling_interval_minutes: 15
rate_limit_requests_per_minute: 60ERP API setup is the most client-specific step. Allocate 2–3 days for ERP integration testing. Work with the client's ERP administrator to create a dedicated API service account with read access to POs, Work Orders, BOMs, Suppliers, and Inventory, plus write access to Work Orders (Phase 4 only). Document all API endpoints and test with Postman before coding the agent integration. For Epicor Kinetic, ensure the REST API v2 is enabled in the Epicor Admin Console.
Step 7: Load Historical Supplier Lead Time Data
Extract 3+ years of historical Purchase Order data from the ERP and load it into the PostgreSQL database. This historical data is essential for establishing supplier performance baselines, calculating average lead times, standard deviations, and on-time delivery rates. These baselines enable the delay prediction agent to distinguish normal variation from genuine delays.
# ETL script for loading historical PO data into PostgreSQL
cat > /opt/supply-chain-agent/crewai/scripts/historical_etl.py << 'PYEOF'
import os
import requests
import psycopg2
from psycopg2.extras import execute_values
from datetime import datetime, timedelta
import json
import logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)
# Configuration
ERP_BASE_URL = os.environ['ERP_API_BASE_URL']
ERP_API_KEY = os.environ['ERP_API_KEY']
DB_CONN = f"host=localhost port=5432 dbname=supply_chain_agent user=scm_admin password={os.environ['POSTGRES_PASSWORD']}"
def fetch_erp_data(endpoint, params=None):
"""Generic ERP API fetcher with pagination."""
headers = {'X-API-Key': ERP_API_KEY, 'Content-Type': 'application/json'}
all_records = []
url = f"{ERP_BASE_URL}{endpoint}"
while url:
resp = requests.get(url, headers=headers, params=params)
resp.raise_for_status()
data = resp.json()
all_records.extend(data.get('value', []))
url = data.get('@odata.nextLink')
params = None # nextLink includes params
logger.info(f"Fetched {len(all_records)} records so far...")
return all_records
def load_suppliers(conn):
"""Load supplier master data."""
suppliers = fetch_erp_data('/odata/COMPANY/Erp.BO.VendorSvc/Vendors',
{'$select': 'VendorNum,VendorID,Name'})
with conn.cursor() as cur:
execute_values(cur,
"INSERT INTO suppliers (supplier_id, supplier_name) VALUES %s ON CONFLICT (supplier_id) DO UPDATE SET supplier_name = EXCLUDED.supplier_name",
[(str(s['VendorNum']), s['Name']) for s in suppliers])
conn.commit()
logger.info(f"Loaded {len(suppliers)} suppliers")
def load_purchase_orders(conn):
"""Load 3 years of PO history."""
cutoff = (datetime.now() - timedelta(days=1095)).strftime('%Y-%m-%dT00:00:00Z')
pos = fetch_erp_data('/odata/COMPANY/Erp.BO.POSvc/POs',
{'$filter': f"OrderDate ge {cutoff}",
'$select': 'PONum,VendorNum,OrderDate,DueDate,PromiseDt,OpenOrder'})
with conn.cursor() as cur:
for po in pos:
cur.execute("""
INSERT INTO purchase_orders (po_id, erp_po_number, supplier_id, order_date,
requested_delivery_date, confirmed_delivery_date, current_status)
VALUES (%s, %s, %s, %s, %s, %s, %s)
ON CONFLICT (po_id) DO NOTHING
""", (
str(po['PONum']), str(po['PONum']), str(po['VendorNum']),
po['OrderDate'][:10], po['DueDate'][:10],
po.get('PromiseDt', po['DueDate'])[:10] if po.get('PromiseDt') else None,
'open' if po.get('OpenOrder', True) else 'closed'
))
conn.commit()
logger.info(f"Loaded {len(pos)} purchase orders")
def calculate_supplier_metrics(conn):
"""Calculate baseline metrics per supplier."""
with conn.cursor() as cur:
cur.execute("""
UPDATE suppliers s SET
avg_lead_time_days = sub.avg_lt,
lead_time_std_dev = sub.std_lt,
on_time_delivery_rate = sub.otd
FROM (
SELECT supplier_id,
AVG(actual_receipt_date - order_date) as avg_lt,
STDDEV(actual_receipt_date - order_date) as std_lt,
(COUNT(*) FILTER (WHERE delay_days <= 0)::NUMERIC / NULLIF(COUNT(*),0) * 100) as otd
FROM purchase_orders
WHERE actual_receipt_date IS NOT NULL
GROUP BY supplier_id
) sub
WHERE s.supplier_id = sub.supplier_id
""")
conn.commit()
logger.info("Supplier baseline metrics calculated")
if __name__ == '__main__':
conn = psycopg2.connect(DB_CONN)
try:
load_suppliers(conn)
load_purchase_orders(conn)
calculate_supplier_metrics(conn)
logger.info("Historical ETL complete!")
finally:
conn.close()
PYEOFsudo apt install -y python3-pip python3-venv
cd /opt/supply-chain-agent/crewai
python3 -m venv venv
source venv/bin/activate
pip install requests psycopg2-binary pyyamlsource /opt/supply-chain-agent/.env
python scripts/historical_etl.pyThis script is a template — it must be customized based on the specific ERP system and API schema confirmed in Step 6. Replace 'COMPANY' with the actual Epicor company ID. For Acumatica/D365, adapt the endpoint URLs and field names per their respective APIs. Run the ETL during off-peak hours as it may generate significant API load. Verify record counts against ERP reports: total POs loaded should match ERP PO count for the same date range (±5% tolerance for API pagination edge cases). This step typically takes 1–4 hours for 3 years of data depending on ERP API performance.
Step 8: Deploy CrewAI Agent Framework
Install and configure the CrewAI multi-agent system that powers the core intelligence. This includes four specialized agents: (1) Supplier Monitor Agent that polls for PO status changes, (2) Delay Analyzer Agent that assesses impact severity and predicts cascading effects, (3) Schedule Optimizer Agent that generates production reschedule recommendations, and (4) Notification Agent that formats and delivers alerts via Teams. Deploy as a Docker container with the agent definitions, tools, and tasks.
mkdir -p /opt/supply-chain-agent/crewai/{agents,tools,tasks,config}cat > /opt/supply-chain-agent/crewai/requirements.txt << 'EOF'
crewai[tools]==0.86.0
langchain-openai==0.3.0
psycopg2-binary==2.9.9
requests==2.32.3
pyyaml==6.0.2
python-dotenv==1.0.1
pydantic==2.9.0
EOFcat > /opt/supply-chain-agent/crewai/Dockerfile << 'DOCKEREOF'
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "main.py"]
DOCKEREOFcat >> /opt/supply-chain-agent/docker-compose.yml << 'EOF'
crewai-agents:
build:
context: ./crewai
dockerfile: Dockerfile
container_name: scm-crewai
restart: unless-stopped
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- OPENAI_API_BASE=https://<azure-openai-endpoint>.openai.azure.com/
- OPENAI_MODEL_NAME=gpt-5.4
- POSTGRES_HOST=postgres
- POSTGRES_PORT=5432
- POSTGRES_DB=supply_chain_agent
- POSTGRES_USER=scm_admin
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
- ERP_API_BASE_URL=${ERP_API_BASE_URL}
- ERP_API_KEY=${ERP_API_KEY}
- TEAMS_WEBHOOK_URL=${TEAMS_WEBHOOK_URL}
- AGENT_MODE=monitor_and_recommend
- SCHEDULE_CHANGE_THRESHOLD_DAYS=3
- AUTO_EXECUTE_ENABLED=false
- POLLING_INTERVAL_MINUTES=15
volumes:
- ./crewai:/app
- ./logs:/app/logs
depends_on:
postgres:
condition: service_healthy
links:
- postgres
EOFcd /opt/supply-chain-agent
docker compose build crewai-agents
docker compose up -d crewai-agentsdocker logs scm-crewai --tail 50The AGENT_MODE environment variable controls autonomy level: 'monitor_only' for Phase 2, 'monitor_and_recommend' for Phase 3, 'semi_autonomous' for Phase 4. AUTO_EXECUTE_ENABLED must remain 'false' until Phase 4 is formally approved by the client's operations manager. The SCHEDULE_CHANGE_THRESHOLD_DAYS controls how many days of delay trigger agent activation (start conservative at 3 days). The CrewAI agent code files are detailed in the custom_ai_components section.
Step 9: Configure Microsoft Teams Notification Channel
Set up the Microsoft Teams integration for agent notifications and human-in-the-loop approval workflows. Create a dedicated 'Supply Chain AI Alerts' channel in Teams, configure an incoming webhook for notifications, and set up an Adaptive Card-based approval flow for schedule change recommendations.
curl -X POST ${TEAMS_WEBHOOK_URL} \
-H 'Content-Type: application/json' \
-d '{
"@type": "MessageCard",
"@context": "http://schema.org/extensions",
"themeColor": "FF0000",
"summary": "Supply Chain Agent Test",
"sections": [{
"activityTitle": "🤖 Supply Chain AI Agent - Test Notification",
"activitySubtitle": "Agent system connectivity verified",
"facts": [
{"name": "Status", "value": "Connected"},
{"name": "Server", "value": "scm-agent-01"},
{"name": "Timestamp", "value": "'$(date -u +%Y-%m-%dT%H:%M:%SZ)'"}
],
"markdown": true
}]
}'Teams incoming webhooks are being deprecated by Microsoft in favor of Power Automate Workflows connectors. For new deployments, use the 'Post to a channel when a webhook request is received' Power Automate template instead. For the human-in-the-loop approval flow (Phase 3), a Power Automate Premium license ($15/user/month) is needed for the HTTP connector — or use n8n's native Microsoft Teams node with Graph API permissions. Test the webhook immediately to confirm notifications appear in the correct channel.
Step 10: Deploy Grafana Monitoring Dashboard
Deploy Grafana OSS and Prometheus for monitoring agent system health, visualizing supplier performance metrics, and providing an executive KPI dashboard. Connect Grafana to the PostgreSQL database for supply chain analytics and to Prometheus for system metrics.
cat >> /opt/supply-chain-agent/docker-compose.yml << 'EOF'
prometheus:
image: prom/prometheus:latest
container_name: scm-prometheus
restart: unless-stopped
volumes:
- ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
- ./prometheus/data:/prometheus
ports:
- '9090:9090'
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.retention.time=90d'
grafana:
image: grafana/grafana-oss:latest
container_name: scm-grafana
restart: unless-stopped
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_ADMIN_PASSWORD}
- GF_SERVER_ROOT_URL=https://grafana.${CLIENT_DOMAIN}
volumes:
- ./grafana/data:/var/lib/grafana
- ./grafana/provisioning:/etc/grafana/provisioning
ports:
- '3000:3000'
depends_on:
- prometheus
- postgres
EOFcat > /opt/supply-chain-agent/prometheus/prometheus.yml << 'PROMEOF'
global:
scrape_interval: 30s
evaluation_interval: 30s
scrape_configs:
- job_name: 'node-exporter'
static_configs:
- targets: ['host.docker.internal:9100']
- job_name: 'postgres-exporter'
static_configs:
- targets: ['postgres-exporter:9187']
- job_name: 'crewai-metrics'
static_configs:
- targets: ['crewai-agents:8000']
metrics_path: /metrics
PROMEOFmkdir -p /opt/supply-chain-agent/grafana/provisioning/datasources
cat > /opt/supply-chain-agent/grafana/provisioning/datasources/datasources.yml << 'DSEOF'
apiVersion: 1
datasources:
- name: PostgreSQL
type: postgres
url: postgres:5432
database: supply_chain_agent
user: scm_admin
secureJsonData:
password: ${POSTGRES_PASSWORD}
jsonData:
sslmode: disable
maxOpenConns: 5
maxIdleConns: 2
- name: Prometheus
type: prometheus
url: http://prometheus:9090
isDefault: true
DSEOFecho "GRAFANA_ADMIN_PASSWORD=$(openssl rand -base64 16)" >> .envdocker compose up -d prometheus grafanacurl -s http://localhost:3000/api/healthAfter deployment, log into Grafana and import the custom dashboards defined in the custom_ai_components section. Create two dashboards: (1) Agent Operations — showing agent execution count, success rate, API latency, token costs, and error rates, and (2) Supply Chain KPIs — showing supplier on-time delivery rates, average delays by supplier, open delay events, and schedule adjustment history. Set up Grafana alerting to notify the MSP's NOC if agent error rate exceeds 5% or if the agent hasn't executed in 30+ minutes.
Step 11: Configure n8n Orchestration Workflows
Build the n8n workflows that orchestrate the entire agent system. Three primary workflows: (1) Scheduled ERP Polling workflow that runs every 15 minutes to check for PO status updates, (2) Delay Detection & Agent Trigger workflow that invokes CrewAI when delays are detected, and (3) Human Approval workflow that processes planner responses from Teams and executes approved schedule changes.
docker exec scm-n8n n8n import:workflow --input=/app/workflows/erp_polling.json
docker exec scm-n8n n8n import:workflow --input=/app/workflows/delay_detection.json
docker exec scm-n8n n8n import:workflow --input=/app/workflows/human_approval.jsonn8n workflows are the glue that connects ERP data polling, agent execution, and human interaction. The workflow JSON files are provided in the custom_ai_components section. When importing, you'll need to re-map credentials to the ones configured in n8n's credential store. Test each workflow individually before activating the full pipeline. The ERP polling workflow should start in 'manual trigger' mode during testing, then switch to 'Schedule Trigger' (every 15 minutes) for production.
Step 12: Security Hardening and Compliance Configuration
Harden the server and application stack for manufacturing compliance requirements. Configure audit logging, access controls, encryption at rest, and network segmentation. This step is critical for SOX, CMMC, and general cybersecurity best practices. ``` # Enable automatic security updates sudo apt install -y unattended-upgrades sudo dpkg-reconfigure -plow unattended-upgrades # Configure fail2ban for SSH protection sudo apt install -y fail2ban sudo systemctl enable fail2ban sudo systemctl start...
Custom AI Components
Supplier Monitor Agent
Type: agent CrewAI agent responsible for continuously polling the ERP system for Purchase Order status changes, comparing confirmed delivery dates against requested dates, detecting anomalies in supplier communication patterns (e.g., no confirmation received within expected window), and flagging potential delays. This is the primary data-gathering agent that feeds all downstream analysis. Implementation: ``` # File: /opt/supply-chain-agent/crewai/agents/supplier_monitor.py from crewa...
Delay Analyzer Agent
Type: agent CrewAI agent that receives delay events from the Supplier Monitor Agent and performs deep impact analysis. It traces the affected parts through the Bill of Materials (BOM) to identify which work orders and finished goods are impacted, calculates the criticality based on customer delivery commitments, and ranks delay events by business impact severity.
Implementation
from crewai import Agent
from tools.erp_connector import ERPWorkOrderTool, ERPBOMTool, ERPInventoryTool
from tools.database_connector import SupplierMetricsTool
from tools.llm_analyzer import DelayImpactAnalysisTool
def create_delay_analyzer_agent():
return Agent(
role='Supply Chain Delay Impact Analyst',
goal=(
'For each detected supplier delay, perform a comprehensive impact analysis by: '
'(1) Identifying all Work Orders that depend on the delayed material via BOM explosion, '
'(2) Checking current inventory levels to determine if safety stock can absorb the delay, '
'(3) Calculating the production impact window — which production days are affected, '
'(4) Assessing customer delivery risk — which customer orders are at risk of late delivery, '
'(5) Assigning a severity score (Critical/High/Medium/Low) based on customer impact and '
'financial exposure.'
),
backstory=(
'You are a production planning expert who understands MRP logic, BOM structures, '
'and the cascading effects of material shortages in discrete manufacturing. You know '
'that not all delays are equal — a 5-day delay on a commodity fastener with 30 days '
'of safety stock is a non-event, while a 1-day delay on a sole-sourced machined '
'component on the critical path can halt an entire production line. You always '
'check inventory levels before escalating.'
),
tools=[
ERPWorkOrderTool(),
ERPBOMTool(),
ERPInventoryTool(),
SupplierMetricsTool(),
DelayImpactAnalysisTool()
],
verbose=True,
memory=True,
max_iter=8,
allow_delegation=False
)from crewai import Task
def create_impact_analysis_task(agent, context=None):
return Task(
description=(
'For each delay event provided by the Supplier Monitor Agent:\n\n'
'1. MATERIAL TRACING: Use the BOM tool to find all parent assemblies and '
'finished goods that require the delayed part. Build a dependency tree.\n\n'
'2. INVENTORY CHECK: Query current on-hand inventory and allocated inventory '
'for the delayed part. Calculate: available_buffer_days = (on_hand_qty - allocated_qty) '
'/ daily_usage_rate. If available_buffer_days > delay_magnitude_days, the delay '
'can be absorbed by inventory — downgrade severity.\n\n'
'3. WORK ORDER IMPACT: Identify all open/released Work Orders that consume this part '
'and have a scheduled start date within (delay_magnitude_days + available_buffer_days) '
'from today. These are the directly impacted work orders.\n\n'
'4. CUSTOMER IMPACT: For each impacted Work Order, identify the associated Sales Order '
'and customer. Flag any work orders tied to key accounts or contractual delivery penalties.\n\n'
'5. SEVERITY SCORING:\n'
' - CRITICAL: Customer delivery at risk, no safety stock buffer, critical path item\n'
' - HIGH: Customer delivery at risk but partial buffer exists, or high-value order\n'
' - MEDIUM: Production disruption likely but customer delivery achievable with overtime\n'
' - LOW: Inventory buffer absorbs delay, no customer impact expected\n\n'
'6. OUTPUT: For each delay event, produce a structured impact report.'
),
expected_output=(
'A JSON array of impact assessments, each containing: delay_event_id, severity '
'(critical/high/medium/low), affected_work_orders (list with WO IDs, part numbers, '
'customer names, original due dates), inventory_buffer_days, production_days_at_risk, '
'estimated_financial_impact_usd, recommended_urgency (immediate/today/this_week/monitor), '
'and a human_readable_summary (2-3 sentences explaining the impact in plain language).'
),
agent=agent,
context=context
)Schedule Optimizer Agent
Type: agent CrewAI agent that takes impact assessments from the Delay Analyzer and generates specific, actionable production schedule adjustment recommendations. It considers machine capacity constraints, labor availability, existing work order priorities, and material availability to propose optimal rescheduling scenarios.
Implementation
# File: /opt/supply-chain-agent/crewai/agents/schedule_optimizer.py
from crewai import Agent
from tools.erp_connector import ERPWorkOrderTool, ERPCapacityTool
from tools.database_connector import ScheduleAdjustmentTool
from tools.llm_analyzer import ScheduleOptimizationTool
def create_schedule_optimizer_agent():
return Agent(
role='Production Schedule Optimizer',
goal=(
'Generate specific, implementable production schedule adjustments that minimize '
'the impact of detected supplier delays on customer deliveries and overall plant '
'efficiency. For each impacted work order, propose concrete date changes, work order '
'resequencing, or alternative actions (overtime, alternate supplier, partial shipment '
'to customer). Always present options with tradeoff analysis, never a single rigid solution.'
),
backstory=(
'You are a master production scheduler with deep expertise in finite capacity '
'scheduling for discrete manufacturing. You understand that schedule changes have '
'ripple effects — moving one work order affects machine utilization, labor allocation, '
'and downstream assembly operations. You always consider: (1) Can we pull forward '
'other work to fill the gap? (2) Is overtime a viable option? (3) Can we split the '
'work order? (4) Is there an alternate supplier who can expedite? You present 2-3 '
'options ranked by total cost impact.'
),
tools=[
ERPWorkOrderTool(),
ERPCapacityTool(),
ScheduleAdjustmentTool(),
ScheduleOptimizationTool()
],
verbose=True,
memory=True,
max_iter=10,
allow_delegation=False
)# CrewAI task definition with option generation, constraint checking, and
# guardrails
# File: /opt/supply-chain-agent/crewai/tasks/optimization_tasks.py
from crewai import Task
def create_schedule_adjustment_task(agent, context=None):
return Task(
description=(
'For each impact assessment with severity HIGH or CRITICAL:\n\n'
'1. OPTION GENERATION: Create 2-3 schedule adjustment options:\n'
' Option A (Conservative): Delay affected work orders by the minimum necessary '
'time, maintaining original sequence. Calculate new dates.\n'
' Option B (Optimized): Resequence work orders to minimize customer impact — '
'pull forward non-affected work to fill gaps, push delayed work to earliest feasible slot.\n'
' Option C (Aggressive): Propose overtime/weekend shifts or alternate sourcing '
'to maintain original schedule. Estimate additional cost.\n\n'
'2. CONSTRAINT CHECKING: For each option, verify:\n'
' - Machine/work center capacity is not exceeded (check ERP capacity tool)\n'
' - Required materials for resequenced work are available\n'
' - Labor constraints are considered (no more than 20%% overtime increase)\n'
' - Downstream assembly operations are not orphaned\n\n'
'3. COST-BENEFIT: For each option estimate:\n'
' - Additional labor cost (overtime hours × avg rate)\n'
' - Customer penalty risk (if contractual penalties exist)\n'
' - Inventory carrying cost change\n'
' - Opportunity cost of displaced work\n\n'
'4. RECOMMENDATION: Rank options and provide a clear recommendation with rationale.\n\n'
'5. For MEDIUM severity: Generate a monitoring recommendation (no schedule change yet, '
'but set a re-check date).\n\n'
'6. For LOW severity: Log the event and note that no action is needed.\n\n'
'GUARDRAILS: Never recommend changes that would delay a customer order by more than '
'the original supplier delay. Never propose schedule changes more than 30 days out. '
'Never modify work orders that are currently in-process (status = Started/Active).'
),
expected_output=(
'A JSON array of schedule adjustment recommendations, each containing: '
'delay_event_id, severity, recommendation_type (reschedule/monitor/no_action), '
'options (array of {option_label, affected_work_orders [{work_order_id, '
'original_start, original_end, new_start, new_end, change_reason}], '
'estimated_additional_cost, customer_delivery_impact, confidence_score}), '
'recommended_option, recommendation_rationale, '
'requires_human_approval (boolean — true for all CRITICAL and HIGH, '
'configurable for MEDIUM).'
),
agent=agent,
context=context
)Notification Agent
Type: agent CrewAI agent responsible for formatting analysis results into clear, actionable notifications for production planners via Microsoft Teams. Generates context-rich Adaptive Cards with severity-coded alerts, impact summaries, and one-click approval buttons for schedule changes. Implementation: ``` # File: /opt/supply-chain-agent/crewai/agents/notification_agent.py from crewai import Agent from tools.teams_connector import TeamsNotificationTool, TeamsApprovalCardTool from too...
ERP Connector Tools
Type: integration
CrewAI tool classes that wrap the ERP REST API calls for Purchase Orders, Work Orders, BOM data, Inventory levels, and Capacity data. Supports Epicor Kinetic, Acumatica, and Dynamics 365 through a configurable adapter pattern.
Implementation:
# File: /opt/supply-chain-agent/crewai/tools/erp_connector.py
import os
import json
import requests
import yaml
from crewai.tools import BaseTool
from pydantic import BaseModel, Field
from typing import Optional, List
import logging
logger = logging.getLogger(__name__)
# Load ERP configuration
def load_erp_config():
config_path = os.path.join(os.path.dirname(__file__), '..', 'integrations', 'erp_config.yaml')
with open(config_path, 'r') as f:
return yaml.safe_load(f)
class ERPClient:
"""Generic ERP REST API client with adapter pattern."""
def __init__(self):
self.config = load_erp_config()
self.base_url = os.environ.get('ERP_API_BASE_URL', self.config['base_url'])
self.api_key = os.environ.get('ERP_API_KEY', '')
self.company_id = self.config.get('company_id', '')
self.session = requests.Session()
self.session.headers.update({
'Content-Type': 'application/json',
'X-API-Key': self.api_key
})
def _resolve_endpoint(self, endpoint_template: str, **kwargs) -> str:
"""Replace placeholders in endpoint URLs."""
url = endpoint_template.replace('{company}', self.company_id)
for key, value in kwargs.items():
url = url.replace(f'{{{key}}}', str(value))
return f"{self.base_url}{url}"
def get(self, endpoint_key: str, params: dict = None, **kwargs) -> dict:
"""Execute a GET request against the ERP API."""
endpoints = self.config['api_endpoints']
category, action = endpoint_key.split('.')
url = self._resolve_endpoint(endpoints[category][action], **kwargs)
resp = self.session.get(url, params=params, timeout=30)
resp.raise_for_status()
return resp.json()
def patch(self, endpoint_key: str, data: dict, **kwargs) -> dict:
"""Execute a PATCH request to update ERP records."""
endpoints = self.config['api_endpoints']
category, action = endpoint_key.split('.')
url = self._resolve_endpoint(endpoints[category][action], **kwargs)
resp = self.session.patch(url, json=data, timeout=30)
resp.raise_for_status()
return resp.json()
erp_client = ERPClient()
class POQueryInput(BaseModel):
status_filter: Optional[str] = Field(default='open', description='Filter: open, closed, all')
supplier_id: Optional[str] = Field(default=None, description='Filter by supplier ID')
days_ahead: Optional[int] = Field(default=30, description='Look-ahead window in days')
class ERPPurchaseOrderTool(BaseTool):
name: str = 'query_purchase_orders'
description: str = 'Queries the ERP system for Purchase Orders. Can filter by status (open/closed), supplier, and delivery date window.'
args_schema: type[BaseModel] = POQueryInput
def _run(self, status_filter: str = 'open', supplier_id: str = None, days_ahead: int = 30) -> str:
from datetime import datetime, timedelta
filters = []
if status_filter == 'open':
filters.append('OpenOrder eq true')
cutoff = (datetime.now() + timedelta(days=days_ahead)).strftime('%Y-%m-%dT00:00:00Z')
filters.append(f'DueDate le {cutoff}')
if supplier_id:
filters.append(f"VendorNum eq {supplier_id}")
filter_str = ' and '.join(filters)
fields = ','.join(erp_client.config['api_endpoints']['purchase_orders']['fields'])
try:
result = erp_client.get('purchase_orders.list',
params={'$filter': filter_str, '$select': fields, '$orderby': 'DueDate'})
records = result.get('value', [])
return json.dumps({'count': len(records), 'purchase_orders': records}, default=str)
except Exception as e:
logger.error(f'ERP PO query failed: {e}')
return json.dumps({'error': str(e), 'count': 0, 'purchase_orders': []})
class WOQueryInput(BaseModel):
job_num: Optional[str] = Field(default=None, description='Specific Work Order/Job number')
part_number: Optional[str] = Field(default=None, description='Filter by part number')
status: Optional[str] = Field(default='released', description='Filter: released, firm, active')
class ERPWorkOrderTool(BaseTool):
name: str = 'query_work_orders'
description: str = 'Queries the ERP for Work Orders/Job entries. Can filter by status, part number, or specific job number.'
args_schema: type[BaseModel] = WOQueryInput
def _run(self, job_num: str = None, part_number: str = None, status: str = 'released') -> str:
filters = []
if job_num:
filters.append(f"JobNum eq '{job_num}'")
if part_number:
filters.append(f"PartNum eq '{part_number}'")
if status == 'released':
filters.append('JobReleased eq true and JobClosed eq false')
filter_str = ' and '.join(filters) if filters else None
fields = ','.join(erp_client.config['api_endpoints']['work_orders']['fields'])
try:
params = {'$select': fields, '$orderby': 'StartDate'}
if filter_str:
params['$filter'] = filter_str
result = erp_client.get('work_orders.list', params=params)
records = result.get('value', [])
return json.dumps({'count': len(records), 'work_orders': records}, default=str)
except Exception as e:
logger.error(f'ERP WO query failed: {e}')
return json.dumps({'error': str(e), 'count': 0, 'work_orders': []})
class SupplierQueryInput(BaseModel):
vendor_id: Optional[str] = Field(default=None, description='Specific vendor ID')
class ERPSupplierTool(BaseTool):
name: str = 'query_suppliers'
description: str = 'Queries the ERP for supplier/vendor master data.'
args_schema: type[BaseModel] = SupplierQueryInput
def _run(self, vendor_id: str = None) -> str:
try:
params = {'$select': ','.join(erp_client.config['api_endpoints']['suppliers']['fields'])}
if vendor_id:
params['$filter'] = f"VendorNum eq {vendor_id}"
result = erp_client.get('suppliers.list', params=params)
return json.dumps({'suppliers': result.get('value', [])}, default=str)
except Exception as e:
return json.dumps({'error': str(e)})
class BOMQueryInput(BaseModel):
part_number: str = Field(description='Part number to explode BOM for')
class ERPBOMTool(BaseTool):
name: str = 'query_bom'
description: str = 'Queries the Bill of Materials for a given part number, showing all parent assemblies and finished goods that depend on it.'
args_schema: type[BaseModel] = BOMQueryInput
def _run(self, part_number: str) -> str:
try:
result = erp_client.get('bom.list', params={'$filter': f"PartNum eq '{part_number}'"})
return json.dumps({'bom_records': result.get('value', [])}, default=str)
except Exception as e:
return json.dumps({'error': str(e)})
class InventoryQueryInput(BaseModel):
part_number: str = Field(description='Part number to check inventory for')
class ERPInventoryTool(BaseTool):
name: str = 'query_inventory'
description: str = 'Queries current inventory levels for a part number, including on-hand, allocated, and available quantities.'
args_schema: type[BaseModel] = InventoryQueryInput
def _run(self, part_number: str) -> str:
try:
# This endpoint varies significantly by ERP — adjust per client
result = erp_client.get('purchase_orders.list',
params={'$filter': f"PartNum eq '{part_number}'", '$select': 'PartNum,OnHandQty,AllocatedQty,AvailableQty'})
return json.dumps({'inventory': result.get('value', [])}, default=str)
except Exception as e:
return json.dumps({'error': str(e)})
class ERPCapacityTool(BaseTool):
name: str = 'query_capacity'
description: str = 'Queries work center capacity and utilization for scheduling feasibility checks.'
def _run(self) -> str:
try:
result = erp_client.get('work_orders.list',
params={'$filter': 'JobReleased eq true and JobClosed eq false',
'$select': 'JobNum,StartDate,DueDate,SchedCode'})
return json.dumps({'capacity_data': result.get('value', [])}, default=str)
except Exception as e:
return json.dumps({'error': str(e)})Database Connector Tools
Type: integration CrewAI tool classes for reading and writing to the PostgreSQL database — querying supplier metrics baselines, recording delay events, storing schedule adjustment recommendations, and writing to the compliance audit log. Implementation: ``` # File: /opt/supply-chain-agent/crewai/tools/database_connector.py import os import json import psycopg2 from psycopg2.extras import RealDictCursor from crewai.tools import BaseTool from pydantic import BaseModel, Field from typin...
CrewAI Main Orchestrator
Type: workflow The main entry point that assembles all four agents into a CrewAI Crew with sequential task processing. Implements the polling loop, error handling, and phase-aware execution (monitor-only, recommend, or semi-autonomous modes). Implementation: ``` # File: /opt/supply-chain-agent/crewai/main.py import os import sys import json import time import logging from datetime import datetime from crewai import Crew, Process from dotenv import load_dotenv # Import agents from ag...
n8n ERP Polling Workflow
Type: workflow n8n workflow JSON that runs on a 15-minute schedule to poll the ERP for PO status changes and trigger the CrewAI agent crew when changes are detected. Acts as the event loop that connects the ERP to the AI agent system. Implementation: ``` // File: /opt/supply-chain-agent/crewai/workflows/erp_polling.json // Import this JSON into n8n via the UI or CLI // This is a simplified representation — full n8n JSON is verbose // Workflow: ERP PO Status Polling // // Node 1: Sched...
Grafana Supply Chain Dashboard
Type: workflow Grafana dashboard configuration for visualizing supplier performance, delay events, agent activity, and schedule adjustment status. Provides both operational visibility for the client's planners and system health monitoring for the MSP. Implementation: ``` // Grafana Dashboard JSON — Import via Grafana UI > Dashboards > Import // Dashboard: Supply Chain AI Agent Operations // // Panel 1: Supplier On-Time Delivery Rate (Gauge) // Data Source: PostgreSQL // Query: // SELE...
Testing & Validation
CONNECTIVITY TEST: Execute the following command with API credentials and verify a 200 response with valid PO JSON data. Document the response time (should be <2 seconds).
curl -X GET <ERP_API_URL>/odata/<company>/Erp.BO.POSvc/POs?$top=1DATABASE TEST: Connect to PostgreSQL and verify the historical data load count matches expected records (±5% of ERP PO count for the same date range).
docker exec scm-postgres psql -U scm_admin -d supply_chain_agent -c "SELECT COUNT(*) FROM purchase_orders;"SUPPLIER BASELINE TEST: Run the following query and verify that the worst-performing suppliers align with the client's known problem suppliers. Present results to the production planner for validation.
SELECT supplier_id, supplier_name, avg_lead_time_days, on_time_delivery_rate
FROM suppliers
WHERE avg_lead_time_days IS NOT NULL
ORDER BY on_time_delivery_rate ASC
LIMIT 10;AGENT SMOKE TEST: Trigger a single monitoring cycle and verify: (a) no Python exceptions in logs, (b) agent_audit_log has new entries, (c) execution completes within 5 minutes.
docker exec scm-crewai python -c "from main import SupplyChainAgentCrew; c = SupplyChainAgentCrew(); c.execute_cycle()"DELAY DETECTION TEST: Manually modify a test PO's confirmed delivery date in the ERP to be 5 days later than the requested date. Wait for the next polling cycle (or trigger manually). Verify: (a) a delay_event record is created in the database, (b) the Teams channel receives a notification with correct PO number, supplier name, and delay magnitude.
FALSE POSITIVE TEST: Verify that POs with confirmed dates matching requested dates (on-time) do NOT generate delay events. Run the following query after a clean cycle — should return 0 if all POs are on time.
SELECT COUNT(*) FROM delay_events WHERE detected_at > NOW() - INTERVAL '1 hour';TEAMS NOTIFICATION TEST: Send a test notification and verify it appears in the correct Teams channel within 5 seconds.
curl -X POST $TEAMS_WEBHOOK_URL \
-H "Content-Type: application/json" \
-d "{\"@type\":\"MessageCard\",\"summary\":\"Test\",\"sections\":[{\"activityTitle\":\"Test Alert\"}]}"RECOMMENDATION TEST (Phase 3): After a delay is detected, verify that the schedule_adjustments table contains a recommendation with: valid work_order_id, reasonable date adjustments, priority rating, and status='pending'. Review the impact_description with the production planner for accuracy.
APPROVAL FLOW TEST (Phase 3): Trigger a schedule adjustment recommendation. Verify the Adaptive Card appears in Teams with Approve/Reject buttons. Click Approve. Verify the schedule_adjustments record updates to status='approved' with the approver's name and timestamp.
AUDIT LOG COMPLETENESS TEST: After running 5 monitoring cycles, run the following query and verify every agent has log entries. This is critical for SOX/CMMC compliance.
SELECT agent_name, action_type, COUNT(*)
FROM agent_audit_log
GROUP BY agent_name, action_type;Audit log completeness is critical for SOX/CMMC compliance. Every agent must have entries after 5 monitoring cycles.
PERFORMANCE TEST: Run 10 consecutive monitoring cycles and measure the following. Document baseline for ongoing SLA monitoring.
- Average cycle duration (target: <3 minutes)
- API token consumption per cycle (target: <10,000 tokens)
- Database query count per cycle
FAILOVER TEST: Stop the PostgreSQL container and trigger an agent cycle. Verify: (a) the agent fails gracefully with a logged error, (b) an error notification is sent to Teams, (c) no data corruption occurs when PostgreSQL is restarted.
# Stop the PostgreSQL container
docker stop scm-postgres
# After verifying graceful failure, restart the container
docker start scm-postgresSECURITY TEST: Perform the following verifications to confirm access controls and credential security are properly configured.
- Attempt to access the n8n web UI without authentication — verify redirect to login page.
- Attempt API calls to the agent server from outside the firewall — verify they are blocked.
- Verify all .env credentials are not readable by non-root users — the file should show -rw------- permissions.
ls -la /opt/supply-chain-agent/.envEND-TO-END INTEGRATION TEST: Create a realistic scenario with the client — identify a real open PO with a known late supplier. Monitor through the full pipeline: detection → analysis → recommendation → Teams notification → planner review. Document the full cycle time and planner feedback.
Client Handoff
Client Handoff Checklist
Training Sessions (Schedule 3 sessions over 2 weeks)
Session 1: Production Planners (2 hours)
- How to read and interpret Teams delay alerts (severity levels, what each field means)
- How to approve/reject schedule change recommendations via Teams Adaptive Cards
- When to override agent recommendations (e.g., customer relationship context the agent doesn't have)
- How to access the Grafana dashboard for supplier performance visibility
- Practice exercise: walk through 3 real delay scenarios from recent history
Session 2: Operations Manager (1.5 hours)
- Overview of agent autonomy levels and how to request changes (Phase 2→3→4 progression)
- How to adjust agent thresholds: delay detection sensitivity, auto-execute rules
- Monthly KPI review process using Grafana dashboards
- Escalation path: when to call MSP vs. handle internally
- Compliance: understanding the audit log and how to pull reports for auditors
Session 3: IT Administrator (1.5 hours)
- System architecture overview: what's running where, container management basics
- How to restart the agent system (see command below)
- How to check system health: Grafana alerts, Docker logs, disk space
- Backup verification: how to confirm nightly backups completed
- Security: access controls, credential rotation schedule, firewall rules
docker compose restart crewai-agentsDocumentation to Deliver
Success Criteria Review (Joint session with client stakeholders)
Phase Gate Sign-Off
- Client operations manager signs off on Phase 2 (monitoring) before proceeding to Phase 3 (recommendations)
- Minimum 4-week observation period between phases
- Phase 4 (semi-autonomous) requires written approval from operations VP with defined guardrails document
Maintenance
Ongoing Maintenance Plan
MSP Responsibilities
Daily (Automated — verify via Grafana alerts)
- Agent system health check: all Docker containers running, no error spikes
- API connectivity verification: ERP API, Azure OpenAI, Teams webhook all responding
- Database backup completion verification
- Token consumption monitoring (alert if daily spend exceeds 150% of 7-day average)
Weekly (15 minutes — MSP tech reviews dashboard)
- Review agent execution logs for recurring errors or warnings
- Check disk space utilization (alert at 80%)
- Review any rejected schedule recommendations — may indicate threshold tuning needed
- Verify SSL certificate status (auto-renew should handle, but verify)
Monthly (1-2 hours — MSP senior engineer)
- Review supplier metrics trends: are baselines still accurate? Retrigger baseline calculation if supplier behavior has significantly changed
- Azure OpenAI API usage review and cost optimization: are prompts efficient? Can any be simplified?
- Docker image updates: pull latest versions of n8n, PostgreSQL, Grafana, apply security patches
- Generate and deliver compliance audit report from agent_audit_log
- Review and rotate API keys/credentials per security policy
- 30-minute check-in call with client operations manager to review KPIs and gather feedback
Quarterly (4-6 hours — MSP senior engineer + AI specialist)
- Agent prompt tuning: review agent decisions from past quarter, refine backstory and task descriptions for better accuracy
- Supplier baseline recalculation: re-run historical ETL to incorporate latest 3 years of data
- Capacity planning: review server resource utilization, plan upgrades if needed
- CrewAI and dependency version updates (test in staging first)
- Evaluate phase progression: is client ready to move from Phase 2→3 or Phase 3→4?
- Security audit: review access logs, verify no unauthorized access, confirm compliance posture
Annually
- Full system architecture review: evaluate if newer tools/platforms would improve performance or reduce cost
- ERP API compatibility check: verify agent integrations still work after annual ERP updates
- Disaster recovery drill: simulate server failure, verify backup restoration, measure RTO
- Contract and SLA review with client
SLA Framework
Model/Prompt Retraining Triggers
- False positive rate exceeds 10% (measured by rejected recommendations / total recommendations)
- New supplier onboarded (add to supplier master data and allow 90 days for baseline)
- Client adds new product line or significantly changes BOM structure
- ERP system upgraded to new major version (re-test all API integrations)
- Azure OpenAI model version update (test with current prompts before switching)
Escalation Path
Alternatives
ERP-Native AI Agents (Epicor Prism / Acumatica AI Studio)
Instead of building a custom CrewAI agent system, leverage the AI agent capabilities built directly into the client's ERP platform. Epicor Prism provides pre-built supplier communication automation with AI-driven RFQ processing. Acumatica AI Studio enables no-code AI workflow creation within the ERP. Microsoft Dynamics 365 Copilot agents offer supply chain insights natively within the D365 ecosystem.
Low-Code n8n-Only Approach (No CrewAI)
Eliminate the CrewAI multi-agent framework entirely and implement the entire monitoring, analysis, and notification pipeline using n8n workflows with direct OpenAI API calls. n8n's built-in AI agent nodes, code nodes, and database connectors can replicate most of the agent logic without the complexity of a Python-based agent framework.
Kinaxis RapidResponse or o9 Solutions Overlay
Deploy a purpose-built supply chain planning platform (Kinaxis RapidResponse or o9 Solutions) that sits on top of the existing ERP and provides AI-powered concurrent planning, scenario modeling, and automated schedule optimization. These platforms have mature supply chain AI capabilities built by domain experts.
Microsoft Dynamics 365 Supply Chain Management + Copilot Agents
For clients already running Microsoft Dynamics 365, leverage the native Supply Chain Management module with Copilot AI capabilities. D365 SCM includes AI-powered demand forecasting, predictive lead time analysis, and production planning optimization — all integrated with the Microsoft 365 ecosystem (Teams, Power BI, Power Automate).
Phased Spreadsheet-to-Agent Migration
Start with a semi-automated approach: use n8n or Power Automate to extract PO status data from the ERP into a shared Excel/Google Sheet dashboard. Production planners manually review the dashboard daily. Over 3–6 months, add automated delay detection alerts. Only then invest in AI agent capabilities for recommendation and autonomous scheduling.
Want early access to the full toolkit?