61 min readIntelligence & insights

Implementation Guide: Analyze contracts to flag unusual clauses, missing provisions, or risk terms

Step-by-step implementation guide for deploying AI to analyze contracts to flag unusual clauses, missing provisions, or risk terms for Legal Services clients.

Hardware Procurement

Attorney Workstation Desktop

DellOptiPlex 7020 Tower (Intel Core i5-14500, 16GB RAM, 512GB SSD)Qty: 5

$950 per unit (MSP cost) / $1,200 suggested resale

Primary attorney workstation for running Microsoft Word with contract AI add-ins (goHeather/Spellbook), web-based AI review dashboards, and dual-monitor contract comparison workflows. 16GB RAM and i5-14500 ensure smooth performance with multiple browser tabs, Word documents, and AI tools running simultaneously.

QHD Dual Monitor

DellP2423D 24-inch QHDQty: 10

$290 per unit (MSP cost) / $375 suggested resale

Dual-monitor setup per attorney enables side-by-side contract comparison—original document on one screen, AI-flagged analysis on the other. QHD resolution (2560x1440) provides the text clarity essential for legal document review.

Mobile Attorney Laptop

LenovoThinkPad X1 Carbon Gen 12 (Intel i7, 16GB RAM, 512GB SSD)Qty: 2

$1,500 per unit (MSP cost) / $1,900 suggested resale

For attorneys who review contracts remotely or in client offices. Lightweight form factor with enterprise-grade security (TPM 2.0, fingerprint reader) and sufficient power for Word add-ins and browser-based AI tools.

Network Attached Storage

RS1221+ 8-Bay Rackmount NAS

SynologyRS1221+ 8-Bay Rackmount NASQty: 1

$1,200 (MSP cost) / $1,600 suggested resale (diskless); add $300–$400 per Seagate IronWolf 4TB drive x4 in RAID 10

Local contract document cache and backup repository. Stores copies of contracts before and after AI analysis for audit trail purposes. Provides fast local access to frequently reviewed templates and playbook documents. Not required if the firm is 100% cloud-based on SharePoint/NetDocuments.

Firewall / Security Appliance

Firewall / Security Appliance

FortinetFortiGate 40FQty: 1

$400 (hardware MSP cost) + $350/year FortiGuard UTM licensing / $700 hardware + $500/year suggested resale

Secures the encrypted HTTPS tunnel between the law firm network and SaaS AI platforms. Provides UTM (Unified Threat Management) including IPS, web filtering, and application control to prevent data exfiltration of sensitive contract data. Essential for ABA Rule 1.6 confidentiality compliance.

Uninterruptible Power Supply

APCSmart-UPS 1500VA (SMT1500RM2U)Qty: 1

$550 (MSP cost) / $700 suggested resale

Protects NAS, firewall, and network switch from power interruptions during contract upload/analysis operations. Prevents data corruption on the local contract document cache.

Software Procurement

goHeather Pro

goHeatherQty: 5 attorneys

$39.99/user/month; ~$200/month MSP cost with volume negotiation / $300/month suggested resale

Primary AI contract review platform for SMB law firms. Provides AI redlining in Microsoft Word, drag-and-drop PDF analysis, custom legal playbooks, jurisdiction-aware clause analysis, and risk flagging with traffic-light severity indicators. Lowest barrier to entry for firms new to AI contract review.

LegalSifter

LegalSifterSaaS subscription (2-year term)Qty: 5 users

$29/month per user for 2-year commitment; 5 users = ~$145/month / $250/month suggested resale

Alternative or complementary platform to goHeather. Pre-built 'Sifters' (AI advisors) for 150+ specific clause types across NDAs, leases, employment agreements, and commercial contracts. Reduces review time by up to 80% and identifies missing provisions and gotcha terms. Best for firms wanting rapid time-to-value with minimal playbook configuration.

Microsoft 365 Business Premium

MicrosoftSaaS per-seat monthly (CSP resale)Qty: 5 users

$22/user/month via CSP; 5 users = $110/month (MSP cost) / $150/month suggested resale

Foundation platform providing Microsoft Word (required for contract AI add-ins like goHeather and Spellbook), SharePoint/OneDrive (document storage and DMS), Entra ID (SSO authentication for AI platforms), Exchange Online (email), and Intune (endpoint management). Microsoft Defender for Office 365 provides email security for contract attachments.

Azure OpenAI Service (GPT-5.4)

Microsoft AzureGPT-5.4

$2.50/million input tokens, $10.00/million output tokens; estimated $50–$200/month for 50–200 contracts/month / $150–$500/month suggested resale

Powers the custom AI orchestration layer for supplemental contract analysis using the firm's proprietary legal playbooks. Handles analysis that goes beyond the SaaS platform's built-in capabilities—custom risk scoring, jurisdiction-specific clause comparison, and firm-specific standard position matching. Deployed within Azure tenancy for data residency compliance.

Azure Virtual Machine (Orchestration Server)

Microsoft AzureAzure D4s v5 (4 vCPU, 16GB RAM)Qty: 1

~$140/month / $200/month suggested resale

Hosts the custom LangChain/FastAPI orchestration layer that coordinates between the firm's document repository, Azure OpenAI, and the playbook database. Runs the RAG pipeline, manages prompt chains, and serves the internal API that connects the custom AI components.

Azure Blob Storage

Microsoft Azure

$0.018/GB/month; estimated $5–$20/month for typical contract volumes / $15–$40/month suggested resale

Cloud storage for the contract document repository used by the custom AI pipeline. Stores original contracts, AI analysis results, and audit trail documents. Integrated with Azure AD for access control.

Azure Database for PostgreSQL

Microsoft AzureFlexible Server Burstable B1ms

~$50/month / $80/month suggested resale

Stores contract analysis metadata, playbook definitions, risk scoring rules, clause libraries, and complete audit trails of all AI-generated analysis for ABA compliance. Supports the custom orchestration layer's state management and reporting.

Clio Manage

Clio (Themis Solutions)Advanced PlanQty: 5 users

Advanced plan: $109/user/month; 5 users = $545/month (referral commission to MSP: 10-15%) / recommend client purchases directly via Clio Partner Program referral

Practice management system that serves as the central hub for matter management, client records, time tracking, and billing. Contract AI analysis results are linked to specific matters via Clio's REST API, enabling attorneys to see flagged risks in context of their case files. Time entries for AI-assisted review are logged appropriately per ABA Rule 1.5.

Cisco Umbrella DNS Security

CiscoUmbrella DNS SecurityQty: 10 users (attorneys + staff)

~$3–$5/user/month; ~$40/month (MSP cost) / $75/month suggested resale

DNS-layer security filtering to prevent accidental data exfiltration of sensitive contract data. Blocks access to unauthorized cloud storage, shadow AI tools, and known malicious sites. Provides visibility into which cloud services are accessing contract data. Supports ABA Rule 1.6 confidentiality obligations.

Prerequisites

  • Microsoft 365 Business Premium licenses provisioned and active for all attorneys and relevant staff, with Entra ID (Azure AD) configured for the tenant
  • Stable internet connection: minimum 50 Mbps download / 10 Mbps upload with <50ms latency to Azure East US or relevant regional datacenter
  • Modern web browser installed on all workstations (Microsoft Edge or Google Chrome, latest version)
  • Microsoft Word desktop application (not web-only) installed via Microsoft 365 Apps on all attorney workstations—required for Word add-in contract AI tools
  • Existing document management approach identified: SharePoint/OneDrive, iManage, or NetDocuments—contracts must be accessible in a centralized location, not scattered across local drives
  • Practice management system (Clio recommended) active with matters and client records populated—API access credentials obtained from Clio admin console
  • At least one senior attorney designated as 'AI Champion' who will collaborate on playbook development, validate AI output quality, and serve as internal advocate during rollout
  • Sample contract corpus prepared: minimum 50 representative contracts across the firm's primary practice areas (e.g., 15 NDAs, 10 commercial leases, 10 employment agreements, 15 vendor/service agreements) with attorney annotations of known issues for testing
  • Firm's malpractice insurance carrier contacted to confirm coverage extends to AI-assisted contract review—document this confirmation in writing
  • ABA Formal Opinion 512 reviewed by managing partner; firm-level AI usage policy drafted or in progress
  • Azure subscription provisioned under the MSP's CSP partner account with the client as a named tenant—billing relationship established
  • FortiGate 40F firewall installed and operational at the firm's primary office with current FortiGuard UTM subscription, or equivalent perimeter security confirmed
  • Client has executed Data Processing Agreements (DPAs) with all AI vendors that include explicit no-training clauses and SOC 2 Type II attestation verification
  • DNS filtering (Cisco Umbrella or equivalent) deployed across the network to prevent unauthorized cloud AI tool usage by staff

Installation Steps

...

Step 1: Environment Assessment and Security Baseline

Conduct a thorough assessment of the client's current IT environment, document all existing software, network topology, and security posture. Verify all prerequisites are met. Run a network speed test to confirm bandwidth. Audit existing document storage to determine where contracts currently reside (local drives, shared folders, DMS, email attachments). Catalog the types and volumes of contracts the firm handles monthly. Document findings in the MSP's project management system.

Network speed test and Microsoft 365 environment discovery commands
powershell
# Network speed test from attorney workstation
speedtest-cli --simple

# Verify Microsoft 365 licensing via PowerShell
Connect-MgGraph -Scopes 'User.Read.All','Organization.Read.All'
Get-MgSubscribedSku | Select-Object SkuPartNumber, ConsumedUnits, PrepaidUnits

# Check Entra ID configuration
Get-MgOrganization | Select-Object DisplayName, VerifiedDomains

# Enumerate existing SharePoint sites (if using SharePoint as DMS)
Get-MgSite -Search '*' | Select-Object DisplayName, WebUrl
Note

This assessment typically takes 4-6 hours on-site. Bring a checklist of all prerequisites. If contracts are scattered across local drives and email, add 1-2 weeks to the timeline for document consolidation. Identify any contracts containing PII from EU data subjects—these trigger GDPR DPA requirements.

Step 2: Network and Security Infrastructure Setup

Install and configure the FortiGate 40F firewall (if not already present). Configure firewall policies to allow outbound HTTPS (port 443) to AI vendor endpoints while blocking unauthorized cloud storage services. Deploy Cisco Umbrella DNS agents on all workstations. Configure the Synology NAS for local contract caching if the firm requires on-premises document backup. Set up the APC UPS and connect critical infrastructure.

shell
# FortiGate initial configuration via CLI (after factory reset)
config system interface
  edit port1
    set ip 192.168.1.1 255.255.255.0
    set allowaccess ping https ssh
  next
end

# Create address objects for AI vendor endpoints
config firewall address
  edit goheather-saas
    set type fqdn
    set fqdn app.goheather.com
  next
  edit azure-openai
    set type fqdn
    set fqdn *.openai.azure.com
  next
  edit legalsifter-saas
    set type fqdn
    set fqdn app.legalsifter.com
  next
end

# Create outbound policy allowing AI vendor traffic
config firewall policy
  edit 10
    set srcintf internal
    set dstintf wan1
    set srcaddr all
    set dstaddr goheather-saas azure-openai legalsifter-saas
    set action accept
    set schedule always
    set service HTTPS
    set utm-status enable
    set ssl-ssh-profile certificate-inspection
    set av-profile default
    set ips-sensor default
    set logtraffic all
  next
end

# Cisco Umbrella - deploy roaming client via Intune
# Upload the OrgInfo.json file and Umbrella MSI to Intune
# Create Win32 app deployment in Endpoint Manager
# Install command:
msiexec /i UmbrellaRoamingClient.msi /qn ORG_ID=<your_org_id> ORG_FINGERPRINT=<your_fingerprint>

# Synology NAS - create shared folder for contracts
# Via DSM web interface or SSH:
synoshare --add ContractArchive 'Contract Document Archive' /volume1/ContractArchive '' '' rw @administrators
Note

The FortiGate FQDN objects resolve dynamically, so no IP whitelisting is needed. Enable SSL certificate inspection (not deep inspection) to avoid breaking certificate pinning on AI vendor apps. For Cisco Umbrella, use the MSP multi-tenant dashboard to manage this client alongside others. The Synology NAS should use RAID 10 with 4x Seagate IronWolf drives for redundancy and performance.

Step 3: Azure Infrastructure Provisioning

Provision the Azure resources needed for the custom AI orchestration layer. This includes the Azure OpenAI Service instance with GPT-5.4 deployment, the D4s v5 virtual machine for the orchestration server, Azure Blob Storage for the contract repository, and Azure Database for PostgreSQL for metadata and audit trails. All resources are deployed in a single resource group with appropriate RBAC and network security groups.

bash
# Login to Azure CLI
az login

# Set subscription context
az account set --subscription '<client-subscription-id>'

# Create resource group
az group create --name rg-legalai-prod --location eastus

# Create Azure OpenAI resource
az cognitiveservices account create \
  --name oai-legalai-prod \
  --resource-group rg-legalai-prod \
  --kind OpenAI \
  --sku S0 \
  --location eastus \
  --custom-domain oai-legalai-prod

# Deploy GPT-5.4 model
az cognitiveservices account deployment create \
  --name oai-legalai-prod \
  --resource-group rg-legalai-prod \
  --deployment-name gpt-5.4-contract-review \
  --model-name gpt-5.4 \
  --model-version 2024-08-06 \
  --model-format OpenAI \
  --sku-capacity 30 \
  --sku-name Standard

# Create storage account for contract documents
az storage account create \
  --name stlegalaidocs \
  --resource-group rg-legalai-prod \
  --location eastus \
  --sku Standard_LRS \
  --kind StorageV2 \
  --min-tls-version TLS1_2 \
  --allow-blob-public-access false

# Create blob container
az storage container create \
  --name contracts \
  --account-name stlegalaidocs \
  --auth-mode login

# Create PostgreSQL Flexible Server
az postgres flexible-server create \
  --name psql-legalai-prod \
  --resource-group rg-legalai-prod \
  --location eastus \
  --sku-name Standard_B1ms \
  --tier Burstable \
  --version 16 \
  --admin-user legalai_admin \
  --admin-password '<GENERATE-STRONG-PASSWORD>' \
  --storage-size 32

# Create the application database
az postgres flexible-server db create \
  --resource-group rg-legalai-prod \
  --server-name psql-legalai-prod \
  --database-name contractanalysis

# Create VM for orchestration server
az vm create \
  --resource-group rg-legalai-prod \
  --name vm-legalai-orch \
  --image Ubuntu2204 \
  --size Standard_D4s_v5 \
  --admin-username azureuser \
  --generate-ssh-keys \
  --public-ip-sku Standard \
  --nsg-rule SSH

# Create NSG rule to allow HTTPS inbound (for API access)
az network nsg rule create \
  --resource-group rg-legalai-prod \
  --nsg-name vm-legalai-orchNSG \
  --name AllowHTTPS \
  --priority 100 \
  --destination-port-ranges 443 \
  --protocol Tcp \
  --access Allow
Note

Use a strong, randomly generated password for PostgreSQL—store it in Azure Key Vault immediately. The GPT-5.4 sku-capacity of 30 represents 30K tokens-per-minute; increase if the firm processes high contract volumes. Consider enabling Azure Private Link for the OpenAI and PostgreSQL endpoints to keep traffic on the Azure backbone. All resources should be tagged with client name and project code for billing clarity.

Step 4: Orchestration Server Setup and Application Deployment

SSH into the Azure VM and install the Python environment, required packages, and the custom LangChain-based contract analysis application. Configure the application as a systemd service for automatic startup and reliability. Set up Nginx as a reverse proxy with TLS termination.

bash
# SSH into the orchestration VM
ssh azureuser@<vm-public-ip>

# Update system and install dependencies
sudo apt update && sudo apt upgrade -y
sudo apt install -y python3.11 python3.11-venv python3-pip nginx certbot python3-certbot-nginx postgresql-client

# Create application directory and virtual environment
sudo mkdir -p /opt/legalai
sudo chown azureuser:azureuser /opt/legalai
cd /opt/legalai
python3.11 -m venv venv
source venv/bin/activate

# Install Python packages
pip install --upgrade pip
pip install langchain langchain-openai langchain-community openai fastapi uvicorn psycopg2-binary python-multipart azure-storage-blob azure-identity pdfplumber python-docx tiktoken pydantic sqlalchemy alembic httpx tenacity structlog

# Create application directory structure
mkdir -p /opt/legalai/{app,playbooks,templates,logs}
touch /opt/legalai/app/__init__.py

# Create environment configuration file
cat > /opt/legalai/.env << 'EOF'
AZURE_OPENAI_ENDPOINT=https://oai-legalai-prod.openai.azure.com/
AZURE_OPENAI_API_KEY=<your-api-key>
AZURE_OPENAI_DEPLOYMENT=gpt-5.4-contract-review
AZURE_OPENAI_API_VERSION=2024-08-06
AZURE_STORAGE_ACCOUNT=stlegalaidocs
AZURE_STORAGE_CONTAINER=contracts
DATABASE_URL=postgresql://legalai_admin:<password>@psql-legalai-prod.postgres.database.azure.com:5432/contractanalysis
LOG_LEVEL=INFO
MAX_TOKENS_PER_REQUEST=128000
EOF
chmod 600 /opt/legalai/.env

# Create systemd service
sudo cat > /etc/systemd/system/legalai.service << 'EOF'
[Unit]
Description=Legal AI Contract Analysis Service
After=network.target

[Service]
Type=simple
User=azureuser
WorkingDirectory=/opt/legalai
Environment=PATH=/opt/legalai/venv/bin
EnvironmentFile=/opt/legalai/.env
ExecStart=/opt/legalai/venv/bin/uvicorn app.main:app --host 127.0.0.1 --port 8000 --workers 2
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable legalai

# Configure Nginx reverse proxy
sudo cat > /etc/nginx/sites-available/legalai << 'EOF'
server {
    listen 443 ssl;
    server_name legalai.<client-domain>.com;
    
    client_max_body_size 50M;
    
    location / {
        proxy_pass http://127.0.0.1:8000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_read_timeout 300s;
    }
}
EOF

sudo ln -s /etc/nginx/sites-available/legalai /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl restart nginx

# Obtain TLS certificate
sudo certbot --nginx -d legalai.<client-domain>.com --non-interactive --agree-tos -m admin@<msp-domain>.com
Note

Replace all placeholder values (<your-api-key>, <password>, <client-domain>, <vm-public-ip>) with actual values. Store the .env file secrets in Azure Key Vault and retrieve them at runtime for production hardening. The proxy_read_timeout of 300s is necessary because contract analysis of long documents can take 30-60 seconds. Consider setting up Azure Application Gateway instead of Nginx if the firm requires WAF capabilities.

Step 5: Database Schema Initialization

Connect to the Azure PostgreSQL instance and create the database schema for storing contract analysis results, playbook definitions, clause libraries, and audit trails. This schema supports the full lifecycle of contract review from upload through analysis to attorney validation.

PostgreSQL schema creation for contract analysis, playbooks, flagged items, and audit logging
sql
# Connect to PostgreSQL
psql 'host=psql-legalai-prod.postgres.database.azure.com port=5432 dbname=contractanalysis user=legalai_admin sslmode=require'

# Execute schema creation
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

CREATE TABLE playbooks (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    name VARCHAR(255) NOT NULL,
    contract_type VARCHAR(100) NOT NULL,
    jurisdiction VARCHAR(100),
    version INTEGER DEFAULT 1,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    created_by VARCHAR(255) NOT NULL,
    is_active BOOLEAN DEFAULT TRUE
);

CREATE TABLE playbook_clauses (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    playbook_id UUID REFERENCES playbooks(id) ON DELETE CASCADE,
    clause_name VARCHAR(255) NOT NULL,
    clause_category VARCHAR(100) NOT NULL,
    standard_position TEXT NOT NULL,
    risk_level VARCHAR(20) DEFAULT 'medium' CHECK (risk_level IN ('low', 'medium', 'high', 'critical')),
    is_required BOOLEAN DEFAULT FALSE,
    sample_language TEXT,
    red_flag_patterns TEXT[],
    notes TEXT
);

CREATE TABLE contracts (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    filename VARCHAR(500) NOT NULL,
    file_hash VARCHAR(64) NOT NULL,
    blob_path VARCHAR(1000) NOT NULL,
    contract_type VARCHAR(100),
    parties TEXT[],
    effective_date DATE,
    uploaded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    uploaded_by VARCHAR(255) NOT NULL,
    matter_id VARCHAR(100),
    clio_matter_id VARCHAR(100),
    status VARCHAR(50) DEFAULT 'pending' CHECK (status IN ('pending', 'analyzing', 'completed', 'error', 'reviewed'))
);

CREATE TABLE analysis_results (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    contract_id UUID REFERENCES contracts(id) ON DELETE CASCADE,
    playbook_id UUID REFERENCES playbooks(id),
    analysis_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    overall_risk_score DECIMAL(3,1) CHECK (overall_risk_score BETWEEN 0 AND 10),
    summary TEXT NOT NULL,
    model_version VARCHAR(100) NOT NULL,
    token_usage_input INTEGER,
    token_usage_output INTEGER,
    processing_time_seconds DECIMAL(8,2),
    raw_response JSONB
);

CREATE TABLE flagged_items (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    analysis_id UUID REFERENCES analysis_results(id) ON DELETE CASCADE,
    item_type VARCHAR(50) NOT NULL CHECK (item_type IN ('unusual_clause', 'missing_provision', 'risk_term', 'deviation', 'ambiguity')),
    severity VARCHAR(20) NOT NULL CHECK (severity IN ('info', 'low', 'medium', 'high', 'critical')),
    clause_reference VARCHAR(255),
    original_text TEXT,
    description TEXT NOT NULL,
    recommendation TEXT,
    playbook_clause_id UUID REFERENCES playbook_clauses(id),
    attorney_reviewed BOOLEAN DEFAULT FALSE,
    attorney_action VARCHAR(50) CHECK (attorney_action IN (NULL, 'accepted', 'dismissed', 'modified')),
    attorney_notes TEXT,
    reviewed_at TIMESTAMP,
    reviewed_by VARCHAR(255)
);

CREATE TABLE audit_log (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    action VARCHAR(100) NOT NULL,
    user_id VARCHAR(255),
    contract_id UUID REFERENCES contracts(id),
    details JSONB,
    ip_address INET
);

CREATE INDEX idx_contracts_status ON contracts(status);
CREATE INDEX idx_contracts_matter ON contracts(clio_matter_id);
CREATE INDEX idx_flagged_severity ON flagged_items(severity);
CREATE INDEX idx_flagged_type ON flagged_items(item_type);
CREATE INDEX idx_audit_timestamp ON audit_log(timestamp);
CREATE INDEX idx_analysis_contract ON analysis_results(contract_id);
Note

The playbook_clauses table is the heart of the system—it stores the firm's standard positions against which contracts are compared. The flagged_items table captures attorney feedback (accepted/dismissed/modified), which is essential for measuring AI accuracy and refining playbooks over time. Always use parameterized queries in the application code to prevent SQL injection. Enable PostgreSQL audit logging for compliance.

Step 6: SaaS Platform Provisioning - goHeather

Set up the primary SaaS contract AI platform (goHeather) for the firm. Create the organization account, provision user seats, configure SSO via Entra ID, and install the Microsoft Word add-in on all attorney workstations. goHeather provides the user-friendly front-end that attorneys interact with daily, while the custom Azure-based pipeline handles supplemental analysis.

1
Navigate to https://goheather.com and create organization account. Use the firm's admin email (managing partner or IT admin). Select Pro plan, 5 user seats.
2
Configure SSO with Entra ID: In Azure Portal > Entra ID > Enterprise Applications > New Application. Search for 'goHeather' or configure SAML manually: Identifier (Entity ID) provided by goHeather during setup; Reply URL provided by goHeather during setup; Sign-on URL: https://app.goheather.com/sso/<org-id>
3
Assign users in Entra ID: Azure Portal > Entra ID > Enterprise Applications > goHeather > Users and groups. Add the 5 attorney users.
4
Install Word Add-in on each workstation. Option A: Deploy via Microsoft 365 admin center (centralized) — Go to admin.microsoft.com > Settings > Integrated apps > Upload custom app. Or deploy from Office Add-ins store: In Word > Insert > Get Add-ins > search 'goHeather' > Add. Option B: Deploy via Intune (for managed devices) — Use Microsoft 365 centralized deployment method.
5
Verify add-in appears in Word: Open Word > Home tab > look for goHeather ribbon section. Click 'Analyze Contract' to verify connectivity.
Note

goHeather SSO configuration requires coordination with their support team—initiate this during the vendor procurement phase to avoid delays. The Word add-in requires Word desktop (not Word Online) for full functionality. Test with a single attorney first before rolling out to all 5. If goHeather SSO setup is complex, SAML configuration may require their enterprise support tier. Alternative: LegalSifter uses a similar deployment model and can be substituted 1:1 in this step.

Step 7: Practice Management Integration - Clio API Configuration

Configure the bidirectional integration between the contract AI system and Clio practice management. This enables attorneys to trigger contract analysis from within their matter workflow and automatically links flagged items back to the relevant matter/client record. Also configures proper time entry tracking for AI-assisted review.

Clio API registration, credential configuration, and connectivity testing
bash
# 1. Register an application in Clio Developer Portal
# Navigate to https://app.clio.com/nc/#/settings/developer_applications
# Create new application:
#   Name: 'Legal AI Contract Analyzer'
#   URL: https://legalai.<client-domain>.com
#   Redirect URI: https://legalai.<client-domain>.com/auth/clio/callback
#   Scopes: matters:read, documents:read, documents:write, activities:write, contacts:read

# 2. Store Clio OAuth credentials in .env file on orchestration server
echo 'CLIO_CLIENT_ID=<your-clio-client-id>' >> /opt/legalai/.env
echo 'CLIO_CLIENT_SECRET=<your-clio-client-secret>' >> /opt/legalai/.env
echo 'CLIO_REDIRECT_URI=https://legalai.<client-domain>.com/auth/clio/callback' >> /opt/legalai/.env

# 3. Test Clio API connectivity
curl -X GET 'https://app.clio.com/api/v4/matters.json?limit=5' \
  -H 'Authorization: Bearer <test-access-token>' \
  -H 'Content-Type: application/json'

# 4. Verify document retrieval
curl -X GET 'https://app.clio.com/api/v4/documents.json?matter_id=<test-matter-id>' \
  -H 'Authorization: Bearer <test-access-token>' \
  -H 'Content-Type: application/json'
Note

Clio uses OAuth 2.0 for authentication. The access token expires and must be refreshed using the refresh token—the application handles this automatically. Clio rate limits API calls to 600 requests per minute per user. The 'activities:write' scope allows the system to create time entries tagged as 'AI-Assisted Review' to support ABA Rule 1.5 fee transparency. If the firm uses PracticePanther or MyCase instead of Clio, equivalent REST API integrations are available.

Step 8: Custom Playbook Development and Loading

Work with the designated attorney champion to develop legal playbooks that define the firm's standard positions, required clauses, and risk thresholds for each contract type they commonly review. This is the most critical step for solution effectiveness. Load completed playbooks into both the goHeather platform and the custom PostgreSQL database.

bash
# Connect to the orchestration server
ssh azureuser@<vm-public-ip>
cd /opt/legalai
source venv/bin/activate

# Load playbook data using the management script
python -m app.management.load_playbook --file playbooks/nda_standard.json
python -m app.management.load_playbook --file playbooks/commercial_lease.json
python -m app.management.load_playbook --file playbooks/employment_agreement.json
python -m app.management.load_playbook --file playbooks/vendor_services.json
python -m app.management.load_playbook --file playbooks/msa_standard.json

# Verify playbooks loaded
python -m app.management.list_playbooks

# Expected output:
# ID | Name                        | Contract Type | Clauses | Active
# -- | --------------------------- | ------------- | ------- | ------
# 1  | Standard NDA Playbook        | NDA           | 18      | Yes
# 2  | Commercial Lease Playbook    | Lease         | 32      | Yes
# 3  | Employment Agreement Playbook| Employment    | 25      | Yes
# 4  | Vendor Services Playbook     | Services      | 22      | Yes
# 5  | Master Services Agreement    | MSA           | 28      | Yes
Critical

CRITICAL: Playbook quality determines the entire ROI of this project. Budget 2-4 weeks and 20-40 hours of attorney time for this step. Each playbook should define: (1) required clauses that must be present, (2) standard acceptable language for each clause, (3) red-flag patterns that indicate risk, (4) severity ratings for deviations. The MSP should provide the JSON template structure; the attorney champion fills in the legal substance. See the custom_ai_components section for the complete playbook JSON schema. goHeather also has its own playbook builder in the web UI—configure both systems with identical standards.

Step 9: Deploy Custom AI Application Code

Deploy the FastAPI-based contract analysis application to the orchestration server. This application provides the API endpoints for contract upload, analysis triggering, result retrieval, and Clio integration. It uses LangChain to orchestrate multi-step AI analysis against the firm's playbooks.

bash
# SSH into orchestration server
ssh azureuser@<vm-public-ip>
cd /opt/legalai

# Clone or copy application files (see custom_ai_components for full source)
# The main application files should be placed as follows:
# /opt/legalai/app/main.py          - FastAPI application entry point
# /opt/legalai/app/analyzer.py      - Core LangChain contract analysis engine
# /opt/legalai/app/models.py         - SQLAlchemy ORM models
# /opt/legalai/app/schemas.py        - Pydantic request/response schemas
# /opt/legalai/app/clio_client.py    - Clio API integration
# /opt/legalai/app/document_loader.py - PDF/DOCX text extraction
# /opt/legalai/app/prompts/          - Prompt templates directory
# /opt/legalai/app/management/       - CLI management scripts

# Run database migrations
cd /opt/legalai
source venv/bin/activate
alembic upgrade head

# Start the service
sudo systemctl start legalai
sudo systemctl status legalai

# Verify the API is running
curl -s https://legalai.<client-domain>.com/health | python3 -m json.tool
# Expected: {"status": "healthy", "database": "connected", "openai": "connected"}

# Run a test analysis
curl -X POST https://legalai.<client-domain>.com/api/v1/analyze \
  -H 'Authorization: Bearer <test-token>' \
  -F 'file=@/opt/legalai/test_contracts/sample_nda.pdf' \
  -F 'contract_type=NDA' \
  -F 'playbook_id=1'
Note

The complete application source code is provided in the custom_ai_components section. Deploy using Git (recommended) or SCP the files directly. After initial deployment, set up a CI/CD pipeline using GitHub Actions or Azure DevOps for future updates. Monitor the systemd journal for errors: journalctl -u legalai -f. The /health endpoint should be added to the MSP's monitoring system (e.g., ConnectWise Automate or Datto RMM).

Step 10: Configure SharePoint Integration for Document Flow

Set up the automated document flow between the firm's SharePoint document library and the contract AI system. When contracts are uploaded to a designated SharePoint folder, they are automatically queued for analysis. Results are written back as metadata and linked comments.

powershell
# Register Azure AD app for SharePoint access
az ad app create \
  --display-name 'LegalAI-SharePoint-Connector' \
  --sign-in-audience AzureADMyOrg \
  --required-resource-accesses @sharepoint-permissions.json

# Grant admin consent
az ad app permission admin-consent --id <app-id>

# Create SharePoint document library structure
# Via SharePoint admin or PnP PowerShell:
Install-Module PnP.PowerShell -Force
Connect-PnPOnline -Url 'https://<tenant>.sharepoint.com/sites/contracts' -Interactive

# Create the contracts library with metadata columns
Add-PnPField -List 'Contracts' -DisplayName 'AI Analysis Status' -InternalName 'AIStatus' -Type Choice -Choices 'Pending','Analyzing','Completed','Error'
Add-PnPField -List 'Contracts' -DisplayName 'Risk Score' -InternalName 'RiskScore' -Type Number
Add-PnPField -List 'Contracts' -DisplayName 'Flagged Items' -InternalName 'FlaggedItems' -Type Number
Add-PnPField -List 'Contracts' -DisplayName 'Contract Type' -InternalName 'ContractType' -Type Choice -Choices 'NDA','Lease','Employment','Services','MSA','Other'
Add-PnPField -List 'Contracts' -DisplayName 'Analysis Report' -InternalName 'AnalysisReport' -Type URL

# Create Power Automate flow to trigger analysis
# Flow: When a file is created in SharePoint 'Contracts for Review' folder
# -> HTTP POST to https://legalai.<client-domain>.com/api/v1/analyze/sharepoint
# -> With file URL, metadata, and user context
# (Configure this in Power Automate web UI - see notes)
Note

If the firm uses iManage or NetDocuments instead of SharePoint, substitute the appropriate API connector. iManage uses REST API with OAuth 2.0; NetDocuments uses their REST API with HMAC authentication. The Power Automate flow provides a no-code trigger mechanism that attorneys understand—they simply upload a contract to the designated folder and the AI analysis happens automatically. Create a 'Contracts for Review' subfolder and a 'Reviewed' subfolder for completed analyses.

Step 11: Entra ID SSO and RBAC Configuration

Configure single sign-on for the custom AI application using Entra ID (Azure AD) and set up role-based access control to differentiate between attorney (reviewer), partner (admin), and staff (read-only) access levels. This ensures all access is authenticated and audited.

Register LegalAI app in Entra ID, define RBAC roles, and apply environment configuration
bash
# Register the LegalAI application in Entra ID
az ad app create \
  --display-name 'Legal AI Contract Analyzer' \
  --web-redirect-uris 'https://legalai.<client-domain>.com/auth/callback' \
  --enable-id-token-issuance true

# Define app roles
az ad app update --id <app-id> --app-roles '[
  {
    "allowedMemberTypes": ["User"],
    "description": "Attorney - can upload contracts, view analysis, provide feedback",
    "displayName": "Attorney",
    "isEnabled": true,
    "value": "Attorney"
  },
  {
    "allowedMemberTypes": ["User"],
    "description": "Partner - full access including playbook management and reporting",
    "displayName": "Partner",
    "isEnabled": true,
    "value": "Partner"
  },
  {
    "allowedMemberTypes": ["User"],
    "description": "Staff - read-only access to completed analyses",
    "displayName": "Staff",
    "isEnabled": true,
    "value": "Staff"
  }
]'

# Add Entra ID configuration to application .env
echo 'AZURE_AD_TENANT_ID=<tenant-id>' >> /opt/legalai/.env
echo 'AZURE_AD_CLIENT_ID=<app-client-id>' >> /opt/legalai/.env
echo 'AZURE_AD_CLIENT_SECRET=<app-client-secret>' >> /opt/legalai/.env

# Restart the application to pick up new config
sudo systemctl restart legalai
  • Assign users to roles via Azure Portal: Entra ID > Enterprise Applications > Legal AI Contract Analyzer > Users and groups > Add each attorney/partner/staff member with appropriate role
Note

The application uses the MSAL (Microsoft Authentication Library) Python package for Entra ID integration. All API endpoints validate the JWT token and check role claims before processing requests. Partners can manage playbooks and view firm-wide analytics; attorneys can upload contracts and review flagged items; staff can only view completed analyses. This RBAC model supports ABA Rule 1.1 by ensuring only authorized attorneys make decisions based on AI output.

Step 12: Testing with Sample Contract Corpus

Execute a structured testing phase using the firm's 50-contract sample corpus. Compare AI analysis output against the attorney champion's manual annotations to measure accuracy, identify false positives/negatives, and tune the system before go-live. Document all results for the client handoff.

Batch analysis and accuracy report generation for test corpus
bash
# Upload test contracts in batch
cd /opt/legalai
source venv/bin/activate

# Run batch analysis on test corpus
python -m app.management.batch_analyze \
  --input-dir /opt/legalai/test_contracts/ \
  --playbook-auto-detect \
  --output-report /opt/legalai/test_results/batch_report.json

# Generate accuracy report comparing AI output to attorney annotations
python -m app.management.accuracy_report \
  --ai-results /opt/legalai/test_results/batch_report.json \
  --attorney-annotations /opt/legalai/test_contracts/attorney_annotations.json \
  --output /opt/legalai/test_results/accuracy_report.html
  • Review key metric: True positive rate (correctly flagged issues)
  • Review key metric: False positive rate (incorrectly flagged non-issues)
  • Review key metric: False negative rate (missed actual issues) — MOST CRITICAL
  • Review key metric: Clause extraction accuracy
  • Review key metric: Risk score correlation with attorney assessment
  • Upload 10 sample contracts through the Word add-in
  • Compare goHeather flags against the custom pipeline flags
  • Document any discrepancies
Note

Target metrics: True positive rate >85%, False negative rate <10% (missing real issues is the highest-risk failure mode), False positive rate <30% (some over-flagging is acceptable and preferred over under-flagging). If false negative rate exceeds 10%, review and strengthen the relevant playbook clauses and prompt templates before go-live. This testing phase typically requires 2-3 iterations of playbook tuning. Involve the attorney champion in reviewing every flagged item during testing to calibrate the system.

Step 13: Attorney Training and Go-Live

Conduct structured training sessions for all attorneys and relevant staff. Cover the goHeather Word add-in workflow, the custom analysis dashboard, how to interpret risk scores and flagged items, the attorney feedback loop, and ABA compliance requirements. Then execute a phased go-live starting with a single practice group.

1
Overview: What the AI does and doesn't do (30 min) — AI as first-pass reviewer, not decision-maker; ABA Formal Opinion 512 compliance requirements; Attorney's ongoing duty of competence and supervision
2
goHeather Word Add-in Demo (45 min) — Opening a contract in Word; Clicking 'Analyze' in the goHeather ribbon; Reading the traffic-light analysis (red/amber/green); Accepting or modifying suggested redlines; Custom playbook selection per contract type
3
Custom Analysis Dashboard Demo (45 min) — Uploading contracts via SharePoint (auto-trigger); Reviewing the analysis report; Understanding risk scores (0-10 scale); Acting on flagged items (accept/dismiss/modify); Linking analysis to Clio matters
4
Feedback Loop & Continuous Improvement (15 min) — How to report false positives/negatives; Monthly playbook review process; Escalation to MSP for technical issues
5
Q&A and Practice Exercises (30 min) — Each attorney analyzes a practice contract; Group discussion of results
Note

Schedule training in 2 sessions: one for the pilot group (2-3 attorneys) who go live first, and a second session 2 weeks later for remaining attorneys after the pilot group validates the workflow. Provide a printed quick-reference card (laminated, desk-sized) showing the most common workflows. Record the training session for future onboarding of new attorneys. Create a shared OneNote or Confluence page for FAQs. The attorney champion should be available for peer support during the first 2 weeks of go-live.

Custom AI Components

Contract Analysis Engine

Type: agent The core LangChain-based agent that performs multi-step contract analysis. It accepts a contract document (PDF or DOCX), extracts text, identifies the contract type, matches it against the appropriate playbook, and performs clause-by-clause analysis to flag unusual clauses, missing provisions, and risk terms. Returns a structured JSON analysis report with severity-rated flagged items. Implementation:

/opt/legalai/app/analyzer.py
python
# /opt/legalai/app/analyzer.py

import json
import hashlib
import time
from typing import Optional
from datetime import datetime

from langchain_openai import AzureChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field
from sqlalchemy.orm import Session
import structlog

logger = structlog.get_logger()

# --- Output Schema ---
class FlaggedItem(BaseModel):
    item_type: str = Field(description="One of: unusual_clause, missing_provision, risk_term, deviation, ambiguity")
    severity: str = Field(description="One of: info, low, medium, high, critical")
    clause_reference: str = Field(description="Section/clause number or heading where the issue was found")
    original_text: str = Field(description="The exact text from the contract that triggered this flag")
    description: str = Field(description="Clear explanation of why this item was flagged")
    recommendation: str = Field(description="Specific recommended action for the reviewing attorney")

class ContractAnalysisResult(BaseModel):
    contract_type: str = Field(description="Identified contract type")
    parties: list[str] = Field(description="Names of all parties to the contract")
    effective_date: Optional[str] = Field(description="Contract effective date if found")
    overall_risk_score: float = Field(description="Overall risk score from 0.0 (low risk) to 10.0 (extreme risk)")
    executive_summary: str = Field(description="2-3 paragraph summary of the contract and key findings")
    flagged_items: list[FlaggedItem] = Field(description="All flagged issues found in the contract")
    missing_provisions: list[str] = Field(description="Standard provisions expected for this contract type that are missing")
    positive_observations: list[str] = Field(description="Clauses that are well-drafted or favorable to the client")

# --- Analysis Engine ---
class ContractAnalyzer:
    def __init__(self, db_session: Session):
        self.db = db_session
        self.llm = AzureChatOpenAI(
            azure_deployment="gpt-5.4-contract-review",
            api_version="2024-08-06",
            temperature=0.1,  # Low temperature for consistent, precise analysis
            max_tokens=8000,
        )
        self.parser = PydanticOutputParser(pydantic_object=ContractAnalysisResult)

    def _load_playbook(self, playbook_id: str) -> dict:
        """Load playbook and its clauses from the database."""
        from app.models import Playbook, PlaybookClause
        playbook = self.db.query(Playbook).filter(Playbook.id == playbook_id, Playbook.is_active == True).first()
        if not playbook:
            raise ValueError(f"Playbook {playbook_id} not found or inactive")
        clauses = self.db.query(PlaybookClause).filter(PlaybookClause.playbook_id == playbook_id).all()
        return {
            "name": playbook.name,
            "contract_type": playbook.contract_type,
            "jurisdiction": playbook.jurisdiction,
            "clauses": [
                {
                    "name": c.clause_name,
                    "category": c.clause_category,
                    "standard_position": c.standard_position,
                    "risk_level": c.risk_level,
                    "is_required": c.is_required,
                    "sample_language": c.sample_language,
                    "red_flag_patterns": c.red_flag_patterns or []
                }
                for c in clauses
            ]
        }

    def _build_analysis_prompt(self, contract_text: str, playbook: dict) -> ChatPromptTemplate:
        """Build the multi-part analysis prompt."""
        system_template = """You are an expert legal contract analyst AI assistant working for a law firm. Your role is to perform thorough contract review by comparing the provided contract against the firm's legal playbook.

You must:
1. Identify ALL clauses in the contract and compare each against the playbook's standard positions
2. Flag any clause that deviates from the firm's standard position, noting the severity of deviation
3. Identify any REQUIRED clauses from the playbook that are MISSING from the contract
4. Flag risk terms including but not limited to: unlimited liability, unilateral termination rights, automatic renewal without notice, broad indemnification, non-mutual obligations, overly broad non-compete/non-solicit, unfavorable governing law, waiver of jury trial, limitation on consequential damages exclusion, assignment without consent
5. Identify ambiguous language that could be interpreted adversely
6. Note any unusual or non-standard clauses not typically found in this type of contract
7. Assess an overall risk score from 0.0 to 10.0

IMPORTANT GUIDELINES:
- Be thorough: it is better to over-flag than to miss a genuine risk
- Be specific: always cite the exact contract language and section references
- Be actionable: every flagged item must include a concrete recommendation
- Be objective: note both risks and favorable provisions
- Severity scale: 'critical' = potential significant financial/legal exposure; 'high' = material deviation requiring negotiation; 'medium' = notable deviation worth discussing; 'low' = minor deviation, likely acceptable; 'info' = observation, no action needed
- This analysis will be reviewed by a licensed attorney who makes all final decisions

{format_instructions}"""

        human_template = """## LEGAL PLAYBOOK: {playbook_name}
Contract Type: {contract_type}
Jurisdiction: {jurisdiction}

### Standard Positions and Required Clauses:
{playbook_clauses}

---

## CONTRACT TO ANALYZE:
{contract_text}

---

Perform a comprehensive analysis of this contract against the playbook above. Identify all flagged items, missing provisions, and risk terms. Provide your analysis in the specified JSON format."""

        playbook_clauses_text = ""
        for clause in playbook["clauses"]:
            required_marker = " [REQUIRED]" if clause["is_required"] else ""
            red_flags = ""
            if clause["red_flag_patterns"]:
                red_flags = f"\n    Red flags: {', '.join(clause['red_flag_patterns'])}"
            playbook_clauses_text += f"""\n- **{clause['name']}** ({clause['category']}) - Risk Level: {clause['risk_level']}{required_marker}
    Standard Position: {clause['standard_position']}{red_flags}\n"""

        prompt = ChatPromptTemplate.from_messages([
            ("system", system_template),
            ("human", human_template)
        ])
        return prompt, {
            "format_instructions": self.parser.get_format_instructions(),
            "playbook_name": playbook["name"],
            "contract_type": playbook["contract_type"],
            "jurisdiction": playbook.get("jurisdiction", "Not specified"),
            "playbook_clauses": playbook_clauses_text,
            "contract_text": contract_text
        }

    async def analyze_contract(
        self,
        contract_text: str,
        playbook_id: str,
        contract_id: str,
        uploaded_by: str
    ) -> ContractAnalysisResult:
        """Execute the full contract analysis pipeline."""
        start_time = time.time()
        logger.info("contract_analysis_started", contract_id=contract_id, playbook_id=playbook_id)

        # Load playbook
        playbook = self._load_playbook(playbook_id)

        # Build prompt
        prompt, variables = self._build_analysis_prompt(contract_text, playbook)

        # Execute LLM analysis
        chain = prompt | self.llm | self.parser
        result = await chain.ainvoke(variables)

        processing_time = time.time() - start_time
        logger.info(
            "contract_analysis_completed",
            contract_id=contract_id,
            risk_score=result.overall_risk_score,
            flagged_count=len(result.flagged_items),
            missing_count=len(result.missing_provisions),
            processing_seconds=round(processing_time, 2)
        )

        # Persist results to database
        from app.models import AnalysisResult, FlaggedItemDB, AuditLog
        analysis_record = AnalysisResult(
            contract_id=contract_id,
            playbook_id=playbook_id,
            overall_risk_score=result.overall_risk_score,
            summary=result.executive_summary,
            model_version="gpt-5.4",
            processing_time_seconds=round(processing_time, 2),
            raw_response=result.model_dump()
        )
        self.db.add(analysis_record)
        self.db.flush()

        for item in result.flagged_items:
            flagged_record = FlaggedItemDB(
                analysis_id=analysis_record.id,
                item_type=item.item_type,
                severity=item.severity,
                clause_reference=item.clause_reference,
                original_text=item.original_text,
                description=item.description,
                recommendation=item.recommendation
            )
            self.db.add(flagged_record)

        # Audit log
        audit_entry = AuditLog(
            action="contract_analyzed",
            user_id=uploaded_by,
            contract_id=contract_id,
            details={
                "playbook_id": playbook_id,
                "risk_score": result.overall_risk_score,
                "flagged_items": len(result.flagged_items),
                "processing_seconds": round(processing_time, 2)
            }
        )
        self.db.add(audit_entry)
        self.db.commit()

        return result

Contract Type Classifier

Type: skill A lightweight AI skill that automatically identifies the type of contract (NDA, lease, employment agreement, MSA, vendor services, etc.) from its text content. This enables automatic playbook matching so attorneys don't need to manually specify the contract type before analysis. Runs as a pre-processing step before the main analysis engine.

Implementation:

python
# /opt/legalai/app/classifier.py

from langchain_openai import AzureChatOpenAI
from langchain.prompts import ChatPromptTemplate
from pydantic import BaseModel, Field
from langchain.output_parsers import PydanticOutputParser
from sqlalchemy.orm import Session

class ClassificationResult(BaseModel):
    contract_type: str = Field(description="The identified contract type. One of: NDA, Lease, Employment, Services, MSA, Purchase, License, Partnership, Consulting, Settlement, Other")
    confidence: float = Field(description="Confidence score from 0.0 to 1.0")
    parties: list[str] = Field(description="Names of all parties identified in the contract")
    effective_date: str | None = Field(description="Contract effective date if found, in YYYY-MM-DD format")
    jurisdiction: str | None = Field(description="Governing law jurisdiction if specified")
    reasoning: str = Field(description="Brief explanation of why this classification was chosen")

class ContractClassifier:
    SUPPORTED_TYPES = ['NDA', 'Lease', 'Employment', 'Services', 'MSA', 'Purchase', 'License', 'Partnership', 'Consulting', 'Settlement', 'Other']

    def __init__(self):
        # Use a cheaper, faster model for classification
        self.llm = AzureChatOpenAI(
            azure_deployment="gpt-5.4-contract-review",
            api_version="2024-08-06",
            temperature=0.0,
            max_tokens=500,
        )
        self.parser = PydanticOutputParser(pydantic_object=ClassificationResult)

    async def classify(self, contract_text: str) -> ClassificationResult:
        """Classify a contract and extract key metadata."""
        # Only send first 3000 chars for classification (cost optimization)
        text_sample = contract_text[:3000]

        prompt = ChatPromptTemplate.from_messages([
            ("system", """You are a legal document classifier. Analyze the provided contract text and identify:
1. The type of contract
2. All named parties
3. The effective date
4. The governing law jurisdiction

Supported contract types: {supported_types}

{format_instructions}"""),
            ("human", "Classify this contract:\n\n{text}")
        ])

        chain = prompt | self.llm | self.parser
        result = await chain.ainvoke({
            "supported_types": ", ".join(self.SUPPORTED_TYPES),
            "format_instructions": self.parser.get_format_instructions(),
            "text": text_sample
        })
        return result

    def match_playbook(self, classification: ClassificationResult, db: Session) -> str | None:
        """Match the classified contract type to the best available playbook."""
        from app.models import Playbook

        # Try exact match first
        playbook = db.query(Playbook).filter(
            Playbook.contract_type == classification.contract_type,
            Playbook.is_active == True
        )

        # If jurisdiction is known, prefer jurisdiction-specific playbook
        if classification.jurisdiction:
            jurisdiction_match = playbook.filter(
                Playbook.jurisdiction == classification.jurisdiction
            ).first()
            if jurisdiction_match:
                return str(jurisdiction_match.id)

        # Fall back to general playbook for this type
        general_match = playbook.filter(
            Playbook.jurisdiction.is_(None) | (Playbook.jurisdiction == '')
        ).first()
        if general_match:
            return str(general_match.id)

        # If no match, use the first active playbook of this type
        any_match = playbook.first()
        return str(any_match.id) if any_match else None

Risk Scoring Workflow

Type: workflow An end-to-end workflow that orchestrates the complete contract analysis pipeline: document ingestion, text extraction, contract classification, playbook matching, AI analysis, risk scoring, result persistence, and notification. This is the main entry point that ties all components together.

Implementation:

/opt/legalai/app/workflows.py
python
# /opt/legalai/app/workflows.py

import hashlib
from datetime import datetime
from typing import Optional

import pdfplumber
from docx import Document as DocxDocument
from azure.storage.blob import BlobServiceClient
from azure.identity import DefaultAzureCredential
from sqlalchemy.orm import Session
import structlog

from app.analyzer import ContractAnalyzer, ContractAnalysisResult
from app.classifier import ContractClassifier
from app.models import Contract, AuditLog
from app.clio_client import ClioClient

logger = structlog.get_logger()

class ContractReviewWorkflow:
    def __init__(self, db: Session):
        self.db = db
        self.classifier = ContractClassifier()
        self.analyzer = ContractAnalyzer(db)
        self.clio = ClioClient()

    def _extract_text(self, file_bytes: bytes, filename: str) -> str:
        """Extract text from PDF or DOCX files."""
        if filename.lower().endswith('.pdf'):
            import io
            text_parts = []
            with pdfplumber.open(io.BytesIO(file_bytes)) as pdf:
                for page in pdf.pages:
                    page_text = page.extract_text()
                    if page_text:
                        text_parts.append(page_text)
            full_text = "\n\n".join(text_parts)
            if len(full_text.strip()) < 100:
                logger.warning("pdf_low_text_extraction", filename=filename, chars=len(full_text))
                # May be a scanned PDF - would need OCR (Azure Document Intelligence)
                raise ValueError("PDF appears to be scanned/image-based. OCR processing required. Please upload a text-based PDF or DOCX.")
            return full_text
        elif filename.lower().endswith('.docx'):
            import io
            doc = DocxDocument(io.BytesIO(file_bytes))
            return "\n\n".join([para.text for para in doc.paragraphs if para.text.strip()])
        else:
            raise ValueError(f"Unsupported file type: {filename}. Please upload PDF or DOCX.")

    def _upload_to_blob(self, file_bytes: bytes, filename: str, contract_id: str) -> str:
        """Upload contract to Azure Blob Storage for archival."""
        import os
        blob_service = BlobServiceClient(
            account_url=f"https://{os.environ['AZURE_STORAGE_ACCOUNT']}.blob.core.windows.net",
            credential=DefaultAzureCredential()
        )
        container_client = blob_service.get_container_client(os.environ['AZURE_STORAGE_CONTAINER'])
        blob_path = f"{datetime.utcnow().strftime('%Y/%m')}/{contract_id}/{filename}"
        container_client.upload_blob(name=blob_path, data=file_bytes, overwrite=True)
        return blob_path

    async def execute(
        self,
        file_bytes: bytes,
        filename: str,
        uploaded_by: str,
        playbook_id: Optional[str] = None,
        contract_type: Optional[str] = None,
        clio_matter_id: Optional[str] = None
    ) -> dict:
        """Execute the complete contract review workflow."""

        # Step 1: Create contract record
        file_hash = hashlib.sha256(file_bytes).hexdigest()
        contract = Contract(
            filename=filename,
            file_hash=file_hash,
            blob_path="pending",
            uploaded_by=uploaded_by,
            clio_matter_id=clio_matter_id,
            status="analyzing"
        )
        self.db.add(contract)
        self.db.flush()
        contract_id = str(contract.id)

        logger.info("workflow_started", contract_id=contract_id, filename=filename)

        try:
            # Step 2: Extract text
            contract_text = self._extract_text(file_bytes, filename)
            logger.info("text_extracted", contract_id=contract_id, chars=len(contract_text))

            # Step 3: Upload to blob storage
            blob_path = self._upload_to_blob(file_bytes, filename, contract_id)
            contract.blob_path = blob_path

            # Step 4: Classify contract (if type not specified)
            if not playbook_id:
                classification = await self.classifier.classify(contract_text)
                contract.contract_type = classification.contract_type
                contract.parties = classification.parties
                if classification.effective_date:
                    try:
                        contract.effective_date = datetime.strptime(classification.effective_date, '%Y-%m-%d').date()
                    except ValueError:
                        pass

                # Step 5: Match playbook
                playbook_id = self.classifier.match_playbook(classification, self.db)
                if not playbook_id:
                    logger.warning("no_playbook_match", contract_type=classification.contract_type)
                    # Use a generic/default playbook
                    from app.models import Playbook
                    default = self.db.query(Playbook).filter(Playbook.contract_type == 'Other', Playbook.is_active == True).first()
                    if default:
                        playbook_id = str(default.id)
                    else:
                        raise ValueError(f"No playbook found for contract type: {classification.contract_type}")

            elif contract_type:
                contract.contract_type = contract_type

            # Step 6: Run AI analysis
            result = await self.analyzer.analyze_contract(
                contract_text=contract_text,
                playbook_id=playbook_id,
                contract_id=contract_id,
                uploaded_by=uploaded_by
            )

            # Step 7: Update contract status
            contract.status = "completed"
            self.db.commit()

            # Step 8: Link to Clio matter (if provided)
            if clio_matter_id:
                try:
                    await self.clio.create_note(
                        matter_id=clio_matter_id,
                        subject=f"AI Contract Analysis: {filename}",
                        body=self._format_clio_note(result)
                    )
                    logger.info("clio_note_created", matter_id=clio_matter_id)
                except Exception as e:
                    logger.error("clio_integration_error", error=str(e))

            # Step 9: Send notification for high-risk contracts
            if result.overall_risk_score >= 7.0:
                logger.warning(
                    "high_risk_contract_detected",
                    contract_id=contract_id,
                    risk_score=result.overall_risk_score,
                    critical_items=len([i for i in result.flagged_items if i.severity == 'critical'])
                )
                # TODO: Implement email/Teams notification for high-risk contracts

            return {
                "contract_id": contract_id,
                "status": "completed",
                "risk_score": result.overall_risk_score,
                "flagged_items_count": len(result.flagged_items),
                "missing_provisions_count": len(result.missing_provisions),
                "analysis": result.model_dump()
            }

        except Exception as e:
            contract.status = "error"
            self.db.add(AuditLog(
                action="analysis_error",
                user_id=uploaded_by,
                contract_id=contract_id,
                details={"error": str(e)}
            ))
            self.db.commit()
            logger.error("workflow_error", contract_id=contract_id, error=str(e))
            raise

    def _format_clio_note(self, result: ContractAnalysisResult) -> str:
        """Format analysis results as a readable note for Clio."""
        lines = [
            f"## AI Contract Analysis Report",
            f"**Overall Risk Score: {result.overall_risk_score}/10**",
            f"",
            f"### Executive Summary",
            result.executive_summary,
            f"",
            f"### Flagged Items ({len(result.flagged_items)} total)",
        ]
        for item in sorted(result.flagged_items, key=lambda x: ['critical','high','medium','low','info'].index(x.severity)):
            lines.append(f"- **[{item.severity.upper()}]** {item.clause_reference}: {item.description}")
        if result.missing_provisions:
            lines.append(f"\n### Missing Provisions")
            for provision in result.missing_provisions:
                lines.append(f"- {provision}")
        lines.append(f"\n---")
        lines.append(f"*This analysis was generated by AI and must be reviewed by a licensed attorney before any action is taken. See ABA Formal Opinion 512.*")
        return "\n".join(lines)

Playbook JSON Schema and Loader

Defines the JSON schema for legal playbooks and provides the management script for loading playbooks into the database. Each playbook defines the firm's standard positions for a specific contract type, including required clauses, acceptable language, risk levels, and red-flag patterns. This is the configuration artifact that attorneys collaborate on to customize the system.

Example: /opt/legalai/playbooks/nda_standard.json
json
{
  "name": "Standard NDA Playbook",
  "contract_type": "NDA",
  "jurisdiction": "New York",
  "version": 1,
  "clauses": [
    {
      "clause_name": "Definition of Confidential Information",
      "clause_category": "definitions",
      "standard_position": "Confidential information should be broadly defined to include all non-public information disclosed by either party, whether written, oral, or visual. Should include carve-outs for information that is: (a) publicly known, (b) already known to the recipient, (c) independently developed, or (d) received from a third party without restriction.",
      "risk_level": "high",
      "is_required": true,
      "sample_language": "'Confidential Information' means any and all non-public information, whether in written, oral, electronic, or other form, disclosed by the Disclosing Party to the Receiving Party...",
      "red_flag_patterns": [
        "excludes oral disclosures",
        "requires written marking as sole condition",
        "one-way definition favoring only one party",
        "no standard exceptions listed"
      ]
    },
    {
      "clause_name": "Non-Disclosure Obligations",
      "clause_category": "obligations",
      "standard_position": "Mutual obligations preferred. Receiving party must protect confidential information using at least the same degree of care it uses for its own confidential information, but no less than reasonable care. Must restrict access to need-to-know employees and advisors.",
      "risk_level": "high",
      "is_required": true,
      "sample_language": null,
      "red_flag_patterns": [
        "no standard of care specified",
        "unrestricted sharing with affiliates",
        "unilateral obligations only"
      ]
    },
    {
      "clause_name": "Term and Duration",
      "clause_category": "term",
      "standard_position": "Agreement term of 2-3 years. Confidentiality obligations should survive for 3-5 years after termination. Trade secrets should be protected indefinitely.",
      "risk_level": "medium",
      "is_required": true,
      "sample_language": null,
      "red_flag_patterns": [
        "perpetual obligations with no end date",
        "survival period less than 2 years",
        "no trade secret exception"
      ]
    },
    {
      "clause_name": "Permitted Disclosures",
      "clause_category": "exceptions",
      "standard_position": "Must include exceptions for legally compelled disclosures (subpoena, court order, regulatory request). Should require prompt notice to the disclosing party before such disclosure when legally permissible.",
      "risk_level": "high",
      "is_required": true,
      "sample_language": null,
      "red_flag_patterns": [
        "no exception for legal compulsion",
        "no notice requirement before compelled disclosure",
        "overly broad permitted disclosure categories"
      ]
    },
    {
      "clause_name": "Return or Destruction of Information",
      "clause_category": "termination",
      "standard_position": "Upon termination or request, receiving party must return or destroy all confidential information and certify destruction in writing. Should allow retention of copies required by law or internal compliance policies.",
      "risk_level": "medium",
      "is_required": true,
      "sample_language": null,
      "red_flag_patterns": [
        "no return/destruction obligation",
        "no certification requirement",
        "no compliance retention exception"
      ]
    },
    {
      "clause_name": "Remedies",
      "clause_category": "remedies",
      "standard_position": "Should acknowledge that breach may cause irreparable harm and that injunctive relief may be sought without proving actual damages. Should not waive right to other remedies.",
      "risk_level": "medium",
      "is_required": false,
      "sample_language": null,
      "red_flag_patterns": [
        "liquidated damages with cap below potential exposure",
        "waiver of injunctive relief",
        "limitation on consequential damages"
      ]
    },
    {
      "clause_name": "Non-Solicitation / Non-Compete",
      "clause_category": "restrictive_covenants",
      "standard_position": "Non-solicitation of employees may be acceptable for 12-24 months. Non-compete clauses should generally be avoided in NDAs. If present, must be narrowly tailored in scope, geography, and duration.",
      "risk_level": "high",
      "is_required": false,
      "sample_language": null,
      "red_flag_patterns": [
        "broad non-compete in an NDA",
        "non-solicitation exceeding 24 months",
        "no geographic limitation on non-compete",
        "non-compete without consideration"
      ]
    },
    {
      "clause_name": "Governing Law and Dispute Resolution",
      "clause_category": "dispute_resolution",
      "standard_position": "Governing law should be the firm client's home jurisdiction (New York). Federal and state courts in New York County preferred. Mandatory arbitration acceptable if with a recognized institution (AAA, JAMS).",
      "risk_level": "medium",
      "is_required": true,
      "sample_language": null,
      "red_flag_patterns": [
        "foreign governing law",
        "mandatory arbitration in unfavorable jurisdiction",
        "waiver of jury trial without reciprocity",
        "loser-pays attorney fee provision"
      ]
    },
    {
      "clause_name": "Assignment",
      "clause_category": "general",
      "standard_position": "Neither party should be able to assign rights or obligations without the other's prior written consent, except in connection with a merger, acquisition, or sale of substantially all assets.",
      "risk_level": "low",
      "is_required": false,
      "sample_language": null,
      "red_flag_patterns": [
        "unilateral assignment right",
        "assignment to affiliates without consent",
        "no assignment clause at all"
      ]
    },
    {
      "clause_name": "No License or IP Transfer",
      "clause_category": "ip",
      "standard_position": "Should explicitly state that no license to intellectual property is granted by disclosure of confidential information. Ownership of all IP remains with the disclosing party.",
      "risk_level": "high",
      "is_required": true,
      "sample_language": "Nothing in this Agreement shall be construed as granting any rights, by license or otherwise, to any Confidential Information...",
      "red_flag_patterns": [
        "no IP reservation clause",
        "implied license language",
        "work product ownership assigned to recipient"
      ]
    }
  ]
}
Playbook Loader Script
python
# /opt/legalai/app/management/load_playbook.py

import json
import sys
import argparse
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
import os

sys.path.insert(0, '/opt/legalai')
from app.models import Playbook, PlaybookClause

def load_playbook(filepath: str):
    engine = create_engine(os.environ['DATABASE_URL'])
    Session = sessionmaker(bind=engine)
    session = Session()

    with open(filepath, 'r') as f:
        data = json.load(f)

    # Check if playbook already exists
    existing = session.query(Playbook).filter(
        Playbook.name == data['name'],
        Playbook.contract_type == data['contract_type']
    ).first()

    if existing:
        print(f"Playbook '{data['name']}' already exists (ID: {existing.id}). Updating...")
        existing.jurisdiction = data.get('jurisdiction')
        existing.version = data.get('version', existing.version + 1)
        existing.updated_at = __import__('datetime').datetime.utcnow()
        # Delete existing clauses and reload
        session.query(PlaybookClause).filter(PlaybookClause.playbook_id == existing.id).delete()
        playbook_id = existing.id
    else:
        playbook = Playbook(
            name=data['name'],
            contract_type=data['contract_type'],
            jurisdiction=data.get('jurisdiction'),
            version=data.get('version', 1),
            created_by='system'
        )
        session.add(playbook)
        session.flush()
        playbook_id = playbook.id
        print(f"Created new playbook '{data['name']}' (ID: {playbook_id})")

    # Load clauses
    for clause_data in data['clauses']:
        clause = PlaybookClause(
            playbook_id=playbook_id,
            clause_name=clause_data['clause_name'],
            clause_category=clause_data['clause_category'],
            standard_position=clause_data['standard_position'],
            risk_level=clause_data.get('risk_level', 'medium'),
            is_required=clause_data.get('is_required', False),
            sample_language=clause_data.get('sample_language'),
            red_flag_patterns=clause_data.get('red_flag_patterns', []),
            notes=clause_data.get('notes')
        )
        session.add(clause)

    session.commit()
    print(f"Loaded {len(data['clauses'])} clauses for playbook '{data['name']}'")
    session.close()

if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='Load a legal playbook into the database')
    parser.add_argument('--file', required=True, help='Path to playbook JSON file')
    args = parser.parse_args()
    load_playbook(args.file)

FastAPI Application Entry Point

Type: integration The main FastAPI application that exposes REST API endpoints for contract upload, analysis, result retrieval, attorney feedback, and Clio integration. Includes authentication middleware using Entra ID JWT tokens and role-based access control.

Implementation:

/opt/legalai/app/main.py
python
# /opt/legalai/app/main.py

import os
from contextlib import asynccontextmanager
from fastapi import FastAPI, UploadFile, File, Form, Depends, HTTPException, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, Session
from typing import Optional
import structlog
import msal
import jwt
from datetime import datetime

from app.workflows import ContractReviewWorkflow
from app.models import Contract, AnalysisResult, FlaggedItemDB, AuditLog

logger = structlog.get_logger()

# Database setup
engine = create_engine(os.environ['DATABASE_URL'], pool_pre_ping=True, pool_size=10)
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)

def get_db():
    db = SessionLocal()
    try:
        yield db
    finally:
        db.close()

# JWT Token validation for Entra ID
TENANT_ID = os.environ.get('AZURE_AD_TENANT_ID')
CLIENT_ID = os.environ.get('AZURE_AD_CLIENT_ID')
JWKS_URL = f"https://login.microsoftonline.com/{TENANT_ID}/discovery/v2.0/keys"

async def get_current_user(request: Request) -> dict:
    auth_header = request.headers.get('Authorization')
    if not auth_header or not auth_header.startswith('Bearer '):
        raise HTTPException(status_code=401, detail="Missing or invalid authorization header")
    token = auth_header.split(' ')[1]
    try:
        # In production, cache the JWKS keys
        import httpx
        async with httpx.AsyncClient() as client:
            resp = await client.get(JWKS_URL)
            jwks = resp.json()
        
        unverified_header = jwt.get_unverified_header(token)
        rsa_key = None
        for key in jwks['keys']:
            if key['kid'] == unverified_header['kid']:
                rsa_key = jwt.algorithms.RSAAlgorithm.from_jwk(key)
                break
        if not rsa_key:
            raise HTTPException(status_code=401, detail="Unable to find appropriate key")
        
        payload = jwt.decode(
            token, rsa_key, algorithms=['RS256'],
            audience=CLIENT_ID,
            issuer=f"https://login.microsoftonline.com/{TENANT_ID}/v2.0"
        )
        return {
            'user_id': payload.get('preferred_username', payload.get('sub')),
            'name': payload.get('name'),
            'roles': payload.get('roles', []),
        }
    except jwt.ExpiredSignatureError:
        raise HTTPException(status_code=401, detail="Token expired")
    except jwt.InvalidTokenError as e:
        raise HTTPException(status_code=401, detail=f"Invalid token: {str(e)}")

def require_role(required_roles: list[str]):
    async def role_checker(user: dict = Depends(get_current_user)):
        if not any(role in user.get('roles', []) for role in required_roles):
            raise HTTPException(status_code=403, detail="Insufficient permissions")
        return user
    return role_checker

# FastAPI app
app = FastAPI(
    title="Legal AI Contract Analyzer",
    description="AI-powered contract review for legal services",
    version="1.0.0"
)

app.add_middleware(
    CORSMiddleware,
    allow_origins=[f"https://<client-domain>.sharepoint.com", "https://localhost:3000"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

@app.get("/health")
async def health_check(db: Session = Depends(get_db)):
    try:
        db.execute("SELECT 1")
        db_status = "connected"
    except Exception:
        db_status = "disconnected"
    return {"status": "healthy", "database": db_status, "timestamp": datetime.utcnow().isoformat()}

@app.post("/api/v1/analyze")
async def analyze_contract(
    file: UploadFile = File(...),
    contract_type: Optional[str] = Form(None),
    playbook_id: Optional[str] = Form(None),
    clio_matter_id: Optional[str] = Form(None),
    user: dict = Depends(require_role(['Attorney', 'Partner'])),
    db: Session = Depends(get_db)
):
    """Upload and analyze a contract document."""
    if not file.filename.lower().endswith(('.pdf', '.docx')):
        raise HTTPException(status_code=400, detail="Only PDF and DOCX files are supported")
    
    file_bytes = await file.read()
    if len(file_bytes) > 50 * 1024 * 1024:  # 50MB limit
        raise HTTPException(status_code=400, detail="File size exceeds 50MB limit")

    workflow = ContractReviewWorkflow(db)
    result = await workflow.execute(
        file_bytes=file_bytes,
        filename=file.filename,
        uploaded_by=user['user_id'],
        playbook_id=playbook_id,
        contract_type=contract_type,
        clio_matter_id=clio_matter_id
    )
    return JSONResponse(content=result)

@app.get("/api/v1/contracts")
async def list_contracts(
    status: Optional[str] = None,
    limit: int = 50,
    offset: int = 0,
    user: dict = Depends(require_role(['Attorney', 'Partner', 'Staff'])),
    db: Session = Depends(get_db)
):
    """List analyzed contracts with optional status filter."""
    query = db.query(Contract).order_by(Contract.uploaded_at.desc())
    if status:
        query = query.filter(Contract.status == status)
    contracts = query.offset(offset).limit(limit).all()
    return [{"id": str(c.id), "filename": c.filename, "contract_type": c.contract_type,
             "status": c.status, "uploaded_at": c.uploaded_at.isoformat(), "clio_matter_id": c.clio_matter_id}
            for c in contracts]

@app.get("/api/v1/contracts/{contract_id}/analysis")
async def get_analysis(
    contract_id: str,
    user: dict = Depends(require_role(['Attorney', 'Partner', 'Staff'])),
    db: Session = Depends(get_db)
):
    """Get the full analysis results for a contract."""
    analysis = db.query(AnalysisResult).filter(
        AnalysisResult.contract_id == contract_id
    ).order_by(AnalysisResult.analysis_timestamp.desc()).first()
    if not analysis:
        raise HTTPException(status_code=404, detail="Analysis not found")
    flagged = db.query(FlaggedItemDB).filter(FlaggedItemDB.analysis_id == analysis.id).all()
    return {
        "contract_id": contract_id,
        "analysis_id": str(analysis.id),
        "risk_score": float(analysis.overall_risk_score),
        "summary": analysis.summary,
        "model_version": analysis.model_version,
        "analyzed_at": analysis.analysis_timestamp.isoformat(),
        "flagged_items": [
            {"id": str(f.id), "type": f.item_type, "severity": f.severity,
             "clause_reference": f.clause_reference, "original_text": f.original_text,
             "description": f.description, "recommendation": f.recommendation,
             "attorney_reviewed": f.attorney_reviewed, "attorney_action": f.attorney_action}
            for f in flagged
        ]
    }

@app.put("/api/v1/flagged-items/{item_id}/review")
async def review_flagged_item(
    item_id: str,
    action: str = Form(...),  # accepted, dismissed, modified
    notes: Optional[str] = Form(None),
    user: dict = Depends(require_role(['Attorney', 'Partner'])),
    db: Session = Depends(get_db)
):
    """Record attorney review decision on a flagged item."""
    item = db.query(FlaggedItemDB).filter(FlaggedItemDB.id == item_id).first()
    if not item:
        raise HTTPException(status_code=404, detail="Flagged item not found")
    if action not in ('accepted', 'dismissed', 'modified'):
        raise HTTPException(status_code=400, detail="Action must be: accepted, dismissed, or modified")
    item.attorney_reviewed = True
    item.attorney_action = action
    item.attorney_notes = notes
    item.reviewed_at = datetime.utcnow()
    item.reviewed_by = user['user_id']
    db.add(AuditLog(
        action="flagged_item_reviewed",
        user_id=user['user_id'],
        contract_id=item.analysis.contract_id if item.analysis else None,
        details={"item_id": item_id, "action": action, "notes": notes}
    ))
    db.commit()
    return {"status": "updated", "item_id": item_id, "action": action}

@app.get("/api/v1/analytics/summary")
async def analytics_summary(
    user: dict = Depends(require_role(['Partner'])),
    db: Session = Depends(get_db)
):
    """Dashboard analytics for partners - review volumes, risk distribution, accuracy metrics."""
    from sqlalchemy import func
    total_contracts = db.query(func.count(Contract.id)).filter(Contract.status == 'completed').scalar()
    avg_risk = db.query(func.avg(AnalysisResult.overall_risk_score)).scalar()
    flagged_by_severity = db.query(
        FlaggedItemDB.severity, func.count(FlaggedItemDB.id)
    ).group_by(FlaggedItemDB.severity).all()
    reviewed_items = db.query(func.count(FlaggedItemDB.id)).filter(FlaggedItemDB.attorney_reviewed == True).scalar()
    dismissed_items = db.query(func.count(FlaggedItemDB.id)).filter(FlaggedItemDB.attorney_action == 'dismissed').scalar()
    false_positive_rate = (dismissed_items / reviewed_items * 100) if reviewed_items > 0 else 0
    return {
        "total_contracts_analyzed": total_contracts,
        "average_risk_score": round(float(avg_risk or 0), 1),
        "flagged_by_severity": {s: c for s, c in flagged_by_severity},
        "attorney_review_stats": {
            "total_reviewed": reviewed_items,
            "dismissed_as_false_positive": dismissed_items,
            "false_positive_rate_pct": round(false_positive_rate, 1)
        }
    }

Clio Practice Management Integration Client

Type: integration API client for bidirectional integration with Clio practice management. Retrieves matter and document information, creates activity notes with analysis results, and logs time entries for AI-assisted review. Handles OAuth 2.0 token management and rate limiting.

Implementation:

/opt/legalai/app/clio_client.py
python
# /opt/legalai/app/clio_client.py

import os
import httpx
from datetime import datetime
from typing import Optional
import structlog

logger = structlog.get_logger()

class ClioClient:
    BASE_URL = "https://app.clio.com/api/v4"

    def __init__(self):
        self.client_id = os.environ.get('CLIO_CLIENT_ID')
        self.client_secret = os.environ.get('CLIO_CLIENT_SECRET')
        self.redirect_uri = os.environ.get('CLIO_REDIRECT_URI')
        self._access_token = None
        self._refresh_token = os.environ.get('CLIO_REFRESH_TOKEN')  # Set after initial OAuth flow

    async def _ensure_token(self):
        """Refresh the access token if needed."""
        if self._access_token:
            return
        if not self._refresh_token:
            logger.warning("clio_no_refresh_token", msg="Clio integration not authenticated. Run OAuth flow first.")
            return
        async with httpx.AsyncClient() as client:
            resp = await client.post("https://app.clio.com/oauth/token", data={
                "grant_type": "refresh_token",
                "refresh_token": self._refresh_token,
                "client_id": self.client_id,
                "client_secret": self.client_secret,
            })
            if resp.status_code == 200:
                data = resp.json()
                self._access_token = data['access_token']
                self._refresh_token = data.get('refresh_token', self._refresh_token)
                # Persist new refresh token
                # In production, store in Azure Key Vault
                logger.info("clio_token_refreshed")
            else:
                logger.error("clio_token_refresh_failed", status=resp.status_code, body=resp.text)

    async def _request(self, method: str, path: str, **kwargs) -> dict:
        await self._ensure_token()
        if not self._access_token:
            raise ConnectionError("Clio authentication not available")
        async with httpx.AsyncClient() as client:
            resp = await client.request(
                method,
                f"{self.BASE_URL}{path}",
                headers={
                    "Authorization": f"Bearer {self._access_token}",
                    "Content-Type": "application/json"
                },
                **kwargs
            )
            if resp.status_code == 401:
                self._access_token = None
                await self._ensure_token()
                resp = await client.request(
                    method, f"{self.BASE_URL}{path}",
                    headers={"Authorization": f"Bearer {self._access_token}", "Content-Type": "application/json"},
                    **kwargs
                )
            resp.raise_for_status()
            return resp.json()

    async def get_matter(self, matter_id: str) -> dict:
        """Retrieve a matter from Clio."""
        result = await self._request("GET", f"/matters/{matter_id}.json", params={"fields": "id,display_number,description,client,status"})
        return result.get('data', {})

    async def create_note(self, matter_id: str, subject: str, body: str) -> dict:
        """Create an activity note on a Clio matter with analysis results."""
        payload = {
            "data": {
                "subject": subject,
                "body": body,
                "type": "Note",
                "matter": {"id": int(matter_id)},
                "date": datetime.utcnow().strftime("%Y-%m-%d")
            }
        }
        result = await self._request("POST", "/activities.json", json=payload)
        return result.get('data', {})

    async def create_time_entry(self, matter_id: str, description: str, duration_seconds: int, user_id: Optional[str] = None) -> dict:
        """Log a time entry for AI-assisted contract review. Marked as non-billable by default per ABA Rule 1.5."""
        payload = {
            "data": {
                "type": "TimeEntry",
                "quantity": duration_seconds,
                "note": f"[AI-Assisted Review] {description}",
                "matter": {"id": int(matter_id)},
                "date": datetime.utcnow().strftime("%Y-%m-%d"),
                "non_billable": True  # AI processing time is non-billable per ABA guidance
            }
        }
        if user_id:
            payload["data"]["user"] = {"id": int(user_id)}
        result = await self._request("POST", "/activities.json", json=payload)
        return result.get('data', {})

Testing & Validation

  • NETWORK TEST: From each attorney workstation, verify connectivity to all SaaS endpoints by running 'curl -s -o /dev/null -w "%{http_code}" https://app.goheather.com' and confirming HTTP 200 responses for goHeather, Azure OpenAI, and Clio endpoints. Verify DNS filtering is active by attempting to access an unauthorized AI tool (e.g., chat.openai.com) and confirming it is blocked by Cisco Umbrella policy.
  • SSO TEST: Each attorney logs into goHeather via the Entra ID SSO flow—verify successful authentication, correct role assignment, and that the goHeather dashboard loads with the firm's organization settings. Log in to the custom API at https://legalai.<client-domain>.com/api/v1/contracts using a Bearer token obtained from Entra ID and confirm the response returns an empty contract list with HTTP 200.
  • WORD ADD-IN TEST: Open Microsoft Word on each workstation, verify the goHeather add-in appears in the ribbon, open a sample NDA, click 'Analyze Contract', and confirm the traffic-light analysis appears within 30 seconds showing red/amber/green clause assessments.
  • DOCUMENT UPLOAD TEST: Upload a sample NDA PDF to the SharePoint 'Contracts for Review' folder and verify that: (1) the Power Automate flow triggers within 2 minutes, (2) the contract appears in the custom API with status 'analyzing', (3) status changes to 'completed' within 3 minutes, and (4) the SharePoint metadata columns (AI Analysis Status, Risk Score, Flagged Items) are automatically populated.
  • CONTRACT CLASSIFICATION TEST: Upload 10 contracts of different types (2 NDAs, 2 leases, 2 employment agreements, 2 service agreements, 2 MSAs) through the API without specifying contract_type. Verify the classifier correctly identifies at least 9 out of 10 contract types and matches the appropriate playbook.
  • FALSE NEGATIVE TEST (CRITICAL): Take 5 contracts from the test corpus that have known, attorney-annotated issues (e.g., an NDA with a missing confidentiality exception, a lease with an unusual auto-renewal clause). Run analysis and verify the AI flags at least 85% of the known issues. Document any missed issues and adjust playbook red_flag_patterns accordingly.
  • FALSE POSITIVE TEST: Run analysis on 5 contracts from the test corpus that the attorney champion has certified as 'clean' (no significant issues). Verify the false positive rate is below 30%—meaning no more than 30% of flagged items are dismissed by the attorney as non-issues. If the rate exceeds 30%, tune playbook standard_position descriptions to be more specific.
  • RISK SCORING CALIBRATION TEST: Analyze a set of 10 contracts pre-scored by the attorney champion on a 1-10 risk scale. Compare the AI risk scores to attorney scores—they should correlate within ±2 points on the 10-point scale for at least 80% of contracts. If correlation is poor, review the risk scoring prompt and adjust severity weightings.
  • CLIO INTEGRATION TEST: Analyze a contract with a valid clio_matter_id parameter. Verify that: (1) an activity note is created on the correct Clio matter, (2) the note contains the executive summary and all flagged items, (3) a non-billable time entry is logged tagged as 'AI-Assisted Review', and (4) the Clio note includes the ABA Formal Opinion 512 disclaimer.
  • AUDIT TRAIL TEST: Perform a complete contract review workflow, then query the audit_log table to verify entries exist for: contract upload, analysis completed, and any attorney review actions. Verify that all entries include the user_id, timestamp, and relevant contract_id. This audit trail is essential for ABA Rule 1.1 competence documentation.
  • PERFORMANCE TEST: Upload 5 contracts simultaneously (or in rapid sequence) and verify the system processes all 5 without errors and completes each within 5 minutes. Monitor the Azure VM CPU and memory during the test—CPU should stay below 80% sustained. Check Azure OpenAI rate limits are not hit (error 429).
  • ROLE-BASED ACCESS TEST: Authenticate as a 'Staff' role user and attempt to upload a contract via POST /api/v1/analyze—verify HTTP 403 Forbidden is returned. Authenticate as an 'Attorney' role and verify the same endpoint returns HTTP 200. Authenticate as a 'Partner' and verify access to GET /api/v1/analytics/summary.
  • DATA SECURITY TEST: Verify that the Azure Blob Storage account has public access disabled (az storage account show --name stlegalaidocs --query allowBlobPublicAccess). Verify PostgreSQL requires SSL connections. Verify the FortiGate firewall is logging all outbound HTTPS traffic to AI vendor endpoints. Confirm goHeather's DPA is on file with the no-training clause explicitly stated.

Client Handoff

Client Handoff Checklist

Training Delivered

1
Attorney Training Session (recorded, 2-3 hours): goHeather Word add-in usage, custom dashboard walkthrough, interpreting risk scores and flagged items, attorney feedback workflow (accept/dismiss/modify), ABA Formal Opinion 512 obligations, when to trust vs. override AI recommendations
2
Staff Training Session (1 hour): How to upload contracts to the SharePoint 'Contracts for Review' folder, how to check analysis status, how to access completed analysis reports (read-only)
3
Partner/Admin Training (1 hour): Analytics dashboard usage, playbook management overview, user access management via Entra ID, monthly accuracy reporting

Documentation Delivered

1
Quick Reference Card (laminated, desk-sized): Step-by-step for the two most common workflows—(a) analyze a contract in Word via goHeather, (b) upload via SharePoint for automated analysis
2
Playbook Reference Guide: Printout of all active playbooks showing standard positions, required clauses, and risk levels for each contract type
3
AI Usage Policy Template: Firm-specific policy document covering ABA Formal Opinion 512 requirements—attorney supervision obligations, client disclosure language, fee billing guidelines for AI-assisted work
4
System Architecture Diagram: Visual showing data flow from document upload through analysis to Clio integration, including where data is stored and encrypted
5
Vendor Security Summary: One-page summary of goHeather's security certifications (SOC 2, no-training clause), Azure data residency, and data processing agreements on file
6
Escalation Procedures: Contact information and process for reporting AI errors, requesting playbook changes, and technical support tickets with the MSP
7
Runbook for IT Staff: Covers password resets, adding/removing users in Entra ID, checking system health, and restarting the orchestration service if needed

Success Criteria Review

Maintenance

Ongoing Maintenance Responsibilities

Weekly (MSP - 1 hour/week)

  • Monitor Azure VM health, CPU, memory, and disk usage via Azure Monitor alerts (threshold: CPU >80% sustained, disk >85%)
  • Review application logs (journalctl -u legalai) for errors or anomalies
  • Check Azure OpenAI API consumption and rate limit events in Azure Monitor
  • Verify the /health endpoint returns 'healthy' status (integrate with MSP RMM tool—ConnectWise Automate or Datto RMM)
  • Review Cisco Umbrella DNS logs for any unauthorized AI tool access attempts

Monthly (MSP + Attorney Champion - 2-3 hours/month)

  • Generate and review the accuracy report: pull false positive rate from the analytics/summary endpoint; if false positive rate exceeds 30%, schedule a playbook tuning session
  • Review attorney feedback on flagged items: analyze patterns in dismissed items to identify playbook over-flagging; analyze patterns in items marked 'modified' to strengthen weak playbook positions
  • Update playbook clauses based on feedback: adjust red_flag_patterns, standard_positions, and risk_levels as needed
  • Review Azure costs and token consumption: ensure API spend is within budget ($50-200/month range); optimize prompts if costs are trending high
  • Apply OS security patches to the Azure VM (sudo apt update && sudo apt upgrade)
  • Rotate API keys and secrets quarterly (store in Azure Key Vault)

Quarterly (MSP - 4 hours/quarter)

  • Comprehensive compliance review: verify all DPAs are current, confirm SOC 2 attestations are still valid, check for new state bar AI ethics opinions that may affect the firm's obligations
  • Update the AI usage policy if new guidance has been issued (monitor ABA, state bar associations, and EU AI Act developments)
  • Review and update goHeather/LegalSifter to latest version; test any new features in a sandbox before enabling for the firm
  • Performance benchmarking: re-run the test corpus through the system and compare accuracy metrics to baseline; investigate any degradation
  • Capacity planning: review contract analysis volume trends; scale Azure VM or OpenAI throughput if volume is growing
  • Backup verification: test restore of PostgreSQL database and Blob Storage contracts from backup

Annually (MSP - 8 hours + attorney stakeholders)

  • Full playbook review with the firm's practice group leaders: add playbooks for new contract types, retire unused playbooks, update standard positions based on legal developments
  • Azure OpenAI model version assessment: evaluate newer GPT model versions for accuracy improvements; test with the standard corpus before upgrading
  • Security audit: penetration test of the custom API, review Entra ID access logs for anomalies, verify all former employees have been deprovisioned
  • Client satisfaction review: survey attorneys on AI accuracy, usability, and time savings; calculate ROI metrics (hours saved × billing rate vs. system cost)
  • Contract renewal negotiation: review goHeather/LegalSifter pricing; negotiate volume discounts based on usage data

SLA Considerations

  • System availability target: 99.5% uptime during business hours (Mon-Fri 7am-8pm client local time)
  • Analysis completion time: <5 minutes for standard contracts (<50 pages); <15 minutes for complex contracts (50-200 pages)
  • Critical issue response: 2-hour response time for system outages; 4-hour response for analysis errors
  • Playbook updates: Completed within 5 business days of attorney-approved change request
  • Escalation path: Level 1 (MSP help desk) → Level 2 (MSP integration engineer) → Level 3 (AI vendor support + MSP lead architect)

Model Retraining / Update Triggers

  • False positive rate exceeds 35% for two consecutive months
  • False negative rate exceeds 15% (immediate playbook review required)
  • New contract type requested by the firm (new playbook development: $2,000-$5,000)
  • Major legal development (e.g., new regulation, landmark case) affecting standard clause positions
  • Azure OpenAI announces new model version with significant accuracy improvements
  • goHeather/LegalSifter releases major platform update affecting analysis capabilities

Alternatives

Enterprise Platform Approach (Luminance or Kira)

Instead of goHeather/LegalSifter for the SaaS layer, deploy Luminance or Kira Systems (by Litera) as the primary contract AI platform. These are purpose-built enterprise legal AI platforms with proprietary machine learning models (not just LLM wrappers). Luminance uses unsupervised ML for anomaly detection across 80+ languages with its proprietary Legal-Grade LLM. Kira specializes in M&A due diligence with 90%+ extraction accuracy and pre-trained models for 1,000+ clause types.

Fully Custom Build with Azure OpenAI Only (No SaaS)

Eliminate the goHeather/LegalSifter SaaS platform entirely and build the complete solution using Azure OpenAI GPT-5.4, LangChain, and a custom web frontend. All contract analysis runs through the custom pipeline with firm-specific playbooks. Includes building a React/Next.js web dashboard for attorneys and a Microsoft Word add-in (Office.js) for in-document analysis.

LegalSifter as Primary Platform (Instead of goHeather)

Use LegalSifter as the primary SaaS platform instead of goHeather. LegalSifter offers pre-built 'Sifters' (AI advisors) for 150+ specific clause types and is designed to work for both lawyers and non-lawyers reviewing contracts. Available at $29/user/month on a 2-year commitment, making it slightly cheaper than goHeather.

Spellbook (Multi-LLM Word-Native Approach)

Deploy Spellbook as the primary contract AI tool. Spellbook operates as a native Microsoft Word add-in and uses multiple LLMs (including GPT-5 and Claude) for contract review. It serves 4,000+ teams in 80+ countries and is GDPR, CCPA, and PIPEDA compliant. The entire workflow stays within Microsoft Word—no separate dashboard or web portal needed.

Note

WHEN TO RECOMMEND: Firms where attorney adoption is the primary concern; firms that have rejected previous technology tools due to workflow disruption; firms focused on commercial/transactional work where the Word-centric workflow is ideal.

Microsoft Copilot for Microsoft 365 + SharePoint Premium

Use Microsoft's own AI ecosystem—Copilot for Microsoft 365 for general contract drafting and review, combined with SharePoint Premium (formerly Syntex) for automated document processing, classification, and metadata extraction. No third-party AI vendor required; everything runs within the Microsoft tenant.

Want early access to the full toolkit?