April 15, 202619 min readIntelligence & Insights

Implementation Guide: Fuse Multi-Source Intelligence Data — Pattern-of-Life Anomalies, Link Analysis & Threat Actor Profiles

Step-by-step implementation guide for deploying AI to fuse multi-source intelligence data — pattern-of-life anomalies, link analysis & threat actor profiles for Government & Defense clients.

Use Case Implementation Guide

Software Procurement

Microsoft Azure OpenAI Service (Azure Government) — IL4

Microsoft Azure GovernmentGPT-5.4Qty: Consumption-based

GPT-5.4: ~$0.005/1K input, ~$0.015/1K output. Entity extraction from 100-page report: ~$2–$5. Threat actor profile synthesis: ~$5–$15 per profile.

Warning

For CUI//INTEL environments, Azure OpenAI running in Azure Government at IL4 authorization level is the required platform. Confirm with the client's ISSO that the specific data types being analyzed are within the IL4 boundary. Some intelligence-related CUI categories (e.g., CUI//SP-CTI — Controlled Technical Information for intelligence purposes) may require IL5 — verify before deployment.

Microsoft Sentinel (Azure Government)

Microsoft Azure GovernmentConsumption-based (per GB ingested)Qty: 10–50GB/month

~$2.46/GB ingested (Azure Government pricing); typical OSINT pipeline: 10–50GB/month = $25–$125/month

Cloud-native SIEM and SOAR platform running in Azure Government. Used as the aggregation layer for multi-source data feeds — ingests OSINT feeds, commercial threat intelligence, social media monitoring data, and structured data exports from other tools. Provides the data lake from which Azure OpenAI analysis is triggered.

Maltego (Entity Link Analysis)

Maltego

Maltego GmbHEnterpriseQty: Per-seat annual

Maltego Enterprise: $5,000–$10,000/seat/year. Government pricing available.

Industry-standard open-source intelligence and link analysis platform. Visualizes relationships between persons, organizations, IP addresses, domains, email addresses, phone numbers, and locations. Used by IC contractors, law enforcement, and cybersecurity analysts. Transforms (data queries) connect to OSINT data sources and commercial intelligence feeds. Output is a visual graph showing entity relationships and connection paths.

Note

Maltego's cloud-based transforms route data through Maltego's servers — verify data sensitivity before using cloud transforms. For sensitive CUI, use locally-deployed transforms or the Maltego on-premises deployment.

Recorded Future Intelligence Cloud (Commercial Threat Intel Feed)

Recorded Future Intelligence Cloud

Recorded Future (now part of Mastercard)SaaS annual subscription

$50,000–$200,000+/year depending on modules and entity count

Recorded Future aggregates OSINT, dark web, technical threat intelligence, and geopolitical intelligence into structured, machine-readable threat intelligence. Provides APIs for automated ingestion into the Azure Sentinel pipeline. Used for threat actor profile enrichment, indicator of compromise (IOC) data, and vulnerability intelligence. FedRAMP Moderate authorized.

Babel Street

Babel StreetSaaS annual (government pricing)

Contact vendor; government pricing available

Government-focused OSINT platform providing real-time multilingual social media monitoring, dark web monitoring, and geospatial intelligence data aggregation. FedRAMP authorized. Used by DHS, DoD, and IC contractors for pattern-of-life analysis and situational awareness. Provides structured API output that feeds into the Azure Sentinel aggregation layer.

Palantir Gotham (Enterprise Intelligence Platform — Optional)

Palantir TechnologiesGotham

Typically $1M+/year for enterprise deployments; available via DoD enterprise license

Enterprise intelligence fusion and analysis platform widely deployed across DoD and IC. If the client already has Palantir Gotham, integrate the Azure OpenAI analysis pipeline as an enrichment layer rather than replacing Palantir. Palantir AIP (AI Platform) also provides FedRAMP High authorized LLM capabilities that can substitute for the Azure OpenAI components described here.

Prerequisites

Cleared personnel requirement: Even for unclassified OSINT analysis, IC contractor environments typically require cleared personnel (minimum Secret, often TS) to access the analysis systems and review outputs. The MSP technicians deploying this system must hold the appropriate clearances for the environment. Confirm with the client's FSO before beginning deployment.
Data source agreements: OSINT and commercial intelligence feeds require licensing agreements. Confirm the client has active, paid subscriptions to all data sources before configuring ingestion pipelines. Verify the licensing agreement covers the intended use case (some intelligence data licenses restrict automated processing or AI analysis).
Terms of service compliance for OSINT sources: Many public OSINT sources (social media platforms, public records databases) have Terms of Service that restrict automated bulk collection. The client's legal counsel must confirm that the collection methodology complies with platform ToS, relevant laws (Computer Fraud and Abuse Act, state equivalents), and any agency-specific collection authorities.
Collection authority documentation (government clients): For government agency clients conducting intelligence collection activities, the client must have documented legal authority for each data collection activity (e.g., Executive Order 12333, relevant statutes, agency authorities). The MSP does not determine collection legality — but must not configure pipelines for data the client is not legally authorized to collect.
IL4/IL5 boundary confirmation: Work with the client's ISSO to formally determine whether the analysis data is within IL4 (CUI) or requires IL5 (CUI with more stringent controls). This determination drives platform selection and must be documented before go-live.
IT admin access: Azure Government subscription (Owner), Microsoft Sentinel workspace, Maltego Enterprise admin, API credentials for commercial intelligence feeds.

Installation Steps

Step 1: Configure the Multi-Source Data Ingestion Pipeline in Microsoft Sentinel

Set up Sentinel (Azure Government) as the aggregation layer, ingesting data from OSINT feeds, commercial threat intelligence, and structured data exports.

sentinel_data_connectors.py

python

# Configure data connectors for intelligence data sources

# sentinel_data_connectors.py
# Configure data connectors for intelligence data sources

# Note: Most connector configuration is done via Azure Portal (portal.azure.us)
# or ARM templates. The code below handles custom data ingestion via the
# Azure Monitor Logs Ingestion API for sources without native Sentinel connectors.

import requests
import json
import os
import datetime
from azure.identity import ClientSecretCredential

TENANT_ID = os.environ["AZURE_TENANT_ID"]
CLIENT_ID = os.environ["AZURE_CLIENT_ID"]
CLIENT_SECRET = os.environ["AZURE_CLIENT_SECRET"]
SENTINEL_WORKSPACE_ID = os.environ["SENTINEL_WORKSPACE_ID"]
DCE_ENDPOINT = os.environ["DATA_COLLECTION_ENDPOINT"]  # Azure Government DCE
DCR_IMMUTABLE_ID = os.environ["DATA_COLLECTION_RULE_ID"]
STREAM_NAME = os.environ["CUSTOM_STREAM_NAME"]  # e.g., "Custom-OsintEvents"

def get_access_token() -> str:
    """Get bearer token for Azure Monitor Logs Ingestion API."""
    credential = ClientSecretCredential(TENANT_ID, CLIENT_ID, CLIENT_SECRET)
    token = credential.get_token("https://monitor.azure.us/.default")
    return token.token

def ingest_osint_events(events: list) -> bool:
    """Ingest OSINT events into Sentinel custom table via Logs Ingestion API."""
    token = get_access_token()
    headers = {
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json"
    }

    # Normalize events to Sentinel schema
    normalized = []
    for event in events:
        normalized.append({
            "TimeGenerated": event.get("timestamp", datetime.datetime.utcnow().isoformat()),
            "EventSource": event.get("source", "Unknown"),
            "EntityType": event.get("entity_type", "Unknown"),
            "EntityValue": event.get("entity_value", ""),
            "EventType": event.get("event_type", ""),
            "Description": event.get("description", ""),
            "Confidence": event.get("confidence", 0),
            "Tags": json.dumps(event.get("tags", [])),
            "RawData": json.dumps(event.get("raw", {}))[:32000]
        })

    url = f"{DCE_ENDPOINT}/dataCollectionRules/{DCR_IMMUTABLE_ID}/streams/{STREAM_NAME}?api-version=2023-01-01"
    resp = requests.post(url, headers=headers, json=normalized)

    if resp.status_code == 204:
        return True
    else:
        print(f"Ingestion error: {resp.status_code} — {resp.text}")
        return False


def fetch_recorded_future_indicators(query: str, limit: int = 100) -> list:
    """Fetch threat indicators from Recorded Future API."""
    RF_API_KEY = os.environ["RECORDED_FUTURE_API_KEY"]
    headers = {
        "X-RFToken": RF_API_KEY,
        "Content-Type": "application/json"
    }

    resp = requests.get(
        "https://api.recordedfuture.com/v2/indicator/search",
        headers=headers,
        params={"query": query, "limit": limit, "fields": "risk,entity,timestamps,evidence"}
    )
    resp.raise_for_status()

    indicators = []
    for ind in resp.json().get("data", {}).get("results", []):
        indicators.append({
            "source": "Recorded Future",
            "entity_type": ind.get("entity", {}).get("type", ""),
            "entity_value": ind.get("entity", {}).get("name", ""),
            "risk_score": ind.get("risk", {}).get("score", 0),
            "risk_rules": [r.get("rule", "") for r in ind.get("risk", {}).get("rules", [])],
            "first_seen": ind.get("timestamps", {}).get("firstSeen", ""),
            "last_seen": ind.get("timestamps", {}).get("lastSeen", ""),
            "evidence": ind.get("evidence", [])
        })
    return indicators

Note

Microsoft Sentinel in Azure Government supports native data connectors for many common sources (Microsoft Threat Intelligence, TAXII servers, CEF/Syslog). Use native connectors where available — they are more reliable and require less maintenance than custom ingestion pipelines. Custom ingestion via the Logs Ingestion API is only needed for sources without a native Sentinel connector. Configure all Sentinel workspaces with RBAC restricting access to authorized analysts only — Sentinel workspaces in intelligence environments should have zero public access and all access logged.

Step 2: Build the Entity Extraction and Link Analysis Pipeline

Extract named entities from ingested OSINT data and identify relationships for link analysis visualization.

entity_extraction.py

python

# Extract entities and relationships from OSINT text for link analysis

# entity_extraction.py
# Extract entities and relationships from OSINT text for link analysis

from openai import AzureOpenAI
import os, json

client = AzureOpenAI(
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
    api_key=os.environ["AZURE_OPENAI_KEY"],
    api_version="2024-08-01-preview"
)

ENTITY_EXTRACTION_PROMPT = """You are an intelligence analyst performing entity extraction
and relationship mapping for link analysis.

Extract all named entities and relationships from the following intelligence text.

ENTITY TYPES TO EXTRACT:
- PERSON: Named individuals (full name, aliases, titles/roles if mentioned)
- ORGANIZATION: Companies, government agencies, military units, terrorist organizations, criminal groups
- LOCATION: Countries, cities, specific facilities, coordinates if mentioned
- FACILITY: Specific named buildings, bases, ports, infrastructure
- VESSEL: Ships, aircraft, vehicles if named
- DEVICE: Specific electronic devices, weapons systems if named
- IP_ADDRESS: Internet protocol addresses
- DOMAIN: Web domains and URLs
- EMAIL: Email addresses
- PHONE: Phone numbers
- DATE: Specific dates or date ranges mentioned
- EVENT: Named events, operations, incidents

For each entity:
{{
  "entity_id": "unique ID for this document (E001, E002...)",
  "type": "PERSON|ORGANIZATION|LOCATION|...",
  "value": "the entity name/value as it appears in text",
  "normalized": "standardized form (e.g., 'USA' → 'United States')",
  "aliases": ["other names used for same entity in this document"],
  "context": "brief context from document (max 50 words)",
  "confidence": 0.0-1.0
}}

For each RELATIONSHIP between entities:
{{
  "relationship_id": "R001, R002...",
  "subject_entity_id": "E001",
  "relationship_type": "MEMBER_OF|LOCATED_AT|COMMUNICATES_WITH|AFFILIATED_WITH|COMMANDS|OWNS|OPERATES|ATTENDED|FUNDED_BY|...",
  "object_entity_id": "E002",
  "evidence": "quote from text supporting this relationship",
  "confidence": 0.0-1.0,
  "date_of_relationship": "date or date range if determinable"
}}

Return JSON only with keys "entities" and "relationships".

INTELLIGENCE TEXT:
{text}"""

def extract_entities_and_relationships(text: str) -> dict:
    response = client.chat.completions.create(
        model=os.environ["AZURE_OPENAI_DEPLOYMENT"],
        messages=[{
            "role": "user",
            "content": ENTITY_EXTRACTION_PROMPT.format(text=text[:6000])
        }],
        temperature=0.0,
        max_tokens=3000,
        response_format={"type": "json_object"}
    )
    return json.loads(response.choices[0].message.content)


def export_to_maltego_csv(entities: list, relationships: list, output_file: str):
    """Export extracted entities and relationships to Maltego-importable CSV format."""
    import csv

    # Entity CSV
    entity_file = output_file.replace(".csv", "_entities.csv")
    with open(entity_file, 'w', newline='', encoding='utf-8') as f:
        writer = csv.writer(f)
        writer.writerow(["Entity Type", "Entity Value", "Alias", "Context", "Confidence"])
        for e in entities:
            writer.writerow([
                e.get("type"), e.get("value"), "|".join(e.get("aliases", [])),
                e.get("context", ""), e.get("confidence", "")
            ])

    # Relationship CSV (Maltego link format)
    rel_file = output_file.replace(".csv", "_relationships.csv")
    entity_map = {e["entity_id"]: e["value"] for e in entities}

    with open(rel_file, 'w', newline='', encoding='utf-8') as f:
        writer = csv.writer(f)
        writer.writerow(["From Entity", "To Entity", "Relationship", "Evidence", "Confidence"])
        for r in relationships:
            writer.writerow([
                entity_map.get(r.get("subject_entity_id"), "Unknown"),
                entity_map.get(r.get("object_entity_id"), "Unknown"),
                r.get("relationship_type"),
                r.get("evidence", ""),
                r.get("confidence", "")
            ])

    print(f"Exported: {entity_file}, {rel_file}")
    return entity_file, rel_file

Step 3: Build the Threat Actor Profile Generator

Generate and maintain structured threat actor profiles from aggregated intelligence sources.

threat_actor_profiler.py

python

# threat_actor_profiler.py

THREAT_ACTOR_PROFILE_PROMPT = """You are an intelligence analyst specializing in
threat actor characterization. Generate a comprehensive threat actor profile from
the following multi-source intelligence data.

This profile covers an UNCLASSIFIED / CUI threat actor (e.g., publicly reported
cyber threat actor, sanctioned entity, publicly known terrorist organization).
Do not include or imply any classified intelligence.

THREAT ACTOR PROFILE FORMAT:

## THREAT ACTOR PROFILE
**Classification:** UNCLASSIFIED // CUI
**Profile ID:** {profile_id}
**Last Updated:** {date}
**Confidence Level:** [High/Medium/Low] — [rationale]

### 1. IDENTITY AND ATTRIBUTION
- Common Names / Aliases
- Attributed Nation-State (if applicable, cite public source)
- Organizational Type (APT group / criminal org / hacktivist / state actor)
- Known Members (publicly named individuals only)
- Affiliated Organizations

### 2. OBJECTIVES AND MOTIVATIONS
- Primary objectives (financial / espionage / disruption / ideology)
- Target sectors (government / defense / financial / energy / healthcare)
- Geographic focus areas
- Long-term strategic goals (inferred from observed activity)

### 3. CAPABILITIES
- Technical sophistication level (1-5)
- Known malware and tools (publicly reported)
- Infrastructure patterns (hosting providers, TLD preferences, C2 methods)
- Operational Security (OPSEC) level
- Estimated resources and backing

### 4. TACTICS, TECHNIQUES, AND PROCEDURES (TTPs)
Format as MITRE ATT&CK framework references where applicable:
- Initial Access techniques
- Execution techniques
- Persistence techniques
- Defense Evasion techniques
- Collection techniques
- Exfiltration techniques
- Impact techniques

### 5. INDICATORS OF COMPROMISE (IOCs)
List only publicly reported IOCs:
- IP addresses (with source and date)
- Domains (with source and date)
- File hashes (with source and date)
- Email patterns
- [Flag any IOCs that may be stale — >6 months old]

### 6. OBSERVED CAMPAIGNS
Timeline of publicly reported campaigns:
- [Date]: [Campaign name/description] — [Source]

### 7. PATTERN OF LIFE
- Operating hours (timezone-based activity patterns)
- Communication patterns (if publicly known)
- Operational tempo (frequency of activity)
- Known gaps in activity (holidays, operational pauses)

### 8. ASSESSMENT
- Current threat level to [client sector]: [Critical/High/Medium/Low]
- Likelihood of targeting client organization: [High/Medium/Low] — [rationale]
- Most probable attack vectors against client
- Recommended defensive priorities

### 9. INTELLIGENCE GAPS
- What key questions about this actor remain unanswered?
- What collection priorities would improve this profile?

### 10. SOURCES
List all sources used (publicly available only for UNCLASSIFIED profile):
- [Source name, date, URL if applicable]

---
[DRAFT — REQUIRES SENIOR ANALYST REVIEW AND CLASSIFICATION REVIEW BEFORE DISTRIBUTION]

INTELLIGENCE DATA INPUTS:
{intelligence_data}"""

def generate_threat_actor_profile(
    actor_name: str,
    intelligence_data: list,
    profile_id: str = None
) -> str:
    import datetime

    profile_id = profile_id or f"TA-{actor_name.replace(' ', '-').upper()}-{datetime.date.today().year}"
    data_formatted = "\n\n".join([
        f"SOURCE: {item.get('source', 'Unknown')} | DATE: {item.get('date', 'Unknown')}\n{item.get('content', '')}"
        for item in intelligence_data
    ])

    response = client.chat.completions.create(
        model=os.environ["AZURE_OPENAI_DEPLOYMENT"],
        messages=[
            {"role": "system", "content": "You are an intelligence analyst. Generate factual, source-cited threat actor profiles. Do not fabricate IOCs, attribution, or intelligence not provided in the source data."},
            {"role": "user", "content": THREAT_ACTOR_PROFILE_PROMPT.format(
                profile_id=profile_id,
                date=datetime.date.today().isoformat(),
                intelligence_data=data_formatted[:6000]
            )}
        ],
        temperature=0.1,
        max_tokens=4000
    )

    return response.choices[0].message.content


def update_threat_actor_profile(
    existing_profile: str,
    new_intelligence: list
) -> str:
    """Update an existing threat actor profile with new intelligence."""

    new_data = "\n\n".join([
        f"NEW SOURCE: {item.get('source')} | DATE: {item.get('date')}\n{item.get('content', '')}"
        for item in new_intelligence
    ])

    update_prompt = f"""Update the following threat actor profile with new intelligence.

EXISTING PROFILE:
{existing_profile[:4000]}

NEW INTELLIGENCE TO INCORPORATE:
{new_data[:2000]}

INSTRUCTIONS:
- Update sections where new intelligence provides new or contradictory information
- Add new IOCs, TTPs, and campaign data from new sources
- Update the confidence level if new intelligence strengthens or weakens attribution
- Mark newly added content with [NEW — {__import__('datetime').date.today().isoformat()}]
- Mark content that is now stale or contradicted with [STALE — verify]
- Update the "Last Updated" date
- Preserve all prior source citations
- Return the complete updated profile

[DRAFT — REQUIRES ANALYST REVIEW AFTER UPDATE]"""

    response = client.chat.completions.create(
        model=os.environ["AZURE_OPENAI_DEPLOYMENT"],
        messages=[{"role": "user", "content": update_prompt}],
        temperature=0.1,
        max_tokens=4000
    )
    return response.choices[0].message.content

Step 4: Configure the Pattern-of-Life Analysis Dashboard

Build the Sentinel workbook and KQL queries that surface pattern-of-life anomalies from aggregated entity activity data.

Pattern-of-Life Baseline and Anomaly Detection — establishes a 90-day rolling baseline and scores entity activity anomalies by ratio and new source appearance

kql

// Sentinel KQL: Pattern-of-Life Baseline and Anomaly Detection
// Run in Azure Sentinel (Azure Government) Log Analytics Workspace

// 1. Establish entity activity baseline (90-day rolling)
let baseline_window = 90d;
let analysis_window = 24h;

let entity_baseline = OsintEvents_CL
| where TimeGenerated > ago(baseline_window)
| where EntityType_s in ("PERSON", "ORGANIZATION", "IP_ADDRESS")
| summarize
    avg_daily_events = count() / 90.0,
    typical_hours = make_set(hourofday(TimeGenerated)),
    typical_days = make_set(dayofweek(TimeGenerated)),
    typical_sources = make_set(EventSource_s)
  by EntityValue_s, EntityType_s;

// 2. Current period activity
let current_activity = OsintEvents_CL
| where TimeGenerated > ago(analysis_window)
| summarize
    current_events = count(),
    current_hours = make_set(hourofday(TimeGenerated)),
    current_sources = make_set(EventSource_s)
  by EntityValue_s, EntityType_s;

// 3. Join and identify anomalies
entity_baseline
| join kind=leftouter current_activity on EntityValue_s, EntityType_s
| extend
    activity_ratio = toreal(current_events) / avg_daily_events,
    hour_overlap = set_intersect(typical_hours, current_hours),
    new_sources = set_difference(current_sources, typical_sources)
| extend
    anomaly_score =
        case(
            activity_ratio > 5.0, 3,  // 5x baseline activity
            activity_ratio > 3.0, 2,  // 3x baseline activity
            activity_ratio < 0.1, 2,  // Sudden drop in activity (going dark)
            array_length(new_sources) > 0, 1,  // New data sources appearing
            0
        )
| where anomaly_score > 0
| extend
    anomaly_description =
        case(
            activity_ratio > 5.0, strcat("Spike: ", round(activity_ratio, 1), "x baseline activity"),
            activity_ratio < 0.1, "Sudden drop: entity may have gone dark",
            array_length(new_sources) > 0, strcat("New source detected: ", tostring(new_sources)),
            "Anomaly detected"
        )
| project
    EntityValue_s, EntityType_s,
    avg_daily_events = round(avg_daily_events, 1),
    current_events,
    activity_ratio = round(activity_ratio, 2),
    anomaly_score,
    anomaly_description,
    new_sources
| order by anomaly_score desc, activity_ratio desc

Network graph co-occurrence query

kql

-- identifies entities sharing 3 or more common sources over 30 days,
-- suitable for Maltego import or graph visualization

// 4. Network graph query — entities appearing together
// (For Maltego import or visualization)
OsintEvents_CL
| where TimeGenerated > ago(30d)
| where EntityType_s == "PERSON" or EntityType_s == "ORGANIZATION"
| summarize co_occurrences = count() by EntityValue_s, EventSource_s
| join kind=inner (
    OsintEvents_CL
    | where TimeGenerated > ago(30d)
    | summarize count() by EntityValue_s, EventSource_s
) on EventSource_s
| where EntityValue_s != EntityValue_s1
| summarize shared_sources = count() by EntityValue_s, EntityValue_s1
| where shared_sources >= 3  // Co-appear in at least 3 sources = meaningful link
| order by shared_sources desc
| take 100

Custom AI Components

Pattern-of-Life Summary Analyst Report

Type: Prompt Translates raw anomaly detection output into an analyst-ready intelligence summary with assessed significance and recommended collection priorities.

Implementation

Pattern-of-Life Summary Analyst Report Prompt Template

text

SYSTEM PROMPT:
You are a senior intelligence analyst reviewing pattern-of-life anomaly data.
Translate the following technical anomaly data into an analyst-ready intelligence
summary for dissemination to intelligence consumers.

FOR EACH ANOMALY:
1. Describe what changed in plain, non-technical language
2. Assess significance: what does this change in behavior suggest?
3. Consider alternative explanations for the change
4. Rate analytic confidence: High/Medium/Low with rationale
5. Recommend specific collection priorities to confirm or deny your assessment
6. Assign a priority for follow-up: Urgent (within 24h) / Priority (within 72h) / Routine

FORMATTING:
- Write in IC analytic style (direct, declarative, sourced)
- Use the Key Judgments format for significant findings
- Mark assessments of likelihood: almost certainly, likely, possibly, unlikely
- All content is UNCLASSIFIED // CUI unless otherwise noted

ANOMALY DATA:
{anomaly_data}

ANALYST CONTEXT:
{analyst_context}

Testing & Validation

Data ingestion verification: Ingest a test batch of 100 synthetic OSINT events and verify all events appear in the Sentinel custom table within 5 minutes. Verify field mapping is correct for all fields (entity type, value, source, timestamp).
Entity extraction accuracy test: Run the extraction pipeline on 10 publicly available, unclassified intelligence reports (open-source threat reports from vendors like Mandiant, CrowdStrike, or Microsoft MSTIC). Have a trained analyst manually extract entities from the same documents. Compare: target ≥85% entity recall (not missing significant entities) and ≥90% precision (not hallucinating entities not in the text).
Relationship extraction quality test: From the same 10 reports, evaluate whether extracted relationships are accurate and supported by the source text. All relationships must be traceable to a specific quote or passage in the source document.
Threat actor profile factual accuracy test: Generate a profile for a well-documented, publicly-known threat actor (e.g., APT28/Fancy Bear). Have an experienced threat intelligence analyst fact-check all claims against publicly available sources. Zero tolerance for fabricated IOCs, false attribution claims, or invented TTPs.
Pattern-of-life baseline test: Seed the Sentinel workspace with 90 days of synthetic entity activity data following known patterns. Inject 5 anomalous events (spike, drop, new source, unusual hour). Verify the KQL query detects all 5 anomalies with correct anomaly scores.
Maltego export test: Export 50 entities and 30 relationships from the extraction pipeline to Maltego CSV format. Import into Maltego and verify the graph renders correctly with correct entity types and relationship labels.
Access control test: Attempt to access the Sentinel workspace and SharePoint intelligence library from an account not in the authorized analyst security group. Verify access is denied and the attempt is logged.
Data sovereignty test: Verify all data (OSINT inputs, analysis outputs, Sentinel logs) remains within Azure Government regions (USGov Virginia/Arizona). Use Azure Policy to enforce geographic restrictions and confirm no data egress to commercial Azure regions.

Client Handoff

Handoff Meeting Agenda (90 minutes — Intelligence Team Lead + ISSO + IT Lead)

1. Architecture and boundary review (15 min)

Review data flow: OSINT sources → Sentinel ingestion → Azure OpenAI analysis → SharePoint outputs
Confirm all data stays within IL4/IL5 boundary as determined by ISSO
Review access controls and audit logging configuration

2. Analysis workflow demonstration (25 min)

Show live Sentinel data ingestion from one active data source
Demonstrate entity extraction on a sample OSINT document
Generate a threat actor profile update from new intelligence
Show the pattern-of-life anomaly dashboard and KQL alert rules

3. Analyst tool training (20 min)

Walk through the Maltego export/import workflow
Demonstrate the pattern-of-life summary prompt for analyst report generation
Review the threat actor profile update workflow

4. Classification and handling review (15 min)

Review CUI//INTEL marking requirements for all outputs
Confirm distribution list and approval chain for intelligence products
Review the policy against using this system for classified data

5. Documentation handoff

Azure architecture diagram (Sentinel workspace, OpenAI, storage accounts)

Data source API key inventory (stored in Azure Key Vault)

KQL query library (Sentinel workbook)

Prompt template library

Entity extraction schema and Maltego import guide

ISSO-signed boundary determination document

MSP support contact (cleared personnel contact for classified-adjacent issues)

Maintenance

Daily Tasks (Automated)

Sentinel analytics rules run continuously — alert on high-confidence anomalies
Pattern-of-life dashboard refreshes hourly

Weekly Tasks

Review Sentinel ingestion health — verify all data sources are feeding correctly
Review false positive rate on anomaly alerts — adjust KQL thresholds if needed

Monthly Tasks

Review and update threat actor profiles for actively tracked actors
Verify commercial intelligence feed API keys are valid and subscriptions current
Azure OpenAI consumption review

Quarterly Tasks

Review entity extraction accuracy — spot-check 20 random extractions against source documents

Update MITRE ATT&CK TTP references in threat actor profiles against current ATT&CK version

Review and update the Sentinel analytics rules for new threat patterns

Annual Tasks

Full data source review — confirm all ingested data sources are still authorized and ToS-compliant
ISSO boundary review — confirm the system remains within the documented IL4/IL5 boundary as data types and sources evolve
Recorded Future and other commercial feed contract renewals

Alternatives

Palantir AIP for Government (Enterprise Intelligence Platform)

AWS GovCloud — Amazon Comprehend + SageMaker + OpenSearch

AWS GovCloud provides FedRAMP High-authorized alternatives: Amazon Comprehend for entity extraction (NER), SageMaker for custom ML models, and OpenSearch for entity graph storage and search. Best for: Organizations standardized on AWS GovCloud. Tradeoffs: Requires more custom ML development than the Azure OpenAI approach; SageMaker custom model training is a significant engineering effort vs. prompt-based extraction.

On-Premises (Air-Gapped) — For IL5+ or Classified Environments

Want early access to the full toolkit?

← All Government & Defense solutions

Software Procurement

Microsoft Azure OpenAI Service (Azure Government) — IL4

Microsoft Sentinel (Azure Government)

Microsoft Sentinel (Azure Government)

Maltego (Entity Link Analysis)

Maltego

Recorded Future Intelligence Cloud (Commercial Threat Intel Feed)

Recorded Future Intelligence Cloud

Babel Street (OSINT and Social Media Intelligence)

Babel Street

Palantir Gotham (Enterprise Intelligence Platform — Optional)

Prerequisites

Installation Steps

Step 1: Configure the Multi-Source Data Ingestion Pipeline in Microsoft Sentinel

Step 2: Build the Entity Extraction and Link Analysis Pipeline

Step 3: Build the Threat Actor Profile Generator

Step 4: Configure the Pattern-of-Life Analysis Dashboard

Custom AI Components

Pattern-of-Life Summary Analyst Report

Implementation

Testing & Validation

Client Handoff

Handoff Meeting Agenda (90 minutes — Intelligence Team Lead + ISSO + IT Lead)

1. Architecture and boundary review (15 min)

2. Analysis workflow demonstration (25 min)

3. Analyst tool training (20 min)

4. Classification and handling review (15 min)

5. Documentation handoff

Maintenance

Daily Tasks (Automated)

Weekly Tasks

Monthly Tasks

Quarterly Tasks

Annual Tasks

Alternatives

Palantir AIP for Government (Enterprise Intelligence Platform)

AWS GovCloud — Amazon Comprehend + SageMaker + OpenSearch

On-Premises (Air-Gapped) — For IL5+ or Classified Environments