
Implementation Guide: Fuse Multi-Source Intelligence Data — Pattern-of-Life Anomalies, Link Analysis & Threat Actor Profiles
Step-by-step implementation guide for deploying AI to fuse multi-source intelligence data — pattern-of-life anomalies, link analysis & threat actor profiles for Government & Defense clients.
Software Procurement
Microsoft Azure OpenAI Service (Azure Government) — IL4
GPT-5.4: ~$0.005/1K input, ~$0.015/1K output. Entity extraction from 100-page report: ~$2–$5. Threat actor profile synthesis: ~$5–$15 per profile.
For CUI//INTEL environments, Azure OpenAI running in Azure Government at IL4 authorization level is the required platform. Confirm with the client's ISSO that the specific data types being analyzed are within the IL4 boundary. Some intelligence-related CUI categories (e.g., CUI//SP-CTI — Controlled Technical Information for intelligence purposes) may require IL5 — verify before deployment.
Microsoft Sentinel (Azure Government)
Microsoft Sentinel (Azure Government)
~$2.46/GB ingested (Azure Government pricing); typical OSINT pipeline: 10–50GB/month = $25–$125/month
Cloud-native SIEM and SOAR platform running in Azure Government. Used as the aggregation layer for multi-source data feeds — ingests OSINT feeds, commercial threat intelligence, social media monitoring data, and structured data exports from other tools. Provides the data lake from which Azure OpenAI analysis is triggered.
Maltego (Entity Link Analysis)
Maltego
Maltego Enterprise: $5,000–$10,000/seat/year. Government pricing available.
Industry-standard open-source intelligence and link analysis platform. Visualizes relationships between persons, organizations, IP addresses, domains, email addresses, phone numbers, and locations. Used by IC contractors, law enforcement, and cybersecurity analysts. Transforms (data queries) connect to OSINT data sources and commercial intelligence feeds. Output is a visual graph showing entity relationships and connection paths.
Maltego's cloud-based transforms route data through Maltego's servers — verify data sensitivity before using cloud transforms. For sensitive CUI, use locally-deployed transforms or the Maltego on-premises deployment.
Recorded Future Intelligence Cloud (Commercial Threat Intel Feed)
Recorded Future Intelligence Cloud
$50,000–$200,000+/year depending on modules and entity count
Recorded Future aggregates OSINT, dark web, technical threat intelligence, and geopolitical intelligence into structured, machine-readable threat intelligence. Provides APIs for automated ingestion into the Azure Sentinel pipeline. Used for threat actor profile enrichment, indicator of compromise (IOC) data, and vulnerability intelligence. FedRAMP Moderate authorized.
Babel Street (OSINT and Social Media Intelligence)
Babel Street
Contact vendor; government pricing available
Government-focused OSINT platform providing real-time multilingual social media monitoring, dark web monitoring, and geospatial intelligence data aggregation. FedRAMP authorized. Used by DHS, DoD, and IC contractors for pattern-of-life analysis and situational awareness. Provides structured API output that feeds into the Azure Sentinel aggregation layer.
Palantir Gotham (Enterprise Intelligence Platform — Optional)
Typically $1M+/year for enterprise deployments; available via DoD enterprise license
Enterprise intelligence fusion and analysis platform widely deployed across DoD and IC. If the client already has Palantir Gotham, integrate the Azure OpenAI analysis pipeline as an enrichment layer rather than replacing Palantir. Palantir AIP (AI Platform) also provides FedRAMP High authorized LLM capabilities that can substitute for the Azure OpenAI components described here.
Prerequisites
- Cleared personnel requirement: Even for unclassified OSINT analysis, IC contractor environments typically require cleared personnel (minimum Secret, often TS) to access the analysis systems and review outputs. The MSP technicians deploying this system must hold the appropriate clearances for the environment. Confirm with the client's FSO before beginning deployment.
- Data source agreements: OSINT and commercial intelligence feeds require licensing agreements. Confirm the client has active, paid subscriptions to all data sources before configuring ingestion pipelines. Verify the licensing agreement covers the intended use case (some intelligence data licenses restrict automated processing or AI analysis).
- Terms of service compliance for OSINT sources: Many public OSINT sources (social media platforms, public records databases) have Terms of Service that restrict automated bulk collection. The client's legal counsel must confirm that the collection methodology complies with platform ToS, relevant laws (Computer Fraud and Abuse Act, state equivalents), and any agency-specific collection authorities.
- Collection authority documentation (government clients): For government agency clients conducting intelligence collection activities, the client must have documented legal authority for each data collection activity (e.g., Executive Order 12333, relevant statutes, agency authorities). The MSP does not determine collection legality — but must not configure pipelines for data the client is not legally authorized to collect.
- IL4/IL5 boundary confirmation: Work with the client's ISSO to formally determine whether the analysis data is within IL4 (CUI) or requires IL5 (CUI with more stringent controls). This determination drives platform selection and must be documented before go-live.
- IT admin access: Azure Government subscription (Owner), Microsoft Sentinel workspace, Maltego Enterprise admin, API credentials for commercial intelligence feeds.
Installation Steps
Step 1: Configure the Multi-Source Data Ingestion Pipeline in Microsoft Sentinel
Set up Sentinel (Azure Government) as the aggregation layer, ingesting data from OSINT feeds, commercial threat intelligence, and structured data exports.
# Configure data connectors for intelligence data sources
# sentinel_data_connectors.py
# Configure data connectors for intelligence data sources
# Note: Most connector configuration is done via Azure Portal (portal.azure.us)
# or ARM templates. The code below handles custom data ingestion via the
# Azure Monitor Logs Ingestion API for sources without native Sentinel connectors.
import requests
import json
import os
import datetime
from azure.identity import ClientSecretCredential
TENANT_ID = os.environ["AZURE_TENANT_ID"]
CLIENT_ID = os.environ["AZURE_CLIENT_ID"]
CLIENT_SECRET = os.environ["AZURE_CLIENT_SECRET"]
SENTINEL_WORKSPACE_ID = os.environ["SENTINEL_WORKSPACE_ID"]
DCE_ENDPOINT = os.environ["DATA_COLLECTION_ENDPOINT"] # Azure Government DCE
DCR_IMMUTABLE_ID = os.environ["DATA_COLLECTION_RULE_ID"]
STREAM_NAME = os.environ["CUSTOM_STREAM_NAME"] # e.g., "Custom-OsintEvents"
def get_access_token() -> str:
"""Get bearer token for Azure Monitor Logs Ingestion API."""
credential = ClientSecretCredential(TENANT_ID, CLIENT_ID, CLIENT_SECRET)
token = credential.get_token("https://monitor.azure.us/.default")
return token.token
def ingest_osint_events(events: list) -> bool:
"""Ingest OSINT events into Sentinel custom table via Logs Ingestion API."""
token = get_access_token()
headers = {
"Authorization": f"Bearer {token}",
"Content-Type": "application/json"
}
# Normalize events to Sentinel schema
normalized = []
for event in events:
normalized.append({
"TimeGenerated": event.get("timestamp", datetime.datetime.utcnow().isoformat()),
"EventSource": event.get("source", "Unknown"),
"EntityType": event.get("entity_type", "Unknown"),
"EntityValue": event.get("entity_value", ""),
"EventType": event.get("event_type", ""),
"Description": event.get("description", ""),
"Confidence": event.get("confidence", 0),
"Tags": json.dumps(event.get("tags", [])),
"RawData": json.dumps(event.get("raw", {}))[:32000]
})
url = f"{DCE_ENDPOINT}/dataCollectionRules/{DCR_IMMUTABLE_ID}/streams/{STREAM_NAME}?api-version=2023-01-01"
resp = requests.post(url, headers=headers, json=normalized)
if resp.status_code == 204:
return True
else:
print(f"Ingestion error: {resp.status_code} — {resp.text}")
return False
def fetch_recorded_future_indicators(query: str, limit: int = 100) -> list:
"""Fetch threat indicators from Recorded Future API."""
RF_API_KEY = os.environ["RECORDED_FUTURE_API_KEY"]
headers = {
"X-RFToken": RF_API_KEY,
"Content-Type": "application/json"
}
resp = requests.get(
"https://api.recordedfuture.com/v2/indicator/search",
headers=headers,
params={"query": query, "limit": limit, "fields": "risk,entity,timestamps,evidence"}
)
resp.raise_for_status()
indicators = []
for ind in resp.json().get("data", {}).get("results", []):
indicators.append({
"source": "Recorded Future",
"entity_type": ind.get("entity", {}).get("type", ""),
"entity_value": ind.get("entity", {}).get("name", ""),
"risk_score": ind.get("risk", {}).get("score", 0),
"risk_rules": [r.get("rule", "") for r in ind.get("risk", {}).get("rules", [])],
"first_seen": ind.get("timestamps", {}).get("firstSeen", ""),
"last_seen": ind.get("timestamps", {}).get("lastSeen", ""),
"evidence": ind.get("evidence", [])
})
return indicatorsMicrosoft Sentinel in Azure Government supports native data connectors for many common sources (Microsoft Threat Intelligence, TAXII servers, CEF/Syslog). Use native connectors where available — they are more reliable and require less maintenance than custom ingestion pipelines. Custom ingestion via the Logs Ingestion API is only needed for sources without a native Sentinel connector. Configure all Sentinel workspaces with RBAC restricting access to authorized analysts only — Sentinel workspaces in intelligence environments should have zero public access and all access logged.
Step 2: Build the Entity Extraction and Link Analysis Pipeline
Extract named entities from ingested OSINT data and identify relationships for link analysis visualization.
# Extract entities and relationships from OSINT text for link analysis
# entity_extraction.py
# Extract entities and relationships from OSINT text for link analysis
from openai import AzureOpenAI
import os, json
client = AzureOpenAI(
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
api_key=os.environ["AZURE_OPENAI_KEY"],
api_version="2024-08-01-preview"
)
ENTITY_EXTRACTION_PROMPT = """You are an intelligence analyst performing entity extraction
and relationship mapping for link analysis.
Extract all named entities and relationships from the following intelligence text.
ENTITY TYPES TO EXTRACT:
- PERSON: Named individuals (full name, aliases, titles/roles if mentioned)
- ORGANIZATION: Companies, government agencies, military units, terrorist organizations, criminal groups
- LOCATION: Countries, cities, specific facilities, coordinates if mentioned
- FACILITY: Specific named buildings, bases, ports, infrastructure
- VESSEL: Ships, aircraft, vehicles if named
- DEVICE: Specific electronic devices, weapons systems if named
- IP_ADDRESS: Internet protocol addresses
- DOMAIN: Web domains and URLs
- EMAIL: Email addresses
- PHONE: Phone numbers
- DATE: Specific dates or date ranges mentioned
- EVENT: Named events, operations, incidents
For each entity:
{{
"entity_id": "unique ID for this document (E001, E002...)",
"type": "PERSON|ORGANIZATION|LOCATION|...",
"value": "the entity name/value as it appears in text",
"normalized": "standardized form (e.g., 'USA' → 'United States')",
"aliases": ["other names used for same entity in this document"],
"context": "brief context from document (max 50 words)",
"confidence": 0.0-1.0
}}
For each RELATIONSHIP between entities:
{{
"relationship_id": "R001, R002...",
"subject_entity_id": "E001",
"relationship_type": "MEMBER_OF|LOCATED_AT|COMMUNICATES_WITH|AFFILIATED_WITH|COMMANDS|OWNS|OPERATES|ATTENDED|FUNDED_BY|...",
"object_entity_id": "E002",
"evidence": "quote from text supporting this relationship",
"confidence": 0.0-1.0,
"date_of_relationship": "date or date range if determinable"
}}
Return JSON only with keys "entities" and "relationships".
INTELLIGENCE TEXT:
{text}"""
def extract_entities_and_relationships(text: str) -> dict:
response = client.chat.completions.create(
model=os.environ["AZURE_OPENAI_DEPLOYMENT"],
messages=[{
"role": "user",
"content": ENTITY_EXTRACTION_PROMPT.format(text=text[:6000])
}],
temperature=0.0,
max_tokens=3000,
response_format={"type": "json_object"}
)
return json.loads(response.choices[0].message.content)
def export_to_maltego_csv(entities: list, relationships: list, output_file: str):
"""Export extracted entities and relationships to Maltego-importable CSV format."""
import csv
# Entity CSV
entity_file = output_file.replace(".csv", "_entities.csv")
with open(entity_file, 'w', newline='', encoding='utf-8') as f:
writer = csv.writer(f)
writer.writerow(["Entity Type", "Entity Value", "Alias", "Context", "Confidence"])
for e in entities:
writer.writerow([
e.get("type"), e.get("value"), "|".join(e.get("aliases", [])),
e.get("context", ""), e.get("confidence", "")
])
# Relationship CSV (Maltego link format)
rel_file = output_file.replace(".csv", "_relationships.csv")
entity_map = {e["entity_id"]: e["value"] for e in entities}
with open(rel_file, 'w', newline='', encoding='utf-8') as f:
writer = csv.writer(f)
writer.writerow(["From Entity", "To Entity", "Relationship", "Evidence", "Confidence"])
for r in relationships:
writer.writerow([
entity_map.get(r.get("subject_entity_id"), "Unknown"),
entity_map.get(r.get("object_entity_id"), "Unknown"),
r.get("relationship_type"),
r.get("evidence", ""),
r.get("confidence", "")
])
print(f"Exported: {entity_file}, {rel_file}")
return entity_file, rel_fileStep 3: Build the Threat Actor Profile Generator
Generate and maintain structured threat actor profiles from aggregated intelligence sources.
# threat_actor_profiler.py
THREAT_ACTOR_PROFILE_PROMPT = """You are an intelligence analyst specializing in
threat actor characterization. Generate a comprehensive threat actor profile from
the following multi-source intelligence data.
This profile covers an UNCLASSIFIED / CUI threat actor (e.g., publicly reported
cyber threat actor, sanctioned entity, publicly known terrorist organization).
Do not include or imply any classified intelligence.
THREAT ACTOR PROFILE FORMAT:
## THREAT ACTOR PROFILE
**Classification:** UNCLASSIFIED // CUI
**Profile ID:** {profile_id}
**Last Updated:** {date}
**Confidence Level:** [High/Medium/Low] — [rationale]
### 1. IDENTITY AND ATTRIBUTION
- Common Names / Aliases
- Attributed Nation-State (if applicable, cite public source)
- Organizational Type (APT group / criminal org / hacktivist / state actor)
- Known Members (publicly named individuals only)
- Affiliated Organizations
### 2. OBJECTIVES AND MOTIVATIONS
- Primary objectives (financial / espionage / disruption / ideology)
- Target sectors (government / defense / financial / energy / healthcare)
- Geographic focus areas
- Long-term strategic goals (inferred from observed activity)
### 3. CAPABILITIES
- Technical sophistication level (1-5)
- Known malware and tools (publicly reported)
- Infrastructure patterns (hosting providers, TLD preferences, C2 methods)
- Operational Security (OPSEC) level
- Estimated resources and backing
### 4. TACTICS, TECHNIQUES, AND PROCEDURES (TTPs)
Format as MITRE ATT&CK framework references where applicable:
- Initial Access techniques
- Execution techniques
- Persistence techniques
- Defense Evasion techniques
- Collection techniques
- Exfiltration techniques
- Impact techniques
### 5. INDICATORS OF COMPROMISE (IOCs)
List only publicly reported IOCs:
- IP addresses (with source and date)
- Domains (with source and date)
- File hashes (with source and date)
- Email patterns
- [Flag any IOCs that may be stale — >6 months old]
### 6. OBSERVED CAMPAIGNS
Timeline of publicly reported campaigns:
- [Date]: [Campaign name/description] — [Source]
### 7. PATTERN OF LIFE
- Operating hours (timezone-based activity patterns)
- Communication patterns (if publicly known)
- Operational tempo (frequency of activity)
- Known gaps in activity (holidays, operational pauses)
### 8. ASSESSMENT
- Current threat level to [client sector]: [Critical/High/Medium/Low]
- Likelihood of targeting client organization: [High/Medium/Low] — [rationale]
- Most probable attack vectors against client
- Recommended defensive priorities
### 9. INTELLIGENCE GAPS
- What key questions about this actor remain unanswered?
- What collection priorities would improve this profile?
### 10. SOURCES
List all sources used (publicly available only for UNCLASSIFIED profile):
- [Source name, date, URL if applicable]
---
[DRAFT — REQUIRES SENIOR ANALYST REVIEW AND CLASSIFICATION REVIEW BEFORE DISTRIBUTION]
INTELLIGENCE DATA INPUTS:
{intelligence_data}"""
def generate_threat_actor_profile(
actor_name: str,
intelligence_data: list,
profile_id: str = None
) -> str:
import datetime
profile_id = profile_id or f"TA-{actor_name.replace(' ', '-').upper()}-{datetime.date.today().year}"
data_formatted = "\n\n".join([
f"SOURCE: {item.get('source', 'Unknown')} | DATE: {item.get('date', 'Unknown')}\n{item.get('content', '')}"
for item in intelligence_data
])
response = client.chat.completions.create(
model=os.environ["AZURE_OPENAI_DEPLOYMENT"],
messages=[
{"role": "system", "content": "You are an intelligence analyst. Generate factual, source-cited threat actor profiles. Do not fabricate IOCs, attribution, or intelligence not provided in the source data."},
{"role": "user", "content": THREAT_ACTOR_PROFILE_PROMPT.format(
profile_id=profile_id,
date=datetime.date.today().isoformat(),
intelligence_data=data_formatted[:6000]
)}
],
temperature=0.1,
max_tokens=4000
)
return response.choices[0].message.content
def update_threat_actor_profile(
existing_profile: str,
new_intelligence: list
) -> str:
"""Update an existing threat actor profile with new intelligence."""
new_data = "\n\n".join([
f"NEW SOURCE: {item.get('source')} | DATE: {item.get('date')}\n{item.get('content', '')}"
for item in new_intelligence
])
update_prompt = f"""Update the following threat actor profile with new intelligence.
EXISTING PROFILE:
{existing_profile[:4000]}
NEW INTELLIGENCE TO INCORPORATE:
{new_data[:2000]}
INSTRUCTIONS:
- Update sections where new intelligence provides new or contradictory information
- Add new IOCs, TTPs, and campaign data from new sources
- Update the confidence level if new intelligence strengthens or weakens attribution
- Mark newly added content with [NEW — {__import__('datetime').date.today().isoformat()}]
- Mark content that is now stale or contradicted with [STALE — verify]
- Update the "Last Updated" date
- Preserve all prior source citations
- Return the complete updated profile
[DRAFT — REQUIRES ANALYST REVIEW AFTER UPDATE]"""
response = client.chat.completions.create(
model=os.environ["AZURE_OPENAI_DEPLOYMENT"],
messages=[{"role": "user", "content": update_prompt}],
temperature=0.1,
max_tokens=4000
)
return response.choices[0].message.contentStep 4: Configure the Pattern-of-Life Analysis Dashboard
Build the Sentinel workbook and KQL queries that surface pattern-of-life anomalies from aggregated entity activity data.
// Sentinel KQL: Pattern-of-Life Baseline and Anomaly Detection
// Run in Azure Sentinel (Azure Government) Log Analytics Workspace
// 1. Establish entity activity baseline (90-day rolling)
let baseline_window = 90d;
let analysis_window = 24h;
let entity_baseline = OsintEvents_CL
| where TimeGenerated > ago(baseline_window)
| where EntityType_s in ("PERSON", "ORGANIZATION", "IP_ADDRESS")
| summarize
avg_daily_events = count() / 90.0,
typical_hours = make_set(hourofday(TimeGenerated)),
typical_days = make_set(dayofweek(TimeGenerated)),
typical_sources = make_set(EventSource_s)
by EntityValue_s, EntityType_s;
// 2. Current period activity
let current_activity = OsintEvents_CL
| where TimeGenerated > ago(analysis_window)
| summarize
current_events = count(),
current_hours = make_set(hourofday(TimeGenerated)),
current_sources = make_set(EventSource_s)
by EntityValue_s, EntityType_s;
// 3. Join and identify anomalies
entity_baseline
| join kind=leftouter current_activity on EntityValue_s, EntityType_s
| extend
activity_ratio = toreal(current_events) / avg_daily_events,
hour_overlap = set_intersect(typical_hours, current_hours),
new_sources = set_difference(current_sources, typical_sources)
| extend
anomaly_score =
case(
activity_ratio > 5.0, 3, // 5x baseline activity
activity_ratio > 3.0, 2, // 3x baseline activity
activity_ratio < 0.1, 2, // Sudden drop in activity (going dark)
array_length(new_sources) > 0, 1, // New data sources appearing
0
)
| where anomaly_score > 0
| extend
anomaly_description =
case(
activity_ratio > 5.0, strcat("Spike: ", round(activity_ratio, 1), "x baseline activity"),
activity_ratio < 0.1, "Sudden drop: entity may have gone dark",
array_length(new_sources) > 0, strcat("New source detected: ", tostring(new_sources)),
"Anomaly detected"
)
| project
EntityValue_s, EntityType_s,
avg_daily_events = round(avg_daily_events, 1),
current_events,
activity_ratio = round(activity_ratio, 2),
anomaly_score,
anomaly_description,
new_sources
| order by anomaly_score desc, activity_ratio desc-- identifies entities sharing 3 or more common sources over 30 days,
-- suitable for Maltego import or graph visualization
// 4. Network graph query — entities appearing together
// (For Maltego import or visualization)
OsintEvents_CL
| where TimeGenerated > ago(30d)
| where EntityType_s == "PERSON" or EntityType_s == "ORGANIZATION"
| summarize co_occurrences = count() by EntityValue_s, EventSource_s
| join kind=inner (
OsintEvents_CL
| where TimeGenerated > ago(30d)
| summarize count() by EntityValue_s, EventSource_s
) on EventSource_s
| where EntityValue_s != EntityValue_s1
| summarize shared_sources = count() by EntityValue_s, EntityValue_s1
| where shared_sources >= 3 // Co-appear in at least 3 sources = meaningful link
| order by shared_sources desc
| take 100Custom AI Components
Pattern-of-Life Summary Analyst Report
Type: Prompt Translates raw anomaly detection output into an analyst-ready intelligence summary with assessed significance and recommended collection priorities.
Implementation
SYSTEM PROMPT:
You are a senior intelligence analyst reviewing pattern-of-life anomaly data.
Translate the following technical anomaly data into an analyst-ready intelligence
summary for dissemination to intelligence consumers.
FOR EACH ANOMALY:
1. Describe what changed in plain, non-technical language
2. Assess significance: what does this change in behavior suggest?
3. Consider alternative explanations for the change
4. Rate analytic confidence: High/Medium/Low with rationale
5. Recommend specific collection priorities to confirm or deny your assessment
6. Assign a priority for follow-up: Urgent (within 24h) / Priority (within 72h) / Routine
FORMATTING:
- Write in IC analytic style (direct, declarative, sourced)
- Use the Key Judgments format for significant findings
- Mark assessments of likelihood: almost certainly, likely, possibly, unlikely
- All content is UNCLASSIFIED // CUI unless otherwise noted
ANOMALY DATA:
{anomaly_data}
ANALYST CONTEXT:
{analyst_context}Testing & Validation
- Data ingestion verification: Ingest a test batch of 100 synthetic OSINT events and verify all events appear in the Sentinel custom table within 5 minutes. Verify field mapping is correct for all fields (entity type, value, source, timestamp).
- Entity extraction accuracy test: Run the extraction pipeline on 10 publicly available, unclassified intelligence reports (open-source threat reports from vendors like Mandiant, CrowdStrike, or Microsoft MSTIC). Have a trained analyst manually extract entities from the same documents. Compare: target ≥85% entity recall (not missing significant entities) and ≥90% precision (not hallucinating entities not in the text).
- Relationship extraction quality test: From the same 10 reports, evaluate whether extracted relationships are accurate and supported by the source text. All relationships must be traceable to a specific quote or passage in the source document.
- Threat actor profile factual accuracy test: Generate a profile for a well-documented, publicly-known threat actor (e.g., APT28/Fancy Bear). Have an experienced threat intelligence analyst fact-check all claims against publicly available sources. Zero tolerance for fabricated IOCs, false attribution claims, or invented TTPs.
- Pattern-of-life baseline test: Seed the Sentinel workspace with 90 days of synthetic entity activity data following known patterns. Inject 5 anomalous events (spike, drop, new source, unusual hour). Verify the KQL query detects all 5 anomalies with correct anomaly scores.
- Maltego export test: Export 50 entities and 30 relationships from the extraction pipeline to Maltego CSV format. Import into Maltego and verify the graph renders correctly with correct entity types and relationship labels.
- Access control test: Attempt to access the Sentinel workspace and SharePoint intelligence library from an account not in the authorized analyst security group. Verify access is denied and the attempt is logged.
- Data sovereignty test: Verify all data (OSINT inputs, analysis outputs, Sentinel logs) remains within Azure Government regions (USGov Virginia/Arizona). Use Azure Policy to enforce geographic restrictions and confirm no data egress to commercial Azure regions.
Client Handoff
Handoff Meeting Agenda (90 minutes — Intelligence Team Lead + ISSO + IT Lead)
1. Architecture and boundary review (15 min)
- Review data flow: OSINT sources → Sentinel ingestion → Azure OpenAI analysis → SharePoint outputs
- Confirm all data stays within IL4/IL5 boundary as determined by ISSO
- Review access controls and audit logging configuration
2. Analysis workflow demonstration (25 min)
- Show live Sentinel data ingestion from one active data source
- Demonstrate entity extraction on a sample OSINT document
- Generate a threat actor profile update from new intelligence
- Show the pattern-of-life anomaly dashboard and KQL alert rules
3. Analyst tool training (20 min)
- Walk through the Maltego export/import workflow
- Demonstrate the pattern-of-life summary prompt for analyst report generation
- Review the threat actor profile update workflow
4. Classification and handling review (15 min)
- Review CUI//INTEL marking requirements for all outputs
- Confirm distribution list and approval chain for intelligence products
- Review the policy against using this system for classified data
5. Documentation handoff
Maintenance
Daily Tasks (Automated)
- Sentinel analytics rules run continuously — alert on high-confidence anomalies
- Pattern-of-life dashboard refreshes hourly
Weekly Tasks
- Review Sentinel ingestion health — verify all data sources are feeding correctly
- Review false positive rate on anomaly alerts — adjust KQL thresholds if needed
Monthly Tasks
- Review and update threat actor profiles for actively tracked actors
- Verify commercial intelligence feed API keys are valid and subscriptions current
- Azure OpenAI consumption review
Quarterly Tasks
Annual Tasks
- Full data source review — confirm all ingested data sources are still authorized and ToS-compliant
- ISSO boundary review — confirm the system remains within the documented IL4/IL5 boundary as data types and sources evolve
- Recorded Future and other commercial feed contract renewals
Alternatives
Palantir AIP for Government (Enterprise Intelligence Platform)
AWS GovCloud — Amazon Comprehend + SageMaker + OpenSearch
AWS GovCloud provides FedRAMP High-authorized alternatives: Amazon Comprehend for entity extraction (NER), SageMaker for custom ML models, and OpenSearch for entity graph storage and search. Best for: Organizations standardized on AWS GovCloud. Tradeoffs: Requires more custom ML development than the Azure OpenAI approach; SageMaker custom model training is a significant engineering effort vs. prompt-based extraction.
On-Premises (Air-Gapped) — For IL5+ or Classified Environments
Want early access to the full toolkit?