# SOP: Country-Level OSINT Project Setup
**Version:** 2.0
**Created:** 2026-03-03
**Updated:** 2026-03-03
**Based on:** Burkina Faso project workflow

---

## Purpose
Standardized procedure for standing up a new country-level OSINT reconnaissance project focused on critical infrastructure, government, and public-facing web assets. Passive discovery only — no active exploitation.

---

## Phase 1: Project Structure

### 1.1 Create Root Folder
```
Desktop/<Country Name>/
```

### 1.2 Create Subfolders & Files
```
<Country Name>/
├── INDEX.md                          # Master report — all findings, intel, next steps
├── SOP-COUNTRY-OSINT-SETUP.md       # This file (copy to each new project)
│
├── targets/                          # Domain lists ONLY — no reports, no data
│   └── <country>-websites.txt        # Master domain list, one per line, with count header
│
├── DUMP/                             # Per-target analysis reports & raw data dumps
│   ├── <TARGET-1>/                   # One subfolder per target org
│   │   ├── <TARGET-1>.md             # Detailed recon report
│   │   └── (screenshots, raw files, cached pages, DNS dumps, etc.)
│   ├── <TARGET-2>/
│   │   └── <TARGET-2>.md
│   └── ...
│
├── TOOLS/                            # Fresh copies of recon tools for this project
│   ├── Huntr/                        # Domain intelligence scanner
│   └── (other tools as needed)
│
├── EXPOSED CREDENTIALS/              # Any discovered credentials, API keys, tokens
│   └── (organized by target/source)
│
└── REPORTS/                          # Finished reports, summaries, deliverables
    └── (final write-ups, exported findings)
```

### 1.3 Folder Purposes

| Folder | Purpose | Contents |
|--------|---------|----------|
| `targets/` | Domain lists only | `<country>-websites.txt` — the master target list |
| `DUMP/` | Raw recon data & per-target reports | Subfolders per target org with .md reports, screenshots, cached pages, DNS/WHOIS dumps, raw scan output |
| `TOOLS/` | Fresh copies of recon tools | Huntr, THOT output, any other tools copied for this engagement |
| `EXPOSED CREDENTIALS/` | Discovered credentials | API keys, tokens, passwords, .env contents found through passive OSINT |
| `REPORTS/` | Final deliverables | Polished findings reports, summaries, exported data |

### 1.4 Target List Format (`targets/<country>-websites.txt`)
```
# ============================================
# <COUNTRY> - COMPREHENSIVE TARGET LIST
# Generated: <DATE>
# Updated: <DATE> (<reason>)
# Source: OSINT passive discovery + THOT Domain Harvester
# TOTAL DOMAINS: <COUNT>
# ============================================

# === CATEGORY NAME ===
domain1.tld
domain2.tld

# === INTERESTING SUBDOMAINS (from THOT/crt.sh) ===
# These reveal internal infrastructure
subdomain.domain.tld
```
- One domain per line
- Comments for category headers (`# === CATEGORY ===`)
- **Always maintain TOTAL DOMAINS count at the top** — update every time you add/remove
- No protocols (no `https://`), just bare domains/subdomains
- Separate section at the bottom for interesting subdomains (mail, intranet, admin, webmail, cpanel, autodiscover, ebank, owa, vpn, zabbix, git)

### 1.5 Per-Target Report Template (`DUMP/<TARGET>/<TARGET>.md`)
```markdown
# <AGENCY NAME>
**Sector:** <sector>
**Date:** <date>

## Domains
| Domain | Status |
|--------|--------|
| `domain.tld` | UP / DOWN / UNKNOWN |

## Tech Stack
### CMS / Framework
### Frontend / JavaScript
### External Services
### Analytics & Tracking

## Interesting Findings
- <bullet points>

## TODO
- [ ] <next steps specific to this target>
```

---

## Phase 2: Domain Discovery

### 2.1 Identify Target Sectors
Collect domains for ALL of the following categories (in priority order):

**HIGH PRIORITY:**
1. Presidency / Head of State
2. Military / Defense / Intelligence
3. Internal Security / Police / Gendarmerie
4. Cybersecurity agencies (national CERT/CIRT)
5. Energy (electricity, grid, nuclear if applicable)
6. Hydrocarbons / Fuel / Oil & Gas
7. Water & Sanitation
8. Telecommunications (regulator + all operators + ISPs)

**MEDIUM PRIORITY:**
9. All government ministries (every single one)
10. National Assembly / Parliament / Senate
11. Judiciary / Constitutional Court
12. Oversight bodies (audit, anti-corruption)
13. Government agencies & services (tax, customs, employment, land registry)
14. State-owned enterprises (postal, rail, mining, agriculture, lottery, social security)
15. Aviation / Transport authorities
16. Banking & Finance (central bank, major banks)

**LOW PRIORITY:**
17. Universities / Research institutions
18. State media / Broadcasting
19. Investment / Economic development portals
20. Hospitals / Health institutions
21. Private sector / Commercial (discovered via THOT)
22. NGO / Civil society
23. Religious / Community organizations

### 2.2 Discovery Methods (Manual — Claude)
Run these searches in parallel for speed:

1. **Web search:** `<Country> government websites .tld list ministries agencies`
2. **Web search:** `<Country> military defense police gendarmerie intelligence website`
3. **Web search:** `<Country> energy electricity water telecom critical infrastructure websites`
4. **Web search:** `<Country> banks financial institutions websites`
5. **Web search:** `<Country> universities hospitals public institutions websites`
6. **Web search:** `<Country> state owned enterprises SOE websites`
7. **Web search:** `site:gov.<tld>` to find all government subdomains
8. **Web search:** `<Country> cybersecurity CERT CIRT national agency`
9. **Fetch:** Country's main government portal — extract all ministry links
10. **Fetch:** `servicepublic.gov.<tld>` or equivalent public admin directory — scrape all agency links
11. **Fetch:** Presidential/PM website — look for `/ministerial-websites/` or similar directory pages
12. **Fetch:** Wikipedia — "Cabinet of <Country>", "List of companies of <Country>"
13. **Web search:** TLD domain lists (e.g., `webatla.com/tld/<tld>/`)
14. **Web search:** Domain-specific guesses: `"domain1.<tld>" OR "domain2.<tld>"` to confirm existence

### 2.3 Discovery Methods (Automated — THOT Domain Harvester)
After manual discovery, run THOT on VM106 to massively expand the list via certificate transparency logs:

```bash
# Connect through Proxmox host (VM106 may not be directly SSH-reachable)
ssh root@10.0.0.245

# Launch harvester in background on VM106
qm guest exec 106 -- bash -c 'nohup /home/kali/Documents/Thot-Odint/LINUX/thot_domain_harvester -t .<tld> > /tmp/<tld>-harvest.txt 2>&1 &'

# Check progress
qm guest exec 106 -- bash -c 'wc -l /tmp/<tld>-harvest.txt; ps aux | grep thot_domain_harvester | grep -v grep | wc -l'

# When done (process count = 0), extract clean unique domains
qm guest exec 106 -- bash -c 'grep -oP "\[\+\]\s+n?\K[a-zA-Z0-9._-]+\.<tld>" /tmp/<tld>-harvest.txt | sed "s/^n//" | sed "s/^\.//" | sort -u'
```

**THOT pulls from:** crt.sh (certificate transparency), and other OSINT sources
**Expected yield:** 5-10x more domains than manual discovery alone
**Key finds:** mail servers, intranet portals, autodiscover endpoints, admin panels, e-banking, OWA, cpanel, webmail, VPN endpoints, monitoring (zabbix), git repos

### 2.4 Country Context to Gather
- Regime type (democracy, junta, monarchy, etc.)
- Recent political changes (coups, elections, restructuring)
- TLD info (who operates the registry, how many domains registered)
- Geopolitical alignment (affects digital infra partnerships)
- Language(s) of government sites
- Internet penetration / electrification rates

---

## Phase 3: Initial Tech Stack Recon

### 3.1 For Each Live Target, Fetch and Extract:
- CMS (WordPress, Joomla, Drupal, TYPO3, custom)
- CMS version if detectable
- Plugins / themes / page builders
- JavaScript libraries and versions
- CDN / WAF presence (Cloudflare, Akamai, etc.)
- Analytics (Google Analytics, Matomo, etc.)
- External services (Google Fonts, social embeds, etc.)
- SSL/TLS status
- Interesting artifacts (dev references, exposed endpoints, debug info)

### 3.2 Save Reports to DUMP
For each analyzed target, create `DUMP/<TARGET>/<TARGET>.md` using the template from 1.5. Drop any supporting files (screenshots, cached pages, raw output) in the same subfolder.

---

## Phase 4: INDEX.md Master Report

### 4.1 Structure
The INDEX.md should contain:

1. **Header** — project name, date, scope, link to target list with count
2. **Country Context** — regime, TLD, language, geopolitical notes
3. **Interesting Findings** — numbered sections by theme:
   - Military / Defense posture
   - Cybersecurity posture
   - Tech stack patterns
   - Web hygiene issues
   - Government structure intelligence
   - Critical infrastructure gaps
   - Domain & hosting intelligence
   - Telecom landscape
   - State-owned enterprises
4. **Target Summary Table** — sector, domain count, priority level
5. **Per-Target Report Links** — table linking to `DUMP/<TARGET>/<TARGET>.md` reports
6. **Next Steps** — checklist of remaining recon tasks

### 4.2 Update Rules
- Update INDEX.md after every major discovery phase
- Always keep findings current — don't let it go stale
- Link back to target list and per-target reports in DUMP/

---

## Phase 5: Tool Deployment

### 5.1 Huntr
Copy a fresh instance of Huntr into `TOOLS/Huntr/`:
```bash
cp -r "C:\Users\Squir\Desktop\Huntr" "<Project>/TOOLS/Huntr"
```
Before running against the new target list:
- Delete old `huntr.db`, `huntr.db-shm`, `huntr.db-wal`
- Clear `logs/` directory
- Point Huntr at `targets/<country>-websites.txt`

### 5.2 THOT (on VM106)
THOT lives on VM106 at `/home/kali/Documents/Thot-Odint/`. Don't copy — run remotely via `qm guest exec`. Pull results back to local machine.

---

## Phase 6: Next Steps Checklist (Post-Setup)

After the project structure and initial discovery are complete, the standard follow-up tasks are:

- [ ] Validate all domains — confirm live, geo-blocked, or dead
- [ ] DNS enumeration (A, AAAA, MX, NS, TXT, CNAME) on all targets
- [ ] Subdomain brute-force on gov.tld and key domains
- [ ] Passive recon (Shodan, Censys) for IP ranges, open ports
- [ ] robots.txt / sitemap.xml harvesting
- [ ] Wayback Machine historical snapshots
- [ ] Google dorking: `site:<tld> filetype:pdf|doc|xls|env|sql|bak|conf`
- [ ] CMS version fingerprinting
- [ ] SSL/TLS certificate transparency log analysis
- [ ] WHOIS on all domains
- [ ] ASN mapping (map IPs to autonomous systems)
- [ ] Check for exposed admin panels (/wp-admin, /administrator, /typo3)
- [ ] Enumerate user-facing portals (recruitment, e-services, e-banking)
- [ ] Run Huntr against full target list
- [ ] Document exposed credentials in `EXPOSED CREDENTIALS/`
- [ ] Write final report in `REPORTS/`

---

## Quick Reference: New Country Kickoff

When starting a new country, tell Claude:

> "Set up a country OSINT project for [COUNTRY]. Follow the SOP at
> `Desktop/Burkina Faso/SOP-COUNTRY-OSINT-SETUP.md`"

This should trigger:
1. Folder creation on Desktop (`<Country>/` with `targets/`, `DUMP/`, `TOOLS/`, `EXPOSED CREDENTIALS/`, `REPORTS/`)
2. Parallel web searches across all sector categories
3. Fetch government portals and directories for links
4. Build the full domain list with count in `targets/<country>-websites.txt`
5. Run THOT Domain Harvester on VM106 for crt.sh expansion
6. Merge THOT results into target list
7. Copy fresh Huntr into `TOOLS/`
8. Initial tech stack recon on priority targets → save to `DUMP/`
9. Compile INDEX.md with all findings

**Expected output:** Fully organized project folder with 200+ domains categorized, initial tech stack analysis on priority targets, per-target reports in DUMP/, tools ready to run, and a comprehensive INDEX.md with all passive findings.
