How to Build an Autonomous OSINT Agent in Python Using Claude's Tool Use API
When I started studying OSINT, I always felt I was just putting random values into software without deeply understanding what I was doing. After months in the field, I realized I wasn't really investigating — I was just executing steps that follow a predictable pattern. That's exactly what an AI agent is good at. So I built one. In this tutorial you'll learn how to set up OpenOSINT, an open-source Python OSINT framework with an AI agent at its core. You'll learn how Claude's native tool use API works, how to run autonomous investigations from the terminal using the interactive AI REPL, how to use the direct CLI for scripting, and how to expose all the tools to Claude Code or Claude Desktop via an MCP server. What Is OSINT and Why Manual Workflows Break Down What You'll Build Prerequisites How Claude's Tool Use API Works How to Install OpenOSINT How to Use the Interactive AI REPL How to Run Individual Tools from the CLI How to Set Up the MCP Server How the Agent Loop Works Under the Hood Project Architecture Conclusion Open Source Intelligence (OSINT) is the practice of collecting and analyzing information from publicly available sources. Security researchers use it during penetration tests. Journalists use it to verify identities and trace connections. Threat analysts use it to profile infrastructure. A typical OSINT workflow looks like this: You have a target email address You run You notice a username in the output You manually copy that username and run You switch to a browser to check HaveIBeenPwned You open another tab for a WHOIS lookup You take notes and repeat Every tool is a silo. Every pivot is manual. The investigation logic — what to run next, what to chain, what the findings mean — lives entirely in your head. When you close the terminal, it's gone. This tutorial walks you through OpenOSINT, an open-source Python framework that replaces that fragmented workflow with an AI agent that chains tools autonomously, executes them against real binaries, and saves a structured Markdown report. More importantly, you'll learn the core design principle that makes it trustworthy for security research: hallucination in tool results is structurally impossible. By the end of this tutorial, you'll have a working OSINT agent that you can use in three ways: Interactive AI REPL— type a target in natural language and the agent decides what to run Direct CLI— run individual tools without AI, useful for scripting MCP Server— expose all tools to Claude Code or Claude Desktop Here's what a real session looks like: The agent went from email → linked accounts → username pivot → cross-platform search with no human orchestration at any step. To follow this tutorial, you'll need: Python 3.10 or later installed on your machine Basic familiarity with the command line An Anthropic API key — only required for the AI REPL, not for the CLI or MCP server Git installed You don't need prior experience with OSINT tools or the Anthropic SDK. Before you dive into installation, it's worth understanding the mechanism that makes this framework trustworthy for security research. Most AI applications that wrap external tools work by generating text that describes what a tool wouldreturn. That's a problem when accuracy matters — the model can hallucinate plausible-looking usernames, fake subdomains, or data breaches that never happened. Claude's tool use API works differently. When the model decides it needs to call a tool, it does notgenerate the output. It stops and emits a structured Your code then runs the actual binary — Here's the flow: The model never generates tool output. It only ever reads it. If This is not a prompting trick or a system prompt instruction. It is how the API is architected. Keep this in mind as you read through the agent loop code later in this tutorial. Start by cloning the repository and installing the package: Alternatively, if you just want to use the tool without modifying the source, install it directly from PyPI: Next, set your Anthropic API key. This is only required for the interactive AI REPL — the direct CLI and MCP server work without it: OpenOSINT wraps several standalone OSINT tools. Install the ones you plan to use: For phone intelligence, Two tools work at higher rate limits with optional API keys: If a binary is missing or an API key is not configured, that specific tool returns a descriptive error string. All other tools continue to work normally. Run If you prefer to pass the API key inline rather than via environment variable, use the You'll get a prompt where you can type targets or questions in natural language: The agent decides which tools to run based on your input. You don't need to specify which tools to use or in what order. If you type an email address, the agent will run email enumeration. If it finds a linked username, it may pivot and search that username across platforms. Reports are saved automatically to the Here are the commands available inside the REPL: If you want to run a single tool without the AI layer — for scripting, automation, or quick lookups — use the direct CLI: The direct CLI doesn't require an Anthropic API key. It runs the underlying binary and prints the output to the terminal. This mode is useful when you need predictable, scriptable behavior — for example, piping output into another tool or running automated checks. OpenOSINT also ships as a Model Context Protocol (MCP) server. This exposes all 9 tools to any MCP-compatible AI client. Verify the registration worked: Once registered, you can drive investigations from the Claude Code prompt: Add the following to your Claude Desktop config at Restart Claude Desktop after saving the file. The tools will appear in Claude's tool list. The MCP server uses stdio transport and does not need a persistent background process. Claude Code or Claude Desktop starts it on demand. Here is a simplified version of the agent loop from There are a few important things to understand in this code. The loop runs until Conversation history is maintained across the entire loop: Each tool result goes back into Tool schemas are defined as JSON: Each tool has a name, description, and parameter schema. The model uses these to know what tools exist and what arguments they accept. Here's a simplified example for The same pattern applies to all 9 tools. The model reads these schemas at the start of every request and uses them to decide what's available and how to call it. The codebase is organized in five layers. The hard rule across the codebase is that no layer imports from a layer above it: This separation makes each layer independently testable. The core tools are pure async functions that take a string and return a string — you can unit test them without touching the agent or the CLI. It also means the AI layer is entirely optional. If you don't have an Anthropic API key, you use the CLI and bypass the agent. The MCP server also operates independently of the agent. In this tutorial, you learned how to set up and use OpenOSINT — a Python OSINT framework built on Claude's tool use API. The key takeaway is the design principle: by using native tool use, the agent never generates tool output. It only reads real output from real binaries. This makes it suitable for security research where accuracy matters and hallucination isn't an acceptable failure mode. To recap the three interfaces: Run Run Register the MCP server in Claude Code or Claude Desktop to run investigations inside your existing AI environment The full source code is available on GitHub under the MIT license. Contributions and issues are welcome. Legal note: OpenOSINT is for authorized security research, penetration testing, and investigative journalism only. Users are solely responsible for compliance with applicable law, including GDPR, CCPA, and the CFAA. See the DISCLAIMER.md for the full notice.What Is OSINT and Why Manual Workflows Break Down
holeheto find which platforms that email is registered onsherlockto search 300+ platformsWhat You'll Build
$ openosintopenosint ❯ investigate [email protected] → generate_dorks('[email protected]') → search_email('[email protected]') ✓ Found: Spotify, WordPress, Gravatar, Office365 → search_breach('[email protected]') ✓ Found in 2 breaches: LinkedIn (2016), Adobe (2013) → search_username('target_handle') ✓ Found on: GitHub, Reddit, HackerNews, Twitter ╭──────────────── Report ────────────────╮ │ ## Online Presence │ │ Spotify · WordPress · Gravatar │ │ │ │ ## Data Breaches │ │ LinkedIn (2016) · Adobe (2013) │ ╰────────────────────────────────────────╯ ✓ Report saved → reports/2026-05-11_report.mdPrerequisites
How Claude's Tool Use API Works
tool_useblock containing the tool name and the arguments it wants to pass.holehe, sherlock, or whatever else — and sends the real output back as a tool_result. The model reads that real output and decides its next step.User prompt ↓Model decides to call search_email() ↓Hard stop — model emits tool_use block ↓Your code runs holehe against the real target ↓Real output sent back as tool_result ↓Model reads actual results, decides next step ↓Repeat until investigation is completesherlockfinds 12 profiles, those 12 URLs go back into the context verbatim. The model cannot add a 13th that doesn't exist.How to Install OpenOSINT
git clone https://github.com/OpenOSINT/OpenOSINT.gitcd OpenOSINTpip install -e .pip install openosintexport ANTHROPIC_API_KEY=sk-ant-...How to Install the External Tool Dependencies
pip install holehe # email account enumerationpip install sherlock-project # username search across 300+ platformspip install sublist3r # subdomain enumerationphoneinfogais a standalone binary. Download the release for your platform from its GitHub releases page and place it somewhere in your PATH.How to Configure Optional API Keys
export HIBP_API_KEY=your_key # required for breach checks via HaveIBeenPwned v3export IPINFO_TOKEN=your_token # optional — raises ipinfo.io rate limitsHow to Use the Interactive AI REPL
openosintwith no arguments to start the AI-powered REPL. You can also use openosint shell— it's equivalent:$ openosint# or$ openosint shell--api-keyflag:$ openosint --api-key sk-ant-...openosint ❯ investigate [email protected] ❯ find all accounts for johndoe99openosint ❯ what subdomains does example.com have?openosint ❯ check if +14155552671 is a mobile numberreports/directory after every investigation that produces structured findings.Command Description clearReset the conversation memory saveManually save the last report toolsShow available tools and their status configShow current configuration helpList all commands exitor Ctrl-DQuit How to Run Individual Tools from the CLI
# Email account enumeration (default timeout: 120s)openosint email [email protected]# With a custom timeout in secondsopenosint email [email protected] -t 60# Username search across 300+ platforms (default timeout: 180s)openosint username johndoe99# Enable verbose output for debuggingopenosint -v email [email protected]How to Set Up the MCP Server
How to Register with Claude Code
claude mcp add openosint python /absolute/path/to/OpenOSINT/openosint/mcp_server.pyclaude mcp list> Investigate [email protected]. If you find a linked username, trace it across other platforms and compile a full report.How to Configure Claude Desktop
~/Library/Application Support/Claude/claude_desktop_config.json:{ "mcpServers": { "openosint": { "command": "python", "args": ["/absolute/path/to/OpenOSINT/openosint/mcp_server.py"] } }}How the Agent Loop Works Under the Hood
openosint/agent.py:import anthropicimport asyncioclient = anthropic.Anthropic()async def run_investigation(user_prompt: str) -> str: messages = [{ "role": "user", "content": user_prompt}] while True: response = client.messages.create( model="claude-...", # model configured via --api-key / env var max_tokens=4096, tools=TOOL_SCHEMAS, # JSON schemas for all 9 tools messages=messages ) # Agent is done — extract and return the final report if response.stop_reason == "end_turn": return extract_text(response) # Agent needs a tool — run the real binary if response.stop_reason == "tool_use": tool_results = [] for block in response.content: if block.type == "tool_use": # Runs holehe, sherlock, etc. as real subprocesses real_output = await execute_tool(block.name, block.input) tool_results.append({ "type": "tool_result", "tool_use_id": block.id, "content": real_output # real output, never generated }) # Append assistant turn and real tool results to conversation messages.append({ "role": "assistant", "content": response.content}) messages.append({ "role": "user", "content": tool_results})stop_reason == "end_turn": The agent decides when it has gathered enough information to write the final report. It may call one tool or ten, depending on what it finds.execute_tool()runs real subprocesses: It's a thin async wrapper around Python's asyncio.create_subprocess_exec()with a configurable timeout. There's no simulation and no mocked data at any point.messages, so the model always has full context of what it found when deciding what to run next.search_email:{ "name": "search_email", "description": ( "Enumerates online services and social accounts " "associated with an email address using holehe." ), "input_schema": { "type": "object", "properties": { "email": { "type": "string", "description": "Target email address" } }, "required": ["email"] }}Project Architecture
openosint/tools/ Core tools Async wrappers around external binaries and APIs. Stateless. No AI. No CLI. Pure functions.openosint/agent.py AI agent Anthropic tool use loop. Per-session conversation history. Imports from tools/. Nothing imports from agent.py.openosint/repl.py Interactive REPL (prompt_toolkit + Rich)openosint/mcp_server.py MCP server (stdio transport)openosint/cli.py CLI entry pointThe 9 Available Tools
Tool Backend What it returns search_emailholehe Social accounts linked to an email search_usernamesherlock Accounts across 300+ platforms search_breachHaveIBeenPwned v3 Breach names, dates, leaked data types search_whoispython-whois Registrant, registrar, creation/expiry search_ipipinfo.io Geolocation, ASN, hostname, org search_domainsublist3r Subdomain enumeration generate_dorksbuilt-in 12 targeted Google dork URLs, no network calls search_pastepsbdmp.ws Pastebin dump mentions search_phonephoneinfoga Carrier, country, line type Conclusion
openosintfor the interactive AI REPL — best for full investigations with automatic chainingopenosint emailor openosint usernamefor direct CLI access — best for scripting and automation
- 最近发表
- 随机阅读
-
- The PHP Handbook – Learn PHP for Beginners
- Machine Learning
- Md Tarikul Islam
- AI Paper Review: Chain
- CSS Transform Handbook – Complete Guide to CSS Transform Functions and Properties
- Web Development
- Build a Self
- software architecture
- Weight Loss Calculator
- langchain
- Bansidhar Kadiya
- JavaScript
- Machine Learning
- How to Build a Browser
- AI Paper Review: Chain
- AI Paper Review: Chain
- langchain
- Machine Learning
- JavaScript
- How to Build a Case Converter Tool Using HTML, CSS, and JavaScript
- 搜索
-