A compilation of links to datajournalism & OSINT tools, guides and resources I find useful to keep at hand. PRs welcomed!
- π = online tool/service/resource
- π» = software
- π = guide/tutorial
- π = list of tools/resources
- π = Python module
- π² = paid or paid-only tool/service
- APIs
- Archival
- Breached Data
- Companies
- Data Analysis & Manipulation
- Lists of tools & resources
- Location, Maps, Satellite Imagery
- Multi-purpose tools
- Phone numbers
- Pictures, Photos, Videos
- Social Networks
- Text & Documents
- Visualization
- Weather
- Websites
- Misc
- Postman π» - API development environment offering useful tools for crafting and debugging API requests.
- ProgrammableWeb π - A good API directory.
- Public APIs π - A categorized list of APIs.
- archive.today π - Saves pages as screenshots, useful for websites the WayBack Machine can't handle.
- Firefox Screenshots π» - Firefox can take a screenshot of a full page (i.e. 'scrolling' screenshot).
- How to Archive Open Source Materials π (Bellingcat)
- Hunch.ly ππ² - Web capture tool designed for online Investigations ($129.99/y).
- Internet Archive Wayback Machine π
- waybackpack π»π - Command-line utility & Python library to download content from the Wayback Machine. See this example.
- Breach Data Search Engines Comparison π (IntelTechniques)
- Dehashed ππ² - Find cleartext & hashed password from data breaches (paid, $4/week, $11/mo).
- GhostProject π - Check if an email appears in a breach. Shows the first 3 characters of the password for free.
- h8mail π» - Find passwords through different breach and reconnaissance services. Can also search the BreachedCompilation torrent.
- Have I Been Pwned? π - Check if an email appears in a breach, set up alerts.
- CompaniesHouse Short Guide π (Bellingcat) - A guide about the UK online company registry.
- ICIJ's Offshore Leaks Database π - Data on offshore companies, foundations and trusts from the Panama Papers, the Offshore Leaks, the Bahamas Leaks and the Paradise Papers investigations.
- List of company registers π (Wikipedia) - A list of all companies registers, by country.
- OCCRP Data π - Fantastic search tool & resources made available by OCCRP. Public records, leaks, scraped business registers, and more.
- OpenCorporates π - A very comprehensive companies database. Has an API.
- csvkit π» - A suite of command-line tools for converting to and working with CSV files.
- OpenRefine π» - Clean & transform messy data.
- pandas π - Powerful Python data analysis library. Best used in a Jupyter notebook.
- tabula π» - an open-source tool for extracting data tables from pdf documents.
- Infoga π» - Gather email accounts information (ip, hostname, country, etc) from different public sources.
- theHarvester π» - Python command-line tool to search several search engines for mail addresses from a particular domain.
- The most complete guide to finding anyone's email π (Blurbiz)
- IntelTechniques.com π - Resources, links & OSINT tools, organized by target data.
- Guides π (Bellingcat) - OSINT & Datajournalism how-tos.
- Online Investigation Toolkit π (Bellingcat)
- awesome-osint π - A curated list of open source intelligence tools and resources.
- OSINT framework π - Tree list of OSINT tools & resources.
- OSINT Collection π - Collection of OSINT related resources.
- I-Intelligence's Open Source Intelligence Tools and Resources Handbook 2018 π - Very complete list of OSINT tools & resources, organized by category. No descriptions.
- AutomatedOSINT.com π - A Blog about automating OSINT techniques using Python.
- netbootcamp π - Custom search forms and lists of resources by theme.
- How To Use Google Earthβs Three Dimensional View π (Bellingcat)
- Identify Burnt Villages on Satellite Imagery π (Bellingcat)
- Photo Interpretation Student Handbook π (US Defense Mapping Agency, 1996) - Old unclassified handbook on analyzing aerial & satellite imagery. General principles & specifics for buildings, industries, transportation & communication facilities.
- Using Time Lapse Satellite Imagery to Detect Infrastructure Changes π (Bellingcat)
- Baidu Maps π - Streetview = Panorama (ηΎεΊ¦ε ¨ζ―)
- Bing Maps π
- GeoNames π - Geographical database.
- GoogleMaps π
- Google Earth π
- Google Earth Outreach - Advanced Google Earth tutorials. Example: Image & Photos Overlay
- Google Earth Engine - Datasets, case studies, etc.
- GEarth Blog - Resources & how-tos about Google Earth
- Satellite imagery providers:
- Copernicus Open Access Hub π - Free access to imagery from the European Sentinel satellites.
- Descartes Labs ππ²
- DigitalGlobe Discover π - Search for satellite imagery of a particular location. Ability to download images (low-resolution compared to Google Earth).
- NASA EarthData & NASA EarthViewer π
- USGS Earth Explorer π - NASA Landsat imagery
- SentinelHub π - Satellite imagery, historical data from several sources, vegetation infrared & index, image exports & comparison. 2 products:
- Playground - Data discovery, playing around
- EO Browser - Compare full resolution images from several sources (Landsat, Sentinel), make time lapses & export to GIF (free signup required).
- See also the custom scripts to highlight fire, snow, metals, type of terrain, etc.
- Yandex Maps π - Has a "Streetview" feature.
- Geographic Bounding Box Drawing Tool π - Draw a rectangle over a map and get the coordinates of its points & center.
- Shadows and Angles: Measuring Object Heights from Satellite Imagery π (GISLounge)
- SunCalc π - Historical solar data (sun orientation & elevation, shadow length, etc).
- TerraPattern π - Scan large geographical areas for specific visual features using machine learning. Only available for 7 cities.
- EchoSec ππ² - Search and analyze social media data based on location. ($499/mo)
- GeoCreepy π» - Geolocation information gathering through social networking platforms (discontinued).
- OpenStreetMap π - User-generated locations & maps. Use taginfo and/or overpass-turbo.eu - To search a location by key/value tags (see OSM's Wiki)
- Social networks (see category)
- Tourism & review websites: Foursquare, TripAdvisor, Yelp, etc. π
- Vkontakte π - Use
near:<coordinates>
in a search. - Wikimapia π - User-generated locations & descriptions. Has an API. Also allows to switch between satellite imagery from Google, Bing, OSM.
- Belati π» - Command-line OSINT tool with whois, subdomain enumeration, mail harvesting, and more.
- Buscador π» - A very handy VM with plenty of pre-installed & pre-configured OSINT tools.
- DataSploit π» - A collection of python scripts which automate open source intelligence searches about domain names, email addresses, IP addresses and usernames.
- Maltego CE π» - Interactive data mining & mapping tool.
- Spiderfoot π» - Open source intelligence automation tool. Gathers intelligence about a given target, which may be an IP address, domain name, hostname, network subnet, ASN, e-mail address or person's name.
- NumberWay π - International directory of white pages and yellow pages phone books.
- PhoneInfoga π» - Information gathering & OSINT reconnaissance tool for phone numbers.
- Exiftool π» - Read and edit metadata. Linode Tutorial
- Exif Viewer (Firefox/Chrome) π»
- Ghiro π» - Automated image forensics tool
- Jeffrey's Image Metadata Viewer π
- StolenCameraFinder π - Search the web for pictures taken with a specific camera serial number
- CalibreObscura π - A blog about weapons & their uses in Middle East conflicts.
- CamoPedia π - Camouflage encyclopedia. Search & compare camouflage patterns.
- How to Digitally Verify Combatant Affiliation in Middle East Conflicts π (Bellingcat)
- ICUS Camouflage Index π
- Immaga ππ² - Pictures auto-tagging tool (Demos + Free API Plan 2000 images/mo, or 14$/mo)
- International Encyclopedia of Uniform Insignia π
- List of Comparative Military Ranks π (Wikipedia)
- Small Arms Surveyβs Weapon ID database π - Search for small arms by caliber, type, location, etc.
- Bing Images π - Can search part of the image
- CitizenEvidence π - Google Images reverse search on Youtube thumbnails.
- Google Images π
- IntelTechniques.com π- Various image search & reverse search tools and lists of resources.
- TinEye π
- Yandex Images π
- How to Conduct Comprehensive Video Collection (Bellingcat) π
- IntelTechniques.com π - Various video search & reverse search tools and lists of resources.
- PimEyes π - Face-recognition matching search engine
- SearchFace.ru π - Face recognition search engine for the Russian VK social network. See this guide from Bellingcat for a tutorial.
- SocialMapper π - Social Media Mapping Tool that correlates profiles via facial recognition. Supports LinkedIn, Facebook, Twitter, Instagram, VKontakte, Weibo, Douban.
- Youtube Geo Search Tool π - Search YT videos by location & time frame.
- Advanced Guide on Verifying Video Content π (Bellingcat)
- How to verify photos and videos on social media networks π (France24)
- InVID Verification Plugin π» - Verification βSwiss army knifeβ Firefox extension.
- Photo Verification Cheatsheet & Video Verification Cheatsheet π (FirstDraftNews)
- Verification 101 π - Storyfulβs advice for checking out material from social media, and putting it into practice.
- Verification Handbook π - Handbook by the European Journalism Centre about verifying digital content in emergency coverage.
- Sherlock π» - Search for a username across 135 social media sites.
- Scrape Facebook with your browser π (J++)
- gitrob π» - Find potentially sensitive files pushed to public repositories on Github. Requires a GitHub access token.
- Zen π» - Find emails of Github users.
- InstaLooter π» - Download all pictures & videos from an Instagram profile. No API key needed.
- An Investigative Guide To LinkedIn π (Bellingcat)
- LinkedIn Operators Tip Sheet π
- raven π» - Linkedin information gathering tool. Extracts employee data for a given company.
- The Endorser π» - Draw out relationships between people on LinkedIn via endorsements/skills.
- Reddit Insight π - Collect info on a Reddit profile, list all posts & comments.
- Reddit Investigator π - Collect info on a Reddit profile.
- tinfoleak π» - Very complete open-source tool for Twitter intelligence analysis. Needs API credentials.
- twarc π»π - A command line tool and Python library for archiving Twitter in JSON format.
- Tweetdeck π
- Tweetdeck Location Search Tutorial π
- Tweets Analyzer π» - Twitter profile analyzer: tweet activity, locations, most used hashtags, etc. Can save tweets to JSON. Requires a Twitter API key.
- TWINT (Twitter Intelligence Tool) π» - Advanced Twitter scraping tool, no API key needed. Can export to text, CSV, JSON, SQLite, Elasticsearch. Can detect emails, phone numbers, profiles.
- Aleph π» - A toolkit for data search, management and analysis in investigative reporting.
- Blacklight π» - Open source Solr user interface discovery platform.
- Datashare π» - Index & search documents on your computer, automatically detect people, organizations and locations with NLP.
- DumpsterDiver π» - Analyze big volumes of various file types in search of secrets, credentials, etc.
- ICIJ Extract π» - A command line tool for parallelized, distributed content-extraction.
- searchbox π» - A simple out-of-the-box web interface to search through thousands of unstructured documents using Solr.
- NewOCR.com π - Recognizes several languages, can resize images, shortcuts to Google & Bing Translate.
- Tesseract π» - Open-source OCR engine.
- PDF Text Extraction with PyPDF2, Tika & PDF Miner. π»
- topia π - Python module to determine important terms within a given piece of content.
- TXM π» - Lexicometry and text statistical analysis for large bodies of text.
- DataWrapper ππ² - Easy to use graph & map tool. Free plan available.
- Google Fusion Tables - Create maps & charts from data. Will shut down on Dec. 2019.
- Matplotlib π - Python 2D plotting library. Best used with pandas in a Jupyter notebook.
- ArcGIS π»π² - mapping & analysis software (proprietary, paid, 21-day trial)
- Folium π - Python library to create Leaflet.js maps. Can be used in a Jupyter Notebook to map data from pandas.
- Geopy π - Python geocoding library. Supports OSM Nominatim, Google, Bing, GeoNames & many more.
- Google:
- MyMaps π
- Earth π
- Earth Proπ»
- Earth Studio ππ»
- Humanitarian Data Exchange π - Useful resources of shapefiles, especially for administrative boundaries.
- KML Interactive Sampler π - Lots of KML templates.
- QGIS π» - Free & open-source alternative to ArcGis.
- Gephi π» - Powerful visualization and exploration software
- Visual Investigative Scenarios π (OCCRP)
- yEd Graph Editor π»
- Tik Tok π» - Javascript tool to easily create simple, mobile-friendly, vertical timelines. Open-source.
- TimelineJS π»
- Wolfram Alpha π - Weather history. What was the weather in New York on January 1st 2017?
- Wunderground History π - Weather history
- OnionScan π»
- Photon π» - Crawl a website (or its archive from the WayBack machine) and extract URLs, emails, social media accounts, files, keys, subdomains, etc.
- Python scraping libraries:
- BeautifulSoup π
- cloudflare-scrape π
- Selenium π
- Scrapy π
- Scrape Interactive Geospatial Data π (Bellingcat)
- Advanced Google searches
- Google Search Operators π (moz.com)
- Mastering Google Search Operators in 67 steps π (moz.com)
- Google Hacking Database π (Exploit.db)
- Google Search Operators: The Complete List π (ahrefs.com)
- NerdyData Search π - Search the source code of pages.
- PublicWWW π - Search the source code of pages.
- SpyOnWeb π - Search by URL, IP address, analytics codes. API with free plan. See this Belligcat how-to for automation.
- Sublist3r π» - Subdomains enumeration tool.
- Unveiling hidden site connections with Google Analytics IDs π (Bellingcat)
- Whois :globe_with_meridians:/:computer: - Get registrar, owner info.
- awesome-selfhosted π - A list of Free Software network services and web applications which can be hosted locally
- grayhatwarfare π - Search open Amazon S3 buckets content.
- Shodan π - Internet of Things search engine
This list is under the Creative Commons Attribution-NonCommercial 4.0 International Public License License.