How to use Photon.py? | RECON
Photon is a python based OSINT web crawling and data extraction tool. It has many great features and is best used when you are looking to gather information from a birds-eye view. The reason Photon is best used for gathering birds-eye view data is because it will find hard-to-find pages, external links, broken links, and web page changes over time. This is very useful during the reconnaissance phase of an assessment because it will allow you to build and create multiple leads.
Specifically for reconnaissance, Photon will extract different target points from a website such as email addresses, names, contact information, social media links, or documents. Full list of potentially extracted information can include:

- URLs (in-scope & out-of-scope)
- URLs with parameters (
example.com/gallery.php?id=2
) - Intel (emails, social media accounts, amazon buckets etc.)
- Files (pdf, png, xml etc.)
- Secret keys (auth/API keys & hashes)
- JavaScript files & Endpoints present in them
- Strings matching custom regex pattern
- Subdomains & DNS related data
Photon.py Options
Options | Description |
---|---|
-h, –help | show this help message and exit |
-u ROOT, –url ROOT | root url |
-c COOK, –cookie COOK | cookie |
-r REGEX, –regex REGEX | regex pattern |
-e EXPORT, –export EXPORT | export format |
-o OUTPUT, –output OUTPUT | output directory |
-l LEVEL, –level LEVEL | levels to crawl |
-t THREADS, –threads THREADS | number of threads |
-d DELAY, –delay DELAY | delay between requests |
-v, –verbose | verbose output |
-s SEEDS [SEEDS …], –seeds SEEDS [SEEDS …] | additional seed URLs |
–stdout STD | send variables to stdout |
–user-agent USER_AGENT | custom user agent(s) |
–exclude EXCLUDE | exclude URLs matching this regex |
–timeout TIMEOUT | http request timeout |
–clone | clone the website locally |
–headers | add headers |
–dns | enumerate subdomains and DNS data |
–ninja | ninja mode (removed in v1.3.0) |
–keys | find secret keys |
–update | update photon |
–only-urls | only extract URLs |
–wayback | fetch URLs from archive.org as seeds |
GitHub: https://github.com/s0md3v/Photon
Crawling a Single Website with Photon
To perform a basic crawl on a single website, use the -u or –url option.
$ python photon.py -u "http://example.com"

Cloning a website with Photon.py locally
To clone a website locally, use the –clone option. The crawled webpages can be saved locally for later use by using the –clone switch as follows
$ python photon.py -u http://192.168.1.1 --clone

Depth of Crawling
To configure the depth of crawling in Photon, use the -l or –level option. Using this option user can set a recursion limit for crawling. For example, a depth of 2 means Photon will find all the URLs from the homepage and seeds (level 1) and then will crawl those levels as well (level 2). The default depth of crawling is set to 2 in Photon.
$ python photon.py -u http://192.168.1.1 --level 3

Photon.py Number of Threads
To configure the number of threads in Photon, use the -t or –threads option. It is possible to make concurrent request to the target and -t
option can be used to specify the number of concurrent requests to make. The default threads is set to 2. While threads can help to speed up crawling, they might also trigger security mechanisms. A high number of threads can also bring down small websites.
$ python photon.py -u 192.168.1.1 --threads 10

Photon Delay Between Each HTTP Request
To configure the delay between each HTTP request in Photon, use the -d or –delay option. It is possible to specify a number of seconds to hold between each HTTP(S) request. The valid value is a int, for instance 1 means one second and 2 means two seconds. The default value is set to 0, which means concurrent request are being made without any delay.
$ python photon.py -u http://192.168.1.1 --delay 5

Photon Seconds before Timeout
It is possible to specify a number of seconds to wait before considering the HTTP(S) request timed out by using the –timeout option. The default number of seconds to wait is 5 seconds before an HTTP request is considered timed out.
$ python photon.py -u http://192.168.1.1 --timeout=4

Add Cookies Header in Photon
To configure the a cookie in the HTTP request in Photon, use the -c or –cookie option. This option lets you add a Cookie header to each HTTP request made by Photon in non-ninja mode.
It can be used when certain parts of the target website require authentication based on Cookies. By default no cookie header is sent.
python photon.py -u http://192.168.1.1 -c "PHPSESSID=u8668v9009s0854qk2977v9qw4"

Specify Output Directory with Photon
Photon saves the results in a directory named after the domain name of the target but you can overwrite this behavior by using the -o or –output option. By default, the saved results are created in a directory named after the domain name of the target.
python photon.py -u http://192.168.1.1 --output "testdir"

Photon Verbose Output
In verbose mode, all the pages, keys, files etc. will be printed as they are found by using the -v or –verbose option.
$ python photon.py -u http://192.168.1.1 --verbose

Photon Exclude Specific URLs
URLs matching the specified regex will not be crawled or showed in the results at all by using the –exclude option.
$ python photon.py -u http://192.168.1.1 --exclude="/blog/20[21|22]"

Photon Specify Seed URL(s)
To add custom seed URL(s) use the -s or –seeds option, separated by commas.
$ python photon.py -u http://192.168.1.1 --seeds "http://192.168.1.1/blog/2022,http://192.168.1.1/login.php"

Photon Specify User-agent(s)
Option: --user-agent
| Default: entries from user-agents.txt
Photon allows you to set a specific user-agent by using the –user-agent option. This option is only present to aid the user to use a specific user-agent without modifying the default user-agents.txt file.
$ python photon.py -u http:192.168.1.1 --user-agent "curl/7.35.0,Wget/1.15 (linux-gnu)"

Photon Custom regex Pattern
It is possible to extract strings during crawling by specifying a regex pattern by using the -r or –regex option.
$ python photon.py -u http://192.168.1.1 --regex "\d{10}"

Photon Export Formats
Option: -e
or --export
With -e or –export option you can specify a output format in which the data will be saved. The currently supported formats include json and csv files.
python photon.py -u "http://example.com" --export=json

Photon Use URLs from archive.org as seeds
The –wayback option makes it possible to fetch archived URLs from archive.org and use them as seeds.
Only the URLs crawled within current year will be fetched to make sure they aren’t dead.
$ python photon.py -u http://192.168.1.1 --wayback

Photon Skip Data Extraction
Option: --only-urls
The –only-urls option skips the extraction of data such as intel and js files. It should come in handy when your goal is to only crawl the target.
python photon.py -u http://192.168.1.1 --only-urls

Photon Extract Secret Keys
The –keys switch tells Photon to look for high entropy strings which can be some kind of auth or API keys or hashes.
$ python photon.py -u http://192.168.1.1 --keys

Photon Ninja Mode
The –ninja option enables Ninja mode. In this mode, Photon uses the following websites to make requests on your behalf. Contrary to the name, it doesn’t stop you from making requests to the target.
$ python photon.py -u http://192.168.1.1 --ninja
IMPORTANT: As of version 1.3.0, ninja mode has been removed. Further release updates with Photon can be viewed here: https://github.com/s0md3v/Photon/releases
Photon Dumping DNS Data & Map
The –dns option saves subdomains in ‘subdomains.txt’ and also generates an image displaying target domain’s DNS data.
$ python photon.py -u http://192.168.1.1 --dns
