In day two of our Google Hacking Week we are going to combine an interesting Google search query (or Google Dork) with a command line command to find AND download any file type you want.
Find the storage room in the back of the store.
Websites on the net consist of more then just webpages with information. They also links to files and folders containting interesting information like PDF’s MP3’s and more. Most of the time these files aren’t ‘visible’ when you visit a specific site but our little friends, the Google Search Bots, DO index them. All you need is the right string to find them.
- intitle: “index of” <filetypehere> <title/genre/artist>
This search query will tell Google to go look for pages with the title “index of”. These pages usually don’t contain a lot of text, but instead contain links to folders and files. Since you are looking for a specific type of file (like for example mp3’s, Pdf’s or something else) you also can add this to the query. Finally you might be looking for mp3’s of Hanna Montana or Tango’s (I don’t know what you like) : That can also be added to the search string. In the end it will look something like this.
- intitle: “index of” mp3 acdc
- intitle: “index of” pdf bookkeeping
- intitle: “index of” epub scott sigler
So using these queries you might find a real treasure-trove of files and info to download. Some of them might even be behind a login/password page (or even a pay wall) but when the web masters don’t do their homework right .. you can find the ‘good stuff’ this way.
So download them one by one ?
If you are just looking for one specific file you can use your browser to find and download it. If you want to download the ENTIRE collection of files on that page .. you need the power of a command line tool called WGET.
Wget can be found on the command line of both Linux, Mac and even Windows machines. Not all the advanced ‘switches’ we give you in this command below might work on Windows, but you can give it a try. The command is
- wget -r -l1 -H -t1 -nd -N -np -A.<.filetype> -erobots=off <url of website>
Replace <filetype> with the type of file you want to download ( .mp3, .pdf, .epub) and <url of website> with the website’s url you found using the Google search. Completed the command might look something like this.
- wget -r -l1 -H -t1 -nd -N -np -A.<.mp3> -erobots=off http://tiobiloute59.free.fr/tiesto/
The download is RECURSIVE, so it “deep dives” into all the folders. Beware : This can get you a LOT of data. So make sure you have the bandwidth and the storage capacity before you start sucking down the internet. Good Luck !