updated Readme.md
This commit is contained in:
29
Readme.md
29
Readme.md
@@ -1,6 +1,6 @@
|
|||||||
# pdfgrab
|
# pdfgrab
|
||||||
|
|
||||||
* Version 0.4.4
|
* Version 0.4.7
|
||||||
|
|
||||||
## What is it?
|
## What is it?
|
||||||
|
|
||||||
@@ -9,6 +9,14 @@ Basically it analyses PDF files for Metadata. You can direct it to a file or dir
|
|||||||
You can show it the url of a pdf or use the integrated googlesearch (thanx to mario vilas class)
|
You can show it the url of a pdf or use the integrated googlesearch (thanx to mario vilas class)
|
||||||
to search for pdfs at target site, download and analyse them.
|
to search for pdfs at target site, download and analyse them.
|
||||||
|
|
||||||
|
## What is new in 0.4.7 release?
|
||||||
|
|
||||||
|
* Added support for html output file, this will be placed in the outdir path and is more clear then a text or json file
|
||||||
|
* Added basic logging support, logfile is placed in pdfgrab.py directory
|
||||||
|
* Reordered Codebase, exported functionality to some libraries
|
||||||
|
* PDF XMP Metadata is grabbed now as well, but not yet saved in output files
|
||||||
|
* added docs/ section with Changelog and Todo
|
||||||
|
|
||||||
## What information can be gathered?
|
## What information can be gathered?
|
||||||
|
|
||||||
This depends on the software used to create the pdf. And if it has been cleaned.
|
This depends on the software used to create the pdf. And if it has been cleaned.
|
||||||
@@ -148,25 +156,6 @@ File: pdfgrab/bpf_global_data_and_static_keys.pdf
|
|||||||
/PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.17 (TeX Live 2016) kpathsea version 6.2.2
|
/PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.17 (TeX Live 2016) kpathsea version 6.2.2
|
||||||
```
|
```
|
||||||
|
|
||||||
## TODO
|
|
||||||
* ~~fixed some bugs with *uncommon* pdfs~~
|
|
||||||
* add socks proxy
|
|
||||||
* ~~add queues~~ for threading
|
|
||||||
* ~~add url list to output~~
|
|
||||||
* ~~json file-output~~
|
|
||||||
* ~~txt file-output~~
|
|
||||||
* catch conn refused connections
|
|
||||||
* ~~set option for certificate verification, default is true~~
|
|
||||||
* ~~complete analyse.txt~~
|
|
||||||
* clean up code
|
|
||||||
* ~~do more testing~~
|
|
||||||
* do even more testing
|
|
||||||
* ~~add random useragent for google and website pdf gathering~~
|
|
||||||
* ~~add decryption routine~~
|
|
||||||
* ~~catch ssl exceptions~~
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
## Google
|
## Google
|
||||||
|
|
||||||
* Search: filetype:pdf site:com
|
* Search: filetype:pdf site:com
|
||||||
|
|||||||
Reference in New Issue
Block a user