Skip to content

PDF processing tool to extract document data and save it in EDN format

License

Notifications You must be signed in to change notification settings

ilovezfs/pdftoedn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pdftoedn

A poppler-based PDF processing tool to extract document data and save it in EDN format. It supports:

  • Font and glyph remapping via user-defined font map configurations (in JSON format) to allow glyph substitutions for Type 1 or TT fonts with invalid/incorrect unicode tables and even embedded CID fonts with missing tables.
  • Path data extraction.
  • Transformed image output, written directly to disk in PNG format.
  • Annotations.
  • PDF outlines.

Usage

Process a pdf document and write its output to output_file.edn:

pdftoedn -o output_file.edn input_file.pdf

Further reading

Refer to the wiki for

About

PDF processing tool to extract document data and save it in EDN format

Resources

License

Stars

Watchers

Forks

Packages

No packages published