Wiki » History » Version 1

Version 1/2 - Next ยป - Current version
Chris Cannam, 2014-08-04 12:50 PM


Wiki

Other tools / better ways to do this

Tools that look like they might do text and/or metadata extraction from PDFs:

  • Apache Tika (text + metadata, Java)
  • Grobid (biblio metadata, Java + native)
  • Textract (text, Python wrapper for other utilities?)