Wiki » History » Version 1
Version 1/2
-
Next ยป -
Current version
Chris Cannam, 2014-08-04 12:50 PM
Wiki¶
Other tools / better ways to do this¶
Tools that look like they might do text and/or metadata extraction from PDFs:
- Apache Tika (text + metadata, Java)
- Grobid (biblio metadata, Java + native)
- Textract (text, Python wrapper for other utilities?)