Poppler-0.68.0-x86 Today
pdftotext -raw -eol dos corrupted.pdf output.txt Librarians and archivists use pdfimages (with -png ) to extract figures from scientific papers stored in a 32-bit NAS:
| Test (100MB PDF, 500 pages) | Poppler 0.68.0-x86 (i686) | Poppler 0.68.0-x86_64 | |-----------------------------|----------------------------|-------------------------| | Text extraction ( pdftotext ) | 12.4 seconds | 8.2 seconds | | Image extraction ( pdfimages ) | 45 images in 6.1s | 45 images in 4.3s | | Memory peak (resident) | 312 MB | 298 MB | | Binary size ( pdftotext ) | 892 KB | 1.1 MB | poppler-0.68.0-x86
for f in *.pdf; do pdfimages -png "$f" "$f%.pdf"; done A headless Raspberry Pi 1 (32-bit ARM, but similar constraints) running an x86 emulator like QEMU-user can use pdftohtml to generate static HTML for intranet servers: pdftotext -raw -eol dos corrupted