DEV Community

Cover image for How to install pdf2htmlEX in Ubuntu 20.04
Andy
Andy

Posted on

How to install pdf2htmlEX in Ubuntu 20.04

pdf2htmlEX is a tool that allows you to convert PDF to HTML without losing text or format. pdf2htmlEX renders PDF files in HTML, using modern Web technologies. It is very useful if you want to convert academic papers with lots of formulas and figures to HTML format

This post will show you how to install pdf2htmlEX on Ubuntu 20.04 LTS.

As at the time of writing this post pdf2htmlEX is no longer packaged by Debian/Ubuntu, you will need to install from the pdf2htmlEX Debian archives (*.deb).

To get started you will need to install the dependencies:

sudo apt update
sudo apt install -y libfontconfig1 libcairo2 libjpeg-turbo8
Enter fullscreen mode Exit fullscreen mode

If you get error about unmet dependencies run the following to fix broken packages

sudo apt apt --fix-broken install
Enter fullscreen mode Exit fullscreen mode

Download latest *.deb package from pdf2htmlEX repository

wget https://github.com/pdf2htmlEX/pdf2htmlEX/releases/download/v0.18.8.rc1/pdf2htmlEX-0.18.8.rc1-master-20200630-Ubuntu-bionic-x86_64.deb
sudo mv pdf2htmlEX-0.18.8.rc1-master-20200630-Ubuntu-bionic-x86_64.deb pdf2htmlEX.deb
Enter fullscreen mode Exit fullscreen mode

Install the package

sudo apt install ./pdf2htmlEX.deb
Enter fullscreen mode Exit fullscreen mode

It is very important that you use a (relative or absolute) path to the *.deb file. It is the ./ in front of the pdf2htmlEX.deb file name which tells apt install that it is supposed to install a local file rather than a package name in apt install's internal package database.

Alternatively you could use the following commands:

sudo dpkg -i pdf2htmlEX.deb
sudo apt install -f
Enter fullscreen mode Exit fullscreen mode

Test your installation

pdf2htmlEX -v
Enter fullscreen mode Exit fullscreen mode

You should see something like this:

pdf2htmlEX version 0.18.8.rc1
Copyright 2012-2015 Lu Wang <coolwanglu@gmail.com> and other contributors
Libraries: 
  poppler 0.89.0
  libfontforge (date) 20200314
  cairo 1.16.0
Default data-dir: /usr/local/share/pdf2htmlEX
Poppler data-dir: /usr/local/share/pdf2htmlEX/poppler
Supported image format: png jpg svg
Enter fullscreen mode Exit fullscreen mode

Top comments (5)

Collapse
 
adamolszewskiit profile image
Adam Olszewski

Thank you! I was trying to install it from scratch with no success, but your tutorial helped me a lot!

Collapse
 
savetheword profile image
Muhammad Aslam

Thank you so much

Collapse
 
en_c07ecce0edbeb27f51c profile image
Endre

Hi there! Do you know if it Is possible to build a aws lambda package for this library running amazonlinux2023 py 3.13 runtime?

Collapse
 
bmankowski profile image
Bartlomiej Mankowski • Edited

Thank you! Your tutorial helped me as well.
There's a typo (double apt) in: sudo apt apt --fix-broken install

Collapse
 
conaryhernandez profile image
Conary

Great post!. Helped me a lot.