DEV Community

Alexis
Alexis

Posted on • Edited on

Java – How to Convert PDF to Excel with Formatting

There are many reasons for converting PDF to Excel. Probably the most important reason is that you can easily manipulate and analyze data in PDFs by using the data calculation, visualization and analysis tools provided by Microsoft Excel. For example, you can use formulas on the data, create charts based on the data, apply conditional formatting to the data, and many more. In this article, I will explain how to programmatically convert PDF to Excel with Formatting in Java.

Add Dependencies

In order to convert PDF to Excel, this article uses a third-party API named Spire.PDF for Java. Before coding, you need to add needed dependencies for including Spire.PDF for Java into your Java project. There are two ways to do that.

Method 1: If you are using maven, you can easily import the JAR file of Spire.PDF for Java into your application by adding the following code to your project’s pom.xml file.

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.pdf</artifactId>
        <version>8.12.6</version>
    </dependency>
</dependencies>
Enter fullscreen mode Exit fullscreen mode

Method 2: If you are not using maven, you can download the latest version of Spire.PDF for Java from this link, extract the zip file and then import the Spire.Pdf.jar file under the lib folder into your project as a dependency.

Convert PDF to Excel with Formatting in Java

The PdfDocument.saveToFile(String, FileFormat) method in Spire.PDF for Java is used to convert a PDF document to other file formats. You can use this method to easily convert a PDF to Excel with formatting by specifying the FileFormat as XLSX.

The following are the detailed steps:

  • Initialize an instance of the PdfDocument class.
  • Load a PDF document using PdfDocument.loadFromFile() method.
  • Save the PDF document to Excel XLSX format using PdfDocument.saveToFile(String, FileFormat) method.
import com.spire.pdf.FileFormat;
import com.spire.pdf.PdfDocument;

public class ConvertPdfToExcel {
    public static void main(String[] args) {
        //Initialize an instance of the PdfDocument class
        PdfDocument pdf = new PdfDocument();
        //Load a PDF document
        pdf.loadFromFile("C:\\Users\\Administrator\\Desktop\\Sample.pdf");

        //Save the PDF document to XLSX format
        pdf.saveToFile("PdfToExcel.xlsx", FileFormat.XLSX);

    }
}
Enter fullscreen mode Exit fullscreen mode

Convert PDF to Excel in Java

Convert a Multi-Page PDF to a Single Excel Worksheet in Java

If your PDF document has multiple pages and you want to convert it to a single Excel worksheet, you can use the PdfDocument.getConvertOptions().setPdfToXlsxOptions() method to set the PDF to XLSX conversion options, and then call the PdfDocument.saveToFile(String, FileFormat) method to convert the PDF to XLSX format with the specified options you have set.

The following are the detailed steps:

  • Initialize an instance of the PdfDocument class.
  • Load a PDF document using PdfDocument.loadFromFile() method.
  • Set the PDF to XLSX conversion options using the PdfDocument.getConvertOptions().setPdfToXlsxOptions() method.
  • Save the PDF document to Excel XLSX format using the PdfDocument.saveToFile(String, FileFormat) method.
import com.spire.pdf.FileFormat;
import com.spire.pdf.PdfDocument;
import com.spire.pdf.conversion.XlsxLineLayoutOptions;

public class ConvertMultiPagePdfToSingleExcelWorksheet {
    public static void main(String[] args) throws Exception {
        // Initialize an instance of the PdfDocument class
        PdfDocument pdf = new PdfDocument();
        //Load a PDF document
        pdf.loadFromFile("C:\\Users\\Administrator\\Desktop\\Members.pdf");

        //Set the PDF to XLSX conversion options: rendering multiple pages on a single worksheet
        pdf.getConvertOptions().setPdfToXlsxOptions(new XlsxLineLayoutOptions(false,true,true));

        //Save the PDF document to XLSX format
        pdf.saveToFile("PdfToOneSheet.xlsx", FileFormat.XLSX);
    }
}
Enter fullscreen mode Exit fullscreen mode

Convert PDF with Multiple Pages to a Single Excel Worksheet in Java

Conclusion

This article demonstrates how to convert a PDF to Excel as well as how to convert a multi-page PDF to a single-sheet Excel in Java using Spire.PDF for Java API. Apart from the PDF to Excel conversion, you can also use the API to convert PDFs to a variety of other file formats such as DOCX, HTML and PPTX by specifying the FileFormat as DOCX, HTML and PPTX.

Top comments (0)