Java Find, Highlight and Replace Text in PDF

#find #highlight #replace #pdf

Sometimes, you need to search the PDF file to get particular texts, but there may be many same results returned. In order to confirm the location more conveniently, you need to highlight the searched texts with background color. Spire.PDF for Java also supports to replace the searched text with updated text one by one in the PDF file. In this article, you will learn how to search, highlight and replaced selected text in PDF files programmatically in java from the following two aspects:

Find and highlight the searched text on all the pages of PDF file
Find and replace text string in PDF document with new text string

Install Spire.PDF for Java

First of all, you're required to add the Spire.PDF.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.pdf</artifactId>
        <version>8.11.8</version>
    </dependency>
</dependencies>

Highlight searched text in PDF

Spire.PDF for java supports to find the particular text on all the pages of the PDF file and then highlight them with the background color. Here are the steps:

Create a PdfDocument instance.
Load a PDF document using PdfDocument.loadFromFile() method.
Loop through the pages in the PDF document and use PdfPageBase.findText(string).getFinds() method to find specified text in entire PDF pages, save the search results in a PdfTextFind array.
Loop through the list and call PdfTextFind.highLight(Color color) method to highlight all occurrences of the specific text with a color.
Save the result document using PdfDocument.saveToFile() method.

import com.spire.pdf.*;
import com.spire.pdf.general.find.PdfTextFind;

import java.awt.*;


public class FindandHighlight {

    public static void main(String[] args) throws Exception {
        //Create a PdfDocument instance
        PdfDocument pdf = new PdfDocument();

        //Load a PDF sample document
        pdf.loadFromFile("Test00.pdf");

        PdfTextFind[] result = null;
        for (Object pageObj : pdf.getPages()) {
            PdfPageBase page =(PdfPageBase)pageObj;
            //Find text
            result = page.findText("Wilde", false).getFinds();
            for (PdfTextFind find : result) {
                //Highlight searched text
                find.highLight(Color.green);
            }
        }

        //Save the result file
        pdf.saveToFile("HighlightText.pdf");
    }
}

Find and Replace the text in PDF

Spire.PDF for java offers PdfPageBase.findText(string) method to find specified text in entire PDF pages , and then draw the new text string by setting its font and size to cover them.

Create a PdfDocument object and load a sample PDF document using PdfDocument.loadFromFile() method.
Get the first page using PdfDocument.getPages().get() method, and find the select text from the page using PdfPageBase.findText() method.
Get the bounds of a specific find result, and draw a white rectangle to cover the area of the bounds using PdfPageBase.getCanvas().drawRectangle() method.
Draw a new string on the area using PdfPageBase.getCanvas().drawString() method.
Save the document to another file using PdfDocument.saveToFile() method.

import com.spire.ms.System.Collections.Generic.List;
import com.spire.pdf.*;
import com.spire.pdf.general.find.*;
import com.spire.pdf.graphics.*;

import java.awt.*;
import java.awt.geom.Dimension2D;
import java.awt.geom.Rectangle2D;


public class FindandReplace {

    public static void main(String[] args) throws Exception {
        //Create a PdfDocument instance
        PdfDocument doc = new PdfDocument();

        //Load a PDF sample document
        doc.loadFromFile("Test00.pdf");

        //Get the first page of the PDF
        PdfPageBase page = doc.getPages().get(0);

        //Find the specific string from the page
        PdfTextFindCollection collection = page.findText("Oscar Wilde",false);

        //Define new text for replacing
        String newText = "Oscar Wilde--playwright, novelist, poet, critic";

        //Create a PdfTrueTypeFont object based on a specific used font
        PdfTrueTypeFont font = new PdfTrueTypeFont(new Font("Times New Roman",  Font.BOLD, 24));
        Dimension2D dimension2D = font.measureString(newText);
        double fontWidth = dimension2D.getWidth();
        double height = dimension2D.getHeight();

        for (Object findObj : collection.getFinds()) {
            PdfTextFind find=(PdfTextFind)findObj;
            List<Rectangle2D> textBounds = find.getTextBounds();

            //Draw a white rectangle to cover the old text
            Rectangle2D rectangle2D = textBounds.get(0);
            new Rectangle((int)rectangle2D.getX(),(int)rectangle2D.getY(),(int)fontWidth,(int)height);
            page.getCanvas().drawRectangle(PdfBrushes.getWhite(), rectangle2D);

            //Draw new text at the position of the old text
            page.getCanvas().drawString(newText, font, PdfBrushes.getBlack(), rectangle2D.getX(), rectangle2D.getY() );
        }

        //Save the document to file
        String result = "FindandReplace.pdf";
        doc.saveToFile(result, FileFormat.PDF);
    }
}

Conclusion

In this article, we have demonstrated how to use Spire.PDF for java to find, highlight and replace the text in PDF. With Spire.PDF for Java, we could also operate PDF files easily in Java applications, such as edit the PDF file, convert PDF file to other file formats, such as Word, Excel and PowerPoint. You can check the PDF forum for more features to operate the PDF files.

DEV Community

Java Find, Highlight and Replace Text in PDF

Install Spire.PDF for Java

Highlight searched text in PDF

Find and Replace the text in PDF

Conclusion

Top comments (0)

Read next

Navigating Search Solutions: A Comprehensive Comparison Guide to Meilisearch, Algolia, and ElasticSearch

Django, Flask, FastAPI, and More: Choosing the Right Python Framework for Your Project

Just Joined the Community – Lots of Questions as a Fresher React Developer

Erasure (Apagamento de Tipos)