Find and Remove Watermarks from Documents in Java

#java #webdev

This article is useful for the Java developers who are looking for a way to find and remove text or image watermarks from PDF, Word, Excel, PowerPoint, Visio and Email documents.

GroupDocs.Watermark for Java API supports adding text and image watermarks to a wide range of document formats. In addition, it also has the ability to find and remove watermarks from the documents. The API also finds the watermark objects that are added using the third-party tools. So let me demonstrate how you can remove the watermark from a document in a few steps in Java.

Before we begin, have a look at the following PDF document which contains a text as well as an image watermark. We’ll use this document and remove the watermarks from it.

Steps to remove watermarks from a document

1. Create a new project.

2. Add the following imports.

 import com.groupdocs.watermark.Document; 
 import com.groupdocs.watermark.ImageDctHashSearchCriteria;
 import com.groupdocs.watermark.ImageSearchCriteria; 
 import com.groupdocs.watermark.PossibleWatermarkCollection;
 import com.groupdocs.watermark.SearchCriteria;
 import com.groupdocs.watermark.TextSearchCriteria;

3. Create an instance of Document class and load the source document.

 Document doc =Document.load("watermarked.pdf");

4. Find the watermarks based on search criteria using findWatermarks method (if you don’t pass any searching criteria, findWatermark will return all the possible watermark objects).

 // configure the search criteria for image watermark
 ImageSearchCriteria imageSearchCriteria =newImageDctHashSearchCriteria("watermark.png");
 // configure the search criteria for text watermark
 TextSearchCriteria textSearchCriteria =newTextSearchCriteria("CONFIDENTIAL");
 // combine the search criteria
 SearchCriteria combinedSearchCriteria = imageSearchCriteria.or(textSearchCriteria);
 // find possible watermarks
 PossibleWatermarkCollection possibleWatermarks = doc.findWatermarks(combinedSearchCriteria);

5. Iterate over the watermark collection and remove watermarks using removeAt method.

 // iterate through the collection and remove watermarks
 while(possibleWatermarks.getCount()&gt;0)
 {
 if (possibleWatermarks.get_Item(0).getImageData() !=null)
    {
        possibleWatermarks.removeAt(0);
 System.out.println("removed image watermark.");
    }
 else
    {
    possibleWatermarks.removeAt(0);
System.out.println("removed text watermark.");
    }
 }

6. Save the resultant document using save method.

 doc.save("without_watermark.pdf");
 doc.close();

Complete Code

 Document doc =Document.load("watermarked.pdf");
 // configure the search criteria for image watermark
 ImageSearchCriteria imageSearchCriteria =newImageDctHashSearchCriteria("watermark.png");
 // configure the search criteria for text watermark
 TextSearchCriteria textSearchCriteria =newTextSearchCriteria("CONFIDENTIAL");
 // combine the search criteria
 SearchCriteria combinedSearchCriteria = imageSearchCriteria.or(textSearchCriteria);
 PossibleWatermarkCollection possibleWatermarks = doc.findWatermarks(combinedSearchCriteria);
 // iterate through the collection and remove watermarks
 while(possibleWatermarks.getCount()&gt;0)
 {
 if (possibleWatermarks.get_Item(0).getImageData() !=null)
    {
        possibleWatermarks.removeAt(0);
 System.out.println("removed image watermark.");
    }
 else
    {
        possibleWatermarks.removeAt(0);
 System.out.println("removed text watermark.");
    }
 }        
 doc.save("without_watermark.pdf");
 doc.close();

Results

The following is the screenshot of the resultant PDF document that we get after removing the watermarks.

DEV Community

Find and Remove Watermarks from Documents in Java

Steps to remove watermarks from a document

Complete Code

Results

Top comments (0)

Read next

Different ways to use where() in Laravel

I created a headless browser in Go. Here's what I learned

Understanding Bearer Tokens: A Simple Guide for Node.js APIs

npm start...the server is up&running!