Description:
This Python script extracts text from PDF files using the PyPDF2 library. It reads each page of the PDF and compiles the extracted text into a single string.
# Python script to extract text from PDFs
import PyPDF2
def extract_text_from_pdf(file_path):
with open(file_path, 'rb') as f:
pdf_reader = PyPDF2.PdfFileReader(f)
text = ''
for page_num in range(pdf_reader.numPages):
page = pdf_reader.getPage(page_num)
text += page.extractText()
return text
Top comments (0)