Extract PDF Pages in Python
Contents
[
Hide
]
Extract Single Page from a PDF
Extract a specific page from a PDF document and save it as a new file. Using the Aspose.PDF library, the script copies the desired page to a new PDF, leaving the original document unchanged. This is useful for splitting PDFs or isolating important pages for distribution.
- Load the source PDF using the
DocumentAPI (ap.Document()). - Create a new
Documentto hold the extracted page. - Add the desired
Pagefrom the source document to the new PDF using the destination document’sPageCollection(dst_document.pages.add(...)).- In this example, page 2 is extracted (1-based indexing).
- Save the new
Documentwith the extracted page to the specified output file.
import aspose.pdf as ap
def extract_page(input_file_name: str, output_file_name: str) -> None:
src_document = ap.Document(input_file_name)
dst_document = ap.Document()
dst_document.pages.add(src_document.pages[2])
dst_document.save(output_file_name)
Extract Multiple Pages from a PDF
Extract multiple specific pages from a PDF document and save them into a new file. Using the Aspose.PDF library, selected pages are copied to a new PDF while leaving the original document intact. This is useful for creating smaller PDFs containing only relevant sections of a larger document.
- Load the source PDF using the
DocumentAPI (ap.Document()). - Create a new
Documentto hold the extracted pages. - Select the pages to extract (in this example, pages 2 and 3 using 1-based indexing).
- Add each selected
Pagefrom the source document to the new PDF using itsPageCollection. - Save the new
Documentwith the extracted pages to the specified output file.
import aspose.pdf as ap
def extract_multiple_pages(input_file_name: str, output_file_name: str) -> None:
document = ap.Document(input_file_name)
pages = [2, 3]
another_document = ap.Document()
for page_index in pages:
another_document.pages.add(document.pages[page_index])
another_document.save(output_file_name)