Extract PDF Pages in Python

Extract Single Page from a PDF

Extract a specific page from a PDF document and save it as a new file. Using the Aspose.PDF library, the script copies the desired page to a new PDF, leaving the original document unchanged. This is useful for splitting PDFs or isolating important pages for distribution.

  1. Load the source PDF using the Document API (ap.Document()).
  2. Create a new Document to hold the extracted page.
  3. Add the desired Page from the source document to the new PDF using the destination document’s PageCollection (dst_document.pages.add(...)).
    • In this example, page 2 is extracted (1-based indexing).
  4. Save the new Document with the extracted page to the specified output file.
import aspose.pdf as ap

def extract_page(input_file_name: str, output_file_name: str) -> None:
    src_document = ap.Document(input_file_name)
    dst_document = ap.Document()
    dst_document.pages.add(src_document.pages[2])
    dst_document.save(output_file_name)

Extract Multiple Pages from a PDF

Extract multiple specific pages from a PDF document and save them into a new file. Using the Aspose.PDF library, selected pages are copied to a new PDF while leaving the original document intact. This is useful for creating smaller PDFs containing only relevant sections of a larger document.

  1. Load the source PDF using the Document API (ap.Document()).
  2. Create a new Document to hold the extracted pages.
  3. Select the pages to extract (in this example, pages 2 and 3 using 1-based indexing).
  4. Add each selected Page from the source document to the new PDF using its PageCollection.
  5. Save the new Document with the extracted pages to the specified output file.
import aspose.pdf as ap

def extract_multiple_pages(input_file_name: str, output_file_name: str) -> None:
    document = ap.Document(input_file_name)
    pages = [2, 3]
    another_document = ap.Document()
    for page_index in pages:
        another_document.pages.add(document.pages[page_index])
    another_document.save(output_file_name)