Convert PDF to Word in Java

Aspose.PDF for Java can export PDF documents to Microsoft Word formats with different recognition and layout options. Use DocSaveOptions to control how PDF text, lists, and images are mapped into Word output.

Convert PDF to DOC

Use this example when a PDF document should be exported to the legacy DOC format. The code creates DocSaveOptions, sets the format to Doc, and passes the options to a shared save method.

  1. Open the source PDF in a Document instance.
  2. Create DocSaveOptions and set the format to Doc.
  3. Call document.save(outputFile.toString(), saveOptions) so the PDF is exported to the Microsoft Word binary document format.
  4. Save the converted DOC file.
public static void convertPdfToDoc(Path inputFile, Path outputFile) {
    try (Document document = new Document(inputFile.toString())) {
        DocSaveOptions saveOptions = new DocSaveOptions();
        saveOptions.setFormat(DocSaveOptions.DocFormat.Doc);
        document.save(outputFile.toString(), saveOptions);
    }
    System.out.println(inputFile + " converted into " + outputFile);
}

Convert PDF to DOCX

Use this example when a PDF document should be exported as a DOCX file. DOCX is the preferred format for most new Word-processing workflows because it is widely supported and easier to edit.

  1. Open the source PDF in a Document instance.
  2. Create DocSaveOptions and set the format to DocX.
  3. Call document.save(outputFile.toString(), saveOptions) so the PDF content is exported as an Office Open XML Word document.
  4. Save the resulting DOCX file.
public static void convertPdfToDocx(Path inputFile, Path outputFile) {
    try (Document document = new Document(inputFile.toString())) {
        DocSaveOptions saveOptions = new DocSaveOptions();
        saveOptions.setFormat(DocSaveOptions.DocFormat.DocX);
        document.save(outputFile.toString(), saveOptions);
    }
    System.out.println(inputFile + " converted into " + outputFile);
}

Convert PDF to DOCX with enhanced flow recognition

Use this example when the Word export should favor flowing editable content instead of fixed visual layout.

  1. Open the source PDF in a Document instance.
  2. Create DocSaveOptions for DocX output.
  3. Enable setMode(DocSaveOptions.RecognitionMode.EnhancedFlow) so the converter uses enhanced flow recognition during DOCX generation.
  4. Call document.save(outputFile.toString(), saveOptions) and save the converted DOCX output.
public static void convertPdfToDocxAdvanced(Path inputFile, Path outputFile) {
    try (Document document = new Document(inputFile.toString())) {
        DocSaveOptions saveOptions = new DocSaveOptions();
        saveOptions.setFormat(DocSaveOptions.DocFormat.DocX);
        saveOptions.setMode(DocSaveOptions.RecognitionMode.EnhancedFlow);
        document.save(outputFile.toString(), saveOptions);
    }
    System.out.println(inputFile + " converted into " + outputFile);
}

Convert PDF to DOCX with preserved line breaks

Use this example when line endings from the source PDF should be retained in the Word output.

  1. Open the source PDF in a Document instance.
  2. Create DocSaveOptions for DocX export.
  3. Enable setAddReturnToLineEnd(true) so explicit line breaks are preserved during conversion.
  4. Call document.save(outputFile.toString(), saveOptions) and save the DOCX file.
public static void convertPdfToDocxWithLineBreaks(Path inputFile, Path outputFile) {
    try (Document document = new Document(inputFile.toString())) {
        DocSaveOptions saveOptions = new DocSaveOptions();
        saveOptions.setFormat(DocSaveOptions.DocFormat.DocX);
        saveOptions.setAddReturnToLineEnd(true);
        document.save(outputFile.toString(), saveOptions);
    }
    System.out.println(inputFile + " converted into " + outputFile);
}

Convert PDF to DOCX with bullet recognition

Use this example when list bullets from the source PDF should be recognized and preserved as list structures in Word.

  1. Open the source PDF in a Document instance.
  2. Create DocSaveOptions for DocX export.
  3. Enable setRecognizeBullets(true) so list-like PDF content is recognized as bullet lists during conversion.
  4. Call document.save(outputFile.toString(), saveOptions) and save the DOCX file.
public static void convertPdfToDocxWithBulletRecognition(Path inputFile, Path outputFile) {
    try (Document document = new Document(inputFile.toString())) {
        DocSaveOptions saveOptions = new DocSaveOptions();
        saveOptions.setFormat(DocSaveOptions.DocFormat.DocX);
        saveOptions.setRecognizeBullets(true);
        document.save(outputFile.toString(), saveOptions);
    }
    System.out.println(inputFile + " converted into " + outputFile);
}

Convert PDF to DOCX with custom image resolution

Use this example when image fidelity inside the generated DOCX should be controlled during conversion.

  1. Open the source PDF in a Document instance.
  2. Create DocSaveOptions for DocX export.
  3. Set setImageResolutionX(300) and setImageResolutionY(300) so raster content is generated at the requested resolution.
  4. Call document.save(outputFile.toString(), saveOptions) and save the DOCX output.
public static void convertPdfToDocxWithImageResolution(Path inputFile, Path outputFile) {
    try (Document document = new Document(inputFile.toString())) {
        DocSaveOptions saveOptions = new DocSaveOptions();
        saveOptions.setFormat(DocSaveOptions.DocFormat.DocX);
        saveOptions.setImageResolutionX(300);
        saveOptions.setImageResolutionY(300);
        document.save(outputFile.toString(), saveOptions);
    }
    System.out.println(inputFile + " converted into " + outputFile);
}