Browse our Products

Aspose.Words for Python via .NET 22.5 Release Notes

Major Features

There are 126 improvements and fixes in this regular monthly release. The most notable are:

  • Added support for loading EPUB documents.
  • Added support for loading XML documents.
  • Added support of “Envelope No. 10” page size for printing.
  • Implemented rendering of a border box around the MathML formulas and the strike lines.
  • Improved font detection when rendering characters in MathML formulas.
  • Improved text wrapping for RTL paragraphs with custom left indent.

Full List of Issues Covering all Changes in this Release (Reported by .NET Users)

KeySummaryCategory
WORDSNET-3822Table headers are not wrapped properlyNew Feature
WORDSNET-8319Table column widths are calculated incorrectly during renderingNew Feature
WORDSNET-8487Paragraphs followed by Tightly wrapped Shapes render incorrectly in PDFNew Feature
WORDSNET-8838Support loading EPUB file formatNew Feature
WORDSNET-8931Tab spacing is not respected in fixed page formatsNew Feature
WORDSNET-9253Shaping issues with Telugu, Tamil, and Chinese charactersNew Feature
WORDSNET-10869Add feature to format page numberNew Feature
WORDSNET-12720Table contents do not render correctly in output PDFNew Feature
WORDSNET-14941FILLIN fields are lost in output PDF and printNew Feature
WORDSNET-22284Text position is changed after DOC to PDF conversionNew Feature
WORDSNET-22697Add support for loading of XML documentsNew Feature
WORDSNET-22887Add loading progress notificationNew Feature
WORDSNET-23577Add .NET 6.0 assemblies to the release buildNew Feature
WORDSNET-7128Text wrapping in Cell is not correct in PDFEnhancement
WORDSNET-8325WordML to PDF conversion issue with table renderingEnhancement
WORDSNET-9075Table column widths are calculated incorrectly during renderingEnhancement
WORDSNET-12186Picture and Textbox cause Aspose.Words to render content on one additional pageEnhancement
WORDSNET-13405Table width in percent is not honored when converted from DOCX to XPSEnhancement
WORDSNET-5460Table inside header of RTF was not rendered in PDFBug
WORDSNET-5619Table widths are disturbed upon rendering to PDFBug
WORDSNET-8037WordML to PDF conversion issue with text renderingBug
WORDSNET-8327WordML to Pdf conversion issue with shape renderingBug
WORDSNET-9172DOCX to PDF conversion issue with table formattingBug
WORDSNET-9788DOC to PDF conversion issue with text (date) alignmentBug
WORDSNET-10017DrawingML TextBoxes are pushed to the left beyond the left boundary in fixed page formatsBug
WORDSNET-10410Table indentation is not preserved during renderingBug
WORDSNET-10700RTF to PDF conversion issue with table renderingBug
WORDSNET-10947Incorrect tab positioning causes incorrect text wrappingBug
WORDSNET-11123Table widths are not calculated correctly during rendering to PDFBug
WORDSNET-11500Incorrect position of wrapped text on conversion to PDFBug
WORDSNET-11641Widths of Tables and cells are not preserved during rendering to PDFBug
WORDSNET-11806DOC to PDF conversion issue with table layoutBug
WORDSNET-12099Table layouts are not correct in PDFBug
WORDSNET-12381Table Cells widths are incorrect in rendered PDFBug
WORDSNET-12750Table Cells widths are incorrect in rendered PDFBug
WORDSNET-12979RenderedDocument and lines issue within table cellsBug
WORDSNET-13196Thai font is displayed in the wrong way in PDFBug
WORDSNET-14989Thai characters are not preserved when rendered to PDFBug
WORDSNET-16037Field.isDirty value always falseBug
WORDSNET-16742Arabic text is not rendered correctly in output PDFBug
WORDSNET-18524Conversion RTF to PDF inconsistent table widthBug
WORDSNET-19215OfficeMath enclosing formula is crushed when outputting PDFBug
WORDSNET-19798Cells in Table gets misplaced during open/save a DOCBug
WORDSNET-22023Text alignments in narrow cells of PDF differs from Word after conversionBug
WORDSNET-22605Split string in LINQ Reporting not working as expectedBug
WORDSNET-22669Table Content Pushed Down from its Original Position in PDFBug
WORDSNET-22725Table Cut off Issue when converting Html to WordBug
WORDSNET-22726Exception is thrown while converting from DOCX to HTMLBug
WORDSNET-22733Extra vertical spacing added between Rows of a Table with Merged CellsBug
WORDSNET-22736Image position is changed after MHTML to PDF ConversionBug
WORDSNET-22843Incorrect rendering of Column3D in PDFBug
WORDSNET-22987Import differs from what is in browserBug
WORDSNET-23025ArgumentException: Incorrect hex lengthBug
WORDSNET-23225Aspose.Words hangs on document renderingBug
WORDSNET-23279Horizontal axis labels are wrapped improperlyBug
WORDSNET-23330Image is not visible after import from AZW3Bug
WORDSNET-23332Aspose.Words hangs when loading a MOBI documentBug
WORDSNET-23370UpdatePageLayout throws exceptionBug
WORDSNET-23371Structured Document Tag gets removedBug
WORDSNET-23394Document.UpdatePageLayout() throws System.InvalidOperationException : Infinite loop detectedBug
WORDSNET-23396Text wrapping does not match WordBug
WORDSNET-23485Tab is lost upon converting document to HTMLBug
WORDSNET-23500Content is shifted upon rendering documentBug
WORDSNET-23504Text is wrapped improperly upon renderingBug
WORDSNET-23505Aspose.Words improperly selects paper source upon printing.Bug
WORDSNET-23511RemoveEmptyParagraphs cleanup option does not work in case of nested IF fieldsBug
WORDSNET-23527Graphics is lost on PDF importBug
WORDSNET-23531Math equations alignment issueBug
WORDSNET-23535Consider disabling LoadOptions.ResourceLoadingCallback invocations for data URLsBug
WORDSNET-23536FileCorruptedException is thrown upon loading HTML documentBug
WORDSNET-23540DOCX to PDF: Text overlapping the document layoutBug
WORDSNET-23545Problem when editing PDF form field in ChromeBug
WORDSNET-23563Content is lost upon loading PDF documentBug
WORDSNET-23565Numbers are rendered as tofu when use NumeralFormat.ArabicIndicBug
WORDSNET-23578Inaccurate vertical alignment in equations when saving to PDFBug
WORDSNET-23588ArgumentException is thrown upon loading MHTML documentBug
WORDSNET-23596Text alignment in table is incorrectBug
WORDSNET-23604List numbering is wrong for lists from HTML altChunk’sBug
WORDSNET-23607“Unsupported file format: Unknown” on loading TXT fileBug
WORDSNET-23642DOCX to PDF conversion causes layout issues in output PDF fileBug
WORDSNET-23643Chart series are lost after DOCX to PDF conversionBug
WORDSNET-23644Bar charts’ height decreases after DOCX to PDF conversionBug
WORDSNET-23660AW does not imitate MS Word handling of an unsupported xml elementBug
WORDSNET-23661ReportingEngine.BuildReport throws an exception on .NET 6 when reflection optimization is onBug
WORDSNET-23665Text in category labels is not wrappedBug
WORDSNET-23667Font name and size does not match MS Word on WML to DOCX conversionBug
WORDSNET-23668Extra paragraph in header on WML to DOCX conversionBug
WORDSNET-23672Incorrect shape positions on WML to DOCX conversionBug
WORDSNET-23677Do not invoke ResourceLoadingCallback for empty URLsBug
WORDSNET-23685Document.ExtractPages() causes line numbers restartingBug
WORDSNET-23693InvalidOperationException: Sequence contains more than one matching elementBug
WORDSNET-23696TestSaveOdt performance test fails on net5 and net6 CLRBug
WORDSNET-23698DOC to PDF: Text with Shadow effect not correctly convertedBug
WORDSNET-23699RTL paragraph is positioned incorrectly inside an inline table with different left and right spacingsBug
WORDSNET-23703Font is changed after appending document with KeepSourceFormattingBug
WORDSNET-23707DOC Compare System.InvalidOperationException: Custom XML part is not found.Bug
WORDSNET-23715FileCorruptedException is thrown upon loading DOCX documentBug
WORDSNET-23717SVG letter-spacing style gets ignored when converting DOCX to PDFBug
WORDSNET-23718Document.ExtractPages changes list numberingBug
WORDSNET-23725Wrong paragraph format when adding an image after Pdf2Word conversionBug
WORDSNET-23730Fix StringComparison warningsBug
WORDSNET-23732Fix StringComparison warningsBug
WORDSNET-23733Fix StringComparison warningsBug
WORDSNET-23735Wrong list numbering due to loss and non-use of DurableId attribute valuesBug
WORDSNET-23743Part of content is moved into table upon reading RTFBug
WORDSNET-23745Fix StringComparison warnings in fields/mailmerge domainBug
WORDSNET-23757Comments anchor is misplaced after the savingBug
WORDSNET-23760PDF can’t be loaded because of “Sequence contains more than one matching element” errorBug
WORDSNET-23791Fix customer issues using SonarQube analysisBug

Full List of Issues Covering all Changes in this Release (Reported by Java Users)

KeySummaryCategory
WORDSNET-15581RTF to PDF conversion issue with table’s cell widthNew Feature
WORDSNET-19386Text-shift observed during Word to PDF conversionNew Feature
WORDSNET-17061Wrong Font for certain Arabic Characters used in PDFBug
WORDSNET-19196Text position is changed in output PDFBug
WORDSNET-20866DOC to HTML conversion throws System.NullReferenceExceptionBug
WORDSNET-21486Imported SVG-based 3D Pie Chart Renders Incorrectly in WordBug
WORDSNET-22835Unexpected Column Widths after HTML with Merged Cells is Converted to DOCXBug
WORDSNET-23277Axis labels are wrapped improperlyBug
WORDSNET-23569FileCorruptedException is thrown upon loading HTML documentBug
WORDSNET-23571Uppercase text is rendered as regular textBug
WORDSNET-23592UpdateFields() fails with NPEBug
WORDSNET-23658System.InvalidOperationException: Stack empty.  is thrown on Range.ReplaceBug
WORDSNET-23673FileCorruptedException is thrown upon loading DOCX documentBug
WORDSNET-23678Aspose.Words hangs upon rendering documentBug
WORDSNET-23695System.InvalidOperationException: Infinite loop detected. exception thrownBug
WORDSNET-23716Images are lost after loading word 2003 XML documentBug
WORDSNET-23766Ident of list item is incorrect after comparing documentsBug

Public API and Backward Incompatible Changes

This section lists public API changes that were introduced in Aspose.Words 22.5. It includes not only new and obsoleted public methods, but also a description of any changes in the behavior behind the scenes in Aspose.Words which may affect existing code. Any behavior introduced that could be seen as regression and modifies the existing behavior is especially important and is documented here.

Added support for loading EPUB documents

Related issue: WORDSNET-8838

Aspose.Words now can load EPUB 2.0 documents.

EPUB is an e-book file format that uses the “.epub” file extension. A EPUB document is a collection of XHTML documents. Currently, Aspose.Words always loads all XHTML files from a EPUB document in the order in which they appear in the content file (OPF).

The following publicly visible enum values were added:

The FileFormatUtil class can now be used to determine if a file is a EPUB document. For example, the following call

info = aw.FileFormatUtil.detect_file_format("book.epub")

will return an info instance with the FileFormatInfo.load_format property set to LoadFormat.EPUB.

The use cases for loading EPUB documents are as follows:

doc = aw.Document("book.epub")

Added support for loading XML documents

Related issue: WORDSNET-22697

Aspose.Words now can load XML documents. The Extensible Markup Language (XML) is a simple text-based format for representing structured information: documents, data, configuration, books, transactions, invoices, and much more. Aspose.Words mimics MS Word behavior during import XML documents.

The following publicly visible enum value was added:

The FileFormatUtil class can now be used to determine if a file is a XML document. For example, the following call

info = aw.FileFormatUtil.detect_file_format("sample.xml")

will return an info instance with the FileFormatInfo.load_format property set to LoadFormat.XML.

The use cases for loading XML documents are as follows:

doc = aw.Document("sample.xml")

Introduced ChapterPageSeparator enum and added PageSetup.chapter_page_separator and PageSetup.heading_level_for_chapter properties

Related issue: WORDSNET-10869

The ChapterPageSeparator enum is introduced:

class ChapterPageSeparator(enum.IntEnum):
    """Defines the separator character that appears between the chapter and page number."""

    # A colon.
    HYPHEN = 0
    
    # A period.
    PERIOD = 1
    
    # A colon.
    COLON = 2
    
    # An emphasized dash.
    EM_DASH = 3

    # A standard dash.
    EN_DASH = 4

The following public properties are added to PageSetup class:

class PageSetup:
    ...

    @property
    def heading_level_for_chapter(self) -> int:
        """Gets or sets the heading level style that is applied to the chapter titles in the document.
        
        Can be a number from 0 through 9. 0 means no chapter number if applied to page number.
        Before you can create page numbers that include chapter numbers, the document headings must have a numbered outline format applied."""
        ...

    @property
    def chapter_page_separator(self) -> ChapterPageSeparator:
        """Gets or sets the separator character that appears between the chapter number and the page number.
        
        Before you can create page numbers that include chapter numbers, the document headings must have a numbered outline format applied."""
        ...

Use Case:

doc = aw.Document(file_name);
 
page_setup = doc.first_section.page_setup
 
page_setup.page_number_style = aw.NumberStyle.UPPERCASE_ROMAN
page_setup.chapter_page_separator = aw.ChapterPageSeparator.COLON
page_setup.heading_level_for_chapter = 1

Slight changes in markup nodes typed collection

Related issue: WORDSNET-23774

The default indexer for markup nodes collection has been changed. Now it is the index number of a structured document tag in the collection.

class StructuredDocumentTagCollection:
    ...

    def __getitem__(self, index: int) -> IStructuredDocumentTag:
        """Returns the structured document tag at the specified index.
        
        :param index: An index into the collection."""
        ...

Along with this, it has become possible to remove a structured document tag at the specified index number, as well as remove a structured document tag by its identifier.

class StructuredDocumentTagCollection:
    ...

    def remove(self, id: int):
        """Removes the structured document tag with the specified identifier.
        
        :param id: The structured document tag identifier."""
        ...

    def remove_at(self, index: int):
        """Removes a structured document tag at the specified index.
        
        :param index: An index into the collection."""
        ...

The functionality that the indexer has previously performed by ID is now available through get_by_id() method.

class StructuredDocumentTagCollection:
    ...

    def get_by_id(self, id: int) -> IStructuredDocumentTag:
        """Returns the structured document tag by identifier.
        
        Returns None if the structured document tag with the specified identifier cannot be found.
        
        :param id: The structured document tag identifier."""

Use Case:

structured_document_tags = doc.range.structured_document_tags
# We iterate through all collection elements, getting each element by its index number.
for i in range(structured_document_tags.count):
    sdt = structured_document_tags[i]
    print(std.title)

# Get the structured document tag by its Id.
sdt = structured_document_tags.get_by_id(1160505028)
if sdt is not None:
    print(sdt.title)

# Remove the structured document tag by its Id.
structured_document_tags.remove(1160505028)

# Remove the structured document tag at position 0.
structured_document_tags.remove_at(0)

Added “NUMBER_10_ENVELOPE” value to PaperSize enum

Related issue: WORDSNET-23505

Added support of “Envelope No. 10” page size (4.125 x 9.5 inches) for printing.

Use Case:

# This value is used to set the page size as follows:
doc = aw.Document(file_name)
doc.first_section.page_setup.paper_size = aw.PaperSize.NUMBER_10_ENVELOPE
 
# Or in a similar way using DocumentBuilder:
builder = aw.DocumentBuilder(doc)
builder.page_setup.paper_size = aw.PaperSize.NUMBER_10_ENVELOPE

HtmlSaveOptions.export_text_box_as_svg was marked as obsolete

Related issue: WORDSNET-23514

The HtmlSaveOptions.export_text_box_as_svg property is now obsolete. The customers should use the HtmlSaveOptions.export_shapes_as_svg, which affects text boxes as well.