Aspose.Words for Java 21.4 Release Notes

Major Features

There are 102 improvements and fixes in this regular monthly release. The most notable are:

  • Added an ability to remove unused BuiltIn Styles.
  • The Fill class was extended with a functionality for processing solid fills.
  • Public API of Structured Document Tag range was extended.
  • Document.getRange().replace() bulk bug fixing and improvement.
  • JavaDoc documentation update and bug fixes.

Full List of Issues Covering all Changes in this Release

KeySummaryCategory
WORDSNET-21246Provide more properties/methods in StructuredDocumentTagRangeStart ClassNew Feature
WORDSNET-21991Support advanced typography in SkiaSharp image rendererNew Feature
WORDSNET-5643Consider providing a way to save user specified page ranges when saving to flow formatsEnhancement
WORDSNET-20580Improve list item padding simulation in Html list writerEnhancement
WORDSNET-21597Add “IsDecorative” flag for DML shapesEnhancement
WORDSNET-3449Improve support of large filesEnhancement
WORDSJAVA-2259Saving docx to pdf with Arabic font issueBug
WORDSJAVA-2314TableSubstitutionRule.save should take outputStreamBug
WORDSJAVA-2455Document.getRange().replace() with FindReplaceOptions set to useSubstitutions(true) doesn’t replace text.Bug
WORDSJAVA-2505Document.getRange().replace() with useSubstitutions FindReplaceOptions failed with NullPointerExceptionBug
WORDSJAVA-2528insertOleObject failed when using InputStream parameterBug
WORDSJAVA-2541NetInputStream throws Exception if skip number is bigger than stream lengthBug
WORDSJAVA-2546Document.getRange().replace() returns one extra symbol when regular expressions with substitutions are used.Bug
WORDSNET-21966DOCX to PDF conversion: System.ArgumentOutOfRangeException: ‘Index was out of range.Bug
WORDSNET-21801URL is changed after Word to PDF conversionBug
WORDSNET-20987Unexpected format change revision appears for a Table after CompareBug
WORDSNET-20190Chart series are lost after DOCX to PDF conversionBug
WORDSNET-20191Chart X-axis and Axis Title render incorrectly in output PDFBug
WORDSNET-22024System.InvalidOperationException: NC sync failed occurs upon comparing DOCX filesBug
WORDSNET-22016Provide example to process BARCODE fieldsBug
WORDSNET-21400Aspose.Words enters a infinite loop when converting file from DOCX to PDFBug
WORDSNET-21636Content formatting changed after delete XmlMappingBug
WORDSNET-21977Unused styles are not cleaned from DOCXBug
WORDSNET-21979Does not update SDT content in document.xml parts after calling UpdatePageLayout()Bug
WORDSNET-21981FileCorruptedException is thrown while importing DOCBug
WORDSNET-21983Uneoected FileCorruptedException is thrown while importing documentBug
WORDSNET-20758Table does not export correctly in output PDFBug
WORDSNET-21986Track changes - Ghost formatted tableBug
WORDSNET-21959DOCX to HTML: Incorrect SVG image renderingBug
WORDSNET-21997w:instrText is added in document header after re-saving DOCXBug
WORDSNET-21999System.ArgumentOutOfRangeException is thrown while saving DOCX to PDFBug
WORDSNET-21998w:instrText is added in comments.xml after re-saving DOCXBug
WORDSNET-16679Metafile is rendered improperlyBug
WORDSNET-16689Spacing between asian and latin numbers is rendered improperlyBug
WORDSNET-4539Setting FormField text to empty does not display default text in MS WordBug
WORDSNET-16698When MetafileRenderingMode.Bitmap is used, quality of image is not goodBug
WORDSNET-20597Incorrect automatic font color inside VML shapeBug
WORDSNET-21663Vertical Chinese Text Collapses into a Single Line in HTML FixedBug
WORDSNET-16509Not able to save as PDFBug
WORDSNET-16971Cell background color is incorrect in .NET StandardBug
WORDSNET-21838Docx -> PDF: conversion never ends for a fileBug
WORDSNET-21862Conversion from ODT to PDF throws IndexOutOfRangeExceptionBug
WORDSNET-21680Missed footnotes during conversion between DOCX and MarkdownBug
WORDSNET-22025LINQ Reporting Engine throws NullPointerException when ReportBuildOptions.RemoveEmptyParagraphs is usedBug
WORDSNET-21875Table rows are lost after PDF to DOCX conversionBug
WORDSNET-18770Unable to save the HTML version of DOCX fileBug
WORDSNET-18092Document.UpdateFields method throws System.InvalidOperationExceptionBug
WORDSNET-21330Rendering of some Japanese characters changes their orientationBug
WORDSNET-21886Hyperlink DisplayResult not present in result DOTX documentBug
WORDSNET-21859Text from paragraphs with zero line spacing becomes visible in rendered documentsBug
WORDSNET-18969List bullet image is duplicated after conversion to HTMLBug
WORDSNET-21685Image dimension is changed while importing MarkdownBug
WORDSNET-21691ArgumentOutOfRangeException is thrown when call UpdatePageLayoutBug
WORDSNET-15169Direction of Text Characters is changed in DOCX to JPG conversionBug
WORDSNET-21096Infinite loop on call of UpdatePageLayoutBug
WORDSNET-21899DOC to PDF conversion issue with Chinese text renderingBug
WORDSNET-21722Some tests fail sometimes while running in parallelBug
WORDSNET-21435Styles are not imported correctly after HTML to DOCX conversionBug
WORDSNET-21733Combo Custom Combination Chart - Updating a Combinated Chart Corrupts DocumentBug
WORDSNET-21151Bar char is not rendered correctlyBug
WORDSNET-21910System.IndexOutOfRangeException occurs upon DOCX to PDF conversionBug
WORDSNET-21911System.ArgumentOutOfRangeException occurs upon Word DOCX to PDF conversionBug
WORDSNET-21912Word to PDF conversion never ends for a DOC FileBug
WORDSNET-20607Contents are pushed down to next page in output PDFBug
WORDSNET-21188An interval between list label end and first line text is different depending on HtmlSaveOptions.PrettyFormatBug
WORDSNET-21602Table row is lost when HTML and Table is inserted into document using DocumentBuilderBug
WORDSNET-21604Infinite loop during call of Document.UpdatePageLayoutBug
WORDSNET-21748System.ArgumentOutOfRangeException for UpdatePageLayout methodBug
WORDSNET-21484Incorrectly displayed form fields in PDF when viewed in Google Chrome Web BrowserBug
WORDSNET-21844Line extends out of bounds in pdf file after saving DPCX to PDFBug
WORDSNET-21763DOCX to TIFF conversion issue with simplified Arabic text renderingBug
WORDSNET-21929Text is moved to previous line after DOCX to PDF conversionBug
WORDSNET-21928Barcode position is changed after DOC to PDF conversionBug
WORDSNET-21800Japanese top to down text issue when converting DOCX to HTMLBug
WORDSNET-21607New style introduced after adding a new SDTBug
WORDSNET-21592Links are broken when rendering CHM files to HTMLBug
WORDSNET-13730System.TypeInitializationException is thrown while loading DocumentBug
WORDSNET-21930DOCX to PDF conversion issue with backgroundBug
WORDSNET-21642Zny.Common.Document uses our AW.NET, check is it legal or notBug
WORDSNET-18507DOC to HtmlFixed conversion issue with table renderingBug
WORDSNET-21935Table row is shifted towards left side of page after DOCX to PDF conversionBug
WORDSNET-21649Bookmarks are lost after initiating LayoutEnumeratorBug
WORDSNET-21939FileCorruptedException occurs upon loading a DOCXBug
WORDSNET-21797DOCX to PDF/A conversion and validation fails: Table spanned over 2 pages is tagged as two tables instead of one.Bug
WORDSNET-21761Conversion process hangs on LinuxBug
WORDSNET-19297Layout difference for the documentBug
WORDSNET-21967Extract text from PDF file line by line and save data values inside SQL server databaseBug
WORDSNET-21978Document.Cleanup() not removing all unused Styles and ListsBug
WORDSNET-13487Watermark positioning is incorrect in PDFBug
WORDSNET-16697TryParse does not work for “fr-CH”Bug
WORDSNET-20979text-align : left style inside table column is not exported in output HTMLBug
WORDSNET-16683Document has incorrect tab sizeBug
WORDSNET-16705Original date value is incorrectBug
WORDSNET-21518Nested Textboxes inside Tables are missing in generated HTMLBug
WORDSNET-12069Extra empty page appears after conversion form DOCX to PDFBug
WORDSNET-12153Contents position is changed after conversion from DOCX to PDFBug
WORDSNET-12393Some Tables are pushed to next page in rendered documentBug
WORDSNET-20242DOC to HtmlFixed conversion is incorrect when HtmlFixedSaveOptions.ResourcesFolder is usedBug
WORDSNET-13531Incorrect conversion from DOCX to PDFBug
WORDSNET-20729MHTML to PDF conversion hangsBug
WORDSNET-16722Document has incorrect field valuesBug
WORDSNET-21590PDF to MS word formatting issuesBug

Public API and Backward Incompatible Changes

This section lists public API changes that were introduced in Aspose.Words 21.4. It includes not only new and obsoleted public methods, but also a description of any changes in the behavior behind the scenes in Aspose.Words which may affect existing code. Any behavior introduced that could be seen as regression and modifies the existing behavior is especially important and is documented here.

Aspose.Words.Comparing namespace was introduced

Due to refactoring work on Aspose.Words namespaces, CompareOptions, ComparisonTargetType, Granularity classes were moved to new separate namespace Aspose.Words.Comparing. In case of compilation error - please add using Aspose.Words.Comparing.

Aspose.Words.Notes namespace was introduced

Due to refactoring work on Aspose.Words namespaces, Footnote, EndnoteOptions, FootnoteOptions, EndnotePosition, FootnotePosition, FootnoteType, FootnoteNumberingRule classes were moved to new separate namespace Aspose.Words.Notes. In case of compilation error - please add using Aspose.Words.Notes.

Due to refactoring work on Aspose.Words namespaces, LoadOptions, PdfLoadOptions, RtfLoadOptions, TxtLoadOptions classes and corresponding enums were moved to Aspose.Words.Loading namespace. In case of compilation error - please add using Aspose.Words.Loading.

Added a new public property CleanupOptions.UnusedBuiltinStyles

Related issue: WORDSNET-21977

Added a new public property to CleanupOptions:

/// <summary>
/// Specifies that unused BuiltIn styles should be removed from document.
/// </summary>
public bool UnusedBuiltinStyles { get; set; }

Use Case: Explains how to use UnusedBuiltinStyles property.

Document doc = new Document("input.docx");
CleanupOptions cleanupOptions = new CleanupOptions();
cleanupOptions.UnusedBuiltinStyles = true;
doc.Cleanup(cleanupOptions);

Advanced typography supported when saving to images on .NET and .NET Standard

Related issue: WORDSNET-21330, WORDSNET-21991

Advanced typography is now supported when saving to images with GDI+ or SkiaSharp (i.e. on all .NET platforms and .NET Standard).

Use Case: Saving document to image with advanced typography features.

Document doc = new Document("input.docx");
doc.LayoutOptions.TextShaperFactory = HarfBuzzTextShaperFactory.Instance;
doc.Save("output.png");

Fill.Solid() method was introduced

Related issue: WORDSNET-21808

The following new public methods were added into the Fill class:

/// <summary>
/// Sets the fill to a uniform color.
/// </summary>
/// <remarks>
/// Use this method to convert any of the fills back to solid fill.
/// </remarks>
public void Solid()
 
/// <summary>
/// Sets the fill to a specified uniform color.
/// </summary>
/// <remarks>
/// Use this method to convert any of the fills back to solid fill.
/// </remarks>
public void Solid(Color color)

Use Case: Explains how to change fill to Solid.

// Open some document with text effects.
Document doc = new Document("TextTwoColorGradient.docx");
 
// Get Fill object for Font of the first Run.
Fill fill = doc.FirstSection.Body.FirstParagraph.Runs[0].Font.Fill;
 
// Check Fill properties of the Font.
Console.WriteLine("The type of the fill is: {0}", fill.FillType);
Console.WriteLine("The foreground color of the fill is: {0}", fill.ForeColor);
Console.WriteLine("The fill is transparent at {0}%", fill.Transparency * 100);
 
// Change type of the fill to Solid with uniform green color.
fill.Solid(Color.Green);
Console.WriteLine("\nThe fill is changed:");
Console.WriteLine("The type of the fill is: {0}", fill.FillType);
Console.WriteLine("The foreground color of the fill is: {0}", fill.ForeColor);
Console.WriteLine("The fill transparency is {0}%", fill.Transparency * 100);
 
doc.Save("TextSolidOut.docx");
 
/*
This code example produces the following results:
 
The type of the fill is: Gradient
The foreground color of the fill is: Color [A=255, R=0, G=128, B=128]
The fill is transparent at 65%
 
The fill is changed:
The type of the fill is: Solid
The foreground color of the fill is: Color [A=255, R=0, G=128, B=0]
The fill transparency is 0%
*/

Public API of Structured Document Tag range was extended

Related issue: WORDSNET-21246

The constructors for StructuredDocumentTagRangeStart and StructuredDocumentTagRangeEnd classes have been made public. Now the instances of these classes can be created manually.

/// <summary>
/// Initializes a new instance of the <b>Structured document tag range start</b> class.
/// </summary>
/// <remarks>
/// <para>The following types of SDT can be created:</para>
/// <list type="bullet">
/// <item><see cref="Markup.SdtType.Checkbox"/></item>
/// <item><see cref="Markup.SdtType.DropDownList"/></item>
/// <item><see cref="Markup.SdtType.ComboBox"/></item>
/// <item><see cref="Markup.SdtType.Date"/></item>
/// <item><see cref="Markup.SdtType.BuildingBlockGallery"/></item>
/// <item><see cref="Markup.SdtType.Group"/></item>
/// <item><see cref="Markup.SdtType.Picture"/></item>
/// <item><see cref="Markup.SdtType.RichText"/></item>
/// <item><see cref="Markup.SdtType.PlainText"/></item>
/// </list>
/// </remarks>
/// <param name="doc">The owner document.</param>
/// <param name="type">Type of SDT node.</param>
public StructuredDocumentTagRangeStart(DocumentBase doc, SdtType type)
 
/// <summary>
/// Initializes a new instance of the <b>Structured document tag range end</b> class.
/// </summary>
/// <param name="doc">The owner document.</param>
/// <param name="id">Identifier of the corresponding structured document tag range start.</param>
public StructuredDocumentTagRangeEnd(DocumentBase doc, int id)

Use Case:

Document doc = new Document("input.docx");
 
StructuredDocumentTagRangeStart start = new StructuredDocumentTagRangeStart(doc, SdtType.RepeatingSectionItem);
StructuredDocumentTagRangeEnd end = new StructuredDocumentTagRangeEnd(doc, start.Id);
 
doc.FirstSection.Body.InsertAfter(start, doc.FirstSection.Body.FirstParagraph);
doc.LastSection.Body.InsertBefore(end, doc.LastSection.Body.LastParagraph);
 
doc.Save("output.docx");
IEnumerable interface is implemented in StructuredDocumentTagRangeStart class for full LINQ support (i.e. Last(), LastOrDefault() and other methods).

Document doc = new Document("input.docx");
StructuredDocumentTagRangeStart start = (StructuredDocumentTagRangeStart)doc.FirstSection.Body.GetChild(NodeType.StructuredDocumentTagRangeStart, 0, false);
 
Console.WriteLine(start.LastOrDefault().GetText());

A new public RemoveAllChildren() method has been added.

/// <summary>
/// Removes all the nodes between this range start node and the range end node.
/// </summary>
public void RemoveAllChildren()

Use Case: Explains how to use RemoveAllChildren() method.

Document doc = new Document("input.docx");
 
int nodeCountBefore = doc.GetChildNodes(NodeType.Any, true).Count;
StructuredDocumentTagRangeStart start = (StructuredDocumentTagRangeStart)doc.FirstSection.Body.GetChild(NodeType.StructuredDocumentTagRangeStart, 0, false);
start.RemoveAllChildren();
 
int nodeCountAfter = doc.GetChildNodes(NodeType.Any, true).Count;
 
Console.WriteLine(nodeCountBefore);
Console.WriteLine(nodeCountAfter);

A new public RemoveSelfOnly() method has been added.

/// <summary>
/// Removes this range start and appropriate range end nodes of the structured document tag,
/// but keeps its content inside the document tree.
/// </summary>
public void RemoveSelfOnly()

Use Case: Explains how to use RemoveSelfOnly() method.

Document doc = new Document("input.docx");
StructuredDocumentTagRangeStart start = (StructuredDocumentTagRangeStart)doc.FirstSection.Body.GetChild(NodeType.StructuredDocumentTagRangeStart, 0, false);
start.RemoveSelfOnly();

Removed obsolete PdfSaveOptions.EscapeUri property

This option is not needed anymore because writing of URI to PDF was improved and cases when disabled escaping was required are handled well now.