Aspose.Words for Java 20.5 Release Notes

Major Features

  • Provided ability to show/hide Grammatical and Spelling errors.
  • New helper class to work with watermark inside document was introduced.
  • Added feature to set the compression level for OOXML documents.
  • Added resilience for the conversion of semi-broken MHTML files to PDF.

Full List of Issues Covering all Changes in this Release

KeySummaryCategory
WORDSNET-10403Add feature to ‘Hide spelling errors in this document only’New Feature
WORDSNET-4879Add a helper method to insert a watermark into the headerNew Feature
WORDSNET-10404Add feature to ‘Hide grammar errors in this document only’New Feature
WORDSNET-20094Add an option to remove duplicate styles to Document.Cleanup featureNew Feature
WORDSNET-20169Aspose.Words writes incorrect bytes for DOCXNew Feature
WORDSJAVA-1587Aspose Words 17.5 Giving Incorrect RenditionBug
WORDSJAVA-1604Paragraph’ line is rendered on the previous page in output PDFBug
WORDSJAVA-2333Document.getPageCount throws java.lang.IllegalStateException with JDK 1.8 (32 bit)Bug
WORDSJAVA-2346On MHTML to PDF conversion the ArrayIndexOutOfBoundsException has been thrownBug
WORDSJAVA-2348On MHTML to PDF conversion, the IllegalStateException has been thrownBug
WORDSJAVA-2349On MHTML to PDF conversion the NegativeArraySizeException has been thrownBug
WORDSJAVA-2350On MHTML to PDF conversion, the NullPointerException has been thrownBug
WORDSJAVA-2352On MHTML to PDF conversion, the “OutOfMemoryError Requested array size exceeds VM” limit has been thrownBug
WORDSJAVA-2354On MHTML to PDF conversion, the UnsupportedOperationException has been thrownBug
WORDSJAVA-2371UpdateFields feature creates the wrong ordering of names.Bug
WORDSJAVA-2375Black/white bi-tonal image is inverted after RTF to TIFF conversion.Bug
WORDSJAVA-2376Wrong count of labels in ChartDataLabelCollection.Bug
WORDSJAVA-2378The code throws an exception when using Aspose.Words.Shaping.Harfbuzz plugin with JDK 1.6Bug
WORDSJAVA-2379Implement platform-neutral StringList to remove the link to org.hsqldb.* from production.Bug
WORDSJAVA-2383Java Collator’s rules do not match to .NET’sBug
WORDSNET-13640Font “Simplified Arabic” is changed to “Arial” in output PDFBug
WORDSNET-20352Spaces with the font DotumChe are not shrunkBug
WORDSNET-19093Follow up for WORDSNET-18543, implement notificationBug
WORDSNET-19105Arabic words take up more horizontal spaceBug
WORDSNET-20187UpdateFields does not process TOC fields ( formatting changed & an entry missing)Bug
WORDSNET-19240Improve markdown emphases parsingBug
WORDSNET-20257Import of complex StructuredDocumentTagBug
WORDSNET-15509Arabic text is not rendered correctly in output PDFBug
WORDSNET-18294Some characters looks different in PDF renditionBug
WORDSNET-12125Locales should be expanded from document defaults when exported to DOC formatBug
WORDSNET-19254Incorrect inline shape height causes layout differencesBug
WORDSNET-13162Rtf to Pdf conversion issue with Thai text renderingBug
WORDSNET-20208Rendering of the fraction. Height calculation of the fractionBug
WORDSNET-18923Paragraph shading is stretched up to TextBox bottomBug
WORDSNET-20211DOCX is corrupted after re-saving itBug
WORDSNET-20212Range.Replace does not replace the numbersBug
WORDSNET-20353ArgumentOutOfRangeException is thrown while exporting document with multiple numbered paragraphs inside a cell into MarkdownBug
WORDSNET-19920Document.UpdateFields does not update Index entries under Swedish languageBug
WORDSNET-13133TXT to PDF conversion issue with Thai CharactersBug
WORDSNET-20219Extra data points appear in the chart in the output PDFBug
WORDSNET-20222Convert to PDF output font issueBug
WORDSNET-19038Content control missing after resaveBug
WORDSNET-15446Font formatting of Arabic text is changed in output PDFBug
WORDSNET-15858Traditional Arabic font is not rendered correctly in output PDFBug
WORDSNET-17001Equations have incorrect layout in PDFBug
WORDSNET-20168On conversion from MHTML to PDF the exception “Parameter is not valid” has been thrownBug
WORDSNET-20046Document.UpdateFields does not update the IF field correctlyBug
WORDSNET-20052Line shape is lost after DOC to PDF conversionBug
WORDSNET-20229GetStartPageIndex returns incorrect page number for tableBug
WORDSNET-20234Reading PDF documents does not work with Aspose.Words for .NET 20.4 (.NET 4.6.1.)Bug
WORDSNET-20236Diagram text is cropped if the paragraph line spacing less then oneBug
WORDSNET-20237LayoutCollector incorrectly returns page indexBug
WORDSNET-20238LayoutCollector incorrectly returns page indexBug
WORDSNET-20224Inline images being shifted in HTML Fixed formatBug
WORDSNET-19826CPU hangs during processing Mail Merge involving INCLUDETEXT fieldsBug
WORDSNET-19405DOCX to PDF file conversion gets one page moreBug
WORDSNET-20244Image’s height width change during open save a RTFBug
WORDSNET-19202Extra step numbers are being generated in Word to PDF transformationBug
WORDSNET-20116Hyperlinks in the Table of Contents do not workBug
WORDSNET-20249Unsupported EOT fonts from DOC format during roundtrip conversionBug
WORDSNET-20251After conversion to PDF some text with radio buttons is mixed upBug
WORDSNET-20247Document.Compare throws System.InvalidOperationExceptionBug
WORDSNET-20186Document.Range.Replace regex string anchors not workingBug
WORDSNET-20104WordArt Shape turns into boxes in rendered documentBug
WORDSNET-20232Image renders incorrectly after DOC to HtmlFixed conversionBug
WORDSNET-20028Conversion to HTML results in wrong spacingBug
WORDSNET-20172On conversion from MHTML to PDF the exception “NotSupportedException” has been thrownBug
WORDSNET-20268Offset shapes in SmartArt when using the Advanced modeBug
WORDSNET-20118Second attempt to save in RTF file format seems to hangBug
WORDSNET-19863SVG is rendered incorrectly in output DOCX/PDFBug
WORDSNET-19352DOCX to PDF conversion issue with heading numbersBug
WORDSNET-19647SVG images not displayed correctlyBug
WORDSNET-20069Table header is repeated after conversion from DOTX to DOCXBug
WORDSNET-19885HTML file load wrong encodingBug
WORDSNET-20288Chinese text is overlapping in icon caption of OleObjectBug
WORDSNET-20290The width of the fraction is incorrect after converting to PDFBug
WORDSNET-20292Number of type string unexpectedly changes value in LINQ Reporting EngineBug
WORDSNET-20223System.ArgumentOutOfRangeException when saving document on XamarinBug
WORDSNET-20134Pageref appearing in PDF Table of ContentsBug
WORDSNET-20139Infinite loop while converting DOCX to PDFBug
WORDSNET-19929An Arial bold formatting has been invalidated in PDFBug
WORDSNET-20152Ignore fonts from %WINDIR%\Fonts\Deleted folderBug
WORDSNET-20271Invalid “d” attribute value of SVG Path leads to an exceptionBug
WORDSNET-20153DOCX to HTML conversion issue with image positionBug
WORDSNET-20163Word file containing images doesn’t get properly converted to PDF fileBug
WORDSNET-19664IncludePicture Image does not scale properlyBug
WORDSNET-20164NullReferenceException is thrown upon converting HTML to MDBug
WORDSNET-20181System.InvalidOperationException is thrown while inserting document into anotherBug
WORDSNET-20320Aspose.Words.Document constructor hangs for ODT under netcoreapp3.1Bug
WORDSNET-20171On conversion from MHTML to PDF the exception “NullReferenceException” has been thrownBug
WORDSNET-18360Text is overlapped after conversion from DOCX to PDFBug
WORDSNET-11448Thai Text rendering issue in output TIFF/PDFBug
WORDSNET-19688Paragraph spacing has been automatically changed to 10Bug
WORDSNET-15503Paragraph’ line is rendered on previous page in output PDFBug
WORDSNET-19603Just open and save DOCX document, the style is modifiedBug
WORDSNET-15463DOC to PDF - Output is incorrect on long contentBug
WORDSNET-15928Blank page is inserted after conversion from DOCX to PDFBug
WORDSNET-14528Shapes do not render correctly in output HtmlFixedBug
WORDSNET-20346Build project target to Framework 4.6.1 adds netstandard.dll and other facade libs to debug folderBug

Public API and Backward Incompatible Changes

This section lists public API changes that were introduced in Aspose.Words 20.5. It includes not only new and obsoleted public methods, but also a description of any changes in the behavior behind the scenes in Aspose.Words which may affect existing code. Any behavior introduced that could be seen as regression and modifies existing behavior is especially important and is documented here.

Added a new public property OoxmlSaveOptions.CompressionLevel

Related issue: WORDSNET-20169

A new public option has been added to the ‘OoxmlSaveOptions’ class:

.NET

/// <summary>
/// Specifies the compression level used to save document.
/// </summary>
public CompressionLevel CompressionLevel

The corresponding public ‘CompressionLevel’ enum has been added to the ‘Aspose.Words.Saving’ namespace:

.NET

/// <summary>
/// The compression level for OOXML files. 
/// <para>
/// (DOCX and DOTX files are internally a ZIP-archive, this property controls the compression level of the archive.
/// </para>
/// <para>
/// Note, that FlatOpc file is not a ZIP-archive, therefore, this property does not affect the FlatOpc files.)
/// </para>
/// </summary>
public enum CompressionLevel
{
    /// <summary>
    /// Normal compression level. Default compression level used by Aspose.Words.
    /// </summary>
    Normal = 0,
    /// <summary>
    /// Maximum compression level.
    /// </summary>
    Maximum = 1,
    /// <summary>
    /// Fast compression level.
    /// </summary>
    Fast = 2,
    /// <summary>
    /// Super Fast compression level. Microsoft Word uses this compression level.
    /// </summary>
    SuperFast = 3
}

Use Case: Explains how to specify the compression level ‘SuperFast’ (used by Microsoft Word) to save the document.

.NET

Document doc = new Document("in.docx");
OoxmlSaveOptions so = new OoxmlSaveOptions(SaveFormat.Docx);
so.CompressionLevel = CompressionLevel.SuperFast;
doc.Save("out.docx", so);

Added page layout callback

Related issue WORDSNET-19093

Document.LayoutOptions.Callback property

.NET

/// <summary>
/// Gets or sets <see cref="IPageLayoutCallback"/> implementation used by page layout model.
/// </summary>
public IPageLayoutCallback Callback

An interface implemented by application:

.NET

public interface IPageLayoutCallback
{
    /// <summary>
    /// This is called to notify of layout build and rendering progress.
    /// </summary>
    /// <remarks>
    /// Exception when thrown by implementation aborts layout build process.<para />
    /// </remarks>
    void Notify(PageLayoutCallbackArgs args);
}

Event arguments

.NET

public class PageLayoutCallbackArgs
{
    /// <summary>
    /// Gets event.
    /// </summary>
    public PageLayoutEvent Event { get; }
    /// <summary>
    /// Gets document.
    /// </summary>
    public Document Document
    {
        get { return _Document; }
    }
    /// <summary>
    /// Gets 0-based index of the page in the document this event relates to.
    /// Returns negative value if there is no associated page, or if page was removed during reflow.
    /// </summary>
    public int PageIndex
    {
        get { return Part != null ? Part.GetPage().GetIndex() : -1; }
    }
}

Event codes

.NET

public enum PageLayoutEvent
{
    /// <summary>
    /// Default value
    /// </summary>
    None,
    /// <summary>
    /// Corresponds to a checkpoint in code which is often visited and which is suitable to abort process.<para/>
    /// While inside <see cref="IPageLayoutCallback.Notify(PageLayoutCallbackArgs)"/> throw custom exception to abort process.<para/>
    /// You can throw when handling any callback event to abort process.<para/>
    /// Note that if process is aborted the page layout model remains in undefined state. If process is aborted upon reflow of a complete page,
    /// however, it should be possible to use layout model up to the end of that page.<para/>
    /// </summary>
    WatchDog,
    /// <summary>
    /// Build of the page layout has started. Fired once.
    /// This is the first event which occurs when <see cref="Document.UpdatePageLayout"/> is called.
    /// </summary>
    BuildStarted,
    /// <summary>
    /// Build of the page layout has finished. Fired once.
    /// This is the last event which occurs when <see cref="Document.UpdatePageLayout"/> is called.
    /// </summary>
    BuildFinished,
    /// <summary>
    /// Conversion of document model to page layout has started. Fired once.
    /// This occurs when layout model starts pulling document content.
    /// </summary>
    ConversionStarted,
    /// <summary>
    /// Conversion of document model to page layout has finished. Fired once.
    /// This occurs when layout model stops pulling document content.
    /// </summary>
    ConversionFinished,
    /// <summary>
    /// Reflow of the page layout has started. Fired once.
    /// This occurs when layout model starts reflowing document content.
    /// </summary>
    ReflowStarted,
    /// <summary>
    /// Reflow of the page layout has finished. Fired once.
    /// This occurs when layout model stops reflowing document content.
    /// </summary>
    ReflowFinished,
    /// <summary>
    /// Reflow of the page has started.
    /// Note that page may reflow multiple times and that reflow may restart before it is finished.
    /// <seealso cref="PageLayoutCallbackArgs.PageIndex"/>
    /// </summary>
    PartReflowStarted,
    /// <summary>
    /// Reflow of the page has finished.
    /// Note that page may reflow multiple times and that reflow may restart before it is finished.
    /// <seealso cref="PageLayoutCallbackArgs.PageIndex"/>
    /// </summary>
    PartReflowFinished,
    /// <summary>
    /// Rendering of page has started. This is fired once per page.
    /// </summary>
    PartRenderingStarted,
    /// <summary>
    /// Rendering of page has finished. This is fired once per page.
    /// </summary>
    PartRenderingFinished,
}

Added public property CleanupOptions.DuplicateStyle

A new public property DuplicateStyle has been added into the CleanupOptions class:

.NET

/// <summary>
/// Gets/sets a flag indicating whether duplicate styles should be removed from document.
/// Default value is <b>false</b>.
/// </summary>
public bool DuplicateStyle { get; set; }

Use Case:

.NET

Document doc = new Document(fileName);
CleanupOptions options = new CleanupOptions();
options.DuplicateStyle = true;
doc.Cleanup(options);
doc.Save(outFileName);

Added а new public method FontInfo.GetEmbeddedFontAsOpenType()

Related issue: WORDSNET-20249

A new method GetEmbeddedFontAsOpenType() has been added to FontInfo class. It allows to convert embedded fonts in Embedded OpenType format (which comes from .doc documents) to OpenType.

.NET

/// <summary>
/// Gets an embedded font file in OpenType format. Fonts in Embedded OpenType format are converted to OpenType.
/// </summary>
/// <param name="style">Specifies the font style to retrieve.</param>
/// <returns>Returns <c>null</c> if the specified font is not embedded.</returns>
public byte[] GetEmbeddedFontAsOpenType(EmbeddedFontStyle style)

New helper class to work with Watermark inside document was introduced

Related issue: WORDSNET-4879.

The new property Watermark has been added to the Document class.

.NET

/// <summary>
/// Provides access to the document watermark.
/// </summary>
public Watermark Watermark { get; }

The new Watermark class allows to add/remove watermark from the document. A watermark can be created from text or from an image.

.NET

/// <summary>
/// Represents class to work with document watermark.
/// </summary>
public sealed class Watermark
{
    /// <summary>
    /// Adds Text watermark into the document.
    /// </summary>
    /// <param name="text">Text that displays as a watermark.</param>
    /// <remarks>
    /// The text length should be in the range from 1 to 200 inclusive.
    /// The text cannot be null or consist only of whitespaces.
    /// </remarks>
    /// <exception cref="ArgumentOutOfRangeException">
    /// Throws when the text length is out of range or the text consists only of whitespaces.
    /// </exception>
    /// <exception cref="ArgumentNullException">
    /// Throws when the text is null.
    /// </exception>
    public void SetText(string text)
    /// <summary>
    /// Adds Text watermark into the document.
    /// </summary>
    /// <param name="text">Text that displays as a watermark.</param>
    /// <param name="options">Defines additional options for the text watermark.</param>
    /// <remarks>
    /// The text length should be in the range from 1 to 200 inclusive.
    /// The text cannot be null or consist only of whitespaces.
    /// </remarks>
    /// <exception cref="ArgumentOutOfRangeException">
    /// Throws when the text length is out of range or the text consists only of whitespaces.
    /// </exception>
    /// <exception cref="ArgumentNullException">
    /// Throws when the text is null.
    /// </exception>
    /// <remarks>If options is null, the watermark will be set with default properties.</remarks>
    public void SetText(string text, TextWatermarkOptions options)
    /// <summary>
    /// Adds Image watermark into the document.
    /// </summary>
    /// <param name="image">Image that displays as a watermark.</param>
    /// <exception cref="ArgumentNullException">
    /// Throws when the image is null.
    /// </exception>
    public void SetImage(Image image)
    /// <summary>
    /// Adds Image watermark into the document.
    /// </summary>
    /// <param name="image">Image that displays as a watermark.</param>
    /// <param name="options">Defines additional options for the image watermark.</param>
    /// <exception cref="ArgumentNullException">
    /// Throws when the image is null.
    /// </exception>
    /// <remarks>If options is null, the watermark will be set with default properties.</remarks>
    public void SetImage(Image image, ImageWatermarkOptions options)
    /// <summary>
    /// Removes watermark.
    /// </summary>
    public void Remove()
    /// <summary>
    /// Returns watermark type.
    /// </summary>
    public WatermarkType Type { get; }
}

The new enum is provided to determine the type of watermark inside the document.

.NET

/// <summary>
/// Specifies the watermark type.
/// </summary>
public enum WatermarkType
{
    /// <summary>
    /// Indicates that the text will be used as a watermark.
    /// <p>Such a watermark corresponds to a WordArt object.</p>
    /// </summary>
    Text,
    /// <summary>
    /// Indicates that the image will be used as a watermark.
    /// <p>Such a watermark corresponds to a shape with image.</p>
    /// </summary>
    Image,
    /// <summary>
    /// Indicates watermark is no set.
    /// </summary>
    None
}

The following option classes are provided to customize the watermark.

For Text watermark.

.NET

/// <summary>
/// Contains options that can be specified when adding a watermark with text.
/// </summary>
public class TextWatermarkOptions
{
    /// <summary>
    /// Gets or sets font family name. The default value is "Calibri".
    /// </summary>
    public string FontFamily { get; set; }
    /// <summary>
    /// Gets or sets font color. The default value is Color.Silver.
    /// </summary>
    public Color Color { get; set; }
    /// <summary>
    /// Gets or sets a font size. The default value is 0 - auto.
    /// </summary>
    /// <remarks>
    /// <p>Valid values range from 0 to 65.5 inclusive.</p>
    /// <p> Auto font size means that the watermark will be scaled to its max width and max height relative to
    /// the page margins.</p>
    /// </remarks>
    /// <exception cref="ArgumentOutOfRangeException">
    /// Throws when argument was out of the range of valid values.
    /// </exception>
    public float FontSize { get; set; }
    /// <summary>
    /// Gets or sets a boolean value which is responsible for opacity of the watermark.
    /// The default value is <code>true</code>.
    /// </summary>
    public bool IsSemitrasparent { get; set; }
    /// <summary>
    /// Gets or sets layout of the watermark. The default value is <see cref="WatermarkLayout.Diagonal"/>.
    /// </summary>
    public WatermarkLayout Layout { get; set; }
}

The new enum is provided to set the text watermark in a diagonal or horizontal layout.

.NET

/// <summary>
/// Defines layout of the watermark relative to the watermark center.
/// </summary>
public enum WatermarkLayout
{
    /// <summary>
    /// Horizontal watermark layout. Corresponds to 0 degrees of rotation.
    /// </summary>
    Horizontal = 0,
    /// <summary>
    /// Diagonal watermark layout. Corresponds to 315 degrees of rotation.
    /// </summary>
    Diagonal = 315
}

For Image watermark:

.NET

/// <summary>
/// Contains options that can be specified when adding a watermark with image.
/// </summary>
public class ImageWatermarkOptions
{
    /// <summary>
    /// Gets or sets the scale factor expressed as a fraction of the image. The default value is 0 - auto.
    /// </summary>
    /// <remarks>
    /// <p>Valid values range from 0 to 65.5 inclusive.</p>
    /// <p>Auto scale means that the watermark will be scaled to its max width and max height relative to
    /// the page margins.</p>
    /// </remarks>
    /// <exception cref="ArgumentOutOfRangeException">
    /// Throws when argument was out of the range of valid values.
    /// </exception>
    public double Scale { get; set; }
    /// <summary>
    /// Gets or sets a boolean value which is responsible for washout effect of the watermark.
    /// The default value is <code>true</code>.
    /// </summary>
    public bool IsWashout { get; set; }
}

Use Case: Add Text watermark with specific options.

.NET

Document doc = new Document(pathFile);
TextWatermarkOptions options = new TextWatermarkOptions()
{
    FontFamily = "Arial",
    FontSize = 36,
    Color = Color.Black,
    Layout = WatermarkLayout.Horizontal,
    IsSemitrasparent = false
};
doc.Watermark.SetText("Test", options);

Use Case: Add Image watermark with specific options.

.NET

Document doc = new Document(pathFile);
ImageWatermarkOptions options = new ImageWatermarkOptions()
{
    Scale = 5,
    IsWashout = false
};
doc.Watermark.SetImage(Image.FromFile(filePath), options);

Use Case: Remove the watermark from the document.

.NET

Document doc = new Document(pathFile);
if (doc.Watermark.Type == WatermarkType.Text)
    doc.Watermark.Remove();

New public property Document.ShowGrammaticalErrors has been added

Related issue: WORDSNET-10404

A new public option has been added to the ‘Document’ class:

.NET

/// <summary>
/// Specifies whether to display grammar errors in this document.
/// </summary>
public bool ShowGrammaticalErrors

Use Case: Explains how to show grammar errors.

.NET

Document doc = new Document("in.doc");
doc.ShowGrammaticalErrors = true;
doc.Save("out.doc");

New public property Document.ShowSpellingErrors has been added

Related issue: WORDSNET-10403

A new public option has been added to the ‘Document’ class:

.NET

/// <summary>
/// Specifies whether to display spelling errors in this document.
/// </summary>
public bool ShowSpellingErrors

Use Case: Explains how to show spelling errors.

.NET

Document doc = new Document("in.doc");
doc.ShowSpellingErrors = true;
doc.Save("out.doc");