Specify Load Options
When loading a document, you can set some advanced properties. Aspose.Words provides you with the LoadOptions class, which allows more precise control of the load process. Some load formats have a corresponding class that holds load options for this load format, for example, there is PdfLoadOptions for loading to PDF format or TxtLoadOptions for loading to TXT. This article provides examples of working with options of the LoadOptions class.
Set Microsoft Word Version to Change the Appearance
Different versions of the Microsoft Word application can display documents in differently. For example, there is a well-known problem with OOXML documents such as DOCX or DOTX produced using WPS Office. In such cases, essential document markup elements may be missing or may be interpreted differently causing Microsoft Word 2019 to show such a document differently compared to Microsoft Word 2010.
By default Aspose.Words opens documents using Microsoft Word 2019 rules. If you need to to make document loading appear as it would happen in one of the previous Microsoft Word application versions, you should explicitly specify the desired version using the MswVersion property of the LoadOptions class.
The following code example shows how to set the Microsoft Word version with load options:
// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Java | |
// Specify load option to specify MS Word version | |
LoadOptions loadOptions = new LoadOptions(); | |
loadOptions.setMswVersion(MsWordVersion.WORD_2003); | |
Document doc = new Document(dataDir + "document.doc", loadOptions); | |
doc.save(dataDir + "Word2003_out.docx"); |
Set Language Preferences to Change the Appearance
The details of displaying a document in Microsoft Word depend not only on the application version and the MswVersion property value but also on the language settings. Microsoft Word may show documents differently depending on the “Office Language Preferences” dialog settings, that can be found in “File → Options → Languаge”. Using this dialog a user can select, for example, primary language, proofing languages, display languages, and so on. Aspose.Words provides the LanguagePreferences property as the equivalent of this dialog. If Aspose.Words output differs from the Microsoft Word output, set the appropriate value for EditingLanguage – this can improve the output document.
The following code example shows how to set Japanese as EditingLanguage:
// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Java | |
// Specify LoadOptions to add Editing Language | |
LoadOptions loadOptions = new LoadOptions(); | |
loadOptions.getLanguagePreferences().addEditingLanguage(EditingLanguage.JAPANESE); | |
Document doc = new Document(dataDir + "languagepreferences.docx", loadOptions); | |
int localeIdFarEast = doc.getStyles().getDefaultFont().getLocaleIdFarEast(); | |
if (localeIdFarEast == (int) EditingLanguage.JAPANESE) | |
System.out.println("The document either has no any FarEast language set in defaults or it was set to Japanese originally."); | |
else | |
System.out.println("The document default FarEast language was set to another than Japanese language originally, so it is not overridden."); |
Use WarningCallback to Control Problems While Loading a Document
Some documents may be corrupted, contain invalid entries, or have features not currently supported by Aspose.Words. If you want to know about problems that occurred while loading a document, Aspose.Words provides the IWarningCallback interface.
The following code example shows the implementation of the IWarningCallback interface:
// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Java | |
private static class DocumentLoadingWarningCallback implements IWarningCallback { | |
public void warning(WarningInfo info) { | |
// Prints warnings and their details as they arise during document loading. | |
System.out.println("WARNING: " + info.getWarningType() + " source:" + info.getSource()); | |
System.out.println("\tDescription: " + info.getDescription()); | |
} | |
} |
To get information about all problems throughout the load time, use the WarningCallback property.
The following code example shows how to use this property:
// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Java | |
// Create a new LoadOptions object and set its WarningCallback property. | |
LoadOptions loadOptions = new LoadOptions(); | |
loadOptions.setWarningCallback(new DocumentLoadingWarningCallback()); | |
Document doc = new Document(dataDir + "input.docx", loadOptions); |
Use ResourceLoadingCallback to Control the External Resources Loading
A document may contain external links to images located somewhere on a local disk, network, or Internet. Aspose.Words automatically loads such images into a document, but there are situations when this process needs to be controlled. For example, to decide whether we really need to load a certain image or perhaps skip it. The ResourceLoadingCallback load option allows you to control this.
The following code example shows the implementation of the IResourceLoadingCallback interface:
// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Java | |
private static class HtmlLinkedResourceLoadingCallback implements IResourceLoadingCallback { | |
public int resourceLoading(ResourceLoadingArgs args) throws Exception { | |
switch (args.getResourceType()) { | |
case ResourceType.CSS_STYLE_SHEET: { | |
System.out.println("External CSS Stylesheet found upon loading: " + args.getOriginalUri()); | |
// CSS file will don't used in the document | |
return ResourceLoadingAction.SKIP; | |
} | |
case ResourceType.IMAGE: { | |
// Replaces all images with a substitute | |
String newImageFilename = "Logo.jpg"; | |
System.out.println("\tImage will be substituted with: " + newImageFilename); | |
BufferedImage newImage = ImageIO | |
.read(new File(Utils.getDataDir(LoadOptionsCallbacks.class) + newImageFilename)); | |
ByteArrayOutputStream baos = new ByteArrayOutputStream(); | |
ImageIO.write(newImage, "jpg", baos); | |
baos.flush(); | |
byte[] imageBytes = baos.toByteArray(); | |
baos.close(); | |
args.setData(imageBytes); | |
// New images will be used instead of presented in the document | |
return ResourceLoadingAction.USER_PROVIDED; | |
} | |
case ResourceType.DOCUMENT: { | |
System.out.println("External document found upon loading: " + args.getOriginalUri()); | |
// Will be used as usual | |
return ResourceLoadingAction.DEFAULT; | |
} | |
default: | |
throw new Exception("Unexpected ResourceType value."); | |
} | |
} | |
} |
The following code example shows how to use the ResourceLoadingCallback property:
// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Java | |
// Create a new LoadOptions object and set its ResourceLoadingCallback attribute | |
// as an instance of our IResourceLoadingCallback implementation | |
LoadOptions loadOptions = new LoadOptions(); | |
loadOptions.setResourceLoadingCallback(new HtmlLinkedResourceLoadingCallback()); | |
// When we open an Html document, external resources such as references to CSS | |
// stylesheet files and external images | |
// will be handled in a custom manner by the loading callback as the document is | |
// loaded | |
Document doc = new Document(dataDir + "Images.html", loadOptions); | |
doc.save(dataDir + "Document.LoadOptionsCallback_out.pdf"); |
Use TempFolder to Avoid a Memory Exception
Aspose.Words supports extremely large documents that have thousands of pages full of rich content. Loading such documents may require much RAM. In the process of loading, Aspose.Words needs even more memory to hold temporary structures used to parse a document.
If you have a problem with the Out of Memory exception while loading a document, try to use the TempFolder property. In this case, Aspose.Words will store some data in temporary files instead of memory, and this can help avoid such an exception.
The following code example shows how to set TempFolder:
// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Java | |
// Specify LoadOptions to set Temp Folder | |
LoadOptions lo = new LoadOptions(); | |
lo.setTempFolder("C:\\TempFolder\\"); | |
Document doc = new Document(dataDir + "document.doc", lo); |
Set the Encoding Explicitly
Most modern document formats store their content in Unicode and do not require special handling. On the other hand, there are still many documents that use some pre-Unicode encoding and sometimes either miss encoding information or do not even support encoding information by nature. Aspose.Words tries to automatically detect the appropriate encoding by default, but in a rare case you may need to use an encoding different from the one detected by our encoding recognition algorithm. In this case, use the Encoding property to get or set the encoding.
The following code example shows how to set the encoding to override the automatically chosen encoding:
// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Java | |
// Set the Encoding attribute in a LoadOptions object to override the | |
// automatically chosen encoding with the one we know to be correct | |
LoadOptions loadOptions = new LoadOptions(); | |
loadOptions.setEncoding(java.nio.charset.Charset.forName("UTF-8")); | |
Document doc = new Document(dataDir + "Encoded in UTF-8.txt", loadOptions); |
Load Encrypted Documents
You can load Word documents encrypted with a password. To do this, use a special constructor overload, which accepts a LoadOptions object. This object contains the Password property, which specifies the password string.
The following code example shows how to load a document encrypted with a password:
// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Java | |
// For complete examples and data files, please go to | |
// https://github.com/aspose-words/Aspose.Words-for-Java | |
// Load the encrypted document from the absolute path on disk. | |
Document doc = new Document(dataDir + "LoadEncrypted.docx", new LoadOptions("aspose")); |
If you do not know in advance whether the file is encrypted, you can use the FileFormatUtil class, which provides utility methods for working with file formats, such as detecting the file format or converting file extensions to/from file format enumerations. To detect if the document is encrypted and requires a password to open it, use the IsEncrypted property.
The following code example shows how to verify OpenDocument either it is encrypted or not:
// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Java | |
FileFormatInfo info = FileFormatUtil.detectFileFormat(dataDir + "encrypted.odt"); | |
System.out.println(info.isEncrypted()); |