清理文档

有时您可能需要删除未使用或重复的信息,以减少输出文档的大小和处理时间。

虽然您可以手动查找和删除未使用的数据(例如样式或列表)或重复信息,但使用 Aspose.Words 提供的特性和功能来执行此操作会更加方便。

CleanupOptions 类允许您指定文档清理选项。要从文档中删除重复的样式或仅删除未使用的样式或列表,您可以使用 cleanup 方法。

从文档中删除未使用的信息

您可以使用 unused_stylesunused_builtin_styles 属性来检测和删除标记为"未使用"的样式。

您可以使用 unused_lists 属性来检测和删除标记为"未使用"的列表和列表定义。

以下代码示例演示如何从文档中仅删除未使用的样式:

# For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Python-via-.NET
doc = aw.Document(docs_base.my_dir + "Unused styles.docx")
# Combined with the built-in styles, the document now has eight styles.
# A custom style is marked as "used" while there is any text within the document
# formatted in that style. This means that the 4 styles we added are currently unused.
print(f"Count of styles before Cleanup: {doc.styles.count}\n" +
f"Count of lists before Cleanup: {doc.lists.count}")
# Cleans unused styles and lists from the document depending on given CleanupOptions.
cleanupOptions = aw.CleanupOptions()
cleanupOptions.unused_lists = False
cleanupOptions.unused_styles = True
doc.cleanup(cleanupOptions)
print(f"Count of styles after Cleanup was decreased: {doc.styles.count}\n" +
f"Count of lists after Cleanup is the same: {doc.lists.count}")
doc.save(docs_base.artifacts_dir + "WorkingWithDocumentOptionsAndSettings.cleanup_unused_styles_and_lists.docx")

从文档中删除重复信息

您还可以使用 duplicate_style 属性将所有重复样式替换为原始样式,并从文档中删除重复项。

以下代码示例演示如何从文档中删除重复的样式:

# For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Python-via-.NET
doc = aw.Document(docs_base.my_dir + "Document.docx")
# Count of styles before Cleanup.
print(doc.styles.count)
# Cleans duplicate styles from the document.
options = aw.CleanupOptions()
options.duplicate_style = True
doc.cleanup(options)
# Count of styles after Cleanup was decreased.
print(doc.styles.count)
doc.save(docs_base.artifacts_dir + "WorkingWithDocumentOptionsAndSettings.cleanup_duplicate_style.docx")