Working with Table of Contents
Often you will work with documents containing a table of contents (TOC). Using Aspose.Words you can insert your own table of contents or completely rebuild the existing table of contents in the document using just a few lines of code.
This article outlines how to work with the table of contents field and demonstrates:
- How to insert a brand new TOC.
- Update new or existing TOCs in the document.
- Specify switches to control the formatting and overall structure of the TOC.
- How to modify the styles and appearance of the table of contents.
- How to remove an entire
TOC
field along with all entries from the document.
Insert Table of Contents Programmatically
You can insert a TOC
(table of contents) field into the document at the current position by calling the InsertTableOfContents method.
A table of contents in a Word document can be built in several ways and formatted using a variety of options. The field switches that you pass to the method control the way the table is built and displayed in your document.
The default switches that are used in a TOC
inserted in Microsoft Word are “\o “1-3 \h \z \u”. Descriptions of these switches as well as a list of supported switches can be found later in the article. You can either use that guide to obtain the correct switches or if you already have a document containing the similar TOC
that you want you can show field codes (ALT+F9) and copy the switches directly from the field.
The following code example shows how to insert a Table of Contents field into a document:
The following code example demonstrates how to insert a Table of contents (TOC) into a document using heading styles as entries:
The code demonstrates the new table of contents is inserted into a blank document. The DocumentBuilder class is then used to insert some sample content formatting with the appropriate heading styles which are used to mark the content to be included in the TOC. The next lines then populate the TOC
by updating the fields and the page layout of the document.
TOC
field, but with no visible content. This is because the TOC
field has been inserted but is not yet populated until it’s updated in the document. Further information about this is discussed in the next section.
Update Table of Contents
Aspose.Words allows you to completely update a TOC
with only a few lines of code. This can be done to populate a newly inserted TOC
or to update an existing TOC
after changes to the document have been made.
The following two methods must be used to update the TOC
fields in the document:
Please note that these two update methods are required to be called in that order. If reversed the table of contents will be populated but no page numbers will be displayed. Any number of different TOCs can be updated. These methods will automatically update all TOCs found in the document.
The following code example shows how to completely rebuild TOC
fields in the document by invoking field updates:
The first call to Document.updateFields() will build the TOC
, all text entries are populated and the TOC
appears almost complete. The only thing missing is the page numbers which for now are displayed with “?”.
The second call to Document.updatePageLayout() will build the layout of the document in memory. This needs to be done to gather the page numbers of the entries. The correct page numbers calculated from this call are then inserted into the TOC.
Use Switches to Control Table of Contents Behavior
As with any other field, the TOC
field can accept switches defined within the field code that controls how the table of contents is built. Certain switches are used to control which entries are included and at what level while others are used to control the appearance of the TOC. Switches can be combined together to allow a complex table of contents to be produced.
By default, these switches above are included when inserting a default TOC
in the document. A TOC
with no switches will include content from the built-in heading styles (as if the \O switch is set).
The available TOC
switches that are supported by Aspose.Words are listed below and their uses are described in detail. They can be divided into separate sections based on their type. The switches in the first section define what content to include in the TOC
and the switches in the second section control the appearance of the TOC.
If a switch is not listed here then it is currently unsupported. All switches will be supported in future versions. We are adding further support to every release.
Entry Marking Switches
Switch | Description |
---|---|
Heading Styles (\O Switch) |
This switch defines that the
|
Outline Levels (\U switch) |
Each paragraph can define an outline level under Paragraph options.
Note that built-in heading styles such as Heading 1 have an outline level compulsory set in style settings.
|
Custom Styles (\T switch) |
This switch will allow custom styles to be used when collecting entries to be used in the TOC. This is often used in conjunction with the \O switch to include custom styles along with built-in heading styles in the TOC.
will use content styled with CustomHeading1 as level 1 content in the |
Use TC Fields (\F and \L Switches) |
In older versions of Microsoft Word, the only way to build a These fields can be inserted into a document at any position like any other field and are represented by the
will only include TC fields such as
The
- \F – Explained above. - \L – Defines which level in the - _\N – The page numbering for this |
Appearance Related Switches
Switch | Description |
---|---|
Omit Page Numbers (\N Switch) |
This switch is used to hide page numbers for certain levels of the TOC. For example, you can define
and the page numbers on the entries of levels 3 and four will be hidden along with the leader dots (if there are any). To specify only one level a range should still be used, for example, “1-1” will exclude page numbers only for the first level. |
Insert As Hyperlinks (\H Switch) |
This switch specifies that |
Set Separator Character (\P Switch) |
This switch allows the content separating the title of the entry and page numbering to be easily changed in the TOC. The separator to use should be specified after this switch and enclosed in speech marks. |
Preserve Tab Entries (\W Switch) |
Using this switch will specify that any entries that have a tab character, for instance, a heading that has a tab at the end of the line, will be retained as a proper tab character when populating the TOC. This means the function of the tab character will be present in the |
Preserve New Line Entries (\X Switch) |
Similar to the switch above, this switch specifies that headings spanning over multiple lines (using newline characters, not separate paragraphs) will be preserved as they are in the generated TOC. For example, a heading which is to spread across multiple lines can use the new line character (Ctrl + Enter or |
Insert TC Fields
You can insert a new TC field at the current position of the DocumentBuilder
by calling the DocumentBuilder.InsertField
method and specifying the field name as “TC” along with any switches that are needed.
The following code example shows how to insert a TC
field into the document using DocumentBuilder.
Often a specific line of text is designated for the TOC
and is marked with a TC
field. The easy way to do this in MS Word is to highlight the text and press ALT+SHIFT+O. This automatically creates a TC
field using the selected text. The same technique can be accomplished through code. The code below will find text matching the input and insert a TC
field in the same position as the text. The code is based on the same technique used in the article. The following code example shows how to find and insert a TC
field at the text in a document.
Modify a Table of Contents
Change the Formatting of Styles
The formatting of entries in the TOC
does not use the original styles of the marked entries, instead, each level is formatted using an equivalent TOC
style. For example, the first level in the TOC
is formatted with the TOC1 style, the second level formatted with the TOC2 style and so on. This means that to change the look of the TOC
these styles must be modified. In Aspose.Words these styles are represented by the locale-independent StyleIdentifier.TOC1
through to StyleIdentifier.TOC9
and can be retrieved from the Document.Styles
collection using these identifiers.
Once the appropriate style of the document has been retrieved the formatting for this style can be modified. Any changes to these styles will be automatically reflected in the TOCs in the document.
The following code example changes a formatting property used in the first level TOC
style.
It is also useful to note that any direct formatting of a paragraph (defined on the paragraph itself and not in the style) marked to be included in the TOC
will be copied over in the entry in the TOC. For example, if the Heading 1 style is used to mark content for the TOC
and this style has Bold formatting while the paragraph also has italic formatting directly applied to it. The resulting TOC
entry will not be bold as that is part of style formatting however it will be italic as this is directly formatted on the paragraph.
You can also control the formatting of the separators used between each entry and the page number. By default, this is a dotted line that is spread across to the page numbering using a tab character and a right tab stop lined up close to the right margin.
Using the Style
class retrieved for the particular TOC
level you want to modify, you can also modify how these appear in the document.
To change how this appears firstly Style.ParagraphFormat
must be called to retrieve the paragraph formatting for the style. From this, the tab stops can be retrieved by calling ParagraphFormat.TabStops
and the appropriate tab stop modified. Using this same technique the tab itself can be moved or removed altogether.
The following code example shows how to modify the position of the right tab stop in TOC
related paragraphs.
Remove a Table of Contents from the Document
A table of contents can be removed from the document by removing all nodes found between the FieldStart
and FieldEnd node of the TOC
field.
The code below demonstrates this. The removal of the TOC
field is simpler than a normal field as we do not keep track of nested fields. Instead, we check the FieldEnd
node is of type FieldType.FieldTOC
which means we have encountered the end of the current TOC. This technique can be used in this case without worrying about any nested fields as we can assume that any properly formed document will have no fully nested TOC
field inside another TOC
field.
Firstly the FieldStart
nodes of each TOC
are collected and stored. The specified TOC
is then enumerated so all nodes within the field are visited and stored. The nodes are then removed from the document. The following code example demonstrates how to remove a specified TOC
from a document.
Extract Table of Contents
If you want to extract a table of contents from any Word document, the following code sample can be used.