Working with Artifacts in Python via .NET

Artifacts in PDF are graphics objects or other elements that are not part of the actual content of the document. They are usually used for decoration, layout, or background purposes. Examples of artifacts include page headers, footers, separators, or images that do not convey any meaning.

The purpose of artifacts in PDF is to allow the distinction between content and non-content elements. This is important for accessibility, as screen readers and other assistive technologies can ignore artifacts and focus on the relevant content. Artifacts can also improve the performance and quality of PDF documents, as they can be omitted from printing, searching, or copying.

To create an element as an artifact in PDF, you need to use the Artifact class. It contains following useful properties:

  • custom_type - Gets name of artifact type. May be used if artifact type is non standard.
  • custom_subtype - Gets name of artifact subtype. May be used if artifact subtype is not standard subtype.
  • type - Gets artifact type.
  • subtype - Gets artifact subtype. If artifact has non-standard subtype, name of the subtype may be read via CustomSubtype.
  • contents - Gets collection of artifact internal operators.
  • form - Gets XForm of the artifact (if XForm is used).
  • rectangle - Gets rectangle of the artifact.
  • position - Gets or sets artifact position. If this property is specified, then margins and alignments are ignored.
  • right_margin - Right margin of artifact.If position is specified explicitly (in Position property) this value is ignored.
  • left_margin - Left margin of artifact.If position is specified explicitly (in Position property) this value is ignored.
  • top_margin - Top margin of artifact. If position is specified explicitly (in Position property) this value is ignored.
  • bottom_margin - Bottom margin of artifact.If position is specified explicitly (in Position property) this value is ignored.
  • artifact_horizontal_alignment - Horizontal alignment of artifact. If position is specified explicitly (in Position property) this value is ignored.
  • artifact_vertical_alignment - Vertical alignment of artifact. If position is specified explicitly (in Position property) this value is ignored.
  • rotation - Gets or sets artifact rotation angle.
  • text - Gets text of the artifact.
  • image - Gets image of the artifact (if presents).
  • opacity - Gets or sets opacity of the artifact. Possible values are in range 0..1.
  • lines - Lines of multiline text artifact.
  • text_state - Text state for artifact text.
  • is_background - If true Artifact is placed behind page contents.

The following classes may also be useful for work with artifacts:

Working with Existing Watermarks

A watermark created with Adobe Acrobat is called an artifact (as described in 14.8.2.2 Real Content and Artifacts of the PDF specification).

In order to get all Watermarks on a particular page, the Page class has the ‘artifacts’ property.

The following code snippet shows how to get all watermarks on the first page of a PDF file:


    import aspose.pdf as ap

    path_infile = self.data_dir + infile

    # Open PDF document
    with ap.Document(path_infile) as document:
        # Get the watermarks from the first page artifacts
        watermarks = [
            artifact for artifact in document.pages[0].artifacts
            if artifact.type == ap.Artifact.ArtifactType.PAGINATION and
                artifact.subtype == ap.Artifact.ArtifactSubtype.WATERMARK
        ]

        # Iterate through the found watermark artifacts and print details
        for watermark_item in watermarks:
            print(f"{watermark_item.text} {watermark_item.rectangle}")

The above details are also depicted in the figures below:

Example of Watermarks

Working with Backgrounds as Artifacts

Background images can be used to add a watermark, or other subtle design, to documents. In Aspose.PDF for Python via .NET, each PDF document is a collection of pages and each page contains a collection of artifacts. The BackgroundArtifact class can be used to add a background image to a page object.

The following code snippet shows how to add a background image to PDF pages using the BackgroundArtifact object.


    import aspose.pdf as ap

    # Open PDF document
    with Document(path_infile) as pdf_document:
        # Create a new BackgroundArtifact and set the background image
        background_artifact = ap.BackgroundArtifact()
        background_artifact.background_image = open(path_imagefile, 'rb')

        # Add the background image to the first page's artifacts
        pdf_document.pages[1].artifacts.add(background_artifact)

        # Save PDF document with the added background
        pdf_document.save(path_outfile)

If you want, for some reason, to use a solid color background, please change the previous code in the following manner:


    import aspose.pdf as ap

    # Open PDF document
    with ap.Document(path_infile) as document:
        # Create a new BackgroundArtifact and set the background color
        background_artifact = ap.BackgroundArtifact()
        background_artifact.background_color = ap.Color.dark_khaki

        # Add the background color to the first page's artifacts
        document.pages[1].artifacts.add(background_artifact)

        # Save PDF document
        document.save(path_outfile)

Counting Artifacts of a Particular Type

To calculate the total count of artifacts of a particular type (for example, the total number of watermarks), use the following code:


    import aspose.pdf as ap

    # Open PDF document
    with ap.Document(path_infile) as document:
        # Get pagination artifacts from the first page
        pagination_artifacts = [artifact for artifact in document.pages[1].artifacts
                                if artifact.type == ap.Artifact.ArtifactType.PAGINATION]

        # Count and display the number of each artifact type
        print("Watermarks: {}".format(
            sum(1 for artifact in pagination_artifacts
                if artifact.subtype == ap.Artifact.ArtifactSubtype.WATERMARK)))
        print("Backgrounds: {}".format(
            sum(1 for artifact in pagination_artifacts
                if artifact.subtype == ap.Artifact.ArtifactSubtype.BACKGROUND)))
        print("Headers: {}".format(
            sum(1 for artifact in pagination_artifacts
                if artifact.subtype == ap.Artifact.ArtifactSubtype.HEADER)))
        print("Footers: {}".format(
            sum(1 for artifact in pagination_artifacts
                if artifact.subtype == ap.Artifact.ArtifactSubtype.FOOTER)))

Adding Bates Numbering Artifact

This example illustrates how to programmatically add Bates numbering to a PDF document using Aspose.PDF for Python via .NET. By configuring the BatesNArtifact with desired settings and applying it to the document’s pages, you can automate the process of adding standardized identifiers to each page.

To add a Bates numbering artifact to a document, call the AddBatesNumbering(BatesNArtifact) extension method on the PageCollection, passing the BatesNArtifact object as a parameter:


    import aspose.pdf as ap

    # Create or open PDF document
    with ap.Document() as document:
        # Add 10 pages
        for page_index in range(10):
            document.pages.add()

        # Add Bates numbering to all pages
        document.pages.add_bates_numbering(ap.BatesNArtifact(
            # These properties are set to their default values, as if they were not specified
            start_page=1,
            end_page=0,
            subset=ap.Subset.ALL,
            number_of_digits=6,
            start_number=1,
            prefix="",
            suffix="",
            artifact_vertical_alignment=ap.VerticalAlignment.BOTTOM,
            artifact_horizontal_alignment=ap.HorizontalAlignment.RIGHT,
            right_margin=72,
            left_margin=72,
            top_margin=36,
            bottom_margin=36
        ))

        # Save PDF document
        document.save(path_outfile)

Or, you can pass a collection of PaginationArtifacts:


    import aspose.pdf as ap

    # Create or open PDF document
    with ap.Document() as document:
        # Add 10 pages
        for page_index in range(10):
            document.pages.add()

        # Add Bates numbering to all pages
        document.pages.add_pagination([
            ap.BatesNArtifact(
                # These properties are set to their default values, as if they were not specified
                start_page=1,
                end_page=0,
                subset=ap.Subset.ALL,
                number_of_digits=6,
                start_number=1,
                prefix="",
                suffix="",
                artifact_vertical_alignment=ap.VerticalAlignment.BOTTOM,
                artifact_horizontal_alignment=ap.HorizontalAlignment.RIGHT,
                right_margin=72,
                left_margin=72,
                top_margin=36,
                bottom_margin=36
            )
        ])

        # Save PDF document
        document.save(path_outfile)

Add a Bates numbering artifact using an action delegate:


    import aspose.pdf as ap

    # Create or open PDF document
    with ap.Document() as document:
        # Add 10 pages
        for page_index in range(10):
            document.pages.add()

        # Add Bates numbering to all pages
        document.pages.add_bates_numbering(lambda bates_numbering: (
            # These properties are set to their default values, as if they were not specified
            setattr(bates_numbering, 'start_page', 1),
            setattr(bates_numbering, 'end_page', 0),
            setattr(bates_numbering, 'subset', Subset.All),
            setattr(bates_numbering, 'number_of_digits', 6),
            setattr(bates_numbering, 'start_number', 1),
            setattr(bates_numbering, 'prefix', ""),
            setattr(bates_numbering, 'suffix', ""),
            setattr(bates_numbering, 'artifact_vertical_alignment', ap.VerticalAlignment.BOTTOM),
            setattr(bates_numbering, 'artifact_horizontal_alignment', ap.HorizontalAlignment.RIGHT),
            setattr(bates_numbering, 'right_margin', 72),
            setattr(bates_numbering, 'left_margin', 72),
            setattr(bates_numbering, 'top_margin', 36),
            setattr(bates_numbering, 'bottom_margin', 36),
            setattr(bates_numbering.text_state, 'font_size', 10)
        ))

        # Save PDF document
        document.save(path_outfile)

To delete Bates numbering, use the following code:


    import aspose.pdf as ap

    # Open PDF document
    with ap.Document(path_infile) as document:
        # Delete Bates numbering from all pages

        document.pages.delete_bates_numbering()

        # Save PDF document
        document.save(path_outfile)