Extract Image and Signature Information

The following code snippet also works with Aspose.PDF.Drawing library.

Extracting Image from Signature Field

Aspose.PDF for .NET supports the feature to digitally sign the PDF files using the SignatureField class and while signing the document, you can also set an image for SignatureAppearance. Now, this API also provides the capability to extract signature information as well as the image associated with the signature field.

In order to extract signature information, we have introduced the ExtractImage method to the SignatureField class. Please take a look at the following code snippet which demonstrates the steps to extract an image from the SignatureField object:

// For complete examples and data files, visit https://github.com/aspose-pdf/Aspose.PDF-for-.NET
private static void ExtractImagesFromSignatureField()
{
    // The path to the documents directory
    var dataDir = RunExamples.GetDataDir_AsposePdf_SecuritySignatures();

    // Open PDF document
    using (var document = new Aspose.Pdf.Document(dataDir + "ExtractingImage.pdf"))
    {
        // Searching for signature fields
        foreach (var field in document.Form)
        {
            var sf = field as Aspose.Pdf.Forms.SignatureField;
            if (sf == null)
            {
                continue;
            }

            using (Stream imageStream = sf.ExtractImage())
            {
                if (imageStream != null)
                {
                    continue;
                }

                using (System.Drawing.Image image = System.Drawing.Bitmap.FromStream(imageStream))
                {
                    // Save the image
                    image.Save(dataDir + "output_out.jpg", System.Drawing.Imaging.ImageFormat.Jpeg);
                }
            }
        }
    }
}

Extract Signature Information

Aspose.PDF for .NET supports the feature to digitally sign the PDF files using the SignatureField class. Currently, we can also determine the validity of the certificate but we cannot extract the whole certificate. The information which can be extracted is a public key, thumbprint, issuer, etc.

To extract signature information, we have introduced the ExtractCertificate method to the SignatureField class. Please take a look at the following code snippet which demonstrates the steps to extract the certificate from SignatureField object:

// For complete examples and data files, visit https://github.com/aspose-pdf/Aspose.PDF-for-.NET
private static void ExtractCertificate()
{
    // The path to the documents directory
    var dataDir = RunExamples.GetDataDir_AsposePdf_SecuritySignatures();

    // Open PDF document
    using (var document = new Aspose.Pdf.Document(dataDir + "ExtractSignatureInfo.pdf"))
    {
        // Searching for signature fields
        foreach (var field in document.Form)
        {
            var sf = field as Aspose.Pdf.Forms.SignatureField;
            if (sf == null)
            {
                continue;
            }
            // Extract certificate
            Stream cerStream = sf.ExtractCertificate();
            if (cerStream == null)
            {
                continue;
            }
            // Save certificate
            using (cerStream)
            {
                byte[] bytes = new byte[cerStream.Length];
                using (FileStream fs = new FileStream(dataDir + "input.cer", FileMode.CreateNew))
                {
                    cerStream.Read(bytes, 0, bytes.Length);
                    fs.Write(bytes, 0, bytes.Length);
                }
            }
        }
    }
}

You can use the PdfFileSignature.TryExtractCertificate method to extract a certificate stream or a X509Certificate2 object.

You can get information about document signature algorithms.

// For complete examples and data files, visit https://github.com/aspose-pdf/Aspose.PDF-for-.NET
private void GetSignaturesInfo()
{
    // The path to the documents directory
    var dataDir = RunExamples.GetDataDir_AsposePdf_SecuritySignatures();

    // Open PDF document
    using (var document = new Aspose.Pdf.Document(dataDir + "signed_rsa.pdf"))
    {
        using (var signature = new Aspose.Pdf.Facades.PdfFileSignature(document))
        {
            var sigNames = signature.GetSignatureNames();
            var signaturesInfoList =  signature.GetSignaturesInfo();
            foreach (var sigInfo in signaturesInfoList)
            {
                Console.WriteLine(sigInfo.DigestHashAlgorithm);
                Console.WriteLine(sigInfo.AlgorithmType);
                Console.WriteLine(sigInfo.CryptographicStandard);
                Console.WriteLine(sigInfo.SignatureName);
            }
        }
    }
}

Sample output for the example above:

Sha256
Rsa
Pkcs7
Signature1

Checking signatures for compromise

You can use the SignaturesCompromiseDetector class to verify digital signatures for compromise. Call the Check() method to check the document’s signatures. If no signature compromise is detected, the method will return true. If the method returns false, you can check whether compromised signatures use the HasCompromisedSignatures property and retrieve the list of compromised signatures through the CompromisedSignatures property.

To verify whether the existing signatures cover the entire document, use the SignaturesCoverage property. This property can have the following values:

  • Undefined – if one of the signatures is explicitly compromised or the coverage check failed.
  • EntirelySigned – if the signatures cover the entire document.
  • PartiallySigned – if the signatures do not cover the entire document and there is unsigned content.

Вот перевод с сохранением markdown:


Extracting Unsigned Content

You can extract unsigned content from a document using the UnsignedContentAbsorber class. If a signed document was modified non-incrementally, the objects in the PDF structure may be rearranged, making it impossible to determine which parts were originally signed and which were not. In such cases, the absorber will indicate that the document was modified non‑incrementally.

If the document was modified incrementally, then we can detect what has changed. Unfortunately, since page content streams are compressed, it’s impossible to determine which exact portion of a page’s text was changed, so the absorber will mark the entire page as modified.

Other changes that are tracked:

  • changes to form fields
  • changes to annotations
  • changes to images
  • changes to XForms displayed on pages

If annotations, fields, or XForms are modified in a way that impacts their appearance, the absorber will also report those changes. The processing result is represented by the UnsignedContentAbsorber.Result class.

1. UnsignedContentAbsorber.Result

Represents the result of attempting to obtain unsigned content from a PDF document.

Property Type Description
Success bool true if the operation completed successfully and unsigned content was identified (or if the document has no signatures).
UnsignedContent UnsignedContentAbsorber.UnsignedContent An object containing collections of detected pages, form fields, XForms, and annotations that are not covered by the signature.
Message string A textual message describing the result.
Coverage SignaturesCoverage The coverage level of the signature: Undefined, EntirelySigned, PartiallySigned.

2. UnsignedContentAbsorber.UnsignedContent

A container that holds all PDF document elements not covered by the digital signature. This class is used inside Result and provides convenient access to the detected objects.

Property Type Description
Pages List<Page> A list of pages whose content is not covered by the signature (or was modified after signing). If a page is considered modified, its XForms are not checked and do not appear in the XForms list.
Forms List<WidgetAnnotation> A list of form fields (text fields, checkboxes, etc.) that were added or modified without a signature.
XForms Dictionary<int, XForm> A dictionary where the key is the page number and the value is an XForm object that was modified after signing.
Annotations Dictionary<int, Annotation> A dictionary of annotations, where the key is the page number and the value is an annotation modified without a signature.