Converting Documents with Microsoft Azure function
Contents
[
Hide
]
This article provides detailed step-by-step instructions for converting PDF documents in Microsoft Azure using Aspose.PDF for .NET and Azure function.
Prerequisites
- Visual Studio 2022 Community Edition with installed Azure development or Visual Studio Code.
- Azure Account: You need an Azure subscription, create a free account before beginning.
- .NET 6 SDK.
- Aspose.PDF for .NET.
Create Azure Resources
Create Storage Account
- Go to Azure Portal (https://portal.azure.com).
- Click “Create a resource”.
- Search for “Storage account”.
- Click “Create”.
- Fill in the details:
- Subscription: Choose your subscription.
- Resource group: Create new or select existing.
- Storage account name: Enter a unique name.
- Region: Choose the nearest region.
- Performance: Standard.
- Redundancy: LRS (Locally redundant storage).
- Click “Review + create”.
- Click “Create”.
Create Container
- Open your storage account.
- Go to “Containers” under “Data storage”.
- Click “+ Container”.
- Name it “pdfdocs”.
- Set public access level to “Private”.
- Click “Create”.
Create Project
Create Visual Studio Project
- Open Visual Studio 2022.
- Click “Create a new project”.
- Select “Azure Functions”.
- Name your project “PdfConverterAzure”.
- Choose “.NET 6.0” or later and “HTTP trigger”.
- Click “Create”.
Create Visual Studio Code Project
Install Prerequisites
- Visual Code extensions:
code --install-extension ms-dotnettools.csharp
code --install-extension ms-azuretools.vscode-azurefunctions
code --install-extension ms-vscode.azure-account
- Install Azure Functions Core Tools:
npm install -g azure-functions-core-tools@4 --unsafe-perm true
- Install Azure CLI:
- Windows: Download from Microsoft’s website.
- macOS:
brew install azure-cli
. - Linux:
curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
.
Configure Project
- Open project in Visual Studio Code:
code .
- Add NuGet packages by creating/updating
PdfConverterApp.csproj
:
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net6.0</TargetFramework>
<AzureFunctionsVersion>v4</AzureFunctionsVersion>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Microsoft.NET.Sdk.Functions" Version="4.1.1" />
<PackageReference Include="Aspose.PDF" Version="24.10.0" />
<PackageReference Include="Azure.Storage.Blobs" Version="12.14.1" />
<PackageReference Include="Microsoft.Azure.WebJobs.Extensions.Storage" Version="5.0.1" />
</ItemGroup>
</Project>
Install Required NuGet Packages
In Visual Studio open Package Manager Console and run:
Install-Package Aspose.PDF
Install-Package Azure.Storage.Blobs
Install-Package Microsoft.Azure.WebJobs.Extensions.Storage
In Visual Studio Code run:
dotnet restore
Configure Azure Storage Connection
Get Access Keys for the storage account under Access keys in the Azure Portal. These keys will be used to authenticate your application.
- Open
local.settings.json
:
{
"IsEncrypted": false,
"Values": {
"AzureWebJobsStorage": "YOUR_STORAGE_CONNECTION_STRING",
"FUNCTIONS_WORKER_RUNTIME": "dotnet",
"ContainerName": "pdfdocs"
}
}
- Replace
YOUR_STORAGE_CONNECTION_STRING
with your actual storage connection string from Azure Portal.
Configure Aspose License
In Visual Studio:
- Copy your Aspose.PDF license file to the project.
- Right-click on the license file, and select “Properties”.
- Set “Copy to Output Directory” to “Copy always”.
- Add license initialization code in the Program.cs:
var license = new Aspose.Pdf.License();
license.SetLicense("Aspose.PDF.lic");
Create code
Create a new file PdfConverter.cs
:
using Aspose.Pdf;
using Azure.Storage.Blobs;
using System;
using System.IO;
using System.Threading.Tasks;
public class PdfConverter
{
private readonly BlobContainerClient _containerClient;
public PdfConverter(string connectionString, string containerName)
{
_containerClient = new BlobContainerClient(connectionString, containerName);
}
public async Task<string> ConvertToFormat(string sourceBlobName, string targetFormat)
{
// Download source PDF
var sourceBlob = _containerClient.GetBlobClient(sourceBlobName);
using var sourceStream = new MemoryStream();
await sourceBlob.DownloadToAsync(sourceStream);
sourceStream.Position = 0;
// Load PDF document
var document = new Document(sourceStream);
// Create output stream
using var outputStream = new MemoryStream();
string targetBlobName = Path.GetFileNameWithoutExtension(sourceBlobName);
// Convert based on format
switch (targetFormat.ToLower())
{
case "docx":
targetBlobName += ".docx";
document.Save(outputStream, SaveFormat.DocX);
break;
case "html":
targetBlobName += ".html";
document.Save(outputStream, SaveFormat.Html);
break;
case "xlsx":
targetBlobName += ".xlsx";
document.Save(outputStream, SaveFormat.Excel);
break;
case "pptx":
targetBlobName += ".pptx";
document.Save(outputStream, SaveFormat.Pptx);
break;
case "jpeg":
case "jpg":
targetBlobName += ".jpg";
foreach (var page in document.Pages)
{
var jpegDevice = new JpegDevice(new Resolution(300));
jpegDevice.Process(page, outputStream);
}
break;
default:
throw new ArgumentException($"Unsupported format: {targetFormat}");
}
// Upload converted file
outputStream.Position = 0;
var targetBlob = _containerClient.GetBlobClient(targetBlobName);
await targetBlob.UploadAsync(outputStream, true);
return targetBlob.Uri.ToString();
}
}
Create a new file ConvertPdfFunction.cs
:
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Extensions.Http;
using Microsoft.AspNetCore.Http;
using Microsoft.Extensions.Logging;
using System.Threading.Tasks;
using System;
using System.IO;
using Newtonsoft.Json;
using Microsoft.AspNetCore.Mvc;
public static class ConvertPdfFunction
{
[FunctionName("ConvertPdf")]
public static async Task<IActionResult> Run(
[HttpTrigger(AuthorizationLevel.Function, "post"), Route = "convert"] HttpRequest req,
ILogger log)
{
try
{
// Read request body
string requestBody = await new StreamReader(req.Body).ReadToEndAsync();
dynamic data = JsonConvert.DeserializeObject(requestBody);
string sourceBlob = data?.sourceBlob;
string targetFormat = data?.targetFormat;
if (string.IsNullOrEmpty(sourceBlob) || string.IsNullOrEmpty(targetFormat))
{
return new BadRequestObjectResult("Please provide sourceBlob and targetFormat");
}
// Get configuration
string connectionString = Environment.GetEnvironmentVariable("AzureWebJobsStorage");
string containerName = Environment.GetEnvironmentVariable("ContainerName");
// Convert PDF
var converter = new PdfConverter(connectionString, containerName);
string resultUrl = await converter.ConvertToFormat(sourceBlob, targetFormat);
return new OkObjectResult(new { url = resultUrl });
}
catch (Exception ex)
{
log.LogError(ex, "Error converting PDF");
return new StatusCodeResult(500);
}
}
}
// Startup.cs
[assembly: FunctionsStartup(typeof(PdfConverterAzure.Functions.Startup))]
namespace PdfConverterAzure.Functions
{
public class Startup : FunctionsStartup
{
public override void Configure(IFunctionsHostBuilder builder)
{
// Read configuration
var config = builder.GetContext().Configuration;
// Register services
builder.Services.AddLogging();
// Register Azure Storage
builder.Services.AddSingleton(x =>
new BlobServiceClient(config["AzureWebJobsStorage"]));
// Configure Aspose License
var license = new Aspose.Pdf.License();
license.SetLicense("Aspose.PDF.lic");
}
}
}
Test Locally
In Visual Studio:
- Start the Azure Storage Emulator.
- Run the project in Visual Studio.
- Use Postman or curl to test:
curl -X POST http://localhost:7071/api/convert \
-H "Content-Type: application/json" \
-d '{"sourceBlob": "sample.pdf", "targetFormat": "docx"}'
In Visual Studio Code:
- Start the function app:
func start
- Upload a PDF to test:
az storage blob upload \
--account-name $AccountName \
--container-name pdfdocs \
--name sample.pdf \
--file /path/to/your/sample.pdf
- Use Postman or curl to test:
curl -X POST http://localhost:7071/api/convert \
-H "Content-Type: application/json" \
-d '{"sourceBlob": "sample.pdf", "targetFormat": "docx"}'
Deploy to Azure
In Visual Studio:
- Right-click the project in Visual Studio.
- Select “Publish”.
- Choose “Azure Function App”.
- Select your subscription.
- Create new or select existing Function App.
- Click “Publish”.
In Visual Studio Code:
- Press F1 or Ctrl+Shift+P.
- Select “Azure Functions: Deploy to Function App”.
- Choose your subscription.
- Select the function app created above.
- Click “Deploy”.
Configure Azure Function App
- Go to Azure Portal.
- Open your Function App.
- Go to “Configuration”.
- Add application settings:
- Key: “ContainerName”.
- Value: “pdfdocs”.
- Save changes.
Test the Deployed Service
Use Postman or curl to test:
curl -X POST "https://your-function.azurewebsites.net/api/convert" \
-H "x-functions-key: your-function-key" \
-H "Content-Type: application/json" \
-d '{"sourceBlob": "sample.pdf", "targetFormat": "docx"}'
Supported Formats
The list of supported formats can be found here.
Trobleshooting
Important Configuration Options
- Add authentication:
[FunctionName("ConvertPdf")]
public async Task<IActionResult> Run(
[HttpTrigger(AuthorizationLevel.Function, "post", Route = "convert")] HttpRequest req,
ClaimsPrincipal principal,
ILogger log)
{
// Check authentication
if (!principal.Identity.IsAuthenticated)
return new UnauthorizedResult();
// ...
}
- For large files, consider:
- Increasing function timeout.
- Using a consumption plan with more memory.
- Implementing chunked upload/download.
- Adding progress tracking.