Convert a File to PDF Format
XPS, XML Paper Specification, is a Microsoft file format used to integrate document creation and viewing into Windows. With Aspose.PDF for Java, it is possible to convert XPS files to PDF, the portable file format from Adobe.
The file format is basically a zipped XML file, primarily used for distribution and storage. It’s very difficult to edit and mostly implemented by Microsoft.
To convert an XPS file to PDF using Aspose.PDF for Java, use XpsLoadOptions class. This is used to initialize a LoadOptions object. Later, this object is passed as an argument during the Document object initialization and helps the PDF rendering engine to determine the source document’s input format.
In both XP and Windows 7, you should find an XPS Printer pre-installed if you look in the Control Panel and then Printers. To create XPS files you can use that printer for the output device. In Windows 7, you should be able to just double-click the file to open it in an XPS viewer. You may also download XPS viewer from Microsoft’s website.
The following code snippet shows the process of converting the XPS file into PDF format.
To allow conversion from PCL to PDF, Aspose.PDF for Java has the class PclLoadOptions which is used to initialize the LoadOptions object. This object is then passed as an argument during Document object initialization and helps the PDF rendering engine to determine the source document’s input format.
The following code snippet shows the process of converting a PCL file into PDF format.
Starting release of Aspose.PDF for Java 9.5.0, the API supports HP-GL to PDF conversion. So PCL with HP-GL can be converted to PDF using this approach.
PCL to PDF: Supported Features
|Sets of Commands||Support||Exceptions||Description|
|Job control commands||+||Duplex printing mode||Control print process: number of copies, output bin, simplex/duplex printing, left and top offsets etc.|
|Page control commands||+||Perforation Skip command||Specify a size of page, margins, page orientation inter -lines, -character distances etc.|
|Cursor Positioning Commands||+||Specify cursor position and, hence, origins of the text, raster or vector images and details.|
|Font selection commands||+|
1. Transparent Print Data Command.
2. Embedded soft fonts. In the current version instead of creating soft font our library selects suitable font from existing “hard” TrueType fonts installed on a target machine.
3. User-Defined Symbol Sets.
|Allow loading soft (embedded) fonts from the PCL file and managing them in memory.|
|Raster graphics commands||+||Only black & white||Allow loading raster images from PCL file to memory, specify raster parameters|
such as width, height, compression type, resolution, etc.
|Color commands||+||Allow coloring for all printable objects.|
|Print Model commands||+||Allow filling text, raster images and rectangular areas with raster predefined and user-defined patterns, specify transparency mode for patterns and source raster image.|
Predefined patterns are hatching, cross-hatch and shading ones.
|Rectangle area fill commands||+||Allow creation and filling rectangular areas with patterns.|
|HP-GL/2 Vector Graphics commands||+||Screened Vector Command (SV), Transparency Mode Command (TR), Transparent Data Command (TD), RO (Rotate Coordinate System), Scalable or Bitmap Fonts Command (SB), Character Slant Command (SL) and Extra Space (ES) are not implemented and DV (Define Variable Text Path) commands are realized in beta version.|
- Allow loading HP-GL/2 vector images from the PCL file into memory. Vector image has an origin at the lower-left corner of the printable area, can be scaled, translated, rotated and clipped.
- A vector image can contain text, as labels, and geometric figures such as rectangle, circle, ellipse, line, arc, bezier curve and complex figures composed from the simple ones.
- Closed figures including letters of labels can be filled with solid fill or vector pattern.
- The pattern can be hatching, cross-hatch, shading, raster user-defined, PCL hatching or cross-hatch and PCL user-defined. PCL patterns are raster. Labels can be individually rotated, scaled, and directed in four directions: up, down, left and right. Left and Right directions involve one-after-another letter arrangement. Up and Down directions involve one-under-another letter arrangement.
|Macross||―||Allow loading a sequence of PCL commands into memory and use this sequence many times, for example, to print page header or set one formatting for a set of pages.|
|Unicode text||―||Allow printing non-ASCII characters. Not implemented due to lack of sample files with|
|PCL6 (PCL-XL)||Realized only in the Beta version because of lack in test files. Embedded fonts also are not supported. The JetReady extension is not supported because it is impossible to have JetReady specification.||Binary file format.|
PCL to PDF: Known Issues
- Origin of text strings and images can slightly differ from the ones in a source PCL file if the print direction is not 0º. The same refers to vector images if coordinate system of vector plot is rotated (RO command preceded).
- Origin of labels in vector images can be differ from the ones in a source PCL file if the labels are influenced by a sequence of commands: Label Origin (LO), Define Variable Text Path (DV), Absolute Direction (DI) or Relative Direction (DR).
- A text can be incorrectly read if it must be rendered with Bitmap or TrueType soft (embedded) font, because currently these fonts are only partially supported (see exceptions in “Supported features table”). In this situation text can be correctly read only if character codes in a soft font corresponds to default ones. A style of the read text also can be differed from the one in source PCL file because it is not necessary to set style in soft font header.
- If parsed PCL file contains Intellifont or Universal soft fonts, an exception will be thrown, because Intellifont and Universal font are not supported at all.
- If parsed PCL file contains macros commands the result of parsing will strongly differ from the source file, because macros commands are not supported.
In order to convert a PostScript file to PDF format, Aspose.PDF for Java offers PsLoadOptions class which is used to initialize the LoadOptions object. Later this object can be passed as an argument to Document object constructor, which will help PDF Rendering Engine to determine the format of source document. Following code snippet can be used to convert a PostScript file into PDF format:
Conversion of an XML document tree into various supported output formats, is a common need in applications and usually it’s a two-step process:
- Convert an XML source to an XSL-FO representation.
- Convert an XSL-FO representation to the target format.
This transformation allows the structure of the outer tree to be different from the structure of the original tree. While constructing the result tree, the tree transformation process may also add the information necessary to make additional formatting for result data. As an example, a table-of-contents may be added as a filtered selection of an original document, or source data rearrange into a numbered tabular view.
Please take a look at the following simple contents for XSL-FO to PDF conversion using Aspose.PDF for Java.
<?xml version="1.0" encoding="utf-8"?> <fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format"> <fo:layout-master-set> <fo:simple-page-master master-name="only" page-height="29.7cm" page-width="21cm" margin-top="1cm" margin-bottom="2cm" margin-left="2.5cm" margin-right="2.5cm"> <fo:region-body margin-top="3cm" margin-bottom="1.5cm" margin-left="2cm" margin-right="2cm"/> <fo:region-before precedence="true" extent="3cm"/> <fo:region-after precedence="true" extent="1.5cm"/> <fo:region-start extent="1cm"/> <fo:region-end extent="1cm"/> </fo:simple-page-master> </fo:layout-master-set> <fo:page-sequence master-reference="only" initial-page-number="1"> <fo:static-content flow-name="xsl-region-before"> <fo:block text-align="end" font-size="10pt" font-family="serif" line-height="14pt" > XML Recommendation - p. <fo:page-number/> </fo:block> </fo:static-content> <fo:static-content flow-name="xsl-region-after"> <fo:block text-align="center" font-size="10pt" font-family="serif" line-height="14pt" > After </fo:block> </fo:static-content> <fo:static-content flow-name="xsl-region-start"> <fo:block-container border-color="black" border-style="solid" border-width="1pt" height="22.2cm" width="1cm" top="0cm" left="0cm" position="absolute"> <fo:block text-align="start" font-size="8pt" font-family="serif" line-height="10pt">Start</fo:block> </fo:block-container> </fo:static-content> <fo:static-content flow-name="xsl-region-end"> <fo:block-container border-color="black" border-style="solid" border-width="1pt" height="22.2cm" width="1cm" top="0cm" left="0cm" position="absolute"> <fo:block text-align="start" font-size="8pt" font-family="serif" line-height="10pt">End</fo:block> </fo:block-container> </fo:static-content> <fo:flow flow-name="xsl-region-body"> <fo:block font-size="18pt" font-family="sans-serif" line-height="24pt" space-after.optimum="15pt" background-color="blue" color="white" text-align="center" padding-top="0pt"> Extensible Markup Language (XML) 1.0 </fo:block> <fo:block font-size="16pt" font-family="sans-serif" line-height="20pt" space-before.optimum="10pt" space-after.optimum="10pt" text-align="start" padding-top="0pt"> Abstract </fo:block> <fo:block font-size="12pt" font-family="sans-serif" line-height="15pt" space-after.optimum="3pt" text-align="start"> The Extensible Markup Language (XML) is a subset of SGML that is completely described in this document. Its goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML. XML has been designed for ease of implementation and for interoperability with both SGML and HTML. For further information go to <fo:basic-link external-destination="normal.pdf">normal.pdf</fo:basic-link> </fo:block> <fo:block font-size="16pt" font-family="sans-serif" line-height="20pt" space-before.optimum="10pt" space-after.optimum="10pt" text-align="start" padding-top="0pt"> Status of this document </fo:block> <fo:block font-size="12pt" font-family="sans-serif" line-height="15pt" space-after.optimum="3pt" text-align="start"> This document has been reviewed by W3C Members and other interested parties and has been endorsed by the Director as a W3C Recommendation. It is a stable document and may be used as reference material or cited as a normative reference from another document. W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web. For further information go to <fo:basic-link external-destination="normal.pdf">normal.pdf</fo:basic-link> </fo:block> <fo:block font-size="12pt" font-family="sans-serif" line-height="15pt" space-after.optimum="3pt" text-align="start"> This document specifies a syntax created by subsetting an existing, widely used international text processing standard (Standard Generalized Markup Language, ISO 8879:1986(E) as amended and corrected) for use on the World Wide Web. It is a product of the W3C XML Activity. </fo:block> </fo:flow> </fo:page-sequence> </fo:root>
Preview of the PDF document generated after XSL-FO conversion
|Figure:XSL-FO file to PDF|
Scalable Vector Graphics (SVG) is a family of specifications of an XML-based file format for two-dimensional vector graphics, both static and dynamic (interactive or animated). The SVG specification is an open standard that has been under development by the World Wide Web Consortium (W3C) since 1999.
SVG images and their behaviors are defined in XML text files. This means that they can be searched, indexed, scripted and, if required, compressed. As XML files, SVG images can be created and edited with any text editor, but it is often more convenient to create them with drawing programs such as Inkscape.
To convert SVG files to PDF, use the class named SvgLoadOptions which is used to initialize the LoadOptions object. Later, this object is passed as an argument during the Document object initialization and helps the PDF rendering engine to determine the input format of the source document.
The following code snippet shows the process of converting SVG file into PDF format.
We converted this tiger.svg to PDF.
A complex SVG graphic
|Figure: SVG Tiger|
SVG to PDF: Supported Features
|SVG Tag||Sample Use|
Referenced character data
|SVG Tag Property||Sample Use|
|SVG Tag Attribute||Sample Use|
Have you gazed on naked grandeur where there’s nothing else to gaze on,
Referenced character data
SVG to PDF: Known Issues
- Shorten path strings that not covered by specification but as I checked, rendered in browser fine
- Elliptical arc segments with transformations evolve complex 2D geometry calculations, require more testing.
- Big SVG files (megabytes size) take more than a few seconds to parse, code optimization required.
- Graphical filters API in the Pdf layer was extended recently, this makes possible to make more full integration of the corresponding SVG layer graphics functions.
The LaTeX file format is a text file format with markup in the LaTeX derivative of the TeX family of languages and LaTeX is a derived format of the TeX system. LaTeX (ˈleɪtɛk/ lay-tek or lah-tek) is a document preparation system and document markup language. It is widely used for the communication and publication of scientific documents in many fields, including mathematics, physics, and computer science. It also has a prominent role in the preparation and publication of books and articles that contain complex multilingual materials, such as Sanskrit and Arabic, including critical editions. LaTeX uses the TeX typesetting program for formatting its output and is itself written in the TeX macro language.
Aspose.PDF for Java supports the feature to convert TeX files to PDF format and in order to accomplish this requirement, com.aspose.pdf package has a class named LatexLoadOptions which provides the capabilities to load LaTex files and render the output in PDF format using the Document class. The following code snippet shows the process of converting LaTex file to PDF format.
//Instantiate Latex Load option object LatexLoadOptions Latexoptions = new LatexLoadOptions(); //Create Document object com.aspose.pdf.Document doc = new com.aspose.pdf.Document("samplefile.tex", Latexoptions); //Save the output in PDF file doc.save("TeXToPDF_out.pdf");
Aspose.PDF for Java provides the capability to convert Text files to PDF format. In this article, we demonstrate how easily and efficiently we can convert a text file to PDF using Aspose.PDF.
When you need to convert a Text file to PDF, initially read the source text file in some reader. We have used StringBuilder to read the Text file contents. Instantiate Document object and add a new page in the Pages collection. Create a new object of TextFragment and pass StringBuilder object to its constructor. Add a new paragraph in Paragraphs collection using TextFragment object and save the resultant PDF file using the Save method of Document class.
In order to convert EPUB files to PDF format, Aspose.PDF for Java has a class named EpubLoadOptions which is used to load source EPUB file. After that, the object is passed as an argument to Document object initialization, as it helps the PDF rendering engine to determine the source document’s input format.
The following code snippet shows the process of converting an EPUB file into PDF format.
Aspose.PDF for Java provides the opportunity to convert an XML file into PDF document requiring that the input XML file must follow the Aspose.PDF for Java Schema.
Access TextFragement and TextSegment elements from XML file
The bindXML(..) method offers the feature to load XML file contents and Document.save() method is used to save the output in PDF format. However during conversion, we can also access individual elements inside XML template using XML id property with getObjectById() method. The following code snippet shows the steps to access TextSegments from the XML file.
<?xml version="1.0" encoding="utf-8" ?> <Document xmlns="Aspose.PDF"> <Page id="mainSection"> <TextFragment> <TextSegment id="boldHtml">segment1</TextSegment> </TextFragment> <TextFragment> <TextSegment id="strongHtml">segment2</TextSegment> </TextFragment> </Page> </Document>