Project Usage
Our project required PDF generation at various stages of the application and the uploading the same into the FileNet FTP. PD4ML was used for generation and saving the file locally into the server from which the PDF got generated to the FileNet. PD4ML was used as our application used Struts framework and the data needed in the PDF was provided using STRUTS and the layout design was done using HTML and CSS. This provided us with a seamless abstraction between the presentation and the business layer for dynamic generation of PDFs online.
Introduction to PD4ML
PD4ML is a powerful PDF generating tool that uses HTML and CSS (Cascading Style Sheets) as page layout and content definition format. Written in 100% pure Java, it allows users to easily add PDF generation functionality to end products. PD4ML can be used either as a command line operation or in Web applications for online PDF generation from HTML and JSP templates.
###PD4ML as a Command Line Operation PD4ML can be used for HTML to PDF transformation with a command line application. There are man ways for achieving this conversion. However the most commonly used methods are as follows:
####Creating a PDF from an URL String The PDF can be generated using a html file whose URL can be in the render () method
import org.zefer.pd4ml.PD4ML;
import org.zefer.pd4ml.PD4Constants;
........
File f = new File("D:/tools/test.pdf");
.io.FileOutputStream fos = new java.io.FileOutputStream(f);
java= new PD4ML();
PD4ML pd4ml .render( urlstring, fos ); pd4ml
Steps Involved
- Import the PD4ML converter class
- Define HTML-to-PDF converting parameter values if needed such as user space width, HTML elements arrangement, vertical size etc.,
- Preparing output stream for PDF generation.
- Instantiating PD4ML converter.
- Passing to it HTML-to-PDF converting parameters.
- Performing HTML-to-PDF translation.
Converting HTML obtained from input stream to PDF
Using an URL for converting an HTML into a PDF is not mandatory. PD4ML can read a source HTML from input stream and then use the input stream for conversion into the PDF
File f = new File("D:/tools/test.pdf");
.io.FileOutputStream fos = new java.io.FileOutputStream(f);
javaFile fz = new File("D:/tools/yahoo.htm");
.io.FileInputStream fis = new java.io.FileInputStream(fz);
javaInputStreamReader isr = new InputStreamReader( fis, "UTF-8" );
= new PD4ML();
PD4ML html URL base = new URL( "file:D:/tools/" );
.render( isr, fos, base ); html
Formatting the PDF document generated
The PDF getting generated can be formatted using various methods. Some of the most commonly used ones are given below:
= new PD4ML();
PD4ML html .setPageSize( new Dimension(450, 450) );
html//defines page size in points. A set of predefined page format constants is available in the PD4Constants interface.
.setPageInsets( new Insets(20, 50, 10, 10) );
html//specifies page insets in points
.setHtmlWidth( 750 );
html//defines desired HTML page width in screen pixels. Virtually it can be seen as a web browser window horizontal resize
.enableImgSplit( false );
html//allows to disable image splitting by page breaks. By default the option is true (splitting enabled).
For Generating Text-Only Header and Footer
Static or template text can be used for header and footer of the PDF document. The header and the footer can be set with various formats. Few of them are:
= new PD4PageMark();
PD4PageMark header .setAreaHeight( 20 );
header//defines height of the header or footer area
.setTitleTemplate( "title: $[title]" );
header//defines a template for page title representation.
//No title is printed, if the titleTemplate is set to null. Default value is null.
.setTitleAlignment( PD4PageMark.CENTER_ALIGN );
header//defines alignment for the page title string in the document's header of footer
.setPageNumberAlignment( PD4PageMark.LEFT_ALIGN );
header//defines alignment for the page numbers in the document's header of footer area
.setPageNumberTemplate( "#$[page]" );
header//defines a template for page number representation
= new PD4PageMark();
PD4PageMark footer .setAreaHeight( 30 );
footer//Already explained above
.setFontSize( 20 );
footer//sets font size for the header or footer
.setColor( Color.red );
footer//setColor() sets the color of header or footer text
.setPagesToSkip( 1 );
footer//defines a number of pages from the document beginning, that should not be marked with the header or footer info
.setTitleTemplate( "[ $[title] ]" );
footer//Already explained above
.setPageNumberTemplate( "page: $[page]" );
footer//Already explained above
.setTitleAlignment( PD4PageMark.RIGHT_ALIGN );
footer//Already explained above
.setPageNumberAlignment( PD4PageMark.LEFT_ALIGN );
footer//Already explained above
.setPageHeader( header );
pd4ml.setPageFooter( footer ); pd4ml
Protecting PDF documents
A PDF document can be encrypted to protect its contents from unauthorized access. PD4ML supports PDF access permissions concept and allows a password to be specified for a document. If any passwords or access restrictions are specified with PD4ML.setPermissions (), the document is encrypted, and the permissions and information required to validate the passwords are stored to the resulting document.
The possible restrictions are:
- Modifying the document’s contents
- Copying or otherwise extracting text and graphics from the document
- Adding or modifying text annotations
- Printing the document
The various types of pre-set Permissions available in the API are:
- AllowAssembly
- AllowContentExtraction
- AllowCopy
- AllowDegradedPrint
- AllowModify
- AllowPrint
The PDF document produced by PD4ML can be protected with 40-bit or 128-bit encryption using the various Permission levels given above.
String password = "empty";
boolean strongEncryption = true;
int permissions = PD4Constants.AllowPrint | PD4Constants.AllowCopy;
.setPermissions( password, permissions, strongEncryption ); pd4ml
Some of the other salient Features that are available with PD4ML are:
- Converting HTML headings or named anchors to PDF bookmarks
- Named anchors
- Inserting page breaks
- generating and sending PDF by email
Using PD4ML in Web applications for online PDF generation
PD4ML can be used in Web applications for online PDF generation from HTML, JSP and Servlet templates. A simple example is given below:
<taglib uri="http://pd4ml.com/tlds/pd4ml/2.5" prefix="pd4ml">
<page contentType="text/html; charset=UTF-8">
<pd4ml:transform
screenWidth="400"
pageFormat="A5"
pageOrientation="landscape"
pageInsets="100,100,100,100,points"
enableImageSplit="false">
<html>
<head>
<title>pd4ml test</title>
<style type="text/css">
body {color: red;
background-color: #FFFFFF;
font-family: Tahoma, "Sans-Serif";
font-size: 10pt;
}</style>
</head>
<body>
<img src="images/logos.gif" width="125" height="74">
<p>
Hello, World!<pd4ml:page.break/>
<table width="100%" style="background-color: #f4f4f4; color: #000000">
<tr>
<td>
Hello, New Page!</td>
</tr>
</table>
</body>
</html>
</pd4ml:transform>
In order to get a PDF output, we need to surround the HTML or JSP
with
- PD4ML JSP taglib declaration and opening transform tag. JSP content
surrounded with
<pd4ml:transform>
and</pd4ml:transform>
tags is passed to the PD4ML converter. - Image should be referenced with relative path. Absolute URLs, like
src="http://myserver:80/path/to/img.gif"
are allowed as well, butsrc="/path/to/img.gif"
is not allowed. - The directive forces PD4ML converter to insert a page break to the output PDF.
- Closing of the transformation tag. Any content that appears after the tag is ignored.
####Defining PDF document footer (or header) with JSP custom tag The header and/or footer for the PDF can be declared in the jsp in the following fashion.
<pd4ml:footer
titleTemplate="[${title}]"
pageNumberTemplate="page ${page}"
titleAlignment="left"
pageNumberAlignment="right"
color="#008000"
initialPageNumber="1"
pagesToSkip="1"
fontSize="14"
areaHeight="18"/>
Description
- Title template definition. A string that can optionally contain placeholders ${title} for a title value taken from HTML’s TITLE tag, ${page} for a page counter value.
- Page number template definition. A string with placeholder ${page} for a page counter value.
- The attribute initializes internal page counter with the given value.
- The attribute defines, that 1 page should not contain footer information.
- Footer area height in points.
Adding Dynamic data
Dynamic data like data from session or scriplets can be used in the PDF generation. A Simple Example is given below.
<% String template = getFormattedDate() + ", page ${page} "; %>
<pd4ml:footer
pageNumberTemplate="<%=template%>"
.......
/>
This means that the entire form generation for Presentation Frameworks like Struts etc., can be used just like a normal JSP. This provides a nice demarcation and seamless integration of the presentation (Format/Layout) of the PDF document and the business behind the generation
####Temporary saving generated PDF to hard drive With
<pd4ml:savefile>
tag you have possibility to store
just generated PDF to hard drive and redirect user’s browser to read the
PDF as static resource or to redirect the request to another URL for PDF
post-processing. The tag should be nested within
<pd4ml:transform>
and should not have a body.There
are two ways of generating the PDF and redirecting the browser.
####Routing the browser to the PDF generated Once the PDF is generated the user can be directed to the generated PDF using the following piece of code.
<pd4ml:savefile
uri="/WEB/savefile/saved/"
dir="D:/spool/generated_pdfs"
redirect="pdf"
debug="false"/>
The tag above forces PD4ML to save the generated PDF to
D:/spool/generated_pdfs with an autogenerated name.It is expected, that
local directory D:/spool/generated_pdfs corresponds to URL
http://yourserver.com/WEB/savefile/saved/
(as given in
“uri” attribute)
After generation PD4ML will send to client’s browser a redirect
command with URL like that:
http://yourserver.com/WEB/savefile/saved/generated_name.pdf
where,
http://yourserver.com
- Context path
/WEB/savefile/saved
- URI given
generated_name.pdf
- Auto generated file Name
Routing the browser to the next page
However if the browser needs to be redirected to the next page instead of the PDF generated, it can be done in the following way.
<pd4ml:savefile
dir="D:/spool/generated_pdfs"
redirect="/mywebapp/send_pdf_by_email.jsp"
debug="false"/>
The tag above forces PD4ML to save the generated PDF to
D:/spool/generated_pdfs with an auto generated name. After that it
forwards to /mywebapp/send_pdf_by_email.jsp with a REQUEST parameter
filename=<pdfname>
. So send_pdf_by_email.jsp can read
file name using,
String fileName = request.getParameter("filename");
//Building the full path of the PDF generated
String path = "D:/spool/generated_pdfs" + "/" + fileName;
Hence that JSP can read the just-generated PDF file and and perform post-processing or any other actions (like E-mail or File-Upload).
In both cases above you can predefine PDF file name with “name” attribute. If a file with the name is already exists in D:/spool/generated_pdfs, than the new file name is appended with an auto-incremented numeric value.
Instructions for Installation
PD4ML is intended to be used with JDK1.3.1 and above .For deploying PD4ML as either Console application and for online generation, use the following jars available at the PD4ML site (Given in the references) • pd4ml.jar • pd4ml_tl.jar(for the tag library)
Professional Version Features
Apart from the various features discussed above, the licensed professional version includes lots of additional features such as:
- TTF embedding
- Configuring Fonts directory
- Embedding fonts to PDF from Java API
- Embedding fonts to PDF from JSP
- Watermark images
- Table of contents
- General notes
Other libraries
Few other libraries that are available for PDF generation are Apache FOP and iText
Apache FOP
Apache FOP (Formatting Objects Processor) is a print formatter driven by XSL formatting objects (XSL-FO) and an output independent formatter. It is a Java application that reads a formatting object (FO) tree and renders the resulting pages to a specified output. Output formats currently supported include PDF, PS, PCL, AFP, XML (area tree representation), Print, AWT and PNG, and to a lesser extent, RTF and TXT. The primary output target is PDF.
iText > iText is an open source library that allows you to create and manipulate PDF documents. It enables developers looking to enhance web and other applications with dynamic PDF document generation and/or manipulation.” > - http://itextpdf.com/