iTextSharp – HTML to PDF – Prerequisites

animal-015

Before we get into the nitty gritty of parsing the HTML so that we can create PDF code from it, it is important that we develop the concept of how text layout works in iTextSharp.  So today we will cover those basics.

The first type of element we want to deal with when we parse our HTML into a PDF is the Paragraph element.

When we get to actually parsing our HTML to PDF code we will use the Paragraph object for all of our block elements.  This allows us to add other Paragraphs and Chunks into it which we can format.

A Chunk is our second object that we will be using.  The Chunk is the main object that will allow us to format the font.  In fact, even if our block element specifies some sort of specific font, the font doesn’t actually get applied in the code until we add the text.

Typical code to place text into a PDF document would look something like this

p = new Paragraph(new Chunk("text that needs a font", 
    FontFactory.GetFont("Arial", 10, Font.NORMAL, Color.BLACK)));
p.Alignment = (Element.ALIGN_CENTER);
ct.AddElement(p);

where “ct” is an object of type ColumnText that we discussed last week.

The only other two classes we need to discuss are the list classes.  We use the List to create an item that will handle both the OL and UL tags.  The ListItem class will handle the individual items within the list.  The List constructor handles which of the two types of list we are dealing with by specifying true or false in the first parameter, numbered.

I have not yet added the ability to handle tables to my HTML parser mainly because I have not had the need.  I think once I show you how to create tables and how to parse HTML you should be able to handle adding table parsing code yourself.

Related Post

  • iTextSharp – HTML to PDF – Parsing HTMLiTextSharp – HTML to PDF – Parsing HTML Now that we have the HTML cleaned up, the next thing we will want to do is to parse the HTML. In my actual code for this, I parse the HTML and create the PDF at the same time, but for the purpo...
  • iTextSharp – HTML to PDF – Writing the PDFiTextSharp – HTML to PDF – Writing the PDF Last week we parsed the HTML and created code that keeps track of the various attributes we are going to need when we create the PDF.  Today we will finish the code and create the Elements t...
  • iTextSharp – HTML to PDF – Finishing UpiTextSharp – HTML to PDF – Finishing Up In the last post I mentioned there were a few topics we need to close up today.  The two topics we’ve left undone are popping the attribute information off the stack when we hit a closing ele...
  • iTextSharp – PDF to HTML – Cleaning HTMLiTextSharp – PDF to HTML – Cleaning HTML The last prerequisite step prior to actually converting our HTML into PDF code is to clean up the HTML. The method I use takes advantage of the XML parser in .NET but in order to use that we hav...
  • iTextSharp – HTML to PDF – Positioning TextiTextSharp – HTML to PDF – Positioning Text The next series of things I’m going to introduce about using iTextSharp are all going to lead toward taking HTML text and placing it on the PDF document. There are several items we need to cove...