Archive for July 20th, 2009
iTextSharp – PDF to HTML – Cleaning HTML
The last prerequisite step prior to actually converting our HTML into PDF code is to clean up the HTML.
The method I use takes advantage of the XML parser in .NET but in order to use that we have to have XHTML compliant XML.
For this exercise, what I am most concerned about is that the HTML tags all have matching closing tags, that the tags are nested in a hierarchical structure, and that the tags all are lower case.
Some of this we will have to rely on the user to provide, like properly nesting the tags. But some of this we can attempt to clean up in our code. If you know you will have complete control over your HTML, you might be able to skip this step. But I think the code is simple enough that you’ll want to add it anyhow.
Other Related Items:
LINQ in ActionLLINQ, Language INtegrated Query, is a new extension to the Visual Basic and C# programming languages designed to simplify data queries and databas... Read More >
Health & Medical Web Design Templates, Layouts, Logos, HTML and Stock PhotosHealth and medical themed CD contains a collection of HTML web templates, layouts, CSS files, images, logos, and photographs all related to the medica... Read More >
Greddy Oil Filters: QX-01 #7156 New Page 1 Product Details: GReddy Oil Filters use "double pleats" folding technology on the inner filter to provide a maximum... Read More >

