iTextSharp – HTML to PDF – Finishing Up

tiger In the last post I mentioned there were a few topics we need to close up today.  The two topics we’ve left undone are popping the attribute information off the stack when we hit a closing element and dealing with the paragraph gap that normally appears between paragraph elements.

The first thing you’ll want to do when you hit a closing element is to retrieve its name again.  Just like we did at the beginning element.  Once you have that you can pop the attribute information off the stack(s).

You’ll also want to undo any indentation that you applied during the opening element.

To handle the paragraph break, I defined a _crlfAtEnd attribute in my resource file.  If it was defined as true, I added an extra line feed at the end to account for the gap.

   1:  isBlock = Resources.html2pdf
   2:      .ResourceManager
   3:      .GetString(tagName + "_isBlock");
   4:  if (isBlock != null && 
   5:      isBlock.ToLower() == "true")
   6:  {
   7:      isBlock = Resources.html2pdf
   8:          .ResourceManager
   9:          .GetString(tagName + "_crlfAtEnd");
  10:       if (isBlock != null && 
  11:          isBlock.ToLower() == "true")
  12:      {
  13:          et = stack.Peek();
  14:          Font f = getCurrentFont();
  15:          if (et is Phrase)
  16:          {
  17:              ((Phrase)(et)).Add(
  18:                  new Chunk("\n\r", f)); 
  19:              stack.Pop();
  20:          }
  21:      }
  22:      p = new Paragraph();
  23:      ((Paragraph)p).Add("");
  24:      ((Paragraph)p).SetLeading(m_leading, 1);
  25:      list.Add(p);
  26:      stack.Push(p);
  27:  }
 

One problem I’ve had with this in the past is that this cr/lf gets added at the end even if the block is the last block.  I really need to find some way to detect that this is the last place this occurs either nested or in the outermost block.  But I’ll leave that enhancement for you.

Related Post

  • iTextSharp – HTML to PDF – Writing the PDFiTextSharp – HTML to PDF – Writing the PDF Last week we parsed the HTML and created code that keeps track of the various attributes we are going to need when we create the PDF.  Today we will finish the code and create the Elements t...
  • iTextSharp – HTML to PDF – PrerequisitesiTextSharp – HTML to PDF – Prerequisites Before we get into the nitty gritty of parsing the HTML so that we can create PDF code from it, it is important that we develop the concept of how text layout works in iTextSharp.  So today...
  • iTextSharp – HTML to PDF – Parsing HTMLiTextSharp – HTML to PDF – Parsing HTML Now that we have the HTML cleaned up, the next thing we will want to do is to parse the HTML. In my actual code for this, I parse the HTML and create the PDF at the same time, but for the purpo...
  • iTextSharp – PDF to HTML – Cleaning HTMLiTextSharp – PDF to HTML – Cleaning HTML The last prerequisite step prior to actually converting our HTML into PDF code is to clean up the HTML. The method I use takes advantage of the XML parser in .NET but in order to use that we hav...
  • iText IN ACTION – Creating and Manipulating PDFiText IN ACTION – Creating and Manipulating PDF While this isn’t specifically targeted at iTextSharp, which we’ve been covering in recent posts, this is really the closest book you are going to find on the subject. The basics are the same.&#...
  • LeeZilla

    I am replacing some really REALLY bad office automation stuff. Interop = evil in a production system. The existing app stores the data to be printed in HTML. This is exactly what I would want to do for my usage as well. Is this the most up to date article? Since writing this, have you found any glaring limitations or problems with the HTML parsing?

    Thanks for the great information. Building on this work will save me a massive amount of time.

    • Dave

      Wrote this 8/12/09, you can’t get much more recent than that

  • LeeZilla

    Sure enough! The current year IS 2009! I blame it on my general lack of coffee so far today…

    Still, my question stands. Even at this early juncture, do you feel you should have / could have done something different? I must add, I use of the XMLTextReader object to parse the HTML. That was a pretty inspired bit of engineering you did!

  • Ankit

    hi i want to view or display pdf file in browser using itextsharp is it possible if yes then how and if no then you have alternate solution for that? thanks

    • Dave

      Did you read the whole series and you still have this question?