iTextSharp – HTML to PDF – Finishing Up

tiger In the last post I mentioned there were a few topics we need to close up today.  The two topics we’ve left undone are popping the attribute information off the stack when we hit a closing element and dealing with the paragraph gap that normally appears between paragraph elements.

The first thing you’ll want to do when you hit a closing element is to retrieve its name again.  Just like we did at the beginning element.  Once you have that you can pop the attribute information off the stack(s).

You’ll also want to undo any indentation that you applied during the opening element.

To handle the paragraph break, I defined a _crlfAtEnd attribute in my resource file.  If it was defined as true, I added an extra line feed at the end to account for the gap.

   1:  isBlock = Resources.html2pdf
   2:      .ResourceManager
   3:      .GetString(tagName + "_isBlock");
   4:  if (isBlock != null && 
   5:      isBlock.ToLower() == "true")
   6:  {
   7:      isBlock = Resources.html2pdf
   8:          .ResourceManager
   9:          .GetString(tagName + "_crlfAtEnd");
  10:       if (isBlock != null && 
  11:          isBlock.ToLower() == "true")
  12:      {
  13:          et = stack.Peek();
  14:          Font f = getCurrentFont();
  15:          if (et is Phrase)
  16:          {
  17:              ((Phrase)(et)).Add(
  18:                  new Chunk("\n\r", f)); 
  19:              stack.Pop();
  20:          }
  21:      }
  22:      p = new Paragraph();
  23:      ((Paragraph)p).Add("");
  24:      ((Paragraph)p).SetLeading(m_leading, 1);
  25:      list.Add(p);
  26:      stack.Push(p);
  27:  }
 

One problem I’ve had with this in the past is that this cr/lf gets added at the end even if the block is the last block.  I really need to find some way to detect that this is the last place this occurs either nested or in the outermost block.  But I’ll leave that enhancement for you.

Related Post

5 Responses to “iTextSharp – HTML to PDF – Finishing Up”

  • LeeZilla:

    I am replacing some really REALLY bad office automation stuff. Interop = evil in a production system. The existing app stores the data to be printed in HTML. This is exactly what I would want to do for my usage as well. Is this the most up to date article? Since writing this, have you found any glaring limitations or problems with the HTML parsing?

    Thanks for the great information. Building on this work will save me a massive amount of time.

  • LeeZilla:

    Sure enough! The current year IS 2009! I blame it on my general lack of coffee so far today…

    Still, my question stands. Even at this early juncture, do you feel you should have / could have done something different? I must add, I use of the XMLTextReader object to parse the HTML. That was a pretty inspired bit of engineering you did!

  • Ankit:

    hi i want to view or display pdf file in browser using itextsharp is it possible if yes then how and if no then you have alternate solution for that? thanks

Leave a Reply

Comment Policy:

  • You must verify your comment by responding to the automated email that is sent to your email address. Unverified comments will never show.Leave a good comment that adds to the conversation and I'll leave your link in.
  • Leave me pure spam and I'll delete it.
  • Leave a general comment and I'll remove the link but keep the comment.

Notify me of followup comments via e-mail

Bear