Monday 28 November 2011

Merge Multiple PDF files using the iTextSharp library

I had a bit of an issue today. I'd been given an e-book in multiple PDF files and when I say multiple, I mean multiple; one per chapter of a twenty six chapter book, plus a couple of appendices and the index and table of contents, all in all over 30 files.

I did a bit of searching trying to see whether I could do it with OpenOffice and a link took me to the iText library, a free Java,C# library that can read, parse and do all manner of things with PDF  files.

I know that there are a myriad of applications that join, merge, ... etc, pdf files, not least Adobe Acrobat, which you can use for free for 30 days but sometimes it's fun to tinker a little bit.

I found this sample but it looks like it's from 2006 and the code does not work with the current version (5.1.2.0). Somebody, posted an example that looked more promising, but alas it did not work either, so I modified to work with the current library. I also modified so that it would join any number of files.

Here is the code (hopefully is self explanatory despite the lack of comments):

   1 public static void MergePdfFiles(string destinationfile, List<string> files)
   2 {
   3     Document document = null;
   4     
   5     try
   6     {
   7         List<PdfReader> readers = new List<PdfReader>();
   8         List<int> pages = new List<int>();
   9 
  10         foreach (string file in files)
  11         {
  12             readers.Add(new PdfReader(file));
  13         }
  14 
  15         document = new Document(readers[0].GetPageSizeWithRotation(1));
  16 
  17         PdfWriter writer = PdfWriter.GetInstance(document, new FileStream(destinationfile, FileMode.Create));
  18 
  19         document.Open();
  20 
  21         foreach (PdfReader reader in readers)
  22         {
  23             pages.Add(reader.NumberOfPages);
  24             WritePage(reader, document, writer);
  25         }
  26     }
  27     catch (Exception ex)
  28     {
  29         MessageBox.Show("An Error occurred");
  30     }
  31     finally
  32     {
  33         document.Close();
  34     }
  35 }
  36 
  37 private static void WritePage(PdfReader reader, iTextSharp.text.Document document, PdfWriter writer)
  38 {
  39     try
  40     {
  41         PdfContentByte cb = writer.DirectContent;
  42         PdfImportedPage page;
  43 
  44         int rotation = 0;
  45 
  46         for (int i = 1; i <= reader.NumberOfPages; i++)
  47         {
  48             document.SetPageSize(reader.GetPageSizeWithRotation(i));
  49             document.NewPage();
  50 
  51             page = writer.GetImportedPage(reader, i);
  52 
  53             rotation = reader.GetPageRotation(i);
  54 
  55             if (rotation == 90 || rotation == 270)
  56             {
  57                 cb.AddTemplate(page, 0, -1f, 1f, 0, 0, reader.GetPageSizeWithRotation(i).Height);
  58             }
  59             else
  60             {
  61                 cb.AddTemplate(page, 1f, 0, 0, 1f, 0, 0);
  62             }
  63         }
  64     }
  65     catch (Exception ex)
  66     {
  67         MessageBox.Show("An Error occurred");
  68     }
  69 }

The code will join the pdf files in the order that the files List stores the files, so it would possible to sort the files using List<T>.Sort Method if you need to before calling MergePdfFiles

7 comments:

  1. Work out as it is... thanks a lot for this post...
    Just want to add a bit... while rotation=180...
    else if (rotation == 180)
    {
    cb.AddTemplate(page, -1f, 0, 0,-1f, reader.GetPageSizeWithRotation(i).Width, reader.GetPageSizeWithRotation(i).Height);
    }

    ReplyDelete
  2. The fill-able document's fields will not be editable in the destination PDF...

    ReplyDelete
  3. Thank's for this code , good job , juste one thing where is the comments ? :(

    ReplyDelete
  4. Brilliant- best, most straightforward example on the net. Thank you.

    ReplyDelete
  5. You can try this free online pdf to image converter to convert pdf to multipage tiff.

    ReplyDelete