Monday 28 November 2011

Merge Multiple PDF files using the iTextSharp library

I had a bit of an issue today. I'd been given an e-book in multiple PDF files and when I say multiple, I mean multiple; one per chapter of a twenty six chapter book, plus a couple of appendices and the index and table of contents, all in all over 30 files.

I did a bit of searching trying to see whether I could do it with OpenOffice and a link took me to the iText library, a free Java,C# library that can read, parse and do all manner of things with PDF  files.

I know that there are a myriad of applications that join, merge, ... etc, pdf files, not least Adobe Acrobat, which you can use for free for 30 days but sometimes it's fun to tinker a little bit.

I found this sample but it looks like it's from 2006 and the code does not work with the current version ( Somebody, posted an example that looked more promising, but alas it did not work either, so I modified to work with the current library. I also modified so that it would join any number of files.

Here is the code (hopefully is self explanatory despite the lack of comments):

   1 public static void MergePdfFiles(string destinationfile, List<string> files)
   2 {
   3     Document document = null;
   5     try
   6     {
   7         List<PdfReader> readers = new List<PdfReader>();
   8         List<int> pages = new List<int>();
  10         foreach (string file in files)
  11         {
  12             readers.Add(new PdfReader(file));
  13         }
  15         document = new Document(readers[0].GetPageSizeWithRotation(1));
  17         PdfWriter writer = PdfWriter.GetInstance(document, new FileStream(destinationfile, FileMode.Create));
  19         document.Open();
  21         foreach (PdfReader reader in readers)
  22         {
  23             pages.Add(reader.NumberOfPages);
  24             WritePage(reader, document, writer);
  25         }
  26     }
  27     catch (Exception ex)
  28     {
  29         MessageBox.Show("An Error occurred");
  30     }
  31     finally
  32     {
  33         document.Close();
  34     }
  35 }
  37 private static void WritePage(PdfReader reader, iTextSharp.text.Document document, PdfWriter writer)
  38 {
  39     try
  40     {
  41         PdfContentByte cb = writer.DirectContent;
  42         PdfImportedPage page;
  44         int rotation = 0;
  46         for (int i = 1; i <= reader.NumberOfPages; i++)
  47         {
  48             document.SetPageSize(reader.GetPageSizeWithRotation(i));
  49             document.NewPage();
  51             page = writer.GetImportedPage(reader, i);
  53             rotation = reader.GetPageRotation(i);
  55             if (rotation == 90 || rotation == 270)
  56             {
  57                 cb.AddTemplate(page, 0, -1f, 1f, 0, 0, reader.GetPageSizeWithRotation(i).Height);
  58             }
  59             else
  60             {
  61                 cb.AddTemplate(page, 1f, 0, 0, 1f, 0, 0);
  62             }
  63         }
  64     }
  65     catch (Exception ex)
  66     {
  67         MessageBox.Show("An Error occurred");
  68     }
  69 }

The code will join the pdf files in the order that the files List stores the files, so it would possible to sort the files using List<T>.Sort Method if you need to before calling MergePdfFiles


  1. Work out as it is... thanks a lot for this post...
    Just want to add a bit... while rotation=180...
    else if (rotation == 180)
    cb.AddTemplate(page, -1f, 0, 0,-1f, reader.GetPageSizeWithRotation(i).Width, reader.GetPageSizeWithRotation(i).Height);

  2. The fill-able document's fields will not be editable in the destination PDF...

  3. Thank's for this code , good job , juste one thing where is the comments ? :(

  4. Brilliant- best, most straightforward example on the net. Thank you.

  5. You can try this free online pdf to image converter to convert pdf to multipage tiff.
