Tuesday, December 17, 2013

Processing PDF files in Linux

When preparing research papers on Linux using open source tools such as Latex, sometimes it is necessary to handle files in PDF format. For example, for a research paper submission, I had to attach all the diagrams and graphs included in the paper in a separate annex of the PDF file. However the Latex manuscript wants me to have my diagrams and graphs in EPS file format to be included in the paper. So I had to process the PDF of the paper and the diagrams in EPS format separately and then merge them together. For future usage, I'm noting down the tools I used and how they were utilized. 

When I prepare my diagrams I used DIA tool. Then I exported them as EPS files to be used in the Latex manuscript. If I want to convert such an EPS file to a PDF file, I can easily use a command line tool called "epstopdf". For example, lets say I have a EPS file named as "collision_ratio.eps". I can convert it to PDF format in the command line as follows.

epstopdf collision_ratio.eps

Another new requirement arose after the above conversion. The diagram in the PDF file is smaller and does not fit in to the total A4 page size. So, I wanted to resize it to the A4 size which was done by using "Ghostscript" tool as follows. The input file is "collision_ratio.pdf" and the resulting resized file will be "collision_ratio_A4.pdf".

gs  -sOutputFile=collision_ratio_A4.pdf  -sDEVICE=pdfwrite  -sPAPERSIZE=a4  -dCompatibilityLevel=1.4  -dNOPAUSE  -dBATCH  -dPDFFitPage  collision_ratio.pdf

Finally my requirement was to merge multiple PDF files together and sometimes to rearrange the pages inside the same PDF file accordingly. There are so many open source tools which can be used in Linux to perform such kind of a task. In my case I used a tool called "PDFMod". We can install it on Ubuntu Linux by the following command.

sudo apt-get install pdfmod

This tool is a GUI based tool and therefore users can graphically add many PDF files as possible, rearrange them and finally export the resulting new PDF file. It is a really useful tool for manipulating PDF files. Even though these are the tools I used for my recent work, there are so many other available tools for processing and manipulating PDF files.