Help:Splitting and joining PDF, DjVu and images
This page will explain splitting and joining (merging) pdf, djvu and tiff files. It may be required as a part of conversion process.
Enhancing the command line[edit]
Splitting and joining is usually done with command line tools. While the standard Windows command shell is enough for the task, there are programs that make it more convenient to use. One of them that has positive reception is ConEmu.
Splitting/joining PDF[edit]
Using command line tools[edit]
One of the command line tools that can split and join PDF files (among other features) is Coherent PDF (cpdf).
Splitting:
- Specifying the page range:
cpdf input.pdf 1-10,20,30,100-end -o output.pdf
will produce the file with the pages specified: 1 to 10, then 20, 30, then all the pages from 100 to the end.
- Breaking the files into individual pages:
cpdf -split input.pdf -o output%%%.pdf
will give files output001.pdf, output002.pdf etc.
- Splitting on bookmarks:
cpdf -split-bookmarks 0 a.pdf -o out%%%.pdf
breaks the file on the top-level bookmarks.cpdf -split-bookmarks 1 a.pdf -o out%%%.pdf
breaks the file on the top-level bookmarks and also on the first-level child bookmarks.cpdf -split-bookmarks 0 a.pdf -o @B.pdf
Uses the bookmarks for file names.
Joining:
cpdf input1.pdf input2.pdf [...] -o output.pdf
- All in the current directory:
cpdf *.pdf -o output.pdf
Doing a combined operation:
cpdf input1.pdf 1-10 input2.pdf 1,5,10 -o output.pdf
will create a file with pages 1-10 from input1.pdf and pages 1, 5 and 10 from input2.pdf.
More details are in the manual.
Splitting using virtual printers[edit]
A PDF virtual printer is a piece of software that installs itself as a printer, which appears on the list of printers in the Print dialog box. When 'printing' with that printer, the result is saved as a PDF file on your computer.
PDF-XChange Lite Printer is an example of a free virtual PDF printer, while there are many others.
So to split a PDF with a virtual printer, you simply need to use any PDF reader program that you open PDF books with and to 'print' the document with a virtual printer, and in the Print dialog, specify the exact page range(s) or number(s) that need to come out.
The resulting PDF file on the binary level will not be the exact copy of the pages of the original one, because the virtual printer encodes it afresh in its own way. So, depending on the algorithm used for that, the output file may gain or lose in terms of quality and size.
Splitting/joining DjVu[edit]
Splitting and joining DjVu is done with the djvm and djvmcvt tools from the DjVuLibre package. They don't allow flexibility like the cpdf program provides. Joining (merging) is done with djvm. Let's just quote here its self-explaining help:
DjVu multipage document manipulation utility Usage: To compose a multipage document: djvm -c[reate] <doc.djvu> <page_1.djvu> ... <page_n.djvu> where <doc.djvu> is the name of the BUNDLED document to be created, <page_n.djvu> are the names of the page files to be packed together. To insert a new page into an existing document: djvm -i[nsert] <doc.djvu> <page.djvu> [<page_num>] where <doc.djvu> is the name of the BUNDLED DjVu document to be modified, <page.djvu> is the name of the single-page DjVu document file to be inserted as page <page_num> (page numbers start from 1). Negative or omitted <page_num> means to append the page. <page.djvu> can be another multipage DjVu document, in which case all pages of that document will be inserted into <doc.djvu> starting starting at page <page_num> To delete a page from an existing document: djvm -d[elete] <doc.djvu> <page_num> where <doc.djvu> is the name of the docyment to be modified and <page_num> is the number of the page to be deleted To list document contents: djvm -l[ist] <doc.djvu>
For example, to join all the DjVu files in the current directory, type
djvm -c book.djvu *.djvu
Djvmcvt can be used to split a DjVu files into individual pages. From its help:
DjVu multipage document conversion utility Usage: To convert any DjVu multipage document into the new INDIRECT format: djvmcvt -i[ndirect] <doc_in.djvu> <dir_out> <idx_fname.djvu> where <dir_out> is the name of the output directory, and <idx_fname.djvu> is the name of the top-level document index file. The <doc_in.djvu> specifies the document to be converted.
For example,
djvmcvt -i input.djvu folder index.djvu
Will create a series of one-page files in the 'folder' directory. You can select some of them and join them using djvm as described above.
Splitting using virtual printers[edit]
A virtual printer can be used to print selected pages or ranges of DjVu books into PDF format. So not only splitting, but also a conversion to PDF will take place on-the-fly. It's possible to print into DjVu format this way too, if you can find a DjVu virtual printer.
Splitting/joining TIFF[edit]
Multipage TIFF images can of course be splitted and joined too. Someone familiar with the process is welcome to edit this article and contribute their knowledge on this.