PDFBOX: Merge adds unused Fonts, how to remove it












1















i merge two PDF Files into one with PDFBOX Version 2.
The First one got Fonts:



name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
XXMGEM+Arial-BoldMT TrueType WinAnsi yes yes yes 15 0
XXMGEM+ArialMT TrueType WinAnsi yes yes yes 19 0
XXMGEM+ArialMT CID TrueType Identity-H yes yes yes 27 0
XXMGEM+ArialNarrow-Bold TrueType WinAnsi yes yes yes 40 0
XXMGEM+ArialNarrow TrueType WinAnsi yes yes yes 44 0


and the Second one:



name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
UNTWVR+HelveticaLTCom-Roman CID TrueType Identity-H yes yes yes 25 0
UNTYID+HelveticaLTCom-Bold CID TrueType Identity-H yes yes yes 26 0
UNTZUP+ArialMT CID TrueType Identity-H yes yes yes 27 0
UNUBHB+Arial-BoldMT CID TrueType Identity-H yes yes yes 28 0
Helvetica-Bold Type 1 WinAnsi no no no 29 0
UNXPUH+HelveticaLTCom-Roman CID TrueType Identity-H yes yes yes 50 0
UNXRGT+HelveticaLTCom-Bold CID TrueType Identity-H yes yes yes 51 0
UNXSTF+ArialMT CID TrueType Identity-H yes yes yes 52 0
UNXUFR+Arial-BoldMT CID TrueType Identity-H yes yes yes 53 0


After Merging, this happens:



name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
SRWYVL+HelveticaLTCom-Roman CID TrueType Identity-H yes yes yes 420 0
SRXAHX+HelveticaLTCom-Bold CID TrueType Identity-H yes yes yes 421 0
SRXBUJ+ArialMT CID TrueType Identity-H yes yes yes 422 0
SRXDGV+Arial-BoldMT CID TrueType Identity-H yes yes yes 423 0
Helvetica-Bold Type 1 WinAnsi no no no 424 0
SRWYVL+HelveticaLTCom-Roman CID TrueType Identity-H yes yes yes 425 0
SRXAHX+HelveticaLTCom-Bold CID TrueType Identity-H yes yes yes 426 0
SRXBUJ+ArialMT CID TrueType Identity-H yes yes yes 427 0
SRXDGV+Arial-BoldMT CID TrueType Identity-H yes yes yes 428 0
SRWYVL+ArialMT CID TrueType Identity-H yes yes yes 429 0
SRXAHX+HelveticaLTCom-Roman CID TrueType Identity-H yes yes yes 430 0
SRXBUJ+HelveticaLTCom-Bold CID TrueType Identity-H yes yes yes 431 0
SRXDGV+Arial-BoldMT CID TrueType Identity-H yes yes yes 432 0
WDEGAT+Arial-BoldMT TrueType WinAnsi yes yes yes 436 0
GSEDXU+ArialMT TrueType WinAnsi yes yes yes 437 0
Arial TrueType WinAnsi yes no no 416 0
ZapfDingbats TrueType WinAnsi yes no yes 419 0
ArialNarrow TrueType WinAnsi yes no no 417 0
ACHRDX+ZapfDingbats TrueType WinAnsi yes yes yes 618 0
ACHRDX+ZapfDingbats TrueType WinAnsi yes yes yes 619 0
ACHRDX+ZapfDingbats TrueType WinAnsi yes yes yes 620 0
ACHRDX+ZapfDingbats TrueType WinAnsi yes yes yes 621 0
ACHRDX+ZapfDingbats TrueType WinAnsi yes yes yes 622 0
GSEDXU+ArialNarrow-Bold TrueType WinAnsi yes yes yes 560 0
NVGLHQ+ArialNarrow TrueType WinAnsi yes yes yes 561 0
KWHHMM+ArialMT CID TrueType Identity-H yes yes yes 578 0


My Code in Java:



final PDFMergerUtility pdfMerger = new PDFMergerUtility();
pdfMerger.setDestinationStream(outputStream);
pdfMerger.addSources(additionalPdfStreams);
pdfMerger.addSource(inputStreamPdDocument);
pdfMerger.mergeDocuments(MemoryUsageSetting.setupMainMemoryOnly());


The Problem is that an Api from a third party vendor got an Problem with this Fonts.
So : What am i doing wrong and how can i remove the unused and doubled fonts ??










share|improve this question















This question has an open bounty worth +350
reputation from danny117 ending in 3 days.


This question has not received enough attention.


Consolidate fonts, Consolidate backgrounds, Consolidate images. Optimize for web viewing. Same things acobat standard does when user opens a pdf followed by save as pdf.












  • 2





    Please also share the source PDF files to allow reproducing the issue. In particular I'm surprised that your test run seems to indicate that PDFBox renames embedded subsets. It is possible I missed that but I don't consider it probable.

    – mkl
    Nov 15 '18 at 10:29











  • PDFBox doesn't rename fonts. What PDFBox version do you use? Are you sure that the result file was font-analysed directly after the merge, and not after something else? Is it the correct file?

    – Tilman Hausherr
    Nov 15 '18 at 12:27











  • Hi, i cannot upload the PDF Files its not for the public. @TilmanHausherr : Yes, the PDF was analyzed directly after the PDFBox merged it we are using 2.0.11

    – Skary
    Nov 16 '18 at 7:45











  • Current version is 2.0.12. Can you reproduce the problem by using the command line merge utility? If yes, could you try to reproduce the problem with two non confidential PDF files?

    – Tilman Hausherr
    Nov 16 '18 at 9:01











  • This is easily duplicated just copy mypdf.pdf to copy of mypdf.pdf then merge them together. They carry double fonts double images double backgrounds.

    – danny117
    Feb 27 at 22:18
















1















i merge two PDF Files into one with PDFBOX Version 2.
The First one got Fonts:



name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
XXMGEM+Arial-BoldMT TrueType WinAnsi yes yes yes 15 0
XXMGEM+ArialMT TrueType WinAnsi yes yes yes 19 0
XXMGEM+ArialMT CID TrueType Identity-H yes yes yes 27 0
XXMGEM+ArialNarrow-Bold TrueType WinAnsi yes yes yes 40 0
XXMGEM+ArialNarrow TrueType WinAnsi yes yes yes 44 0


and the Second one:



name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
UNTWVR+HelveticaLTCom-Roman CID TrueType Identity-H yes yes yes 25 0
UNTYID+HelveticaLTCom-Bold CID TrueType Identity-H yes yes yes 26 0
UNTZUP+ArialMT CID TrueType Identity-H yes yes yes 27 0
UNUBHB+Arial-BoldMT CID TrueType Identity-H yes yes yes 28 0
Helvetica-Bold Type 1 WinAnsi no no no 29 0
UNXPUH+HelveticaLTCom-Roman CID TrueType Identity-H yes yes yes 50 0
UNXRGT+HelveticaLTCom-Bold CID TrueType Identity-H yes yes yes 51 0
UNXSTF+ArialMT CID TrueType Identity-H yes yes yes 52 0
UNXUFR+Arial-BoldMT CID TrueType Identity-H yes yes yes 53 0


After Merging, this happens:



name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
SRWYVL+HelveticaLTCom-Roman CID TrueType Identity-H yes yes yes 420 0
SRXAHX+HelveticaLTCom-Bold CID TrueType Identity-H yes yes yes 421 0
SRXBUJ+ArialMT CID TrueType Identity-H yes yes yes 422 0
SRXDGV+Arial-BoldMT CID TrueType Identity-H yes yes yes 423 0
Helvetica-Bold Type 1 WinAnsi no no no 424 0
SRWYVL+HelveticaLTCom-Roman CID TrueType Identity-H yes yes yes 425 0
SRXAHX+HelveticaLTCom-Bold CID TrueType Identity-H yes yes yes 426 0
SRXBUJ+ArialMT CID TrueType Identity-H yes yes yes 427 0
SRXDGV+Arial-BoldMT CID TrueType Identity-H yes yes yes 428 0
SRWYVL+ArialMT CID TrueType Identity-H yes yes yes 429 0
SRXAHX+HelveticaLTCom-Roman CID TrueType Identity-H yes yes yes 430 0
SRXBUJ+HelveticaLTCom-Bold CID TrueType Identity-H yes yes yes 431 0
SRXDGV+Arial-BoldMT CID TrueType Identity-H yes yes yes 432 0
WDEGAT+Arial-BoldMT TrueType WinAnsi yes yes yes 436 0
GSEDXU+ArialMT TrueType WinAnsi yes yes yes 437 0
Arial TrueType WinAnsi yes no no 416 0
ZapfDingbats TrueType WinAnsi yes no yes 419 0
ArialNarrow TrueType WinAnsi yes no no 417 0
ACHRDX+ZapfDingbats TrueType WinAnsi yes yes yes 618 0
ACHRDX+ZapfDingbats TrueType WinAnsi yes yes yes 619 0
ACHRDX+ZapfDingbats TrueType WinAnsi yes yes yes 620 0
ACHRDX+ZapfDingbats TrueType WinAnsi yes yes yes 621 0
ACHRDX+ZapfDingbats TrueType WinAnsi yes yes yes 622 0
GSEDXU+ArialNarrow-Bold TrueType WinAnsi yes yes yes 560 0
NVGLHQ+ArialNarrow TrueType WinAnsi yes yes yes 561 0
KWHHMM+ArialMT CID TrueType Identity-H yes yes yes 578 0


My Code in Java:



final PDFMergerUtility pdfMerger = new PDFMergerUtility();
pdfMerger.setDestinationStream(outputStream);
pdfMerger.addSources(additionalPdfStreams);
pdfMerger.addSource(inputStreamPdDocument);
pdfMerger.mergeDocuments(MemoryUsageSetting.setupMainMemoryOnly());


The Problem is that an Api from a third party vendor got an Problem with this Fonts.
So : What am i doing wrong and how can i remove the unused and doubled fonts ??










share|improve this question















This question has an open bounty worth +350
reputation from danny117 ending in 3 days.


This question has not received enough attention.


Consolidate fonts, Consolidate backgrounds, Consolidate images. Optimize for web viewing. Same things acobat standard does when user opens a pdf followed by save as pdf.












  • 2





    Please also share the source PDF files to allow reproducing the issue. In particular I'm surprised that your test run seems to indicate that PDFBox renames embedded subsets. It is possible I missed that but I don't consider it probable.

    – mkl
    Nov 15 '18 at 10:29











  • PDFBox doesn't rename fonts. What PDFBox version do you use? Are you sure that the result file was font-analysed directly after the merge, and not after something else? Is it the correct file?

    – Tilman Hausherr
    Nov 15 '18 at 12:27











  • Hi, i cannot upload the PDF Files its not for the public. @TilmanHausherr : Yes, the PDF was analyzed directly after the PDFBox merged it we are using 2.0.11

    – Skary
    Nov 16 '18 at 7:45











  • Current version is 2.0.12. Can you reproduce the problem by using the command line merge utility? If yes, could you try to reproduce the problem with two non confidential PDF files?

    – Tilman Hausherr
    Nov 16 '18 at 9:01











  • This is easily duplicated just copy mypdf.pdf to copy of mypdf.pdf then merge them together. They carry double fonts double images double backgrounds.

    – danny117
    Feb 27 at 22:18














1












1








1


2






i merge two PDF Files into one with PDFBOX Version 2.
The First one got Fonts:



name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
XXMGEM+Arial-BoldMT TrueType WinAnsi yes yes yes 15 0
XXMGEM+ArialMT TrueType WinAnsi yes yes yes 19 0
XXMGEM+ArialMT CID TrueType Identity-H yes yes yes 27 0
XXMGEM+ArialNarrow-Bold TrueType WinAnsi yes yes yes 40 0
XXMGEM+ArialNarrow TrueType WinAnsi yes yes yes 44 0


and the Second one:



name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
UNTWVR+HelveticaLTCom-Roman CID TrueType Identity-H yes yes yes 25 0
UNTYID+HelveticaLTCom-Bold CID TrueType Identity-H yes yes yes 26 0
UNTZUP+ArialMT CID TrueType Identity-H yes yes yes 27 0
UNUBHB+Arial-BoldMT CID TrueType Identity-H yes yes yes 28 0
Helvetica-Bold Type 1 WinAnsi no no no 29 0
UNXPUH+HelveticaLTCom-Roman CID TrueType Identity-H yes yes yes 50 0
UNXRGT+HelveticaLTCom-Bold CID TrueType Identity-H yes yes yes 51 0
UNXSTF+ArialMT CID TrueType Identity-H yes yes yes 52 0
UNXUFR+Arial-BoldMT CID TrueType Identity-H yes yes yes 53 0


After Merging, this happens:



name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
SRWYVL+HelveticaLTCom-Roman CID TrueType Identity-H yes yes yes 420 0
SRXAHX+HelveticaLTCom-Bold CID TrueType Identity-H yes yes yes 421 0
SRXBUJ+ArialMT CID TrueType Identity-H yes yes yes 422 0
SRXDGV+Arial-BoldMT CID TrueType Identity-H yes yes yes 423 0
Helvetica-Bold Type 1 WinAnsi no no no 424 0
SRWYVL+HelveticaLTCom-Roman CID TrueType Identity-H yes yes yes 425 0
SRXAHX+HelveticaLTCom-Bold CID TrueType Identity-H yes yes yes 426 0
SRXBUJ+ArialMT CID TrueType Identity-H yes yes yes 427 0
SRXDGV+Arial-BoldMT CID TrueType Identity-H yes yes yes 428 0
SRWYVL+ArialMT CID TrueType Identity-H yes yes yes 429 0
SRXAHX+HelveticaLTCom-Roman CID TrueType Identity-H yes yes yes 430 0
SRXBUJ+HelveticaLTCom-Bold CID TrueType Identity-H yes yes yes 431 0
SRXDGV+Arial-BoldMT CID TrueType Identity-H yes yes yes 432 0
WDEGAT+Arial-BoldMT TrueType WinAnsi yes yes yes 436 0
GSEDXU+ArialMT TrueType WinAnsi yes yes yes 437 0
Arial TrueType WinAnsi yes no no 416 0
ZapfDingbats TrueType WinAnsi yes no yes 419 0
ArialNarrow TrueType WinAnsi yes no no 417 0
ACHRDX+ZapfDingbats TrueType WinAnsi yes yes yes 618 0
ACHRDX+ZapfDingbats TrueType WinAnsi yes yes yes 619 0
ACHRDX+ZapfDingbats TrueType WinAnsi yes yes yes 620 0
ACHRDX+ZapfDingbats TrueType WinAnsi yes yes yes 621 0
ACHRDX+ZapfDingbats TrueType WinAnsi yes yes yes 622 0
GSEDXU+ArialNarrow-Bold TrueType WinAnsi yes yes yes 560 0
NVGLHQ+ArialNarrow TrueType WinAnsi yes yes yes 561 0
KWHHMM+ArialMT CID TrueType Identity-H yes yes yes 578 0


My Code in Java:



final PDFMergerUtility pdfMerger = new PDFMergerUtility();
pdfMerger.setDestinationStream(outputStream);
pdfMerger.addSources(additionalPdfStreams);
pdfMerger.addSource(inputStreamPdDocument);
pdfMerger.mergeDocuments(MemoryUsageSetting.setupMainMemoryOnly());


The Problem is that an Api from a third party vendor got an Problem with this Fonts.
So : What am i doing wrong and how can i remove the unused and doubled fonts ??










share|improve this question














i merge two PDF Files into one with PDFBOX Version 2.
The First one got Fonts:



name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
XXMGEM+Arial-BoldMT TrueType WinAnsi yes yes yes 15 0
XXMGEM+ArialMT TrueType WinAnsi yes yes yes 19 0
XXMGEM+ArialMT CID TrueType Identity-H yes yes yes 27 0
XXMGEM+ArialNarrow-Bold TrueType WinAnsi yes yes yes 40 0
XXMGEM+ArialNarrow TrueType WinAnsi yes yes yes 44 0


and the Second one:



name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
UNTWVR+HelveticaLTCom-Roman CID TrueType Identity-H yes yes yes 25 0
UNTYID+HelveticaLTCom-Bold CID TrueType Identity-H yes yes yes 26 0
UNTZUP+ArialMT CID TrueType Identity-H yes yes yes 27 0
UNUBHB+Arial-BoldMT CID TrueType Identity-H yes yes yes 28 0
Helvetica-Bold Type 1 WinAnsi no no no 29 0
UNXPUH+HelveticaLTCom-Roman CID TrueType Identity-H yes yes yes 50 0
UNXRGT+HelveticaLTCom-Bold CID TrueType Identity-H yes yes yes 51 0
UNXSTF+ArialMT CID TrueType Identity-H yes yes yes 52 0
UNXUFR+Arial-BoldMT CID TrueType Identity-H yes yes yes 53 0


After Merging, this happens:



name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
SRWYVL+HelveticaLTCom-Roman CID TrueType Identity-H yes yes yes 420 0
SRXAHX+HelveticaLTCom-Bold CID TrueType Identity-H yes yes yes 421 0
SRXBUJ+ArialMT CID TrueType Identity-H yes yes yes 422 0
SRXDGV+Arial-BoldMT CID TrueType Identity-H yes yes yes 423 0
Helvetica-Bold Type 1 WinAnsi no no no 424 0
SRWYVL+HelveticaLTCom-Roman CID TrueType Identity-H yes yes yes 425 0
SRXAHX+HelveticaLTCom-Bold CID TrueType Identity-H yes yes yes 426 0
SRXBUJ+ArialMT CID TrueType Identity-H yes yes yes 427 0
SRXDGV+Arial-BoldMT CID TrueType Identity-H yes yes yes 428 0
SRWYVL+ArialMT CID TrueType Identity-H yes yes yes 429 0
SRXAHX+HelveticaLTCom-Roman CID TrueType Identity-H yes yes yes 430 0
SRXBUJ+HelveticaLTCom-Bold CID TrueType Identity-H yes yes yes 431 0
SRXDGV+Arial-BoldMT CID TrueType Identity-H yes yes yes 432 0
WDEGAT+Arial-BoldMT TrueType WinAnsi yes yes yes 436 0
GSEDXU+ArialMT TrueType WinAnsi yes yes yes 437 0
Arial TrueType WinAnsi yes no no 416 0
ZapfDingbats TrueType WinAnsi yes no yes 419 0
ArialNarrow TrueType WinAnsi yes no no 417 0
ACHRDX+ZapfDingbats TrueType WinAnsi yes yes yes 618 0
ACHRDX+ZapfDingbats TrueType WinAnsi yes yes yes 619 0
ACHRDX+ZapfDingbats TrueType WinAnsi yes yes yes 620 0
ACHRDX+ZapfDingbats TrueType WinAnsi yes yes yes 621 0
ACHRDX+ZapfDingbats TrueType WinAnsi yes yes yes 622 0
GSEDXU+ArialNarrow-Bold TrueType WinAnsi yes yes yes 560 0
NVGLHQ+ArialNarrow TrueType WinAnsi yes yes yes 561 0
KWHHMM+ArialMT CID TrueType Identity-H yes yes yes 578 0


My Code in Java:



final PDFMergerUtility pdfMerger = new PDFMergerUtility();
pdfMerger.setDestinationStream(outputStream);
pdfMerger.addSources(additionalPdfStreams);
pdfMerger.addSource(inputStreamPdDocument);
pdfMerger.mergeDocuments(MemoryUsageSetting.setupMainMemoryOnly());


The Problem is that an Api from a third party vendor got an Problem with this Fonts.
So : What am i doing wrong and how can i remove the unused and doubled fonts ??







java pdfbox






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 15 '18 at 9:27









SkarySkary

5210




5210






This question has an open bounty worth +350
reputation from danny117 ending in 3 days.


This question has not received enough attention.


Consolidate fonts, Consolidate backgrounds, Consolidate images. Optimize for web viewing. Same things acobat standard does when user opens a pdf followed by save as pdf.








This question has an open bounty worth +350
reputation from danny117 ending in 3 days.


This question has not received enough attention.


Consolidate fonts, Consolidate backgrounds, Consolidate images. Optimize for web viewing. Same things acobat standard does when user opens a pdf followed by save as pdf.










  • 2





    Please also share the source PDF files to allow reproducing the issue. In particular I'm surprised that your test run seems to indicate that PDFBox renames embedded subsets. It is possible I missed that but I don't consider it probable.

    – mkl
    Nov 15 '18 at 10:29











  • PDFBox doesn't rename fonts. What PDFBox version do you use? Are you sure that the result file was font-analysed directly after the merge, and not after something else? Is it the correct file?

    – Tilman Hausherr
    Nov 15 '18 at 12:27











  • Hi, i cannot upload the PDF Files its not for the public. @TilmanHausherr : Yes, the PDF was analyzed directly after the PDFBox merged it we are using 2.0.11

    – Skary
    Nov 16 '18 at 7:45











  • Current version is 2.0.12. Can you reproduce the problem by using the command line merge utility? If yes, could you try to reproduce the problem with two non confidential PDF files?

    – Tilman Hausherr
    Nov 16 '18 at 9:01











  • This is easily duplicated just copy mypdf.pdf to copy of mypdf.pdf then merge them together. They carry double fonts double images double backgrounds.

    – danny117
    Feb 27 at 22:18














  • 2





    Please also share the source PDF files to allow reproducing the issue. In particular I'm surprised that your test run seems to indicate that PDFBox renames embedded subsets. It is possible I missed that but I don't consider it probable.

    – mkl
    Nov 15 '18 at 10:29











  • PDFBox doesn't rename fonts. What PDFBox version do you use? Are you sure that the result file was font-analysed directly after the merge, and not after something else? Is it the correct file?

    – Tilman Hausherr
    Nov 15 '18 at 12:27











  • Hi, i cannot upload the PDF Files its not for the public. @TilmanHausherr : Yes, the PDF was analyzed directly after the PDFBox merged it we are using 2.0.11

    – Skary
    Nov 16 '18 at 7:45











  • Current version is 2.0.12. Can you reproduce the problem by using the command line merge utility? If yes, could you try to reproduce the problem with two non confidential PDF files?

    – Tilman Hausherr
    Nov 16 '18 at 9:01











  • This is easily duplicated just copy mypdf.pdf to copy of mypdf.pdf then merge them together. They carry double fonts double images double backgrounds.

    – danny117
    Feb 27 at 22:18








2




2





Please also share the source PDF files to allow reproducing the issue. In particular I'm surprised that your test run seems to indicate that PDFBox renames embedded subsets. It is possible I missed that but I don't consider it probable.

– mkl
Nov 15 '18 at 10:29





Please also share the source PDF files to allow reproducing the issue. In particular I'm surprised that your test run seems to indicate that PDFBox renames embedded subsets. It is possible I missed that but I don't consider it probable.

– mkl
Nov 15 '18 at 10:29













PDFBox doesn't rename fonts. What PDFBox version do you use? Are you sure that the result file was font-analysed directly after the merge, and not after something else? Is it the correct file?

– Tilman Hausherr
Nov 15 '18 at 12:27





PDFBox doesn't rename fonts. What PDFBox version do you use? Are you sure that the result file was font-analysed directly after the merge, and not after something else? Is it the correct file?

– Tilman Hausherr
Nov 15 '18 at 12:27













Hi, i cannot upload the PDF Files its not for the public. @TilmanHausherr : Yes, the PDF was analyzed directly after the PDFBox merged it we are using 2.0.11

– Skary
Nov 16 '18 at 7:45





Hi, i cannot upload the PDF Files its not for the public. @TilmanHausherr : Yes, the PDF was analyzed directly after the PDFBox merged it we are using 2.0.11

– Skary
Nov 16 '18 at 7:45













Current version is 2.0.12. Can you reproduce the problem by using the command line merge utility? If yes, could you try to reproduce the problem with two non confidential PDF files?

– Tilman Hausherr
Nov 16 '18 at 9:01





Current version is 2.0.12. Can you reproduce the problem by using the command line merge utility? If yes, could you try to reproduce the problem with two non confidential PDF files?

– Tilman Hausherr
Nov 16 '18 at 9:01













This is easily duplicated just copy mypdf.pdf to copy of mypdf.pdf then merge them together. They carry double fonts double images double backgrounds.

– danny117
Feb 27 at 22:18





This is easily duplicated just copy mypdf.pdf to copy of mypdf.pdf then merge them together. They carry double fonts double images double backgrounds.

– danny117
Feb 27 at 22:18












1 Answer
1






active

oldest

votes


















1














The "duplication" issue seems like it's coming from multiple pages, because each page contains its own font metadata. If you iterate over the pages and get the font names, then you will see duplicates in the output if a font is used in more than one page.



Something seems very wrong with the details in the question though. Neither of the source files have ZapfDingbats font, so where did it come from into the merged document?



First, I wrote a couple of helper methods:



static String mergePdfs(InputStream is1, InputStream is2) throws IOException {
PDFMergerUtility pdfMerger = new PDFMergerUtility();
pdfMerger.addSource(is1);
pdfMerger.addSource(is2);

String destFile = System.getProperty("java.io.tmpdir") + System.nanoTime() + ".pdf";
pdfMerger.setDestinationFileName(destFile);
pdfMerger.mergeDocuments(MemoryUsageSetting.setupMainMemoryOnly());

return destFile;
}

static List<String> getFontNames(PDDocument doc) throws IOException {
List<String> result = new ArrayList<>();
for (int i=0; i < doc.getNumberOfPages(); i++){
PDPage page = doc.getPage(i);
PDResources res = page.getResources();
for (COSName fontName : res.getFontNames()) {
result.add(res.getFont(fontName).toString());
}
}

return result;
}


Then I created 3 test PDF documents. The first 2, test-pdf-1.pdf and test-pdf-2.pdf contain one page each and use the same two fonts: PDTrueTypeFont BAAAAA+ArialMT and PDTrueTypeFont CAAAAA+Roboto-Black. The 3rd one, test-pdf-3.pdf, contains 2 pages from the first two documents, and was created with a text editor and not with PDFBox.



And then added the following test code:



Class clazz = Test.class;
String src1, src2, src3;
src1 = "/test-pdf-1.pdf";
src2 = "/test-pdf-2.pdf";
src3 = "/test-pdf-3.pdf";

InputStream is1, is2, is3;
is1 = clazz.getResourceAsStream(src1);
is2 = clazz.getResourceAsStream(src2);

String merged = mergePdfs(is1, is2);

PDDocument doc1, doc2, doc3, doc4;

is1 = clazz.getResourceAsStream(src1);
doc1 = PDDocument.load(is1);

is2 = clazz.getResourceAsStream(src2);
doc2 = PDDocument.load(is2);

is3 = clazz.getResourceAsStream(src3);
doc3 = PDDocument.load(is3);

doc4 = PDDocument.load(new File(merged));

System.out.println(src1 + " >nt" + getFontNames(doc1));
System.out.println(src2 + " >nt" + getFontNames(doc2));
System.out.println(src3 + " >nt" + getFontNames(doc3));
System.out.println(merged + " >nt" + getFontNames(doc4));


The output is as follows (I truncated the last file name for readability and easier comparison):



/test-pdf-1.pdf >
[PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black]
/test-pdf-2.pdf >
[PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black]
/test-pdf-3.pdf >
[PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black, PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black]
C:Temp..9.pdf >
[PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black, PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black]


You can see that both the file created by PDFBox's merge, "C:temp7193671804393899.pdf" (abbreviated in the output for readability), and the file "test-pdf-3.pdf" which was created with an editor have the same output for fonts, showing each font twice, one for each page.



Opening the merged file in Acrobat Reader confirms that only one copy of the fonts exists:



C:temp7193671804393899.pdf Properties > Fonts






share|improve this answer


























  • Your code gets only the top level resources. There could be more fonts in form xobjects, in field widgets, etc. Btw if you'd use the result as a set and not as a list you'd eliminate the duplicates.

    – Tilman Hausherr
    Feb 28 at 10:29











  • The intent here was to address the question specifically so using a Set would have defeated the purpose. I created the test documents so I know that there are no forms etc. I kept the code as simple as possible for clarity and readability.

    – isapir
    Feb 28 at 15:00











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53316182%2fpdfbox-merge-adds-unused-fonts-how-to-remove-it%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














The "duplication" issue seems like it's coming from multiple pages, because each page contains its own font metadata. If you iterate over the pages and get the font names, then you will see duplicates in the output if a font is used in more than one page.



Something seems very wrong with the details in the question though. Neither of the source files have ZapfDingbats font, so where did it come from into the merged document?



First, I wrote a couple of helper methods:



static String mergePdfs(InputStream is1, InputStream is2) throws IOException {
PDFMergerUtility pdfMerger = new PDFMergerUtility();
pdfMerger.addSource(is1);
pdfMerger.addSource(is2);

String destFile = System.getProperty("java.io.tmpdir") + System.nanoTime() + ".pdf";
pdfMerger.setDestinationFileName(destFile);
pdfMerger.mergeDocuments(MemoryUsageSetting.setupMainMemoryOnly());

return destFile;
}

static List<String> getFontNames(PDDocument doc) throws IOException {
List<String> result = new ArrayList<>();
for (int i=0; i < doc.getNumberOfPages(); i++){
PDPage page = doc.getPage(i);
PDResources res = page.getResources();
for (COSName fontName : res.getFontNames()) {
result.add(res.getFont(fontName).toString());
}
}

return result;
}


Then I created 3 test PDF documents. The first 2, test-pdf-1.pdf and test-pdf-2.pdf contain one page each and use the same two fonts: PDTrueTypeFont BAAAAA+ArialMT and PDTrueTypeFont CAAAAA+Roboto-Black. The 3rd one, test-pdf-3.pdf, contains 2 pages from the first two documents, and was created with a text editor and not with PDFBox.



And then added the following test code:



Class clazz = Test.class;
String src1, src2, src3;
src1 = "/test-pdf-1.pdf";
src2 = "/test-pdf-2.pdf";
src3 = "/test-pdf-3.pdf";

InputStream is1, is2, is3;
is1 = clazz.getResourceAsStream(src1);
is2 = clazz.getResourceAsStream(src2);

String merged = mergePdfs(is1, is2);

PDDocument doc1, doc2, doc3, doc4;

is1 = clazz.getResourceAsStream(src1);
doc1 = PDDocument.load(is1);

is2 = clazz.getResourceAsStream(src2);
doc2 = PDDocument.load(is2);

is3 = clazz.getResourceAsStream(src3);
doc3 = PDDocument.load(is3);

doc4 = PDDocument.load(new File(merged));

System.out.println(src1 + " >nt" + getFontNames(doc1));
System.out.println(src2 + " >nt" + getFontNames(doc2));
System.out.println(src3 + " >nt" + getFontNames(doc3));
System.out.println(merged + " >nt" + getFontNames(doc4));


The output is as follows (I truncated the last file name for readability and easier comparison):



/test-pdf-1.pdf >
[PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black]
/test-pdf-2.pdf >
[PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black]
/test-pdf-3.pdf >
[PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black, PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black]
C:Temp..9.pdf >
[PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black, PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black]


You can see that both the file created by PDFBox's merge, "C:temp7193671804393899.pdf" (abbreviated in the output for readability), and the file "test-pdf-3.pdf" which was created with an editor have the same output for fonts, showing each font twice, one for each page.



Opening the merged file in Acrobat Reader confirms that only one copy of the fonts exists:



C:temp7193671804393899.pdf Properties > Fonts






share|improve this answer


























  • Your code gets only the top level resources. There could be more fonts in form xobjects, in field widgets, etc. Btw if you'd use the result as a set and not as a list you'd eliminate the duplicates.

    – Tilman Hausherr
    Feb 28 at 10:29











  • The intent here was to address the question specifically so using a Set would have defeated the purpose. I created the test documents so I know that there are no forms etc. I kept the code as simple as possible for clarity and readability.

    – isapir
    Feb 28 at 15:00
















1














The "duplication" issue seems like it's coming from multiple pages, because each page contains its own font metadata. If you iterate over the pages and get the font names, then you will see duplicates in the output if a font is used in more than one page.



Something seems very wrong with the details in the question though. Neither of the source files have ZapfDingbats font, so where did it come from into the merged document?



First, I wrote a couple of helper methods:



static String mergePdfs(InputStream is1, InputStream is2) throws IOException {
PDFMergerUtility pdfMerger = new PDFMergerUtility();
pdfMerger.addSource(is1);
pdfMerger.addSource(is2);

String destFile = System.getProperty("java.io.tmpdir") + System.nanoTime() + ".pdf";
pdfMerger.setDestinationFileName(destFile);
pdfMerger.mergeDocuments(MemoryUsageSetting.setupMainMemoryOnly());

return destFile;
}

static List<String> getFontNames(PDDocument doc) throws IOException {
List<String> result = new ArrayList<>();
for (int i=0; i < doc.getNumberOfPages(); i++){
PDPage page = doc.getPage(i);
PDResources res = page.getResources();
for (COSName fontName : res.getFontNames()) {
result.add(res.getFont(fontName).toString());
}
}

return result;
}


Then I created 3 test PDF documents. The first 2, test-pdf-1.pdf and test-pdf-2.pdf contain one page each and use the same two fonts: PDTrueTypeFont BAAAAA+ArialMT and PDTrueTypeFont CAAAAA+Roboto-Black. The 3rd one, test-pdf-3.pdf, contains 2 pages from the first two documents, and was created with a text editor and not with PDFBox.



And then added the following test code:



Class clazz = Test.class;
String src1, src2, src3;
src1 = "/test-pdf-1.pdf";
src2 = "/test-pdf-2.pdf";
src3 = "/test-pdf-3.pdf";

InputStream is1, is2, is3;
is1 = clazz.getResourceAsStream(src1);
is2 = clazz.getResourceAsStream(src2);

String merged = mergePdfs(is1, is2);

PDDocument doc1, doc2, doc3, doc4;

is1 = clazz.getResourceAsStream(src1);
doc1 = PDDocument.load(is1);

is2 = clazz.getResourceAsStream(src2);
doc2 = PDDocument.load(is2);

is3 = clazz.getResourceAsStream(src3);
doc3 = PDDocument.load(is3);

doc4 = PDDocument.load(new File(merged));

System.out.println(src1 + " >nt" + getFontNames(doc1));
System.out.println(src2 + " >nt" + getFontNames(doc2));
System.out.println(src3 + " >nt" + getFontNames(doc3));
System.out.println(merged + " >nt" + getFontNames(doc4));


The output is as follows (I truncated the last file name for readability and easier comparison):



/test-pdf-1.pdf >
[PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black]
/test-pdf-2.pdf >
[PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black]
/test-pdf-3.pdf >
[PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black, PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black]
C:Temp..9.pdf >
[PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black, PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black]


You can see that both the file created by PDFBox's merge, "C:temp7193671804393899.pdf" (abbreviated in the output for readability), and the file "test-pdf-3.pdf" which was created with an editor have the same output for fonts, showing each font twice, one for each page.



Opening the merged file in Acrobat Reader confirms that only one copy of the fonts exists:



C:temp7193671804393899.pdf Properties > Fonts






share|improve this answer


























  • Your code gets only the top level resources. There could be more fonts in form xobjects, in field widgets, etc. Btw if you'd use the result as a set and not as a list you'd eliminate the duplicates.

    – Tilman Hausherr
    Feb 28 at 10:29











  • The intent here was to address the question specifically so using a Set would have defeated the purpose. I created the test documents so I know that there are no forms etc. I kept the code as simple as possible for clarity and readability.

    – isapir
    Feb 28 at 15:00














1












1








1







The "duplication" issue seems like it's coming from multiple pages, because each page contains its own font metadata. If you iterate over the pages and get the font names, then you will see duplicates in the output if a font is used in more than one page.



Something seems very wrong with the details in the question though. Neither of the source files have ZapfDingbats font, so where did it come from into the merged document?



First, I wrote a couple of helper methods:



static String mergePdfs(InputStream is1, InputStream is2) throws IOException {
PDFMergerUtility pdfMerger = new PDFMergerUtility();
pdfMerger.addSource(is1);
pdfMerger.addSource(is2);

String destFile = System.getProperty("java.io.tmpdir") + System.nanoTime() + ".pdf";
pdfMerger.setDestinationFileName(destFile);
pdfMerger.mergeDocuments(MemoryUsageSetting.setupMainMemoryOnly());

return destFile;
}

static List<String> getFontNames(PDDocument doc) throws IOException {
List<String> result = new ArrayList<>();
for (int i=0; i < doc.getNumberOfPages(); i++){
PDPage page = doc.getPage(i);
PDResources res = page.getResources();
for (COSName fontName : res.getFontNames()) {
result.add(res.getFont(fontName).toString());
}
}

return result;
}


Then I created 3 test PDF documents. The first 2, test-pdf-1.pdf and test-pdf-2.pdf contain one page each and use the same two fonts: PDTrueTypeFont BAAAAA+ArialMT and PDTrueTypeFont CAAAAA+Roboto-Black. The 3rd one, test-pdf-3.pdf, contains 2 pages from the first two documents, and was created with a text editor and not with PDFBox.



And then added the following test code:



Class clazz = Test.class;
String src1, src2, src3;
src1 = "/test-pdf-1.pdf";
src2 = "/test-pdf-2.pdf";
src3 = "/test-pdf-3.pdf";

InputStream is1, is2, is3;
is1 = clazz.getResourceAsStream(src1);
is2 = clazz.getResourceAsStream(src2);

String merged = mergePdfs(is1, is2);

PDDocument doc1, doc2, doc3, doc4;

is1 = clazz.getResourceAsStream(src1);
doc1 = PDDocument.load(is1);

is2 = clazz.getResourceAsStream(src2);
doc2 = PDDocument.load(is2);

is3 = clazz.getResourceAsStream(src3);
doc3 = PDDocument.load(is3);

doc4 = PDDocument.load(new File(merged));

System.out.println(src1 + " >nt" + getFontNames(doc1));
System.out.println(src2 + " >nt" + getFontNames(doc2));
System.out.println(src3 + " >nt" + getFontNames(doc3));
System.out.println(merged + " >nt" + getFontNames(doc4));


The output is as follows (I truncated the last file name for readability and easier comparison):



/test-pdf-1.pdf >
[PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black]
/test-pdf-2.pdf >
[PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black]
/test-pdf-3.pdf >
[PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black, PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black]
C:Temp..9.pdf >
[PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black, PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black]


You can see that both the file created by PDFBox's merge, "C:temp7193671804393899.pdf" (abbreviated in the output for readability), and the file "test-pdf-3.pdf" which was created with an editor have the same output for fonts, showing each font twice, one for each page.



Opening the merged file in Acrobat Reader confirms that only one copy of the fonts exists:



C:temp7193671804393899.pdf Properties > Fonts






share|improve this answer















The "duplication" issue seems like it's coming from multiple pages, because each page contains its own font metadata. If you iterate over the pages and get the font names, then you will see duplicates in the output if a font is used in more than one page.



Something seems very wrong with the details in the question though. Neither of the source files have ZapfDingbats font, so where did it come from into the merged document?



First, I wrote a couple of helper methods:



static String mergePdfs(InputStream is1, InputStream is2) throws IOException {
PDFMergerUtility pdfMerger = new PDFMergerUtility();
pdfMerger.addSource(is1);
pdfMerger.addSource(is2);

String destFile = System.getProperty("java.io.tmpdir") + System.nanoTime() + ".pdf";
pdfMerger.setDestinationFileName(destFile);
pdfMerger.mergeDocuments(MemoryUsageSetting.setupMainMemoryOnly());

return destFile;
}

static List<String> getFontNames(PDDocument doc) throws IOException {
List<String> result = new ArrayList<>();
for (int i=0; i < doc.getNumberOfPages(); i++){
PDPage page = doc.getPage(i);
PDResources res = page.getResources();
for (COSName fontName : res.getFontNames()) {
result.add(res.getFont(fontName).toString());
}
}

return result;
}


Then I created 3 test PDF documents. The first 2, test-pdf-1.pdf and test-pdf-2.pdf contain one page each and use the same two fonts: PDTrueTypeFont BAAAAA+ArialMT and PDTrueTypeFont CAAAAA+Roboto-Black. The 3rd one, test-pdf-3.pdf, contains 2 pages from the first two documents, and was created with a text editor and not with PDFBox.



And then added the following test code:



Class clazz = Test.class;
String src1, src2, src3;
src1 = "/test-pdf-1.pdf";
src2 = "/test-pdf-2.pdf";
src3 = "/test-pdf-3.pdf";

InputStream is1, is2, is3;
is1 = clazz.getResourceAsStream(src1);
is2 = clazz.getResourceAsStream(src2);

String merged = mergePdfs(is1, is2);

PDDocument doc1, doc2, doc3, doc4;

is1 = clazz.getResourceAsStream(src1);
doc1 = PDDocument.load(is1);

is2 = clazz.getResourceAsStream(src2);
doc2 = PDDocument.load(is2);

is3 = clazz.getResourceAsStream(src3);
doc3 = PDDocument.load(is3);

doc4 = PDDocument.load(new File(merged));

System.out.println(src1 + " >nt" + getFontNames(doc1));
System.out.println(src2 + " >nt" + getFontNames(doc2));
System.out.println(src3 + " >nt" + getFontNames(doc3));
System.out.println(merged + " >nt" + getFontNames(doc4));


The output is as follows (I truncated the last file name for readability and easier comparison):



/test-pdf-1.pdf >
[PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black]
/test-pdf-2.pdf >
[PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black]
/test-pdf-3.pdf >
[PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black, PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black]
C:Temp..9.pdf >
[PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black, PDTrueTypeFont BAAAAA+ArialMT, PDTrueTypeFont CAAAAA+Roboto-Black]


You can see that both the file created by PDFBox's merge, "C:temp7193671804393899.pdf" (abbreviated in the output for readability), and the file "test-pdf-3.pdf" which was created with an editor have the same output for fonts, showing each font twice, one for each page.



Opening the merged file in Acrobat Reader confirms that only one copy of the fonts exists:



C:temp7193671804393899.pdf Properties > Fonts







share|improve this answer














share|improve this answer



share|improve this answer








edited Feb 28 at 2:37

























answered Feb 28 at 1:59









isapirisapir

6,86254662




6,86254662













  • Your code gets only the top level resources. There could be more fonts in form xobjects, in field widgets, etc. Btw if you'd use the result as a set and not as a list you'd eliminate the duplicates.

    – Tilman Hausherr
    Feb 28 at 10:29











  • The intent here was to address the question specifically so using a Set would have defeated the purpose. I created the test documents so I know that there are no forms etc. I kept the code as simple as possible for clarity and readability.

    – isapir
    Feb 28 at 15:00



















  • Your code gets only the top level resources. There could be more fonts in form xobjects, in field widgets, etc. Btw if you'd use the result as a set and not as a list you'd eliminate the duplicates.

    – Tilman Hausherr
    Feb 28 at 10:29











  • The intent here was to address the question specifically so using a Set would have defeated the purpose. I created the test documents so I know that there are no forms etc. I kept the code as simple as possible for clarity and readability.

    – isapir
    Feb 28 at 15:00

















Your code gets only the top level resources. There could be more fonts in form xobjects, in field widgets, etc. Btw if you'd use the result as a set and not as a list you'd eliminate the duplicates.

– Tilman Hausherr
Feb 28 at 10:29





Your code gets only the top level resources. There could be more fonts in form xobjects, in field widgets, etc. Btw if you'd use the result as a set and not as a list you'd eliminate the duplicates.

– Tilman Hausherr
Feb 28 at 10:29













The intent here was to address the question specifically so using a Set would have defeated the purpose. I created the test documents so I know that there are no forms etc. I kept the code as simple as possible for clarity and readability.

– isapir
Feb 28 at 15:00





The intent here was to address the question specifically so using a Set would have defeated the purpose. I created the test documents so I know that there are no forms etc. I kept the code as simple as possible for clarity and readability.

– isapir
Feb 28 at 15:00




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53316182%2fpdfbox-merge-adds-unused-fonts-how-to-remove-it%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

List item for chat from Array inside array React Native

Thiostrepton

Caerphilly