Jump to content

Recommended Posts

Posted

I am looking at a PDF in Intella Connect that is 53,439 bytes in size. Intella identifies an embedded thumbnail in it that is 69,417 bytes in size. How is it possible that the embedded item is larger than the PDF itself? If I download the PDF and the embedded image, their file size is indeed as reported by Intella Connect.

If I use the Linux tool pdfimages and extract embedded images from the PDF, I am able to extract the same image as what Intella shows us. But it's 27,552 bytes in size. All sizes are logical size, not size on disk. That would make a lot more sense.

Anyone have any idea how/why an embedded image would be larger than the PDF itself?

Posted

 

This could be due to compression or encoding. The PDF might use more efficient compression for certain content (text, fonts, vector graphics), whereas the embedded image could be less compressed or stored in a higher-quality format. Also when an embedded image is extracted from a PDF, it might be re-encoded in a different format with different settings, hence the difference. In this specific case it could be the compression though. Different tools might have different algorithms for extracting images.

You can share the PDF (either privately or publicly) if you'd like, and tell us what command line you used to extract the images.

 

 

Posted

Thanks. That thought crossed my mind. Unfortunately, it's relating to an investigation, so I can't share it publicly. I'll have a look whether it can be shared privately, although that may not be possible either. I would have thought that pdfimages would have extracted it in the same manner. It may be because of the switch that I used.

I used Kali in Windows Subsystem for Linux (WSL), and I used the pdfimages Linux command. I tried it with the -jpg switch, as well as with the -all switch. In both cases, it produced a JPG, and of the same size (about 26.9 KB).

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...