Discussion:
How to extract embedded JPEGs from supplied PDF document?
(too old to reply)
g***@adobeforums.com
2006-12-09 11:42:55 UTC
Permalink
Hello there. I'm receiving requests from clients to extract photo images from PDFs for potential use in other layouts.

I'm having trouble figuring out how to extract the bitmap image from a PDF so that I can open it in photoshop and assess it's size etc?

I imagine with PDF as it's native file type AICS is the programme for this, tho' perhaps InDesign CS would do it?

Help much appreciated,
Gill Keeley.
J***@adobeforums.com
2006-12-09 13:36:14 UTC
Permalink
Use the Extract All Images command in Acrobat Pro (renamed Export All Images in version eight).

I imagine with PDF as it's native file type AICS is the programme for
this




You are being victimized by marketing hype obfuscation that Adobe should never have foisted upon its users. For practical purposes, all this nonsense about PDF being "AI's native format" means this:

The PDF format allows its file to contain a "marked off" place where whatever program creates the PDF can write its native format stuff. Other programs just "ignore" that stuff.

Similarly, AI's format allows its file to contain a "marked off" place where another version of the file in PDF format can be contained. So other programs unable to import the AI-native stuff can see the PDF stuff and import that.

So in Adobe-speak, "A contains B" means "A equals B". Not exactly logical, but then Illustrator isn't really renowned for its logic, either.

When you save a native AICS file from AI, you have an option to include in that file another complete copy of the file's content in PDF format.

When you "save" (should be export) a PDF from AI, you have an option to include in that PDF another complete copy of the file's content in native AI format.

If PDF really were AI's "native format" (in the common-sense sense that such language is understood by users); all this double-version stuff would not be necessary. PDF and AI file formats just provide a place to hold another copy in each other's file. Adobe takes that and uses it as justification to say "PDF is AI's native format."

Bottom line is, AI has no feature to export embedded raster images contained in the file with no changes. It should, but it doesn't. Acrobat does.

JET
Phos....
2006-12-09 15:43:26 UTC
Permalink
Gill...

In regards to JET's admonishment of "You are being victimized by marketing hype obfuscation that Adobe should never have foisted upon its users." and his explanation about what that really portends, you may find the following article by author and former Illustrator Product Manager Mordy Golding enlightening.

"What's in a file?" <http://rwillustrator.blogspot.com/2006/11/whats-in-file.html>"Confusion abounds when people talk about file formats:
Which one should I use? But my printer told me to always use this format? I heard that the other format isn't good. Illustrator's native file format is PDF. InDesign can read native Illustrator files. EPS is dead to me. Always save your file as EPS. PDF solves all problems. PDF results aren't high quality...

But the real question you should be asking is:
What is actually *IN* a file anyway?

If you understand what a file is, and what's in it, you can answer all of the questions above, and then some."
g***@adobeforums.com
2006-12-09 21:44:40 UTC
Permalink
Hi James and Phos....
Thank you for your comments, really helpful.
I don't have Acrobat, so I gues I'm stuck there. I don't really want to buy Acrobat just because my clients are unable to supply images in the hi-res JPEG or TIFF formats I request for print jobs. I don't think I need it for anything else... unless you could suggest a reason?

I export all my PDFs through InDesign and that works well for the most part, though occasionally there are frustrating typographic anomalies which appear in the course of the conversion and are hard to resolve. (Could Acrobat convert my InDesign docs to PDF better than InDesign itself?)

How odd that I can't extract the embedded image from the PDF in Illustrator...
Clients also forward images embedded in Word documents. I seem to have no way of extracting those either, to assess their suitability.

I think at the end of the day this is not my problem. As time goes on I'm more inclined to say a polite firm 'no' to my clients when they forward wierd and wonderful content files for me to figure out. I let them know what I need to work with and they just ignore me, do their own thing and give me the problem. Grumpily (re: clients) Grateful (re: you) Gill
g***@adobeforums.com
2006-12-09 23:17:17 UTC
Permalink
Re: Acrobat....
I've never really understood what it's for...?
J***@adobeforums.com
2006-12-10 04:05:00 UTC
Permalink
I've never really understood what it's for...?





I don't think I need it for anything else... unless you could suggest
a reason?




There's plenty to complain about with Acrobat. I hate its interface design. I hate the fact that Adobe just can't seem to get it in version parity with the rest of the "suite" that it is bundled with. Nonetheless, I sure wouldn't want to be without Acorbat Pro these days. It's essential for any graphics shop.

PDF is the preferred delivery format for print. It's the ultimate "output bundle", self-containing the file, the fonts, the images. I can't imagine a graphics shop not wanting to have Acrobat on hand to check, modify, and/or enhance PDFs exported from whatever program was used to create a project. If a project makes more sense for me to work it in Canvas or Corel Designer or any other non-Adobe app, I can do that without worrying whether the print shop or other recipient has the software I used for authoring. I want to check those PDFs out in Acrobat before sending them on.

I can deliver a CMYK and/or spot color PDF to a small specialty shop, like a T-shirt imprinter, who doesn't have decent software or the expertise to use it. I just "print" separations to Adobe PDF (installed with Acrobat). The silkscreen shop can print the PDF pages to burn its positives, using nothing more than Reader.

The last color comp printer I bought was a grossly overpriced, over-rated Epson 3000. I have two (count 'em; two) of 'em. Haven't used the slow, unreliable, cheaply-built pieces of junk in years. PDFs have been my approval comps ever since. But I never send a PDF comp exactly as it is exported from the document's native program. I always open it to check the font embedding, set initial view, crop away the bleeds, etc.

Other things I do routinely with Acrobat:

Set the initial view properties of each PDF I make, so it opens the way I want.

Open any PDF and use the Preflight interface to find out the actual resolution of each raster image it contains, and export those raster images if I need to.

Downsample images in existing print-res PDFs to make them bandwidth practical to repurpose them for web delivery.

Combine multiple PDFs into single PDFs, and split multi-page PDFs into individual PDFs.

Batch convert a whole folder full of images into PDFs in one go.

Batch export PNG images from a folder containing several thousand PDFs (tech manual illustrations), so those images can be linked to and displayed in FileMaker container fields.

Add interactivity to PDFs with Javascript to create training aids that require nothing more than Reader at the end user.

Build 700+ page tech manuals, delivered on CD, consisting of individual PDFs per chapter, but acting like a single book, with the same set of bookmarks serving as a hot-linked table of contents in each.

Build on-page links in things like wiring diagrams which span several pages. Links at the edges of the page allow the user to continue tracing a circuit when it jumps to another page. Clicking the link opens the needed page and zooms it right to where the circuit picks up and continues.

Build presentations as PDF files (rather than PowerPoint files). They can contain interactivity, including Flash files.

Each time I build a client a placement ad, I can now open the delivery PDF in Acrobat Pro 8, add a few fields to contain imprint info, and send it to the client's several Dealers. The Dealers can key in their own contact info in those fields and save them using only Reader. They can then use the PDFs for localized advertising.

Create design/illustration-intensive price sheets or other projects containing information subject to change. The client can open the PDFs in Reader, edit the prices, phone numbers, etc., and save it.

And it's quite clear Adobe has all kinds of intentions for further integration of Flash and PDF. Again, I can't imagine any graphics person these days not wanting to be (and stay) up to speed with Acrobat.

JET
D***@adobeforums.com
2006-12-11 01:24:38 UTC
Permalink
I'm in Prepress, and a few years ago I was very angry at people sending PDF files. They were always low res crap. But, now in the newer versions of Acrobat (and more user awareness, I guess) things are working out fine.

I use Acrobat Pro every day, since we changed to direct to plate and a PDF workflow.

I just took a course for Acrobat, and was amazed at what it can actually do. We also have a plugin for it called Pitstop, for editing, but I rarely use it, now that our RIP handles all the trapping.

Acrobat is really worth looking into. It's so much better than a few years ago. Not to mention that more and more people are going that route.
S***@adobeforums.com
2006-12-11 05:50:47 UTC
Permalink
Gill,

You didn't say what version of Photoshop you were using but in CS, you might try launching up Photoshop and going to File>Import>PDF Image. It will give you a list of embedded images and the option to open them at their saved resolutions directly in Photoshop one at a time.

In CS2 you can just try to open them like a normal document and when the screen comes up that asks which page you want and what resolution you want and so on, change the drop down next to "Select" from "Page" to "Image" and you'll pretty much get the same thing.

The time this most notably won't work is if you are dealing with a PDF that was saved flattened which has drop shadows and other transparency effects in use. As a flattened PDF, the only way to preserve these effects is to sort of burn them into the underlying images and when this is done, the images get 'sliced & diced'... If you don't understand what I'm talking about, you will the first time you try to do the import trick in Photoshop and you've run across one.

Oh, one other thing to watch out for - if the image was scaled out of proportion in the layout program (like it was made thinner to fit a space or to make a model look a little less rocky-road) it will be lost on import - not always a big deal but something to watch out for because some clients are really picky about things looking like their originals they send in.
g***@adobeforums.com
2006-12-13 16:49:36 UTC
Permalink
Hey Sepen, many thanks for that! I extracted the images from the PDF with File>Import>PDF Image. Voila!

Many thanks James for your notes. Like you, I certainly work with PDFs every day. All the artwork I supply is in PDF format, only I convert my files in InDesign. I'm interested that you feel Acrobat is essential kit and I'll look into it more. Still the only problem I've had exporting PDFs from InDesign is in occasional typographic anomolies, like the appearance of invisible characters and tab leaders failing to appear. So it seems like a big investment to make for a problem with a few full stops and a rectangle. I imagine I wouldn't experience these using Acrobat? Still you've got me thinking about Acrobat much more.
Thanks Gill
:-)

Loading...