Thanks, that was super helpful! In case anyone has this problem in the future, here's the working code I came up with (this is inside a loop, hence the $_ and next):
my $property = $pdf->getProperty($pagenum, $_);
next if(!defined($property));
my $propval = $pdf->getValue($property);
my $type = $propval->{Type}->{value};
next if(!defined($type) || $type ne 'XObject');
# decodeOne() expects a hash with key "type" and value "dictionary,"
# so that's what we're going to give it.
my %dictionary = ('type' => 'dictionary',
'value' => $propval);
my $content = $pdf->decodeOne(\%dictionary);
my $pagetree = CAM::PDF::Content->new($content);
As you can see it's a bit of a hack. Ideally there'd be a getParseTreeFromXObject() function (or something) that takes a page number and resource name and returns the parse tree of the associated xstream. I'd submit a patch but I'm not sure I understand the CAM::PDF internals well enough to produce something usable.
Thanks again for your help. This saved me a ton of time and frustration.
-Tim
|