I'm trying to write code to extract the "bounding rectangles" of text in PDF documents. I get a pageTree using getPageContentTree and then pass my callback function to its render() method. My renderer then extracts most of the information you need to compute the text's position on the screen, including its Tm, cm, Tfs, Tc, Tw, and Tz parameters. From these I seem to be able to estimate the upper, lower, and left bounds of each text object. However, to get the right bound, I need to know the widths of the glyphs in the text object, and since many fonts are variable-width, that requires unpacking the font itself.
Does the object the rendering code passes to callback functions include information that would allow computation of the width of text objects? If not, can you recommend another way to do this? I think CAM::PDF must do this kind of calculation somewhere, since you need to know how wide one text object is to figure out the starting position of the next one on that line. But I've spent some time poking through the code and haven't found it.
Thanks a lot!