Posted on 2005-08-24 14:51:29-07 by ampulla
start & end event offset & offset_end attributes appear bogus
When using a subroutine call back for an instance of HTML::Parser the offset and offset_end atribs are about 1000 bytes too high. In the case of a 800 byte "myfile.foo" the following code returns "start" tag offsets of around 1840 and 1853 which are clearly too high:
my $Hparser = HTML::Parser->new(api_version => 3); $Hparser->handler(start => sub{my($tag, $start, $end) = @_; printf("%s starts at %d and ends at %d\n", $tag, $start, $end);}, "tagname, offset,offset_end"); $Hparser->parse_file("myfile.foo");
I got the same results by dumping the whole file into a single variable and parsing it. The file is ordinary one byte per character ASCII with 67 lines (CR/LFs). Anybody seen this problem on current release W2k with ActiveState ActivePerl 5.8 and been able to resolve it? Thanks for taking a look, Pat Ampulla
Direct Responses: Write a response
Perl Weekly newsletter
A free weekly newsletter for people who are busy to read all the blogs. click here to check it out.