I wanted to build an awesome place for people to discuss module specific issues, but I don't have any more time for this, and there are much better places to discuss Perl-related issues. I'd recommend asking your question on Stack Overflow or on Perl Monks.
If you are looking for a Perl tutorial or Perl-related news, I hope these links will serve you well.
Posted on 2010-09-25 03:16:52.794842-07 by mirod in response to 12958
Re: Question: Handling large number of XML with XML-Twig

Well, you have a single output file, C:\xmlperl\output.txt, so the code outputs everything to it.

You need to open a new output file for each input file.

my @TranscriptsList = glob( "$xml_dir/*.xml"); # easier than using readdir foreach my $xml_file (@TranscriptsList) { # create a text file name from the input file name and open it my $text_file= $xml_file; $text_file=~ s{\.xml$}{\.txt}; open( my $text_fh, '>', $text_file) or die "cannot create $text_file: $!"; # I assume you only want the text, not the markup (tags), otherwise you could do # $_->print( $text_fh) to also print the markup my $twig= XML::Twig->new(twig_roots => {Texte => sub { print {$text_fh}, $_->text; } }) ->parsefile( $xml_file); }

From the tag name 'Texte' I suspect you might run into encoding problems, so you might need to open the output file in utf8 mode.

You may also want to read a bit about modern Perl style, bareword filehandles (XMLOUT), indirect object notation (new XML::Twig) and opendir/readir are not used a lot these days.

Direct Responses: 12960 | Write a response