SPUG: Using regex rather than split for item sorting
Joel Grow
joel at largest.org
Fri Mar 12 01:00:17 CST 2004
On Thu, 11 Mar 2004, Tim Maher wrote:
> I tried to convert my earlier unappealing split()-based solution to one
> that uses a matching operator, but I guess I've bungled the
> postive-lookahead somehow, because it doesn't extract the last
> label/item pair.
>
> Can somebody see the problem?
(?:) tells Perl regex to not use these parens as matching (only use them
as grouping). So your regex is matching '=item' and the chapter title and
text for this entry like you want, but it then keeps going and matches up
to and including the next '=item'. You'll notice when running your code
that it actually skips every other entry, not just the last one. To see
what I mean, add a set of parens around the second '=item':
@fields= /^(=item\b.*?)(?:^(=item)\b|\Z)/smg;
You'll see you're matching the second '=item' but throwing it away.
In the spirit of TMTOWTDI:
use strict;
# Quick and dirty program to sort POD list items into ASCIIbetical order
# Doesn't handle nested lists yet, and is in dire need of a more elegant
# way to get item-labels hooked together with their contents
# %file will look like:
#
# %file = ('Chapter 1' => 'text of chapter 1',
# 'Chapter 2' => 'text of chapter 2');
my %file;
{
local $/ = '=item ';
# this assumes '=item' will be flush-left
%file = map { split("\n", $_, 2) } # first line becomes key, rest is value
grep { $_ } # gets rid of singleton '=item ' at beginning
map { chomp; $_ }
<DATA>;
}
for my $chapter (sort keys %file) {
print "Chapter '$chapter'.\n\n";
print "$file{$chapter}\n\n";
}
__DATA__
=item This
and stuff was written
=item What
more drivel here
=item Other
getting the idea?
=item That's all
the end
Joel
> *--------------------------------------------------------------------------*
> | Tim Maher, CEO (206) 781-UNIX (866) DOC-PERL (866) DOC-UNIX |
> | tim(AT)Consultix-Inc.Com http://TeachMePerl.Com http://TeachMeUnix.Com |
> *+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-*
> | Classes: Perl Prog. plus Perl Modules, 4/5-4/8 ; Template Toolkit: TBA |
> | Watch for my Manning book: "Minimal Perl for Shell Users & Programmers" |
> *--------------------------------------------------------------------------*
>
> #! /usr/bin/perl -wl
> # sort_pod_items2
> # Tim Maher, tim at teachmeperl.com
> # Thu Mar 11 21:59:09 PST 2004
>
> # Quick and dirty program to sort POD list items into ASCIIbetical order
> # Doesn't handle nested lists yet, and is in dire need of a more elegant way
> # to get item-labels hooked together with their contents
>
> # NOTE: Runds in file-slurping mode, so all data presented at once to implicit loop
>
> $/=undef;
> $_=<DATA>;
>
> # First, split lines into '=item' labels for POD-list items,
> # followed by associated contents
>
> # Something wrong with following; doesn't extract last list item
> @fields= /^(=item\b.*?)(?:^=item\b|\Z)/smg;
>
> $i=1;
> foreach ( @fields ) {
> print "$i: $_\n";
> $i++;
> }
> exit;
>
>
> print map { $_->[1] }
> sort { $a->[0] cmp $b->[0] }
> map {
> if ( $_ =~ /^=item\s+([^\n]+)\n/ ) { # $1 is label for list item
> [ $1 , $_ ]
> } else {
> die "Bad data: records must start with =item\n";
> }
> } @fields;
>
> __DATA__
> =item This
>
> and stuff was written
>
> =item What
>
> more drivel here
>
> =item Other
>
> getting the idea?
>
> =item That's all
>
> the end
> _____________________________________________________________
> Seattle Perl Users Group Mailing List
> POST TO: spug-list at mail.pm.org http://spugwiki.perlocity.org
> ACCOUNT CONFIG: http://mail.pm.org/mailman/listinfo/spug-list
> MEETINGS: 3rd Tuesdays, Location Unknown
> WEB PAGE: http://www.seattleperl.org
>
More information about the spug-list
mailing list