SPUG: Using regex rather than split for item sorting

Joel Grow joel at largest.org
Fri Mar 12 01:00:17 CST 2004


On Thu, 11 Mar 2004, Tim Maher wrote:

> I tried to convert my earlier unappealing split()-based solution to one
> that uses a matching operator, but I guess I've bungled the
> postive-lookahead somehow, because it doesn't extract the last
> label/item pair.
>
> Can somebody see the problem?

(?:) tells Perl regex to not use these parens as matching (only use them
as grouping).  So your regex is matching '=item' and the chapter title and
text for this entry like you want, but it then keeps going and matches up
to and including the next '=item'.  You'll notice when running your code
that it actually skips every other entry, not just the last one.  To see
what I mean, add a set of parens around the second '=item':

 @fields= /^(=item\b.*?)(?:^(=item)\b|\Z)/smg;

You'll see you're matching the second '=item' but throwing it away.

In the spirit of TMTOWTDI:

 use strict;

 # Quick and dirty program to sort POD list items into ASCIIbetical order
 # Doesn't handle nested lists yet, and is in dire need of a more elegant
 # way to get item-labels hooked together with their contents

 # %file will look like:
 #
 #  %file = ('Chapter 1' => 'text of chapter 1',
 #           'Chapter 2' => 'text of chapter 2');
 my %file;
 {
   local $/ = '=item ';

   # this assumes '=item' will be flush-left

   %file = map  { split("\n", $_, 2) } # first line becomes key, rest is value
           grep { $_ }           # gets rid of singleton '=item ' at beginning
           map  { chomp; $_ }
           <DATA>;
 }

 for my $chapter (sort keys %file) {
    print "Chapter '$chapter'.\n\n";
    print "$file{$chapter}\n\n";
 }

 __DATA__
 =item This

 and stuff was written

 =item What

 more drivel here

 =item Other

 getting the idea?

 =item That's all

 the end

Joel


> *--------------------------------------------------------------------------*
> | Tim Maher, CEO     (206) 781-UNIX      (866) DOC-PERL     (866) DOC-UNIX |
> | tim(AT)Consultix-Inc.Com  http://TeachMePerl.Com  http://TeachMeUnix.Com |
> *+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-*
> |  Classes: Perl Prog. plus Perl Modules, 4/5-4/8 ;  Template Toolkit: TBA |
> |  Watch for my Manning book: "Minimal Perl for Shell Users & Programmers" |
> *--------------------------------------------------------------------------*
>
> #! /usr/bin/perl -wl
> # sort_pod_items2
> # Tim Maher, tim at teachmeperl.com
> # Thu Mar 11 21:59:09 PST 2004
>
> # Quick and dirty program to sort POD list items into ASCIIbetical order
> # Doesn't handle nested lists yet, and is in dire need of a more elegant way
> # to get item-labels hooked together with their contents
>
> # NOTE: Runds in file-slurping mode, so all data presented at once to implicit loop
>
> $/=undef;
> $_=<DATA>;
>
> # First, split lines into '=item' labels for POD-list items,
> # followed by associated contents
>
> # Something wrong with following; doesn't extract last list item
> @fields= /^(=item\b.*?)(?:^=item\b|\Z)/smg;
>
> $i=1;
> foreach ( @fields ) {
> 	print "$i: $_\n";
> 	$i++;
> }
> exit;
>
>
> print map { $_->[1] }
>  sort { $a->[0] cmp $b->[0] }
>  map {
> 	 if ( $_ =~ /^=item\s+([^\n]+)\n/ ) {	# $1 is label for list item
>  		[ $1 , $_ ]
> 	 } else {
> 	 	die "Bad data: records must start with =item\n";
> 	 }
>  } @fields;
>
>  __DATA__
> =item This
>
> and stuff was written
>
> =item What
>
> more drivel here
>
> =item Other
>
> getting the idea?
>
> =item That's all
>
> the end
> _____________________________________________________________
> Seattle Perl Users Group Mailing List
> POST TO: spug-list at mail.pm.org  http://spugwiki.perlocity.org
> ACCOUNT CONFIG: http://mail.pm.org/mailman/listinfo/spug-list
> MEETINGS: 3rd Tuesdays, Location Unknown
> WEB PAGE: http://www.seattleperl.org
>



More information about the spug-list mailing list