[Phoenix-pm] Meeting yesterday (June 30) wrapup
Scott Walters
scott at illogics.org
Fri Jul 1 12:55:33 PDT 2005
Hi everyone,
Thanks for coming, and thanks for listening to me talk. As usual, I talked
for longer than I meant to. Whoops. Nello's was unusually busy. We'll
probably do Nello's again, but not for a while, and not for code
presentations... regardless, I think having food at the meetings made
things a lot easier (we're all busy people) so I think I'll see about
ordering out for pizza for our regular meetings, whereever they wind up.
Good to meet all of the new people. I'm sorry I didn't a chance to chat with
you guys more and I hope you'll be back. It's hard to get to know people
in two hours with so much chaos.
I completely forgot to give the door prize, CGI Programming with Perl, to
Brock to give away. D'oh! Next meeting, we'll just have to have two door prizes.
Sorry to everyone who only came because of the door prize. Next time, remind
me, or Brock, or someone.
Er, ehm, without further ado, here's yafro.pl.
Again, you shouldn't use http.pm or TransientBaby -- they're for educational
purposes only. If you actually do any Web scraping, use HTML::Parser,
HTML::TableExtractor, or something sane. Which means you'll have
to modify this to use another HTML parser. That shouldn't be hard to
do if you use an event based one.
I was expecting people to chime in and comment on how *they* scraped
Web content but instead Michael just gave a lot of examples of how he
*blocks* robots. Heh, heh, heh.
For the benefit of people not at the meeting, here are a few comments on
the code: would have been easier to just extract all images with URLs
matching a certain pattern, and the *get_page = http::generate_get_page;
this is odd and would have been better done with the Exporter (module).
Okay. Talk to ya'll later.
-scott
On 0, Brock <awwaiid at thelackthereof.org> wrote:
>
> We had a lovely meeting yesterday, with 10 people enjoying dinner and
> perl-talk at Nello's Pizza. In addition to random perl-related
> conversation, Mike Friedman spoke of HighWire Press [1] and their use of
> Perl, and Scott Walters told us about how he does web-scraping (and he
> will post code soon we hope :) ).
>
> We'll have the website up soon enough. In the meantime keep an eye out
> here for the next meeting topic/time/location, which I will have picked
> out within the next two weeks. Please send topic requests and
> volunteership here to the list. I like the idea of doing two talks like
> we did this time so that we can cover a potentially wider range of
> interest and experience.
>
> Have a good (long) weekend!
> --Brock
>
> [1] http://highwire.stanford.edu/
>
> _______________________________________________
> Phoenix-pm mailing list
> Phoenix-pm at pm.org
> http://mail.pm.org/mailman/listinfo/phoenix-pm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: yafro.pl
Type: application/x-perl
Size: 10120 bytes
Desc: not available
Url : http://mail.pm.org/pipermail/phoenix-pm/attachments/20050701/933934af/yafro-0001.bin
-------------- next part --------------
0. About
a. This is how Scott scrapes Web content
b. This is not necessarily how you should scrape Web content
c. Minimal examples are for illustration of the grammars and protocols
1. Fetching
a. LWP
b. LWP::Simple
c. http.pm
**** 1st example ****
I. Handles POST + GET in IIS
II. Note the socket read timeouts as an alternative to alarm
III. Browser sessions are closures with cookies and the referer bound in
IV. Referer tag is automagically correct -- great for stubborn sites
V. Knows how to deal with ASP/IIS applications that expect GET and POST
data to be separate
VI. HTTP consists of a connection, sending the handshake, sending
headers, waiting for a reply, then closing the connection
2. Parsing
a. Every novice wants to use regex -- regexen are:
I. Fragile
II. Non-reenterant, therefore non-recursive
III. Not stateful, therefore not a grammar -- HTML requires a grammar
IV. Doomed to failure
b. HTML::LinkExtor, HTML::LinkExtactor for easy spiders -- don't forget
robots.txt
c. HTML::TableExtractor, HTML::TableContentParser
d. HTML::TreeBuilder
I. Insert nodes
simple recurse through tree
II. Remove nodes
III. Turn everything under a node back into HTML
IV. Easy ton navigate the structure of the document
V. Easy to modify document
VI. Hard to extract repeated structures in the document
e. HTML::Parser
I. Easy to extract repeated non-table structures
II. Event driven -- HTML tags generate callback events
f. my minimal HTML parser -- like HTML::Parser --
***** 3rd example ******
I. This simple, minimal parser is incomplete
II. The various states illustrate statefulness of the HTML grammar
. Between tags
. Inside of tag
. Inside of quoted values inside of tags
g. yafro.pl -- an exampling of using an event driven HTML parser
I. Extracts data in a hidden field from the login form
II. Logs in, keeping cookies, returning hidden field data
III. parse_index() parses photo index pages
. performs a callback operation for each image link found
. identifies link to previous month's index
. identifies link to next "with-in month" index page
. returns link to next "with-in month" page if it exists, otherwise
falls back on previous month's index page
IV. parse_index() uses coroutines to repeatedly fall back on the HTML
parser without having to return -- coroutines do for 'return' what
funcation calls did for 'goto' -- this lets us keep state implicitly
V. parse_index()'s caller starts off with a known URL for the latest
index page and loops as long parse_index() returns it another index
page
VI. Each iteration, all discovered images are downloaded and the list
of queued images is zeroed
VII. Cheats and computes the full image name/location from the thumbnail
name.
IIX. yafro wasn't a hard site to crack, but a lot of this
infrastructure was created for sites that are
h. table tweakers... no, not today
3. Normalizing
a. Most dynamic data comes from a database
b. A lot of dynamic data comes from a relational, structured database
c. Why not reconstruct the original database?
I. Web output is the result of a join'ing several tables together
II. To normalize it again, identify the keys, supporting values,
and relations
III. By virtue of being on the same row of output, values relate
to each other
4. Other scrapers
a. budweiser.com
b. Thomas Register (geekpac)
c. Google (geekdate)
d. I forget...
1 ----- Fetching -------
2
3 1.a. LWP
4
5 # Create a user agent object
6 use LWP::UserAgent;
7 $ua = LWP::UserAgent->new;
8 $ua->agent("MyApp/0.1 ");
9
10 # Create a request
11 my $req = HTTP::Request->new(POST =>
'http://search.cpan.org/search');
12 $req->content_type('application/x-www-form-urlencoded');
13 $req->content('query=libwww-perl&mode=dist');
14
15 # Pass request to the user agent and get a response back
16 my $res = $ua->request($req);
17
18 # Check the outcome of the response
19 if ($res->is_success) {
20 print $res->content;
21 }
22
23 1.b. LWP::Simple
24
25 use LWP::Simple;
26 $content = get("http://www.sn.no/");
27 die "Couldn't get it!" unless defined $content;
28
29 1.c. http.pm
30
31 use http;
32 *get_page = http::generate_get_page;
33 ($status, $headers, $html) = get_page(
34 $next_page,
35 state => $hidden_state_value,
36 login => 'Login',
37 %login_info,
38 );
39
40
41 ----- yafro.pl ------
42
43 (Excerpts)
44
45 #!/usr/bin/perl
46
47 #
48 # Downloads all pictures for a given user from yafro.com.
49 #
50 # Usage: perl yafro.pl --user <username>
51 #
52
53 use strict;
54 use warnings;
55
56 use Socket;
57
58 use IO::Handle;
59 use IO::Socket::INET;
60
61 use Coro;
62 use Coro::Cont;
63
64 use TransientBaby::Forms;
65 use TransientBaby;
66
67 use http;
68
69 my $debug = 1;
70
71 my %login_info = (
72 email => 'scott at slowass.net',
73 passwd => 'XX',
74 );
75
76 # ... (stuff here)
77
78 $http::ua = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.8)
Gecko/20050511 Firefox';
79
80 *get_page = http::generate_get_page;
81
82 sub parse_index {
83
84 my $page = shift;
85 my $imagehit = shift; ref $imagehit eq 'CODE' or die;
86
87 my $prev_month;
88 my $next_month;
89 my $next_within_month;
90
91 TransientBaby::Forms::parse_html($page, csub {
92
93 # This crawls index pages, picks out photos, and calls-back links
to the
94 # photos/photo pages (haven't decided) back. It yields links to
the next
95 # index page. This is the next page within the month or else the
previous
96 # month, when those are exhausted.
97
98 no warnings 'uninitialized';
99
100 my $accessor;
101 my %ent;
102
103 # Previous month (one of):
104
105 # <tr class="row0"><td class="col0"><table style="border: solid
0px black" width=100% cellspacing=0 cellpadding=0><tr
class="headerRow"><td class="col0" align=left><font size=2>Show x's
Photos in:</font>   <font size=2><b><nobr><a
href='http://x.yafro.com/photos/2005/5/0'><<</a> June 2005 <font
size=2 color=#999999>>></font></nobr></b></font></td></tr>
106
107 # <tr class="row0"><td class="col0"><table style="border: solid
0px black" width=100% cellspacing=0 cellpadding=0><tr
class="headerRow"><td class="col0" align=left><font size=2>Show x's
Photos in:</font>   <font size=2><b><nobr><a
href='http://x.yafro.com/photos/2005/5/0'><<</a> June 2005 <font
size=2 color=#999999>>></font></nobr></b></font></td></tr>
108
109 do {
110 yield; ($accessor, %ent) = @_;
111 } until $ent{tag} eq 'lit' and $ent{text} =~ m/Photos in/;
112
113 $debug and print "1. 'Photos in'\n";
114
115 for(;;) {
116 yield; ($accessor, %ent) = @_;
117 if($ent{tag} eq 'a') {
118 $prev_month = $ent{href};
119 last;
120 } elsif($ent{tag} eq 'font' and $ent{color} eq '#999999') {
121 # <font size=2 color=#999999> separates prev and next
links as far as we're concerned
122 $prev_month = undef;
123 last;
124 }
125 }
126
127 $debug and print "2. \$prev_month: $prev_month\n";
128
129 for(;;) {
130 yield; ($accessor, %ent) = @_;
131 if($ent{tag} eq 'a') {
132 $next_month = $ent{href};
133 last;
134 } elsif($ent{tag} eq '/nobr') {
135 # the </nobr> means this section is all over
136 $next_month = undef;
137 last;
138 }
139 }
140
141 $debug and print "3. \$next_month: $next_month\n";
142
143 # Image:
144
145 # <a href="http://x.yafro.com/photo/897523"
onmouseout="hideddrivetip()" onmouseover="ddrivetip('........')">
146 # <img src='http://photos.yafro.com/pics3/i/20050627/07/3/4
a/34ab316226ff678b4a49a26bd50_thumb.jpg' border=0>
147
148 # or:
149
150 # <a href="http://x.yafro.com/photo/896918"><img
src='http://photos.yafro.com/pics3/i/20050628/22/f/b/0/fb04c3dab88ed
dff08821f95b9d400_thumb.jpg' border=0></a><br><nobr><font
size=1>June 28</font><font size=1>, </font><font size=1>10:09
PM</font></nobr><br><center><a href='http://x.yafro.com/photo/896918
>2 comments</a>
151
152 # "Photos in" seems to appear only before the prev/next month
selected... not sure in which case this worked
153 # do {
154 # yield; ($accessor, %ent) = @_;
155 # } until $ent{tag} eq 'lit' and $ent{text} =~ m/Photos posted
in/;
156
157 $debug and print "4. 'Photos posted in'\n";
158
159 for(;;) {
160 yield; ($accessor, %ent) = @_;
161 last if $ent{tag} eq 'lit' and $ent{text} =~ m/Pages for/;
162 # $debug and print "debug: looking for image hits: tag is
$ent{tag}\n";
163 # if($ent{tag} eq 'a' and $ent{onmouseout} and
$ent{onmouseover}) # images without labels don't have these!
164 if($ent{tag} eq 'a') {
165 yield; ($accessor, %ent) = @_;
166 last if $ent{tag} eq 'lit' and $ent{text} =~ m/Pages
for/;
167 if($ent{tag} eq 'img') {
168 $debug and print "5. Image hit: $ent{src}\n";
169 $imagehit->($ent{src});
170 } elsif($ent{tag} eq 'lit') {
171 next; # that's okay -- a href surrounds the image and
an identical one surrounds any text after
172 # the image.
173 } else {
174 print "debug: in middle of thumbnails, just after a
href, expected img src, instead got $ent{tag}!\n";
175 }
176 }
177 }
178
179 $debug and print "6. 'Pages for'\n";
180
181 # Next index page within current month:
182
183 # <tr class="row0"><td class="col0">Pages for June: <font size=1
color=#999999><b><</b></font> <font size=3
color=#990033><b>1</b></font> <a href='http://x.yafro.com/photos/200
/6/1'><font size=2><b>2</b></font></a> <a
href='http://x.yafro.com/photos/2005/6/2'><font
size=2><b>3</b></font></a> <a href='http://x.yafro.com/photos/2005/6
3'><font size=2><b>4</b></font></a>
184
185 for(;;) {
186 yield; ($accessor, %ent) = @_;
187 last if $ent{tag} eq '/tr';
188 # this sequence marks the current page: <font size=3
color=#990033> <b>1</b></font>
189 if($ent{tag} eq 'font' and $ent{color} eq '#990033') {
190 do {
191 yield; ($accessor, %ent) = @_;
192 last if $ent{tag} eq '/tr';
193 if($ent{tag} eq 'a') {
194 $debug and print "7. Next within-month page:
$ent{href}\n";
195 $next_within_month = $ent{href};
196 last;
197 }
198 } until 0;
199 }
200 }
201
202 });
203
204 return $next_within_month if $next_within_month;
205 return $prev_month if $prev_month;
206 return undef; # no more indices to crawl
207
208 }
209
210 # ...
211
212 my $status;
213 my $headers;
214 my $html;
215
216 my $next_page;
217
218 # ...
219
220 # log-in as self so I can see the pr0n
221
222 ($status, $headers, $html) = get_page(
223 'http://www.yafro.com/',
224 );
225
226 my $hidden_state_value;
227
228 TransientBaby::Forms::parse_html($html, sub {
229 my $accessor = shift;
230 my %ent = @_;
231 $next_page = $ent{action} if $ent{tag} eq 'form' and exists
$ent{name} and $ent{name} eq 'loginbox';
232 $hidden_state_value = $ent{value} if $ent{tag} eq 'input' and exists
$ent{name} and $ent{name} eq 'state';
233 });
234
235 # ...
236
237 ($status, $headers, $html) = get_page(
238 $next_page,
239 state => $hidden_state_value,
240 login => 'Login',
241 %login_info,
242 );
243
244 $html =~ m/My Photos posted in/ or die "Login apparently failed - string
'My Photos posted in' not found in results";
245
246 # ...
247
248 $next_page = "http://$user.yafro.com";
249
250 do {
251 ($status, $headers, $html) = get_page($next_page);
252 die unless $status == 200;
253 my @imagehits;
254 $next_page = parse_index(
255 $html, sub { push @imagehits, shift(); },
256 );
257 printf "\n\nimage hits: %d\nnext page: %s\n\n", scalar @imagehits,
$next_page || '<no next page>';
258 foreach my $image (@imagehits) {
259 my $local_image;
260 $image =~ s/_thumb\.jpg$/_full.jpg/;
261 $image =~ m{.*/(.*)} ? $local_image = "jpg/$user/$1" : die;
262 -f $local_image and do {
263 $debug and print "$local_image already exists -- not
re-downloading\n";
264 next;
265 };
266 (my $status, my $headers, my $jpg) = get_page($image);
267 open my $f, '>', $local_image or die "$local_image: $!";
268 $f->print($jpg);
269 $f->close;
270 }
271 } while $next_page;
272
273
274 --- http.pm ------
275
276 (Exceprts)
277
278 package http;
279
280 use strict;
281 use warnings;
282
283 use Socket;
284 use IO::Handle;
285 use IO::Socket::INET;
286 use POSIX;
287
288 our $ua;
289 our $debug = 1;
290
291 sub generate_get_page {
292
293 my %cookies;
294 my $referer;
295 $ua or die "set \$http::ua or something";
296
297 return sub {
298
299 (my $page, my @args) = @_;
300
301 my @headers;
302 push @headers, "Referer: $referer" if $referer;
303
304 my $cookies = '';
305 foreach my $k (keys %cookies) { $cookies .= '; ' if $cookies;
$cookies .= "$k=$cookies{$k}"; }
306 print "Cookies! $cookies\n";
307
308 (my $status, my $headers, my $html) = get_page($page, $cookies,
$ua, \@headers, \@args);
309
310 %cookies = (%cookies, map { (split /=/, $_) } map { (split /; +/,
$_)[0] } map { $_->[1] } grep { $_->[0] eq 'Set-Cookie' }
@$headers);
311 # delete $cookies{$_} for qw/expires path domain/;
312
313 $referer = $page if grep { lc($_->[0]) eq 'content-type' and
$_->[1] =~ m<text/html>i } @$headers;
314 $referer = $page unless grep { lc($_->[0]) eq 'content-type' }
@$headers; # No Content-Type? Must be text/html.
315 $debug > 0 and printf "debug: referer is %s\n", $referer || '<not
set>';
316
317 return ($status, $headers, $html);
318
319 };
320 }
321
322 sub get_page {
323
324 my $url = shift;
325 my $cookies = shift;
326 my $ua = shift;
327 my @headers = @{ shift() };
328 my @form = @{ shift() };
329
330 # XXX kind of inconsistant that @headers here is a list of the format
"Foo: Bar" where it's returned as ['Foo', 'Bar']
331
332 my $postdata = '';
333 my $port = 80;
334 my $output = '';
335
336 $debug > 0 and print "\n\ndebug: get_page: $url\n\n";
337
338 while(@form) {
339 my $key = shift @form;
340 my $value = escape(shift @form);
341 $postdata .= '&' if $postdata;
342 $postdata .= $key . '=' . $value;
343 }
344
345 eval {
346
347 (my $host, my $page) = parse_url($url);
348 my $hostaddr = inet_aton($host) or die "couldn't look up host:
$host - giving up on url $url\n";
349
350 # my $sh = IO::Socket::INET->new(
351 # PeerAddr => $host,
352 # PeerPort => $port,
353 # Proto => 'tcp',
354 # ) or die $!;
355
356 socket(my $sh, PF_INET, SOCK_STREAM, getprotobyname('tcp')) or
die $!;
357
358 setsockopt($sh, SOL_SOCKET, SO_SNDTIMEO, pack 'LL', 15, 0 ) or
die $!;
359 setsockopt($sh, SOL_SOCKET, SO_RCVTIMEO, pack 'LL', 15, 0 ) or
die $!;
360
361 connect($sh, sockaddr_in($port, $hostaddr)) or
362 die "connect failed: $! host: $host page: $page url: $url\n";
363
364 push @headers, "Cookie: $cookies" if $cookies;
365 push @headers, "Content-Length: " . length $postdata if length
$postdata;
366 push @headers, "Content-Type: application/x-www-form-urlencoded"
if length $postdata;
367
368 my $getstr = join('', map "$_\015\012",
369 (length $postdata ? "POST $page HTTP/1.0" : "GET $page
HTTP/1.0"),
370 "User-Agent: $ua",
371 "Host: $host",
372 # "Referer: $url", # should be sent as an argument now
373 "Accept: image/gif; image/jpeg; text/html; text/plain; */*",
374 "Connection: close",
375 "Accept-Language: en",
376 @headers,
377 # XXX Content-type: multipart/form-data; boundary="abcdefg" #
used for file uploads
378 );
379 $getstr .= "\015\012";
380 $getstr .= $postdata if length $postdata;
381 $debug > 0 and print "debug: getstr:\n$getstr";
382 $sh->print($getstr) or die "print failed: $! host: $host page:
$page url: $url\n";
383
384 $sh->flush();
385
386 read $sh, $output, 4096, length $output while not eof $sh;
387
388 close $sh;
389
390 };
391
392 if($@) {
393 $debug > 0 and print "debug: get_page failed: $@\n";
394 return undef;
395 } else {
396 $debug > 0 and print "debug: get_page successful: ", length
$output, " bytes\n";
397 return parse_headers($output);
398 }
399
400 }
401
402 1;
403
404 ---- parse_html() ----
405
406 Please excuse my HTML parsing code. It's sooooo 2002. Let this be a
lesson
407 to you -- don't try to parse HTML with regex!
408
409 sub parse_html {
410 my $file = shift;
411 my $callback = shift; $callback ||= sub { return 0; };
412 my $callerpackage = shift;
413
414 # if $callback->($accessor, %namevaluepairs) returns true, we use
that return value in
415 # place of the text that triggered the callback, allowing the
callback to filter the HTML.
416
417 my $name;
418 my $text;
419 my $state; # 0-outside of tag; 1-inside of tag; 2-expecting
name of new section
420 my %keyvals;
421 my $highwater; # where in the text the last tag started
422
423 my $accessor = sub { ... };
424
425 eval { while(1) {
426
427 if($file =~ m{\G(<!--.*?-->)}sgc) {
428 $text .= $1;
429 print "debug: comment\n" if($debug);
430 my $x = $callback->($accessor, tag=>'comment', text=>$1);
if(defined $x) {
431 $text .= $x;
432 } else {
433 $text .= $1;
434 }
435
436 } elsif($file =~ m{\G<([a-z0-9]+)}isgc) {
437 # start of tag
438 print "debug: tag-start\n" if($debug);
439 $highwater = length($text);
440 %keyvals = (tag => lc($1));
441 $state=1;
442 if(lc($1) eq 'div') {
443 $state=2;
444 }
445 $text .= "<" . cc($1);
446
447 } elsif($file =~ m{\G<(/[a-z0-9]*)>}isgc) {
448 # end tag
449 $keyvals{'tag'} = lc($1);
450 my $x = $callback->($accessor, %keyvals); if(defined $x) {
451 $text .= $x;
452 } else {
453 $text .= "<".cc($1).">";
454 }
455 %keyvals=();
456 print "debug: end-tag\n" if($debug);
457
458 } elsif($file =~ m{\G(\s+)}sgc) {
459 # whitespace, in or outside of tags
460 if($state == 0) {
461 my $x = $callback->($accessor, tag=>'lit', text=>$1);
if(defined $x) {
462 $text .= $x;
463 } else {
464 $text .= $1;
465 }
466 } else {
467 $text .= $1;
468 }
469 print "debug: whitespace\n" if($debug);
470
471 } elsif(($state == 1 || $state == 2) and
472 ($file =~ m{\G([a-z0-9_-]+)\s*=\s*(['"])(.*?)\2}isgc or
473 $file =~ m{\G([a-z0-9_-]+)\s*=\s*()([^ >]*)}isgc)) {
474 # name=value pair, where value may or may not be quoted
475 $keyvals{lc($1)} = $3;
476 $text .= cc($1) . qq{="$3"}; # XXX need to preserve whitespace
477 print "debug: name-value pair\n" if($debug);
478
479 } elsif(($state == 1 || $state == 2) and
480 ($file =~ m{\G([a-z0-9_-]+)}isgc)) {
481 # name without a =value attached. if above doesnt match this is
the fallthrough.
482 $keyvals{lc($1)} = 1;
483 $text .= cc($1); # correct case if needed
484 print "debug: name-value pair without a value\n" if($debug);
485
486 } elsif($file =~ m{\G>}sgc) {
487 # end of tag
488 $state=0;
489 my $x = $callback->($accessor, %keyvals); if(defined $x) {
490 # overwrite the output with callback's return, starting from
the beginning of the tag
491 # $text may have changed (or been deleted) since $highwater
was recorded
492 substr($text, $highwater) = $x if($highwater && length($text)
> $highwater);
493 } else {
494 $text .= '>';
495 }
496 print "debug: tag-end\n" if($debug);
497
498 } elsif($file =~ m{\G([^<]+)}sgc and $state != 1) {
499 # between tag literal data
500 # $text .= $1 unless($state == 2);
501 my $x = $callback->($accessor, tag=>'lit', text=>$1);
if(defined $x) {
502 $text .= $x;
503 } else {
504 $text .= $1;
505 }
506 print "debug: lit data\n" if($debug);
507
508 } elsif($file =~ m{\G<!([^>]+)}sgc and $state != 1) {
509 # DTD
510 print "debug: dtd\n" if($debug);
511 $highwater = length($text);
512 $text .= '<!' . cc($1);
513 %keyvals = (tag => lc($1));
514 $state=1;
515
516 } elsif($file =~ m{($macro)}sgc) { # 5.004 has issues with this
517 # an escape of whatever format we're using for escapes
518 print "debug: template escape\n" if($debug);
519 # XXX if this appears in a tag, no mention will be passed to
handler,
520 # which may rewrite the tag wtihout it
521 $text .= $1;
522
523 } else {
524 # this should only ever happen on end-of-string, or we have a
logic error
525 (my $foo) = $file =~ m{\G(.*)}sgc;
526 print "stopped at: -->$foo\n" if($debug);
527 if($foo) {
528 # this is an error condition
529 $callback->($accessor, tag=>'stopped', text=>$foo)
530 }
531 return $text;
532 }
533 } };
534 # shouldnt reach this point
535 print $@ if($debug && $@);
536 return $text;
537 }
More information about the Phoenix-pm
mailing list