<div dir="ltr"><br><div class="gmail_extra">Hi, </div><div class="gmail_extra"><br></div><div class="gmail_extra">Thanks everyone for your replies.<br><br><div class="gmail_quote">On Tue, Jan 21, 2014 at 1:21 PM, Matthew Phillips wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><br></div><div class="gmail_extra"><div class="gmail_quote">
On Tue, Jan 21, 2014 at 12:29 PM, Antonio Sun <span dir="ltr"><<a href="mailto:tpm.ats@spamgourmet.com" target="_blank">tpm.ats@spamgourmet.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><span style="font-family:arial,sans-serif;font-size:13px">Hi,</span><br style="font-family:arial,sans-serif;font-size:13px">
<br style="font-family:arial,sans-serif;font-size:13px"><span style="font-family:arial,sans-serif;font-size:13px">I have a script that works on xml content. A thousand time simplified version is:</span><br style="font-family:arial,sans-serif;font-size:13px">
<br style="font-family:arial,sans-serif;font-size:13px"><span style="font-family:arial,sans-serif;font-size:13px"> xml_output | perl -n000e </span><span style="font-family:arial,sans-serif;font-size:13px">'s,(?<=">)(.*?)(?=</</span><span style="font-family:arial,sans-serif;font-size:13px">HttpBody>),</span><font face="arial, sans-serif">`echo $1 | wc -c`,eg; print'</font><br style="font-family:arial,sans-serif;font-size:13px">
<br style="font-family:arial,sans-serif;font-size:13px"></div></div></blockquote></div></div></blockquote><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<span style="font-family:arial,sans-serif;font-size:13px">why are you doing `echo $1 | wc -c` instead of simply doing «length($1)»?</span></blockquote><div><span style="font-family:arial,sans-serif;font-size:13px"><br></span></div>
<div><span style="font-family:arial,sans-serif;font-size:13px">As said before, this is only thousand time simplified version, to stress out what the problems is, instead of distract you with unrelated code. </span><br style="font-family:arial,sans-serif;font-size:13px">
</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class="gmail_extra"><div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><font face="arial, sans-serif">I.e., I need to pipe the matching string ($1, content between tag </font><span style="font-family:arial,sans-serif;font-size:13px">HttpBody) to an external program via shell. As you can tell, </span><span style="font-family:arial,sans-serif;font-size:13px">if the </span><span style="font-family:arial,sans-serif">matching </span><span style="font-family:arial,sans-serif;font-size:13px">content</span><span style="font-family:arial,sans-serif;font-size:13px"> is too big for shell parameter length, my script will fail.</span></div>
</div></blockquote></div></div></blockquote><div><br></div><div><br></div><div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
Take a look at perldoc perlipc (search for Using open() for IPC)<br>Alternatively, use IO::All and something like ... <br></blockquote><div><br></div><div>Previously, I thought the last resort would be bring in the big gun and do something like the following,</div>
<div><pre class="" style="margin-left:10px;margin-right:10px;background-color:rgb(238,238,221);border:1px solid rgb(204,204,204);word-wrap:break-word;white-space:pre-wrap;padding:3px;font-size:1.2em;color:rgb(81,81,81)"><ol style="background-color:rgb(216,216,216);color:rgb(63,63,63);margin-top:0px;margin-bottom:0px">
<li style="background-color:rgb(238,238,221);padding-left:5px;color:rgb(38,38,38);padding-bottom:2px"> <a class="" href="http://perldoc.perl.org/functions/open.html" style="color:rgb(102,102,102);font-weight:bold">open</a><span class="" style="color:rgb(0,0,0)">(</span><span class="" style="color:rgb(0,0,0)">SPOOLER</span><span class="" style="color:rgb(0,0,0)">,</span> <span class="" style="color:rgb(205,85,85)">"| cat -v | lpr -h 2>/dev/null"</span><span class="" style="color:rgb(0,0,0)">)</span></li>
<li style="background-color:rgb(238,238,221);padding-left:5px;color:rgb(38,38,38);padding-bottom:2px"> || <a class="" href="http://perldoc.perl.org/functions/die.html" style="color:rgb(102,102,102);font-weight:bold">die</a> <span class="" style="color:rgb(205,85,85)">"can't fork: $!"</span><span class="" style="color:rgb(0,0,0)">;</span></li>
<li style="background-color:rgb(238,238,221);padding-left:5px;color:rgb(38,38,38);padding-bottom:2px"> <a class="" href="http://perldoc.perl.org/functions/local.html" style="color:rgb(102,102,102);font-weight:bold">local</a> <span class="" style="color:rgb(0,104,139)">$SIG</span>{<span class="" style="color:rgb(0,0,0)">PIPE</span>} = <a class="" href="http://perldoc.perl.org/functions/sub.html" style="color:rgb(102,102,102);font-weight:bold">sub</a> <span class="" style="color:rgb(0,0,0)">{</span> <a class="" href="http://perldoc.perl.org/functions/die.html" style="color:rgb(102,102,102);font-weight:bold">die</a> <span class="" style="color:rgb(205,85,85)">"spooler pipe broke"</span> <span class="" style="color:rgb(0,0,0)">}</span><span class="" style="color:rgb(0,0,0)">;</span></li>
<li style="background-color:rgb(238,238,221);padding-left:5px;color:rgb(38,38,38);padding-bottom:2px"> <a class="" href="http://perldoc.perl.org/functions/print.html" style="color:rgb(102,102,102);font-weight:bold">print</a> <span class="" style="color:rgb(0,104,139)">SPOOLER</span> <span class="" style="color:rgb(205,85,85)">"stuff\n"</span><span class="" style="color:rgb(0,0,0)">;</span></li>
<li style="background-color:rgb(238,238,221);padding-left:5px;color:rgb(38,38,38);padding-bottom:2px"> <a class="" href="http://perldoc.perl.org/functions/close.html" style="color:rgb(102,102,102);font-weight:bold">close</a> <span class="" style="color:rgb(0,0,0)">SPOOLER</span> || <a class="" href="http://perldoc.perl.org/functions/die.html" style="color:rgb(102,102,102);font-weight:bold">die</a> <span class="" style="color:rgb(205,85,85)">"bad spool: $! $?"</span><span class="" style="color:rgb(0,0,0)">;</span></li>
<li></li></ol></pre></div><div>I.e, instead of using shell's "<span style="font-family:arial,sans-serif">echo $1 |</span><span style="font-family:arial,sans-serif"> ", I'll write to my own file descriptor opened by Perl just like above. </span></div>
<div><span style="font-family:arial,sans-serif"><br></span></div><div><span style="font-family:arial,sans-serif">However, after trying that, I realized that it is not working. </span></div></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr">
<div class="gmail_extra"><span style="font-family:arial,sans-serif;font-size:13px"></span></div><div class="gmail_extra"><span style="font-family:arial,sans-serif;font-size:13px">I *have to* process the </span><span style="font-family:arial,sans-serif">matching string via the </span><span style="font-family:arial,sans-serif;font-size:13px">external program. Is there any way I can get around this? </span></div>
</div></blockquote></div></div></blockquote><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<span style="font-family:arial,sans-serif;font-size:13px">Why not just print the extracted $1 to stdout ... that gets piped into wc ?</span></blockquote><div><span style="font-family:arial,sans-serif;font-size:13px"><br></span></div>
<div><span style="font-family:arial,sans-serif;font-size:13px">See below.</span></div><div><span style="font-family:arial,sans-serif;font-size:13px"><br></span></div><div>The problem is that I not only need to <span style="font-size:13px;font-family:arial,sans-serif">process the </span><span style="font-family:arial,sans-serif">matching string via the </span><span style="font-size:13px;font-family:arial,sans-serif">external program, but I also need to replace the </span><span style="font-family:arial,sans-serif">matching string with the result of </span><span style="font-family:arial,sans-serif">the </span><span style="font-size:13px;font-family:arial,sans-serif">external process. Putting two together is where the problem is. </span></div>
<div><span style="font-size:13px;font-family:arial,sans-serif"><br></span></div><div><span style="font-size:13px;font-family:arial,sans-serif">This is my previous code:</span></div><div><span style="font-size:13px;font-family:arial,sans-serif"><br>
</span></div><div><span style="font-size:13px;font-family:arial,sans-serif"> perl -n000e </span><span style="font-size:13px;font-family:arial,sans-serif">'s,(?<=">)(.*?)(?=</</span><span style="font-size:13px;font-family:arial,sans-serif">HttpBody>),</span><font face="arial, sans-serif">`echo $1 | wc -c`,eg; print'</font><span style="font-size:13px;font-family:arial,sans-serif"><br>
</span></div><div><font face="arial, sans-serif"><br></font></div><div>This is what I tried just now:</div><div><br></div><div> <span style="background-color:transparent;font-family:Calibri;font-size:13px;white-space:pre-wrap">perl -n000e 'BEGIN { open(SPOOLER, "| tee /dev/tty | wc -c") || die "cannot fork: $!"; local $SIG{PIPE} = sub { die "spooler pipe broke" }; sub process { print SPOOLER "$_[0]"; }; }; s,(?<=">)(.*?)(?=</HttpBody>),process $1,eg; print' </span></div>
<div><br></div><div>Can you see what's wrong? -- the replacement would not be the result of "<span style="font-family:arial,sans-serif">wc -c", but all are 1. </span></div><div><span style="font-family:arial,sans-serif"><br>
</span></div><div><span style="font-family:arial,sans-serif">So to recap, </span></div><div><span style="font-family:arial,sans-serif"><br></span></div><div><span style="font-family:arial,sans-serif">I need to pick out a big chunk of input string (>200K), feed it to </span><span style="font-family:arial,sans-serif;font-size:13px">external program (which is pipe after </span><span style="font-family:arial,sans-serif;font-size:13px">pipe after</span><span style="font-family:arial,sans-serif;font-size:13px"> </span><span style="font-family:arial,sans-serif;font-size:13px">pipe)</span><span style="font-size:13px;font-family:arial,sans-serif">, then replace the </span><span style="font-family:arial,sans-serif">matching string with the </span><span style="font-size:13px;font-family:arial,sans-serif">processed result. what's the proper way to do it (for big </span><span style="font-family:arial,sans-serif">matching </span><span style="font-family:arial,sans-serif">chunks</span><span style="font-family:arial,sans-serif">)</span><span style="font-size:13px;font-family:arial,sans-serif">? </span></div>
<div><span style="font-family:arial,sans-serif;font-size:13px"><br></span></div><div><span style="font-family:arial,sans-serif;font-size:13px">Thanks</span></div><div><br></div></div></div></div>