<div dir="ltr"><div dir="ltr"><div>Interesting challenge. A quick search on CPAN led me to the Text::Mining package, has anyone used it for this type of project?</div><div><a href="https://metacpan.org/pod/Text::Mining">https://metacpan.org/pod/Text::Mining</a></div><div>With the government shutdown, <a href="http://regulations.gov">regulations.gov</a> may not be approving new API keys. If anyone needs some example comments, I can put together a small archive.</div><div><br></div><div>Thanks,<br></div><div>dsk<br></div><div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Jan 24, 2019 at 1:27 PM Mark Senn <<a href="mailto:mark@purdue.edu">mark@purdue.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Purdue Perl Mongers,<br>
<br>
A person (I don't know if they want to be identified offhand)<br>
demonstrated how to get information from <a href="http://federalregistrar.gov" rel="noreferrer" target="_blank">federalregistrar.gov</a> and/or (I<br>
forget for sure) <a href="http://regulations.gov" rel="noreferrer" target="_blank">regulations.gov</a> during our last meeting using an API.<br>
<br>
From<br>
<a href="https://www.regulations.gov/document?D=EPA-HQ-OAR-2017-0355-21117" rel="noreferrer" target="_blank">https://www.regulations.gov/document?D=EPA-HQ-OAR-2017-0355-21117</a><br>
  EPA received more than 270,000 comments on the ANPRM, which have<br>
  informed this proposed rulemaking.<br>
<br>
From<br>
<a href="https://www.wolframalpha.com/input/?i=270000+seconds" rel="noreferrer" target="_blank">https://www.wolframalpha.com/input/?i=270000+seconds</a><br>
  [270000 seconds is] 3.3 days<br>
<br>
Challenge problem: figure out how to use the API for <a href="http://regulations.gov" rel="noreferrer" target="_blank">regulations.gov</a> and<br>
"sentiment analysis" (google it) to automatically classify comments. I<br>
understand <a href="http://regulations.gov" rel="noreferrer" target="_blank">regulations.gov</a> limits the rate at which one can download<br>
information but if some "sentiment analysis" software can automatically<br>
classify comments faster/better/cheaper that humans or other existing<br>
software on a small trial, <a href="http://regulations.gov" rel="noreferrer" target="_blank">regulations.gov</a> may be interested in that. I<br>
certainly wouldn't want to read 270K comments and summarize them.<br>
<br>
-mark<br>
_______________________________________________<br>
Purdue-pm mailing list<br>
<a href="mailto:Purdue-pm@pm.org" target="_blank">Purdue-pm@pm.org</a><br>
<a href="https://mail.pm.org/mailman/listinfo/purdue-pm" rel="noreferrer" target="_blank">https://mail.pm.org/mailman/listinfo/purdue-pm</a><br>
</blockquote></div>