[Purdue-pm] challenge problem: sentiment analysis

dsk zeewfo at gmail.com
Mon Jan 28 09:38:51 PST 2019


I put together a couple of files that hold 2438 comment records from the
https://www.regulations.gov/docket?D=EPA-HQ-OAR-2017-0355 docket.

In https://x646b.org/projects/ppm/sentiment/ , the .csv file is a comma
separated value file and the .db file is a sqlite file with the same
information. They have two columns of interest, comment_text and
attachments. comment_text holds the text of the comment record. If the
comment text says something like "See attached" or "See attached file(s)"
then the comment text is only available as an attachment and the name of
the attachment is stored in the attachments column. The attachments have
been archived and can be downloaded as a separate .tgz file from the same
directory as the other two files.

Thanks,
dsk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.pm.org/pipermail/purdue-pm/attachments/20190128/38eab936/attachment.html>


More information about the Purdue-pm mailing list