<div dir="ltr"><div>Team,</div><div><br></div><div>  Sharing a recap of what we discussed impromptu. We sure can revisit for a deep dive in the coming months.</div><div><br></div><div>Thanks,<br>Ramana </div><div><b><br></b></div><div><b><br></b></div><div><b><br></b></div><div><b>A) dt - a data transformation/tracking framework<br></b><br></div><div>Goal<b style="font-weight:bold">:</b><span style="font-weight:normal"> Ability to compose tabular data from multiple sources in an automation friendly framework, AND track the results in a revision controlled fashion.</span><br></div><div><br></div><div>Sources can be structured (e.g. Relational DB, xls/csv/tsv/ods etc formats) or unstructured (e.g. HTML table in a webpage).</div><div>Automation can schedule the jobs (say) via a cron job, or Interactively view results.</div><div>Ability to combine (e.g. join, pivot, group, order, subset, union, intersection etc) is essential.</div><div>Persistence to results (final as well as intermediate)</div><div>Tracking to find trend over time. E.g. How a row (or a field of a table) evolved over time.</div><div><br></div><div>Imagine a use case...Combine diverse data (from MS Excel on Dropbox, Numbers worksheet on Mac, and a HTML Table in a webpage) using web services (e.g. find stock price) and put the result into a SQLite Database and push the result to a web page and a Google Sheet for users to view. Same can be seen from a terminal in interactive fashion - with periodic refresh.</div><div><br></div><div>Wow...that seems like a valuable weapon that saves tons of cycles, at least in my world. :-)</div><div><br></div><div>We reviewed the version 0.01 of the tool during yesterday's OCPM meeting. The tool was built using easydatabase (<a href="https://sites.google.com/site/easydatabase/">https://sites.google.com/site/easydatabase/</a>) and sqlite3, using Perl's ability glue diverse data sources together.</div><div><br></div><div><span style="background-color:rgb(255,255,0)">[snip]</span></div><div>$ dt infile=test.csv [outformat=psv]</div><div>              Formats input to unform width 'psv' (pipe seperated values)</div><div>$ cat test.csv | dt</div><div>              Takes input from a pipe</div><div>$ dt infile=test.csv infile=test1.csv command='$dt->[0]=$dt->[0]->join($dt->[1], 0, ["Name"], ["Name"], {renameCol => 1})' outformat=xls outfile=t.xls</div><div>               Composes tables and stores results</div><div>$ cat t.csv | dt informat=csv command='dt2db($dt->[0], undef, "t.db", "t")'</div><div>                Creates persistent DB tables from in-memory Data::Table objects</div><div>$ sqlite3 -header t.db "select * from t"</div><div>                DB and SQL access</div><div>$ cat t.csv | dt informat=csv command='dt2db($dt->[0], undef, "t.db", "t")'</div><div>                 When data changes, only changed rows are updated giving a time history of data</div><div><span style="background-color:rgb(255,255,0)">[/snip]</span></div><div><br></div><div>Question: Is there a tool out there that solves this class of problems? If so, we can learn and adopt; else we can refine and release the tool.</div><div><br></div><div><br></div><div><b>B) Webscraping question</b></div><div><br></div><div>How do we get the JSON from URL e.g. "<a href="https://www.tipranks.com/api/stockInfo/getDetails/?name=aapl">https://www.tipranks.com/api/stockInfo/getDetails/?name=aapl</a>" via Perl script?</div><div>Looks like there is some challenge to usual headers.</div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div> </div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div></div>