[Chicago-talk] Data for Charting

Warren Lindsey warren.lindsey at gmail.com
Mon May 18 14:09:19 PDT 2009


I've done this before.  Most graphing/chart modules do not handle
timestamp datasets with gaps very well.  I think gnuplot can handle
timeseries with gaps, but I don't remember.  If you load your data
into rrdtool, it can do it, but that wasn't the solution I wanted.  I
just wanted to parse logfiles and build graphs without a database step
in between.

My solution was to load all of my data into a hash, then walk with a
for loop, printing hash keys that exist or print null value for keys
that were not defined.  This worked for spacing out entries in my
graphs.

Completely untested code:

my $hash = {};

foreach my $line ( @logfile ) {
  my ($year, $month, $day, $hour, $minute, $second, $hostname, $entry)
= ($line =~ m/regex/);
 $hash{$hostname}{$year}{$month}{$day}{$hour}{$minute}{$second}=$entry;
}

foreach my $host ( sort keys %{$hash} ) {
  foreach my $year ( sort keys %{$hash{$host}} ) {
    foreach my $month (1..12) {
      foreach my $day (1..31) {
        foreach my $hour (0..24) {
          foreach my $minute (0..59) {
            foreach my $second (0..59) {
              if (defined
$hash{$hostname}{$year}{$month}{$day}{$hour}{$minute}{$second} ) {
                 print CHART  join(",", $hostname, $year, $month,
$day, $hour, $minute, $second,
$hash{$hostname}{$year}{$month}{$day}{$hour}{$minute}{$second})."\n";
              } else {
                print CHART  join(",", $hostname, $year, $month, $day,
$hour, $minute, $second, "null")."\n";
              }
            }
          }
        }
      }
    }
  }
}

The actual code had $min and $max values passed in for the time fields
and allowed a user to drill down in a logfile and generate a graph of
events across logfiles from multiple servers.  Obviously, the dataset
was not too big, so the $hash size in memory was not an issue.

If you're pulling data from a database, a better solution would be to
do an outer join to a sequence and your data would be returned with
these null values already.

Cheers,
Warren

On Mon, May 18, 2009 at 2:26 PM, David J. Young <younda at rcn.com> wrote:
> Mongers,
>
> Does anyone know of a good module to extract/create data suitable for use in a charting module/program?
>
> For example:
>
> The original dataset may look like this from a database query:
>
> series1:
> Q3-2008  4716
> Q4-2008  1025
> Q2-2009    73
>
> series2:
> Q3-2008: 1024
> Q4-2008:  445
>
> series3:
> Q4-2008:  777
>
> (Note, dataset above is not explicitly how it comes out of database.  This is just to illustrate the gaps in the data that need to be set to zero).
>
> >From this, I'd like to get this:
> xaxis = qw(Q3-2008 Q4-2008 Q1-2009 Q2-2009);
> line1 = qw([0,4716] [1,1025] [2,0] [3,73]);
> line2 = qw([0,1024] [1,445] [2,0] [3,0]];
> line3 = qw([0,0] [1,777] [2,0] [3,0]];
>
> If you don't know of any modules, can you recommend any good strategies to do this?
>
> ydy
> _______________________________________________
> Chicago-talk mailing list
> Chicago-talk at pm.org
> http://mail.pm.org/mailman/listinfo/chicago-talk
>


More information about the Chicago-talk mailing list