APM: [Lopsa-us-tx-austin] Musings: Current state of log capture and analysis...

jameschoate at austin.rr.com jameschoate at austin.rr.com
Thu Jul 1 12:29:50 PDT 2010


---- Mark Farver <mfarver at mindbent.org> wrote: 
> My experience boiled down to two choices.
> 
> 1. Splunk.  Which is just awesome, scales ok up to a couple hundred
> gigs a day and is easy to use.  It is priced by the GB, and the price
> is heart attack inducing.

Yep, that kills it right there.
 
> 2. Roll your own... basically a bunch of syslog collectors writing to
> Hadoop/HFS (if you expect to actually analyze all of that data)

I guess it's time to talk IP then...

> Either way, you'll need to build a rack or two of high disk capacity
> machines to house the data on.  The nice thing is Hadoop works pretty
> well on generic server hardware and consumer grade disks.  Stuff a
> machine with 2TB disks and you can pack as much as 12TB into 1U.  I
> recommend starting with about 10-20 machines and scaling up.. much
> less that that and you'll have the diskspace by probably not enough
> CPU to do analysis.

The current plan is a single machine  to store the files from the various head-ends on, keeping five days worth of each. The analysis will get done on other boxes and at this point isn't my problem.

I'm opting for  a pull mechanism from the server, I see a push from the clients as taking too much maint.

> Expect that this system is going to require at least a full time
> employee seat or two.  Probably a Hadoop admin, and a
> programmer/report writer.  Hadoop is pretty easy to setup, but actual
> data analysis takes some skill.  I can give you some pointers, or you
> might be able to find a Rackspace Hadoop person (there are quite a few
> in SA) that would moonlight.

In one of these comments I mentioned the fiscal responsibility of cable companies...

Thanks for the feedback Mark.

--
 -- -- -- --
Venimus, Vidimus, Dolavimus

jameschoate at austin.rr.com
james.choate at g.austincc.edu
james.choate at twcable.com
h: 512-657-1279
w: 512-845-8989
http://hackerspaces.org/wiki/Confusion_Research_Center

Adapt, Adopt, Improvise
 -- -- -- --


More information about the Austin mailing list