APM: [Lopsa-us-tx-austin] Musings: Current state of log capture and analysis...
jameschoate at austin.rr.com
jameschoate at austin.rr.com
Thu Jul 1 12:29:50 PDT 2010
---- Mark Farver <mfarver at mindbent.org> wrote:
> My experience boiled down to two choices.
>
> 1. Splunk. Which is just awesome, scales ok up to a couple hundred
> gigs a day and is easy to use. It is priced by the GB, and the price
> is heart attack inducing.
Yep, that kills it right there.
> 2. Roll your own... basically a bunch of syslog collectors writing to
> Hadoop/HFS (if you expect to actually analyze all of that data)
I guess it's time to talk IP then...
> Either way, you'll need to build a rack or two of high disk capacity
> machines to house the data on. The nice thing is Hadoop works pretty
> well on generic server hardware and consumer grade disks. Stuff a
> machine with 2TB disks and you can pack as much as 12TB into 1U. I
> recommend starting with about 10-20 machines and scaling up.. much
> less that that and you'll have the diskspace by probably not enough
> CPU to do analysis.
The current plan is a single machine to store the files from the various head-ends on, keeping five days worth of each. The analysis will get done on other boxes and at this point isn't my problem.
I'm opting for a pull mechanism from the server, I see a push from the clients as taking too much maint.
> Expect that this system is going to require at least a full time
> employee seat or two. Probably a Hadoop admin, and a
> programmer/report writer. Hadoop is pretty easy to setup, but actual
> data analysis takes some skill. I can give you some pointers, or you
> might be able to find a Rackspace Hadoop person (there are quite a few
> in SA) that would moonlight.
In one of these comments I mentioned the fiscal responsibility of cable companies...
Thanks for the feedback Mark.
--
-- -- -- --
Venimus, Vidimus, Dolavimus
jameschoate at austin.rr.com
james.choate at g.austincc.edu
james.choate at twcable.com
h: 512-657-1279
w: 512-845-8989
http://hackerspaces.org/wiki/Confusion_Research_Center
Adapt, Adopt, Improvise
-- -- -- --
More information about the Austin
mailing list