[Madrid-pm] Ayuda con script

Host Host greathost en gmail.com
Mie Feb 18 07:59:36 PST 2009


Hola chicos, alguno podria echarme una mano?


This project is a checking (healthcheck) of the flow of SQL sentences, from
one node to a main one.
The structure are developed as follows:

1 Main node plus 3 client nodes who reports events by sql (mysql) to the
main node, the process who
sends the events are written in java so i dont have any mechanism to check
if it fails/frozen this application,
so i need to make a workaround to check if process are sending this sqls and
check the log files.

So i need the following:

Flow:
1. INSERT of a given event in sql format to introduce in the sql flow in the
3 Slave Nodes to Master node every

5 minutes.

2. Checking by an sql instruction who arrived fine (in time) if not arrives,
something goes wrong.
This happens by a micro-cut of the communications, or something the
technical product are still checking.
So the solution is to check if this events arrives, if all its fine, i dont
do anything, if fails,
ill call into the system to a script who kills the process and start again
the node process.
a) Ill stop the slave node, b) ill stop the master node till the log file
says that are stoped (i read this

word), then c) ill start-up the master node (read again the log to see its
completelly up &running) and
d) i run every slave node ni order, first 1, 2nd 2..and 3rd 3.
And i read on every node his log file to chech that are up&running (i read
im ready on the log file for ex.)

Sometimes when i read the log files on Slave Nodes i have the following
text:
"Connection lost with the Central Server"
This happens some times by no reasson, so i need to correct it calling
kill-my-process.sh to kill it

completelly, waiting to read into the log "Please connect your client to the
web server on port: 7070"
Before sending the kill signal to the process, some times it re-establish by
itself, so i have to "sleep 60"
checking if i have in log files: "Connection re-established with the Central
Server" then ill dont do anything,

only keep sending the SQL events every 5minutes.

3. When i have fully restarted the app it writes down a log where you can
read his status, this is the
phrase:"Please connect your client to the web server on port: 9090" i need
to write an auxiliar file
where in every moment writes me the status of the nodes, (master and slaves)
to read it before act,
this status are the following: if i read on the log file in the lastest
(usually 2/3 phrases):
"Please connect your client to the web server on port: 7070" -->
STATUS:UP&RUNNING
"Server Successfully Shut Down" --> STATUS:DOWN
"Server modules started with problems" --> STATUS:PROBLEMS

I need to drop the last status line from the original log file to an
auxiliar file which latest status
of the process (in main node and in slave nodes too)

When im up&running i'll dont do anything
When im down i'll check to the others nodes to stop before start up the main
node again.
When im reading problems, i restart again the process to start up all nodes
again, (stop main node,
stop slave nodes, start up main node, start up slave node1, slave node 2,
slave node 3) wait in the bucle..


The idea of the flow is that from every slave node it sends a witness
(testigo: a sql sentence) to the main
node to check if it arrives correctly, for example every 5 minutes, the main
node receives this witness
and prints in log "node 1 are ok" and do anything, the same from node 2 & 3.

If im having problems receiving this witness its supposed that im having
problems with Node x, then i need to
re-start the nodes in an order: 1st: master node, 2nd slave node1, 3rd slave
node2, 4th slave node3.
When are are down (i read from auxiliary log file his status) i can re-start
in order, 1s node1, then node2,

stop and check if node 2 are up and then node3.
To send the kill command, i need i must send a remote system call, i dont
care if its by RSH::Net or
system_call($command) to call a file called kill_node_process.sh <- this
script kills the node and start up
again, all i need is to wait it restarts again and read the log file that
tells me all is fine.
And of course, i'll check the node sql events again to check that are
syncronizing again the node.

Suddendly i can read on the master node log the following:
"Connection lost with the Client Server" then i must send the kill process
to node slave 1, 2 and 3.
And then i wait if events arrives fine.




Table in MYSQL from a Node server


desc Event;

----+-----+---------+-------+
| Field     | Type         | Null | Key | Default | Extra |
+-----------+--------------+------+-----+---------+-------+
| ID        | int(11)      |      | PRI | 0       |       |
| TEXT      | text         | YES  |     | NULL    |       |
| CATEGORY  | varchar(100) | YES  |     | NULL    |       |
| DDOMAIN   | varchar(100) | YES  |     | NULL    |       |
| NETWORK   | varchar(100) | YES  |     | NULL    |       |
| NODE      | varchar(100) | YES  |     | NULL    |       |
| ENTITY    | varchar(150) |      | MUL |         |       |
| SEVERITY  | int(11)      | YES  |     | NULL    |       |
| TTIME     | bigint(20)   | YES  | MUL | NULL    |       |
| SOURCE    | varchar(100) | YES  |     | NULL    |       |
| HELPURL   | varchar(100) | YES  |     | NULL    |       |
| WEBNMS    | varchar(100) | YES  |     | NULL    |       |
| GROUPNAME | varchar(100) | YES  |     | NULL    |       |
| OWNERNAME | varchar(25)  |      | PRI |         |       |
+-----------+--------------+------+-----+---------+-------+
14 rows in set (0.01 sec)


How i can make the INSERT in the table?
As you can see i need to read the LAST ID 1st and the add 1 ($ID+1)

select id from Event order by ttime desc limit 1;

This select it gives me the ID to add later the sql statement:
I make the time (TTIME) this style: 1234343883752,
I think its with time() * 1000 to make the time on the insert, but im not
sure (and i dont care) but i want it

to check lately, check the SQL that arrived in less than 5 min (300seg), and
rest from the localtime this way:

$node1_time=1234343883752 (SELECT TTIME from table...);
$local_time=time()*1000;
$finish_time=$node1_time - $local_time = $difference
if $difference > 300 do sign_kill_signal_to_node1 ....
   ...

     ....




This is a real SELECT from the table Event:

     | 10.33.109.189.barajas_dms_1  | 1234465208249 |
|  55558 | Process Memory usage threshold exceeded Of Instance
-->oracle.4134 Value: 339320 Data:

ProcessMemoryUsage : 10.33.109.189.barajas_dms_1 : 25.5.1.1.2 Threshold
Type: maxMajor Threshold : 204800Major

Rearm Value: 204799


What i really mind is the following values:

ID: (generated with select sql given before)

select id from Event order by TTIME desc limit 1;  (remember: INSERT $ID+1)
+-------+
| id    |
+-------+
| 70857 |
+-------+

NODE:
bar_dms_1 for node slave1
leg_dms_2 for node slave2
cas_dms_3for node slave3.

TTIME, generated by $time=time()*1000; (or maybe 999 to fit into the numbers
i get from the select) to then

rest to the localtime and if > 300 this means im waiting for more than 5min
the waitness, so i send the kill
process remotelly.

TEXT: I need the following text: "Alert: Filesystem: /cosa : 100 % used"







----------Script that adds the SQL to a temporary file:
(FAILING)--------------------------------

#!/usr/bin/perl

use DBI;


# Main Server DB connection data
$db="CentralDB";
$host="localhost";
$puerto="3306";
$usuario="root";
$password="";

# Prepare database query
$DatosConexion="DBI:mysql:database=$db;$host:$puerto";
$dbh=DBI->connect($DatosConexion,$usuario,$password);

# QUERY

$sth=$dbh->prepare(q{ SELECT ID,TTIME,NODE,TEXT from Event where
 lower(text) like '%cosa%' Order By TTIME desc limit 1 });

$sth->execute(  );

### Open the output file
open FILE, ">results.txt" or die "Can't open results.lis: $!";

### Dump the formatted results to the file
$rows = $sth->dump_results(1,\*FILE );   <------------This fails...to put me
in the auxiliary file (status)

### Close the output file
close FILE or die "Error closing result file: $!\n";

---------------------------------------------------------------------------------------------------------------

--------

OUTPUT:
MAIN_NODE# perl printmefiles.pl
'1...', '1...', '1...', 'A...'
1 rows

---------------------------------------------------------------------------------------------------------------
------------ próxima parte ------------
Se ha borrado un adjunto en formato HTML...
URL: <http://mail.pm.org/pipermail/madrid-pm/attachments/20090218/f321c3ea/attachment.html>


Más información sobre la lista de distribución Madrid-pm