Skip to content

Commit

Permalink
doco cleanups
Browse files Browse the repository at this point in the history
  • Loading branch information
nabbi committed Oct 14, 2021
1 parent 4ce1d70 commit 76c613f
Show file tree
Hide file tree
Showing 5 changed files with 92 additions and 71 deletions.
7 changes: 0 additions & 7 deletions INSTALL

This file was deleted.

13 changes: 13 additions & 0 deletions INSTALL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Installation
If you change these paths then you'll have to patch the script and/or update syslog-ng.conf references
See USAGE.md for configuration setup

## configuration files
```
cp -iv alert*.conf /etc/syslog-ng/
```

## script
```
cp -iv syslog-alert.tcl /usr/local/sbin/
```
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@

# syslog-alert
I wrote this to dabble with TclOO while improving my solution for throttling email alerts from syslog-ng.

* Provides per host message throttling and is intended to run on a centralized log collector.
* Configurable sub exclusions to filter out false alarm noise.
* Selectively alert different support groups, including shorter messages to pagers or mobile devices.

See USAGE file for more information
## concept
* Reads standard input from syslog-ng OSE by using the program() driver.
* Alerted logs are tracked within SQLite so recent occurrences can be discarded
* Sendmail recipients are compiled from group memberships within SQLite
* Two configuration files define the contacts and log pattern alert actions

See USAGE.md and INSTALL.md for more information
129 changes: 71 additions & 58 deletions USAGE → USAGE.md
Original file line number Diff line number Diff line change
@@ -1,164 +1,170 @@
### copyright 2020 nic@boet.cc
# Usage

I originally wrote this as I needed a solution for throttling sendmail events piped from syslog-ng
which provided custom per host+message throttling yet was flexible to # selectively send specific
events to a different distribution list or after hours pager/mobile.
which provided custom per host+message throttling yet was flexible to selectively send specific
events to a different distribution list or an after hours pager/mobile.

Delightfully written in Tcl as an experiment with TclOO
SQLite is used for an in memory database of tracking events

This has gone through various rewrites over the years. It started as a Perl script over a decade ago
which flocked a file for event tracking, it was a crude database (if you could even call it that).
Refreshed a few years ago in TclOO to leverage SQLite. For me, this was one of those scripts which
silently ran in the background and forgotten about. So this say around in this state for another year or
three before picking it up again.
silently ran in the background and forgotten about. So this stayed around in this state for another year or
three before I picked it up again.
This latest update expands filtering functionality, cleaner per group alerting, and leverages configuration
files instead of hacking the script.



### Notes on testing and troubleshooting
## Notes on testing and troubleshooting

As a result of using only an in memory database, if this program crashes or syslog-ng restarts,
you may see alerts which notify again within X threshold. syslog-ng expects said program not to exit and
continue to accept stdio -- you will see the pid change if this script happens to crash as syslog-ng will try
to restart it
continually accept stdio -- you will see the pid change if this script happens to crash as syslog-ng will restart it

syslog.info syslog-ng[]: Child program exited, restarting; cmdline='/usr/local/sbin/syslog-alert.tcl', status='256'
* syslog.info syslog-ng[]: Child program exited, restarting; cmdline='/usr/local/sbin/syslog-alert.tcl', status='256' *

This usually indicates a syntax issues introduced into the code base. Since the switch condition is now generated
from the user configuration file, errors have been reduced from manually editing the code.
There is still high probability of a bad confg to load or miss parse. Very minimal non-existent validations are performed.
This usually indicates a syntax issue introduced into the code base. Since the switch condition is now generated
from the user configuration file, errors have been reduced from manually editing conditions within code.
There is still high probability of a bad config to load or miss parse. Very minimal non-existent validations are performed
when reading these conf files.
Read up on TCL Lists if this is foreign to you.

That being said, what I have seen more likely is frequent restarts to syslog-ng itself triggering alerts to trigger again.
That being said, what I have seen more likely is frequent restarts to syslog-ng itself triggering alerts to trigger again;
that is the anticipated design.


You should, and are encouraged, to run this program directly to validate the behaviors.
You can run this program directly to validate the behaviors.
Toggle debug and trace flags to log more info to stdout
Paste or pipe in some test log messages to see what happens.
Paste or pipe in some test log messages to see what happens, this must match the syslog-ng template.

```
{datetime stamp} {sandbox} {error} {info} {smartd[12334]:} {test message}
{datetime stamp} {sandbox} {debug} {info} {smartd[555]:} {test message}
{datetime stamp} {sandbox} {debug} {info} {test[3333]:} {test message garbage}
```

Or to test this with syslog-ng generage events with logger (or trigger the real condition on the source)
Or to test this with syslog-ng generate events with logger (or trigger the real condition on the source)


### alert-contacts.conf
## alert-contacts.conf

copy into /etc/syslog-ng/
This file defines who, based on a group name, should receive an alert
This file defines, based on a group label name, who should receive an alert

syntax tcl list
```
{name} {group} {email@example.com} {pager@example.com}
{name} {group} {one@example.com, two@example.com} {pager1@example.com, pager2@example.com}
```

NAME is unused, config doco
GROUP is a label for a team or sme for who should receive a particular type of event
NAME is unused, exists for your config doco purposes
GROUP is a label for a team or SME for who should receive a particular type of event
EMAIL and PAGE/mobile both expect valid email addresses.
multiple email address can be added if csv defined (ie ", " separator)
We aren't doing any sms integration so Lookup their carriers phonenumber@domain online

Both are optional. pages get a smaller formatted message, not the entire raw log like email
If someone only has one type of contact method then just leave it blank (ie "{}")

```
{user1} {admin} {user1@example.com} {1111111111@carrier.com}
{user2} {admin} {user2@example.com, user2other@exampleother.com} {}
{user3} {disk} {} {3333333333@carrier.com}
{user4} {admin} {user4@example.com} {4444444444@carrier.com}
{user4} {disk} {user4@example.com} {4444444444@carrier.com}
```


### alert.conf
## alert.conf

copy into /etc/syslog-ng/
This file defines what events to alert on by dynamically generating the tcl switch conditions
This file defines which events to alert on by dynamically generating the tcl switch conditions
First matched so order matters

syntax tcl list, some elements are lists themselves
{{pattern1} {pattern2}} {{exclude1} {exclude2}} {hash} {delay} {email} {page} {ignore} {custom tcl code}

PATTERN nocase glob matches against $log(all)
all=is the complete reassembled log message
Must escape with double quotes to match tcl switch; unless it's the default (you define that here too)
Must escape with double quotes to match tcl switch; unless it's the default (you must define that here too)
These are a list of lists to allow the same throttling and action to occur with minimizing config lines
and minimizing duplicate switch bodies in memory.

```
{{""}}
{{"*mdadm*}}
{{"*alert*"} {"*crit*"}}
{{default}}
```


EXCLUDE patten sub negates what matched at PATTERN
Also nocase glob matches but this filters to which section of the log message
EXCLUDE pattern sub negates what matched at PATTERN
Also nocase glob matches but this filters to a specific section of the log message
Multiple conditions are treated as OR
if you need AND then use all= and string together the template order with glob wildcards

```
{}
{{}}
{{host="foo*"} {level="debug"}}
{{all="*crit*cron**some event*"}}
{{msg="*some other event all daemons*"}}

```
HASH controls how we throttle an alert
This is also the default subject for email and pages
Usually I include the host which allows similar events from other nodes to still alert
This can be anything, as it does not patterned matching the real message,
This can be anything, as it does not pattern match against the real message,
although they are linked so this needs to be unique for the log event and if too generic
or matched too soon, it could suppress other events

```
{"$log(host) label"}
{"common message from all sources"}

```
DELAY throttles how long between getting another alert
Defined as an integer in seconds

```
{600}
{3600}
{86400}

```
EMAIL these groups
multiple can be listed separated with a space
can be omitted, then no action is taken
(ie maybe use with IGNORE or CUSTOM)

(ie maybe use with IGNORE or CUSTOM if no EMAIL action is desired)
```
{}
{admin disk}
{oncall}

```
PAGE these groups
same as EMAIL

IGNORE is a boolean true false
This does not process the pattern for alerting
Consider filter noise in syslog-ng but if you cannot

Consider filtering heavy noise within syslog-ng itself
```
{}
{0}
{1}

```
CUSTOM eval as tcl code
This extends further flexibility of the program by tclsh injection
This extends further flexibility of the program by tclsh code injection
customize the action behavior, positioned before email/page actions

A good reminder to limit modification to the config files and tcl script
this will run as the UID of syslog-ng

this will exec as the UID of syslog-ng
```
{}
{set subject "$log(host) event"}
{exec /usr/local/bin/something-cool.sh}
```

### tcl 8.6 (tested with 8.6.10 and 8.6.8)
## tcl 8.6 (tested with 8.6.11)

Originally there was dependency on tcllib (1.20) to provide csv (0.8.1)
While I included logic to import csv without this, I decided to rewrite the config
using lists instead as I found the statements easier to read.
Simpler. Possibly more portable as some distros ship with dated packages

### sqlite3 (tested with 3.33.0 and 3.32.0))
## sqlite3 (tested with 3.35.5)

compiled with --enable-tcl

Expand All @@ -167,49 +173,56 @@ as the database calls are not complex



### sendmail
## sendmail

expects to find a "sendmail" compatible in the system path
I am using sSMTP (2.64)


### syslog-ng (tested with 3.28.1 and 3.13.2))
## syslog-ng (tested with 3.32.2)

This program depends on adjusting your /etc/syslog-ng/syslog-ng.conf
Below describes how to integrate this into your config as an external program call



# Template
### Template
In order to parse log events into variables, a predictable and parsable structure
needs to be established. We escape each section with {} to create a tcl list.
needs to be established. We escape each message section with {} to create a tcl list.

```
template t_alert {
template( "{${ISODATE}} {${HOST}} {${FACILITY}} {${LEVEL}} {${MSGHDR}} {${MSG}}\n" );
template_escape(no);
};#
```


# Destination
### Destination
define where you placed this script so syslog-ng can pipe events to it
Note that this is where you link the template formatting
Note that this is where you bind the template formatting

```
destination d_alert { program("/usr/local/sbin/syslog-alert.tcl" template(t_alert) mark-freq(0) ); };
```


# Filters
### Filters
So. This script was written with the mindset of pre-filtering events within syslog-ng
This results in some duplication of config mgmt; in syslog-ng.conf and within this script
The expectation is, if you pipe messages to this alert script then you intend send emails.

Think of syslog-ng as the course comb and this script as the fine side of the comb.
While it may be possible to handle forward all events, I haven't extensively load tested.
This results in some duplication of config mgmt; in syslog-ng.conf and within this script
While it appears capable to handle forward all events, I haven't extensively load tested this
inversed behavior.

filter f_level3 { level (err..emerg); };


# Log
### Log
This links your filters and destination together

```
log { source(s_net); source(s_local); filter(f_level3); destination(d_alert); };
```


8 changes: 4 additions & 4 deletions syslog-alert.tcl
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ oo::class create Contacts {

foreach l $lines {

if { [string index $l 0] == "#" || [string index $l 0] == " " || [string length $l] == 0 } {
if { [string index $l 0] == "#" || [string length $l] == 0 } {
continue
}

Expand Down Expand Up @@ -84,7 +84,7 @@ oo::class create Contacts {

set to [my contacts_group $g "page"]

#silently fail as we do not want to exit. check configs for valid entire
#silently fail as we do not want to exit. check configs for valid entry
if { [string length $to] > 0 } {
my sendmail "$to" $s $b
}
Expand All @@ -97,7 +97,7 @@ oo::class create Contacts {

set to [my Group $g "email"]

#silently fail as we do not want to exit. check configs for valid entire
#silently fail as we do not want to exit. check configs for valid entry
if { [string length $to] > 0 } {
my Sendmail $to "Subject: $s" $b
}
Expand Down Expand Up @@ -203,7 +203,7 @@ oo::class create Alert {

foreach l $lines {

if { [string index $l 0] == "#" || [string index $l 0] == " " || [string length $l] == 0 } {
if { [string index $l 0] == "#" || [string length $l] == 0 } {
continue
}

Expand Down

0 comments on commit 76c613f

Please sign in to comment.