Remote Logging with SSH and Syslog-NG

Hal Pomeranz, Deer Run Associates

 

One of the points I make repeatedly in my training classes is the value of centralized logging. Keeping an off-line copy of your site's logs on some central, secure log server not only gives you greater visibility from a systems management perspective, but can prove invaluable after a security incident when the local copies of the log files on the target system(s) have been compromised by the attacker.

 

The difficulty is that the standard Unix Syslog daemon uses unauthenticated UDP messages to transmit log messages to remote servers. This makes drilling holes in your firewalls to accept Syslog messages from remote locations very undesirable, to say nothing of the security implications of having critical system log messages traveling in clear text over public networks. Use of IPSEC or some other strong VPN product can certainly help mitigate these concerns, but if all you care about is obtaining logging information from some remote site then firing up a full-bore VPN session may seem like overkill.

 

However, the fact that UDP is not a guaranteed delivery protocol also means that important log messages can be dropped entirely. While lack of guaranteed delivery can be a factor for Syslog messages in LAN environments, the risk becomes much greater when trying to drive remote log messages across highly congested public networks. Simply using a VPN to protect the security of the remote log stream does nothing to address the guaranteed delivery concern. This is where Syslog-NG becomes attractive, because two Syslog-NG servers can share remote logging information using TCP rather than UDP. But once you're logging via TCP, then it is also possible to tunnel this TCP communication via SSH rather than firing up a full VPN-- the "best of both worlds" if you're looking for a quick and dirty solution.

 

The rest of this article covers the basic configuration for establishing an SSH tunnel between two servers and configuring Syslog-NG at both ends to communicate log messages down this tunnel. Because Syslog-NG is capable of both accepting UDP-based log messages from standard Unix Syslog daemons as well as forwarding those messages to another machine, it is possible to set up a single Syslog-NG server at a remote site which acts as a collector and relay for the log messages generated by all machines at that location, but this configuration is largely outside of the scope of this article (though I'll give you some pointers in that direction as we go along).

Start with SSH

The first step is to get the SSH tunnel set up between the two machines. My personal preference is to originate the SSH tunnel on my central "loghost" machine at the primary site, and have it connect to the machine at the "remote" site that I want to get logs from. Typically this involves drilling out through the firewall at the primary site-- often the site's default firewall rules will allow this connection without any reconfiguration-- and allowing the connection "inward" through the firewall at the remote end, which usually requires some firewall ruleset tweaks on the remote site's firewall.

 

However, since we want the remote log server to be sending logs back to the central loghost at the primary site, we need to use a reverse tunnel (that's the "-R" option on the SSH command line) to get things working properly. This is actually one of very few places where I find reverse tunnels to be useful. Figure A, below, shows a high-level picture of how the traffic is flowing in this design.

 

 

We need to make sure that the SSH session and tunnel are set up automatically when the central log host boots. If the SSH session dies for some reason (intermittent network outage, system administration "accident", etc) we'd also like the connection to be re-established as quickly as possible. In situations like this, I like to have the init process fire off the SSH connection with a line like this in /etc/inittab:

 

        log1:3:respawn:/usr/bin/ssh -nNTx
            -R 514:loghost.domain.com:514
            remote.domain.com >/dev/null 2>&1

 

The example above must appear as a single long line in /etc/inittab-- I've just broken it onto multiple lines for clarity.

 

Let's examine the SSH command line first. The "-R 514:loghost.domain.com:514" on the second line of the example sets up the reverse tunnel from 514/tcp on the remote server to "loghost.domain.com:514"-- in other words, port 514/tcp on the central loghost machine. While it seems natural to use 514/tcp for Syslog-NG logging, you have to remember that 514/tcp is the reserved port for the Unix rlogin/rsh service so you're going to run into a port conflict if you still have these services enabled. I generally turn off unencrypted network protocols like telnet, FTP, rlogin/rsh/rcp on my servers and use SSH instead, so it's not an issue for me, but you can run this tunnel over any free ports if there is a conflict at your site.

 

As for the other SSH command line options above, the "-n" flag tells SSH to associate the standard input with /dev/null. There won't be any command line input since we're essentially going to be running the SSH client as a "daemon" via init. As you can see at the end of the command line in the example, we're also sending the standard output and standard error to /dev/null as well ("... >/dev/null 2>&1"). Since we're never going to be issuing remote commands via this SSH connection (we only care about the tunnel), the "-N" option to SSH tells the SSH client to only set up the tunnel and to not bother preparing a command stream for issuing commands on the remote system, while "-T" says to not bother allocating a pseudo-tty on the remote system. The "-x" option disables X11 forwarding, just as a defense-in-depth gesture.

 

Turning our attention to the rest of the /etc/initab entry, the first field ("log1") is just an identifier for this entry in the inittab file. These identifiers can be any sequence of 2-4 alphanumeric characters; the only requirement is that they be unique from all other identifiers used in the file. I've chosen "log1" here because it's usually the case that I have multiple SSH tunnels set up to different remote log sources, and I typically name the inittab entries "log1", "log2", etc. The second field in the inittab file ("3") is the run level where this entry should be fired. Make sure to start this SSH process after the network interfaces have been initialized but before the Syslog-NG daemon is started.

 

The "respawn" option in the third field is the reason I like to use init for spawning processes like this. When the "respawn" option is enabled, the init process will automatically fire off a new SSH process if the old one dies for any reason. In other words, init acts a like a "watchdog" type daemon and makes sure that the SSH tunnel is always up and running. This is an extremely useful technique, but one that a lot of system admins seem to have forgotten.

 

Once you've got your inittab entry all set up, HUP the init process ("kill -HUP 1"). This should cause the init process to re-read the inittab file and spawn the SSH connection. You should be able to verify that the SSH client is running with the ps command and verify the existence of the tunnel using netstat. Once you've got all that working, it's time to turn our attention to configuring Syslog-NG.

Configuring Syslog-NG

In general, configuration of Syslog-NG is well covered by Balazs Scheidler's reference manual[1] and Nate Campi's excellent FAQ[2]. So allow me to just present complete configuration examples for the main loghost and remote log server and point out the critical bits.

 

First let's take a look at the configuration for the main loghost:

 

        options { check_hostname(yes);
                  keep_hostname(yes);
                  chain_hostnames(no); };

        source inputs { internal();
                        unix-stream("/dev/log");
                        udp();
                        tcp(max_connections(100)); };

        destination logpile {
           file("/logs/$HOST/$YEAR/$MONTH/$FACILITY.$YEAR$MONTH$DAY"
           owner(root) group(root) perm(0600)
           create_dirs(yes) dir_perm(0700)); };

        log { source(inputs); destination(logpile); };

 

As far as the options go, "check_hostname(yes)" forces Syslog-NG to do a little bit of sanity checking on the incoming remote hostname in the log message. In our destination directive we'll be creating directories for each system's logs by hostname and it wouldn't be good if an attacker could embed shell meta-characters in the hostname to cause us problems. "keep_hostname(yes)" means to use the hostname that's presented in the actual message from the remote log server rather than using the hostname we get by resolving the source of the remote Syslog connection. After all, since we expect remote messages to be coming down our SSH tunnel, the source IP address of these messages will be the loopback address (127.0.0.1), and having all messages tagged with "localhost" is not what we want. "chain_hostnames(no)" causes Syslog-NG just to show the original hostname in the message rather than a chain of all the hops the message has been through to get to its final destination. This becomes a lot more relevant when you start relaying messages through multiple servers.

 

The inputs cover all of the various places we can get logging information from. "internal()" is internal messages from the Syslog-NG daemon itself. "unix-stream("/dev/log")" is the normal /dev/log device that Linux systems use for local logging. Note that if you're on a non-Linux platform like Solaris, HP-UX, or one of the *BSD operating systems then your local log channel is likely to be very different (examples of appropriate configurations for various operating systems can be found in the Syslog-NG source distribution). Some sites actually run the vendor Syslog in parallel with Syslog-NG rather than having to deal with the problem of emulating the standard vendor Syslog interfaces-- the vendor Syslog daemon can just relay messages to Syslog-NG via the standard UDP Syslog channel, even within the same machine. The "udp()" line means to listen on the standard 514/udp Syslog channel and "tcp()" means to listen on 514/tcp for messages from another Syslog-NG server (or in our case, the SSH tunnel). Note that both the "tcp()" and "udp()" options accept the "port()" option to specify a different port. For example, if you wanted your Syslog-NG server to listen on port 5014/tcp to avoid conflicts with the rlogin/rsh daemon you would write:

 

tcp(port(5014) max-connections(100));

 

Note also the use of the "max_connections()" option to increase the number of simultaneous TCP sessions the logging daemon can handle.

 

The destination clause allows us to specify a "log sink", or place where we want our logs to end up. Here we're using some built-in Syslog-NG macros to force incoming log messages to be divided out into directories: first by hostname, and then by year and month. Within each directory, messages will go into log files named for the Syslog facility the message was logged to (mail, auth, kern, local0, etc), with each file having a date stamp attached. Notice that with Syslog-NG automatically creating a new file for each day of logs, we don't even need a separate log rotation program! This is just one more useful feature of Syslog-NG. The other options to the "file()" directive make sure that directories will be created as needed and set sensible ownerships and permissions on the newly created files and directories.

 

Having defined our inputs and destination directives, we combine them into log declarations to actually tell the Syslog-NG daemon what to do with the incoming messages. Here we're just doing the trivial rule that sends all of our incoming messages from all sources into the log file directory hierarchy we defined in the destination directive above.

 

With the basic configuration of the central loghost out of the way, let's take a look at a sample configuration for the remote log server on the other end of the SSH tunnel. It's actually not too much different from the configuration for the central loghost:

 

        options { check_hostname(yes);
                  keep_hostname(yes);
                  chain_hostnames(no); };

        source inputs { internal();
                        unix-stream("/dev/log");
                        udp();
                        tcp(max_connections(100)); };

        destination logpile {
           file("/logs/$HOST/$YEAR/$MONTH/$FACILITY.$YEAR$MONTH$DAY"
           owner(root) group(root) perm(0600)
           create_dirs(yes) dir_perm(0700)); };

        destination remote { tcp("localhost"); };

        log { source(inputs); destination(logpile); };
        log { source(inputs); destination(remote); };

 

Basically, all we've done here is added an additional destination directive and an additional log directive. The "remote" destination says to log via TCP to "localhost" using the default port 514/tcp (since we didn't specify an alternate port). "localhost:514" should be the location of our reverse tunnel endpoint. Note that if you used an alternate port for the tunnel endpoint, you can specify it:

 

destination remote { tcp("localhost" port(5014)); };

 

Our first log declaration keeps a local copy of all log messages received in a directory structure on the remote log server that parallels the one on the central loghost. The second log directive also relays a copy of all messages back to the central log server via the SSH tunnel. It's up to you whether you keep a local copy of the logs on the remote log server, but most likely the admins at the remote site will appreciate having this copy of the logs.

 

Note that in the inputs section above, we've configured the standard "udp()" input for normal UDP Syslog messages. This means that other hosts at the remote site can send Syslog messages to the remote log server and those messages will be relayed by the Syslog-NG server back through the SSH tunnel to the central log host at home base. We've also configured the remote log server to listen for messages on the "tcp()" input channel. Maybe there are other Syslog-NG servers at the remote location, or perhaps there is an SSH tunnel from the remote log server to some other remote site and we're chaining log messages through multiple hops!

Conclusion

I think you'll find this a very easy little recipe to implement, and yet it achieves a very powerful goal. Of course, once you have this big pile of logs you're going to want some sort of tool that actually reads the logs for you and send you the "interesting" events. You could use a simple tool like Logcheck[3] or Swatch[4], or investigate some of the newer, fancier tools out there like Logsurfer+[5], SEC[6], or Lire[7]. Whatever solution you end up with, let me assure you that I never regret the effort I expend to set up centralized logging and log monitoring, because the visibility I get as far as what's happening on my networks is enormously useful.

References

[1] Syslog-NG Reference Manual, http://www.balabit.com/products/syslog_ng/reference/book1.html

 

[2] Syslog-NG FAQ, http://www.campin.net/syslog-ng/faq.html

 

[3] Logcheck, http://sourceforge.net/projects/sentrytools/

 

[4] Swatch, http://swatch.sourceforge.net/

 

[5] Logsurfer+, http://www.crypt.gen.nz/logsurfer/

 

[6] SEC, http://kodu.neti.ee/~risto/sec/

 

[7] Lire, http://logreport.org/lire/

About the Author

Hal Pomeranz (hal@deer-run.com) has been doing IT for more than 15 years. His favorite activity is being up at midnight on New Year's Eve so he can hear the disk drives on his log servers spin as the logging directory hierarchy for the new year is created.