Remote Logging with Python

If you find yourself developing a distributed system, it’s very useful to have a logging subsystem which allows you to see output from multiple nodes in one place.

Python’s Logging module is quite comprehensive, it gives a lot of ways to format messages and a good level of control over what happens. For any given Logger object, there can be multiple Handlers which control where log messages go.

For remote logging Python provides two handlers StreamHandler and DatagramHandler, for TCP and UPD connections respectively. This example focuses on using the DatagramHandler for a UDP connection.

Step 1: Adding the Handler

This part is easy, if you have a Logger object already instantiated then get the DatagramHandler class from logging.handlers and create an instance with the parameters for the IP address and the port of the remote listener.

from logging.handlers import DatagramHandler

ip_address = "127.0.0.1"  # Localhost, for demonstration
port = 5000  # Any port that is not in use

udp_handler = DatagramHandler(ip_adress, port)  # Create the handler
log.addHandler(udp_handler)  # Add the handler to the logger

So that bit’s easy enough! Note that no formatting information is needed at this point, that is handled by the receiver.

Step 2: Receiving the message

This is the part that gets difficult. The receiver will get the message, but rather than being a nice formatted string, it will receive a serialised (In Python terminology: pickled) object of type dict (A Python dictionary). This contains all message parameters we need to format the output.

First: we’re going to make use of four modules: socket, struct, cPickle, and logging.

Second: a socket is needed to receive information sent to it. We create the socket for use with IPv4 addresses and with intent to use UDP as shown. When the socket is bound using the bind() method, we can start accepting connections.

import socket
import struct
import cPickle
import logging

ip_address = "127.0.0.1"  # Localhost, for demonstration
port = 5000  # Any port that is not in use

# socket.AF_INET: Means this socket will operate using IPv4
# socket.SOCK_DGRAM: This socket will receive UDP Datagrams.
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.bind((ip_address, port))

Third: We need a Logger object to work with the LogRecord object we will decode. At this point any formatting rules can be used. In this example, a simple StreamHandler is used to print messages to StdOut.

log = logging.getLogger()  # New Logger instance
handle = logging.StreamHandler()  # Prints to StdOut
handle.setLevel(logging.DEBUG)  # Set effective level
log.addHandler(handle)  # Add to logger

As a tip, if you find messages that you expect to be displayed are being lost, ensure that the proper effective level has been set on both the Logger objects and the handlers.

At this point, the assets we need are in place and we can enter a loop to receive data.

while True:
    data = sock.recv(2048)  # Recieve 2048 bytes
    data_length = struct.unpack(">L", data[:4])[0]  # Get packet size
    try:
        unpickled_dict = cPickle.loads(data[4:data_length + 4])  # Unpickle
    except EOFError:
        print("Error while receiving data")  # Thrown in case of unpickle error
        continue
    record = logging.makeLogRecord(unpickled_dict)  # Dict -> Record
    log.handle(record)  # Print record out

The socket waits on the recv() method until data is sent to it, after which the socket tries to read 2KB of data from the stream. The stream of data contains two things: the length of the data packet (first 4 bytes) and the pickled information. Using the struct module, the first four bytes are translated into an integer, then the cPickle module is used to unpickle that data, from this we get the dictionary. The dictionary is simply turned into a LogRecord object by the logging module, then the Logger object can handle it like any other message.

As a note: if you expect datagrams to ever be longer than 2kb, you will need to use a loop to ensure you receive all data else you will fail to unpickle the data.

The recv() method takes data from the socket, but if the address of the other client is needed, use recvfrom() instead, which returns a tuple containing the data and the address of the client. If more information is needed, consider the following…

Step 3: Sending Additional Data (Optional)

Sometimes, certainly with distributed systems, it’s important to be able to show identifying information in the log. This could be placed in every single log message individually but that’s tedious and may result in a lot of unnecessary code. My solution was to add a dictionary of static information to the handler which is put in ever record it handles.

from logging.handlers import DatagramHandler

class CustomUDPHandler(DatagramHandler):

    def __init__(self, host, port, record_additions={}):
        super(CustomUDPHandler, self).__init__(host, port)
        self.record_additions = record_additions

    def handle(self, record):
        for key, value in self.record_additions.iteritems():
            if key in record.__dict__:
                raise AttributeError(
                    "Attempt to overwrite attribute %s in record." % (key,))
            record.__setattr__(key, value)
        super(CustomUDPHandler, self).handle(record)

Attributes of the LogRecord object are kept in its namespace, overriding the handle method in this way allows us to put all our attributes in the record before it’s pickled and sent. An AttributeError will be raised if you inadvertently attempt to overwrite any of these existing attributes.

On the other end, a formatter can be used to display the information. Say we create an instance of our CustomUDPHandler with the parameters: (‘127.0.0.1’, 3000, {‘ident’: “Client #5”}) then the attribute ident will be a part of the record sent to the other side. Access it as such:

style = "%(ident)> %(message)"  # Ident formatted into string
formatter = logging.Formatter(fmt=style)

stream = logging.StreamHandler()
stream.setFormatter(formatter)
log.addHandler(stream)

Allowing us to have a nice formatted message output from the receiving end that looks like this:

 Client #5> Everything is awesome!

Advertisements