Sorting Apache log files

Last week I had to sort several months worth of Apache log files to feed them to Webalizer. These came from several servers, so it was fairly difficult to get the data together and in the correct format. I ended up using a small python sorting script, especially written for this. Maybe it'll help someone else, so I'm sharing it here. Just pipe the logs through the script and direct the output to a file.

import sys

data = sys.stdin.readlines()

def compare_apache_dates (date1, date2): str1 = date1.split()[3] str2 = date2.split()[3] months = ["Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"]

if str1[8:12] > str2[8:12]:
    return 1
elif str1[8:12]  months.index(str2[4:7]):
    return 1
elif months.index(str1[4:7])  str2[1:3]:
    return 1
elif str1[1:3]  str2[13:15]:
    return 1
elif str1[13:15]  str2[16:18]:
    return 1
elif str1[16:18]  str2[19:21]:
    return 1
elif str1[19:21] </pre></body></html>


Comments powered by Disqus