httperf and visitor click replay

We're using httperf to test some load on the webserver. To make it more "lifelike", we had a user browse the site and read the parts he was interested in. From that, we needed to create a list of urls and waiting times that correctly reflect the actual behaviour of the visitor.

For this, I created a small python script. You feed it a log that contains lines like:

[17/Jul/2007:17:37:34 /mytest/page.html

As you can see, I simply parsed the Apache log of the visit through an "awk '{print $4, $7}'", directed to a file. I then read that file with the following python app to create a file that matches httperfs directions. Keep in mind that:

  • Each request made within one second of the previous request is considered part of one request (like, you get the html and the jpegs in one request).
  • Anything longer is considered "think time" of the visitor.

The program is kinda sloppy, but it works, which is all I really care about :)

import sys

from datetime import *

lasttime = datetime(2007,07,17,0,0,0)

set = []

file = open(sys.argv[1])

newfile = open("new-test","w")

for line in file: newtime = datetime(2007,07,17,int(line[13:15]),int(line[16:18]),int(line[19:21])) delta = newtime - lasttime if delta.seconds > 1: if len(set) > 0: set[0] = set[0] + " think=" + str(delta.seconds) + "\n" for writeline in set: newfile.write(str(writeline)) set = [] set.append(line[22:-4]) else: set.append("\t"+line[22:-4]+"\n") lasttime = newtime set[0] = set[0] + "\n"

for writeline in set: newfile.write(str(writeline)) file.close()



Comments powered by Disqus