Kreuvfs Allerweltsblog

2008-10-06

wz_logs_get_referers.sh

Abgelegt unter von Kreuvf um 18:48:15

wz_logs_get_referers.sh

#!/bin/sh

# Author: Steven “Kreuvf” Koenig
# Date : 2008-10-05
# Last update: 2008-10-06
# Purpose: Kicking out all the cruft of the log files provided by all-inkl.com
#
# What it does:
# Remove every entry that we know is uninteresting so that only the interesting entries remain
# clearing lines is done with ed as it seems easier to me

whole_cmd=$0
file=$1

# Get every referer entry
# We have three strings enclosed in quotes and we just want the second one
sed -i -r ‘s/.*”.*”.*”(.*)”.*”.*”/\1/’ $file

# Delete every empty referer, google stuff, warzone2100.de stuff – we are not interested in the ways people go on the site -, yahoo search stuff, wikipedia stuff
sed -i -r ‘s/^-$//’ $file
sed -i -r ‘s|.*google\.[^./]*/.*||’ $file
sed -i -r ‘s|.*warzone2100\.de/.*||’ $file
sed -i -r ‘s|.*search\.yahoo\.[^./]*/.*||’ $file
sed -i -r ‘s|.*wikipedia\.org/.*||’ $file

# I am too lazy to understand how to remove empty lines with sed.
set `echo $whole_cmd | sed -r ‘s|(.*/).*|\1|’`

# $1 is set by the set command which is why I “save” the first $1 as $file
ed $file < $1/wz_log_edscript