Find all unique url's from Apache log files
Posted on Tue 05 February 2013 in misc
I needed to build a list of all unique hits that had been made on a website in Apache.
Here's what I came up with using awk and sed. This should match any HTTP 2xx or 3xx requests and strip of any GET request parameters.
awk '\$9 \~/\^(2|3)/ {print \$7}' somelogs\* | sed 's/\\?.\*\$//' | sort |
uniq