Blog internal - How IPs are masked

14 Oct 2022

In some way an IP can be used to identify an individual. This server is logging every access. This is a description how IPs are sufficiently masked, so they can't be used anymore for identification purposes.

The solution is to log from httpd to syslogd, then pipe it to a program that removes the last octet and reroutes it back to syslog. This can be acomplished solely with syslogd and perl:

!blog
*.info /var/log/httpd_blog
!*
!httpd
*.* |/usr/bin/env perl -MSys::Syslog=:standard,:macros -ne 'openlog("blog","",LOG_LOCAL0);s/^.*[[:space:]](([[:digit:]]{1,3}\.){3})[[:digit:]]{1,3}(.*$)/$1XXX$3/;syslog("info","%s",$_);'
!*

Deconstructing the oneliner:

use Sys::Syslog qw(:standard :macros) 
while(<>) {
  openlog("blog","",LOG_LOCAL0);
  s/^.*[[:space:]](([[:digit:]]{1,3}\.){3})[[:digit:]]{1,3}(.*$)/$1XXX$3/;
  syslog("info","%s",$_);'
}

The while loop is used to iterate through everything that gets input into STDIN. The openlog and syslog functions are the relevant ones for passing the result back to the syslog (to get fetched by the !blog part). The substitution is used for: Deleting the first part up to the ip address (^.*[[:space:]]), since a similiar content would be appended from the Sys::Syslog module. Afterwards the IP address is fetched up to the third octet and stored in capture group 1 ((([[:digit:]]{1,3}\.){3})). To remove the last octet it is specifically excluded from the third capture group ([[:digit:]]{1,3}(.*$)). Finally everything gets replaced with the resulting capture groups and the mask for the last octet of the address ($1XXX$3). To compare before and after:

Before:
Oct 12 09:06:10 webhost blog: 127.0.0.1 - - [12/Oct/2022:09:06:10 +0200] "GET / HTTP/1.1" 200 1173
After:
Oct 12 09:06:10 webhost blog: 127.0.0.XXX - - [12/Oct/2022:09:06:10 +0200] "GET / HTTP/1.1" 200 1173