Skip to content

It's just a python script for analyzing our access log which log were we entered into the script then we can find most hitting IP address and corresponding location.

Notifications You must be signed in to change notification settings

yousafkhamza/log-analyzer-pyscript

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Access Log Analyzer + Location Finder

Build


Description

It's a python script for find which IPs are hitting on our servers mostly and that location where we find from the log with the help of ipstack.


Feature

  • Log sorting with the highest hit (first 10 positions)
  • Log location printing with the script

Pre-Requests

  • Need an IPstack Login and API for location finding
  • Need to add apikey file before running the script. So, please find the IPstack URL and grab the key and change the same on "apikey.py"
  • Need to install python3.

How to get IPstack API

  • Please go through the ipstack and click "GET FREE API KEY" on the top right corner.

alt text


How to get the script

Steps: (Amazon-Linux)

sudo yum install git -y
sudo yum install python3
git clone https://github.com/yousafkhamza/log-analyzer-pyscript.git
cd log-analyzer-pyscript

Script running Demonstration

$ python3 log.py
Enter your log file name (absalute path): ../Downloads/Python/access.log
193.106.31.130      :    313055  [Ukraine]
197.52.128.37       :     40777  [Egypt]
45.133.1.60         :      7514  [Netherlands]
173.255.176.5       :      5220  [United States]
172.93.129.211      :      4195  [United States]
178.44.47.170       :      2824  [Russia]
51.210.183.78       :      2684  [France]
84.17.45.105        :      2360  [United States]
193.9.114.182       :      2205  [Belgium]
45.15.143.155       :      1927  [United States]

Most Hitting Ip Address : hit count [location]


Modules used

  • ipstack (Custome made module)
  • logparser (Custome made module)
  • apikey (Custome made module for API key passing)
  • requests (API key passing module)
  • re (Regular expression module)

Behind the code

# cat apikey.py (Using for api key passing to the script)

api = '<enter your apikey from ipstack site>'            #<------------------- Replace with your API key where you got from ipstack
# eg:
# api = 'a37f9a05417225606d6650e16167'

# cat ipstack.py (API connection establishing and country name grabbing)

import requests

def get_country(ip=None,key=None):
 if ip != None and key != None:
    url_ipstack = "http:https://api.ipstack.com/{}?access_key={}".format(ip,key)
    response = requests.get(url=url_ipstack)
    geodata = response.json()
    return geodata['country_name']

# cat logparser.py (it's using for log parsing and it's a outsource script and who had made this and really thank you for him.)

#!/usr/bin/env  python3

import re

regex_host = r'(?P<host>.*?)'
regex_identity = r'(?P<identity>\S+)'
regex_user = r'(?P<user>\S+)'
regex_time = r'\[(?P<time>.*?)\]'
regex_request = r'\"(?P<request>.*?)\"'
regex_status = r'(?P<status>\d{3})'
regex_size = r'(?P<size>\S+)'
regex_referer = r'\"(?P<referer>.*?)\"'
regex_agent = r'\"(?P<agent>.*?)\"'
regex_space = r'\s'

pattern = regex_host + regex_space + regex_identity + regex_space + \
          regex_user + regex_space + regex_time + regex_space + \
                  regex_request + regex_space + regex_status + regex_space + \
                  regex_size + regex_space + regex_referer + regex_space + \
                  regex_agent


def parser(s):
        """
        return type : dict()
        return format: {
                       host:str , identity:str , user:str ,
                                           time:str ,request:str , status:str ,
                                           size:str , referer:str, agent:str
                                        }
        returns None if failed.
        """
        try:
                parts = re.match(pattern,s)
                return parts.groupdict()
        except Exception as err:
                print(err)

# cat log.py (The script for sort IP hit and finding which of the location where it's from)

import ipstack
import logparser
import apikey

def get_hit(t):
    return t[1]

path = input('Enter your log file name (absalute path): ')

if path.lower().split('/')[-1].endswith('log') and os.path.isfile('{}'.format(path)):
    file = open("{}".format(path),'r')
    ipcount = {}
    for line in file:
        part = logparser.parser(line)
        ip = part['host']
        if ip not in ipcount:
            ipcount[ip] = 1
        else:
            ipcount[ip] += 1
            
    result = sorted(ipcount.items(),key=get_hit,reverse=True)[:10]
    for item in result:
        ip,hit = item
        country = ipstack.get_country(ip=ip,key=apikey.api)
        print("{:20}:{:10}  [{}]".format(ip,hit,country))
else:
    print('This is not a access_log')

Sticky Note

alt text


Conclusion

It's just a python script for analyzing our access log which log were we entered into the script then we can find most hitting IP address and corresponding location.

⚙️ Connect with Me

About

It's just a python script for analyzing our access log which log were we entered into the script then we can find most hitting IP address and corresponding location.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages