Skip to content

ML model that predicts a server's OS by analyzing TCP/IP headers in the server's packets

Notifications You must be signed in to change notification settings

oopir/os_fingerprinting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Machine Learning Model for Passive OS Fingerprinting

OS fingerprinting is the process of detecting a remote server's OS (and version) by communicating with it and analyzing its response. This process is important for security experts (and attackers), since knowing a server's OS reveals the server's security vulnerabilities.

The most common tools for fingerprinting (Nmap, NetworkMiner, Satori, p0f) rely on a database of "network signatures" (a signature can be thought of as the 'accent' or 'body language' of an OS). The database is maintained manually by security experts, and has not been updated in a long time (most tools rely on the database of p0f).

This project is an attempt to create an ML model for OS fingerprinting.

Background on OS Fingerprinting

There are 2 types of fingerprinting:

  • Active fingerprinting takes advantage of known security flaws: if there was a vulnerability in version X of the linux kernel, and it was fixed in version Y, then attempting to use the exploit will help us determine the server's kernel version ("exploit completed successfully" --> "server has version X"). Nmap is a common tool for active fingerprinting.

  • Passive fingerprinting only analyzes packets of 'typical/legitimate' communication (mainly the TCP/IP headers). p0f is a common tool for passive fingerprinting.

The trade-off between the two methods: the active method has better accuracy, but its 'aggressive' nature makes it much easier to detect by firewalls.

In this project my models perform the passive version. To be precise, they only look at the server's TCP SYN-ACK message, which makes the process extremely stealthy and fast.

Related Work: I found a paper written by IEEE researchers about a similar project:
      A Machine Learning-based Tool for Passive OS Fingerprinting with TCP Flavor as a Novel Feature

Data Generation

I collected data on ~1,000,000 servers (chosen from a list of popular websites).

Establishing Ground Truth

Since I don't have a datacenter's-worth of my own servers, finding labeled servers felt like a 'chicken and egg' problem. I decided to use Nmap's analysis as my ground truth: it may not be 100% accurate, but it does harness the percision of active fingerprinting, and it's an industry standard.

Nmap's output usually claims to be of 85%-90% certainty. It returns a list of guesses in descending order of certainty. For this reason I aimed for 85%-90% accuracy with my models, and decided that the most relevant accuracy metric will be top-2 accuracy.

Feature Selection

I chose the features by reading p0f's documentation, the paper mentioned before and the RFC on TCP/IP headers.
Some of the most helpful fields are IP's "Dont Fragment" flag, IP's TTL value, TCP's MSS value, and TCP's options.

Data Collection

The process of retrieving labels and the process of retrieving features were run separately using different tools.

Label retrieval: Python has a wrapper for Nmap, so automating the scan was relatively trivial. Another advantage of Nmap is a built-in ability to concurrently scan multiple hosts.

Feature retrieval: to analyze a server's SYN-ACK message, I sent an HTTP request while sniffing the communication with Scapy (a sniffer & packet manipulation tool). I used multithreading to probe multiple hosts simultaneously.
(Initially I only sent a TCP SYN message, as it's simpler & faster than sending a full HTTP request. I noticed there was almost no variety in the response's TCP options, and suspected it may be due to the 'synthetic' nature of the probe. Switching to a full HTTP request resulted in the variety I was hoping for.)

My scan found the following operating systems:

OS       # Samples       OS       # Samples      
Linux 5.X       12392       OpenBSD 4.X       7041      
Linux 4.X       110824       FreeBSD 6.X       72072      
Linux 3.X       88485       embedded       76809      
Linux 2.6.X       50978       Windows 2016       6224      
Linux (Other)       5634       Windows 2012       9014      


Model Comparison

The Models:

  • SVM: in some of the features, different operating systems result in different value ranges (for example, Windows systems tend to have initial TTL of 128, while Linux systems tend to have initial TTL of 64). I believed this property might call for a linear classifier.
  • Gradient Boosting: this is simply a typical choice for tabular data.

  • Neural Network: adding this model was mostly for my own curiosity. The network has 4 fully-connected layers.

The Metric:
    As I wrote under Establishing Ground Truth, the metric that fit my data is top-2 accuracy.
    Note that it does not hinder user experience too much: receiving 2 guesses isn't so bad when looking for exploits.

The Results:
    All 3 models reached a top-2 accuracy of around 85%. Graphs are available in the Model Training Notebook.

About

ML model that predicts a server's OS by analyzing TCP/IP headers in the server's packets

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published