Skip to main content

Network anomaly detection via machine learning

Project description

netml

netml is a network anomaly detection library written in Python.

This library contains two primary submodules:

  • pcap parser: pparser
    pparser is for parsing pcaps to flow features, using Scapy.

  • novelty detection modeling: ndm
    ndm is for detecting novelty / anomaly, via different models, such as OCSVM.

Installation

netml is available on PyPI:

pip install netml

Or, from a repository clone:

pip install .

Use

PCAP to features

import os

from netml.pparser.parser import PCAP
from netml.utils.tool import dump_data

RANDOM_STATE = 42

pcap_file = 'data/demo.pcap'
pp = PCAP(pcap_file, flow_ptks_thres=2, verbose=10, random_state=RANDOM_STATE)

# extract flows from pcap
pp.pcap2flows(q_interval=0.9)

# label each flow with a label
label_file = 'data/demo.csv'
pp.label_flows(label_file=label_file)

# extract features from each flow given feat_type
feat_type = 'IAT'
pp.flow2features(feat_type, fft=False, header=False)

# dump data to disk
X, y = pp.features, pp.labels
out_dir = os.path.join('out', os.path.dirname(pcap_file))
dump_data((X, y), out_file=f'{out_dir}/demo_{feat_type}.dat')

print(pp.features.shape, pp.pcap2flows.tot_time, pp.flow2features.tot_time)

Novelty detection

import os

from sklearn.model_selection import train_test_split

from netml.ndm.model import MODEL
from netml.ndm.ocsvm import OCSVM
from netml.utils.tool import dump_data, load_data

RANDOM_STATE = 42

# load data
data_file = 'out/data/demo_IAT.dat'
X, y = load_data(data_file)
# split train and test test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=RANDOM_STATE)

# create detection model
model = OCSVM(kernel='rbf', nu=0.5, random_state=RANDOM_STATE)
model.name = 'OCSVM'
ndm = MODEL(model, score_metric='auc', verbose=10, random_state=RANDOM_STATE)

# learned the model from the train set
ndm.train(X_train, y_train)

# evaluate the learned model
ndm.test(X_test, y_test)

# dump data to disk
out_dir = os.path.dirname(data_file)
dump_data((model, ndm.history), out_file=f'{out_dir}/{ndm.model_name}-results.dat')

print(ndm.train.tot_time, ndm.test.tot_time, ndm.score)

For more examples, see the examples/ directory in the source repository.

Architecture

  • docs/: includes all documents (such as APIs)
  • examples/: includes toy examples and datasets for you to play with it
  • ndm/: includes different detection models (such as OCSVM)
  • pparser/: includes pcap propcess (feature extraction from pcap)
  • scripts/: others (such as xxx.sh, make)
  • tests/: includes test cases
  • utils/: includes common functions (such as load data and dump data)
  • visul/: includes visualization functions
  • LICENSE.txt
  • readme.md
  • requirements.txt
  • setup.py

To Do

The current version just implements basic functions. We still need to further evaluate and optimize them continually.

  • Evaluate 'pparser' performance on different pcaps
  • Add 'test' cases
  • Add license
  • Add more examples
  • Generated docs from docs-string automatically

Welcome to make any comments to make it more robust and easier to use!

Development

Development dependencies may be installed via the dev extras (below assuming a source checkout):

pip install --editable .[dev]

(Note: the installation flag --editable is also used above to instruct pip to place the source checkout directory itself onto the Python path, to ensure that any changes to the source are reflected in Python imports.)

Development tasks are then managed via argcmdr sub-commands of manage …, (as defined by the repository module manage.py), e.g.:

manage bump patch -m "initial release of netml" --build --release

Thanks

netml is based on the initial work of the "Outlier Detection" library odet 🙌

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

netml-0.0.2.tar.gz (20.2 kB view hashes)

Uploaded Source

Built Distribution

netml-0.0.2-py3-none-any.whl (23.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page