How it works

This page explain the process of the data extraction of the captured traffic.

Extraction of the data

Ways of data extraction


We can see that, and are three alternatives to generate the SQLite base. So, it is not necessary to use Python (although recommended) to generate the base. Once the base created, a Python script or anything other programming language can easily examine it.

Storing the results of extraction

SQLite base

It is possible to store the extraction results (in this case timestamp, source and destination IP) in a database.

Python serialized object

Structure of the Python serialized object

>>> dic_ip = {'' : {'' : 20, '' : 16,
                                '' : 451},
            '' : {'' : 48, '' : 2},

Some tests

>>> import pickle
>>> dic_obj = open("./dic.pyobj", "r")
>>> dic_ip = pickle.load(dic_obj)
>>> for i in dic_ip['']:
            a += dic_ip[''][i]
>>> print a
>>> len(dic_ip[''])

We see here that has contacted 738,585 different IP, for a total of 2,815,911 contacts.

>>> (len(dic_ip[''])/(len(dic_ip)*1.0))

This already represents a significant part of sources ip.

>>> liste = dic_ip[''].items() # list of IP contacted by
>>> liste.sort(key = operator.itemgetter(1), reverse = True)
>>> liste[0]
('', 204909) # most contacted by
>>> liste[1]
('', 114881)
>>> liste[-1]
('', 1) # less contacted by

>>> liste[-43527]
('', 1)
>>> liste[-43528]
('', 2) has contacted 43,527 different IP one time.

What can we do with this object ?

So we have seen that this is very simple to deal with this object to obtain what we want. This object represent the part of the base that you want to exploit. It can be created with Remember that the SQLite base contains all the informations of the Pcap. So, if you want, you can filter these informations before visualizing. For example :

cedric@debian:~/IP-Link/source$ python -i data/ip.sql -r time -p 2009-1-15-22-00-00:2009-1-16-02-00-00
DB connect
Request sent to the base :
    SELECT ip_src, ip_dst FROM ip_link WHERE tts >= 1232053200.0 AND tts <=  1232067600.0
Creating object...
Reading the result of the query...

Here, you will extract all the traffic between 2009/01/15 22h00m00s and 2009/01/16 02h00m00s. Now, for example you can generate the Circos matrix and a MooWheel graph :

cedric@debian:~/IP-Link/source$ python -i jub-dic.pyobj -o ip.circos
Loading objet...
Searching IP that are source and destination...
Circos matrix generation...
Saving the matrix...

cedric@debian:~/IP-Link/source$ python
Loading dictionary...
Creating MooWheel file...
Writting file.