• Aucun résultat trouvé

AIL Framework for Analysis of Information Leaks Workshop - A generic analysis information leak open source software

N/A
N/A
Protected

Academic year: 2022

Partager "AIL Framework for Analysis of Information Leaks Workshop - A generic analysis information leak open source software"

Copied!
65
0
0

Texte intégral

(1)

AIL Framework for Analysis of Information Leaks

Workshop - A generic analysis information leak open source software

Alexandre Dulaunoy

alexandre.dulaunoy@circl.lu

Sami Mokaddem

sami.mokaddem@circl.lu info@circl.lu

January 19, 2018

(2)

Objectives of the workshop

(3)

Our objectives of the workshop

Demonstrate why data-analysis is critical in information security

Explain challenges and the design of the AIL framework

Learn how to install and start AIL

Learn how to properly feed AIL with custom data

Learn how to manage current modules

Learn how to create new modules

Practical part

(4)

Your objectives of the workshop

What are your expectations?

(5)

Sources of leaks

(6)

Sources of leaks: Paste monitoring

Example: http://pastebin.com/

Easily storing and sharing text online

Used by programmers and legitimate users

Source code & information about configurations

Abused by attackers to store:

List of vulnerable/compromised sites

Software vulnerabilities (e.g. exploits)

Database dumps

User data

Credentials

Credit card details

More and more ...

(7)

Sources of leaks: Paste monitoring

Example: http://pastebin.com/

Easily storing and sharing text online

Used by programmers and legitimate users

Source code & information about configurations

Abused by attackers to store:

List of vulnerable/compromised sites

Software vulnerabilities (e.g. exploits)

Database dumps

User data

Credentials

Credit card details

More and more ...

(8)

Examples of pastes

(9)

Sources of leaks: Others

Mistakes from users

https://github.com/search?q=remove password&type=Commits&ref=searchresults

(10)

Sources of leaks: Others

Mistakes from users

https://github.com/search?q=remove password&type=Commits&ref=searchresults

(11)

Are leaks frequent?

Yes!

And it’s important to detect them.

(12)

Paste monitoring at CIRCL: Statistics

Monitored paste sites: 27

pastebin.com

ideone.com

...

Table:Statistics for 2016

Pastes 2016 Monthly average Total

Fetched pastes 1 547 094 18 565 124

(13)

AIL Framework

(14)

From a requirement to a solution: AIL Framework

History:

AIL initially started as an internship project (2014) to evaluate the feasibility to automate the analysis of (un)structured information to find leaks.

In 2017, AIL framework is an open source software in Python.

The software is actively used (and maintained) by CIRCL.

(15)

AIL Framework: A framework for Analysis of Information Leaks

”AIL is a modular framework to analyse potential information leaks from unstructured data sources like pastes from Pastebin.”

Other leaks

(16)

AIL Framework: Current capabilities

Extending AIL to add a new analysis module can be done in 50 lines of Python

The framework supports multi-processors/cores by default.

Any analysis module can be started multiple times to support faster processing during peak times or bulk import

Multiple concurrentdata input

(17)

AIL Framework: Current features

Extracting credit cards numbers, credentials, phone numbers, ...

Extracting and validating potential hostnames

Keeps track of duplicates

Full-text indexer to index unstructured information

Terms, sets and regex tracking and occurences

Sentiment/Mood analyser for incoming data

Modules manager

And many more

(18)

Example: Following a notification (0) - Dashboard

(19)

Example: Following a notification (1) - Searching

(20)

Example: Following a notification (2) - Metadata

(21)

Example: Following a notification (3) - Browsing

content

(22)

Example: Following a notification (3) - Browsing

content

(23)

Pystemon

(24)

Pystemon: A monitoring tool for PasteBin-alike sites

Other leaks

(25)

Pystemon: Current capabilities

Flexible design, minimal effort to add another paste* site

Use custom download functions for complex pastie sites

Uses multiple threads per unique site to download the pastes

(optional) Uses random User-Agents

(optional) Uses random proxies

Removes a proxy if it is unreliable (fails 5 times)

(optional) Compress saved files with Gzip. (no zip to limit external dependencies)

And more...

(26)

Setting up the framework

(27)

Setting up AIL-Framework from source or virtual machine

Setting up AIL-Framework from source

1 git clone https://github.com/CIRCL/AIL-framework.git

2 cd AIL-framework

3 ./installing_deps.sh

4 cd var/www/

5 ./update_thirdparty.sh

Using the virtual machine:

1. Downloadhttps://www.circl.lu/assets/files/

ail-training/AIL_v@4986352.ova 2. Start virtualbox

(28)

AIL ecosystem - Challenges and design

(29)

AIL ecosystem: Technologies used

Programing language: Essentially python2 (slowly migrating to python3)

Databases: Redis and Redis-levelDB Server: Flask

Data message passing: ZMQ and Redis Publisher/Subscriber

(30)

AIL global architecture

Redis PubSub 1: port 6380, channel queuing

Redis PubSub 2: port 6380, channel script Pystemon import dir.py

ZMQ

AIL Mixer

Redis set 1 Redis set 2 Redis set 3

Modulex

Modulex Moduley Modulez

(31)

Data feeder: Gathering pastes with pystemon

Pystemon global architecture

Redis PubSub 1: port 6380, channel queuing Redis PubSub 2: port 6380, channel script

Pystemon1

Pystemon2

Pystemon3

Redis set Dispatcher ZMQ:5555

SOCAT:5555 Org

Org Org

Org Org

(32)

AIL global architecture: Data streaming between

module

(33)

AIL global architecture: Data streaming between

module (Credential example)

(34)

Message consuming

Modulex

Redis set

Moduley Moduley

SPOP SPOP

SADD

(35)

Starting the framework

(36)

Running your own instance from source

Accessing the environment and starting AIL

1 # Activate the virtualenv

2 . ./AILENV/bin/activate

3

4 # Launch the system

5 cd bin/

6 ./LAUNCH

7 # check options 1->5

8

9 # Start web interface

(37)

Running your own instance using the virtual machine

Login and passwords:

1 Web i n t e r f a c e ( d e f a u l t n e t w o r k s e t t i n g s ) :

2 h t t p : / / 1 9 2 . 1 6 8 . 5 6 . 5 1 : 7 0 0 0 /

3 S h e l l /SSH :

4 a i l / P a s s w o r d 1 2 3 4

5

(38)

Managing the framework

(39)

Managing AIL: Old fashion way

Access the script screen

1 screen -r Script

Table:GNU screen shortcuts

Shortcut Action

C-a d detach screen

C-a c Create new window C-a n next window screen

(40)

Managing your modules: Using the helper

(41)

Feeding the framework

(42)

Feeding AIL

There are differents way to feed AIL with data:

1. Be a partner with CIRCL and ask to get access to our feed info@circl.lu

2. Setup pystemonand use the custom feeder

pystemonwill collect pastes for you

3. Feed your own data using the import dir.pyscript

(43)

Feeding AIL

There are differents way to feed AIL with data:

1. CIRCL partners and ask to access our feed info@circl.lu

B You already have access

2. Setup pystemonand use the custom feeder

pystemonwill collect pastes for you

3. Feed your own data using the import dir.pyscript

(44)

Plug-in AIL to the CIRCL feed

You can freely access the CIRCL feed during this workshop!

In the filebin/package/config.cfg,

Set ZMQ Global->addressto tcp://crf.circl.lu:5556

(45)

Feeding AIL with your own data - import dir.py (1)

/!\ 2 requirements:

1. Data to be fed must have the path hierarchy as the following:

1.1 year/month/day/(textfile/gzfile)

1.2 This is due to the inner representation of paste in AIL

2. Each file to be fed must be of a raisonable size:

2.1 3 Mb is already large

2.2 This is because some modules are doing regex matching 2.3 If you want to feed a large file, better split it in multiple ones

(46)

Feeding AIL with your own data - import dir.py (2)

1. Change your local configuration bin/package/config.cfg

In the filebin/package/config.cfg,

Add127.0.0.1:5556inZMQ Global

(should already be set by default)

2. Launch import dir.py with de directory you want to import

import dir.py -d dir path 3. Watch your data being feed to AIL

(47)

Feeding AIL with your own data - import dir.py (2)

1. Change your local configuration bin/package/config.cfg

In the filebin/package/config.cfg,

Add127.0.0.1:5556inZMQ Global

(should already be set by default)

2. Launch import dir.py with de directory you want to import

import dir.py -d dir path

3. Watch your data being feed to AIL

(48)

Feeding AIL with your own data - import dir.py (2)

1. Change your local configuration bin/package/config.cfg

In the filebin/package/config.cfg,

Add127.0.0.1:5556inZMQ Global

(should already be set by default)

2. Launch import dir.py with de directory you want to import

import dir.py -d dir path 3. Watch your data being feed to AIL

(49)

Creating new features

(50)

Developping new features: Plug-in a module in the system

Choose where to locate your module in the data flow:

(51)

Writing your own modules - /bin/template.py

1 i m p o r t t i m e

2 f r o m p u b s u b l o g g e r i m p o r t p u b l i s h e r 3 f r o m H e l p e r i m p o r t P r o c e s s 4 if _ _ n a m e _ _ == ’ _ _ m a i n _ _ ’:

5 # P o r t of the r e d i s i n s t a n c e u s e d by p u b s u b l o g g e r 6 p u b l i s h e r . p o r t = 6 3 8 0

7 # S c r i p t is the d e f a u l t c h a n n e l u s e d for the m o d u l e s . 8 p u b l i s h e r . c h a n n e l = ’ S c r i p t ’

9 # S e c t i o n n a m e in bin / p a c k a g e s / m o d u l e s . cfg 10 c o n f i g _ s e c t i o n = ’ < s e c t i o n name > ’

11 # S e t u p the I / O q u e u e s 12 p = P r o c e s s ( c o n f i g _ s e c t i o n )

13 # S e n t to the l o g g i n g a d e s c r i p t i o n of the m o d u l e 14 p u b l i s h e r . i n f o (" < d e s c r i p t i o n of the module > ") 15 # E n d l e s s l o o p g e t t i n g m e s s a g e s f r o m the i n p u t q u e u e 16 w h i l e T r u e :

17 # Get one m e s s a g e f r o m the i n p u t q u e u e 18 m e s s a g e = p . g e t _ f r o m _ s e t ()

19 if m e s s a g e is N o n e :

20 p u b l i s h e r . d e b u g (" {} q u e u e is empty , w a i t i n g ".f o r m a t( c o n f i g _ s e c t i o n ) )

21 t i m e . s l e e p (1)

(52)

AIL - Add your own web interface

1. Launch var/www/create new web module.py 2. Enter the module’s name

3. A template and flask skeleton has been created for your new webpage in var/www/modules/

4. You can startcoding server-side in:

var/www/modules/your module name/Flaskyour module name.py

5. You can startcoding client-side in:

var/www/modules/your module name/templates/your module name.html

(53)

Case study: Push alert to MISP

(54)

Push alert to MISP

−→

Goal: Every alert conceringCredential.pyand CreditCards.py are pushed to MISP

(55)

Case study: Finding the best place in the system

Best place to put it?

(56)

Case study: Finding the best place in the system

Best place to put it?

(57)

Case study: Updating alertHandler.py

alertHandler.py- commit 83e082e62a20a49e1ca2d546e4f19209135ac59d

1 [ . . . ]

2 if m e s s a g e is not N o n e :

3 m e s s a g e = m e s s a g e . d e c o d e (’ u t f 8 ’) # d e c o d e b e c a u s e of p y t h o n 3 4 m o d u l e _ n a m e , p _ p a t h = m e s s a g e . s p l i t (’ ; ’)

5 [ . . . ] 6

7 # C r e a t e M I S P AIL - l e a k o b j e c t and p u s h it 8 if f l a g _ m i s p : # M I S P is c o n n e c t e d

9 a l l o w e d _ m o d u l e s = [’ c r e d e n t i a l ’, ’ c r e d i t c a r d s ’]

10 if m o d u l e _ n a m e in a l l o w e d _ m o d u l e s : 11 # c r e a t e and s e t u p the M I S P o b j e c t

12 w r a p p e r . a d d _ n e w _ o b j e c t ( m o d u l e _ n a m e , p _ p a t h ) 13 w r a p p e r . p u s h T o M I S P ()

14 e l s e:

15 p r i n t(’ not p u s h i n g to M I S P : ’, m o d u l e _ n a m e , p _ p a t h ) 16

(58)

Practical part

(59)

Practical part: Pick your choice

1. Improve module keys.py to support other type of keys (ssh, ...)

https://github.com/veorq/blueflower/blob/master/

blueflower/constants.py 2. Graph database onCredential.py

Top used passwords, most compromised user, ...

3. Webpage scrapper

Download html from URL found in pastes

Re-inject html as paste in AIL 4. Integration of truffleHog

Searches through git repositories for high entropy strings and secrets, digging deep into commit history

https://github.com/dxa4481/truffleHog

(60)

Contribution rules

(61)

How to contribute

(62)

How to contribute

Feel free to fork the code, play with it, make some patches or add additional analysis modules.

Feel free to make a pull request for your contribution

That’s it!

(63)

How to contribute

Feel free to fork the code, play with it, make some patches or add additional analysis modules.

Feel free to make a pull request for your contribution

That’s it!

(64)

How to contribute

Feel free to fork the code, play with it, make some patches or add additional analysis modules.

Feel free to make a pull request for your contribution

That’s it!

(65)

Conclusion

Building AIL helped us to find additional leaks which cannot be found using manual analysis and improve the time to detect duplicate/recycled leaks.

→ Therefore quicker response time to assist and/or inform proactively affected constituents.

Références

Documents relatifs

EZT RACE is based on plugins that allow it to trace different programming models such as MPI, pthread or OpenMP as well as user-defined libraries or applications.. EZT RACE uses

Copy-number signal is segmented by the PELT segmentation method from changepoint package (Killick et al., 2013).. Allele

Although there are many types and goals of literature reviews (Cooper, 1988b), all of them can be improved using a tool chain of free and open source software and

L¨ammel and Verhoef [2] propose a technique in which syntactic tools are constructed and later augmented with semantic knowledge on a per-project basis (demand-driven semantics).

– Section 4 provides insight into 10 case studies of open source businesses and how they are currently implementing the Software Business Model Framework of Schief.. – Section

• Risks with stolen email addresses.. • Risks with stolen

• Feel free to fork the code, play with it, make some patches or add additional analysis modules. • Feel free to make a pull request for

On 31 May 2013, the World Health Organization Headquarters in Geneva became a smoke-free campus, going beyond the Smoke-free United Nations premises resolution which only