OARC 2014 Spring Workshop (Warsaw)

Name: OARC 2014 Spring Workshop (Warsaw)
Start: 2014-05-10T09:00:00+02:00
End: 2014-05-11T18:00:00+02:00
Location: Sofitel Warsaw Victoria

10–11 May 2014

Sofitel Warsaw Victoria

Europe/Warsaw timezone

Support

admin@dns-oarc.net

Large scale regular expression recognition on the DITL data-set by using similarity search

10 May 2014, 11:45

20m

Opera (Sofitel Warsaw Victoria)

Opera

Sofitel Warsaw Victoria

11 Królewska Street 00-065 Warsaw

Members-only Members-Only Session

Dr Arnoldo Muller-Molina (simMachines)

The day in the life (DITL) data-set is collected to study and improve the integrity of the root server system. Among the different properties recorded in the data-set, we focus on second level domain (SLD) strings. In this study, we introduce a method that automatically infers regular expressions from over-represented SLD strings. At first, we identify random strings and remove them from the data pipeline. Then, we find common string seeds that guide the elucidation process. Finally, we perform similarity search on strings that do not exceed a certain level of entropy level to generate a weight matrix that is then converted into regular expressions and their corresponding visualizations. Similarity search is a very expensive operation, but we manage to achieve fast results by using the simMachines R-01 similarity engine. The method may be used to preemptively discover security or performance issues in the infrastructure. During the talk, we will show a sample of collected regular expressions so that the community may identify familiar and unfamiliar SLD patterns.

Dr Arnoldo Muller-Molina (simMachines)

Slides

patrec.pdf

patrec.pptx

OARC 2014 Spring Workshop (Warsaw)

Support

Large scale regular expression recognition on the DITL data-set by using similarity search

Opera

Sofitel Warsaw Victoria

Speaker

Description

Primary author

Presentation materials

Choose timezone

OARC 2014 Spring Workshop (Warsaw)

Support

Speaker

Description

Primary author

Presentation materials