Quantcast
Channel: Noise
Viewing all articles
Browse latest Browse all 39990

SANS Internet Storm Center, InfoCON: green: freq.py super powers?, (Fri, Jul 10th)

$
0
0

Look, up in the URL. Its a BASE64, its a URI. This looks like a job for freq.py

This diary is a follow up to yesterdays post on using freq.py to find DGA (Domain Generation Algorithm) host names in your logs using frequency tables. If you didnt see that post you should review it first before continuing. You can find that diary here: https://isc.sans.edu/forums/diary/Detecting+Random+Finding+Algorithmically+chosen+DNS+names+DGA/19893/

Fellow SANS Instructor/GSE Kevin Fiscus suggested another use for the tool to me last week when I showed him what I had been working on. Kevin pointed out that it is difficult to find BASE64 encoded strings programmatically. Because BASE64 uses uppercase letters, lowercase letters, numbers, slashes and plus signs a program has a hard time telling the difference between a BASE64 encoded string and a URI (The part of the URL after the domain name).">String 1: Q09NRSBUTyBNWSBQWVRIT04gQ0xBU1MgSU4gVkVHQVMhISEh
String 2: forums/diary/Detecting+Random+Finding+Algorithmically+chosen+DNS+names+DGA/19893

Both of these are properly formatted BASE64 encoded strings. But only one of them is intentionally a BASE64 encoded string and only one of them will decode to something other than garbage random data. Our frequency table will work nicely to determine the difference between the BASE64 strings and the URI. This time a high probability (above 5) indicates that it is NOT BASE64 encoded and a LOW probability (below 5) indicates that it is BASE64 encoded. You can do this from either the command line with ">python freq --measure suspect base 64 string frequency_table.freq">freq.exe on Windows using the tools available for download here. But we used the command line yesterday and I want to show you another way to use the Python script. This time lets import freq.py as a Python module. First I start up my Python interactive mode session. Then I import the freq module.">RED.">python
Python 2.7.6 (default, Mar 22 2014, 22:59:38)
[GCC 4.8.2] on linux2
Type help, copyright, credits or license for more information.
">from freq import *

Next I will assign the variable fc to hold a frequency counter object, then load my prebuilt english_lowercase.freq">">fc = FreqCounter()
">fc.load(english_lowercase.freq)

Next I use the .probability() method to measure the probability of the two strings.">">fc.probability(forums/diary/Detecting+Random+Finding+Algorithmically+chosen+DNS+names+DGA/19893)
9.490788394012279
">fc.probability(Q09NRSBUTyBNWSBQWVRIT04gQ0xBU1MgSU4gVkVHQVMhISEh)
2.578357325221811

The URI scores well above 5 indicating that it is not random text and in this case not BASE64. But the 2nd string scores a 2.5 indicating that it more likely to be a BASE64 encoded string. Once again this isnt perfect. You could still have URI with random ASCII strings in then that score low, but it does help to differentiate between common URI strings and BASE64 encoded strings. This is confirmed when we try to BASE64 decode each of hte strings.">">forums/diary/Detecting+Random+Finding+Algorithmically+chosen+DNS+names+DGA/19893.decode(BASE64)
~x8axeex9axcfxddx89xaaxf2xfc7xadyxcbbx9ex0fx91jwhx9bxe1bx9dxd8xa7x83xe0%x82x8axe2xb6x19xa2qxa9excbxe7!xa2xc7xa7xf83Rxfavxa6zxcfx83x18x0fxf5xf7xcfw
">Q09NRSBUTyBNWSBQWVRIT04gQ0xBU1MgSU4gVkVHQVMhISEh.decode(BASE64)
COME TO MY PYTHON CLASS IN VEGAS!!!!

Follow me on twitter @MarkBaggett

Want to learn to use this code in your own script or build tools of your own? Join me for PythonSEC573 in Las Vegas this September 14th! Click here for more information.

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Viewing all articles
Browse latest Browse all 39990

Trending Articles