Friday 31 August 2012

Task saturated

Most LE forensicator types will know what it is like to be overwhelmed with urgent stuff, the situation not helped when your supervisor is shouting at you to get stuff done and then, when you have gone home, your supervisor phoning you up to shout at you some more (and we love that).  So it is nice to remember that, no matter how crushing the challenges facing you, things could be worse.  Thus, I commend to you the fictional(?) zombie stylings of STEPHEN KNIGHT.

This all too credible tale of zombie mayhem is a milestone in the canon of the walking dead.  All too often zombies are unfairly portrayed as thoughtless automatons , unable to perform any tasks beyond the standard ripping and tearing of flesh.   In this book, a detachment of soldiers are heavily outnumbered and trapped in a tower block (a block of flats for those who speak the Queen's English), their efforts to escape are hampered by the appearance of zombie special forces soldiers who have managed to retain some memories of their training.   In fact, these uber-zombies are so adept that they are able to reduce the most complicated and seemingly insoluble problems to very simple solutions....blow stuff up!!!

Relentless gripping, this is essential reading.  This is the first part of a trilogy, will the tale end happily with the walking dead ascending to global dominance, or will the jammy living score an unlikely and undeserved victory?


Thursday 30 August 2012

Facebook artifacts

It is widely known that Facebook artifacts can be cached to the disk, a couple of years ago the chat artifacts were written as plain text files that could be found in the web browser cache for IE users.   This is no longer the case, however some artifacts can still be recovered from a hard disk, particularly from the swap file.
The artifacts are in json format, but facebook are fond of updating their infrastructure so the internal structure of the json artifacts may change frequently.  Do you have tool for recovering such artifacts?  Is it keeping pace with changes to the structure of the artifacts?  I am going to show you a way to not only recover the artifacts and parse them to generate more user friendly output, but also how to ensure you can stay up-to-date with changes facebook make to the structure of the artifacts.   This technique will also allow you to recover other chat/messaging artifacts such as Yahoo  IM2.

What we need to do is recover ALL the json artifacts from the disk, save them to file then write a script to parse out any messages.  I use the mighty and utterly essential   bulk_extractor tool to recover the json artifacts (amongst many other things).  If you haven't used this tool then you absolutely MUST get hold of it, there are virtually no cases where I don't deploy it.  I will cover more uses for the tool in future posts, there is a Widows version with a nice GUI on the download page, but for now we'll look at the recovery and parsing of json data.  We can do this on a the suspect system, having customised our boot disk and generated the .iso as per my PREVIOUS POST.  Alternatively you could run bulk extractor against a disk image.
You will need to run bulk_extractor as root from the terminal.  There are lots of options for bulk_extractor we are just going to run it with the json scanner enabled.
The command would be:
bulk_extractor -E json -o /home/fotd/bulk /dev/sda
The -E json option turns off all the scanners except the json scanner, we then specify and output directory and the physical device that we want to process.
Upon completion we find a text file called json.txt, this will contain all the json strings preceded by the disk offset that they were found at.

The json strings can be very long, sometimes 10s of thousands of characters in length...so I can't really show you any of the interesting strings here.   Ideally you want to view the strings without line wrapping, the gedit text editor can do this for you.  Trying to manually review the json and identify facebook artifacts is the road to insanity, so we'll can script the processing of the json strings.   What we are going to do is search our json strings for facebook artifacts, the newer ones will have the string "msg_body" within the 1st 200 characters.   So we can read our json file, line by line, looking for the term "msg_body", if a json string matches this criteria we will search it a bit deeper looking for other landmarks in the string that are an indicator of facebook artifacts.  We can use those landmarks as field separators for awk, to isolate structures in the string, such as the message content, message date, author id etc.  Here is a chunk of code that is representative of the script:

jbproc () {


if test `echo $CHATFILE | head -c 200 | egrep -m 1 -o msg_body | head -n 1`
 then
   echo "new single message found"
   MSGTYPE=OFFLINE_MESSAGE
   SUBJ=`echo $CHATFILE | awk -F'5Cu003Cp>' '{print $2}' | awk -F'5Cu003C' '{print $1}'`
   UTIME=`echo $CHATFILE | awk -Ftimestamp '{print $2}' | awk -F, '{print $1}' | awk -F: '{print $2}'`
   HTIME=`date -d @$(($UTIME/1000))`
   TEXT=`echo $CHATFILE | awk -Fcontent\ noh '{print $2}' | awk -F5Cu003Cp\> '{print $2}' | awk -F5Cu003C '{print $1}' | sed 's/,/ /g' | sed 's/,/ /g'`
   SNDID=`echo $CHATFILE | awk -Fsender_fbid '{print $2}' | awk -F, '{print $1}'| awk -F: '{print $2}'`
   SNDNME=`echo $CHATFILE | awk -Fsender_name '{print $2}' | awk -F, '{print $1}'| awk -F: '{print $2}'`
   RCPTID=NONE
   RCPNME=NONE
   MSGID=NONE
   OFFSET=`echo $CHATFILE | awk '{print $1}'`
   echo "$SNDNME,$SNDID,$SUBJ,$TEXT,$MSGTYPE,$HTIME,$MSGID,$RCPTID,$RCPNME,$OFFSET," >> $OUTFILE 
}
OUTFILE=FACEBOOK_MSGS.csv
echo "Sender Name,Sender ID,Msg Subject,Message Content,Msg Type,Message Date/Time,Message ID,Recipient ID,Recipient Name,Offset," > $FACEBOOK_MSGS.csv
cat json.txt | while read CHATFILE ; do  jbproc $CHATFILE ; done




The final line submits each line of our json.txt file to a function called jbproc.  The first line of the function checks to see if the term msg_body appears in the first 200 characters of the line.  Note that we pipe the result of egrep to the head command. If we didn't do this and our test found 2 instances of "msg_body" then our script would fall over as the test command will only accept a single result.  The rest of the script is fairly straightforward.  In the TEXT variable you want to make sure you remove any commas, as the output is going to a comma deliminated file - otherwise you formatting is going to be messed up.   The time stamp in the json is a unixtime value multiplied by a thousand, so we need to divide the number by 1000 then convert the value to human readable with the date command.  All our variable values are echoed out to our spread sheet.   The code snippet above just deals with one type of facebook artifact, you can download the full script that processes all the various facebook artifacts HERE.  Save the script into /usr/local/bin, make it executable then change into the directory containing your json.txt file, run the script and you will find a spreadsheet containing all the parsed output in the same directory.

The big advantage of this approach is that if facebook change their json output, then you can quickly see what the changes are by checking the bulk_extractor generated json.txt file, then simply edit the script to reflect the new changes.

If you are going to use the script, let me know how results compare to any other tools that you might be using to recover facebook artifacts.


Wednesday 29 August 2012

Customising your boot CD

Trying to roll-your-own version of Linux used to be more stressful then being trapped in a basement during the zombie apocalypse with someone who suddenly develops flu-like symptoms.   Mercifully, this is no longer the case (depending what distro you are using), a set of the 'remastersys' scripts are included on some distos to simplify the process.  I am assuming that you know some basic linux for this post!

The engine of my preview system is the CAINE bootable CD (version 2.0).   To create your own system you will need to get the dd image (if you want the simplest solution).  Once you have downloaded and unzipped the image, you just need to write the image to a 2GB thumbdrive (or larger).   Remember that you are sending the image to the physical device, not to a file in a file system on your thumbdrive. The first sector of the image needs to be written to the first sector on the thumbdrive.   To do this you can use any linux distro, just open a terminal and type:
sudo dd if=/path/to/image of=/dev/sdb
This assumes that your thumbdrive is identified as /dev/sdb. You will be prompted for the root password for the system, type it in and wait several minutes until the copying is complete.   You now have a bootable thumbdrive that you can boot any computer with - assuming the BIOS supports USB booting.  You can go ahead and boot any machine with the thumbdrive, simply configure the BIOS to boot from the USB drive.

Bear in mind that it is unlikely that any wireless cardvdrivers are going to be available, you should anticipate using an ethernet cable to access the internet.
Once the system is up and running, you can add and remove programs using the synaptic packet manager, or downloading and installing programs manually.

Any scripts that I publish here, or you download from other sites, I recommend putting in the /usr/local/bin directory.
Remember to make all scripts executable by changing the permissions, like this:
sudo chmod a+x /usr/local/bin/scriptname.sc
The root passord is "caine".

All you need to do now is generate the .iso image so you can create a bootable CD of your distro.   You can do this with the remastersys scripts - there is a nice GUI available in the menu, to help you along.   These screenshot shows the path to the GUI and the resulting dialog boxes:
Make sure you select the modify option for the first run, we are going to have to configure remastersys with some parameters.   Once you select modify, the following dialog box appears:


What we need to do with the above options is select the directory where we want the .iso created.  The .iso will be about 600mb, although other temporary files will be written so figure you will need 1.2 GB of space.  You probably won't have space on your thumbdrive, so we can send the .iso image to a directory where we are going to mount and external drive or thumbdrive. The "working directory" field is the FULL PATH we want to send our .iso file to.  In my example, I have a directory at /stuff where I mount an external drive to receive the .iso image.  It is VERY IMPORTANT that you also put the same path into the "Files to Exclude" field - otherwise all the data on your external drive will be included in the .iso!!.  Give your .iso file a name in the Filename field then select the "Go back to main menu" option, you will be taken to this dialog box:



Select the dist option and press OK.   A terminal will open and you will be able to follow the progress of your .iso being created.  You will find the .iso in a directory called "remastersys" in the path that you specified previously.  Make sure that you have mounted your external drive in READ/WRITE mode at the directory you specified in the settings.

Your .iso image can now be burned to as many CDs as you like.

Now I must feed...but when my grim business is done we will start on some scripting.


Forensic previewing with Linux

I s'pose the first question is why do any form of previewing?   Most LE agencies that I know of experience very high volumes of request for digital forensic services, this often leads to backlogs in cases.  To combat this, agencies have adopted a wide variety of reponses including combinations of outsourcing, prioritising cases, triaging, introducing KPIs, setting policies to only view the files in the live set on some systems.   All of these approaches are practical solutions, however they do have some drawbacks such as cost or risk in evidence being missed.   The squeeze on budgets is already impacting on many agencies abilities to outsource cases, hire more staff or buy new hardware/software.  My approach is to make sure every piece of digital storage is processed, using open source tools.  Often my processes go deeper than what is done during a "full" forensic examination in many labs.   Costs of this approach are minimal - some external drives and some 4-port KVM switches so that many systems can be previewed in parallel.   You really want to make sure that your costly forensic tools are being focussed on media that is known to contain evidence.   Of course, the commercial forensic tools (many of which I like A LOT!) do give you the ability to do some previewing, however this ties up your software dongles and your forensic workstations.  You also need to be at the keyboard, as the approach often involves configuring a process, running it, then configuring the next process and running it and so on...

I leverage the processing power of the suspect system, processing the system in a forensically sound manner by booting the suspect system with a forensic CD then running a single program to do the analysis that I require, selecting various options depending on circumstances.  The processing may take many hours (sometimes up to 18 hours), however, my forensic workstation and software is free to tackle systems that I have previously processed and found evidence on.  I can view the output from my previewing in a couple of hours, a establish if there is evidence on the system or not.   Therefore, I overcome the problem of potential evidence being missed that exist in some other approaches to reducing backlogs of cases.  Most linux forensic boot disks can be installed to a workstation to process loose hard disks and USB storage devices.  Ultimately the majority of storage devices can be eliminated from the need to undergo costly and time-consuming forensic examinations, only disks/media known to contain evidence are processed, everything that is seized gets looked at...double gins all round.

So, that is about as serious and po-faced as this blog is going to get.   The next post will look at customising your forensic boot CD for your own needs (it's A LOT simpler than most people realise).

Tuesday 28 August 2012

Dawn of the forensicator of the dead

YALG!  Yet Another Linux Geek blogging.   I'm afraid I no longer have a functioning cerebrum so can't compete with most of the very intelligent (and annoyingly alive) linux DFIR types out there.  Hopefully this blog will have a vague appeal to those who have dipped their toe into the murky world of Linux and then pathetically ran away screaming at the incomprehensibility of it all, or indeed those who haven't yet had to sheer life affirming joy of spending years in the trenches of front line forensics and want to know a smidgeon more.
 
I have developed a system of "enhanced previewing" of computer system and storage devices that allowed my team to get rid of the soul-destroying weight of computer backlogs.  As it is a Linux system unencumbered with dingle-dangle-dongles and suchlike, it can be deployed on as many systems as space allows.   You simply boot the suspects machine with the CD or thumbdrive, and then in a forensically sound manner (or at least I hope it is or I am going to look VERY foolish) plunder the drive for potential evidence that is then exported out to an external drive, by invoking a single command (or pressing an icon for you gui jockeys!).   The output can be viewed on any PC, those machines that don't contain any evidence can be eliminated from further examination.   Those that do contain evidence can then be tortured by your forensic tools of your choice - however the granular nature of the output of my system means that you know where to look for the evidence e.g in zip files in unallocated space and therefore do your forensic evidential recovery much more quickly.   The system is essentially a Linux boot CD which I have customised + 50,000 lines of ham-fisted bash scripting wot I writ.
  
I have learned many painful lessons (at least they would have been painful if I had a working Parietal Lobe) and some interesting stuff.  So if you want to know how to recovery specific file types PROPERLY, identify encrypted data, recover encryption keys, review hundreds of hours of movie footage in minutes, classify files according to the language that they are written in, process all types of email, recover facebook artifacts reliably and loads of other stuff, then stay tuned.  I will be sharing bash/shell code that you can mock unrelentingly (and possibly use in your cases).   So there will be some very basic stuff and some more challenging stuff....now what first?