Sunday, 19 May 2013

Unwinding the dead

Your humble zombie forensicator has been quiet lately.  This was partly due to the prolonged bitterly cold conditions that generally render us un-ambulatory until the thaw.  However I have also been trying to get my python chops in order and have been studying the iTunes backup format.
So, I have written some scripts to help parsing the iTunes backup that I am going to share with you.

I have had a run of cases where significant information has been found in the iTunes backups on computers that I have looked at.  If you weren't aware, owners of iPhone/iPad/iPod mobile devices can hook them up to their computers for backing-up purposes.   The resulting backups are a nuisance to process - all the files have been re-named based on hash values and have no file extension.  However, the backups contain significant amounts of data that can be useful to forensic examiner.   The main files of note are generally sqlite databases that can contain chat/messaging/email messages.  There are a number of popular chat/messaging/email apps on the iphone that appear over and over in the backups that I have looked at, they are generally:
skype
whatsapp
yahoo
Apple sms
google email

Luckily, there is lots of info on the 'net about how to deal with the sqlite databases that are created by the iPhone apps.   You should really start with the utterly essential Linux Sleuthing blog.  There are an essential series of articles on processing sqlite databases.   I suppose the main thing to bear in mind when dealing with sqlite databases is NOT to rely on automated tools.  By all means, use them to get an overview of the type of data available, but the most effective approach is to process them manually.   Understand the table and schema layouts and execute your own queries to extract data across tables.  Often automated tools will just dump all the data for each table sequentially.  It is only by doing cross table queries that you can marry up phone numbers to the owners of those phone numbers or screen names to unique user names.  There is no way I can improve on the info available in the Linux Sleuthing blog so please visit him to get the skinny on sqlite queries.

The main problem with processing iTunes backups is that there are so may apps that may be used for messaging/chat/email that it nearly impossible to keep up with what ones are out there, what format they store their data in and where it can be found in the backup.   The first step for me in examining an iTunes backup is to try and establish what kind of device it is that has been backed up.  This is fairly simple, you just need to view the contents of the Info.plist file which is XML so it can be viewed in any text viewer.  We know that XML is simply a series of key-value pairs, so you just need to find the "Product Type" key and look at the corresponding value.   If you can't find the Info.plist then look for the Manifest.plist.   This is in more recent backups and is a binary plist file.   Just download the plutil tool and generate the the Info.plist like this:
plutil -i Manifest.plist -o Info.plist
You now have a text version of the Manifest.plist which you can examine in a text editor.

Next thing to do is recreate the original directory structure of the backup and restore the files to their original locations and file names.
The whole point of these backups is that the iTunes software can restore it your original device in the case that your device encounters a problem.  So how does iTunes know the original directory structure and file names of the files in the backup?   When the initial backup is performed a file called Manifest.mbdb is created.  This is a binary database file that maps the original path and file names to their new hash-based file name in the backup.   So, all we need to do is create a text version of the Manifest.mbdb, so we can read it and understand original structure on of the backup.   We can generate our text version of the Manifest database with THIS python script.

The it is a case of recreating the original directory structure and file names - more on this later.

Once we have the original directory/file structure we can browse it in our file browser of choice.   It is important to remember that not all chat is saved in sqlite databases.   Some services such as ICQ save the chat in text based log files.  To complicate matters there is one format for the ICQ-free chat service and anther format for the ICQ-paid chat service.   The ICQ-free chat logs appear to be in a psuedo-JSON type format and not at all easily readable.   I say psuedo-JSON type because my python interpreter could not recognise it as JSON (using the json module) and even online JSON validators stated that the data was not valid JSON.   The ICQ-paid logs are a bit more user friendly and very interesting to forensicators.  They appear to be a record of everything that happens within the ICQ environment once the ICQ app is launched.  This means that the chat messages are buried a long way down in the log file and are not straight forward to read.   No matter, both ICQ formats were amenable to scripting via a couple of python scripts that I wrote.

As a salutary example of not relying on automated tools.  I processed an  iTune backup of an iPhone in a commercial forensic tool (mainly used for processing mobile phones).  In the "chat" tab of the output of the commercial tool, it listed 140 ICQ-Free chat messages and 0 ICQ-Paid chat messages.  Using my technique on the same backup  I recovered 3182 ICQ-Free chat messages and 187 ICQ-Paid chat messages.

Having warned against relying on automated tools, I have produced an err....automated tool for processing the itunes backups in the way I have described.  It comprises one bash script + 2 python scripts.  What it does in "unbackup" the backup and executes sqlite queries on some of the more well known sqlite databases.  However, it also copies out any processed sqlite databases and any unprocessed databases so that you can manual interrogate them.

So, I'll let you have the scripts if you promise to manually process the sqlite databases that are recovered (and visit the Linux Sleuthing blog)!   BTW, if you do find any interesting databases, if you could let me know the path, file name and sqlite query, I can add the functionality to my script.

My project is hosted at google code, HERE

I have, of course, added the scripts to my previewing home brew, so now any iTunes backups should be automatically detected and processed.