+ All Categories
Home > Documents > [IEEE 2013 Eighth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing...

[IEEE 2013 Eighth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing...

Date post: 31-Jan-2017
Category:
Upload: nik
View: 214 times
Download: 0 times
Share this document with a friend
4
A study on metadata tagging for tracking original file information within the cloud Richard Crossley 1 , Eleana Asimakopoulou 1 , Stelios Sotiriadis 2,1 , Nik Bessis 1 1 School of Computing & Maths, University of Derby, Derby, United Kingdom 2 Dept. of Electronic & Computer Engineering, Technical University of Crete (TUC), Chania, Greece [email protected], [email protected], [email protected], [email protected] AbstractThis paper discusses the cloud computing data storage and the impact of metadata tagging as a potential method for tracking original file information. We focus on demonstrating current digital forensic methods and approaches with regard to their applicability in cloud computing. In particular, we explore works discussing cloud computing in relation to digital forensics for disaster management cases. We propose possible solutions for tackling problems posed to the forensic examiner trying to analyze exchangeable image file formats of the filename extension of jpg for images that have been stored within the cloud to identify file information for a disaster management emergency situation. Keywords— Cloud computing; Exchangeable image file formats; Virtualization; Metadata tags; Forensics for disaster management I. INTRODUCTION Cloud computing has had a massive surge in popularity over the past few years [9]. This sets to increase because of a combination of ease of use from a user perspective, and potential extra revenue for the business hosting the data. All of the major companies have started to focus on encouraging users to adopt cloud capacity [11]. In terms of storage providers offer users an allowance for free (e.g. 7 GB), and offering the option to purchase additional storage on a monthly or annual basis. In the past year there has been a large increase in companies offering cloud computing. Another paradigm the so-called metadata description has been proven to be efficient for various cases, from data descriptions to job scheduling [12]. In [13] meta-data descriptions has been used to identify cloud resources in decentralized settings. In [14], such descriptions have been used in message exchanging to achieve optimized performance in clouds. The work of [15] extends such study to include the optimization of energy efficiency with regards to efficient message distributions that use meta-data for identification. Thus, metadata tools are vital for the computer or mobile phone examiners as it provides hidden properties of files that can provide critical information in an investigation, which can prove ownership of a file. This includes information such as where it was created and a plethora of other evidence. In [3] authors provide a definition of Metadata when he states that ‘Metadata is a structured description of objects, which contains certain properties useful to the user as well as the program on which the document was created. More succinctly, metadata is data that describes other data’ [3]. This paper will focus specifically upon how cloud storage effects the Exchangeable Image File (EXIF) format for JPEG and TIF files so that ‘when a digital picture is taken, information about the camera, such as model, make, and serial number, and settings, such as shutter speed, focal length, resolution, date, and time, are stored in the graphics file’ [3]. Computer and mobile phone forensics has been fine-tuned over the years and though practitioners may use a variety of different tools and techniques, there have been formal investigative departments since the 1970s. As a result, formal guidelines defined by the Association of Chief Police Officers (ACPO) [1] have been in place in the United Kingdom. These are summarized below. Initially, no action taken by law enforcement agencies or their agents should change data held on a computer or storage media which may subsequently be relied upon in court. Secondly, in circumstances where a person finds it necessary to access original data held on a computer or on storage media, that person must be competent to do so and be able to give evidence explaining the relevance and the implications of their actions. Thirdly, an audit trail or other record of all processes applied to computer-based electronic evidence should be created and preserved. An independent third party should be able to examine those processes and achieve the same result. Fourthly, the person in charge of the investigation (the case officer) has overall responsibility for ensuring that the law and these principles are adhered to. These guidelines have been refined over the years as although operating systems and hardware have changed fundamentally since the 1970s and examiners are now regularly dealing with 2 or 3 Terabytes [5] of data as opposed to a maximum of about 70 Megabytes (MB) the fundamental way this data is stored has remained constant. An in depth explanation of what a hard drive is made up of and how it works can be found in [4]. As a result, computer and forensic examiners have been able to follow these guidelines to achieve forensically sound evidence that is suitable for presentation at Crown Court. The development of cloud computing has completely changed the ways users store data. Thus, in section 2 we present a review how metadata tagging (specifically EXIF data) is currently used in forensic 2013 Eighth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing 978-0-7695-5094-7/13 $31.00 © 2013 IEEE DOI 10.1109/3PGCIC.2013.76 462 2013 Eighth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing 978-0-7695-5094-7/13 $31.00 © 2013 IEEE DOI 10.1109/3PGCIC.2013.76 453 2013 Eighth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing 978-0-7695-5094-7/13 $31.00 © 2013 IEEE DOI 10.1109/3PGCIC.2013.76 453
Transcript

A study on metadata tagging for tracking original file information within the cloud

Richard Crossley1, Eleana Asimakopoulou1, Stelios Sotiriadis2,1, Nik Bessis1 1School of Computing & Maths, University of Derby, Derby, United Kingdom

2Dept. of Electronic & Computer Engineering, Technical University of Crete (TUC), Chania, Greece [email protected], [email protected], [email protected],

[email protected]

Abstract— This paper discusses the cloud computing data storage and the impact of metadata tagging as a potential method for tracking original file information. We focus on demonstrating current digital forensic methods and approaches with regard to their applicability in cloud computing. In particular, we explore works discussing cloud computing in relation to digital forensics for disaster management cases. We propose possible solutions for tackling problems posed to the forensic examiner trying to analyze exchangeable image file formats of the filename extension of jpg for images that have been stored within the cloud to identify file information for a disaster management emergency situation.

Keywords— Cloud computing; Exchangeable image file formats; Virtualization; Metadata tags; Forensics for disaster management

I. INTRODUCTION Cloud computing has had a massive surge in popularity

over the past few years [9]. This sets to increase because of a combination of ease of use from a user perspective, and potential extra revenue for the business hosting the data. All of the major companies have started to focus on encouraging users to adopt cloud capacity [11]. In terms of storage providers offer users an allowance for free (e.g. 7 GB), and offering the option to purchase additional storage on a monthly or annual basis. In the past year there has been a large increase in companies offering cloud computing.

Another paradigm the so-called metadata description has been proven to be efficient for various cases, from data descriptions to job scheduling [12]. In [13] meta-data descriptions has been used to identify cloud resources in decentralized settings. In [14], such descriptions have been used in message exchanging to achieve optimized performance in clouds. The work of [15] extends such study to include the optimization of energy efficiency with regards to efficient message distributions that use meta-data for identification. Thus, metadata tools are vital for the computer or mobile phone examiners as it provides hidden properties of files that can provide critical information in an investigation, which can prove ownership of a file. This includes information such as where it was created and a plethora of other evidence. In [3] authors provide a definition of Metadata when he states that ‘Metadata is a structured description of objects, which contains certain properties useful to the user as well as the program on

which the document was created. More succinctly, metadata is data that describes other data’ [3]. This paper will focus specifically upon how cloud storage effects the Exchangeable Image File (EXIF) format for JPEG and TIF files so that ‘when a digital picture is taken, information about the camera, such as model, make, and serial number, and settings, such as shutter speed, focal length, resolution, date, and time, are stored in the graphics file’ [3].

Computer and mobile phone forensics has been fine-tuned over the years and though practitioners may use a variety of different tools and techniques, there have been formal investigative departments since the 1970s. As a result, formal guidelines defined by the Association of Chief Police Officers (ACPO) [1] have been in place in the United Kingdom. These are summarized below. Initially, no action taken by law enforcement agencies or their agents should change data held on a computer or storage media which may subsequently be relied upon in court. Secondly, in circumstances where a person finds it necessary to access original data held on a computer or on storage media, that person must be competent to do so and be able to give evidence explaining the relevance and the implications of their actions. Thirdly, an audit trail or other record of all processes applied to computer-based electronic evidence should be created and preserved. An independent third party should be able to examine those processes and achieve the same result. Fourthly, the person in charge of the investigation (the case officer) has overall responsibility for ensuring that the law and these principles are adhered to.

These guidelines have been refined over the years as although operating systems and hardware have changed fundamentally since the 1970s and examiners are now regularly dealing with 2 or 3 Terabytes [5] of data as opposed to a maximum of about 70 Megabytes (MB) the fundamental way this data is stored has remained constant. An in depth explanation of what a hard drive is made up of and how it works can be found in [4]. As a result, computer and forensic examiners have been able to follow these guidelines to achieve forensically sound evidence that is suitable for presentation at Crown Court. The development of cloud computing has completely changed the ways users store data.

Thus, in section 2 we present a review how metadata tagging (specifically EXIF data) is currently used in forensic

2013 Eighth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing

978-0-7695-5094-7/13 $31.00 © 2013 IEEE

DOI 10.1109/3PGCIC.2013.76

462

2013 Eighth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing

978-0-7695-5094-7/13 $31.00 © 2013 IEEE

DOI 10.1109/3PGCIC.2013.76

453

2013 Eighth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing

978-0-7695-5094-7/13 $31.00 © 2013 IEEE

DOI 10.1109/3PGCIC.2013.76

453

examination. Then in section 3 we discuss how cloud computing stores data and the impact of this upon traditional computer forensic methods. In section 4 we present the challenges posed by cloud computing with regards to metadata tagging. We identify such needs in cases of physical disaster management.

II. HOW METADATA (EXIF) IS USED IN COMPUTER FORENSICS EXAMINATIONS

EXIF data is of fundamental importance to a computer of mobile phone examiner as it provides crucial information such as the source, location, date, time, device and software version of a file. Before specifically examining any file for EXIF data it is important to realize that a forensic examiner would usually have created an exact 1 to 1 copy of the whole storage media using a forensic acquisition tool (referred to as imaging a drive). The most common tools for carrying out this process are FTK by AccessData and Encase provided by Guidance software; this would be carried out so that a forensic examiner can adhere to the first two principles of the ACPO guidelines [1].

By imaging the storage media the original media is preserved in the same condition as when it was seized and the examiner has a complete copy of all of the user’s files and of the operating system files. While this paper is focused upon EXIF data from digital images it is important to understand that registry entries, log, temporary files and the ability to recover deleted files unallocated data are all important in forensic examination that can only be accessed by working from a physical acquisition (a bit-for bit copy) of storage media.

In order to document the EXIF data contained within an image can be of use to a forensic examiner an image has been taken using a mobile phone camera that includes a built in Global Positioning System (GPS) and examined using the freely available tools WinHex (version 16.7) and IrfanView (version 4.35). While a normal user will see just the image, a forensic examiner would use either Encase or FTK to view the .JPG or .JPEG in hexadecimal view to see the EXIF data and gain access to a wealth of information. Both of these tools allow the examiner to create reports of these images and are forensically verified so that they can be used as evidence in legal proceedings; they both contain a vast amount of other forensic features but cost a large amount of money. By using a combination of WinHex and IrfanView similar results can be achieved but would not be admissible as evidence because these tools are not forensically sound; for the purpose of this paper they will prove adequate. Fig 1 demonstrates the image when viewed in WinHex, it gives access to the file type (in hexadecimal a .jpg header is displayed as ÿØÿà), make and model of camera and the date and time the image was created.

In the case of this image the naming convention is taken from the time and date contained within the EXIF data but it is important to remember that while the actual name of the .jpg is easily editable by the user, the metadata contained within it is not; hence why it can prove so valuable to a forensic examiner. Time and date information is simultaneously one of the most useful and problematic tools in a forensic examiners arsenal, as whilst it can be key information, it is important to remember that this is set from the settings on a computer of device, which

can be manually adjusted. However, one of the most pleasing side effects of increasing adoption of smart phones (and now cameras) based upon the Google Android or the Apple iOS operating systems is that these devices are set to obtain time and date information from the mobile networks (or GPS) automatically by default; this reduces users inputting incorrect time/date information or leaving the default settings active (which are dictated by the products manufacture date or a company default settings).

Fig. 1. Image view in WinHex

When the same .JPG file is viewed in IrfanView a large amount of additional of information that is contained within the EXIF data is also able to be viewed including everything from Flash and Light Source data to GPS information. One of the biggest developments in recent years has been the inclusion of GPS within mobile devices and opens new opportunities for a forensic examiner; it does depend on the device as to if this facility is enabled by default but if it is selected then this information contained in the EXIF data can be used. The process of adding location information to media and embedding it within the image data is known as Geotagging.

A Google Maps search in comparison to the Geotagging information contained within the EXIF data shows a certain amount of variance but is generally quite accurate. Using Geotagging information is relatively new in forensics and does depend on it being enabled; until recently Geotagging was automatically enabled on the Apple iPhone and during my time as a mobile phone forensic examiner with the Derbyshire Constabulary, was used to provide evidence that an offender who was not allowed within 100 meters of any public area that may contain children, had taken a picture of a slide in a local park. While proof that the owner was using the device was still required, the offender had already admitted to sole usage so this shows the potential value of Geotagging information to a forensic examiner. A relevant scenario could be the use case of a child (as a dependent person) who is lost in a particular space (e.g. a shopping center). Thus, through the use of Geotagging with original file tracking can help trace the dependent person [10].

EXIF data with specific reference to .JPG images but also Metadata for all files is critical to traditional computer and mobile device forensic examination for proving ownership, date and time and location and many other parts of the digital forensic process as can be seen from the example provided. When extracted using forensic software following the ACPO

463454454

guidelines [1] and used in combination with registry entries and log files EXIF data provides evidence that is very difficult to question regarding validity and integrity, and has been the mainstay of digital evidence since the 1970s. The way this evidence is recovered and presented has been refined to as near perfection as can be achieved but computer forensics is very much a reactionary industry as a result of needing to provide evidence that needs to be beyond question and all established methods and ideology is about to be challenged by cloud computing.

III. HOW CLOUD COMPUTING DATA STORAGE MOVES THE DIGITAL FORENSIC GOALPOSTS

Regardless of which cloud provider a user decides to adopt they all work by virtualizing user data so that it is protected by hosting the information remotely across different datacenters and even different sites. The data storage is transient and constantly being moved around and a large amount of different users will be contained on each server. This makes traditional forensic methods of seizing an offender’s storage media and creating a bit for bit copy impossible and additionally there is no ability to access registry entries and log or temporary files. In fact, because storage is the premium product of cloud any temporary files are very quickly discarded to free the space for a different user.

The whole process of how cloud storage operates and the virtualization process is described in detail by [6]. Authors make the following observations ‘although cloud computing has many benefits to offer, there is still a degree of speculation over its security (or lack of security). More particularly, there are still questions to be answered relating to its ability to support forensic investigations’ [6]. This is especially important as current guidelines (referring to ACPO) state that ‘digital evidence (once gathered) must satisfy the same legal requirements as conventional evidence’ [6] with set guidelines that it must be authentic, reliable, complete, believable and admissible. All of these notions are more difficult to provide required certainties when dealing with data stored in the cloud. This is further complicated as ‘cloud computing has carefully considered security and indeed it was forced to right from the very outset. The reason for this is that security is an essential requirement for any IT application’ [6] However, ‘on the other hand, computer forensics or forensic readiness is not an essential requirement, it is seen as more of a luxury’ [6].

The work of [7] takes a slightly different approach and focuses on the impact on an individual that cloud computing forensics will have. Authors state that ‘when a user stores some sensitive information in a cloud, the confidentiality of these sensitive information is of concern to the user’ [7]. Without any protection on these sensitive information, e.g., personal financial information, health records, a user won’t have confidence in storing his/her sensitive information in cloud’ [7]. This issue of trust in cloud computing is a major factor facing its long term success as ‘Besides the confidentiality of these sensitive information, the user’s identity privacy, a fundamental right to privacy, is also expected in cloud computing’ [7]. Users of cloud computing, who obey the law, should have every right to expect that these statements are adhered to when tackling the issues related to cloud computing

and the potential for collateral intrusion caused by virtualization of data within the cloud. The work of [7] then goes on to state that in an ideal world cloud computing would adopt a model of ‘anonymous authentication’ to provide user identity privacy but that this approach is ‘a two-edged sword’ by stating that if ‘a group of users are authorized to some financial computing or data-intensive scientific collaborations in a cloud, if an important data modified by someone is disputed, it is hard to track the real user due to complete anonymous authentication.’ With this in mind ‘cloud computing should also provide provenance to record ownership and process history of data objects in cloud in order for wide acceptance to the public.’

Both these papers take broad perspectives on how cloud computing will impact upon traditional computer forensic methods but make little mention of the specific impact of cloud storage upon metadata. This becomes further apparent, as when discussing the topic of metadata in cloud computing [2] that ‘digital provenance, meaning meta-data that describes the ancestry or history of a digital object, is a crucial feature for forensic investigations. In combination with a suitable authentication scheme, it provides information about who created and who modified what kind of data in the cloud’ [2]. However it becomes apparent that ‘unfortunately, the aspects of forensic investigations in distributed environment have so far been mostly neglected by the research community’ [2].

While most computer and mobile phone users have not yet made a total migration of storage to cloud computing as a result of the premiums currently charged by cloud providers the increasing popularity of the Google Chromebook, currently the number 1 selling laptop computer on Amazon, and Samsung doing a deal with Dropbox to provide Galaxy S III users 50GB of storage free for two years the shift has already started. What becomes obvious is that despite a large amount of theoretical papers on the subject, there has been little active participation from the law enforcement and forensic community with cloud providers, who in turn have thought little of the social and ethical implications of the way cloud computing might be exploited by the criminal fraternity.

IV. TACKLING THE CHALLENGES POSED BY CLOUD COMPUTING TO METADATA TAGGING

The main focus of this paper has been discussing the theoretical effect that cloud computing has upon metadata tagging as a potential method for tracking original file information. Without a basic explanation of metadata (specifically EXIF data) itself, how it is used by a forensic examiner, traditional forensic protocol and how cloud storage impacts upon all three this would not be possible. The next stage will be to turn the theoretical into the practical; a large amount of images containing EXIF data from a variety of different sources will be uploaded to all of the four cloud providers mentioned. The files will then be left on these servers for a defined period of time and then downloaded directly from each, these will then be viewed in IrfanView and WinHex and the results compared to see the impact upon the EXIF data. If the EXIF data from the original file is retained by some or all of the cloud servers the validity of the file as evidence will then be discussed.

464455455

Although EXIF data has been the main focus of this paper the ability to recover files from the cloud and the conflicts this causes with traditional forensic methods has also been highlighted. Therefore, a software solution to recover EXIF data from .JPG files based on these cloud servers will be looked at as in most cases it will not be possible to seize and isolate a single user’s data because of virtualization. The fact may be that ‘there is no foolproof, universal method for extracting evidence in an admissible fashion from cloud-based applications, and in some cases, very little evidence is available to extract’ [8]. However, this does not mean a tool for extracting what information can be retrieved, within a possibly limited time window, would not prove of value.

The final step in investigating metadata tagging as a potential method for tracking original file information within the cloud should be the most simplistic but will actually prove the most challenging as it is trying to encourage communication between the cloud service providers and the law enforcement community to highlight the importance of metadata to digital forensic examinations. If this does not prove possible then it will be necessary for the law enforcement community to realize that they cannot follow a reactionary path; they must be proactive in testing and developing software solutions and guidelines that are ready for the shift to cloud storage, as the currently available choices are inadequate.

V. CONCLUSION AND FUTURE STEPS Investigation of metadata tagging as a potential method for

tracking original file information within the cloud is a challenging but important area of research for forensic digital investigation. The information contained within this paper highlights the need for further work in this field, especially as the majority of the papers quoted mainly provide a general theoretical approach to the topic. If the points in the tackling the challenges posed by cloud computing to metadata tagging can be proactively addressed, then any potential complications caused by cloud storage in metadata (EXIF) forensic examination can be limited.

REFERENCES [1] ACPO. (2007). Good Practice Guide for Computer-Based Electronic

Evidence. Retrieved January 13, 2013, from http://www.7safe.com/electronic_evidence/ACPO_guidelines_computer_evidence.pdf

[2] Birk, D., Wegener, C. (2011). Technical Issues of Forensic Investigations in Cloud Computing Environments. Retrieved January 13, 2013, from http://code-foundation.de/stuff/2011-birk-cloud-forensics.pdf

[3] Jones, J. (2006). Document Metadata and Computer Forensics. Retrieved January 13, 2013, from http://www.infosec.jmu.edu/reports/jmu-infosec-tr-2006-003.pdf

[4] Kruse II, W.G., Heiser, J. G. (2007). Computer Forensics Incident Response Essentials. Indianapolis, USA. Addison-Wesley.

[5] Nelson, B., Phillips, A., Steuart, C. (2010). Guide to Computer Forensics and Investigations 4th edition. Boston, USA. Cengage Learning.

[6] Sotiriadis, S., Bessis, N., Huang, Y., Sant, P. and Maple, C. (2010). Defining Minimum Requirements of Inter-collaborated Nodes by Measuring the Weight of Node Interactions, 4th International Conference on Complex, Intelligent and Software Intensive Systems (CISIS-2010), 15th-18th February, Krakow, ISBN: 978-0-7695-3967-6/10, p.p.: 291-298

[7] Rongxing Lu, Xiaodong Lin, Xiaohui Liang, and Xuemin (Sherman) Shen. Secure Provenance: The Essential of Bread and Butter of Data Forensics in Cloud Computing. Retrieved January 13, 2013, from http://bbcr.uwaterloo.ca/~rxlu/paper/asiaccs185-lu.pdf

[8] Slusky, L., Partow-Navid, P., Doshi, M. Cloud computing and computer forensics for business applications. Retrieved January 13, 2013, from http://www.aabri.com/manuscripts/11935.pdf

[9] Bessis, N., Asimakopoulou, E., French, T., Norrington, P., and Xhafa, F. (2010), The Big Picture, from Grids and Clouds to Crowds: A Data Collective Computational Intelligence Case Proposal for Managing Disasters, 1st International Workshop on Emerging Data Technologies for Collective Intelligence (EDTCI-2010), in conjunction with the 5th IEEE 3PGCIC-2010, Nov 4-6, Fukuoka ISBN: 978-0-7695-4237-9 pp: 351-356

[10] Asimakopoulou, E. and Bessis, N., (2011) Buildings and Crowds: Forming Smart Cities for More Effective Disaster Management, 5th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2011) Jun 30-Jul 2, Seoul ISBN: 978-0-7695-4372-7 pp: 128-135

[11] Sotiriadis, S., Bessis, N., Xhafa, F., and Antonopoulos, N. (2012) From Meta-computing to Interoperable Infrastructures: A Review of Meta-schedulers for HPC, Grid and Cloud, 26th IEEE International Conference on Advanced Information Networking and Applications (AINA-2012), Mar 26-29, Fukuoka ISBN 978-1-7695-4651-3 pp: 874-883

[12] Sotiriadis, S., Bessis, N. and Antonopoulos, N. (2011) Towards inter-Cloud Schedulers: A Survey of meta-Scheduling Approaches, 6th IEEE International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC-2011), Oct 26-30, Barcelona ISBN: 978-0-7695-4531-8 pp: 59-66

[13] Sotiriadis, S., Bessis, N. and Antonopoulos, N. (2012). Decentralized Meta-brokers for Inter-Cloud: Modeling Brokering Coordinators for Interoperable Resource Management, 9th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD'12), Chongqing, May 29 – 31 2012, ISBN 978-1-4673-0024-7/10, p.p.: 2475-2481.

[14] Bessis, N., Sotiriadis, S., Pop, F. and Cristea, V. (2012). Optimizing the Energy Efficiency of Message Exchanging for Service Distribution in Interoperable Infrastructures, 4th IEEE International Conference on Intelligent Networking and Collaborative Systems (INCoS-2012), September 19-21 2012, Bucharest, Romania, ISBN: 978-0-7695-4808-1, p.p.: 105-112

[15] Bessis, N., Sotiriadis, S., Pop, F. and Cristea, V. (2013). Using a Novel Message-Exchanging Optimization (MEO) Model to Reduce Energy Consumption in Distributed Systems, Simulation Modeling Practice and Theory, Elsevier, ISSN: 1569-190X

465456456


Recommended