CHAPTER II: RANDOM IMAGE STEGANOGRAPHY REVIEW...

16

CHAPTER II: RANDOM IMAGE STEGANOGRAPHY– REVIEW

2.1. INTRODUCTION

In the current corporate scenario data or information security is the most

significant asset because loss of information will lead to financial and market loss which

in turn will result in the end of business. Though security guards like cryptography,

watermarking and steganography have armed the electromagnetic pathway against

hackers, the concern on data protection is growing in parallel with the up-to-the-minute

electronic technology. In this review, the role, strength and weakness of steganography

and different random image steganography techniques in protecting the data have been

analyzed and in addition how random techniques can be made smarter and effective have

also been explored.

Gone are the days when images were only about memories of the past. The

images now speak more than that because of the advent of the field of image

steganography [4], which embeds confidential information in images imperceptible to the

naked human eye. From time immemorial emphasis on new techniques for clandestine

communication has been given high importance based on the levels of confidentiality

required. The three interlinked techniques namely cryptography [1], steganography and

watermarking [2] form the base for secure communications. While cryptography involves

making the content undecipherable, the other two are information hiding methods where

the mere presence of information is hidden [27]. Since these three techniques are

interlinked and confusing for those who are from different disciplines, it is better to

distinguish cryptography, steganography and watermarking in the initial phase of the

17

review. Table 2.1 details the differences among steganography, watermarking and

cryptography.

A Schematic diagram of the proposed study to classify existing information

security is given in Fig. 2.1.

Table 2.1. Differences among Steganography, Watermarking and Cryptography

Property Steganography Watermarking Cryptography

Carrier ,

secret data,

key and

output

The payload is

embedded in any

digital media with an

optional key and is

called as the stego-file

The water mark is

embedded in

image/audio files

and is called as the

watermarked-file

The information is

encrypted in text or

image files and

output is called as

the cipher-text

Selection of

cover

Any cover can be

chosen

Restriction in cover

selection

N/A

Objective and

concern

Capacity is a major

concern for the secret

communication aided

by steganography

Robustness is a

necessary feature of

copyright

preservation

Robustness is

essential for data

protection

Detection and

retrieval

The cover is not needed

for recovery and full

retrieval of data is

possible

Data is retrieved by

cross-correlation and

the original cover is

required for the

same

Full retrieval of

data without the

need of the cover

Relation to

cover and

visibility

The information is not

generally related to the

cover and is never

perceptible to the

normal human vision

Watermarks are

sometimes visible to

human eye and

usually becomes an

attribute of the

image

Due to encryption,

we can easily know

that there is hidden

data but

deciphering is

difficult

Attacks Steganalysis detects the

presence of information

Image processing

aids in removal

/replacement of

watermarks

Cryptanalysis de-

ciphers the

encrypted

information

18

2.1.1. The triplets of security

Three technologies define the possibilities of data hiding. They are:

Cryptography

Steganography

Watermarking

Cryptography is a technique in which the secret message is encrypted and sent in

an unintelligent format. It scrambles the confidential data in such a way that it appears to

be gibberish to any unintended user. The confidential data to be communicated is a

mixture of permutations and substitutions and hence any third party other than the

legitimate user cannot access the message. Furthermore, cryptography [1] could be

LINGUISTIC

WATERMARKING

FRAGILE ROBUST

AUDIO VIDEO TEXT

INFORMATION SECURITY

STEGANOGRAPHY

TECHNICAL

COVER

S

IMAGE

TIME DOMAIN FREQUENCY DOMAIN

METHOD

S

CRYPTOGRAPHY

SYMMETRIC KEY PUBLIC KEY

Figure 2.1. Various sub-disciplines of information security

19

carried out using single key (Symmetric Key Cryptography) or using two keys (Public

Key Cryptography). Symmetric key cryptography employs the same key for encryption

and decryption of the plain text, whereas the asymmetric or better known as public key

cryptography uses a public key for encryption of the plain text and private key to decrypt.

The Data Encryption Standard (DES) is the best example for symmetric key

cryptography, which involves 56 bit key and 64 bit input with 16 rounds. Triple-DES,

International Data Encryption Algorithm (IDEA) and Bluefish are other well known

alternatives. Later, in 1976, Whitfield Diffie and Martin Hellman [28] developed the

public key cryptography, showing that it was impossible to deduce the private key from

the public key, and this presence of two keys made the algorithm secure. The RSA

algorithm developed by Ron Rivest, Adi Shamir and Leonard Adleman in 1978 was

another model of public key cryptography [29]. Public key cryptography is mainly used

for encryption algorithm and to develop digital signature [30, 31].

Steganography is an art of embedding the confidential information within some

other file generally known as the cover [32, 33]. The main objective of steganography is

to provide a covert communication between any two users such that an unintended person

does not gain access to the information by just glancing at the cover file [5, 33].

Steganography is different from cryptography [27] and the basic difference being that the

latter scrambles the data while the former just hides its presence. Steganography rather

hides the data whereas cryptography encrypts the data. Steganography provides much

more security when compared to cryptography because there is no chance of any

unintended user to know that a message is being sent whereas in cryptography, there will

http://en.wikipedia.org/wiki/Ron_Rivest

http://en.wikipedia.org/wiki/Adi_Shamir

http://en.wikipedia.org/wiki/Leonard_Adleman

20

always be a suspicion that a message is being sent. Hence these are more prone to be

hacked or suppressed.

Watermarking is generally used for authentication and copyrights protection [34-

42]. Watermarking can be used for creating an image so that it is recognizable. It can also

be used to mark a digital file so that it is intended to be visible (visible watermarking) or

visible only to its creator (invisible marking). The main purpose of watermarking is to

prevent the illegal copying or claim of ownership of digital media. The earlier

cryptography and recent steganography could be used for private communication, usually

for peer to peer communication, but watermarking is employed between one to many,

i.e.; same watermark is embedded in many covers. Fingerprinting is a special type of

watermarking, which would embed label and serial number to identify a unique copy

among several. A number of surveys on watermarking methods [37-42] are available in

literature and each aims to highlight the growth of watermarking method in multimedia

[38], wavelet transforms [39], digital images [40], for authentication [41] and digital

watermarking [42].

2.1.2. Characteristic comparison of the triplet

The common characteristics among steganography, cryptography and

watermarking is that they transmit the secret information in such a way that only the

receiver would decrypt or extract the confidential data [6, 43]. These techniques which

had been prevalent during the ancient times have been transported to the digital world. It

has become nearly impossible to extract or detect the secret messages.

In digital domain, steganography and watermarking would tie themselves and are

extremely used in digital images, but they have other uses as well; both cannot exist by

21

themselves, and hence require cover objects. Steganography requires a cover media to

carry the secret information and watermarking requires a carrier object which is intended

to be protected. These similarities create a link within them, and some modifications can

lead the transportation from one technique to another. Due to the similarities present

between the two, it is difficult to distinguish both of them, but there is a remarkable

difference between them.

Cryptography encrypts data in two methods namely secure or unbreakable (e.g.

One-time pad) systems and breakable (e.g. RSA) systems. Through both the systems,

communication carried out is known to all. However, it is time consuming and often

fruitless to crack a code. The robustness of the code lies in the difficulties faced while

reversing the code in different permutations and combinations. Due to its robustness, it is

used for security purposes. For example, cryptography is used for online shopping,

banking, etc. The credit card number, expiry, etc. and other crucial information are

encrypted and sent so that an unintended user can‟t access the details.

Steganography offers high carrier capacity keeping embedded message invisible,

thus maintaining the fidelity of the cover media. The efficiency of the steganographic

method is that one shouldn‟t know that a media file has been altered in order for

embedding. If the malicious user knows that there is some alteration, the steganographic

method is defeated and less efficient. The embedded message is very fragile and hence if

any modification is done to the stego image the whole secret message is corrupted. The

effectiveness lies in the ability to fool an unintended user. The layers of communication

can be more than one layer. A secret message can be embedded with a digital image

which in turn can be embedded within another digital media or video clippings.

22

Watermarking is required for the authentication and copyright protection of

digital files. The embedded watermarking is required in an object to make it impossible

to remove completely. If the embedded watermarking is removed, then the marked object

is either distorted or destroyed making it useless for anyone. This is the reason why

watermarking is more robust [2], compared to the other image processing techniques

such as compression, cropping, rotation, etc. Hence, even if a tiny bit of information is

extracted by modification and tampering, the rightful owner can still claim ownership.

Unlike steganography, it is acceptable for everyone to see the watermark embedded in it

including the invisible ones.

2.1.3. A clever mix of the triplet - An illustration

Cryptography is used as a paisano of the other two data hiding techniques. Data is

encrypted in both the techniques, to avoid statistics-based attacks and to increase the

randomness of steganography and to protect the hidden data in watermarking. In general,

confidential information is encrypted prior to embedding.

The importance of watermarking can be stated as follows. Suppose Rs.100 bills

are introduced in December 2009, then watermarking is implemented in order to prevent

illegal copies. Identify the original from fake, when the bill is shown in light a small

image will appear within the large image. The watermarking is actually a part of the large

paper and is visible on both sides. Hence, it becomes difficult to produce a paper with

such features. In addition to these features, some tiny writings which are invisible to the

human eyes are present in the paper.

A banker having the necessary equipments (magnifying glass) can tell the

difference between the original bill and the fake bill. Steganography makes its play here.

23

The tiny printing done on the bill represents steganography .The tiny printing done in the

paper cannot be copied since any commercial printer is incapable of printing such a fine

and thin print leading to black spots. These are the reasons why steganography is used for

high security.

Cryptography is also implemented in the bill. A serial number is printed on the

bill, which may contain information about the location and date the bill was printed and

any other confidential information. The unique serial number for each bill can be used for

tracking purposes. Using steganography, cryptography and watermarking it becomes

impossible to reproduce Rs.100 bill. It must be kept in mind that all three are different

and have different functionality.

Since the main objective of this review is to explore more on random image

steganography, further researches on cryptography and watermarking are not explained in

detail.

2.1.4. History of steganography

Steganography, derived from Greek words meaning „covered writing‟ has been in

use over the past thousands of years [2, 4, 6 - 10]. For example, secret writing of the

Chinese was reinvented by the Italian mathematician Jerome Cardan and included a paper

mask with holes between both the sender and receiver, with the secret message written by

keeping the paper mask on a blank sheet. Later, the blank is filled to appear as innocuous

text and this method is called as Cardan Grille [44].

The Nazis invented several steganographic methods during the World War II

using invisible ink and null ciphers and microdots. An example can be given as follows:

“Apparently neutral‟s protest is thoroughly discounted and ignored. Isman hard hit.

24

Blockade issue affects pretext for the embargo on by-products, ejecting suets and

vegetable oils.” is a message sent by a Nazi spy, which when decoded using the second

letters reveals the secret message „Pershing's sails from NY June 1‟. Another interesting

case is wherein Morse code was concealed in a drawing in the form of long grass and

short grass in the year 1945 [45, 46].

Various methods are implemented to obliterate the existence of the secret

message. A bunch of such techniques includes invisible ink, character arrangement,

microdots, digital signatures and spread spectrum. More details on the history of

steganography are available in [2, 4, 6 - 10].

2.1.5. Digital steganography

This prowess of secret communication coined as steganography means “covered

writing”. From the aforementioned ancient methods, steganography has made a giant leap

to the digital form as well due to the enormous improvement in computing power, the

internet and advancements in digital signal processing.

While the ambit of steganography is immense, it has been mainly used for secret

communication. Although steganography and cryptography both cater to the same

purpose, the advantage of the former over the latter is that it obliterates the existence of

the confidential message.

Steganography, the most promising area in information security, has

revolutionized the digital sphere that the world has become now. Initially, digital

steganography has been stated by Simmons in his famous prisoner‟s problem paper [11]..

Later, in 1997, the first International Conference on Information Hiding organized by

Ross Anderson defined the necessities for steganography [13]. Even today people suspect

25

that the hidden communication would aid terrorists to share an illegal plan or an attack

[6]. Despite the facts that were attributed to the failure, it was evident that steganography

is not only practiced in still images but in audio and video also. Examples exist in [47,

48] for hiding data in music files, and even in a simpler form such as in Hyper Text

Markup Language(HTML), executable files(EXE) and extensible Markup Language

(XML) [49]. But steganography has given rise to several applications and corporate

vigilance standards and measures.

Contemporary methods of information hiding are due to Simmons [11]. Kurak

and McHugh discussed a method [50], which is similar to embedding into the 4 least

significant bits (LSBs). They examined image downgrading and contamination, which

are known now as image based steganography [4, 14].

A detailed survey on steganographic tools in other media from a forensic

investigator‟s perspective is available in the literature[51]. Steganography is widely used

for secret communication purposes, and a continuous evolution is guaranteed due to its

applications.

2.1.6. Steganography applications

Steganography is a field which finds its application in almost all domains, since

security and confidentiality is of prime importance, be it a simple email to a friend or a

company‟s confidential data [2, 4, 6]. The various applications are detailed as follows:

For copyright control of materials [40, 51].

To maintain the confidentiality and integrity of company‟s secrets [52-54].

For making smart IDs with personal details embedded in the photographs [55].

To aid in video-audio synchronisation [57-60].

26

To append additional information in TV broadcasting [61, 62].

In TCP/IP packets wherein a unique ID is embedded into an image to analyze the

network traffic of specific users [63-66].

Petitcolas' contribution to the medical field with the help of steganography is

remarkable, and it aids in maintaining the patients‟ records more safely and also avoids

mix-up leading to confusion [2]. His main application in medical imaging systems was

embedding the patient‟s data into their image data like X-ray reports. This helps in

maintaining a link between the image data and the personal information and the chances

of reports of two or more patients getting mixed up is removed, thus providing an

ultimate guarantee of authentication, which is essential. An LSB embedding technique

for the electronic patient records [67], is based on multiple-base data hiding wherein the

base is the pixel value difference between the original image and its JPEG version. Few

more patient data concealment in digital images is detailed in the literature[68, 69]. A

review of the impact of data privacy and confidentiality on developing telemedicine

applications is available [70].

A more sophisticated application of information hiding would be embedding data

into a printed picture which would be unperceivable by the naked eye, while a mobile

phone with a camera could decode it. The Japanese firm Fujitsu is developing a

technology based on the idea of transforming the image colour scheme into its Hue,

Saturation and Value (HSV)components and embedding in the Hue domain which is

insensitive to human vision [4, 71]. This decoding takes just less than one second as the

embedded data is merely 12 bytes. This could be implemented for “doctor‟s

27

prescriptions, food wrappers, billboards, business cards and printed media such as

magazines and pamphlets” [72], or to replace barcodes.

“Appearances may be deceptive” is the most apt phrase to be used for the

contemporary digital technology in which the chances of forgery are high [73]. This has

led to a new vast field of research namely digital document forensics. Chaddad Abbas has

proposed a security scheme which protects scanned documents from forgery using self-

embedding techniques [74]. This method detects forgery and also allows legal or

forensics experts to gain access to the original document despite the manipulation used.

Any advancement in science would certainly have a disadvantage. In this case, the

concern was about the usage of steganography by terrorists for secret plots, which is also

called as cyber planning or digital menace [2]. Hence, Provos and Honeyman [75]

subjected three million images from popular websites to intense scrutiny for any presence

of hidden data but were unsuccessful.

To justify the popularity of the chosen review topic, a simple search has been

conducted to survey the number of articles published in referred and peer-reviewed

journals and number of patents filed in various patent offices throughout the world, with

the search key word steganography, and the results are presented in Table 2.2, 2.3 and

2.4.

Search Key word: Steganography

Web search sources: Science Direct, SCOPUS, IEEE, Springer, DOAJ, Citeseer,

Sciverse, ACM Portal and Google Scholar

28

Table 2.2. Year wise publication details

Table 2.3. Total number of publications in web search

Steganography as keyword Total number of papers

Springer 1261

Scopus 2835

IEEE 1618

Science Direct 883

DOAJ 280

Citeseer 2543

Sciverse 30225

ACM Digital Library 508

Google Scholar 19100

Table 2.4. Total number of patents filed in various patent offices

Patent Offices (PO)

4070

USA

PO

1604 WIPO 277 Europe

PO

114 UK

PO

23 Japan

PO

17

Tables 2.2, 2.3 and 2.4 confirm that, year by year the number of papers published in the

chosen field of study has considerably increased.

In addition, there are two popular review / survey papers available in the existing

literature for steganography [4, 6], but both did not specify anything about random image

YEAR 1997 ’98 ’99 2000 ’01 ’02 ’03 ’04 ’05 ’06 ’07 ’08 ’09 ’10 ’11

/12

TOTAL

NO OF

PAPERS

Science

Direct

8 15 7 9 17 60 42 33 60 53 76 101 128 117 132

/13

883

Scopus 4 17 16 20 40 54 78 155 165 215 260 325 433 600 453 2835

IEEE 2 11 9 17 29 36 55 77 75 123 168 242 317 299 157 1617

Springer 8 15 5 19 7 6 19 55 167 118 112 137 163 158 272 1261

29

steganography. Hence in this chapter, the present status and future direction for random

image steganography have been presented.

2.2. REVIEW ON STEGANOGRAPHY

As a part of the survey, this section aims at providing a bird‟s eye view of the

most important steganographic techniques available for digital images. Most of the

techniques exploit the structures of the formats like GIF, JPEG, etc., and few also make

use of the BMP format for its simple data structure.

A graphical representation shows the embedding process in the cover image.

Secret data is the confidential information to hide and the same is the extracted data,

Embed is the steganographic function and the stego-image is transferred through the

channel, and key represents an optional key: A simple data hiding frame work is shown

in Fig. 2.2.

2.2.1. Steganography classification - Stefan C. Katzenbeisser

In the belles-lettres of steganography with three standard protocols namely pure

steganography, secret key steganography and public-key steganography [2] is shown in

the Fig. 2.3.

Cover Object

o

Secret Data

Embed

Stego object in

Open Channel

Extract

Key

Secret

Data

Figure 2.2. A simple schematic diagram for data hiding framework

30

Pure steganography

This system includes the pure or unadulterated working principles of

steganography, wherein there is no prior transfer of shared secret key. But still the

purpose could be achieved.

In common terms, the embedding process in this mode of steganography is coined

and elaborated in terms of mapping in set theory, where let E be the embedding process,

C be the set of all cover images and M be the set of all possible images and is defined as

E: C×M C while the extraction process is denoted by D and is a mapping representing

D: C M, which depicts the extraction of secret message off the cover.

It is mandatory that | C | >= |M|, and the embedding and extraction algorithm must

be accessible to both transmitter and receiver, but to no one else.

Secret key steganography

As pure steganography relies completely on secrecy, it cannot be always baked

upon when the transfer is prone to lose secrecy, although the transfer is between E

(Embedding process) and D (Extraction process). This is not secure as it violates

kerckhoff‟s principle [12]. With this, secret key steganography could be implemented

with three stratified objects (A cover image “C”, secret message “M”, and a secret key

Figure 2.3. Katzenbeisser classification on steganography

31

“K”). The sender chooses random cover image, takes the secret message embeds it into

the cover image deploying the secret key K. And at the receiver end, if the secret key is

known, the secret message could be extracted from the stego image. And as obvious, any

person unaware of secret key has no scope of finding out the secret message.

Mathematically, with set notation, this is given as

EK: C×M×K →C and DK: C×K →M

where EK- Embedding with secret key and DK– Extraction with secret key K

Public key steganography

As the name suggests the encoding key is public key, which is visible to the

public database and everyone has access to it. So the embedding process is visible to

everyone. Here, the true receiver will also use a secret key D, which is the apt key for

decoding the secret message. So although the public key is known to the third person, the

message cannot be decoded until and unless the third person knows the secret key D.

2.2.2. Steganography classification - Johnson and Katzenbeisser

In addition a popular survey available on steganography by Johnson &

Katzenbeisser [2] is completely dedicated to image steganography and classifies these

techniques into six different methodologies as shown in the Fig. 2.5.

Figure 2.4. Johnson and Katzenbeisser classification on steganography

32

Substitution systems [76-80] – Confidential information has been substituted into

redundant parts of a cover image through random or raster scan.

Transform domain techniques [81-85] – Confidential data has been embedded

into selected transformed coefficients (frequency domain) of the cover image.

Spread spectrum techniques [86-90] – Spread spectrum concepts adapted to

embed the confidential information.

Statistical methods [91-95] – Embed confidential information by varying the

statistical properties of the cover, extraction would be carried through hypothesis

testing.

Distortion techniques [96-100] – Embed confidential information by distorting

the cover, for extraction in the receiver side compare it with cover object

Cover generation methods [101-104] – Generate a new cover from the

confidential information.

This survey neither provides test images of the analysis nor does it discuss

steganography‟s evolution or applications, but provides classification on steganography

methodologies adapted on the chosen cover object.

In the survey of Bailey et al., [105] an evaluation of software in spatial domain

supporting the GIF format is done. Here, steganography assumes the unavailability of the

original image and evaluation is done by a direct comparison of the original and the stego

images.

In the survey by Li et al., [106] image Steganography has been classified into

spatial and JPEG steganography method as shown in Fig. 2.5.

33

Spatial domain techniques further classified into six different methods are as

follows. Least Significant Bit (LSB) based steganography [5], Multiple Bit-planes Based

Steganography (MBPS) [107], Noise-adding Based Steganography (NABS) [108-110],

Prediction Error Based Steganography (PEBS) [111], Modulo Operation Based

Steganography (Mod) [112] and Quantization Based Steganography (QBS) [113]. The

other classification is JPEG steganography, which is further classified into JSteg [114] /

JPHide [115], F5 steganographic algorithm [19], OutGuess [116], model-based

steganography (MB) [117] and Yet Another Steganographic Scheme (YASS) [118]. This

review suggests on how to select pixel locations adaptively for embedding [119], how to

reduce embedding distortion and to increase embedding efficiency, embedding data in the

image creation process, sacrificing the imperceptivity, how to preserve the statistics while

embedding during image creation, but fails to suggest how to improve the randomness

while embedding.

In addition, Cheddas Abbas [4] categorized steganographic method as spatial

domain, frequency domain and the adaptive method's wherein adaptive methods can be

Figure 2.5. Li et al. classification on steganography

34

applied for both spatial and frequency domains. They also evaluated the performance of

commercially available software and its drawbacks and highlighted the need for novel

methods in image steganography. In the next section, some of the dominant techniques

exploiting the image formats are discussed.

2.2.3. Steganography methods in images

One simple image steganographic method is to append the secret data after the

End of File (EOF) tag in the image. The image remains intact and it cannot be perceived

in image viewers that some data is hidden. Whereas when it is opened in notepad, after

some random characters (which are the values of the image), the data is displayed

perfectly thus making it more prone to any steganalysis attack [4].

Another implementation involves appending the data to be hidden into the

image‟s Extended File Information (EXIF), a standard for digital cameras which is used

to store metadata-information about the image (such as the make and model of a camera,

the time the picture was taken and digitized, the resolution of the image, exposure time,

and the focal length) in the image header file. This suffers drawbacks similar to the

previous technique [120].

2.2.4. Image steganography in spatial domain

Spatial domain methods of steganography involve modifying both the secret data

and the cover medium involving embedding data in the LSBs. This has a greater impact

on the visual quality of the image and as the value of „k‟ in kth

bit embedding increases,

(i.e., embedding data only on kth

bit position alone) the level of distortion also increases.

35

Fig. 2.6. and Fig. 2.7 both justify, if the depth of alteration is more, then it would

considerably decrease the imperceptibility.

Figure 2.6 Binary representation of a one byte gray pixel

Figure 2.7. (a) Cover image

Figure 2.7. (b-e) Stego images k

th position 8, 7, 6, 5 for 256×256 bits embedding

Figure 2.7. (f-i) Stego images k

th position 4, 3, 2, 1 for 256×256 bits embedding

36

A practical example of embedding the confidential information only on the kth

bit

position from the 1st LSB to the 8

th MSB is illustrated in 2.5. It can be seen that

embedding in the 8th

, 7th

, 6th

, 5th

MSBs and 4th

LSB generate more visual distortion to the

cover image as the hidden information is seen as “non-natural”. Hence it is suggested to

embed data only till the 4th

LSB, and definitely not on the MSBs.

Consider a cover pixel intensity value as “160”. Its binary representation is

10100000. Assume in all the bits of a pixel data has been altered while embedding. Then

for 1st bit alteration would give 161, 2

nd bit alteration would give 162, 3

rd bit alteration

would give 164, 4th

bit alteration would give 168, 5th

bit alteration would give 176, 6th

bit

alteration would give 128, 7th

bit alteration would give 224 and Finally 8th

bit alteration

would give 96. The differences are computed for k = 1 to 8 is as follows 1, 2, 4, 8, 16, -

32, 64 and -68 respectively.

Potdar et al., [121] proposed a spatial domain technique for robustness against

image cropping attacks by splitting the cover-images into segments after which secret is

embedded and then a Lagrange interpolating polynomial along with the encrypting

algorithm is used for data recovery.

Shirali-Shahreza and Shirali-Shahreza [122] exploit the Arabic and the Persian

languages for their numerous alphabets with dots and based on the binary values of the

secret data, the dots are modified. Al-Azawi and Fadhil, [123] proposed a text

steganography technique suitable for Arabic characters, hiding information by inserting

extensions character (Kashida). Huffman coding was employed at its initial phase to

compress text into binary format.

37

Colour palette based steganography [8] involves the modification of the LSBs

based on their positions in the colour palette; but they offer little resistance to

compression and statistical counter attacks[10,124,125,126,127]. BMP, GIF and more

recently JPEG image files are being used for this [129].

Jung and Yoo [129] proposed a method of first downsampling image to half its

size, then up-sampling it using Neighbour Mean Interpolation(NMI) technique and then

embedding data in 2 × 2 non-overlapping blocks of the up-sampled image. Drawbacks

include impossibility of recovering the secret without errors and destruction of the

naturally strong correlation between the adjacent pixels and its similarity to pixel-value

differencing techniques renders it prone to histogram analysis attacks.

Histogram-based data hiding uses lossless data hiding using the difference value

of the adjacent pixels in [130]. These have the advantage of recovering the original cover

image from the stego-image while the embedding capacity is restricted.

Nur Mohammad et al. [131] in their work proposed a transparent and high hiding

capacity algorithm which effectively uses Block Truncation Coding (BTC) by employing

a two level (one-bit) nonparametric quantizer and Human Visual System (HVS) masking

characteristics ensuring high visual quality of stego image. Zanganeh and Ibrahim [132]

have proposed a substitution based technique which involves embedding data into

uncertain and higher LSB layers which allows flexibility to embed a large amount secret

message.

2.2.5. Image steganography in frequency domain

Though LSB embedding in the spatial domain is a good method for deceiving the

HVS, it offers least resistance to attacks, and hence it was exploited for use in the

http://scialert.net/asci/author.php?author=Omid&last=Zanganeh

http://scialert.net/asci/author.php?author=Subariah&last=Ibrahim

38

frequency domain. Basically, in these techniques, the Discrete Cosine Transform (DCT)

is first taken and each block DCT coefficient is quantized using a specific Quantization

Table (QT). After quantization, Huffman's algorithm is then used to compress the result

and since most of the redundant data and noise are lost in this stage, this is termed as

lossy compression [25]. A work by Li and Wang [133] modifies the QT and inserts the

hidden bits in the middle frequency coefficients thus increasing the payload.

In the DCT techniques[134], data is inserted into the coefficient‟s insignificant

bits and since the change is made in the frequency domian, there is very less visible

change in the spatial domain. Raja et al., [135] proposed that Fast Fourier Transform

(FFT) methods are not suitable for covert communication as they introduce round-off

errors; whereas Johnson and Jajodia [8] made use of FFT among other transformations

and [136] showed that FFT based steganography was used in movies as well.

The JSteg algorithm employed for the JPEG images is resistant against visual

attacks but leaves a significant statistical signature. Wayner in his dissaperaring

cryptography book [137] states that the hidden information distorts the bell shaped curve

of the DCT coefficients. Also, Manikopoulos et al. in [138] discussed an algorithm which

uses Probability Density Function (PDF) to detect hidden data.

OutGuess uses a pseudo-random-number generator to select DCT coefficients

[10] and the 2 -test doesn‟t detect its presence but an extended version of the same could

outmode this technique as well. Andreas Westfeld‟s “F5” algorithm [19], better known as

syndrome coding makes use of subtraction and matrix encoding and proved robust

against the 2 -test and its extended version but could not resist the steganalysis method

by Fridrich [128] which exploits the natural distribution of DCT coefficients.

39

Perturbed Quantization (PQ) in [139] aims at higher efficiency with minimal

distortion and assigns scalar values as to how much distortion would occur for each

coefficient on embedding and the steganographer can carfefully choose the locations with

least distortion thus leading to high imperceptibility. The main drawback of using DCT is

that embedded data cannot be retrieved if the stego-image is re-compressed.

A good literature on steganography in Discrete Wavelet Transform (DWT)

domain is discussed in detail [40, 140, 141], which is still in its infancy stage. Abdulaziz

and Pang reaffirm that modifying data using a wavelet transformation preserves good

quality with little perceptual artifacts by their methods involving Bose and Ray-

Chaudhuri (BCH) an error correcting code and 1-Stage discrete Haar Wavelet transforms

[142].

Abdelwahab and Hassan proposed a DWT domain steganographic method which

involves taking the DWT of both secret and cover images and then dividing them into 4 ×

4 blocks. Secret image bocks fit into the cover image blocks and the error blocks

generated are embedded into the best matched blocks in the horizontal sub-band of the

cover image [143]. The Inverse Discrete Fourier Transform (IDFT), encompasses round-

off error thus rendering DFT improper for steganography applications.

2.2.6. Adaptive image steganography

Adaptive steganography, also known as “Statistics-Aware Embedding” [10],

“Masking” [8] or “Model-Based” [117] takes statistical global features of the image

before attempting to interact with its LSB/DCT coefficients. Any change is made

according to the statistics [144] and [145] and is characterized by a random adaptive

selection of pixels to avoid areas of uniformly coloured smooth areas.

http://en.wikipedia.org/wiki/Raj_Chandra_Bose

http://en.wikipedia.org/wiki/D.K._Ray-Chaudhuri

http://en.wikipedia.org/wiki/D.K._Ray-Chaudhuri

40

Data embedding in noise [140] has proven to be robust with respect to

compression, cropping and image processing [146, 147, 148]. The model-based method

MB1 [117], generates a stego-image based on a given distribution model using a

generalized Cauchy distribution resulting in minimum distortion but can be detected

using the first-order statistics [149, 150] and also by the difference of „blockiness‟

between a stego-image and its estimated image reliability [151]. Edge embedding is

robust and maintains a good imperceptibility by locating edge segments of locations in a

fixed block fashion with its centre on an edge pixel [152]. Chin-Chen et al. proposed an

LSB substitution method based on the correlation between neighbouring pixels to

estimate the degree of smoothness and the payload is high [153].

“A Block Complexity based Data Embedding” (ABCDE) [154] embeds the noisy

block obtained converting the secret data with selected noisy blocks in the image. The

hidden message is more a part of the image than just the added noise [155] and has a very

high embedding capacity. The chief drawback is that certain control parameters have to

be configured manually rendering it unsuitable for automatic processes. This algorithm

was developed as an improvement over Bit Plane Complexity Segmentation (BPCS),

[156] which compensated the drawback of LSB manipulation techniques.

In a method explained in [155], wavelet transforms are used to map integers to

integers and this overcomes the difficulty of floating point conversion that occurs after

embedding. The payload is embedded in non-overlapping 4 × 4 blocks of lower

frequency choosing two pixels on either side of the diagonal at a time.

In [157], a method to restore the marked image to its original state after data

extraction is proposed and this is done by using the histogram peak-point in the

41

difference image and generating the inverse transformation in the spatial domain. The

selection of the local histogram‟s peak point bp will direct the embedding process and

matrix manipulation. One drawback is that the authors have not given any explanation on

the effect of homogeneous, dark, bright and edged blocks on the algorithm efficiency.

Wu and Shih have proposed a Genetic Algorithm (GA) based method which

generates a stego-image to resist its detection in the spatial and frequency domain

steganalysis systems by artificially counterfeiting the statistical systems and the process

is repeated until some predefined condition is satisfied. It is not stated whether the

predefined condition is generated automatically or has to be declared manually. Their

claim that their paper was the first to use evolutionary algorithms is not true and prior

works include [160]. Yu et al. [151] extends the conventional '1' algorithm to JPEG

images using genetic algorithms.

Kong et al. proposed a content-based image embedding scheme in which a

watershed method coupled with Fuzzy C-Means FCM [161], is used to segment the

homogenous grayscale areas and entropy is calculated for each segment. This entropy

determines the embedding capacity and accordingly either four or two LSBs are

embedded. The sensitivity to intensity changes affects the extraction of the correct secret

data.

Chao et al. proposed a 3D steganography scheme of embedding the secret

messages in the vertices of 3D polygon models [162]. On a similar note, Bogomjakov et

al. hide a message in the indexed representation of a mesh by taking the permutation of

the order in which the faces and the vertices are stored [163]. The difficulties faced

42

include time complexity to generate the mesh and 3D graphics are not easy to port as the

digital images.

Wien Hong et al. proposed a method to improve the stego image quality by using

reversible contrast mapping data hiding scheme that uses the variance feature of the cover

image [164]. Hao Luo et al., proposed a method that incorporates secret sharing and data

hiding technique for block truncation coding for image compression. Under this method,

two quantization levels are hidden in two shared images to increase the level of

compression [165].

Xiang et al. designed a novel steganography method that employed selecting a

series of Multiple Choice Questions (MCQ‟s) that could generate the secret data. This

technique when compared against the existing linguistic steganographic techniques

seemed superior [166]. Luo et al. presented an algorithm based on the directed

Hamiltonian path selection in the complete digraph mapped from multi-blogs [167]. This

work was found to possess good imperceptibility and high security.

An adaptive data hiding based on edge pixels in spatial domain is available in

[168], Chen and chang proposed an Optimal Pixel Adjustment Process (OPAP) to

improve the imperceptibility, but the drawback of this method is that it is not adaptive

[169]. In a method proposed by Yang, author has proposed an inverted pattern (IP) LSB

substitution approach [168], to increase the quality of the stego image. Under this

method, before embedding some of the secret messages are inverted and the others are

left unchanged a simple strategy is used to find whether a section of a message is

inverted and a bit string named IP is used to record the inverting actions. They also use

OPAP to further increase the quality of the stego image. They have also shown promising

http://scialert.net/asci/author.php?author=Wien&last=Hong

http://scialert.net/asci/author.php?author=Hao&last=Luo

43

results as the final image quality obtained was better than optimal LSB substitution

approach [14, 169] and the OPAP LSB substitution approach [171].

2.2.7. Steganography methods - An illustration

For a better understanding of the aforementioned concepts and the succeeding

chapters, this section discusses in detail about LSB, IP, Pixel Value Differencing (PVD)

and Pixel Indicator (PI) methods.

Least Significant Bit data embedding scheme

The main reasons for the LSB Substitution method [169] to be popular are as

follows. Firstly, the ease of computation is very high because of its straight forward

implementation. Secondly, large amount of information or payload can be hidden in the

cover image without distorting it. Human eye is sensitive only to the changes made in the

smooth areas of an image. It overlooks or cannot perceive the alterations made to the less

sensitive edge areas of the image.

But traditional LSB substitution method has the following disadvantages:

Since it is a well known method, the message embedded using this method is

vulnerable.

The number of message bits embedded in each pixel is the same for all the pixels.

Hence the decoding of the embedded is very easy and security of the message is

at stake.

Visual degradation is possible if more number of message bits is embedded in the

edge areas of the image, hence for fully embedded image visual degradation will

be high.

The payload is uniform in all the pixels.

44

This method is not robust. i.e., when subjected to image processing, it will lose

the confidential information.

Optimum Pixel Adjustment Procedure (OPAP)

OPAP reduces the distortion caused by the LSB substitution method [169]. Here

the pixel value is adjusted after the hiding of the secret data is done to improve the

quality of the stego image without disturbing the data hidden.

Procedure for hiding

First a few least significant bits are substituted with the data to be hidden.

Then in the pixel, the bits before the hidden bits are adjusted suitably if necessary

to give less error.

Let n LSBs be substituted in each pixel.

Let d= decimal value of the pixel after the substitution.

d1 = decimal value of last n bits of the pixel.

d2 = decimal value of n bits hidden in that pixel.

• If ((d1~d2)<=(2^n)/2 ) then no adjustment is made in that pixel.

• Else

If(d1<d2)

d = d – 2^n .

If(d1>d2)

d = d + 2^n .

This d is converted to binary and written back to pixel.

Retrieval

The retrieval follows the extraction of the least significant bits (LSB) as hiding is

done using simple LSB substitution.

Advantages

1. Simple methodology.

45

2. Easy retrieval.

3. Improved stego-image quality than LSB substitution.

Inverted Pattern Approach (IP)

This IP LSB substitution approach uses the idea of processing secret messages

prior to embedding [170]. In this method each section of secret images is determined

whether to be inverted or not before it is embedded. In addition, the bits which are used

to record the transformation are treated as secret keys or extra data to be re-embedded.

The embedding procedure

The embedded string is S, the replaced string is R, and the embedded bit string to

divided to P parts.

Let us consider n-bit LSB substitution to be made. Then S and R are of n-bits

length.

For P part in i = 1 to P

If MSE(Si,Ri) ≤ MSE(S’i,Ri)

Choose Si for embedding

Mark key(i) as logic „0‟

If MSE(Si,Ri) ≥ MSE( S‟i,Ri)

Choose S‟ i for embedding

Mark key(i) as logic „1‟

MSE – Mean Square Error.

End

where, S is the data to be hidden

S‟ is the data to be hidden in inverted form.

Procedure for retrieval

The stego-image and the key file are required at the retrieval side.

First corresponding numbers of LSB bits are retrieved from the stego-image.

46

If the key is „0‟, then the retrieved bits are kept as such.

Else if the key is „1‟, then the bits are inverted.

The bits retrieved in this manner from every pixel of the stego-image give the data

hidden.

Pixel Value Differencing (PVD) methods

HVS approach is applied to embed data, for example, Wu and Tsai have used this

technique in their paper [111], in which they have applied HVD and determined that

more data could be embedded into the edge areas rather than into the smooth portions of

the image. The side match approach used by Chang and Tseng is also worth a mention as

they have used the difference between the side pixels to find the embedding capacity

[147]. Lee and Chen also proposed a method in which they have used the contrast and

luminance of the neighbor-hood pixels [172]. They have also introduced an additional

concept in which the greater a grayscale is, the more change of the grayscale could be

tolerated. This was also utilised by Wang [173]. PVD is able to provide a high-quality

stego image in spite of the high capacity of the concealed information [80, 111, 147, 172,

173]. That is, the number of insertion bits is dependent on whether the pixel is in an edge

area or smooth area. In edge area, the difference between the adjacent pixels is more,

whereas in the smooth area it is less as human perception is less sensitive to subtle

changes in edge areas of a pixel.

PVD – An illustration [80]

This method hides the data in the target pixel by finding the characteristics of four

pixels surrounding it, as indicated in the Table 2.5 below.

47

Table 2.5 Pixel arrangement in spatial domain

g(x-1,y-1)

top left pixel

g(x-1,y)

top pixel

g(x-1,y+1)

top right pixel

g(x,y-1)

left pixel

g(x,y)

target pixel

g(x-1,y-1) , g(x-1,y) , g(x-1,y+1) , g(x,y-1) are the gray values of the pixels surrounding

the target pixel g(x,y).

Embedding procedure

Select the maximum and the minimum values among the four pixel values that

have already finished the embedding process. Calculate the difference value d

between the maximum pixel value and the minimum pixel value using the

following formula

“ d = gmax − gmin “

where,

gmax = max(g(x−1,y−1), g(x−1,y) ,g(x−1,y+1), g(x,y−1)) and

gmin = min(g(x−1,y−1), g(x−1,y) ,g(x−1,y+1), g(x,y−1))

Use above equations to judge, whether the target pixel is included in an edge area

or a smooth area. The number of bit n inserted into the target pixel, is determined

by value d.

Calculate n = log2d -1 , if d > 3.

= 1 otherwise.

Calculate a temporary value tx,y = b − (gx,y mod 2n)

where b is the data to be hidden.

Calculate t1

=t(x,y)

if (-(2 n -1)/2)≤t(x,y)≤(2 n -1)/2 n

=t(x,y)+2 n

if(-2 n +1) ≤ t(x,y) < (-2 n -1)/2 n

=t(x,y)-2 n

if(2 n -1)/2 < t(x,y) < 2 n

g1(x,y)=g(x,y)+t1(x,y).

g1(x,y) is the new pixel value.

48

Retrieval

n is calculated in the same way as in the sender side.

The target pixel value is present in g(x,y).

The data hidden is b=g1(x,y) mod 2n.

Pixel Indicator (PI) Method

Gutub proposed PI method [174] and implemented for random image

steganography. One channel is fixed as an indicator and the specified amount of bits by

user (say k bits, for example 2 bits) are then embedded in the other two channels

depending upon the last two bits of the indicator channel, and the details are given in

Table 2.6.

Table. 2.6 Meaning of indicator values

Thus if „RED‟ plane is selected as an indicator, and its last two bits be „11‟ then k-

bits (as defined by user) of data are embedded in the blue and the green channel

respectively. Variable payload for the chosen cover is the specialty of this method, where

the last two bits of the indicator decide the embedding capacity.

In the following section, transform domain methods like, a simple DCT [4] and

Integer Wavelet Transform method (IWT) [175] have been discussed.

INDICATOR CHANNEL 1 CHANNEL 2

00 No data embedded No data embedded

01 No data embedded k bits of data embedded

10 k bits of data embedded No data embedded

11 k bits of data embedded k bits of data embedded

49

DCT based embedding

In this transform domain technique; DCT is used to hide messages in significant

areas of the cover image. Here pixels are split into 8 × 8 blocks. Then, all blocks are DCT

transformed and each block encoding exactly one secret message bit.

Procedure for hiding

The embedding process starts with selecting a block Bi which will be used to code

the ith

message bit.

Let Bi = D{bi} be the DCT-transformed image block.

Before the communication starts, both sender and receiver have to agree on the

location of two DCT coefficients, which will be used in the embedding process.

Let us denote these two indices by (u1, v1) and (u2, v2).

Let m(i) be the ith

message bit.

If m(i)=0 ,

if Bi(u1, v1) > Bi(u2, v2) then

swap Bi(u1, v1) and Bi(u2, v2)

else if m(i)=1,

if Bi(u1, v1) < Bi(u2, v2) then

swap Bi(u1, v1) and Bi(u2, v2)

The last step is to take inverse dct of the blocks to obtain the stego image.

During the retrieval, again the stego image is split as 8×8 pixel blocks and is DCT

transformed.

Now, the predetermined set of two DCT coefficients are compared for all the

blocks.

if Bi(u1, v1) > Bi(u2, v2) then the message bit=1,

else 0.

Procedure for Retrieval

A block Bi is selected in the stego-image.

Then the dct is performed on the block, Bi=D{bi}.

50

Then the two indices (u1,v1) and (u2,v2) which are chosen by both sender and

receiver are compared.

If Bi(u1, v1) > Bi(u2, v2)

Then data hidden=‟1‟

Else if Bi(u1, v1) > Bi(u2, v2)

Then data hidden=‟0‟

This procedure is repeated for all the blocks in the image.

Advantages

This method is more robust to attacks, such as compression, cropping etc.

Though the embedding capacity is low, the quality of image is good.

IWT based embedding

Step 1: Read the cover image as a 2D file with size of 256×256 pixels.

Step 2: R, G and B planes are separated

Step 3: Consider a secret data as text file. Here each character will take 8 bits.

Step 4: Histogram modification is done in all planes, Because, the secret data is to be

embedded in all the planes, while embedding integer wavelet coefficients produce stego-

image pixel values greater than 255 or lesser than 0. So all the pixel values will be ranged

from 15 to 240.

Step 5 : Each plane is divided into 8×8 blocks

Step 6: Apply Haar Integer wavelet transform to 8 × 8 blocks of all the planes, This

process results in LL1, LH1, HL1 and HH1 sub bands

Step 7: Using Key-1(K1) calculate the Bit length (L) for corresponding wavelet co-

efficients (C0),

40,

2

221

222

2,3

1

0

2

0

1

3

0

2

3

0

k

Cifk

Cifk

Cifk

Cifk

L

k

kk

kk

k

51

Step 8: Using key-2 select the position and coefficients for embedding the „L‟ length data

using LSB substitution [159]. Here data is embedded only in LH1,HL1and HH1 sub-

bands. Data is not embedded in LL1 because they are highly sensitive and also to

maintain good visual quality after embedding data.

Step 8: Applying Optimal Pixel adjustment Procedure (OPAP) reduces the error caused

by the LSB substitution method [159].

Step 9: Take inverse wavelet transform to each 8×8 block and combine R,G&B plane to

produce stego image.

Extraction algorithm

Step 1: Read the Stego image as a 2D file with size of 256 × 256 pixels.

Step 2: R, G and B planes are separated

Step 3: Each plane is divided into 8 × 8 blocks

Step 4: Apply Haar Integer wavelet transform to 8×8 blocks of all the planes, This

process results LL1,LH1,HL1 and HH1 sub-bands.

Step 5: Using Key-1 calculate the Bit length (L) for corresponding wavelet co-

efficients(WC), using the „BL‟ equation used in Embedding procedure.

Step 6: Using key-2 select the position and coefficients for extracting the „BL‟ length

data.

Step 7: Combine all the bits and divide it in to 8 bits to get the text message.

To summarize, this review suggest the following, that a simple classification

could be methods based on spatial or transform domain techniques. In addition, there is a

possibility to classify steganography methods based on the covers i.e., Video

Steganography, Text based Steganography, Audio Steganography and Image

Steganography [5]. The former method could be further classified into substitution,

transform, statistical, spread spectrum and cover generation methods [2]. Another

classification on steganography is pure steganography, secret key steganography and

52

public key steganography [2]. The comparative performance on the aforementioned

methods is given in Table 2.7 and the cover and stego images are given in Fig 2.8 (a-n).

In addition this survey suggests a classification based on imperceptibility, capacity and

robustness for random image steganography is available in Fig 2.9.

53

Table 2.7 The comparison on the aforementioned methods follows:

Methods Domain Imperceptibility Capacity Robustness Complexity/

Security

Adaptive Embedding

Simple

LSB

Spatial Fair till k = 3 bit k bits/Pixel Low Low Constant k bits/Pixel

OPAP Spatial Good even for k = 4 bit k bits/Pixel Low Low Constant k bits/Pixel

IP Spatial Best even for k = 4 bit k bits/Pixel Low Good Constant k bits/Pixel

PVD Spatial Good Lesser than k bits/Pixel Low Good Adaptive k bits/Pixel

DCT Transform Fair Low Medium Excellent Depends on cover & method

IWT Transform Best Moderate High Excellent Adaptive

Figure 2.8.Selected cover images 256 × 256 (a) Lena, (b) Baboon

Figure 2.8.(c-h) Stego Images Lena (c) simple LSB, (d) OPAP, (e) PVD, (f) IP, (g) DCT, (h) IWT

Figure. 2.8.(i-n) Stego Images Baboon (i)simple LSB, (j) OPAP, (k) PVD, (l) IP, (m) DCT. (n) IWT

54

Random Image Steganography

Spatial Domain Methods Frequency Domain Methods

Imperceptibility

Capacity

Robustness

Adaptive Random Image Steganography

Randomness through Preprocessing Randomness while Embedding

Figure. 2.9. Classification on random image steganography

55

2.2.8. Pseudo random permutation steganography

A little too deceiving method for the sender, this incorporates the complete

utilization of the cover image, such that every bit is not left turned. This method of

embedding data along with the secret key in a random fashion leaves the hacker with no

hint. Hence, these methods are extremely tedious to poach into the data.

This random fashion of encoding can be done with multiple keys( k1, k2, k3 and so on)

or by the creation of multiple element indices from (J1………Jm).

2.2.9. Recent trends suggested in random image steganography

Random image steganography is defined as an image steganography which offers

cryptic effect. So far literature expects and suggests that well defined cryptography

algorithms could be employed prior to embedding of the confidential information to offer

cryptic effect. But random image steganography is to preprocess the data, followed by

encryption with well defined cryptographic algorithms then adapt any one of the possible

ways to improve the complexity.

Possible ways of random image steganography

Method 1: K bit embedding with a key 1 [0 0 0 0 1 1 0 1] in a pixel

Method 2: Key 2 [1 0 1 0 1 1 0 1] in a cover

Method 3: Using Fibonacci series 1, 2, 3, 5, 8, 13 …. If exceeds use mod

length of the Cover, problem collision attacks.

Method 4: Variable bit embedding on the pixel (PVD)

Method 5: Encrypt the secret then embed

Else Better Change the “ROUTE” for embedding...

56

Method 1: k - bit embedding with a key 1 [0 0 0 0 1 1 0 1] in a pixel

It will embed data into the pixels based on the four LSBs of the key.

1) If key= [0 0 0 0 1 1 0 1] then embedding should be in the 1st(2^0),3

rd(2^2) and 4

th(2^3)

LSBs

2) If key= [0 0 0 0 0 1 0 1] then embedding should be in the 1st(2^0) and 3

rd(2^2) LSBs

Merits: Its embedding capacity, MSE and PSNR depends on the key. Randomness

is introduced in this method if the keys are other than [0 0 0 0 0 0 0 1], [0 0 0 0 0 0 1 1],

[0 0 0 0 0 1 1 1], [0 0 0 0 1 1 1 1], as in these cases it will be normal LSB substitution

with 1,2,3,4 bits respectively.

Complexity: If the data is encrypted using DES it will introduce a complexity of (2^64)

Probability of embedding 0-bits in last four LSBs is ((4c3)/16) = (1/16)

Probability of embedding 1-bit in last four LSBs is ((4c1)/16) = (4/16)




So the final complexity of embedding 1-bit is: (2^64)*(1/16)*(1+4+6+4)

To improve complexity: Good when keys [0 0 0 0 0 0 0 1], [0 0 0 0 0 0 1 1], [0 0 0 0 0 1

1 1] and [0 0 0 0 1 1 1 1] are not used.

Imperceptibility: Visible to some extent if the key used is [0 0 0 0 1 1 1 1].

Suggestion: Do not embed more than 3 bits in gray and 8 bits in color image.

Method 2: Key 2 [1 0 1 0 1 1 0 1] in a cover

It will embed data into the selected pixels basing on the key.

57

Example: 1) If key= [1 0 1 0 1 1 0 1] then embedding should be done in the

pixels 1, 3, 5, 6, 8. This sequence of embedding should be repeated for a block of eight

pixels for complete embedding.

2) If key= [1 1 0 0 0 1 1 1] then embedding should be done in the

pixels 1, 2, 6, 7, 8. This sequence of embedding should be repeated for a block of eight

pixels for complete embedding.

Merits: It has a relatively lower MSE and a relatively higher PSNR. Randomness is

introduced in this method if the key is other than [1 1 1 1 1 1 1 1] as in this case it is

normal raster scan LSB substitution with zero randomness.

Complexity: If the data is encrypted using DES it will introduce a complexity of (2^64)

To improve complexity: good when key [1 1 1 1 1 1 1 1] is not used.

Imperceptibility: visible to some extent if key [1 1 1 1 1 1 1 1] is used with 4-bit

embedding in each pixel.

Method 3: Using Fibonacci series 1, 2, 3, 5, 8, 13 …. If the value exceeds, then use

the mod length of the cover. The problem with this is collision attacks.

In this method pixels are selected for embedding in the following way. A row is

selected if it is not a Fibonacci number. Similarly, the column is also selected. Two

methods are possible. In the first methodology, embedding can be done in all the row and

column numbers that are a part of Fibonacci series. In the second methodology,

embedding can be done in all the row and column numbers that are not a part of

Fibonacci series.

58

Example: In method 1 rows 1,2,3,5,8,… are selected and in a particular row again pixels

1,2,3,5,8,… will be selected for embedding.

In method 2 rows 1,2,3,5,8,… are selected and in a particular row again pixels

1,2,3,5,8,… will not be selected for embedding. All others pixels will be selected for

embedding.

Merits: Method 1 gives more randomness and imperceptibility while method 2 gives

more embedding capacity.

Complexity Analysis: If the data is encrypted using DES it will introduce a complexity

of (2^64)

For method 1: (for a 256*256 image, the complexity varies with respect to size of

image)

Selecting a row which is a part of Fibonacci series can be done in (12/256) ways.

Selecting a column which is a part of Fibonacci series can be done in (12/256) ways.

4 bits can be selected for embedding in four LSBs in 1 way.

So the final complexity of embedding 1-bit is: (2^64)*(256/12)*(256/12)*1

For method 2: (for a 256*256 image, the Complexity varies with respect to size of

image)

Selecting a row which is not a part of Fibonacci series can be done in (244/256) ways.

Selecting a column which is not a part of Fibonacci series can be done in (244/256) ways.

4 bits can be selected for embedding in four LSBs in 1 way.

So the final complexity of embedding 1-bit is: (2^64)*(256/244)*(256/244)*1

59

Method 4: Variable bit embedding on the pixel (PVD)

This method is similar to embedding data into pixels based on the nature of

image. If the image is having an edge, more bits will be embedded and if it‟s a smooth

area then less bits are embedded.

Merits:

Since it is taking the image texture into consideration, the imperceptibility is very high. It

is also complex due to the variable bit embedding.

Complexity analysis:

If the data is encrypted using DES it will introduce a complexity of (2^64)

It is taking variable bit embedding and hence the complexity increases.

Imperceptibility: is low because it takes the image texture into consideration.

Method 5: Encrypt the secret then embed

Here simply the data is first encrypted using some encryption algorithm, and then

encryption is done with the encrypted data and not with the original data. Security

increases because of encryption but it will also depend on the embedding algorithm. It

can be combined with any of the methods above.

2.3. A REVIEW ON STEGANALYSIS

Steganalysis in a nutshell, tests the strength of a stego system. However,

technically, it aims at detecting the presence of secret messages [176]. Deriving its roots

from cryptanalysis, since its origin it has been at a constant war with its rival in principle,

steganography. These systems usually encompass various image processing techniques

such as cropping, filtering, resizing etc., or are designed to deduce presence of any stego

system by computing its statistical properties eg. Histograms, correlations, chi square etc.

60

steganalysts use two kinds of analysis namely Passive and Active. Passive steganalysis

does not preserve the payload. It merely distorts or corrupts the secret information present

in the cover defeating the aim of safe transmission of hidden messages. Whereas, active

steganalysis aims to detect the exact algorithm that is employed in hiding the data and

extracts the payload rather than destroying it.

The methods that these systems follow can be broadly classified under

Steganalysis for specific embedding and Universal blind steganalysis [45, 93, 103, 177].

Specific steganalysis is used to efficiently determine the secret data and the bit

embedding ratio. Steganalysis in particular has been found to be primarily effective

against spatial steganography. It has been able to detect any trace of unusual noise,

relations between indexed colours and patterns between colour pallets. The method is

pretty fragile though [178].

LSB shows weak resistance to filtering, compression, distortion, scaling rotation,

cropping, addition of noise, or lossy compression and hence is completely vulnerable to

any kind of passive analysis. In fact, the entire message can be destroyed by removing the

entire LSB plane causing minimal perceivable difference [124]. Algorithms like RS,

SPA, DIH and LSM can detect the spatial LSB steganography reliably.

Fridrich et al. [179] was able to extract messages embedded in LSB with

embedding capacities as small as 0.03 bpp, by understanding the inner structure of

LSB‟s[179]. Kong et al. pointed out that “pair effect” takes place in LSB substitution

which can be observed from its histogram particularly in cases which use modulus

operator [127].

61

Chi square (χ2) and pair analysis algorithms are very effective on the spatial

domain. Chi square is a non-parametric statistical characteristic of an image that accounts

for the confidence of data present to be uncorrupted and random. This particular

characteristic can determine whether the image intensities follow any distributed pattern

or random pattern. If the intensity levels pertaining to a specific distributed pattern are

identified, then any pixels that belong to these intensity level can be reliably marked as

corrupted or pixels with high probability of data embedding. However, a simple way to

beat this algorithm would be by embedding random chunks of data distributed in a

pseudo random fashion, destroying the true significance of chi square data received from

the altered image. Bohne et al. developed a method to detect randomly scattered data in

LSB domain which made use of Preserving Statistical Properties (PSP) algorithm [150].

The above equation can be used to compute the chi square static, where is the

observed pixel value and is the expected pixel value for the ith

pixel. Jessica et al .[181]

presented a statistical method which operated using a higher order statistics called RS

statistics which provides a rough estimate of pixels flipped pixels caused by embedding.

Li et al. [182] exploited the weakness of YASS algorithm [118], observing the

extra zero coefficients in the embedded host blocks because of the use of a quantization

index modulation (QIM) and the contrasting statistical features derived from stego image

blocks

However, specific embedding steganalysis practicality remains widely

questioned, since analysers would actually find it difficult to zero down to a particular

62

embedding pattern. The embedding method being a steganographers choice would be

difficult to be determined. Hence a more practical way of steganalysis is approached,

namely universal blind steganalysis [177]. In this approach, primary importance is paid to

the flexibility or the adaptability of the method to improvise and train itself, accustoming

itself to the rather unknown stego system. It is a meta detection system in the sense that it

can be adjusted, after training on cover and stego images, to detect any steganographic

method regardless of the embedding domain.

The universal blind steganalysis has been given keen focus of attention in recent

times. These methods can further be broadly classified into two groups. One wherein

cover and stego images are detected using the original images and employ training set

and extraction features to detect the embedded data. Since this particular category of

methods remains in dark about the methods employed in embedding, they are labelled as

blind steganalysis methods. Jessica et al. [183] designed a steganalysis system for the

frequency domain which comprised of DCT features and calibrated Markov features.

The other classification widely deals with the training set employing various

steganography methods on the original image and comparing the resultant images against

the stego object. Analysers in this method have rounded about the suspicion to few

specific methods but remain unsure of the method that has been particularly used on the

object. These methods are hence known as half blind methods. It is worth mentioning

since these methods are based on developments in certain generalized methods, they can

be used to detect and develop new steganography methods.

Of late, trained classifiers are being developed to detect secret messages. Avcibas

et al. proposed the framework for classifier based steganalysis using image quality

63

metrics [184, 185]. Farid also went ahead to investigate the problem faced by classifier

based steganalysis and supported that classifier scheme was effective to deal with the

variable image statistics and algorithms of unknown stego systems [45, 186].

Researches also point out that a single steganalysis system is usually incapable of

detecting the data by itself and strengthening the system would actually imply increasing

the number of features present in the system. Liu et al works consisting of a few single

feature steganalysis system went ahead to validate the same [187]. Avcibas et al used ten

image quality metrics as feature set and employed 18 features from binary similarity

measures of the seventh and eighth bit planes in an image for classification [184].

Similarly Fridrich et al. went ahead and extracted 23 calibrated features from DCT and

spatial domain [188]. These works helped in subsequently converging into a universal

classifier which exploited 81 features extracted from the higher-order absolute moments

of residual noise in the wavelet domain. In the light of more recent developments, Chen

et al. adapted 324 features from the statistical moments of wavelet characteristic

functions [189].

However, oversizing the steganalysis method with more features could adversely

affect its performance. Hence, the concept of feature selection was brought to light

wherein selection of optimised features helped to improve the features of the systems yet

deal with a variety of stego methods. Feature selection ruled out redundant features

limiting it to features that could help steganalysts observe the least of sensitivities and

subsequently aid in attacking the system strengthening the entire model.

64

2.4. SUMMARY

In this chapter, a brief encyclopedia of infant steganography to matured

steganography has been presented. Starting with the definition, differences with other

security guards like cryptography and watermarking have been highlighted. In addition

this chapter also discusses the major steganographic algorithms used for digital imaging.

The emerging techniques like DCT, DWT, IWT and Adaptive Steganography alter

coefficients in the transform domain, thus keeping the image distortion at a minimum

level. This property makes them less prone to attacks with the drawback of lesser payload

in comparison with the spatial domain algorithms (LSB, Modified LSB, PVD), further

highlighting the point of tradeoff between robustness and payload. Methods like

compression or correlated steganography which are based on the conditional entropy of

the message given the cover can be used to reduce the number of bits required to encode

the hidden message.

Scholars have contradicting views regarding the importance of robustness for the

steganographic system design. While Cox felt that watermarking would be differentiated

from steganography mainly with the high robustness characteristic of watermarking

[190], Katzenbeisser dedicated a sub-section to robust steganography mentioning

robustness to be a practical requirement for a steganographic system. “Many

steganography systems are designed to be robust against a specific class of mapping.” [2]

Both of their views are based upon their personal experience in the field and opinions. In

general, if somebody suspects a covert communication, then the goal of steganography is

defeated. Hence, robustness is needed for watermarking, but definitely not for

steganography.

65

This Chapter offered some guidelines and recommendations on the design of

possible ways of random image steganography, which is the major motivation of this

review and the same has been enumerated and enunciated with practical examples. As a

finishing touch to the steganography, the significance of steganalysis has also been

highlighted.

Date post:	06-Mar-2018
Category:	Documents
Upload:	lyque
View:	222 times
Download:	0 times

CHAPTER II: RANDOM IMAGE STEGANOGRAPHY REVIEW...

Documents