Steganography and Block-based Quantitative Steganalysis
Haiqiang Wang
Advisor: C.-C. Jay Kuo
Viterbi School of Engineering, USC
07/26/2014
Haiqiang Wang Steganography and Steganalysis 1 / 38
Outline
1 IntroductionDefinitionInformation Hiding
2 Research on steganographyTransform domainSpatial domain
3 Research on SteganalysisMachine Learning ApproachCurse of Dimensionality
4 Block-based Quantitative SteganalysisWhy block-basedBlock-based steganalysis approach
Haiqiang Wang Steganography and Steganalysis 2 / 38
Introduction Definition
What are steganography and steganalysis?
DefinitionSteganography is the art of communicating in a way which hides theexistence of the communication.
Steganalysis is the science to detect, or estimate the hidden data fromobserved data with little knowledge about steganography algorithm.
Figure 1: Communication with invisible ink
Haiqiang Wang Steganography and Steganalysis 3 / 38
Introduction Definition
Framework of Steganography and Steganalysis
Figure 2: Components of the steganographic channel
Cover X: input image; Stego Y: output image
Payload p: relative message length (bpp: bit per pixel)
HOLUB, V. CONTENT ADAPTIVE STEGANOGRAPHY?DESIGN AND DETECTION. Diss. State University of New York, 2014.
Haiqiang Wang Steganography and Steganalysis 4 / 38
Introduction Information Hiding
Common information hiding techniques
Figure 3: Information hiding classification
V. Nagaraj., et al. "Overview of Digital Steganography Methods and Its Applications." IJAST 60 (2013): 45-58.
Haiqiang Wang Steganography and Steganalysis 5 / 38
Introduction Information Hiding
Examples of Watermarking
Figure 4: Watermarking used in documentation
Haiqiang Wang Steganography and Steganalysis 6 / 38
Introduction Information Hiding
Examples of Watermarking
Figure 5: Perceptible and Imperceptible Watermarking used in video
Haiqiang Wang Steganography and Steganalysis 7 / 38
Introduction Information Hiding
Example of Steganography
Figure 6: Steganography using UNIWARD
Haiqiang Wang Steganography and Steganalysis 8 / 38
Introduction Information Hiding
Comparison with Watermarking
Table 1: Comparison of Watermarking and Steganography
Watermarking SteganographyGoal copyright protection covert communication
Information host or owner any kind of informationPerceptible either visible or imperceptible statistically undetectable
Receiver point-to-multiple points point-to-pointCapacity not important important
Robustness important not necessary
Haiqiang Wang Steganography and Steganalysis 9 / 38
Introduction Information Hiding
Why we study Steganography and Steganalysis?
Used in military purpose, intelligence services, or by terrorists via publicaccess channelIn October 2001, An article from the New York Times claims theterrorists use covert communication to prepare and execute the 11September 2001 terrorist attack
Figure 7: Publications number
Haiqiang Wang Steganography and Steganalysis 10 / 38
Research on steganography Transform domain
How to embed information while minimizing distortion?
Transform domain (wavelet, DCT) and spatial domainJP Hide&Seek, Jsteg, MBS1, MMX, nsF5, OutGuess and PerturbedQuantization (PQ)LSB, HUGO, WOW and UNIWARD
Brute-force embedding and empirical embeddingContent adaptive spatial domain embedding performs better
Typical algorithms (PQ, LSB) and state-of-the-art (UNIWARD).
Pevny, T., et al."From blind to quantitative steganalysis." SPIE Electronic Imaging. 2009.
Haiqiang Wang Steganography and Steganalysis 11 / 38
Research on steganography Transform domain
Perturbed Quantization in DCT domain
Think of EE 669 homework 2 (JPEG cpmpression)
8 ∗ 8 image blocks, shifted by −128, two-dimensional DCT, divided byquantization matrix, round to nearest decimal and entropy coding
Fractional part is around 0.5, either round up or down
Disadvantage: change the pixel value more than 1
Figure 8: Key idea of Perturbed Quantization
Fridrich, Jessica., et al."Perturbed quantization steganography." Multimedia Systems 11.2 (2005): 98-107.
Haiqiang Wang Steganography and Steganalysis 12 / 38
Research on steganography Spatial domain
Least Significant Bit Replacement
Figure 9: LSB replacement embedding
Ker, Andrew D. "Steganalysis of embedding in two least-significant bits." ITIFS on 2.1 (2007): 46-54.
Haiqiang Wang Steganography and Steganalysis 13 / 38
Research on steganography Spatial domain
Weakness of LSB
Figure 10: Embedding changes statistics
Haiqiang Wang Steganography and Steganalysis 14 / 38
Research on steganography Spatial domain
UNIWARD–UNIversal WAvelet Relative Distortion
Minimize a well defined embedding distortion functionEmbedding in noisy regions, complex texture and avoid smooth regions
Figure 11: Steganography using UNIWARD
Denemark, Tomas, et al. "Further Study on the Security of S-UNIWARD." SPIE Electronic Imaging. International Society for Opticsand Photonics, 2014.
Haiqiang Wang Steganography and Steganalysis 15 / 38
Research on steganography Spatial domain
Filter banks
β = {K(1),K(2),K(3)} to evaluate smoothness in multiple directions
K(1) = h · gT ,K(2) = g · hT ,K(3) = g · gT
Figure 12: low-pass and high-pass filter
Haiqiang Wang Steganography and Steganalysis 16 / 38
Research on steganography Spatial domain
Wavelet decomposition
Directional residuals is the convolution between the filter and image:
Wk = Kk ? X (1)
The distortion of changing one pixel is defined as:
D(X,Y) =
3∑k=1
∑u,v
|Wkuv(X)−Wk
uv(Y)|ε+ |Wk
uv(X)|(2)
kth decomposition, Wkuv is the uvth wavelet coefficient
For each filter, compute the relative wavelet coefficient change w.r.t. thecover image
Haiqiang Wang Steganography and Steganalysis 17 / 38
Research on steganography Spatial domain
Embedding cost function
The additive approximation (with subscript "A") of distortion function is:
DA(X,Y) =
n1∑i=1
n2∑j=1
ρij(X,Yij)[Xij 6= Yij] (3)
ρij(X,Yij) is the distortion of changing ijth pixel:
ρij(X,Yij) = D(X,Yij) (4)
Haiqiang Wang Steganography and Steganalysis 18 / 38
Research on steganography Spatial domain
Embedding cost
Figure 13: Cover image and embedding distortion
Haiqiang Wang Steganography and Steganalysis 19 / 38
Research on steganography Spatial domain
Why does it work?
D(X,Y) =
3∑k=1
∑u,v
|Wkuv(X)−Wk
uv(Y)|ε+ |Wk
uv(X)|(5)
Pixel in noisy region has large wavelet coefficients, embedding distortionis small
Even one smooth direction |Wkuv(X)| will lead to large embedding
distortion
Selected pixels are hard to model in all directions
Haiqiang Wang Steganography and Steganalysis 20 / 38
Research on steganography Spatial domain
Embedding distribution of UNIWARD
Figure 14: Content adaptivity of UNIWARD
Haiqiang Wang Steganography and Steganalysis 21 / 38
Research on Steganalysis Machine Learning Approach
Two approaches
Statistical signal detectionDerives the detector from statistical modelAttacks specific embedding algorithm and needs sufficient samples
Standard machine learning approachEnsemble classifierSupport Vector Machine (SVM): binary estimationSupport Vector Regression (SVR): quantitative estimation
Ker, Andrew D., et al. "Moving steganography and steganalysis from the laboratory into the real world." Proceedings of the first ACMworkshop on Information hiding and multimedia security. ACM, 2013.
Haiqiang Wang Steganography and Steganalysis 22 / 38
Research on Steganalysis Machine Learning Approach
Ensemble classifier
Lower complexity but can handle large feature set
Also known as bootstrap aggregation or bagging
Figure 15: Diagram of ensemble classifier
HOLUB, V. CONTENT ADAPTIVE STEGANOGRAPHY: DESIGN AND DETECTION. Diss. State University of New York, 2014.
Haiqiang Wang Steganography and Steganalysis 23 / 38
Research on Steganalysis Machine Learning Approach
Support vector regression
Principle: Use regression tools to learn the relationship between featurevector and payload
Assumption: Feature changes predictably with payload
ψ̂ = arg minψ∈F
1L
L∑i=1
e(ψ(xi), yi) (6)
xi = f(ci) ∈ Rd is feature vector computed from image ci embedded withrate yi ∈ [0, 1].
Find a mapping function ψ̂ : Rd 7→ [0, 1] that minimizes the estimationerror
Pevny, T., et al."From blind to quantitative steganalysis." SPIE Electronic Imaging. 2009.
Haiqiang Wang Steganography and Steganalysis 24 / 38
Research on Steganalysis Machine Learning Approach
Support vector regression
Figure 16: SVR used in quantitative steganalysis
Haiqiang Wang Steganography and Steganalysis 25 / 38
Research on Steganalysis Curse of Dimensionality
Feature dimension problem
Table 2: Feature for Steganalysis (Spatial domain)
Name Dimension Published Year AuthorSPAM 548 2010 T.Pevny and J. Fridrich
CDF 1234 2010 J. Kodovsky and J. FridrichSRM 34671 2012 J. Fridrich and J. Kodovsky
PSRM 12870 2013 V. Holub and J. FridrichCSR 1183 2014 T. Denemark and J. Fridrich
Haiqiang Wang Steganography and Steganalysis 26 / 38
Block-based Quantitative Steganalysis Why block-based
Why block-based approach
Ideas from video coding, 16*16, 16*8, 8*4, . . .Extremly high dimension feature to handle heterogeneous images
Group homogeneous blocks togetherKnowledge about embedding distribution (ROI)
Noisy region, edge and complex texture and avoid smooth region
Ability to achieve better payload estimation
Haiqiang Wang Steganography and Steganalysis 27 / 38
Block-based Quantitative Steganalysis Why block-based
Region of Interest
(a) ROI at payload = 0.25 (b) Embedding dist. at payload = 0.25
Figure 17: ROI and embedding distribution at big payload
Haiqiang Wang Steganography and Steganalysis 28 / 38
Block-based Quantitative Steganalysis Block-based steganalysis approach
Block-based steganalysis approach
1 Block classification2 Different feature of different block group3 Next: support vector regression4 Future work: adaptive block partition
Haiqiang Wang Steganography and Steganalysis 29 / 38
Block-based Quantitative Steganalysis Block-based steganalysis approach
Block classification
3 block types, smooth, edge and texture
Figure 18: Cover image and embedding distortion
Haiqiang Wang Steganography and Steganalysis 30 / 38
Block-based Quantitative Steganalysis Block-based steganalysis approach
Block classification
Figure 19: Distortion map and classification grid
Haiqiang Wang Steganography and Steganalysis 31 / 38
Block-based Quantitative Steganalysis Block-based steganalysis approach
Feature extraction
1183 CSR feature: Content-selective residual
Figure 20: CSR feature components
Denemark, Tomas, et al. "Further Study on the Security of S-UNIWARD." SPIE Electronic Imaging. International Society for Opticsand Photonics, 2014.
Haiqiang Wang Steganography and Steganalysis 32 / 38
Block-based Quantitative Steganalysis Block-based steganalysis approach
Feature extraction
Residual order and truncation (Th and Tc):
Embedding probability
pij =exp(−λρij)
1 + exp(−λρij)(7)
Haiqiang Wang Steganography and Steganalysis 33 / 38
Block-based Quantitative Steganalysis Block-based steganalysis approach
Feature extraction
Pixel classes: compare pij with two parametersType L: when pij(X, α) > tLType s: when pij(X, α) < ts
Example: 1st 1D histogram2d+1 = 4 pixel classes, [s s], [s L], [L s] and [L L].22 = 4 to 3 in 1st, 23 = 8 to 6 in 2nd, and 24 = 16 to 10 in 3rd
Haiqiang Wang Steganography and Steganalysis 34 / 38
Block-based Quantitative Steganalysis Block-based steganalysis approach
Feature reduction
Too large feature for 32*32 blocks
Figure 21: CSR feature of smooth, edge and texture blocks
Haiqiang Wang Steganography and Steganalysis 35 / 38
Block-based Quantitative Steganalysis Block-based steganalysis approach
Feature reduction
PCA to reduce each feature subset
ANOVA to select feature set for each blocks
Figure 22: Reduced feature dimension with PCA
Haiqiang Wang Steganography and Steganalysis 36 / 38
Future work
Future work
Quantitative Steganalysis using SVR
Adaptive block partition
Haiqiang Wang Steganography and Steganalysis 37 / 38