Im
age
Level
Forgery
Id
en
tificatio
nan
dP
ixel
Level
Forgery
Localiz
atio
n
via
aC
on
volu
tion
al
Neu
ral
Netw
ork
Haita
oD
.D
en
g1
&Y
itao
Qiu
2
1D
epartm
ent
of
Mate
rials
Scie
nce
and
Engin
eerin
g2D
epartm
ent
of
Civ
iland
Enviro
nm
enta
lE
ngin
eerin
g
In
tro
du
ctio
n
The
ease
of
manip
ula
tion
of
dig
ital
data
thro
ugh
editin
g/c
roppin
gto
ols
such
as
photo
shop
and
photo
edito
retc
has
ofte
nnegativ
ely
impacte
dth
ein
for-
matio
ncre
dib
ility.
While
severa
lsuccessfu
lcases
for
forg
ery
dete
ctio
n[3
,
6]
have
been
dem
onstra
ted
[5,
7],
pro
gre
ss
for
ageneric
dete
ctio
nte
chniq
ue
develo
pm
ent
has
been
sta
gnant
due
totw
om
ain
reasons.
One,th
ere
are
vari-
ous
fundam
enta
llydiffe
rent
forg
ery
types.
Tw
o,
it’sdiffi
cult
topin
poin
tth
e
locatio
nof
forg
ed
regio
ns.
[3,5,6,7,9].
Cate
gory
Featu
res/M
odels
#P
ara
mete
rsF
org
ery
Types
ML
[4]
CFA
<10
S,C
M,R
ML
[11]
EL
A<
10
S,C
M,R
ML
[8]
NO
I<
10
S,C
M,R
DL
[1]
Bayar
⇠20M
S
DL
[2]
SR
M⇠
50k
S,C
M,R
DL
[10]
Artifi
cia
l7M
S,C
M,R
,E
S:s
plic
ing
,C
M:
co
py
-move,
R:
rem
oval,
E:
en
han
cem
en
t
Here
,w
ere
port
on
alig
ht-w
eig
ht
(5k
and
700k)
para
mete
rs)
VG
G-d
eriv
ed
convolu
tional
neura
lnetw
ork
arc
hite
ctu
reth
at
allo
ws
for
image
level
forg
ery
dete
ctio
nw
ithan
accura
cy
of
91%
and
AU
Cof
85%
on
test
data
and
for
93%
and
AU
Cof
79%
pix
el
level
forg
ery
dete
ctio
n.
Da
taset
an
dF
ea
tures
Data
set
forg
ed
images
forg
ed
pix
els
Tra
in1
31.0
9%
9.1
0%
Dev
129.2
0%
%8.4
2%
Test
130.2
0%
8.0
2%
Tra
in2
98.3
9%
16.1
3%
Dev
297.1
3%
16.8
5%
All
of
the
data
was
obta
ined
from
The
Image
Manip
ula
tion
Data
set
1
and
the
CO
CO
Data
set
2.
Ato
tal
of
10000
images
(224
pix
els⇥
224
pix
els
)w
ere
obta
ined,and
split
into
train
set,
develo
pm
ent
set,
and
test
set
as
8:1
:1.
Inth
isw
ork
,fo
rgery
types
inclu
de
a)
copy-m
ove,b)
locally
enhance,c)
splic
ing.
Im
ag
eL
evel
Fo
rg
ery
Id
en
tifica
tion
Follo
win
gV
GG
netw
ork
arc
hite
ctu
re,
usin
ga
far
tonear
appro
ach,
we
built
our
netw
ork
as
belo
w.
For
train
ing,
images
were
into
56x56
sm
alle
rpatc
hes
toaugm
ent
train
ing
sam
ple
num
ber.
We
augm
ente
dtra
inin
gdata
by
manually
enhanc-
ing
local
pix
el
colo
rsvia
multip
lyin
ga
random
coeffi
cie
nt
toth
elo
cal
pix
el
valu
es
(a.k
.a.
en-
hancem
ent)
innon-p
rivate
regio
ns
(Tra
in1
to
Tra
in2)
toin
cre
ase
forg
ed
image
sam
ple
and
pix
el
ratio
and
achie
ved
bette
rperfo
rmance
on
the
sam
ete
st
data
.T
his
suggests
that
our
cla
s-
sifi
catio
nnetw
ork
was
able
tocaptu
reth
ecom
-
mon
featu
res
share
dbetw
een
diffe
rent
manip
ula
tion
types.
By
adju
stin
gth
ehyperp
ara
me-
ters
,in
clu
din
gth
ele
arn
ing
rate
and
the
optim
izer,
we
found
that
adam
with
the
initia
lle
arn
-
ing
rate
of
0.0
01
gave
the
hig
h-
est
train
ing
accura
cy
and
low
est
loss.
Pix
el
Level
Im
ag
eF
org
ery
Lo
ca
liza
tion
Inth
eim
age
levelfo
rgery
identifi
catio
nnetw
ork
,th
epoolin
gla
yers
condense
the
spatia
lin
form
atio
ndow
nto
few
er
pix
els
for
finalcla
ssifi
catio
n;to
achie
ve
pix
el
wis
epre
dic
tion,
the
poolin
gla
yers
were
dis
card
ed,
pro
ducin
ga
pix
el-
wis
efe
atu
reextra
cto
r.T
he
featu
reextra
cto
ris
then
fed
into
alo
cal
anom
aly
dete
ctio
nnetw
ork
pro
posed
by
Wu’s
[11]
paper.
Our
full
model
achie
ves
bette
rperfo
rmance
than
the
untra
ined
model
from
[10]
and
sim
ilar
perfo
rmance
toth
epartia
llytra
ined
one.
To
avoid
the
po-
tentia
lpro
ble
mof
vanis
hin
ggra
die
nts
and
toexpedite
the
train
ing
pro
cess,
we’v
eals
om
odifi
ed
the
netw
ork
inC
by
addin
gshortc
ut
path
severy
two
convolu
tion
layers
inth
ein
term
edia
teblo
cks
(Model
D)
and
have
found
sim
-
ilar
perfo
rmance
with
inle
ss
train
ing
time.
Model
Tra
ined
Para
mete
rsA
ccura
cy
AU
C
ManT
raN
et
0(7
Min
tota
l)0.9
30.6
7
LA
DN
train
ed
ManT
raN
et
0.2
M(7
Min
tota
l)0.9
10.7
98
Our
Model
0.2
7M
(0.2
7M
into
tal)
0.9
20.7
94
Our
model
with
ResN
et
0.2
7M
(0.2
7M
into
tal)
0.9
30.7
8
Our
full
model
isdem
onstra
ted
as
belo
w,sourc
eim
age
from
[10]:
Co
nclu
sio
na
nd
Fu
ture
Wo
rk
Here
we
deliv
er
alig
ht-w
eig
htnetw
ork
arc
hite
ctu
reth
atachie
ves
hig
hperfo
r-
mance
inboth
image
level
identifi
catio
nand
pix
el
level
localiz
atio
n.
Futu
re
effo
rtcan
be
focused
on
condensin
gth
eL
AD
Nnetw
ork
,as
well
as
incor-
pora
ting
more
featu
res
by
usin
gfi
lterin
gkern
els
genera
ting
featu
res
such
as
CFA
,E
lAto
impro
ve
netw
ork
perfo
rmance.
Refe
ren
ce
[1]
Bay
ar
&S
tam
m,
AC
M.
20
16
,p
p.
5–
10
;[2
]B
ay
ar
&S
tam
m,
IEE
ET
ran
s.
Inf.
Fo
ren
sic
sS
ecu
r,1
3.1
1
(20
18
),p
p.
26
91
–2
70
6;
[3]
Bira
jdar
&M
an
kar,
Dig
it.In
vest.
10
.3(2
01
3),
pp
.2
26
–2
45
;[4
]F
erra
raet
al.,
IEE
ET
ran
s.
Inf.
Fo
ren
sic
sS
ecu
r,7
.5(2
01
2),
pp
.1
56
6–
15
77
;[5
]H
ill&
Rag
er,
“Im
ag
eF
org
ery
Dete
ctio
n”;
[6]
Hsu
&C
han
g,
ICM
E,
To
ron
to,
Can
ad
a,
20
06
;[7
]H
uh
et
al.,
EC
CV
,2
01
8,
pp
.1
01
–1
17
;[8
]M
ah
dia
n&
Saic
,Im
ag
e.
Vis
ion
.C
om
pu
t.,2
7.1
0(2
00
9),
pp
.1
49
7–
15
03
;[9
]R
ao
&N
i,W
IFS
,IE
EE
.2
01
6,
pp
.1
–6
;[1
0]
Sim
ony
an
&Z
isserm
an
,arX
iv:1
40
9.1
55
6(2
01
4);
[11
]W
uet
al.,
CV
PR
20
19
,pp
.9
54
3–
95
52
;[1
2]
Zh
ou
et
al.,
CV
PR
,2
01
8,
pp
.1
05
3–
10
61
.
1h
ttps://w
ww
5.c
s.fa
u.d
e/re
searc
h/d
ata
/imag
e-m
an
ipu
latio
n/
2h
ttp://c
oco
data
set.o
rg/#
ho
me