Citation:Lambourne, AD and Elliott, JR and Miller, S and Collins, L and Schreuders, ZC (2018) Software Pilotand User Guide EWT: Chat Log Grooming Detection. Manual. CSI Centre Leeds Beckett University.(Unpublished)
Link to Leeds Beckett Repository record:https://eprints.leedsbeckett.ac.uk/id/eprint/5078/
Document Version:Monograph (Published Version)
The aim of the Leeds Beckett Repository is to provide open access to our research, as required byfunder policies and permitted by publishers and copyright law.
The Leeds Beckett repository holds a wide range of publications, each of which has beenchecked for copyright and the relevant embargo period has been applied by the Research Servicesteam.
We operate on a standard take-down policy. If you are the author or publisher of an outputand you would like it removed from the repository, please contact us and we will investigate on acase-by-case basis.
Each thesis in the repository has been cleared where necessary by the author for third partycopyright. If you would like a thesis to be removed from the repository or believe there is an issuewith copyright, please contact us on [email protected] and we will investigate on acase-by-case basis.
So t are Pilot and User Guide EWT: Chat Lo Groomin Dete tion
And e Lam ou ne, John Elliott, Stephen Mille , Le is Collins, and Z. Cli e S h eude s
Leeds Be kett Uni e sit and West Yo kshi e Poli e
This so t are is urrently a aila le or poli e use. See elo o details on o tainin EWT and pa ti ipatin in the pilot.
The CARI Proje t
The CARI P oje t is a la e-s ale olla o ation et een West Yo kshi e Poli e and the C e ime and Se u it Inno ation Cent e CSI Cent e at Leeds Be kett Uni e sit . The CARI P oje t aims to imp o e and in o po ate an e iden e- ased app oa h into the poli in o di ital o ensi s and e ime in esti ations. An e tensi e needs assessment o UK poli in and e ime and di ital e iden e
as ondu ted to unde stand the u ent situation, and to identi needs a oss the o e. The CARI P oje t also in ol ed implementin a t ainin and esea h p o amme that has impa ted the apa ilit o the di ital o ensi s and e units ithin West Yo kshi e Poli e to en a e in esea h. This needs assessment and esea h t ainin led to the de elopment o a set o esea h p oposals, hi h e e s o ed and sele ted. Su se uentl , a ademi s and poli e sta o-p odu ed esea h and de elopment
o kst eams: a ame o k o seizu e, p ese ation and p ese ation o loud e iden e; automated o ensi anal sis; ima e linka e o i tim identi i ation and ame o k o ima e in e p int mana ement; automated oomin dete tion; ontline o i e a a eness
de elopment and de ision suppo t mo ile app; assessment o methods o e t ainin ; an e aluation o the ole o the Di ital Media In esti ato ithin WYP; and ha a te isti s o i tims o e ime. Ea h o these p oje ts e e desi ned to add ess needs ithin la en o ement and outputs in lude e iden e- ased p o edu es, ne apa ilities su h as so t a e/al o ithms, and a tiona le intelli en e.
This o k as suppo ted a Poli e Kno led e Fund ant, administe ed the Home O i e, Colle e o Poli in , and the Hi he Edu ation Fundin Coun il o En land HEFCE .
EWT: Chat Lo Groomin Dete tion The aim o this o k is the eation o tools and te hni ues to dete t and la e iden e o p edato eha iou s annin hat and othe so ial media lo s e t a ted om seized e uipment. Althou h the esea h lite atu e p oposes te hni ues o modellin and dete tin lu in dialo ues Olson et al., ; Leathe man, , Di ital Fo ensi s Units DFU t pi all el on manual e ie , sea hin , and simple ke o d lists. Based on anal sis o this in o mation and the asso iated a ademi pape s, a se ies o p edato spee h a ts e e identi ied as ein ele ant to the automated dete tion o oomin . The EWT s annin al o ithm as tuned and e ined i st a ainst PJ lo s and also a ainst eal o ld data.
The esult is a so t a e tool that an e used in esti ato s to automate lo - ile s eenin to ui kl ilte th ou h hat lo s to identi e iden e. The app oa h is su i ientl ene i to allo the same tool to e used ith di e ent le i ons to dete t othe dialo ues o on e n su h as e -stalkin and adi alisation o te o ist e uitment.
Cu ent status: e a e pilotin the so t a e in a num e o o es sta ted Ma . UK o es a e in ited to onta t DCI Vanessa Smith anessa.smith
@ est o kshi e.pnn.poli e.uk and D Z. Cli e S h eude s .s h eude s @leeds e kett.a .uk to join the pilot.
Contents
Ho to install
Sta tin EWT
Ho the so t a e is used
Step Impo t hat lo s.
Step P o ess hat lo s.
Step : Vie esults.
Step P o idin eed a k.
In o mation sheet
Consent statement
Ho to install
EWT has een de eloped and tested on Linu and Windo s.
Step Install Ana onda, P thon . e sion
Ana onda an e do nloaded om he e:
https:// .ana onda. om/do nload
This in ludes all the a ious li a ies ou need to un EWT.
Alte nati el , ou ould manuall install P thon . , Qt , and all the e ui ed P thon modules, in ludin ls ite , a pa se, and hashli . Ho e e , installin Ana onda is the e ommended app oa h.
Step Sa e the EWT so t a e to a lo ation o ou hoosin su h as a olde in ou home di e to . I ou e ei ed EWT as a zip ile, this in ol es e t a tin the ontents o that ile to a PC.
Note: ou ill laun h the so t are rom this older. I ou ant to add a short ut to our desktop, ou an op the e t_ indo s short ut to our desktop, then edit the short ut ri ht
li kin the op o the short ut, sele tin Properties, and the Tar et ield an e edited to insert the ull path here ou ha e sa ed fWT just e ore e t_ ui.p . For e ample: it e omes ... start p thon C:\fWT\e t_ ui.p ...
Startin EWT
On Windo s ith Ana onda, simpl dou le li k on e t_ indo s
On Linu ith Ana onda, simpl un e t_linu .sh
Or on Windo s or Linu ith manual install, sta t a ash shell in the EWT di e to and un:
python ewt_gui.py
Ho the so t are is used
Fi st, eate a ase.
Step ) Import hat lo s. The tool impo ts TSV e po ts, hi h most o ensi tools an easil e po t hat lo s to.
Step : Vie results. Simpl dou le li k to open ea h esults ile, hi h a e sp eadsheets ith ilte s o
ui kl ie in and makin sense o the hat lo s.
E e messa e has een anal sed and assi ned a mat hin spee h a t ate o . Cate o ies in lude:
● Dire t Se ual: e pli itl se ual ● Indire t Se ual: se ual innuendo and desensitisation ● A e: elated to a e ● Pleadin /Demandin : e in , pleadin , and demandin ● Approa h: in pe son meet ups and initiatin onta t ● Groomers: oome s talkin to oome s ● Personal: e han in pe sonal in o mation, in ludin dis ussin appea an e,
amil , home, and dis ussin photos, e ams ● Compliments: appea an e ompliments ● Ne ati e: e usals, and ne ati e esponses to ad an es ● Trust: elated to uildin t ust, and eassu in ● Isolation: isolation om othe s
A summa do ument ...summa . ls p o ides an o e ie o the esults o anal sis. This ie an help to p o ide insi hts su h as one sided on e sations, and hat iles ith a hi h num e o hits a ainst these ate o ies.
D illin do n, ea h hat lo is sto ed in a sepa ate sp eadsheet, oded a ainst the spee h a ts:
Contents in these ima es ha e een ensored.
We lassi the i st si ate o ies Di e t Se ual, Indi e t Se ual, A e,
Pleadin /Demandin , App oa h, and ( oome s as Red hi hest on e n , the othe s as Am e .
Fo a ui k ie o the hi hest on e n messa es, messa es an e ilte ed to Red t a i li ht:
Like ise, messa es an e ilte ed do n at the ate o le el, to dete t attempts to meet:
O o usin on se ual omments:
Step ) Pro idin eed a k. We ask that du in the pilot, poli e use s p o ide us ith eed a k e a din thei use o EWT.
Please omplete a sho t su e ithin the so t a e, or ea h and e ery ase durin the pilot period. The e a e onl a e uestions to ans e , and this input is e t emel
alua le o ontinued esea h and de elopment.
Please let us kno hethe this is a li e o histo i al ase; hethe it has o ould ha e assisted the in esti ation; and ho mu h time EWT sa ed ompa ed to t pi al manual/ke o d sea hes.
Based on the jud ement o the in esti ato , state hi h iles e e roomin that is e iden e o di e tl ele ant to the in esti ation or suspi ious elated o help ul ut not di e tl use ul as e iden e .
The e a e just t o uestions he e e ask o some itten eed a k:
Please p o ide some itten eed a k, ith an positi e o ne ati e eed a k o su estions. A senten e o t o at minimum please.
Finall , hethe the e a e an hat messa es that ha e een in o e tl la ed o missed. The use is p ompted to eda t an sensiti e ontent su h as names and pla es and p o ide some e amples.
The p o am sa es these eed a k esponses alon ith eda ted e sions o the anal sis outputs to a eed a k olde . The eda ted e sions do not ontain an messa e
ontents o pa ti ipant names. We ask that the eed a k olde e p o ided a k to us o esea h pu poses. O e the ollo in months e ill a an e o these to e etu ned to
us.
This in o mation ill ena le us to e aluate and imp o e the s stem, and de elop u the te hni ues to automate ia ma hine lea nin the p o ess o la in hat lo s most likel to ontain oomin .
In ormation sheet
Project aims
Thank you for your participation in this study. The aim of this work is the creation of tools and techniques to detect and flag evidence of predatory behaviour by scanning chat and other social media logs extracted from seized equipment. Although the research literature proposes techniques for modelling and detecting “luring” dialogues (Olson et al., 2007; Leatherman, 2009), Digital Forensics Units (DFU) typically rely on manual review, searching, and simple keyword lists. Based on analysis of this information and the associated academic papers, a series of predatory speech acts were identified as being relevant to the automated detection of grooming. The EWT scanning algorithm was tuned and refined first against PJ logs and also against real world data.
The result is a software tool that can be used by investigators to automate log-file screening to quickly filter through chat logs to identify evidence. The approach is sufficiently generic to allow the same tool to be used with different lexicons to detect other dialogues of concern such as cyber-stalking and radicalisation or terrorist recruitment.
Current status: we are piloting the software in a number of forces (started May 2018). UK forces are invited to contact DCI Vanessa Smith ([email protected]) and Dr Z. Cliffe Schreuders ([email protected]) to join the pilot.
The results from this pilot will help us to evaluate our existing approach, and to continue research and development into techniques for digital investigation and detection of predatory behaviour.
Your participation
You have been chosen to take part in this study, because you are involved in police cases and processes that can potentially involve digital evidence and cybercrime, processing chat logs for predatory behaviour as part of police investigations.
Your input is very valuable to the project, and we hope you will continue to participate. However, you are under no obligation to take part and can withdraw involvement at any stage.
What do I have to do?
Forces who participate in the pilot will use the EWT software to triage chat logs as part of digital investigations, and provide feedback on the effectiveness of the approach. Forces can choose to trial the software on historical cases, or deploy onto live cases, as they deem fit, in combination with any tools and techniques they traditionally use.
The identity of investigators, victims, and offenders
We ask police to ensure the information they provide via the feedback survey contains no
sensitive data, including their own identity.
The feedback data set sent back to us should include no sensitive data; the software automatically creates two copies of it’s output, a sensitive version for police use (including message contents and participants names), and a redacted version for research purposes. Forces are welcome to review the raw data and software code before the feedback directory is sent to us for analysis.
No sensitive personally identifiable data will be published. Results are normally presented in terms of groups of individuals, and overall statistics and findings.
What are the benefits of taking part?
As a force engaged in the pilot, you will have free access to the EWT software (and you can continue using this software for free after the pilot). The EWT software is intended to provide you with tools to save time and increase the effectiveness of digital investigations involving large quantities of chat logs.
You will be helping us to identify areas for improvement for yourselves in your day-to-day roles. These results will be used to understand how these tools can be improved, and we will use this information to continue development of these techniques and software, potentially impacting your own working environment and the effectiveness of cybercrime investigation.
This work is also intended to benefit other police forces and researchers: results will be published in academic venues, such as peer-reviewed conferences and journals. Results will also be summarised and disseminated in presentations, on websites, and in training materials.
Who is funding the research?
Development of EWT was supported by a Police Knowledge Fund grant, administered by the Home Office, College of Policing, and the Higher Education Funding Council for England (HEFCE). Leeds Beckett University and West Yorkshire Police led the CARI Project and are continuing to collaborate on the research.
CARI Project overview
The CARI Project is a large-scale collaboration between West Yorkshire Police and the
Cybercrime and Security Innovation Centre (CSI Centre) at Leeds Beckett University. The
CARI Project aims to improve and incorporate an evidence-based approach into the
policing of digital forensics and cybercrime investigations. An extensive needs assessment
of UK policing and cybercrime and digital evidence was conducted to understand the
current situation, and to identify needs across the force. The CARI Project also involved
implementing a training and research programme that has directly impacted the capability
of the digital forensics and cyber units within West Yorkshire Police to engage in research.
This needs assessment and research training led to the development of a set of research
proposals, which were scored and selected. Subsequently, academics and police staff
co-produced 9 research and development workstreams: a framework for seizure,
preservation and preservation of cloud evidence; automated forensic analysis; image
linkage for victim identification and framework for image fingerprint management;
automated grooming detection; frontline officer awareness development and decision
support mobile app; assessment of methods of cyber training; an evaluation of the role of
the Digital Media Investigator within WYP; and characteristics of victims of cybercrime.
Each of these projects were designed to address needs within law enforcement and
outputs include evidence-based procedures, new capabilities such as software/algorithms,
and actionable intelligence.
Consent statement
By participating in the EWT pilot and completing the feedback surveys:
I confirm that I have read and understand the above.
I understand that all personal information will remain confidential and that all efforts will be made to ensure I cannot be identified (except as might be required by law).
I agree that data gathered in this study may be stored anonymously and securely, and may be used for future research.
I understand that my participation is voluntary and that I am free to withdraw at any time without giving a reason.
I agree to take part in this study.