جعفر منصور

Jordan Customs & Boarder Control .

Formatting investment and enabling industry to be competitive to

enhance the competency of national economy.

Facilitating trade exchange between the Kingdom and the other

countries.

Collecting revenues for the treasury.

Controlling passengers and goods movements and transportations

crossing the Kingdom’s borders in conformity with the departments

authorities under the current regulations in force.

Combating smuggling.

Protecting the local society and the environment from hazardons

materials.

Contribution in controlling the commercial activities to prohibit illegal

businesses under the current regulations in force.

Problem Statement.

• The Customs department in Jordan operates in environments of uncertainty and change. JC can’t manually check every person and goods that enter to the country.

• In the past. 100 percent of the declarations examined, were everything that

entered the country was examined. Goods were processed in days, not hours, and developing competent customs officers took years of training.

• The Asycuda system has been implemented in 2001 , it includes a risk management modules that can randomly sample imports for compliance and has some rules and lists of risks that are built based on the experience of the JC officials.

Problem Statement.

• The Customs department in Jordan operates in environments of uncertainty and change. JC can’t manually check every person and goods that enter to the country.

• JC wants a way to predict the people and goods crossing a border who

will most likely breaking the law by smuggling drugs, or not paying the right custom tariffs.

• At the same time the custom department shouldn’t constrain the legitimate trade.

• JC needs to identify cross-border activities or transactions with the highest potential to pose risk

The required Solution

Jordan Customs requested tp significantly improve performance in facilitating trade and evidence positive cost benefit outcomes through automation, prediction , pre-arrival processing and post clearance audit.

Mining Data Set

The dataset we used for this study covers the period between January 2009 and June 2009, a list of the constraints and assumptions we used to build the data set is listed at the end of this presentation.

Decollations Data Model

Declaration

Declaration Items

Item Block

General Taxes

Taxes UOM

Total tax

Date ..etc

Mining Models

Decision Tree Decision Tree Decision TreeDecision Tree

Input Variables

• Export Country( التصدير (بلد• Origin Country( المنشأ (بلد• HSCode at level 11( التعرفة (بند• Company( المستوردة الشركة أو (التاجر• Clearance company(المصرح)• Whether there is Exemption or not and in case there is exemption what kind of

exemption( اعفاء وجود عدم حال في عام وضع أو األعفاء (نوع• Whether there is Agreement or not and in case there is agreement what kind of

agreement(األتفاقية)• Declaration Type(( الجمركي البيان نوع• Custom Status( التفصيلي الجمركي (الوضع• Custom center ( الجمركي (المركز• TotalitemTaxAmount ( بها المصرح الرسوم (مجموع

We used the following variables to study the fraud bavior

Used Mining Algorithms

Decision Trees a classification data technique that splits data using tree figure where it create nodes that contain the transactions that belongs to certain state of an attribute based on the special characteristics that differ a certain value of certain variable from the remaining dataset.

Naïve base a classification algorithm that looks at attributes and treat them as if they are completely independent and looks at the effect of one variable at time on the predicted output at its classification. It is powerful in comparing the importance of each input variable and to know the most importance ones.

The fooling algorithms where used

Logistic Regression :- a classification algorithm that represent a special form of neural network. It is used in cases where output is one of two possible states or goes on one of two directions only, and it is not recommended to be used in cases where multiple classes are used. In case of multiple classes neural networks can be used rather.

Association Rules :- an association algorithm that asses to what extent a group of item sets (values or states that appear together in a certain number of transactions) are associated to specific value of predicted variable

Used Mining Algorithms

Results Visualization - 1

Decision tree


Using Naïve base1

2

3

6

5

4

7

Product HS is the most important Important


Using Association rules.


Using Logistic Regression

Minisg Accuracy Chart

Decision Tree was the best Association rules is the worst

Testing Results – One month later

We run the same model for a dataset of declarations for a one month later and here are the results

Conclusions and Results Recommendations

The scores (importance results) of the LR were high for the attribute values that have very few cases, and we recommend to not rely on the score values in creating risk rules.

Neural network can be used. However, it will share the same issues with LR (similar behavior specifically for scaled networks). Besides, performance issues could be encountered in using neural network (very long processing time and very long exploration time).

In order to produce better results networks should be unscaled and could be achieved by avoiding input attributes that have big number of possible sates such as HS code at level 11 or company name. It might be a good idea to try to use Chapters rather than HS codes at level 11 in deeper studies if neural networks were chosen as a model.

A further study for neural network and logistic regression could be done by trying to create a dataset that ignores input states that have few supporting cases for example less than 10.

Conclusions and Results Recommendations - 2 -

Naïve base is good algorithm in studying one attribute at time and determining the most important attributes but it should be known that it differs from other algorithms that it could not study more than one attribute at time.

Decision tree is a very effective algorithm that a deep analysis of its details could represent a good guider in identifying the patterns in data at customs and detect fraud behavior.

Association rules could be useful in identifying patterns in data however analysts should take into consideration that it differs from classification techniques.

Generative algorithms such decision trees could be useful to classify datasets where the data set related to one node could be used in doing further analysis using discriminative algorithm such as neural network. Where it is important to emphasize that if the resulted network is unscaled the result will be more useful.

If model cascading used and models need to be created from excel it is recommended to do that in a separate analysis services database that could be used from excel. Where this operation is recommended in order to avoid using built excel models in case reprocessing took place from sql data tools.

Conclusions and Results Recommendations – 3-

In this study we used data for six month. However, in further studies we recommend the data set to be for longer periods. Where short periods has the advantage of detecting recent trends longer periods is more accurate in determining persistent trends.

In this study we used the month just after the study (July 2009) where in further studies it is a good idea to give a gap month for the analysis and to assess the model for a short period after that. Also, it is important to emphasize that study periods should be much longer than testing periods. For example, a two years study could be done and the results of the two months following to the period of one month after the study could be taken and statistics similar to the ones we did for July 2009 could be done to assess models reliability. Also, it is recomended for the studies to be done by customs for different periods to assess how stable is the results of the models in order to reach most suitable period for the study and actions taken according to studies

In this study we took all the import declarations. Where in further studies, it is useful to try to build a specific models such as models for certain chapters or models that eliminate certain chapters. Also, studies could be done by customs for specific custom centers to get insights about behaviors and patterns related to the specific center.

Questions and Answers

Date post:	05-Aug-2015
Category:	Technology
Upload:	talal-al-shammari
View:	46 times
Download:	1 times

جعفر منصور

Technology