Avoiding Electricity Theft using
Smart Meters in Smart Grid
By
Muhammad Anas
Registration Number: CIIT/FA10-REE-041/ISB
MS Thesis
In
Electrical Engineering
COMSATS Institute of Information Technology
Islamabad - Pakistan
Spring, 2012
COMSATS Institute of Information Technology
Avoiding Electricity Theft usingSmart Meters in Smart Grid
A Thesis Presented to
COMSATS Institute of Information Technology Islamabad
In Partial fulfilment
of the requirement of the degree of
MS (Electrical Engineering)
By
Muhammad Anas
CIIT/FA10-REE-041/ISB
Spring, 2012
ii
Avoiding Electricity Theft using Smart Meters in Smart Grid.
A Post Graduate Thesis submitted to the Department of Electrical Engineering
as partial fulfilment of the requirement for the award of Degree of M.S (Electrical
Engineering).
Name Registration NumberMuhammad Anas CIIT/FA10-REE-041/ISB
Supervisor:
Dr. Nasrullah Khan
Professor,
Department of Electrical Engineering,
COMSATS Institute of Information Technology (CIIT)
Islamabad Campus
June,2012.
iii
Final Approval
This Thesis titled
Avoiding Electricity Theft using
Smart Meters in Smart Grid
By
Muhammad AnasCIIT/FA10-REE-041/ISB
Has been approved
For COMSATS Institute of Information Technology, Islamabad Campus.
External Examiner:Name:
Supervisor:Dr.Nasrullah Khan/ Professor
Department of Electrical Engineering, Islamabad Campus.
Co-Supervisor:Dr.Nadeem Javaid/ Assistant Professor
Department of Electrical Engineering, Islamabad Campus.
HOD:Dr.Shafayat Abrar/ Associate Professor
Head of Department of Electrical Engineering, Islamabad Campus.
iv
Declaration
I Muhammad Anas Registration number: FA10-REE-041/ISB hereby declare thatI have produced the work presented in this thesis, during the scheduled period ofstudy. I also declare that I have not taken any material from any source exceptreferred to wherever due that amount of plagiarism is within acceptable range. Ifa violation of HEC rule on research has occured in this thesis, I shall be liable topunishable action under the plagiarism rules of the HEC.
Date: Signature of the student:
Muhammad AnasFA10-REE-041/ISB
v
Certificate
It is certified that Muhammad Anas, FA10-REE-041/ISB has carried out all thework related to this thesis under my supervision at the Department of ElectricalEngineering, COMSATS Institute of Information Technology, Islamabad Campusand the work fulfills the requirement for award of MS degree.
Date: Supervisor:
Dr.Nasrullah Khan/ Professor,Department of ELectrical Engineering,CIIT Islamabad Campus.
Head Of Department:
Dr.Shafayat Abrar/ Associate Professor,HoD Electrical Engineering.
vi
DEDICATION
Dedicated to my Loving Parents.
vii
ACKNOWLEDGEMENTS
I am heartily grateful to my supervisor, Dr. Nasrullah Khan, whose patient en-couragement, guidance and really nice to me from the beginning to the final levelenabled me understanding of the thesis.
I offer my profound regards, blessing and express my deepest sense of gratitudeto my co-supervisor, Dr. Nadeem Javaid for his noble guidance, tremendous co-operation and help me in any respect during the completion of my thesis.
I wish to express my sincere thanks to my father, who guided me and helped mein my mathematical work. I very much honors all my fellows and friends whohelped me and discussed issues with me in a very good way.
Special thanks to my co-supervisor, Dr. Nadeem Javaid, who offered much as-sistance regarding publications in my current thesis. I deeply appreciate yoursupport. Thank you so much. I am also thankful to all my good teachers fromwhom I learned many things including studies and real guidance of everyday life.
Muhammad AnasFA10-REE-041/ISB
viii
ABSTRACT
Global energy crises are increasing every moment. World is trying to shift fromnon-renewable energy resources towards renewable energy resources. Every onehas the major attention towards more and more energy production and savingit, that is to minimize loss or miss use of electrical energy. Electricity can beproduced through many ways. In hydro power plants, after electricity production,it is synchronized on a single bus bar, than allowed for transmission from theswitch yard. Main theme is to study losses in electrical system. Generation andtransmission losses are considered technical. They can be calculated easily. Whereas i am interested to find non-technical or commercial losses. To find out non-technical losses, ways and methods of non-technical losses are important to know.Causes and effects of electricity theft is one other important point. If governmentprovide subsidy and incentives to its users commercial losses can be minimized toa great deal.
There are different kinds of energy meters. Smart meter can be the best option tominimize electricity theft, because of its high efficiency, accurate and precise resultsand excellent resistance towards many of theft ideas in other energy meter, that isit has high security than other electromechanical meters. Behind Smart meters aninfrastructure must be present to support smart meters, that is to handle the datasafely for further process of billing etc., to the grid or utility system. Data can besent through wireless and wired connections. Under ground fiber optic cable can beused for data transmission. Using different methods of regression model, includingfitting regression line model, non-parametric test, Spearman’s rank correlationcoefficient test, Karl Pearson’s correlation test, and hypothesis testing methods.I have compiled practical data through these methods, compared results of thesedifferent methods. As there are other different classification techniques as well,like optimum path forest tree based algorithm, support vector machine(Linear orRadial basis function), genetic algorithm and linear programming techniques etc.
ix
Contents
1 Introduction 1
1.1 Overview of problem . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Goals And Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Major Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Smart Meter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Other Meters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5.1 Multi-tarif meters . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5.2 Time of Use Meters . . . . . . . . . . . . . . . . . . . . . . . 6
1.5.3 Pre-payment meters . . . . . . . . . . . . . . . . . . . . . . 7
1.5.4 Energy meters in Pakistan . . . . . . . . . . . . . . . . . . . 8
2 Electricity Theft Issues, Methods, and Data Communication to
Utility 10
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.1 Electromechanical Meters . . . . . . . . . . . . . . . . . . . 11
2.2 Related Work and Motivation . . . . . . . . . . . . . . . . . . . . . 12
2.3 Losses Due To Electricity Theft . . . . . . . . . . . . . . . . . . . . 13
2.3.1 Theft in Electromechanical Meters . . . . . . . . . . . . . . 18
2.3.2 Theft in smart meters . . . . . . . . . . . . . . . . . . . . . 19
2.3.3 Engineered ways of Theft . . . . . . . . . . . . . . . . . . . 21
2.4 To Communicate Data To Utility Safely . . . . . . . . . . . . . . . 21
2.5 Causes And Effects Of Electricity Theft . . . . . . . . . . . . . . . 24
3 Regression Based Technique for Estimating Electricity Theft 26
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 Related Work and Motivation . . . . . . . . . . . . . . . . . . . . . 28
3.3 Techniques for Estimating Electricity Theft . . . . . . . . . . . . . 28
3.3.1 Fitting a Regression line . . . . . . . . . . . . . . . . . . . . 29
3.3.2 Non-Parametric Statistical Methods . . . . . . . . . . . . . . 32
3.3.2.1 Spearman Rank Correlation Coefficient Test . . . . 33
x
3.3.3 Karl Pearson’s Approximation . . . . . . . . . . . . . . . . . 36
3.3.4 Hypothesis Testing in Regression Model . . . . . . . . . . . 37
3.4 Linear Support Vector Machine . . . . . . . . . . . . . . . . . . . . 39
4 Conclusion 44
References 44
xi
List of Figures
1.1 Power Flow in Advanced Metering Infrastructure . . . . . . . . . . 3
1.2 Smart meter basic Configuration . . . . . . . . . . . . . . . . . . . 5
1.3 Single Phase Electronic multi tariff Meter [20] . . . . . . . . . . . . 6
1.4 Peak and off Peak timings in Pakistan . . . . . . . . . . . . . . . . 7
1.5 Basic Time of use meter [21] . . . . . . . . . . . . . . . . . . . . . . 7
1.6 Prepay Electromechanical meter [22] . . . . . . . . . . . . . . . . . 8
2.1 To find lambda using lambda-iteration method . . . . . . . . . . . . 15
2.2 Month wise Graphical Representation of losses in a populated city
in year 2010-2012. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Neutral Grounded . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4 Communicating Data To Utility . . . . . . . . . . . . . . . . . . . . 23
3.1 Capacitor coupled voltage transformer [www.wikipedia.com] . . . . 28
3.2 Regression Line fitted on Data of 2010-2011 . . . . . . . . . . . . . 33
3.3 Regression Line fitted on Data of 2011-2012 . . . . . . . . . . . . . 34
3.4 Linear data classification . . . . . . . . . . . . . . . . . . . . . . . . 40
3.5 Linear data classification with slack variables . . . . . . . . . . . . . 43
xii
List of Tables
2.1 Energy Losses in year 2010 till 2012, Data taken from Lahore Elec-
tric Supply Company . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1 Energy Losses in a populated city of Lahore Pakistan in year 2010
till 2012[www.lesco.gov.pk] . . . . . . . . . . . . . . . . . . . . . . . 30
3.2 Regional Progressive energy Losses as updated on 29-02-2012 on
www.lesco.gov.pk . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3 Showing ranking of data . . . . . . . . . . . . . . . . . . . . . . . . 36
xiii
Chapter 1
Introduction
1
1.1 Overview of problem
As Pakistan is an economically week country and we cannot beer any type of
loss in our economic resources due to any reason that may be frauds, corruption,
stealing which can damage our country. Every individual has to do something
for his/her country for its well being. I have discussed the problem of electricity
theft in many aspects. We must know electricity flow for studying theft in a
power generation point, while transmitting it through high power lines, or while
distribution. Generally theft is carried out at distribution level, however it is
possible to steal electricity at any stage. For that extra knowledge and experience
is needed, as stealing at transmission level is very risky. Because Power is generated
at low voltage it is stepped up at power generation level. Maximum voltage
bearing system in Pakistan is 500kV. So voltage is stepped up to 500kV in step
up transformer at transformer deck in a power generation system. Then this
power is transferred to the switch yard, from where it can be ready to send it
for transmission through high energy power lines. Power transmission has many
stages, it can be transmitted in 500kV, 220kV, 133kV, 66kV, even 33kV, but 66kV
and 33kV are almost obsolete now a days, because of system up gradation. Few of
grid stations of 66kV and 33kV are still in a working condition in Azad Kashmir.
As electricity in usable form must have the specific voltage value that is 220V
in Pakistan because all devices are made for this specification to work on, while
current drawing capacity of every device is different. This is the reason due to
which theft is generally performed at distribution level, where voltage is already
stepped down to 220V. Following is the power flow diagram which tells the whole
story explained above.
There are different kinds of transformers for different purposes, as at power gen-
eration step up transformer is used, at primary grid station where input voltage is
500kV and output voltage can be any voltage like 220kV, 133kV, or 220V, depend-
ing on the situation different power transformers are used. These are all step down
transformers, varying capability of stepping voltage down. Power transmission is
done in three phase three wire system, while distribution of power is generally
done in three phase four wire system, which are shown in above figure 1.1.
1.2 Goals And Objectives
In Electric power system development and usage of IT based system instead of
manual system is need of the day, because of its efficiency, stability, accuracy and
2
Figure 1.1: Power Flow in Advanced Metering Infrastructure
timely operations. It has many advantages as less men power and effort is required
to operate the whole system. it is cost effective because once system is installed
it will work for a larger period of time, though it would need maintenance but
not as frequently as now a days manual system requires. Thus we can say that
it is cost effective as well. Time is one other factor which is more important,
and we can save that as well, if we have smart meters installed instead manual
electromechanical meters. This system would be error free, that is chances of error
will be minimum. Smart meters can be controlled remotely from a grid station
or utility. Its reading will be accurate and precise, because clerical mistakes will
also be minimized, that is delayed meter reading by the meter readers, unable to
take certain meter readings, incorrect meter readings, and rounding of data by the
meter readers. Reading taken automatically are more accurate and precise than
a manual meter readings, that is they are free of errors. Data will not be lost in
3
case of installing a smart meters. Processing of that data for further process like
billing process it will also be in time and accurate. Remote security can also be
established, and billing can be made even if employees are on leave, and billing
process will not be effected. Data stored at the servers in the utility can be further
easily used in experiments and research work, because it is easy to manipulate.
1.3 Major Functions
Major functions performed if existing system is changed to the advanced system
of power grid, called smart grid. Smart grid consists of advanced metering in-
frastructures, which includes smart meters. This system has function of remote
controlling and monitoring from grid or utility, that is remote on/ off of any de-
vices at homes or at industry, or we can turn the smart meter off completely if it
is assumed that meter is being miss used, an illegal means of electricity is being
taken. We can record meter readings on a cell phones, if we use GSM. We can still
perform on/off operations from a cell phones through SMS messages, and through
emails as well. We can update the data on a web site as well from control centers
in the utility grid.
1.4 Smart Meter
A smart meter record consumption of power every after one hour and communicate
that information to the distribution company for billing purpose. It performs
following main functions:
• Register electricity both generated and consumed.
• Offer possibility to read meter both remotely and locally.
• Have the ability to read nearby or consumer premises gas and water meters.
• Control electricity consumed by consumer.
• Switch the consumer of remotely.
Smart meters communicate by means of modem. Power line carrier (PLC), wire-
less modem (GSM/GPRS) and ADSL are the communication interfaces which
connect home appliance and display of the meter. Display is used to show energy
consumption and corresponding cost. From the above discussion it is quite clear
that smart metering provide following benefits such as:
4
• Reliability of supply.
• Variable tarif scheme to attract the attention of new customers.
• Metering cost be reduced.
• Power saving.
• Detection of power theft and fraud.
Domestic, commercial and industrial users waste energy and use energy too much
of their sanctioned load. This misuse of energy can be controlled by smart meter-
ing. Because if the power flow exceeds for a certain time then smart meter will
cut of the power supply and restored when the power comes to its prescribed lim-
its. This feature will greatly help to accomplish the government goals regarding
energy.
The great challenge faced by the distribution companies is to replace all old me-
ters by smart meters and also to update their system. Because in smart metering
continuously data being received, processed and transmit. If smart metering is
done then distribution companies easily find/locate the region where power con-
sumption is high and easily controlled or managed. Smart metering easily reduces
the cost of meter reading, billing collection and labor to disconnect the faulty
consumer. It will also improve billing accuracy, tighter billing and collection and
reduce power theft.
Basic configuration of smart meter is shown below in figure 1.2:
Figure 1.2: Smart meter basic Configuration
5
1.5 Other Meters
Instead of electromechanical meters and smart meters certain other kind of meters
are also used
1.5.1 Multi-tarif meters
Distribution companies think that it is not cost effective to charge from the con-
sumer at the same rate during a period of high demand as they charge the con-
sumer during a period of low demand. They decided to charge the consumer at
different rates during different period of time. Thus multi-tariff meters are intro-
duced to charge different amount for different period of time. They are mostly
industrial meters. Generally they are introduced at the premises where load is
above 5 KW. These multi-tariff meters have ”peak” and ”off-peak” tariff register.
Such type of meters are also called time of use (TOU) meters. These have time
register and also have other register in it. Single Phase electronic multi tariff meter
is shown below in figure 1.3.
Figure 1.3: Single Phase Electronic multi tariff Meter [20]
1.5.2 Time of Use Meters
Time of use (TOU) meters basically divide the day according to the load demand.
The period of high demand is called peak hour and during this period, the dis-
tribution company charge consumer at high rate. The remaining period is called
6
off peak and during this period consumer is charged at relatively low rate. The
basic purpose of this is that the consumer automatically controls the consumption
of electricity and pay accordingly. The peak and off peak hours in Pakistan are
listed in table above:
Figure 1.4: Peak and off Peak timings in Pakistan
Figure 1.5: Basic Time of use meter [21]
1.5.3 Pre-payment meters
The distribution companies install prepayment meters at such a premise, where
they think that the consumer will creates problem in billing. Some consumers
wish to install prepayment meters because it will greatly help them to manage
the budget. In such type of meters, advance mode of payment is made just like
in mobile phones. When the amount for which you purchase electricity vanished/
end then the relay will cut of the supply. Prepayment meters are of three types:
7
• Smart card meters.
• Key meters.
• Token meters.
The tariff of these prepayment meters is much high because it includes fixing and
maintenance charges and also it includes the cost of collection of money from pay
points or from post offices.
Figure 1.6: Prepay Electromechanical meter [22]
1.5.4 Energy meters in Pakistan
Our distribution companies take energy meters from both national and interna-
tional companies. Some of the major energy meter supplier companies are as
follows:
• Pak-Elektron Limited.
• MicroTech Industries (Pvt.) Ltd.
• Syed Bhai (Pvt.) Ltd.
• Creative Engineering Group Lahore.
8
• S.B. Electronics and Control Engineering.
• ESCORT Pakistan Ltd. Lahore
9
Chapter 2
Electricity Theft Issues, Methods,
and Data Communication to
Utility
10
2.1 Introduction
Electricity is generated through many ways, is synchronized on a single bus bar
of the grid for transmission. Before utilization of electricity, it passes from certain
phases. It is first generated, step upped in transformer deck, passed from switch
yard for transmission through power lines. After transmission it is distributed for
utilization to the customers. This energy needs to be billed as well. Usually two
types of devices are mainly used for billing procedure.
1. Electromechanical KWh meters.
2. Smart meters.
2.1.1 Electromechanical Meters
Electromechanical meters consists of following parts:
• Counting mechanism
• Serious Electromagnet.
• Shunt Electromagnet
• Brake magnet
• Aluminium rotor disk.
Shunt electromagnet is wound with a fine wire of many turns and is connected
across the supply so that the current flow through it is proportional to the supply
voltage. Since the coil of the shunt magnet has large number of turns and the
reluctance of its magnetic circuit is very small due to the presence of small air gap
which makes the coil highly inductive. Thus the current lags the supply voltage
by 90o. In comparison to its series magnet its wound with a heavy wire of few
turns and is connected in series with the load so that it carries the load current.
Since the coil of series magnet is highly non inductive so that angle of lead or lag
is determined by the load.
Our energy is strained to the utmost now a day, so using energy efficiently is
one of the issues which need urgent attention. That is why electricity is to be
dealt with great care. As for as knowledge is concerned there is no such password
which can not be cracked but best password is the one which is being cracked in
a larger period of time. This is one basic reason that whole world is shifting from
analog devices to digital devices. That is why analog electromechanical meters
are being substituted by smart meters. Digital devices provide better security and
11
controlling options. The better detection and controlling of losses is one of the
reasons for substitution of smart meters.
Every thing occurs for a reason, so the reason for this substitution is losses in
electrical systems. There are mainly two types of losses.
1. Technical losses.
2. Non-Technical/Commercial losses.
In developing countries electricity theft is a common practice specially in remote
areas, as they do not pay utility bills to a government company in case of electricity
and gas as well. To solve this problem governments must think of an idea to provide
help in terms of subsidy to manage this issue.
2.2 Related Work and Motivation
In [1,3] authors explained theft control very well in a sense that they proposed a
model. In this model they calculated NTL in external control section, and if NTL
> 5%, legal customers are disconnected for some interval. Harmonic generator is
operated in this time period, which destroys the electrical equipment of all the
illegal consumers. Reconnect normal supply for genuine customers. Although this
is a good model that electricity theft is an issue that one can make equipments
of an illegal users starts malfunctioning. However this model can be improved to
stop functioning of the equipment of an illegal users, weather using smart meters
or any other technique.
S. McLaughlin et al. explained some of the energy theft in Advanced Metering
Infrastructure (AMI), proposed an idea of a communication architecture from
smart meter to grid using meter to meter communication. For boosting the data
signals using collectors and receptors. It defines this procedure in a network known
as Backhaul network, used to transport data to utility. However energy theft in
smart meters can be a technical person, if he removes the µ-controller from his
meter. It will not be able to measure readings and send it to utility for further
process.
In [4,5] authors elaborated ways of communication, in how many ways we can
transmit the data of smart meter to utility. S. S. S. R Depuru and et. al. shown
how an electromechanical meter works and why is smart meter better than elec-
tromechanical meter.
In [6] central observer meter is placed, which is cost effective because a smart meter
12
is placed at secondary side of transformer. It used matrix based approach in excel
to show electricity theft case and normal case. Where as if large amount of data
has to be managed than larger matrices will be required. Memory requirements
will increase, time consumption to solve large matrices will increase.
[8,9] have some mathematical modeling techniques which helps to detect and con-
trol electricity theft using some classifiers. [8] discussed a Graphical User Interface
(GUI) based software implemented in Malayesia.
2.3 Losses Due To Electricity Theft
Electricity theft is basically an illegal way of getting the energy for different uses,
resulting in loss for utility companies. Losses consist of technical and non technical
losses. There are about $25 billion of losses annually in the world [1]. Losses can
actually be computed by finding the energy supplied, subtracting the amount of
energy billed/paid [3]. If we want to calculate non-technical losses (NTL) simply
one way of calculating it is to calculate technical losses. We can evaluate it as
follows.
Total Energy Losses = Energy Supplied−Bills paid (2.1)
Total Energy Losses = NTL + TL (2.2)
Combining equation 1 and 2, we get
NTL = Energy Supplied−Bills Paid− TL (2.3)
In data 1 shown below in table 2.1. losses occurred monthly in Lahore Pakistan
in the year 2011-12 till February [7]. Percentage losses are calculated as:
Percentage Loss =
(Received V alue− Sold V alue
Received V alue
)∗ 100
As we are intrusted in calculating NTL, we can generally find it from equation 2.3,
as if we have smart meters installed, we would be having data of Energy supplied
recorded in control centers, and bill payments by the consumers, the only factor
left with us unknown in the equation is TL. Now we have to find technical losses
by numerous ways but we can find it by using Lagrang function.
Lagrange function can also be the method to study load flow in electrical power
system toward distribution. Lagrange basic function without losses can be given
as:
13
£ =N∑i=0
Fi + λϕ (2.4)
Where £ is Lagrang function,∑N
i=0 Fi is a cost function, λ is a Lagrangian mul-
tiplier.
ϕ = 0 =
[Pd −
N∑i=0
Pgi
](2.5)
£ =N∑i=0
Fi + λ
[Pd −
N∑i=0
Pgi
](2.6)
d£
dPgi
=dFi
dPgi
− λ = 0 (2.7)
Taking derivative of equation 2.6 to solve it for the generation cost of certain
operating unit, we get equation 2.7, equating it to zero. It can be the approach
for finding technical losses in one way as if we consider above equation 2.6 including
losses it will be given as:
£ =N∑i=0
Fi + λ
[Pd + PL −
N∑i=0
Pgi
](2.8)
Now taking derivative of equation 2.8, we get the following equation in terms of
losses.
d£
dPgi
=dFi
dPgi
− λ
(1− ∂PL
∂Pi
)= 0 (2.9)
dFi
dPgi
+ λ∂PL
∂Pi
− λ = 0 (2.10)
dFi
dPgi
+ λ∂PL
∂Pi
= λ (2.11)
∂PL
∂Pi
= 1− dFi
λdPgi
(2.12)
Integrating both sides, we get the following to find out TL = PL.
14
PL =
∫ (1− dFi
λdPgi
)Pi (2.13)
Further we can find the value of λ using lambda iteration method, the flow chart
for finding lambda is given below:
Figure 2.1: To find lambda using lambda-iteration method
As i mentioned earlier that Lagrangian function method can be used for load flow
analysis for economic dispatch of power from utility to the distribution or con-
sumer level. Certain other methods are also used for economic dispatch including
Gradient search method, Newton‘s method, and dynamic programming methods.
Several methods are used to identify electricity theft using certain mathematical
methods like Support Vector Machine LINEAR (SVM-LINEAR), Support Vector
Machine- Radial Basis Function (SVM-RBF), Artificial Neural Network- Multi
Layer Perceptrons (ANN-MLP), Optimum Path Forest classifier (OPF) [8]. SVM
is a regression based technique, in which dependent and independent variables are
considered. It defines certain parameters to define a graph or compare it to the
standard data or graph. In this method special kind of theft are recognized. If
there occurs an abrupt change in load flow it notify that change and store that
data as faulty one [7].
Their is one other technique ANN-MLP which is based on modeling techniques,
obeys some of non-linear statistical modeling or tree diagram. Other part of it is
MLP which is a type of linear classifier and selects better output among outputs
15
from its input.
Ramos, C. C. O et al. proposed OPF based technique [8]. It is an approach in
which better output is replaced for the previous value selected to reach to identify
theft. It needs no parameters to be assumed. Its training phase operation is very
fast, an overview is tested by [8] and showed that an OPF has a higher hit rate of
theft and having more accuracy than SVM-LINEAR, SVM-RBF, and ANN-MLP
[8].
16
Table
2.1:Energy
Losses
inyear2010till2012,Data
takenfrom
Lahore
ElectricSupply
Company
Month
sJuly
Aug
Sept
Oct
Nov
Dec
Jan
Feb
2010-2011
Energ
y(M
KW
H)
Rece
ived
1764.81
1777.29
1518.89
1461.89
1136.25
1179.97
1169.85
1058.03
Sold
1508.41
1513.76
1311.82
1282.98
1047.91
1060.11
1057.74
1009.38
PercentageLosses
14.53
14.83
13.63
12.24
7.77
10.16
9.58
4.60
2011-2012
Energ
y(M
KW
H)
Rece
ived
1693.09
1768.82
1570.68
1509.01
1199.71
1179.12
1127.43
1140.52
Sold
1449.12
1510.51
1365.99
1329.63
1106.89
1115.94
1024.03
1085.20
PercentageLosses
14.41
14.60
13.03
11.89
7.74
5.36
9.17
4.85
Decrease
0.12
0.22
0.60
0.35
0.04
4.80
0.41
-0.25
17
July Aug Sep Oct Nov Dec Jan Feb1000
1200
1400
1600
1800
Rec
eive
d va
lues
vs
Sol
d va
lues
(a) Data of 2010−2011
July Aug Sep Oct Nov Dec Jan Feb1000
1200
1400
1600
Rec
eive
d va
lue
vs s
old
valu
es
(b) Data of 2011−2012
Figure 2.2: Month wise Graphical Representation of losses in a populated city in year2010-2012.
Decrease or difference between 2010-2011 and 2011-2012 can be found out by
subtracting the value of the present year from the previous year in table 2.1. and
graphically shown in figure 2.2. 1
There are certain methods of stealing electricity. The core reason of stealing is
lack of awareness amongst the peoples, due to which this unpleasant act is being
performed in different areas of the world. Meter tempering can be done in elec-
tromechanical meters and smart meters as well. Tempering in electromechanical
meter is explained in detail below.
2.3.1 Theft in Electromechanical Meters
Few methods of stealing electricity are
• Taking connections directly from distribution lines.
• Grounding the neutral wire.
• Putting a magnet on electromechanical meter like neodymium [1].
• Inserting some disc to stop rotating of the coil.
1Area of a city = 684 sq mile , Population of city = 11,000,000
18
• Hitting the meter to damage the rotating coil [2].
• Interchanging input output connections.
But these disputed issues can be minimized by using the smart meters. Even
in smart meter, one can take connections directly from distribution system but
smart meters have the ability to record zero reading. It inform the utility system
by sending data through different techniques. These techniques include bluetooth,
Power Line Carrier (PLC), Internet protocols. Session Initiation Protocol (SIP)
can be used for controlling of Voice over Internet Protocol (VoIP), Zigbee 802.15.4
can be used in Home Area Networks (HANs) [4].
In second point if we ground neutral wire then energy meter assume the circuit
is not complete and does not measure reading. As we know that moving coil in
electromechanical meter can be easily affected by magnetic field lines. So if we put
a magnet on electromechanical meter its magnetic field effects the coil motion and
cause it to move slow, or even stop if magnet is strong magnet like neodymium. If
someone insert an x-ray disc in electromechanical meter, it also interact with coil
and affect its performance. Hitting meter shows same results by damaging coil in
electromechanical meter. In last point interchanging input output connections in
electromechanical meter starts moving in reverse direction, which is also a method
to produce less reading, till end of the month.
2.3.2 Theft in smart meters
Smart grid is a very generalized word, it includes diverse kind of sub-infrastructures.
One of the important infrastructures is AMI discussed in [2]. Due to many ad-
vantages of AMI, every community has the desire to install this system for its
ease. AMI is an infrastructure which has many function but it can also be used
to control electricity theft. AMI is an infrastructure and smart meter is an entity
which can be placed at each and every home/industry, replacing electromechanical
Kilo-Watt hour (KWh) meters.
AMI provides a new sensor based approach. If sensors are installed in the electrical
equipment, then AMI can be useful in a way that utility or power distributing
companies can predict load of a specific area. This is useful for utility in a way that
they will design a correct and efficient load flow to certain area. This technique
is efficient to save many of economic issues for installing an infrastructure for any
area.
Smart meter is a digital device, uses µ-controller and certain other digital instru-
19
Lamp 1 Lamp 2
Energy Meter
Phase
Neutral
Main Switch
Incom
ing E
lect
ricity
To E
nerg
y M
ete
r
Outg
oin
g E
lect
ricity
T
o U
ser
Neutral Grounded
Figure 2.3: Neutral Grounded
ments. Function of smart meter includes.
1. Self billing.
2. Avoid outages in HAN’s.
3. Remote connect and disconnect.
4. Remote authentication like sending control messages
While authenticating, data tempering occurs, using software hacking. False au-
thentication can be used to authenticate the password and hack the data from
smart meter.
Some hardware hacking, specially designed for fraud purposes are also designed
by professionals like descrambler boxes, which reads data from smart meters and
are used for illegal purposes.
Time of use capability is also present in AMI, like billing during peak load must
be little higher than billing in an off peak load timings.
Methods mentioned for electricity theft in electromechanical meter can be applied
to smart meters as well, except putting magnet of neodymium, inserting disc, or
hitting it, by this mechanical shock the meter does not work properly.
One of the objections from consumer is that smart meters are used as a spy at our
homes. It discloses privacy of our homes, which is not ethically viable. It emits
certain kind of radiations which are toxic and dangerous to humans life. It also
interferes radio frequency and create problems in radio transmissions to people.
Mobile police also uses radio frequency which is interrupted by emission of smart
meters [2].
20
2.3.3 Engineered ways of Theft
Some of the sophisticated ways of stealing electricity [3] are
1. Tempering the current transformers (CT) secondary side of the energy me-
ter,it is generally insulated. where CT’s are used to measure current flowing
through it. If any one temper CT, then it will not be able to measure cor-
rect current passing from energy meter to the consumer or it will record slow
readings.
2. Internal calibration of electromechanical energy meter is not correct; the coil
used in it is not calibrated correctly.
3. In three phase meters if neutral is kept open, and only one out of three
phases is used, than electromechanical meter assume that no energy is flow-
ing through it to the customer. These kind of thefts are easily detected in
smart meter by an option of “EL” glowing.
EL is an option in smart meter, whose Light Emitting Diode (LED) when flashes
shows certain points, such as the miss match between the phase and neutral current
is detected by Earth Leakage (EL) LED.
• “EL” glows in smart meter means either neutral of your home is connected
to the neutral of your neighbors or vice versa.
• Phase of your home is connected to the phase of your neighbors or vice versa.
• Neutral is connected to the ground.
If this “EL” LED flashes, it will also be visible to the utility, so utility can check
the problem manually to control theft.
2.4 To Communicate Data To Utility Safely
Communicating to utility follows a step wise procedure. Smart meter has the
ability to measure the energy flowing through it, records the values using micro
controller. it updates the values in its registers, but if there occurs any problem in
wireless data transfer, it will re-check wireless device. Resolve the issue and and
re-update data in smart meter as shown in fig. 2. One of the technical way of theft
is to make a meter read slow. If partial electricity is taking by an illegal means,
and high energy consumption devices like motors are operated by that electricity.
These theft can be examined and checked physically to make all devices operate
through legal connection. After that data will be transferred to utility with out
21
letting an intruder to hack it or distort it, through descrambler boxes etc. This
data transfer is also an important phase, and it needs attention. Next is to store
data at server and make it available for technical computation, that is billing
procedure, etc.
Smart meter has the ability to measure reading time and again, and send it through
different techniques like wireless data transfer using different protocols. Bluetooth
is one of the method through which we can collect the data from smaller distances
such as we can use Bluetooth for HAN as a metering device which follows standard
protocols of Bluetooth that is 802.15.1. Broadband Power Line communication
(BPL) is another way of communication to the grid, it has certain protocols like
Transmission Control Protocol/ Internet Protocol (TCP/IP). It is an advanced
form of Power Line Communication (PLC), and it uses a radio frequency spectrum.
It causes hurdles in radio communication is one of disadvantage of BPL. Using
wired data lines we can also communicate the data from certain industry or home
to some central device like smart meter and then send the data wirelessly to the
server at utility using Wi-Fi, WiMAX, which follows the standards 802.11g [4].
There are certain other protocols like SIP which supports Voice over Internet
Protocol (VoIP), this protocol controls the video and audio data as well. SIP
also controls few of the other protocols like Transmission Control Protocol (TCP),
Hyper Text Transfer Protocols (HTTP), User Datagram Protocol (UDP). SIP is
a very common protocol and dals with many of other protocols. Zigbee is one
other protocol which can be used for HAN. It uses standards of 802.15.4. Global
System for Mobile communication (GSM), General Packet Radio Server (GPRS)
can be another way to send the data to utility using them [5].
By discussing all these ways of communication there will be a problem of huge
data transfer through these networks. In wired and wireless services which are
using currently IPv4. It uses total of 232 addresses, equivalent to 4294967296.
These addresses are insufficient to control all devices of all homes of the world
including industries etc. On the other hand we have IPv6 which has a total of 128
bits or 32 hex digit code means 2128 addresses in which there is 48 bit for the node
address only, it has a lot of addresses, and we can use them to fulfil requirement
of the said scenario. There is one other approach to transmit the data known as
Power Line Carrier (PLC). If it is applied through optical fiber then this would
be a once and for all investment, and is called as Overhead Power Ground Wire
(OPGW).
It would be feasible for controlling the whole of the data if we use one way com-
munication or two way communication from or towards utility. In two way com-
22
Smart
Meter
Update
Reading
Missing
Slow
Reading
Data Transfer
Data Storage
Record at
Utility Server
Problem in
Wireless Data
Transfer
Re-Check
YesTaking Partial
Electricity By
Illegal Means
Check internal
connections
Physically
Turning off
High energy
Devices
Yes
No
No
Figure 2.4: Communicating Data To Utility
23
munication we have an advantage of turning on and off the smart meter of any
home or industry. Bandim. C. J, et al. proposed an idea of low cost methodol-
ogy of sending the data to the utility, by placing a smart meter on the secondary
side of the distribution transformer, which monitors data of home meters locally,
records the readings, sends data through wireless device, and communicate two
way communication with utility.
2.5 Causes And Effects Of Electricity Theft
Theft is a serious crime, creating short fall, increase of load, decrease of frequency,
which is not acceptable and causing load shedding, increase of tariff on the legal
customers[1].The main reason behind electricity theft are low literacy rate and lack
of awareness. Circular debt is one of the serious consequences of the electricity
theft. It can be explained as electricity power is produced through many ways like
from oil, turbines are operated and produce electricity. So now if oil is supplied
to the utility for their use to run turbines for electricity generation. While utility
is not paying to the oil suppliers, and utility is producing electricity from that oil,
and selling it to customers. Then losses come into act non paying customers, and
theft are also very dangerous, this is what it means that they are not paying the
utility back. So this is the issue which is called as circular debt.
In the third world countries, people are mostly not able to pay their utility bills.
Government can also help deserving people by avoiding the electricity theft thus
providing subsidy to minimize per unit cost. For further improvement in electri-
cal power system respective government needs to give incentives to the capable
people to focus on their own electricity production, like emerging technology for
production of electricity from solar cells, wind power, hydel etc. Fulfil their own
use and sell it to the utility for their own good as well, and be benefited from the
Utility by synchronizing their systems successfully.
Power flow mechanism in AMI shown in fig. 4. can briefly be described as power
is generated at hydro power plants. As control rooms and control sections are
very important part of each and every portion of electricity power generation,
primary substation, and secondary substation. Power generation plant is very
important part, because whole of electricity is generated at power plants from
water. Head is one important issue. Water head and turbine size are mechanical
portions. Which also needs to be controlled. These all controls are present for
power generation plants on mimic board in control room. Water level is also to
be noticed, flow of water through gates. Water reservoirs are kept in water bays
24
for running turbine at peak hours. For control at power generation and control
at remote areas Supervisory Control and Data Acquisition System (SCADA)is
used. Distributed Control Systems (DCS) is also used for control and supervision
at power generation plants. Power flows towards primary substation through
transmission lines where it is maintained on the grid in control room.
At Primary substation electricity is stepped down to certain limit. Where elec-
tricity could be stolen at any point. At generation less electricity production and
waste of water is also a theft. Where system has the ability to produce more
electricity, than they were producing. Electricity can also be watched on its way
to primary substation. It is quite possible that it could be stolen in a way that is
why control centers are deployed and data is observed time to time. After primary
distribution it is transmitted to secondary substation, where data comes from the
end users, through smart meters or PLC. Home appliances can be controlled using
Zigbee.
25
Chapter 3
Regression Based Technique for
Estimating Electricity Theft
26
3.1 Introduction
According to law of conservation of energy, energy can neither be created nor
destroyed, but only it can be changed from one form to another form. Electricity
is a form of electrical energy, and is obtained from a lot of methods. Now every area
of population has its own seasonal and environmental conditions. Due to which
one can decide the way of getting electricity for prescribed area which would be
efficient and sufficient. The energy planning for certain area can be managed
by different statistical methods and optimization techniques. Computations are
performed to utilize our economic resources in a best way.
Electricity usage is also a main problem to be considered. As, different types of
losses are also constituted with the distribution of electricity to the consumers.
Mainly losses are of two types; technical and non-technical losses. Technical losses
include core or iron losses(hysteresis and Eddy-current losses), copper or electrical
losses etc. Where as non-technical losses include losses that are unauthorized or
illegal means of getting electricity also called stray load losses.
In this thesis, I first collected data of non-technical losses from utility using smart
meters, which are more suitable to minimize the possibility of electricity theft,as
compared to other devices. Then we applied different estimation methods to cal-
culate the variation and dispersion in the data. In [10-12], authors discussed differ-
ent probability and regression analysis based techniques to estimate non-technical
losses. In this thesis, I mainly used regression based approach, Spearman’s rank
test, sign test etc., and also studied how linear support vector machine is used to
classify electricity theft.
Electricity is being generated it is transmitted in 500kV lines and in some countries
it is transmitted through 700kV lines primarily. After long transmission lines it
is stepped down to 220kV lines and then 132kV lines passing through specific
grid stations. Electricity theft in very high voltage and high voltage transmission
lines which is not an easy job to do. It requires a trained personal because our
equipments at home or equipments in industry works on 220V of supply on its
input. Coupling Capacitor Voltage Transformer (CCVT’s) are used for theft on
high voltage transmission lines. It is a simple transformer having three main parts
capacitor, inductor and a transformer which is used to step down a voltage to a
usable range. If theft is taking place on 220kV single phase, disturbance will occur
in transmission lines, and increase of capacitive losses will occur in transmission
lines.
27
Figure 3.1: Capacitor coupled voltage transformer [www.wikipedia.com]
3.2 Related Work and Motivation
In paper [10], authors used a data sets in the form of windows and performed
certain operation on that data to extract their results. Detection of electricity
theft is explained in a way that data samples will be selected first and decision
will be taken that which sample should be tested on which particular method that
is on decision trees, baysien approach or on Karl Pearson’s approach. As a result
theft can be detected during theft processes. They will be merged and refiltered
on the bases of which inspection will be carried out. However, Karl Pearson’s
approach explains relationship of chi square distribution and discrete multinomial
distribution, which is not explained by the authors.
Technical losses in low voltage distribution systems are evaluated by the authors
in [11]. They used regression analysis to estimate electricity theft defining certain
data sets. They considered some data to be independent and some of data which
will depend on the independent variable. Sixteen (16) data sets are defined in
this work in total and checked that data on seven (7) different variables. However
correlation coefficients are not discussed clearly in [11]. In [12] authors discussed
a project of MIDAS.
3.3 Techniques for Estimating Electricity Theft
In this section, to detect special kind of patterns different mathematical mod-
els are discussed. There are certain classifiers that can also be used for detec-
tion. They includes Support Vector Machine LINEAR (SVM-LINEAR), Support
Vector Machine- Radial Basis Function (SVM-RBF), Artificial Neural Network-
Multi Layer Perceptrons (ANN-MLP), Optimum Path Forest classifier (OPF).
28
The ANN-MLP is based on modeling techniques, obeys some of non-linear statis-
tical modeling or tree diagram. Other part of it is MLP which is a type of linear
classifier and selects better output among outputs from its input. [13] proposed
OPF based technique. It is an approach in which better output is replaced for the
previous value selected to identify theft. Advantages of OPF are[13]:
• It needs no parameters to be assumed.
• Its training phase operation is very fast, and has a higher hit rate of theft,
having more accuracy than ANN-MLP.
Certain other programming methods are also used which are Quadratic Program-
ming (QP) and Genetic Algorithm (GA).
3.3.1 Fitting a Regression line
29
Table
3.1:Energy
Losses
inapopulatedcity
ofLahore
Pakistanin
year2010till2012[www.lesco.gov.pk]
Month
sJuly
Aug
Sept
Oct
Nov
Dec
Jan
Feb
Energ
y(M
kW
H)
Rece
ived
1764.81
1777.29
1518.89
1461.89
1136.25
1179.97
1169.85
1058.03
2010-2011
Sold
1508.41
1513.76
1311.82
1282.98
1047.91
1060.11
1057.74
1009.38
PercentageLosses
14.53
14.83
13.63
12.24
7.77
10.16
9.58
4.60
Energ
y(M
kW
H)
Rece
ived
1693.09
1768.82
1570.68
1509.01
1199.71
1179.12
1127.43
1140.52
2011-2012
Sold
1449.12
1510.51
1365.99
1329.63
1106.89
1115.94
1024.03
1085.20
PercentageLosses
14.41
14.60
13.03
11.89
7.74
5.36
9.17
4.85
Decrease
0.12
0.22
0.60
0.35
0.04
4.80
0.41
-0.25
30
We can test data using linear regression model using non parametric test. There
are two types of tests; parametric tests and non-parametric tests. In a former test
we know about the parameters of data including mean and standard deviation etc.
After performing non parametric tests, we can process the data and obtain results.
In later non parametric tests we do not know about any parameter of the data i.e.,
mean and standard deviation. In particular, we take them as parameters of any
statistical data that might be a sample data or a population data. In method of
regression analysis through non parametric test, we would find the average values
of variable x and y. Where x is the data taken for energy generated by power
house or energy consumed, and y specifies values of energy paid back to the utility
in the form of bills. The correlation between x and y is given as:
ρ =σxy
σxσy
(3.1)
where, ρ denotes the coefficient of correlation, σx specifies the standard deviation
of x and σy is the standard deviation of y. We can also right this equation as
follows:
ρ =Cov(x, y)
σxσy
(3.2)
Covariance shows dispersion in data. There are differences between covariance
and coefficient of correlation. One of them is, coefficient of correlation is a unit
less quantity, because it is a ratio, and its range varies from -1 to +1. Covariance
has units and it is based on standard deviation. Covariance can be calculated as:
Cov(x, y) = E[(x− x)(y − y)] (3.3)
where, E is an expectation. The general equation to find out standard equation
is:
σ2 =
∑x2
N−(∑
x
N
)2
(3.4)
Fig. 1 shows data of 2010-11 power consumed to the power sold in a electric
supply company for one year. We fitted a straight line on this data which shows a
regression fitting on the prescribed data. It basically explains deviation from the
fitted straight line which shows deviation i.e., theft values, that specifies dispersion
from the fitted line.
Suppose we have two variables x and y. we can plot them against each other to
obtain a scatter diagram. However if we plot values of x against y we get regression
line. For regression line we will first find a parameter a as follows:
y = a+ bx (3.5)
31
or
y = a+ bx (3.6)
Therefore,
a = y − bx (3.7)
From here we will find the value of a. To find a parameter b the following equation
is given as:
b =n∑
xy − (∑
x)(∑
y)
n∑
x2 − (∑
x)2(3.8)
Putting values in above equation to get value of b. Hence, the estimated regression
line of y on x is:
y = a+ bx (3.9)
R =
√n∑
xy − (∑
x)(∑
y)√[n
∑y2 − (
∑y)2][n
∑(x2)− (
∑x)2]
(3.10)
where, R2 is the coefficient of determination.
Explained V ariation = σ(y − y)2 (3.11)
Total V ariation =∑
(y − y)2 (3.12)
R2 =σ(y − y)2∑(y − y)2
(3.13)
The variability among the value of dependent variable y, called the total variation.
Coefficient of determination which measure the proportion of variability in values
of the dependent variable y, which is explained by its linear relation with the
independent variable x is defined by the ratio R2. Therefore, 96.3 % of y values
on x values are explained, means 3.7% of electricity is being stolen in data used
of 2010-11, shown in Table.1. 1
Non parametric regression based technique is quite a simple technique, however
its results are similar with other computational methods. It can be used for a
large data sets of any utility system, because of its simplicity.
3.3.2 Non-Parametric Statistical Methods
Non-parametric tests are considered to be very good methods for testing hypoth-
esis. Computations involved in these techniques are very quick and easy to carry
out. Second advantage of using non-parametric test is to be clarified from an
1Area of a city = 684 sq mile , Population of city = 11,000,000
32
1000 1100 1200 1300 1400 1500 1600 1700 1800900
1000
1100
1200
1300
1400
1500
1600
Power Consumed
y
Fitting a Regression line
y = 236.783 + 0.69904 * x
Figure 3.2: Regression Line fitted on Data of 2010-2011
example that if we have two judges and they will have to decide first five best
cement industries. Non-parametric methods can be used to find weather there is
an agreement between the two judges or not. Third advantage of using these kinds
of methods is use of less restrictive assumptions than parametric methods. One
of non-parametric methods are used to estimate electricity theft in certain data
as follows.
3.3.2.1 Spearman Rank Correlation Coefficient Test
Sometime the actual measurement or counts of individual objects are either not
available or accurate assessments are not possible. They are then arranged in
order according to some characteristics of intrust. Such an ordered arrangement
is called ranking in the order given to an individual or object is called its rank.
The correlation between two such set of ranking is known as rank correlation.
Let a set of n objects are ranked with respect to characteristicA as, x1, x2, ..., xi, ..., xn,
and according to characteristic b as y1, y2, ..., yi, ..., yn. We assume that no two or
more objects are given to the same ranks (that are tied) then obviously xi and yi
33
1100 1200 1300 1400 1500 1600 1700 18001000
1100
1200
1300
1400
1500
1600
Power Consumed
y
Fitting a Regression line
y = 269.1164 + 0.700225 * x
Figure 3.3: Regression Line fitted on Data of 2011-2012
are some two numbers from 1 to n. Both xi and yi are the first n natural numbers,
therefore,
n∑i=1
x =n∑
i=1
y =n∑
i=1
i = 1 + 2 + 3 + ...+ n =n(n+ 1)
2(3.14)
n∑i=1
x2 =n∑
i=1
y2 =n∑
i=1
i2 = 12 + 22 + 32 + ...+ n2 =n(n+ 1)(2n+ 1)
6(3.15)
n∑i=1
(xi − x)2 =n∑
i=1
(yi − y)2 =n∑
i=1
y2i −(∑
yi)2
n(3.16)
=n(n+ 1)(2n+ 1)
6− n(n+ 1)2
4(3.17)
=n(n2 − 1)
12(3.18)
Let, di denotes the difference in ranks assigned to the ith individual i.e.,
di = xi − yi (3.19)
34
2010-2011 2011-2012Months MKWH %age losses MKWH %age losses Dec
Received Sold Received SoldJULY 1764.81 1508 14.5 1693.09 1449.12 14.4 0.1
AUGUST 3542.1 3022.17 14.7 3461.91 2959.64 14.5 0.2SEPTEMBER 5061.99 4333.98 14.4 5032.29 4325.63 14 0.4OCTOBER 6523.88 5616.96 13.9 6541.6 5655.11 13.6 0.3NOVEMBER 7660.33 6664.87 13 7741.31 6762 12.7 0.3DECEMBER 8840.92 7724.98 12.6 8920.43 7830.96 12.2 0.4JANUARY 10010.15 8782.71 12.3 10047.86 8854.99 11.9 0.4FEBRUARY 11068.8 9792.1 11.5 11188.38 9940.19 11.2 0.3MARCH 12319.5 10858.19 11.9APRIL 13589.53 11937.1 12.2MAY 15261.04 13350.51 12.5JUNE 16964.32 14740.85 13.1
Table 3.2: Regional Progressive energy Losses as updated on 29-02-2012 onwww.lesco.gov.pk
and ∑d2i =
∑(xi − yi)
2 (3.20)
=∑
x2i +
∑y2i − 2
∑xiyi (3.21)
Substituting for∑
x2i and
∑y2i , we get
∑d2i =
n(n+ 1)(2n+ 1)
6+
n(n+ 1)(2n+ 1)
6− 2
∑xiyi (3.22)
where, ∑xiyi =
n(n+ 1)(2n+ 1)
6− 1
2
∑d2i (3.23)
The product moment coefficient of correlation between the 2 sets of ranking is
r =
∑xy − (
∑x∑
y)n√
[∑
x2 − (∑
x)2
n][∑
y2 − (∑
y)2
n]
(3.24)
Substitution gives
rs =
[n(n+1)(2n+1)
6− 1
2
∑d2i
]− n(n+1)2
4
n(n2−1)12
(3.25)
rs = 1− 6∑
d2in(n2 − 1)
(3.26)
The formula is usually denoted by rs. In order to have a distinction. It is called
35
Table 3.3: Showing ranking of data
x x Rank y y Rank di di2
1764.81 11 1508.41 11 0 01777.29 12 1513.76 12 0 01518.89 8 1311.82 8 0 01461.89 7 1282.98 7 0 01136.25 2 1047.91 2 0 01179.97 4 1060.11 4 0 01169.85 3 1057.74 3 0 01058.03 1 1009.38 1 0 01250.7 5 1066.1 5 0 01270.03 6 1078.91 6 0 01671.51 9 1413.41 10 -1 11763.28 10 1390.34 9 1 1
Total 2
Spearman’s coefficient of correlation, it is to be noted that∑
d2i has the minimum
value and is zero when the number are in complete arrangement. In case of
disarrangement,∑
d2i will be maximum and is equal to n(n2−1)3
Putting these
value, we get:
rs = 1 , for∑
d2i = 1 (3.27)
and
rs = −1 , for∑
d2i =n(n2 + 1)
3(3.28)
This shows that rs lies between (-1,1). If we solve an example using Spearmen
co-efficient of correlation and result comes out to be 0.8 then we can say that it
indicates high correlation.
Using equation-26, we get,
rs = 1− 6(2)
12(122 − 1)(3.29)
Here, rs = 0.993007, i.e., 99.3% correspondence between the data taken.
3.3.3 Karl Pearson’s Approximation
Karl Pearson has established a relationship between discrete multinomial distri-
bution and chi-square distribution by transforming and making the multinomial
distribution approach a χ2-distribution as “n” approaches infinity. This approxi-
mation is widely used to test agreement between observed data and the expected
(or hypothesized) results. Authors in [10], explained models based on Pearson’s
36
coefficient of correlation.
3.3.4 Hypothesis Testing in Regression Model
To check variation between two sets of data as data is available in the form of
electricity losses. These losses can be estimated by using y. Here, as we have
consumed power available i.e., readings taken from the utility that this much
power is to be distributed in a specific month. Second parameter across this one is
MkWH sold, means that utility is paid back after consumption of electricity. Now
we have to estimate the correlation and interdependence between the two sets of
values. We use analysis of variance to get our desired results. In this method, we
have to assume a hypothesis and test it after words.
While assuming Hypothesis we suppose β = 0, If hypothesis is accepted β will
be equal to zero, but hypothesis can be rejected if β = 0.Now we test β using
t-distribution also called student t-distribution. Using the following equation for
the sets of values.
tc =b− β
Sb
(3.30)
It gives the calculated value for “t”, where “b” is the slope of regression line and
β is used in this hypothesis testing. “Sd” is the standard deviation calculated for
“b”. We can fit level of significance “α” from the above calculated “t” value in a
t-table given in fig. If value for level of significance is lesser for a hypothesis better
the covariance and interdependence between data.
Most frequently, we are intrusted in testing the hypothesis that Ho : β = 0
against H1 : β = 0. It is important to note that testing the hypothesis that β = 0
is equivalent to testing the hypothesis that the variable y is independent of the
variable x (in a linear sense). The test statistics then becomes
tc =b− β
Sb
(3.31)
If we reject Ho : β = 0 we conclude that the two variables are linearly related. If
we accept Ho : β = 0, then two variables are not linearly related. To find “Sb”,
s2b =s2yx∑
(x− x)2=
∑(y − y)2
(n− 2)∑
(x− x)2(3.32)
If we want to check the data using hypothesis testing in regression model we need
following parameters;∑x = 17022.5,
∑y = 14740.87,
37
∑x2 = 24969161.78,
∑xy = 21485177.67
and∑
y2 = 18524679.07
Testing the population regression co-efficient, β = 0
• We state our null and alternative hypothesis as
Ho : β = 0andH1 : β = 0
• The significance level is set at α = 0.01
• The test-statistic, under Ho, is
t =b− β
Sb
=0.69904
Sb
(3.33)
which has a student’s t-distribution with υ = 12 − 2, i.e., 10 degrees of
freedom.
• As, we have already find out the values of a and b, which are “0.6990” and
“236.78”, respectively. Computations for sb.
s2yx =
∑y2 − a
∑y − b
∑xy
n− 2(3.34)
s2yx =18524679− 235.7(14740)− 0.699(21485178)
12− 2(3.35)
syx = 41.9303 (3.36)∑(x− x)2 =
∑x2 − (
∑x)2
n(3.37)∑
(x− x)2 = 822036.4792 (3.38)
sb =syx√∑(x− x)2
(3.39)
sb = 0.04625 (3.40)
t =0.69904
0.04625(3.41)
t = 15.11536 (3.42)
• The critical region is |t| ≥ t0.005(10) = 3.169
• Since the computed value of t=15.11536 falls in the critical region, so we
reject the null hypothesis and may conclude that there is sufficient reason
to say at the 1% level of significance that consumed and sold MKWH are
related.
38
3.4 Linear Support Vector Machine
As, different classifiers can be used for optimization problem. In this section, we
focus on SVM based technique. Its aim is to find a hyperplane that sperate the
data points belong to different groups. In linear SVM, we define certain parameters
on the basis of optimization problem. Training data is tested for optimization
problem, as in [17], Td is given as follows:
Td = (xi + yi) for i = 1, 2, 3...n, yi ε [−1, 1] (3.43)
where, xi are the data points septated into upper class and lower class. -1 shows
the lower and 1 shows upper class. This classified data can be written as:
for yi = 1, wx+ b ≥ 1 (3.44)
for yi = −1, wx+ b ≤ −1 (3.45)
where w is the weight of vectors x, and b represents biased or unbiased hyper
plane, for b = 0 it is called unbiased hyperplane i.e., it is passing from origin [9].
However for b = 0. The distance from origin to the margin line can be measured
as b∥w∥ . Where ∥ ∥ is used for norm. By combining the above constraint equations,
we get,
fy = yi(wxi + b) ≥ 1, fori = 1, ..., n (3.46)
The point xi lying on the boundaries line are called the support vector. X+
represents the vector lying on upper class boundary and X− denotes the vectors
lying on lower class boundary, as shown in Fig. 4. Distance (d) of support vector
X+, X i from hyperplane can be find as.
d =wx+ + b
||w||, (for yi = +1) (3.47)
d =wx− + b
||w||, (for yi = −1) (3.48)
Hyperplane with largest margin can minimize the over fitting problem in training
data. To find hyperplane which separates the classes with largest margin, let
margin is denoted by ρ and is calculated as:
ρ(w, x, b) =wx+ + b
||w||− wx− + b
||w||(3.49)
ρ(w, x, b) =1
∥w∥[(wx+ + b)− (wx− + b)] (3.50)
39
X+
X+
X-
X-
wx+b=0wx
+b=+1
wx+b=-1
Margin
Figure 3.4: Linear data classification
ρ(w, x, b) =2
∥w∥(3.51)
Hence, the hyperplane that optimally separates the data is the one that minimizes
f(x) =∥ w ∥2
2(3.52)
This optimization problem with linear and non linear constraints is solved by the
lagrangian multiplier. Lagrangian function £ is expressed as:
£(xi, α) = f(x) +n∑
i=1
αify (3.53)
where, α represents the lagrangian multiplier and fy represents the constraint
function. All the points can be separated as yi(wxi + b) − 1 > 0 for αi > 0.
Lagrangian form of this problem is given below
£(xi, αi) =∥ w ∥2
2−
n∑i=1
αi[yi(wxi + b)− 1] (3.54)
We can find w and b after taking the partial derivative of the lagrangian function
with respect to w and b. First we solve for w
40
Let, suppose:∂£
∂w= 0 (3.55)
∂£
∂b= 0 (3.56)
∂£
∂w=
∂ ∥w∥22
∂w−
n∑i=1
[∂(αiyiwxi)
∂w+
∂αiyib
∂w− ∂αi
∂w] (3.57)
0 =1
22w −
n∑i=1
(αiyixi) + 0 + 0 (3.58)
w =n∑
i=1
(αiyixi) (3.59)
After taking the partial derivative of lagrangian function with respect to b we get:
n∑i=1
(αiyi) = 0 (3.60)
In all conditions, where yi(wxi + b) = 1 it must be the case αi = 0. Solving for
w, b and all αi, it is still complicated. lagrangian equation with function f(x) and
constraints to be transformed to its dual form, which is easier to solve. Using the
fact ∥w∥2 = wwT , the lagrangian equation can be written as:
£(xi, α) =1
2wwT −
n∑i=1
αi[yi(wTxi + b)− 1] (3.61)
w =n∑
i=1
(αiyixi) (3.62)
wT =n∑
i=1
(αjyjxj) (3.63)
£(xi, α, xj, αj) =1
2
n∑i=1
(αiyixi)n∑
j=1
(αjyjxj)−
n∑i=1
αi(yi(n∑
j=1
(αjyjxj)xi) + b)− 1
(3.64)
41
=1
2
n∑i=1
n∑j=1
(αiαjyiyjxixj)−
n∑i=1
αi(yi(n∑
j=1
(αjyjxj)xi) + b) +n∑
j=1
(αi)
(3.65)
=1
2
n∑i,j=1
(αiαjyiyjxixj)−n∑
i,j=1
(αiαjyiyjxixj)+
b
n∑i=1
αiyi +n∑
j=1
(αi)
(3.66)
As we known∑
i=1
αiyi = o
Put values in above equation
= [1
2− 1]
n∑i,j=1
(αiαjyiyjxixj) + 0 +n∑
j=1
αi (3.67)
= −1
2
n∑i,j=1
(αiαjyiyjxixj) + 0 +n∑
j=1
αi (3.68)
In 1995, Corinna Cortes and Vladimir N. Vapnik proposed an adapted maximum
margin idea which allows for misclassified data points. Soft margin method selects
a hyper plane that splits the data point as clear as possible. This method introduce
the slack variables, ξi, which measures the degree of misclassified data of xi.
fy = yi(wxi + b) ≥ 1− ξi, fori = 1, ..., nandξi ≥ 0 (3.69)
If the function is linear, the optimization problem becomes:
minw,b,ξi [1
2∥ w ∥2 + C
n∑i=1
ξi] (3.70)
where, parameter C control the over fitting problems.
42
X+
X+
X-
X-
wx+b=0wx+b=+1
wx+b=-1
Margin
Slack
variableSlack
variable
Figure 3.5: Linear data classification with slack variables
43
Chapter 4
Conclusion
Whole world has about $25 billion of electricity losses, which is a very high amount.
To cope with this situation the primary and necessary action which can be taken
is by governments. They must provide subsidy in electricity price rates. If gov-
ernment give incentives in a form to allow funds for users, small projects to work
on. In these projects they produce electricity using different favorable methods
for certain area, and pay it back to utility and grid stations.
One major point is to aware people for their good and bad using using media like
television, radio, newspapers etc. In second part of my thesis, that is estimation, I
tested data taken from Lahore electric supply company on regression analysis. As
we fit regression model and plotted our data of Lahore electric supply company
on the model. Some of values shown deviation from the fitted model. They are
basically electricity theft. After finding coefficient of determination that is ”R2”, I
concluded that 96.3 out of 100 values shows interdependence and correlation with
the fitted model, while 3.7% values are theft values. In spearman’s coefficient of
correlation test we get these results as 99.3% as it does not explains all the data
points, correlated with the model. Though it is an easy way to find the correla-
tion between non theft scenario and theft scenario. In Karl Pearson’s coefficient
of correlation, chi-square method and student t-distribution are used to find out
correlation in the data. And in hypothesis testing I first assume null and alter-
native hypothesis. I set up significance level and applied test statistics on null
hypothesis. The critical region value from student t-distribution table is 3.169.
Since the computed value of t = 15.11536 falls in the critical region, so we reject
the null hypothesis and may conclude that there is sufficient reason to say at the
1% level of significance that consumed and sold MKWH are related.
44
Bibliography
[1] S. S. S. R Depuru, L. Wang, V. Devabhaktuni. “Electricity theft: Overview,
issues, prevention and a smart meter based approach to control theft.” Energy
Policy 39 (2011) 1007-1015.
[2] S. McLaughlin, D. Podkuiko, and P. McDaniel. “Energy theft in Advanced
Metering Infrastructure” Pennsylvania State University, University Park.
[3] S. S. S. R Depuru, L. Wang, V. Devabhaktuni and N. Gudi. “Measures and
setbacks for controlling electricity theft.”
[4] S. S. S. R Depuru, L. Wang, V. Devabhaktuni. “Smart meters for power grid:
Challenges, issues, advantages and status.” Renewable and sustainable energy
reviews 15 (2011) 2736-2742.
[5] S. S. S. R Depuru, L. Wang, V. Devabhaktuni and N. Gudi.“Smart meters for
power grid: Challenges, issues, advantages and status”. IEEE 2011.
[6] C. J. Bandim, J. E. R. Alves Jr., A. V. Pinto Jr, F. C. Souza, M. R. B. Loureiro,
C. A.Mangalhaes and F. Galvez-Durand. “Identification of energy theft and
tampered meters using a central observer meter: A mathematical approach”
IEEE 2003.
[7] www.lesco.gov.pk (12-04-2012)
[8] J. Nagi, A. M. Mohammad, K. S. Yap, S. K. Tiog, S. K. Ahmed. “Non-
Technical Loss for Detection of Electricity Theft using Support Vector Ma-
chines” 2nd IEEE international conference on power and energy.
[9] C. C. O Ramos, A. N. Souza, J. P. Papa, A. X. Falcao. “Fast Non-Technical
Lasses Identification through Optimum-Path Forest.”
[10] I. Monedro, F. Biscarri, C. Leon, J. I. Guerrero, J Biscarri, R. Millan, “Detec-
tion of Frauds and other non-technical losses in a power utility using Pearson
coefficient, Bayesian networks and decision trees”, Electrical Power and
Energy Systems.
45
[11] L. Queiroz, C. Cavellucci, C. Lyra, “Evaluation of Technical Losses Estima-
tion in LV Power Distribution Systems”, 20th international Conferrence on
Electricity Distribution.”
[12] J. I. Guerrero, C. Leon, F. Biscarri, I. Monedro, J Biscarri and R. Millan, “A
Real Application on non-technical losses detection: the MIDAS Project”,
[13] C. C. O Ramos, A. N. Souza, J. P. Papa, A. X. Falcao, “Fast Non-Technical
Lasses Identification through Optimum-Path Forest.”
[14] I. Monedero, F. Bisarri, C. Leon, J. I. Guerrero, J.Biscarri, “Using Regression
Analysis to Identify Patterns of Non-Technical Losses on Power Utilities”.
[15] S. R. Gunn, “Support Vector Machines for Classificaion an Regression.”
[16] C. C. O. Ramos, A. N. Souza, Y. M. Nakamura, J. P. Papa, “Electrical
Consumers Data Clustering Through Optimum-Path Forest.”
[17] S. S. S. R. Depuru, L. Wang, and Vijay Devabhaktuni, “Support Vector
Machine Based Data Classification for Detection of Electricity Theft.”
[18] A. Boni, F. Pianegiani and D. Petri, “Low-Power and Low-COst Implemen-
tation of SVMs for Smart Sensors.” IEEE Transactions on Instrumentation
and Measurement, VOL, 56, No. 1, February 2007.
[19] J. Nagi, K. S. Yap, S.K Tiong, S. K. Ahmed, A. M. Mohammad, “Detection of
Abnormalities and Electricity Theft using Genetic Support Vector Machines.”
[20] http://www.o-digital.com/wholesale-products/2179/2201-2/Single-Phase-
Electronic-Multi-tariff-Meter-100202.html
[21] http://www.srpnet.com/electric/home/readmeter.aspx
[22] http://electricalengineeringtour.blogspot.com/2009/07/all-about-prepaid-
electricity-watt.html
46