+ All Categories
Home > Software > Understanding Log Lines using Development Knowledge

Understanding Log Lines using Development Knowledge

Date post: 09-Apr-2017
Category:
Upload: sailqu
View: 344 times
Download: 0 times
Share this document with a friend
40
1 Understanding Log Lines Using Development Knowledge Ahmed E. Hassan Meiyappan Nagappan Weiyi Shang Zhen Ming Jiang
Transcript

1

Understanding Log Lines Using Development Knowledge

Ahmed E. HassanMeiyappan NagappanWeiyi Shang Zhen Ming Jiang

2

Practitioners have challenges in understanding log lines

Fetch failure

What exactly does this

message mean?

What could be the cause?

Is it affecting my data?

3

Practitioners either ask experts to help or search online for log inquiries

4

We performed an exploratory study on 3 large software systems

Zookeeper5,641 logging statements

1,080 logging statements

1,163 logging statements

5

We manually examined real-life inquiries about log lines from 3 sources

User mailing listsRandomly sampled logs

6

5 types of information are inquired about logs

Meaning

Cause

Impact

Solution

Context

What exactly does this message mean?

When does this occur?

What could be the cause?

How can I avoid this message/problem?

Is it affecting my data?

7

Experts are crucial in resolving log inquiries

by expert by non-expert replied by expert

only replied by non-expert

not answered

resolved un-resolved

0123456

5

1

0

1

3

0

2

0 0 0

3

0 0 0 0

Hadoop Cassanddra Zookeeper

8

Experts are crucial in resolving log inquiries

8 out of 11 resolved inquiries are resolved by experts.

by expert by non-expert replied by expert

only replied by non-expert

not answered

resolved un-resolved

0123456

5

1

0

1

3

0

2

0 0 0

3

0 0 0 0

Hadoop Cassanddra Zookeeper

9

Experts are crucial in resolving log inquiries

by expert by non-expert replied by expert

only replied by non-expert

not answered

resolved un-resolved

0123456

5

1

0

1

3

0

2

0 0 0

3

0 0 0 0

Hadoop Cassanddra Zookeeper

Inquiries are always resolved if experts reply.

10

Looking for an expert is not the optimal approach to resolve log inquiries

Over 20% of the inquires have no reply.

Wrong answers may be posted in reply to inquiries.

Identifying the expert of a log line is challenging.

First reply can take up to 210 hours.

11

Can we document the inquired logs?

12

Nothing in common between inquired logs

An on-demand approach is needed to assist in understanding logs.

Different log verbosity levels

0 to 2 degrees of fan-in

0 to 200 prior code change

Real-life inquiries

13

We propose to attach development knowledge to logs

Code commit

Issue reportsSource code

/*…*/

Call graph

Code comments

14

Code commit Issue reports

Source code

/*…*/

Code comments

Call graph

fetch failure

From method checkAndInformJobTrackerof file ShuffleScheduler.java

An example of using development knowledge to resolve inquiries of log “fetch failure”

15

Code commit Issue reports

Source code

/*…*/

Code comments

Call graph

fetch failure

Notify the JobTracker after every read error, if `reportReadErrorImmediately' is true or after every `maxFetchFailuresBeforeReporting' failures

An example of using development knowledge to resolve inquiries of log “fetch failure”

16Code

commit Issue reports

Source code

/*…*/

Code comments

Call graph

fetch failure

Called by method copyFailed in class ShuffleScheduler

An example of using development knowledge to resolve inquiries of log “fetch failure”

17

Code commit Issue reports

Source code

/*…*/

Code comments

Call graph

fetch failure

Allow shuffle retries and read-error reporting to be configurable. Contributed by Amareshwari Sriramadasu.

An example of using development knowledge to resolve inquiries of log “fetch failure”

18

Code commit Issue reports

Source code

/*…*/

Code comments

Call graph

fetch failure

MAPREDUCE-1171.… This is caused by a behavioral change in hadoop 0.20.1. ……One solution I could see is "Provide a config option... ”…

An example of using development knowledge to resolve inquiries of log “fetch failure”

19

Code commit Issue reports

Source code

/*…*/

Code comments

Call graph

fetch failure

Meaning: There is a data reading error.Cause: One of the possible reasons is a configuration.Context: The event happens during the shuffle period, while copying data.Impact: The event impacts the jobtracker.Solution: Changing a configuration option would solve the issue.

Amareshwari Sriramadasu is the expert to go to.

An example of using development knowledge to resolve inquiries of log “fetch failure”

Resolve the inquiry by development

knowledge

Go to the expert for help.

20

Overview of our approach

Version control system

Generating templates

for logs

Matching logs with log

templates

Attaching development knowledge to logs

Source code

Log templates

Development knowledge

21

Step 1: Generating templates for logs

Version control system

foo() { … Log_statement(“time=%d, Trying to launch, TaskID=%s”, time, taskid); …}

time=\d+, Trying to launch, TaskID=\S+

22

Step 2: Matching logs with log templates

time=\d+, Trying to launch, TaskID=\S+

time=1, Trying to launch, TaskID=task_1

time=2, launch task, TaskID=task_1…

time=10, task finished, TaskID=task+1Log template

Logs

23

Step 3: Attaching development knowledge to logs

Code commit

Issue reports

Source code

/*…*/

Call graph Code comments

Version control system

Issue tracking system

24

Can development knowledge complement

logging statements?

Complementing logging statements

Resolving real-life log inquiries

Can development knowledge help resolve real-life

inquiries?

25

We compare our approach against Google and mailing list for resolving real-life log inquiries

Real-life inquiries

26

Series10%

10%

20%

30%

40%

50%

60%

70%

80%Percentage of resolved log inquiries

Our approach outperforms Google and is comparable to mailing lists to resolve log inquiries

27

Meaning Cause Context Solution Impact0%

10%20%30%40%50%60%70%80%90%

100%

Percentage of each type of inquired information provided by our approach

Our approach provides 62% of inquired log information

28

Complementing logging statements

Resolving Log Inquiries

Can Development Knowledge Help Resolve Real-life

Inquiries?

YES!

Can development knowledge complement

logging statements?

29

Complementing logging statements

Resolving Log Inquiries

Can Development Knowledge Help Resolve Real-life

Inquiries?

YES!

Can development knowledge complement

logging statements?

30

We complement a random sample of logging statements using our approach

Zookeeper

300 randomly sampled logging statements

31

Development knowledge can complement logging statements

meaning cause context solution impact0

102030405060708090

100Percentage of logging statements complemented by our

approach

HadoopCassandraZookeeper

Issue reports are the best development knowledge to complement logging statements.

32

Complementing logging statements

Resolving Log Inquiries

Can Development Knowledge Help Resolve Real-life

Inquiries?

YES! YES!

Can development knowledge complement

logging statements?

33

Practitioners have challenges in understanding log lines

Fetch failure

What exactly does this

message mean?

… could this be the cause?

Is it affecting my data?

34

35

5 types of information are inquired about logs

Meaning

Cause

Impact

Solution

Context

What exactly does this message mean?

When does this occur?

… could this be the cause?

It will be great if some one can point to the direction how to

solve this?

Is it affecting my data?

36

37

We propose to attach development knowledge to logs

Code commit

Issue reportsSource code

/*…*/

Call graph

Code comments

38

39

Complementing logging statements

Resolving Log Inquiries

Can Development Knowledge Help Resolve Real-life

Inquiries?

YES! YES!

Can development knowledge complement

logging statements?

40

http://tinyurl.com/hirePhD


Recommended