+ All Categories
Home > Documents > Static Branch Frequency and Program Profile...

Static Branch Frequency and Program Profile...

Date post: 06-Feb-2018
Category:
Upload: doannga
View: 213 times
Download: 0 times
Share this document with a friend
25
Static Branch Frequency and Program Profile Analysis Divino César Soares Lucas [email protected] Laboratório de Sistemas de Computação Instituto de Computação UNICAMP Youfeng Wu [email protected] Intel Labs James R. Larus [email protected] University of Wisconsin
Transcript

Static Branch Frequency and

Program Profile Analysis

Divino César Soares Lucas

[email protected]

Laboratório de Sistemas de Computação

Instituto de Computação

UNICAMP

Youfeng Wu

[email protected]

Intel Labs

James R. Larus

[email protected]

University of Wisconsin

Schedule

1. Introduction

2. Related Work

3. Key Idea

4. Branch Prediction

5. Branch Probabilities

6. Combining Predictions

7. Local Block and Edge Frequency

8. From Local to Global Frequencies

9. Results

10. Conclusion

11. References

Introduction

• What is a program profile?

• Dynamic profile

• Static profile

• Why we need profile?

• Instruction scheduling

• Identifying program bottlenecks

• Enhance memory locality

Related Work

• Dynamic profile

• Work centered on reducing profiling overhead [3, 6]

• Static profile

• Simple estimation heuristics [4]

• Estimation based on markov models [5]

Key Idea [1]

• Predict Branches

• Use heuristics

• Compute Probabilities

• Use heuristic hit rates

• Compute Frequency

• Use probabilities

Branch Prediction

• A branch prediction predicts if a branch will be taken or not

taken. It’s a binary decision!

• Some static heuristics [2]:

• LBH - Loop Branch Heuristic

• PH - Pointer Heuristic

• OH - Opcode Heuristic

• GH - Guard Heuristic

• LEH - Loop Exit Heuristic

• LHH - loop Header Heuristic

• CH - Call Heuristic

• SH - Store Heuristic

• RH - Return Heuristic

Branch Probabilities

• A branch probability is a estimate whether the branch will

be taken or not. It’s a continuous value among [0, 1].

Heuristic H.R.

Loop Branch Header 88%

Pointer Heuristic 60%

Opcode Heuristic 84%

Guard Heuristic 62%

Loop Exit Heuristic 80%

Loop Header Heuristic 75%

Call Heuristic 78%

Store Heuristic 55%

Return Heuristic 72%

• We will use these Hit Rates as

branch probabilities.

Combining Predictions

• What happen if two or more heuristics are applicable?

if (k < 0) then

k = y;

else

return ;

end-if

• OH predicts the then part! (With 84% of hit rate).

• RH predicts the else part! (With 72% of hit rate).

• In these situations we use Dempster-

Shafer algorithm…

Combining Predictions

• Each branch has a set of possible targets. In our case two,

taken or not taken:

𝐵 = *𝑡1, 𝑡2+

• Each heuristic gives a evidence that an event can happen:

𝑕1 𝑡1 = 𝑎 𝑕1 𝑡2 = 1 − 𝑎

𝑕2 𝑡1 = 𝑏 𝑕2 𝑡2 = 1 − 𝑏

• Dempster-Shafer algorithm combine these evidences:

𝑕1⊕𝑕2 𝑡1 = 𝑕1(𝑡1)𝑕2(𝑡1)

𝑕1 𝑡1 𝑕2 𝑡1 + 𝑕1(𝑡2)𝑕2(𝑡2)

𝑕1⊕𝑕2 𝑡2 = 𝑕1(𝑡2)𝑕2(𝑡2)

𝑕1 𝑡1 𝑕2 𝑡1 + 𝑕1(𝑡2)𝑕2(𝑡2)

Combining Predictions

Example:

𝑕1 𝑡1 = 0.5 𝑕1 𝑡2 = 0.5

𝑕2 𝑡1 = 0.7 𝑕2 𝑡2 = 0.3

𝑕1⊕𝑕2 𝑡1 = 0.5𝑥0.7

0.5𝑥0.7+0.5𝑥0.3 = 0.7

𝑕3 𝑡1 = 0.6 𝑕3 𝑡2 = 0.4

𝑕1⊕𝑕2 𝑡2 = 0.5𝑥0.3

0.5𝑥0.7+0.5𝑥0.3 = 0.3

𝑕2⊕𝑕3 𝑡1 = 0.7𝑥0.6

0.7𝑥0.6+0.3𝑥0.4 = 0.778

𝑕2⊕𝑕3 𝑡2 = 0.3𝑥0.4

0.7𝑥0.6+0.3𝑥0.4 = 0.222

Local Block and Edge Frequency

• The Branch/Edge frequency is a estimate of how often a

block or edge is executed or taken.

• We calculate local branch/block frequency by propagating

branch probabilities, that is:

bfreq(bi) = 1 bi is entry

bfreq(bi) = 𝑓𝑟𝑒𝑞(𝑏𝑝 → 𝑏𝑖) 𝑏𝑝 ∊ 𝑝𝑟𝑒𝑑 𝑏𝑖 otherwise

freq(bi → bj) = bfreq(bi) prob(bi → bj)

• But these formulas doesn’t work when we have a cycle!

Local Block and Edge Frequency

𝑏𝑓𝑟𝑒𝑞 𝑏0 = 𝑖𝑛_𝑓𝑟𝑒𝑞(𝑏0) + 𝑓𝑟𝑒𝑞(𝑏𝑖𝑘𝑖=1 → 𝑏0)

= 𝑖𝑛_𝑓𝑟𝑒𝑞(𝑏0) + (𝑏𝑓𝑟𝑒𝑞(𝑏𝑖𝑘𝑖=1 )𝑝𝑟𝑜𝑏(𝑏𝑖 → 𝑏0))

= 𝑖𝑛_𝑓𝑟𝑒𝑞(𝑏0) + (𝑏𝑓𝑟𝑒𝑞(𝑏0𝑘𝑖=1 )𝑟𝑖𝑝𝑟𝑜𝑏(𝑏𝑖 → 𝑏0))

= 𝑖𝑛_𝑓𝑟𝑒𝑞(𝑏0) + 𝑏𝑓𝑟𝑒𝑞(𝑏0) 𝑟𝑖𝑝𝑟𝑜𝑏(𝑏𝑖 → 𝑏0)𝑘𝑖=1

Let

𝑐𝑝 𝑏0 = 𝑟𝑖𝑝𝑟𝑜𝑏(𝑏𝑖 → 𝑏0)𝑘𝑖=1

𝑏𝑓𝑟𝑒𝑞 𝑏0 = 𝑖𝑛_𝑓𝑟𝑒𝑞(𝑏0) + 𝑏𝑓𝑟𝑒𝑞 𝑏0 𝑐𝑝(𝑏0)

𝑏𝑓𝑟𝑒𝑞 𝑏0 = 𝑖𝑛_𝑓𝑟𝑒𝑞(𝑏0)

1 − 𝑐𝑝(𝑏0)

Local Block and Edge Frequency

Example:

𝑏𝑓𝑟𝑒𝑞 𝑏0 = 1

1−0.88−0.88𝑥0.12 −0.88𝑥0.12𝑥0.12 = 578.70

From Local to Global Frequencies

• The frequency a function f calls another function g can be

expressed by – considering one invocation of f:

𝑙𝑓𝑟𝑒𝑞 𝑓, 𝑔 = bfreq(bi) calls(bi, g)

• The global frequency of f calling g is:

𝑔𝑓𝑟𝑒𝑞 𝑓, 𝑔 = cfreq(f) lfreq(f, g)

• Where:

𝑐𝑓𝑟𝑒𝑞 𝑓 = 1, 𝑓 𝑖𝑠 𝑚𝑎𝑖𝑛 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛

𝑐𝑓𝑟𝑒𝑞 𝑓 = 𝑓𝑟𝑒𝑞(𝑝, 𝑓) 𝑝 ∊ 𝑝𝑟𝑒𝑑 𝑓 , 𝑜𝑡𝑕𝑒𝑟𝑤𝑖𝑠𝑒

• Global block/edge frequency can be calculated multiplying

function execution frequency by local block/edge frequency.

Results

• Scores of SPEC92 local block frequency:

Results

• Scores of SPEC92 local edge frequency:

Results

• Scores of SPEC92 local edge frequency:

Results

• Results came from SPECint92 C benchmarks and some

Unix applications.

• The system used was a Sequent S2000/750 with i486

processors and the Sequent DYNIX/ptx C compiler 2.1.

• Use of Wall [5] weighted and unweighted match score.

Results

• Scores of SPEC92 global function call frequency:

Results

• Scores of SPEC92 global block frequency:

Results

• Scores of SPEC92 global edge frequency:

Results

• Scores for Unix commands:

Conclusion

• A new technique for static profile was presented.

• The technique introduced a new way to combine multiple

evidences for a branch outcome.

• Although the heuristics hit rate are from another

environment they resulted in considerable results.

References

[1] Y. Wu and J. R. Larus. Static Branch Frequency and Program Profile Analysis.

In Proceedings of the 27th Annual International Symposium on Microarchitecture.

pages 1-11, 1994.

[2] T. Ball and J. R. Larus. Branch prediction for free. In SIGPLAN Conference on

Programming Language Design and Implementation. pages 300-313, 1993.

[3] T. Ball and J. R. Larus. Optimally profilling and tracing programs. ACM

Transactions on Programming Languages and Systems. 16(4):1319-1360, July

1994.

[4] T. A. Wagner, V. Maverick, S. L. Graham, and M. A. Harrison. Accurate static

estimators for program optimization. In Proceedings of the ACM SIGPLAN’94

conference on Programming Language Design and Implementation. pages 85-96.

ACM Press, 1994.

References

[5] D. W. Wall. Predicting Program Behavior Using Real or Estimated Profiles.

Proceedings of ACM SIGPLAN’91 Conference on Programming Language Design

and Implementation. pages 59-70, 1991.

[6] V. Sarkar. Determining average program execution times and their variance. In

SIGPLAN Conference on Programming Language Design and Implementation.

pages 298.312, 1989.


Recommended