Date post: | 19-Jan-2018 |
Category: |
Documents |
Upload: | jonas-newman |
View: | 213 times |
Download: | 0 times |
Tcl/Tk Conference 2012 Pulling Out All The Stops – Part II
Phil Brooks
November 15, 2012
2© 2010 Mentor Graphics Corp. Company Confidentialwww.mentor.com
Purpose of the talk Experiences developing for maximum
performance with Threads in Tcl
—Recap Pulling Out All The Stops – part 1
—Changing Customer usage patterns
—Threads and Performance
—Observations
IIT Engr - August 2011 D2S Operations Meeting
3© 2010 Mentor Graphics Corp. Company Confidentialwww.mentor.comIIT Engr – February, 2012 D2S Operations
Meeting
Context Application
Calibre LVS Device Extraction— Uses geometric analysis to identify devices in a
graphical database— High Performance— Multi-threaded execution— User programmability
– SVRF calculator– Tcl
4© 2010 Mentor Graphics Corp. Company Confidentialwww.mentor.com
Pulling Out All The Stops – Part I Developed in 2005 Device TVF – Tcl based extension to the then
existing SVRF Calculator Achieved excellent performance by:
— Use TCL_EVAL_GLOBAL— pre-run compilation using Tcl_EvalObjv— pre-run data access setup via cached Tcl_Obj arguments
and results— End users strongly encouraged to write efficient Tcl code
– (remember expr { ... }) Interpreter was single threaded and computation
threads accessed it serially through a lock.
IIT Engr - August 2011 D2S Operations Meeting
5© 2010 Mentor Graphics Corp. Company Confidentialwww.mentor.com
Initial Use Originally, Tcl was added to avoid having to add
looping constructs to our calculator. Simple call with parameters and return a single
result interface. a few Tcl calls per device
p1 = TVF_NUMERIC_FUNCTION::libname::procname( parm1, parm2, parm3)
p2 = TVF_NUMERIC_FUNCTION::libname::procname( parm4, parm5, parm6)
IIT Engr - August 2011 D2S Operations Meeting
6© 2010 Mentor Graphics Corp. Company Confidentialwww.mentor.com
Early user code tended to look like this:
proc do_calculation my_arr { set entry_count [ $my_arr entry_count ] # iterate using an index set val 0.0 for { set i 0 } { $i < $entry_count } { incr i } { set val [ expr { .... } ] } return $val}(where ... was a very long expression)
IIT Engr - August 2011 D2S Operations Meeting
7© 2010 Mentor Graphics Corp. Company Confidentialwww.mentor.com
8 years later dozens of parameters complex multi-step calculations returning long string results instead of numbers dozens of calls per device So we went from:
— 2005– good scaling across 4-8 processors using a single Tcl
interpreter
— 2012– poor scaling across 8-32 processors using a single Tcl
interpreter
IIT Engr - August 2011 D2S Operations Meeting
8© 2010 Mentor Graphics Corp. Company Confidentialwww.mentor.com
So Lets turn on Multi-Threading
Look for information on Tcl threads:
"At the C programming level, Tcl's threading model requires that a Tcl interpreter be managed by only one thread." p. 322
“Practical Programming in Tcl and Tk” - Brent B. Welch, Ken Jones, with Jeffrey Hobbs, Prentice Hall, 2003
IIT Engr - August 2011 D2S Operations Meeting
9© 2010 Mentor Graphics Corp. Company Confidentialwww.mentor.com
More about Calibre threading
revision 1.1date: 1998/04/10 17:41:31; author: ****; state: Exp; routines related to flat drc thread are going be in this file. Currently there is not much in it.
IIT Engr - August 2011 D2S Operations Meeting
10© 2010 Mentor Graphics Corp. Company Confidentialwww.mentor.com
Task Queue and thread pool pattern
IIT Engr - August 2011 D2S Operations Meeting
from Wikipedia article on Thread Pool pattern
11© 2010 Mentor Graphics Corp. Company Confidentialwww.mentor.com
Task Queue and Thread Pool API Basic work flow
— Define a task (data and code)— Put it in the queue— Wait until it is done
While working on a task:
— Create thread specific storage (Tcl_CreateInterp)— do calculations (Tcl_EvalObjv)— return results
Where does Tcl_DeleteInterp go?IIT Engr - August 2011 D2S Operations Meeting
12© 2010 Mentor Graphics Corp. Company Confidentialwww.mentor.com
How to clean up the interpreters?
Loop through them and delete them from the main thread— Right?
Threads would individually calculate, using Tcl and they ran nicely in parallel.
At the end of the processing, once all of the tasks were done and the threads all
quietly parked in their Calibre threading model parking slots, I would loop
through on the main thread and destroy each interpreter. This is where things would go terribly wrong. Crashes, memory leaks, unclosed files,
etc..
I was able to reproduce the behavior in a small C/Tcl testcase.
IIT Engr - August 2011 D2S Operations Meeting
13© 2010 Mentor Graphics Corp. Company Confidentialwww.mentor.com
Gerald spotted the problem
Gerald W. Lester: ...> I also found that by deleting the interpreter inside
the same execution thread that it was created in,Where else would you be deleting it from -- an
interpreter is not supposed to be accessed by more than one thread. You may have many interpreters per thread (i.e. a thread can access/use many interpreters), but only one thread per interpreter (i.e. only a single thread should be accessing a given interpreter).
IIT Engr - August 2011 D2S Operations Meeting
14© 2010 Mentor Graphics Corp. Company Confidentialwww.mentor.com
Review the documentation
"At the C programming level, Tcl's threading model requires that a Tcl interpreter be managed by only one thread." p. 322
"At the C programming level, Tcl's threading model requires that a Tcl interpreter be created, used and destroyed by only one thread. Interpreters cannot be used across multiple threads.“
IIT Engr - August 2011 D2S Operations Meeting
15© 2010 Mentor Graphics Corp. Company Confidentialwww.mentor.com
MT Performance
IIT Engr - August 2011 D2S Operations Meeting
1 2 4 8 160
102030405060708090
100
RealCPUSYS
1 2 4 8 1605
1015202530354045
REALCPUSYS
1 2 4 8 160
10
20
30
40
50
60
REALCPUSYS
Tcl 8.4 Tcl 8.5
Tcl 8.61 2 4 8 16
00.5
11.5
22.5
33.5
44.5
REALCPUSYS
C++
16© 2010 Mentor Graphics Corp. Company Confidentialwww.mentor.com
www.mentor.com
IIT Engr - August 2011 D2S Operations Meeting