+ All Categories
Home > Documents > Detailed Results from measurements and simulation

Detailed Results from measurements and simulation

Date post: 23-Jan-2016
Category:
Upload: vanida
View: 29 times
Download: 0 times
Share this document with a friend
Description:
Detailed Results from measurements and simulation. Status Report on the Combined L1&DAQ implementation Wednesday, April 16 Niko Neufeld. Key Technical issues. Gigabit Ethernet Bit Error Rate (BER) Context Switching Latency Latency due to event queuing in sub-farm - PowerPoint PPT Presentation
Popular Tags:
10
1 Detailed Results from measurements and simulation Status Report on the Combined L1&DAQ implementation Wednesday, April 16 Niko Neufeld
Transcript
Page 1: Detailed Results from  measurements and simulation

1

Detailed Results from measurements and

simulation

Status Report on the Combined L1&DAQ

implementation Wednesday, April 16Niko Neufeld

Page 2: Detailed Results from  measurements and simulation

Niko NEUFELDCERN, EP

2

Key Technical issues

•Gigabit Ethernet Bit Error Rate (BER)

•Context Switching Latency

•Latency due to event queuing in sub-farm

•Latency due to L1 decision sorter

•Performance of event-merging in NP

•Performance of L1 decision sorter (if NP)

Work presented here has been done by BJ, JPD, AB and NN

Page 3: Detailed Results from  measurements and simulation

Niko NEUFELDCERN, EP

3

Context Switching Latency

•What is it?– On a multi-tasking OS, whenever the OS

switches from one process to another it needs a certain time to do this

•Why do we worry?– Because we run the L1 and the HLT

algorithms concurrently on each CPU node

•Why do we want this concurrency?– We want to minimise the idle-time of the

CPUs– We cannot use double-buffering in the L1

(latency budget would be half-ed!)

Page 4: Detailed Results from  measurements and simulation

Niko NEUFELDCERN, EP

4

Priority and Latency

• Using Linux 2.5.55 we have established two facts about the scheduler:– Realtime priorities work: the L1 task will never be

interrupted until it finishes– The context switch latency is low: 10.1 ± 0.2 µs

• Measurements of this have been done on a high-end server 2.4 GHz PIV Xeon – 400 MHz FSB – we should have machines at least 2x faster in 2007

• Conclusion: the scheme of running both tasks concurrently is sound

Page 5: Detailed Results from  measurements and simulation

Niko NEUFELDCERN, EP

5

Bit Error Rate (BER)

•Gigabit Ethernet is specified to work over UTP CAT5e cables (1000 BaseT)

•The BER is defined to be < 10^11 one bad packet per 100 s. Real equipment is much better.

•Re-transmission (a.k.a. TCP/IP) does not cure the problem: 1% BER 80% loss in effective band-width

•BER depends not only on the cable, but particularly also on the end-point (MAC/PHY)

Page 6: Detailed Results from  measurements and simulation

Niko NEUFELDCERN, EP

6

Is BER a problem?

•LHCb is based 1000 BaseT, because of cost reasons: NIC and Switch ports are still 3x more expensive than fibre

•The Marvel-Phy on the GigeFE card is working up to 160 m.

•Preliminary tests show

Page 7: Detailed Results from  measurements and simulation

Niko NEUFELDCERN, EP

7

MultiplexingLayer

FE FE FE FE FE FE FE FE FE FE FE FE

Edge Switch Edge Switch

NP NP NP NP

NP NP NP NP

SFC SFC SFC SFC SFC SFC SFC SFC

Level-1Traffic

HLTTraffic

133-235Links

1.1 MHz9.5-17.5 GB/s

333Links

40 kHz2 GB/s

28 Switches

19 NPs63-111 NPs

64-126 Links7-13 GB/s

19 Links1.2 GB/s

75-131 Links8.2-14.2 GB/s

57-99 SFCs

38-66 NPs

Front-end Electronics

EventBuilder Fast/Gb Ethernet

Gb Ethernet

Level-1 Traffic

Mixed Traffic

HLT Traffic

57-99 Links6.2-11 GB/s

TRM

Sorter TFCSystemReadout Network

L1-Decision

Farm CPUs ~1200 CPUs

Latencies

Queuing late

ncy

Page 8: Detailed Results from  measurements and simulation

Niko NEUFELDCERN, EP

8

“Local” latencies

• Latencies which arise as a feature of an isolated component of the system. An event / fragment takes a certain time to pass through the component, independent of other fragments in the system

• Examples: forwarding latencies in the switch, event building latency in the NP

• They will be covered by a global budget of a few ms

• They will be measured as soon as final software and candidate hardware is available

Page 9: Detailed Results from  measurements and simulation

Niko NEUFELDCERN, EP

9

Global latencies

•Latencies which arise from the architecture of the system itself, where an event has to wait because of other events

•When event on arrival in the sub-farm finds all nodes busy it will be “punished”, with extra latency

•When a decision arrives in the L1 decision sorter, it will need to wait for all previous decisions (except the ones in time-out) to arrive before it can go out

Page 10: Detailed Results from  measurements and simulation

Niko NEUFELDCERN, EP

10

Latency due to decision sorting

Processing time assumed for L1 trigger ~ 1 / x [ns]

Additional time an event needs to in the RS before it is dispatched [ns]


Recommended