Date post: | 17-Dec-2015 |
Category: |
Documents |
Upload: | jack-johns |
View: | 218 times |
Download: | 1 times |
Page 2
What is “bring-up”?
Page 3
What is “bring-up”?
Process of making a new piece of hardware boot $YOUR_OS
Page 4
What is “bring-up”?
Process of making a new piece of hardware boot $YOUR_OS properly
Page 5
What is “bring-up”?
Process of making a new piece of hardware boot $YOUR_OS properly
In our case $YOUR_OS == Linux
Page 12
Tada!Bootdata ok (command line is root=UUID=260F-12F2 ro google)Linux version 2.6.18.5-gg26 ([email protected]) (gcc version 4.1.1) #1 BIOS-provided physical RAM map:BIOS-e820: 0000000000000000 - 000000000009c800 (usable)BIOS-e820: 000000000009c800 - 00000000000a0000 (reserved)BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)BIOS-e820: 0000000000100000 - 000000007fbe1800 (usable)BIOS-e820: 000000007fbe1800 - 000000007fbe8c00 (ACPI data)BIOS-e820: 000000007fbe8c00 - 000000007fbe9000 (ACPI NVS)BIOS-e820: 000000007fbe9000 - 00000000a0000000 (reserved)BIOS-e820: 00000000f8000000 - 00000000fa000000 (reserved)BIOS-e820: 00000000fff00000 - 0000000100000000 (reserved)BIOS-e820: 0000000100000000 - 0000000480000000 (usable)ACPI: PM-Timer IO Port: 0x4008ACPI: Local APIC address 0xfee00000ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)Processor #0 15:1 APIC version 16ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)Processor #1 15:1 APIC version 16ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1])ACPI: IOAPIC (id[0x0e] address[0xfec00000] gsi_base[0])IOAPIC[0]: apic_id 14, version 17, address 0xfec00000, GSI 0-23ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge)
Page 13
That sounds easy!
There are a few catches...
Page 14
That sounds easy!
There are a few catches– Nothing ever works the first time
Page 15
That sounds easy!
There are a few catches– Nothing ever works the first time– Something will never work at all
Page 16
That sounds easy!
There are a few catches– Nothing ever works the first time– Something will never work at all– At least 1 major issue is guaranteed to show up
as soon as your customer is looking
Page 17
Drummond's Law
Page 18
Drummond's Law
“The universe hates you.”
Page 19
Bring-up eats time
Page 20
Bring-up eats time
“Is it a hardware problem or a software problem?”
Page 21
Bring-up eats time
“Or both?”
Page 22
Drummond's Law in action
It only happens on the customer's workload
It only happens on < 1% of test runs
It goes away when instrumented
Page 23
What can we do?
Gather as much data as we can
Streamline testing and debugging
Learn about the hardware
Build tools!
Page 24
What can we do?
Gather as much data as we can
Streamline testing and debugging
Learn about the hardware
Build free tools!
Page 25
Problem: Console output
Page 26
Problem: Console output
Many servers do not have VGA
Page 27
Problem: Console output
Many servers do not have VGA
Most servers do have serial ports
Page 28
Problem: Console output
Many servers do not have VGA
Most servers do have serial ports
Linux supports serial console
Page 29
Problem: Console output
Many servers do not have VGA
Most servers do have serial ports
Linux supports serial console
Many BIOSes do not support serial console
Page 30
Problem: Console output
Many servers do not have VGA
Most servers do have serial ports
Linux supports serial console
Many BIOSes do not support serial console
Many BIOSes that support serial console consider it a “premium” feature
Page 31
SGABIOS
Page 32
SGABIOS
Tiny real-mode option ROM
Page 33
SGABIOS
Tiny real-mode option ROM
Hooks BIOS console interrupts– Int 10h, Int 16h
Page 34
SGABIOS
Tiny real-mode option ROM
Hooks BIOS console interrupts– Int 10h, Int 16h
Provides full console redirection
Page 35
SGABIOS
Tiny real-mode option ROM
Hooks BIOS console interrupts– Int 10h, Int 16h
Provides full console redirection
Transparent to most BIOSes and legacy apps
Page 36
Why is it better?
Free, in both senses of the word
Page 37
Why is it better?
Free, in both senses of the word
Optimized for slow serial links
Page 38
Why is it better?
Free, in both senses of the word
Optimized for slow serial links
It works!
Page 39
Features
Passes control to VGA, if present
Provides hooks for smarter BIOS or EFI logging
Supports up to 255x255 consoles
Recent-write caching
Page 40
How can I use it?
Talk to your platform or BIOS vendor
Look for BIOS-editing tools for $YOUR_BIOS
Load it as an option-ROM on a card
http://sgabios.googlecode.com
Page 42
Hypothetical
Page 43
Hypothetical
Your system boots, but the BIOS did not enable FeatureX
Page 44
Hypothetical
Your system boots, but the BIOS did not enable FeatureX
• You really, really want FeatureX
Page 45
Hypothetical
Your system boots, but the BIOS did not enable FeatureX
• You really, really want FeatureX
CPUCorp tells you that FeatureX is enabled by bit 20 of MSR 0x12345678
Page 46
Hypothetical
Your system boots, but the BIOS did not enable FeatureX
• You really, really want FeatureX
CPUCorp tells you that FeatureX is enabled by bit 20 of MSR 0x12345678
How do you do it?
Page 47
You could...
Get a new BIOS from Mobo Inc.– Slow turnaround time– High risk– Don't you want to test it first?
Page 48
You could...
Tweak and rebuild a custom kernel– Where does a hack like this go?– Pretty heavy handed
Page 49
You could...
Write a C/Perl/Python program – A much better answer– Doesn't scale well for frequent use cases
Page 50
Enter iotools
A simple suite of tools to provide access to hardware registers
Page 51
Enable FeatureX
$ rdmsr 0 0x123456780x0000045e0302126a
$ wrmsr 0 0x12345678 0x45e0312126a
$ rdmsr 0 0x123456780x0000045e0312126a
Page 52
Enable FeatureX
$ X=$(rdmsr 0 0x12345678)
$ X=$(($X | 1<<20))
$ wrmsr 0 0x12345678 $X
Page 53
Enable FeatureX
$ wrmsr 0 0x12345678 \ $(($(rdmsr 0 0x12345678) | 1<<20))
Page 54
“But that's a rare case!”
You must not work in bring-up! We need to do stuff like this all the time.
Page 55
“But isn't that risky?”
Why yes, it is. Welcome to the wonderful world of bring-up.
Page 56
Device support
PCIIO (ports)MMIO (memory)MSRTSCCPUIDSMBusCMOS
Page 57
Other support
ANDORXORNOTSHLSHRBTRBTS
Page 58
Why iotools is great
Very small– ~18 KBytes (dynamic, ~400 Kbytes static)– Busybox style install
Very simple– All the tools work alike
Very easy to extend or build on– New commands can be written in minutes
Page 59
A real example
BIOS did not enable PCI error reporting function set_serr { OLD=$(pci_read16 $1 $2 $3 0x4) NEW=$(or $OLD 0x100) # SERR is bit8 (0x100) of register 0x4 pci_write32 $1 $2 $3 4 $NEW } for B in $(seq 0 255); do # for each bus for D in $(seq 0 31); do # for each device for F in $(seq 0 7); do # for each function pci_read32 $B $D $F 0 >/dev/null 2>&1 if [ $? != 0 ]; then set_serr $B $D $F fi done done done
http://iotools.googlecode.com
Page 61
Hypothetical
You just updated the BIOS and now half of your memory doesn't show up.
Page 62
Hypothetical
You just updated the BIOS and now half of your memory doesn't show up.
What changed?
Page 63
Hypothetical
You just updated the BIOS and now half of your memory doesn't show up
What changed?
Actually, this one isn't hypothetical
Page 64
Where do you start?
BIOS release notes?– “Broke the memory controller” is not the sort of
thing that normally gets documented
Page 65
Where do you start?
BIOS release notes?– “Broke the memory controller” is not the sort of
thing that normally gets documented
Kernel output?– The kernel only knows what it is told BIOS
Page 66
Where do you start?
BIOS release notes?– “Broke the memory controller” is not the sort of
thing that normally gets documented
Kernel output?– The kernel only knows what it is told BIOS
BIOS output?– If they knew it was broken, they wouldn't have
shipped it (I hope)
Page 67
Go straight to the hardware
Compare settings between the two BIOSes
Page 68
Go straight to the hardware
Compare settings between the two BIOSes
What settings?– PCI registers are a good place to start
Page 69
Boot the old BIOS
cd /proc/bus/pci/ find ?? -type f | while read DEV; do mkdir -p $DIR/old/$(dirname $DEV) hexdump -v $DEV > $DIR/old/$DEV done
Page 70
Boot the new BIOS
cd /proc/bus/pci/ find ?? -type f | while read DEV; do mkdir -p $DIR/new/$(dirname $DEV) hexdump -v $DEV > $DIR/new/$DEV done
Page 71
Comparediff -ruN old/00.0 new/00.0--- old/00.0 2008-06-03 13:43:16.00000000 -0700+++ new/00.0 2008-06-03 13:55:07.00000000 -0700@@ -5,8 +5,8 @@ 0000040 3bff 0004 0040 4a50 0000 0000 0000 0000 0000050 fff8 7fff 0002 0000 100d 0016 7d41 59e4 0000060 0001 0000 0000 0000 00c0 0000 0000 0000-0000070 0113 5100 8011 5000 3800 0800 220a 0000-0000080 0000 2307 2113 6513 0000 0000 0000 0020+0000070 0111 5102 8011 5000 3800 0800 222a 0001+0000080 0000 2307 2113 6113 0000 0000 0000 0000 0000090 0009 0000 0040 0000 4000 0276 0000 0000 00000a0 0000 0000 0000 0000 0000 0000 0000 0000 00000b0 0000 0000 0000 0000 0000 0000 a000 0371diff -ruN old/00.1 new/00.1--- old/00.1 2008-06-03 13:43:16.00000000 -0700+++ new/00.1 2008-06-03 13:55:07.00000000 -0700@@ -4,8 +4,8 @@ 0000030 0000 0000 0000 0100 0dd0 0000 a000 0000 0000040 46e0 0e0e fff0 f4f1 e000 d000 eff0 dff0 0000050 873b 7dbb 0102 0100 0000 d065 0706 0504 0000060 010f 0fff 0e00 0000 0020 0050 7770 0000-0000070 f011 e500 d018 c500 3b00 0aa0 9909 0000+0000070 f111 e500 d018 c000 3b00 0aa0 1909 0001 0000090 0009 0000 0040 0000 4000 0276 0000 0000 00000a0 0000 0000 0000 0000 0000 0000 0000 0000
Page 72
Wow, that looks like fun!
Look up each difference in the datasheet
If you're lucky, you find something
Hopefully you'll never have to do this again
You will have to do this again, I promise
Page 73
Write a program
Page 74
Write a program
No real APIs to build on
Page 75
Write a program
No real APIs to build on
No standard output format makes parsing hard
Page 76
Write a program
No real APIs to build on
No standard output format makes parsing hard
Adding new devices takes a lot of code and time
Page 77
Realizations
There are a LOT of devices out there and they are all very different
We need an API first – make it easy to add new devices
Page 78
So I tried...
... and failed
Page 79
So I tried...
... and failed
It still takes too much code for each register
Page 80
Emerging concepts
Registers are fixed-width
Page 81
Emerging concepts
Registers are fixed-width
Registers do not map 1:1 with hardware settings
Page 82
Emerging concepts
Registers are fixed-width
Registers do not map 1:1 with hardware settings
We want to look at “fields” more than registers
Page 83
Emerging concepts
Registers are fixed-width
Registers do not map 1:1 with hardware settings
We want to look at “fields” more than registers
While we're on this, can we write to fields, too?
Page 84
What we really want is...
A “language” to describe devices, registers, and fields
Page 85
What we really want is...
A “language” to describe devices, registers, and fields
Let's write a language!
Page 86
More lessons learned
Hardware designers are evil
Page 87
More lessons learned
Hardware designers are evil
There are some pretty bizarre devices out there
Page 88
More lessons learned
Hardware designers are evil
There are some pretty bizarre devices out there
We need to explore more devices before we can really think about a language
Page 89
New goal
Produce something useful in the short term that allows us to explore the problem space.
Page 90
Start with the back-end
Focus on evolution, not revolution
Page 91
Start with the back-end
Focus on evolution, not revolution
Figure out the evil devices
Page 92
Start with the back-end
Focus on evolution, not revolution
Figure out the evil devices
Find a language that works
Page 93
Start with the back-end
Focus on evolution, not revolution
Figure out the evil devices
Find a language that works
Learn C++ better
Page 94
Start with the back-end
Focus on evolution, not revolution
Figure out the evil devices
Find a language that works
Learn C++ better– Yes, C++
Page 95
Some nouns
pp_value– An arbitrarily large integer– Used throughout PP– Based on then GNU MP library
pp_datatype– A way of evaluating a pp_value– Produces human-friendly output– Simple numbers, bitmasks, enums, bools
pp_binding– A link to a particular piece of hardware
Page 96
Some nouns
pp_register– A register– Fixed width (8, 16, 32, 64, 128 bits)– Can read() and write()
pp_field– Arbitrary width– Has a pp_datatype
pp_scope– A container for registers, fields, and scopes– Can have a pp_binding
Page 97
The PP tree
Very much like a filesystem
pp_scopes == directories
pp_registers, pp_fields == files
Generically: pp_dirents
Dirents can be addressed by pp_paths– e.g. “/foo/bar/thing”
Page 98
Where does it come from?
We do not have a “real” language (yet)
We do have a backend API
Simplify and wrap that API
Result: a “fake language” on top of C++
Page 99
Scopes
OPEN_SCOPE(name)OPEN_SCOPE(name, binding)CLOSE_SCOPE()
Scope names are C tokens, plus '.'Bindings come from BIND()
Example:OPEN_SCOPE(“foo”)
Page 100
Bindings
BIND(driver_name, ARGS(args))
Driver names are C tokensArgs are 1 or more pp_values, sent to driverBindings are in effect until scope is closed
Example:OPEN_SCOPE(“foo”, BIND(“cpu”, ARGS(0))
Page 101
Registers
REG8(name, address)REG16(name, address). . .
Register names start with '%'Address is a pp_valueRegister is bound through current binding
Example:REG8(“%foo”, 16)
Page 102
Datatypes
INT(name, units?)HEX(name, width?, units?)BITMASK(name, KV(pair), KV(pair) . . .)ENUM(name, KV(pair), KV(pair) . . .)
Types are scoped like CCan be named or anonymous
Example:ENUM(“enum_t”, KV(“k1”, 1), KV(“k2”, 2))
Page 103
Global types
int_t
hex_t, hex4_t, hex8_t, hex12_t, hex16_t, hex20_t, hex32_t, hex64_t, hex128_t
addr16_t, addr32_t, addr64_t
yesno_t, truefalse_t, onoff_t, enabledisable_t
bitmask_t
Page 104
Simple fields
FIELD(name, datatype, regbits)
Field names are C tokens, plus '.'Datatype can be a type name (string) or an
anonymous typeRegbits come from BITS()
Example:FIELD(“foo”, “int_t”, BITS(“%reg1”,15,7))
Page 105
Regbits
BITS(reg_name)BITS(reg_name, bit)BITS(reg_name, hi_bit, lo_bit)
Regbits can be all of a register or just some bits
Regbits can be joined using '+'Joins always happen at the LSB
Example:BITS(”%reg1”,5,3) + BITS(“%reg2”,6,2))
Page 106
Flow control
The “fake language” is C++, so standard flow control works
Use pp_value for register and field values
Can't switch on pp_value
Use functions where appropriate
Page 107
Evaluating fields
FIELD_EQ(path, value)FIELD_NE(path, value)FIELD_GT(path, value)FIELD_BOOL(path). . .
Values can be pp_values or strings- e.g. Enum keys
Example:if (FIELD_EQ(“foo”, “closed”)) {
Page 108
Testing a dirent
DEFINED(path)
Some dirents are defined conditionally
Example:if (DEFINED(“../foo/bar”))
Page 109
Other wonderful things
Procedure registers and fields
Arrays of dirents
Bookmarks
Magic registers
Constant fields
Regfields
XFORM datatypes
More!
Page 110
Real code
REG16("%command", 0x04);OPEN_SCOPE("command"); FIELD("io", "yesno_t", BITS("../%command", 0)); FIELD("mem", "yesno_t", BITS("../%command", 1)); FIELD("bm", "yesno_t", BITS("../%command", 2)); FIELD("special", "yesno_t", BITS("../%command", 3)); FIELD("mwinv", "yesno_t", BITS("../%command", 4)); FIELD("vgasnoop", "yesno_t", BITS("../%command", 5)); FIELD("perr", "yesno_t", BITS("../%command", 6)); FIELD("step", "yesno_t", BITS("../%command", 7)); FIELD("serr", "yesno_t", BITS("../%command", 8)); FIELD("fbb", "yesno_t", BITS("../%command", 9)); FIELD("intr", "yesno_t", BITS("../%command", 10));CLOSE_SCOPE();
Page 111
More real code
static void BAR(const string &name, const pp_value &address) { OPEN_SCOPE(name); REG32("%lo", address); FIELD("type", ANON_ENUM(KV("mem", 0), KV("io", 1)), BITS("%lo", 0)); if (FIELD_EQ("type", "mem")) { FIELD("width", ANON_ENUM(KV("b32", 0), KV("b20", 1), KV("b64", 2)), BITS("%lo", 2, 1)); FIELD("prefetch", "yesno_t", BITS("%lo", 3)); } if (FIELD_EQ("type", "io")) { FIELD("address", "addr16_t", BITS("%lo", 15, 2) + BITS("%0", 1, 0)); } else if (FIELD_EQ("width", "bits32") || FIELD_EQ("width", "bits20")) { FIELD("address", "addr32_t", BITS("%lo", 31, 4) + BITS("%0", 3, 0)); } else { REG32("%hi", address+4); FIELD("address", "addr64_t", BITS("%hi", 31, 0) + BITS("%lo", 31, 4) + BITS("%0", 3, 0)); } CLOSE_SCOPE();}
Page 112
Drivers
pci(segment, bus, device, function)
io(base, size)
mem(base, size)
msr(cpu)
cpuid(cpu)
Page 113
Discovery
pci(vendor_id, device_id)
cpuid(vendor, min_family, max_family, min_model, max_model, min_stepping, max_stepping)
Page 114
Current state
Under active development
Focusing on device support– PCI– CPUID– K8– MSRs
Exploring more devices
Page 115
What's next?
Refining the model– Can it be simpler?– Does it accommodate “every” device?
A “real” language– Can we piggy back on an existing language?
Bit-level read/write flags
More more more devices!
http://prettyprint.googlecode.com