+ All Categories
Home > Documents > A Model-Driven Automatically-Retargetable Debug Tool for Embedded Systems

A Model-Driven Automatically-Retargetable Debug Tool for Embedded Systems

Date post: 16-Jan-2023
Category:
Upload: independent
View: 0 times
Download: 0 times
Share this document with a friend
11
A Model-Driven Automatically-Retargetable Debug Tool for Embedded Systems Max R. de O. Schultz, Alexandre K.I. Mendonça, Felipe G. Carvalho, Olinto J.V. Furtado, and Luiz C.V. Santos Federal University of Santa Catarina, Computer Science Department, Florianópolis, SC, Brazil {max, mendonca, fgcarval, olinto, santos}@inf.ufsc.br Abstract. Contemporary SoC designs ask for system-level debugging tools suit- able to heterogeneous platforms. Such tools will have to rely on some low-level model-driven debugging engine that must be retargetable, since embedded code may run on distinct processors within the same platform. This paper describes a technique for automatically retargeting debugging tools for embedded code in- spection. The technique relies on two key ideas: automatic extraction of machine- dependent information from a formal model of the processor and reuse of a conventional binary utility package as implementation infrastructure. The re- targetability of the technique was experimentally validated for targets MIPS, SPARC, PowerPC and i8051. 1 Introduction Modern embedded systems are often implemented as systems-on-chip (SoCs) whose optimization requires design space exploration. Alternative CPUs may be explored so as to minimize code size and power consumption, while ensuring enough performance to fulfill real-time constraints. Therefore, design space exploration requires the generation, inspection and evaluation of embedded code for distinct target processors. Besides, con- temporary SoC designs ask for system-level debugging tools suitable to heterogeneous platforms. Such tools will have to rely on some low-level model-driven debugging en- gine that must be retargetable, since embedded code may run on distinct processors within the same platform. As manually retargeting is unacceptable under the time-to-market pressure, auto- matically retargetable tools are mandatory. Retargetable tools [1] automatically extract machine-dependent information from a processor model, usually written in some archi- tecture description language (ADL). To prevent the tools from being tied to a given ADL, an abstract processor model could be envisaged. To be practical, such a model should be synthesizable from a de- scription written in some ADL. Figure 1 describes a typical model-driven tool chain. It summarizes distinct classes of information flow (tool generation, code generation, code inspection and code evaluation). Exploration consists of four major steps, as follows. First, given a target processor model, code generation tools (compiler backend, as- sembler and link editor), code inspection tools (dissassembler and debugger) and an instruction-set simulator are automatically generated. S. Vassiliadis et al. (Eds.): SAMOS 2007, LNCS 4599, pp. 13–23, 2007. c Springer-Verlag Berlin Heidelberg 2007
Transcript

A Model-Driven Automatically-Retargetable Debug Toolfor Embedded Systems

Max R. de O. Schultz, Alexandre K.I. Mendonça, Felipe G. Carvalho,Olinto J.V. Furtado, and Luiz C.V. Santos

Federal University of Santa Catarina, Computer Science Department,Florianópolis, SC, Brazil

{max, mendonca, fgcarval, olinto, santos}@inf.ufsc.br

Abstract. Contemporary SoC designs ask for system-level debugging tools suit-able to heterogeneous platforms. Such tools will have to rely on some low-levelmodel-driven debugging engine that must be retargetable, since embedded codemay run on distinct processors within the same platform. This paper describesa technique for automatically retargeting debugging tools for embedded code in-spection. The technique relies on two key ideas: automatic extraction of machine-dependent information from a formal model of the processor and reuse of aconventional binary utility package as implementation infrastructure. The re-targetability of the technique was experimentally validated for targets MIPS,SPARC, PowerPC and i8051.

1 Introduction

Modern embedded systems are often implemented as systems-on-chip (SoCs) whoseoptimization requires design space exploration. Alternative CPUs may be explored so asto minimize code size and power consumption, while ensuring enough performance tofulfill real-time constraints. Therefore, design space exploration requires the generation,inspection and evaluation of embedded code for distinct target processors. Besides, con-temporary SoC designs ask for system-level debugging tools suitable to heterogeneousplatforms. Such tools will have to rely on some low-level model-driven debugging en-gine that must be retargetable, since embedded code may run on distinct processorswithin the same platform.

As manually retargeting is unacceptable under the time-to-market pressure, auto-matically retargetable tools are mandatory. Retargetable tools [1] automatically extractmachine-dependent information from a processor model, usually written in some archi-tecture description language (ADL).

To prevent the tools from being tied to a given ADL, an abstract processor modelcould be envisaged. To be practical, such a model should be synthesizable from a de-scription written in some ADL. Figure 1 describes a typical model-driven tool chain. Itsummarizes distinct classes of information flow (tool generation, code generation, codeinspection and code evaluation). Exploration consists of four major steps, as follows.

First, given a target processor model, code generation tools (compiler backend, as-sembler and link editor), code inspection tools (dissassembler and debugger) and aninstruction-set simulator are automatically generated.

S. Vassiliadis et al. (Eds.): SAMOS 2007, LNCS 4599, pp. 13–23, 2007.c© Springer-Verlag Berlin Heidelberg 2007

14 M.R. de O. Schultz et al.

Fig. 1. Model-driven tool flows

Then, the application source code can be compiled, assembled and linked, resultingin executable code.

In a third step, the executable code can be run on the instruction-set simulator andits functionality can be observed with the help of disassembling and debugging tools.These tools allow the code to be executed incrementally (step) or to be stopped at certaincode locations (breakpoints) so as to monitor program values (watchpoints).

Finally, as soon as proper functionality is guaranteed by removing existent bugs,continuous execution on the simulator allows the evaluation of code quality with respectto design requirements. If some requirement isn’t met, an alternative instruction set-architecture (ISA) may be envisaged to induce a new solution. If the current processoris an application-specific instruction-set processor (ASIP), its ISA may deserve furthercustomization. Otherwise, a new candidate processor may be selected.

This paper focuses on a technique for generating debugging tools from an arbitraryprocessor model. The technique relies on two key ideas. First, ISA-dependent infor-mation is automatically extracted from the model of the target processor. Second, thewell-known GNU Binutils [2] and GNU debugger [3] packages are employed as im-plementation infrastructure: ISA-independent libraries are reused, while target-specificlibraries are automatically generated.

A Model-Driven Automatically-Retargetable Debug Tool for Embedded Systems 15

The remainder of this paper is organized as follows. Section 2 briefly reviews relatedwork. Section 3 formalizes the processor model that drives tool retargeting. Section 4discusses implementation aspects. Experimental results are provided in Section 5. InSection 6, we draw our conclusions and comment on future work.

2 Related Work

2.1 Manually Retargetable Tools

Manually retargetable binary utilities are available within the popular GNU Binutilspackage [2]: assembler (gas), linker (ld), debugger (gdb) [3] and disassembler (obj-dump). Essentially, the Binutils package consists of an invariant ISA-independent corelibrary and a few ISA-dependent libraries that must be rewritten for each new targetCPU. Among the ISA-dependent libraries, there are two main libraries, namely Op-codes and BFD, which require retargeting.

The Opcodes library describes the ISA of a CPU (instruction encoding, register en-coding, assembly syntax). Unfortunately, there is no standard for ISA description withinthis library.

The BFD library provides a format-independent (ELF, COFF, A.OUT, etc.) objectfile manipulation interface. It is split into two blocks: a front-end, which is the library’sabstract interface with the application and a back-end, which implements that abstractinterface for distinct object file formats.

2.2 Automatically Retargetable Tools

A great deal of contemporary retargetable tools rely on automatic generation from aCPU model, written in some ADL, such as nML [4], ISDL [5], and LISA [6].

Although disassembler and debugger are available for most ADLs, it is unclear towhich extent they are automatically generated or simply hand-retargeted. For instance,once a simulator is generated in the LISA tool chain, it can be linked to a debugginggraphical user interface, but there is no clue on how the underlying mechanism actuallyworks.

It has been acknowledged that novel assembly-level optimization approaches, likeSALTO [7] and PROPAN [8], deserve further investigation [1]. Such techniques allowconventional compiler infrastructure to be reused by enabling post-compiling machine-dependent optimizations to further improve code quality.

Although such post-compiling optimizations are promising, they may inadvertentlyintroduce flaws. Code inspection tools could loose track of breakpoints and watchpointsdue to optimizations not connected to the source code (in face of new locations anddistinct register usage). Therefore, conventional debuggers are likely to overlook flawsintroduced by post-compiling optimizations.

A technique for retargeting assemblers and linkers to the GNU package was pre-sented in [9]. It relies on a formal notation to describe both the target ISA and its re-location information. Although the formalism is solid, experimental results are scarce.Besides, it is not possible to foresee if the proposed framework is able to address retar-getable debugging tools.

16 M.R. de O. Schultz et al.

Two facts motivated the work described in this paper: first, the lack of informationreporting how code inspection tools are made retargetable and at which extent this isperformed automatically; second, the scanty experimental results providing evidence ofproper retargetability.

Although we pragmatically reuse a conventional binary-utility package as imple-mentation infrastructure (like in [9]), we rely on an ADL-independent processor model.

3 Processor Model

This section formalizes the ISA aspects of the processor model in the well-known BNFnotation. To ease its interpretation, an example is also provided. Figure 2 specifies theformal structure for the information typically available in processor manuals, whichrelies on the notions of instruction, operand and modifiers.

A modifier is a function that transforms the value of a given operand. It is writtenin C language and it has four pre-defined variables to specify the transformation: inputis the original operand value, address represents the instruction location, parm is aparameter that may contain an auxiliary value (such as required for evaluating the targetaddress for PC-relative branches), output returns the transformed operand value.

An operand type oper-type specifies the nature of an instruction field and it is tiedto a binary value encoded within a given field. Examples of operand types are immfor immediate values, addr for symbolic addresses and exp for expressions involvingimmediate values and symbols.

Figure 3 shows an illustrative example of the processor model, according to the spec-ified syntax. Lines 1 to 5 describe the mapping for the operand reg, where the symbols$0, $1, ..., $90 are mapped to the values 0, 1, ..., 90. Note that many-to-one mappingsare allowed. For instance, the symbols $sp $fp, $pc and $ra are mapped to values al-ready mapped in line 1. Lines 7 to 8 define the modifier R, which defines a function tobe applied for PC-relative transformations. The modifier’s results (output) is evaluatedby adding the current location (address) to the operand value (input) and to an offset(parm). Lines 10 to 15 define the instruction beq. Line 11 defines its instruction for-mat as a list of fields and its associated bit sizes. Line 12 defines its assembly syntax:reg, reg and exp are tied to instruction fields rs, rt and imm (beq is the instructionmnemonic). The modifier R (whose offset is 2) is applied to operand type imm, therebyspecifying that the resulting value is PC-relative and shifted 2 bits to the left. Finally, inline 14, the constant value 0x04 is assigned to the instruction’s op field.

From the processor model, a table of instructions is generated as a starting point forthe retargeting algorithms. Each table entry is a tuple defined as follows:

table-entry = (mnemonic , opinfo, image, mask , pseudo, format-id)

Let’s illustrate the meaning of its elements by means of an example. From the modelin Figure 3, the following table entry would be generated for the instruction beq:

{"beq", "%reg:1:,%reg:2:,%exp:3:", 0x10000000 , 0xFC000000 , 0, Type_I}

The first element is the instruction’s mnemonic (beq). The second stores informationlike type (reg, reg, exp) and instruction field location (1, 2, 3). The third element storesthe partial binary image of the instruction (0x10000000). The fourth element stores a

A Model-Driven Automatically-Retargetable Debug Tool for Embedded Systems 17

<isa-def > ::= <list -operand > <list -modifier > <list -instruction >

<list -operand > ::= <operand -def > <list -operand > | <operand -def>

<operand -def> ::= operand oper -id { "mapping definition" }

<list -modifier > ::= <modifier -def> <list -modifier > | empty

<modifier -def> ::= modifier modifier -id { "modifier code" }

<list -instruction > ::= <instruction -def> <list -instruction >| <instruction -def>

<instruction -def> ::= instruction insn -id { <format-desc > ; (<syntax-desc >) :( <operand-decoding > ) ; <opcode-decoding > }

<format-desc > ::= field-id : constant , <format-desc > | field-id : constant

<syntax-desc > ::= mnemonic -id <oper -type -list >

<oper -type -list > ::= <qualifier > <oper -type > , <oper -type -list >| <qualifier > <oper -type >

<oper -type > ::= oper -id | imm | addr <modifier > | exp <modifier >

<modifier > ::= << modifier -id ( constant ) | empty

<operand -decoding > ::= field-id , <operand -decoding > | field-id

<opcode-decoding > ::= field-id = constant , <opcode-decoding >| field-id = constant

<qualifier > ::= # | $ | empty

Fig. 2. Processor model specification

1. operand reg { $[0..90] = [0..90];2. $sp = 29;3. $fp = 30;4. $ra = 31;5. $pc = 37; }6.7. modifier R { output = input + address + parm; }8.9. instruction beq {

10. op:6, rs:5, rt:5, imm:16,11. (beq reg, reg, exp << R(2)) : (rs, rt, imm);12. op=0x0413. }

Fig. 3. A segment of the MIPS model

mask (0xFC000000) to be used by the dissassembling algorithm in order to identify theinstruction. The fifth element specifies whether the entry refers to a pseudo-instructionor not (0 = not). Finally, the last element stores the instruction format identifier (Type_I,in this case).

18 M.R. de O. Schultz et al.

4 Implementation

Our generation technique reuses the GNU Binutils and the GNU gdb packages as muchas possible. The structure of the disassembling and debugging tools is depicted in Figure4, where the generated machine-dependent libraries are marked with an asterisk.

Observe that both tools share the BFD and Opcodes libraries. Besides, note thateach tool consists of a target-specific library and a machine-independent core library.Therefore, the key to automatic tool retargeting is to generate both libraries and bothtarget-specific libraries automatically, as will be described in the next subsections. TheISA-dependent information is automatically extracted from the model of the targetCPU.

Note that a retargeted tool is obtained by simply compiling the generated target-specific libraries together with the respective core library. Each generated libraryconsists of a few files, whose organization is summarized in Figure 5, where [arch]represents a given ISA. The remaining of this section focuses on the main generatedfiles.

Fig. 4. Tools structure

4.1 Generation of Library Opcodes

The file include/opcodes/[arch].h declares three data structures supporting in-struction decoding and encoding, the mapping between register names and actual en-codings, and pseudo-instruction manipulation. (It should be noted that disassemblingdoesn’t make use of pseudo-instructions to avoid ambiguity).

The corresponding opcodes/[arch]-opc.c file contains the above mentioned datastructures, which are fed with the information extracted from the processor model.

4.2 Generation of Library BFD

ISA attributes extracted from the processor model are encoded within this library. Sincewe have adopted the ELF format, only the ELF-related files are generated. Amongthem, the most important file is bfd/cpu-[arch].c, which contains information suchas architecture name, word length and address lenght.

A Model-Driven Automatically-Retargetable Debug Tool for Embedded Systems 19

- binutils // GNU Binutils- bfd // library BFD

. cpu -[arq].c

. elf32-[arq].c- opcodes // library Opcodes

. [arq]-opc.c

. [arq]-dis.c- include // general files include

- elf. [arq].h

- opcode. [arq].h

- gdb // GNU Debugger- bfd // library BFD

. cpu -[arq].c

. elf32-[arq].c- opcodes // library Opcodes

. [arq]-opc.c

. [arq]-dis.c- include // general files include

- elf. [arq].h

- opcode. [arq].h

- gdb // files of debugger. [arq]-tdep.c- config

- [arq]. [arq].mt

Fig. 5. Generated file tree

4.3 Target-Specific Disassembler Library

The main file for the disassembling process is opcodes/[arch]-dis.c. It manipulatesthe data structures mentioned in Section 4.1 and invokes BFD interface methods to readobject files.

4.4 Target-Specific Debugger Library

Within this library, the most important file is gdb/[arch]-tdep.c. It contains func-tions handling subroutine calls and giving access to general-purpose and specific regis-ters (e.g. program counter, stack pointer and frame pointer), so as to allow breakpointcontrol and value watching.

5 Experimental Results

For the sake of tool validation, we have adopted the well-known Mibench [10] bench-mark. In order to validate our tool generators, conventional manually-retargeted toolswere used to set reference files and values. Then, we compared results produced by thegenerated tools with the reference values obtained from conventional tools.

Tool validation is achieved, not only by observing proper functionality of the gener-ated tools, but also by observing the retargetability of the generating tool. To check for

20 M.R. de O. Schultz et al.

retargetability, the validation procedures described in the following subsections wererepeated for four distinct targets: PowerPC, MIPS, SPARC and i8051.

5.1 Validation of Disassembling Tools

To validate generated disassembling tools, we employed the following key idea: givena reference object file, if it is disassembled and then re-assembled, the resulting fileshould match the reference file. Figure 6 shows the adopted validation flow. Rectanglesrepresent tools and ellipses denote files. The procedure starts from a reference objectfile, which is fed to the generated disassembler (to be validated), giving rise to an outputassembly file. Then, this file is submitted to an assembler, resulting in an object outputfile. In the end, the input file (reference) and output file (under validation) are comparedto check whether they matched or not.

Fig. 6. Validation flow for disassembling

It could be argued that such a validation procedure should compare assembly codes,instead of object codes. However, the direct comparison of assembly codes is hamperedby the presence of pseudo-instructions or instructions admitting multiple assembly syn-taxes. For instance, the MIPS instruction "jump at register" can be written in two dif-ferent ways: "jr 1”or” j1". That’s why reversed matching was used instead of directmatching, without loss of generality. We repeated the validation procedure for each tar-get CPU and for every benchmark program. As a result, all the comparisons matched,therefore providing evidence of proper functionality.

5.2 Validation of Debugging Tools

To validate generated debugging tools, we defined a set of breakpoints and watchpointsfor a given executable file and observed the resulting values and control for both conven-tional and generated debuggers. Figure 7 shows the adopted validation flow. Rectanglesrepresent tools, while ellipses represent either a file or a set of observed values. The

A Model-Driven Automatically-Retargetable Debug Tool for Embedded Systems 21

Fig. 7. Validation flow for debugging

procedure starts from a given executable file, which is run on an instruction-set simula-tor of the target CPU. First, breakpoints and watchpoints are inserted in the code by aconventional debugger. As a result of running the instrumented code, watch point valuesare set as a reference. Then, a generated debugger was used to repeat the procedure forexactly the same breakpoints and watch points. In the end, the values under validationwere compared to the reference values.

We repeated the validation procedure for each target CPU and for every benchmarkprogram. Since all comparisons matched, they indicate that the generated debuggers areequivalent to their manually retargeted counterparts.

5.3 Tool Efficiency

We provide some quantitative evidence of tool efficiency by showing the relation be-tween program size and runtime for the disassembling tool. Since the debugging toolintensively invokes the disassembling engine, those results serve to assess the efficiencyof both tools.

To check for proper retargetability of the generating tool, the procedure above wasrepeated for RISC (PowerPC, MIPS, SPARC) and CISC (i8051) targets, whose resultsare shown in Tables 1 and 2, respectively. The first two columns show the benchmark

Table 1. Results for RISC targets

Program Files Size [Kb]Runtime [s] (our | objdump)

MIPS SPARC PowerPC

typeset 1 29.7 32.6 25.30.049 0.035 0.071 0.031 0.050 0.039

bitcount 9 4.9 4.1 4.10.010 0.006 0.010 0.009 0.009 0.008

susan 1 64.7 59.4 52.70.099 0.074 0.139 0.057 0.104 0.095

jpeg 60 284.8 239.8 228.20.442 0.317 0.537 0.219 0.437 0.406

fft 3 5.9 5.5 5.30.010 0.007 0.014 0.010 0.012 0.012

22 M.R. de O. Schultz et al.

Table 2. Results for CISC target ( i8051)

Program Files Size [b] Runtime [s]

int2bin 1 188 0.002cast 1 213 0.002sort 1 425 0.003xram 1 214 0.003

programs and respective number of files. The remaining columns show the sizes of".text" sections and disassembling runtimes for each distinct target processor. On aver-age, our disassembling runtimes are 1.15 times slower than the GNU native disassem-bling tool (objdump). Although this could be seen as the price to pay for the benefit ofachieving automatic retargetability, we already detected opportunities to optimize theprototype tool so as to reduce our runtimes.

6 Conclusions

The relevance of the proposed technique lies in the tracks opened by promisingassembly-level post-compiling optimizations and by the need of contemporary system-level debugging tools in heterogeneous platforms. The proposed technique fits in a prag-matic approach for automatic tool retargeting. Its underlying mechanism was clearlydescribed, as opposed to related work.

Experimental validation gives evidence of proper functionality and actual retargetabil-ity for all tested cases. In particular, our technique was able to generate a disassemblingtool for a processor with no pre-existent GNU porting (the i8051).

We first intend to improve the code of the prototype tool so as to reduce runtimes andthen perform experiments with new targets like Motorola ColdFire and Altera Nios2.As future work, we intend to elaborate an API to the retargeting engine so as to enabletool generation from an arbitrary ADL. Also, we want to address mechanisms to tie theretargetable debugger to a system-level debugging tool.

References

1. Leupers, R., Marwedel, P.: Retargetable Compiler Technology for Embedded Systems -Tools and Applications. Kluwer Academic Publishers, Dordrecht (2001)

2. Pesch, R.H., Osier, J.M.: The GNU binary utilities. Free Software Foundation, Inc. (1993)3. GNU: The GNU Project Debugger, http://www.gnu.org/software/gdb4. Hartoog, M.R., Rowson, J.A., Reddy, P.D., Desai, S., Dunlop, D.D., Harcourt, E.A., Khullar,

N.: Generation of software tools from processor descriptions for hardware/software codesign.In: Proceedings of the 34th Annual Conference on Design Automation, pp. 303–306. ACMPress, New York (1997)

5. Hadjiyiannis, G., Hanono, S., Devadas, S.: ISDL: an instruction set description language forretargetability. In: Proceedings of the 34th Annual Conference on Design Automation, pp.299–302. ACM Press, New York (1997)

6. Pees, S., Hoffmann, A., Zivojnovic, V., Meyr, H.: LISA – machine description languagefor cycle-accurate models of programmable DSP architectures. In: Proceedings of the 36thACM/IEEE Conference on Design Automation, pp. 933–938. ACM Press, New York (1999)

A Model-Driven Automatically-Retargetable Debug Tool for Embedded Systems 23

7. SALTO Project, http://www.irisa.fr/caps/projects/Salto8. Kästner, D.: Propan: A retargetable system for postpass optimizations and analyses. In: Pro-

ceedings of the ACM SIGPLAN Workshop on Languages, Compilers, and Tools for Embed-ded Systems, pp. 63–80. ACM Press, New York (2000)

9. Abbaspour, M., Zhu, J.: Retargetable binary utilities. In: Proceedings of the 39th Conferenceon Design Automation, pp. 331–336. ACM Press, New York (2002)

10. Guthaus, M.R., Ringenberg, J.S., Ernst, D., Austin, T.M., Mudge, T., Brown, R.B: A free,commercially representative embedded benchmark suite. In: Proceedings of the 4th AnnualIEEE Workshop on Workload Characterization, pp. 3–14 (2001)


Recommended