+ All Categories
Home > Documents > Kelayakan Lab DSP Thesis

Kelayakan Lab DSP Thesis

Date post: 29-May-2018
Category:
Upload: jadur-rahman
View: 223 times
Download: 0 times
Share this document with a friend

of 39

Transcript
  • 8/8/2019 Kelayakan Lab DSP Thesis

    1/39

    MASTER THESIS

    A Prototype Laboratory Environment for Digital Signal Processing

    Using Simulink and a Texas Instrument DSP Device

    Calle Gustavsson

    March 2002

    IR-SB-EX-0207

    Department of Signals, Sensors & Systems,

    Royal Institute of Technology

  • 8/8/2019 Kelayakan Lab DSP Thesis

    2/39

    2

    Abstract

    Normally, when a model is designed from building blocks in Simulink, the simulation isperformed within the Simulink environment. A test of the design in a real-time

    environment requires that source code is generated, compiled and downloaded to the

    target hardware. As a first attempt to bridge this software gap, this thesis describes and

    evaluates a prototype laboratory environment, which directly links Simulink to a Texas

    Instrument DSP device. The prototype system converts graphical models and makes

    available various real-time signal processing algorithms, such as adders, delays, FFTs,

    IIR filters and multipliers. Future work is to consider modification of the prototype to

    allow for feedback in the graphical models and to find an efficient way of handling signalprocessing algorithms where variable buffer lengths are required.

    Acknowledgements

    I would like to thank Ph.D. Mats Bengtsson for his encouraging comments on my early

    work and for his excellent guidance throughout the project. Personally, I gained a lot of

    practical software experience and valuable insights into signal processing. Also, I would

    like to thank Ph.D. Student Georg Jrngren for good advice concerning DSP issues.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    3/39

    3

    CONTENTS

    TABLE OF FIGURES..................................................................................................... 5

    1 INTRODUCTION.......................................................................................................... 6

    1.1 BACKGROUND............................................................................................................ 6

    1.2 THE PROBLEM............................................................................................................ 6

    1.3 THE OBJECTIVE OF THE PROJECT ............................................................................... 7

    1.4 EARLIER WORK ......................................................................................................... 8

    2 SYSTEM OVERVIEW ................................................................................................. 9

    2.1 SYSTEM OVERVIEW ................................................................................................... 9

    2.2 COMMENTS ON THE SYSTEM OVERVIEW.................................................................. 10

    2.3 PROGRAMMING METHODOLOGY.............................................................................. 102.3.1 Block-wise Processing..................................................................................... 11

    2.3.2 The Sampling Rate........................................................................................... 11

    2.4 EQUIPMENT.............................................................................................................. 11

    3 THE MAIN PARTS OF THE SYSTEM ................................................................... 12

    3.1 THE GRAPHICAL INTERFACE.................................................................................... 12

    3.1.1 The Model Description Language/List-File..................................................... 13

    3.2 THE LINK PROGRAM ................................................................................................ 15

    3.2.1 Class Diagram.................................................................................................. 15

    3.3 THE DSP PROGRAM................................................................................................. 18

    3.3.1 Hardware.......................................................................................................... 183.3.2 DMA and Signal Processing............................................................................ 19

    3.3.3 The Amount of Time Available for Signal Processing.................................... 20

    3.3.4 Buffer Handling in the Signal Processing Unit................................................ 21

    3.4 MAIN LOOP OF THE DSP PROGRAM......................................................................... 22

    4 OPTIMIZATION AND TESTMETHOD ................................................................. 24

    4.1 OPTIMIZATION ......................................................................................................... 24

    4.1.1 C or Assembly?................................................................................................ 24

    4.2 MEMORY USAGE...................................................................................................... 254.3 TEST METHODS........................................................................................................ 26

    5 RESULTS ..................................................................................................................... 29

    5.1 SPEED PERFORMANCE OF THE SYSTEM .................................................................... 29

    5.2 CASE STUDIES.......................................................................................................... 30

    5.3 REQUIREMENTS........................................................................................................ 32

    5.4 LIMITATIONS............................................................................................................ 32

    5.5 CONCLUSION AND FUTURE WORK ........................................................................... 33

    6 APPENDIXES.............................................................................................................. 34

    APPENDIX A: BRIEF OUTLINE FOR RECURSIVE SEARCH .................................................. 34

    APPENDIX B: C6701 INTERRUPTS.................................................................................. 35

  • 8/8/2019 Kelayakan Lab DSP Thesis

    4/39

    4

    APPENDIX C: DIRECT MEMORY ACCESS (DMA)........................................................... 37

    APPENDIX D: USERS GUIDE ......................................................................................... 38

    7 REFERENCES............................................................................................................. 39

  • 8/8/2019 Kelayakan Lab DSP Thesis

    5/39

    5

    TABLE of FIGURES

    FIGURE 1: PROTOTYPE LABORATORY SYSTEM OVERVIEW .................................................. 9

    FIGURE 2: MODULE LIBRARY FOR THE GRAPHICAL INTERFACE ........................................ 12

    FIGURE 3: A BASIC GRAPHICAL MODEL AND AN EXTRACT FROM ITS .MDL-FILE ................. 13FIGURE 4:CLASS DIAGRAM. THE MOST IMPORTANT CLASSES IN THE LINK PROGRAM AND

    THEIR ASSOCIATIONS ................................................................................................. 16

    FIGURE 5: THIRTEEN-BLOCK MODEL. THE RECURSIVE ALGORITHM IN THE LINK PROGRAM

    RECYCLES BUFFERS .................................................................................................... 17

    FIGURE 6: EXECUTION ORDER. THE LINK PROGRAM WRITES DATA TO THE EXTERNALMEMORY OF THE DSP PROGRAM TO INFORM THE DSP PROGRAM WHAT MODULE TO

    RUN AND WHAT BUFFERS TO USE................................................................................ 18

    FIGURE 7: OVERVIEW OF A FEW OF THE PERIPHERALS ON THE EVM BOARD..................... 19FIGURE 8: DSP PROGRAM OVERVIEW ................................................................................ 20

    FIGURE 9: MEMORY STRUCTURE. SAMPLES FOR THE DELAY MODULE AND THE IIR FILTER

    ARE PRESERVED IN A SEPARATE MEMORY AREA......................................................... 21

    FIGURE 10: PSEUDO CODE FOR THE MAIN LOOP IN THE ISR ............................................... 22FIGURE 11: EXTERNAL MEMORY ACCESS........................................................................... 25

    FIGURE 12: BENCHMARKING. THE SCREEN DUMP ILLUSTRATES THE BENCHMARKING

    PROCEDURE IN CODE COMPOSER STUDIO. ................................................................. 26

    FIGURE 13: FEEDBACK MODEL. THIS MODEL IS NOT HANDLED BY THE SYSTEM ................ 33FIGURE 14: INTERRUPT RESPONSE PROCEDURE.................................................................. 35

  • 8/8/2019 Kelayakan Lab DSP Thesis

    6/39

    6

    1 INTRODUCTION

    This Master Thesis tests the feasibility of an idea through the design of a prototypelaboratory environment for digital signal processing. The thesis begins with an

    introduction to the problem of controlling digital signal processors (DSPs). In particular,the first chapter discusses the problem of directly linking a high-level tool for modeling

    and simulation, Simulink (The MathWorks Inc. [1]), to a DSP. The second chapterpresents an overview of the proposed system and the subsequent chapter describes its

    three parts: the graphical interface, the link program and the DSP implementation.

    Chapter four discusses optimization, memory usage, and the test methods utilized.

    Finally, the last chapter evaluates the performance of the system and identifies future

    work.

    1.1 Background

    There are many different ways of controlling a digital signal processor:

    1. By writing a computer program in a DSP-compatible language, for example C,

    and then compile and download it directly to the target hardware.

    2. By describing a system graphically using Simulink, and then generate C-code

    using special toolboxes sold by MATLAB. The generated code is more or less

    ready for compilation.

    3. By developing special programs for DSP-control. You run the programs from the

    MATLAB-prompt or from the DOS-prompt. Each program starts a specific

    algorithm and parameter settings are possible before the function call.

    Such programs form a vital part of the current laboratory exercise in the course Digital

    Signal Processing at the Royal Institute of Technology.

    The first alternative above is the most general, and the third the least general. The idea

    behind this project is to explore a kind of DSP-control, which fits in between the second

    and the third alternative as far as flexibility is concerned. To avoid the complication of

    having to compile source code, the real-time application is implemented as a DSP-

    program, controlled via parameter settings. To create a pedagogically sound and intuitive

    laboratory environment, the proposed system has a graphical user interface (similar toSimulink), where a number of building blocks can be connected in a block diagram

    1.2 The Problem

    Primarily, this project considers the linking problem between Simulink and a Texas

    Instrument DSP device. Normally, when a model is developed from building blocks inSimulink, the simulation is peformed within the Simulink environment. A test of the

    design in a real-time environment requires that source code is generated, compiled and

    downloaded to the target hardware.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    7/39

    7

    Accordingly, a direct conversion of a Simulinkmodel to some kind of parameter table-

    representation, which a DSP-program can interpret, involves a number of problems. To

    begin with, the analysis of a Simulinkblock diagram which describes a signal processing

    system, consists of these tasks:

    Determining the building blocks (modules) of the graphical system and retrievingtheir parameters.

    Identifying the input and output ports of the system

    Determining the connections between the modules.

    Determining the execution order of the modules in the graphical model.

    After the analysis of the graphical model these tasks remain:

    Conveying the results from the analysis of the model to a DSP program. Creating the DSP-program and a toolbox library.

    1.3 The Objective of the Project

    The objective of this project is to develop a prototype laboratory environment for digital

    signal processing.

    The implementation is divided into three parts:

    1) A drawing board, which displays a model of the desired digital signal processingsystem. By double-clicking on a building block on the drawing board, parameter

    values should be adjustable. A pre-defined limited toolbox library should contain

    modules, such as FFT, linear filter, delay, multiplier and adder.

    In digital signal processing the system typically performs some kind of filtering to extract information from a signal.

    2) A Link-program, which interprets and converts the graphical representation of themodel, its component parts and their interconnections, to some kind of

    parameter-table-representation. The table defines what building blocks and

    what parameter values are used, and how the input and output ports are

    interconnected.

    3) A DSP-program, which interprets the parameter table description of the system,

    and performs simulation, analysis and possible visualization. Part of the DSP-

    program is a toolbox library consisting of the modules listed under

    implementation, part 1.

    The typical sampling rate of the system should be at least 8 KHz. It should be possible to

    use sound (voice) as input application to the analog to digital converter (ADC) on the

    evaluation module board (EVM Board), refer to figure 1, System Overview, below.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    8/39

    8

    1.4 Earlier Work

    The attempt to directly bridge the software gap between Simulinkand a DSP device is, tothe authors knowledge, a new design approach. However, several successful approacheshave been taken to convert Simulink models to VHDL code. Grout and Keane [2], for

    example, describe the development of a software toolbox that can analyze a Simulink

    block model in order to produce a VHDL representation of the model. The resulting data

    from the toolbox is a model description language/list file (.mdl-file) for the complete

    system, and a second model file that can be processed to create the VHDL code.

    Similarly, Krukowski and Kale [3] outline the direct mapping of a Simulinkstructure into

    one described in VHDL by generating a VHDL equivalent model.

    Further, Matlabs Real Time Workshop [4] allows for C code generation directly fromSimulink models. By combining such code generation tools with real-time systems

    hardware, it is possible to real-time simulate and analyze signal processing designs. In

    this paper, however, code creation is not considered.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    9/39

    9

    2 SYSTEM OVERVIEW

    Chapter 2 presents an overview of the implemented laboratory system.

    2.1 System Overview

    Figure 1: Prototype Laboratory System Overview

    High Level (Block Diagram)

    >>RUN LINK_PROGRAM

    Medium Level Low Level (Assembler, C)

    C++ Interface Real-Time Application

    LINK PROGRAM

    get_param()create table()

    DSP PROGRAM

    Dsp(vector* table){

    case (adder)

    DSP Out

    EVM MODULEADC/

    DAC

    In

  • 8/8/2019 Kelayakan Lab DSP Thesis

    10/39

    10

    2.2 Comments on the System Overview

    The process begins in Simulink - refer to figure 1 above. You develop a model of a

    system on a drawing board in Simulink, using ready-made modules from a toolbox

    library. Parameter settings can be changed, by right clicking on the building blocks.

    You run the Link program from the prompt in MATLAB, or by double clicking on the

    file link.exe in the Windows Explorer. The Link program converts the graphical

    representation to a kind of parameter table-representation of the model. The Link

    program automatically launches the DSP-program which interprets the parameter tableand starts the simulation in real-time by calling various DSP toolbox modules.

    Finally, a loudspeaker or an oscilloscope connected to the digital-to-analog converter on

    the Evaluation Module (EVM Board) conveys the result of the simulation.

    2.3 Programming Methodology

    A prototype is, according to [5]:

    An original thing in relation to a copy, imitation, representation, later specimen, improved form etc.,a trial model, a preliminary version

    This prototype laboratory system was developed quickly. It has limited functionality -

    time was not wasted on details. Nor is it intended to be complete or accurate in all details.

    However, the project aims at designing a dynamic system, in the sense that it should beeasy to expand and modify the system. Emphasis has therefore been laid on making the C

    and C++ code easy to understand. Comments on all functions are incorporated in the

    source code and the programs are divided into modules; it should not be a nightmare to

    improve the system and to add new DSP-modules.

    The three parts of the laboratory system were developed and tested separately. The

    toolbox library for the graphical interface was developed from ready-made building

    blocks in Simulink, whereas Microsoft Visual C++ 6.0 was the essential tool for writing

    the link program.

    Code Composer Studio and The TMS 320C6701 DSP Platform were used for the initialdesign of the DSP program. As for literature, [6] and [7] served as a starting point for the

    design of the design of the DSP modules. Also, [8] was at times used for inspiration.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    11/39

    11

    2.3.1 Block-wise Processing

    In order to simplify the first version of the conversion program, the DSP-program utilizesblock-wise processing. It continuously captures N samples first and then performsoperations on all N samples, not only when the discrete Fourier transform (DFT) or FFT

    is performed, but also during filtering. Operations, such as filtering, can be done on every

    incoming sample and do not require that a frame or block of data is available at the time

    of processing. However, to avoid having to shift between various buffer operations,

    block-wise processing is utilized for filtering too.

    2.3.2 The Sampling Rate

    The sampling rate of the system affects the speed requirements on the DSP program. Forthe system to work properly, the amount of processing time spent on signal processing

    the samples in the DSP program has to be shorter than the time it takes to fill an input

    buffer. To achieve this goal, economic memory usage, buffer sizes and number of blocks

    allowed in a model, are key issues that have been considered.

    2.4 Equipment

    The following equipment was available for design and implementation:

    One PC (Pentium III 500 MHz)

    One DSP-card (Texas Instruments C6701)

    Code Composer Studio

    Matlab 6.0, Release 12

    Microsoft Visual C++ 6.0

    Function generatorHeadphones, adapters

  • 8/8/2019 Kelayakan Lab DSP Thesis

    12/39

    12

    3 THE MAIN PARTS OF THE SYSTEM

    This chapter describes the graphical interface, the link program and the DSP program

    more extensively.

    3.1 The Graphical Interface

    The graphical interface utilizes Simulink, a flexible design tool provided by Matlab [1].

    Simulinkallows efficient testing and verification of signal processing algorithms [1], [3].In particular, the high-level circuit description makes possible quick changes and

    corrections, which would be impractical, if not impossible to carry out if a low-level tool

    were to be used from the start of the design process

    Figure 2: Module Library for the Graphical Interface

    As can be seen from figure 2, a few ready-made modules form a module library. It should

    be noted that Matlabs Simulinkoffers various options and settings for each module. The

    proposed interface, on the other hand, has put some constraints on these options. More

    often than not, it is possible to adjust only one parameter (Multiplier, Delay). As for the

    IIR filter, the numerator and denominator are adjustable. For professional use thislimitation is unsatisfactory, but for pedagogical purposes, for example if the interface is

    to be used as an introduction to signal processing for university students, this constraint

    might not be considered a serious flaw.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    13/39

    13

    Note the FFT-Out module in the module library. If an FFT module is included in a

    block diagram, the program immediately interrupts signal processing and the result of the

    FFT is sent to the output.

    Once the user has described an algorithm by connecting a number of given librarymodules, the model is stored in a Model Description Language/List-File, which is

    described in the next section.

    3.1.1 The Model Description Language/List-File

    The Model Description Language/List-file (.mdl-file) is a low-tech format file that

    defines a syntax for storing simple data in text and binary files [9]. The data is arranged

    into chunks.

    (informal) A chunk is a part of something, esp. a large part: a chunk of text [5]

    There are several different chunk-types. In a graphical model developed in Simulink, the

    major chunk-types are blocks and lines. The block sections, which describe the

    components of the design model, are stored in alphabetical order, followed by the line

    sections, each equivalent to a single wire connector.

    Figure 3 shows a basic graphical model of a communication channel, where one branch is

    delayed and attenuated. The last part of this models .mdl-file, illustrates the idea ofchunks:

    Figure 3: A basic graphical model and an extract from its .mdl-file

  • 8/8/2019 Kelayakan Lab DSP Thesis

    14/39

    14

    Block {

    BlockType Inport

    Name "In"Position [30, 38, 60, 52]

    Port "1"

    Interpolate on

    }

    Block {

    BlockType Fcn

    Name "Delay"

    Position [95, 80, 155, 110]

    ForegroundColor "red"

    Expr "2"

    }

    Block {

    BlockType Fcn

    Name "Multiplier"

    Position [195, 80, 255, 110]

    ForegroundColor "green"

    Expr "0.2"

    }

    Block {BlockType Sum

    Name "Sum"

    Ports [2, 1]

    Position [275, 35, 295, 55]

    ForegroundColor "blue"

    ShowName off

    IconShape "round"

    Inputs "|++"

    SaturateOnIntegerOverflow on

    }

    Block {

    BlockType Outport

    Name "Out"

    Position [335, 38, 365, 52]

    Port "1"

    OutputWhenDisabled "held"

    InitialOutput "[]"

    }

    Line {SrcBlock "Sum"

    SrcPort 1

    DstBlock "Out"

    DstPort 1

    }

    Line {

    SrcBlock "Delay"

    SrcPort 1

    DstBlock "Multiplier"

    DstPort 1

    }

    Line {

    SrcBlock "Multiplier"

    SrcPort 1

    Points [25, 0]

    DstBlock "Sum"

    DstPort 2

    }

    Line {SrcBlock "In"

    SrcPort 1

    Points [5, 0]

    Branch {

    DstBlock "Sum"

    DstPort 1

    }

    Branch {

    Points [0, 50]

    DstBlock "Delay"

    DstPort 1

    }

    }

    }

    }

  • 8/8/2019 Kelayakan Lab DSP Thesis

    15/39

    15

    Each chunk has a keyword telling what type of chunk it is and a sequence of data items,

    each of which can be an integer, a float, a string, or a new chunk. Each block section, for

    instance, has a string, which identifies what type of block it is, and other strings

    identifying its parameters. In the .mdl-file all text constants are described by the double

    quote. For the Multiplier-block above, the multiplier is 0.2.

    As for the line-chunks, the integers and strings are nested. The line with SrcBlock

    In has, for example, two branches, ending at the Sum Block and the Delay blockrespectively. Thus, it is possible to connect a line to multiple outputs, but Simulinkdoes

    not handle two branches ending at the same block without an intermediate adder.

    The .mdl-format is simple. The format was intended to create a practical way of storing

    models that was fast to read and write, easy to use in programs and reasonably space-

    efficient [9]

    3.2 The Link Program

    This section describes the Link program. Basically, what the Link program does is toopen an .mdl-file to retrieve information about the various building blocks and their wire

    connections in the graphical model. Then, this information is encoded and conveyed to

    the DSP program.

    The core of the link program is an iterative and recursive process, which establishes in

    what order to read from and write to what buffers.

    Whilst the amount of processing time spent on signal processing in the DSP program has

    to be shorter than the time it takes to fill an input buffer, there are no speed requirements

    on the Link program. The function of the Link program is to convert an .mdl-file to an

    execution order for ready-made library DSP modules, and this conversion is not supposed

    to be carried out in real time.

    The link program may be operated directly from the Matlab prompt, or via the file

    link.exe in the Windows Explorer

    3.2.1 Class Diagram

    This section describes the most important classes in the link program and how they

    interact.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    16/39

    16

    Block

    Module: stringStartpoint: string

    Endpoint: stringParam: string

    Line

    Start: stringEnd: string

    Vec_handler

    vec_search()

    create_module_list()fill_ex_vec()

    branch_flag: integerdsp_box: dsp_boxencoder: encoder

    Encoder

    code_name()

    code_param

    code table: int2float

    < Works for

    1 1

    < Works for

    * *

    1

    Dsp_box

    module: integerparam: integer

    readbuffer: integerwritebuffer: integer

    Dsp_ctrl

    dsp_init()

    dsp_run()write_module_list()

    EVM Board HandleHPI HandleEvent Handle

    1

    1

    Works for

    Inits and Supplies

    DSP

    INTERF

    ACE

    1

    *

    Chunk_finder

    open_file()

    search_mdl_file()store_in_vec()

    block_vec: vectorline_vec: vectorex_vec: vector

    1 1

    Overload Messages

    Figure 4:Class Diagram. The Most important Classes in the Link Program and their Associations

    In the initial conversion stage, class Chunk-finder opens an .mdl-file and reads it in text-

    mode. Chunk-finder searches the .mdl-file for two kinds of chunks: blocks and lines. It

    ignores the rest of the data in the file whereas the block-chunks and the line-chunks,encapsulated in separate classes, are stored in two separate vectors, together with the data

    items defining them.

    In the second conversion stage, class Vec_handler takes over. Put simply, Vec_handlercombines the elements in the two vectors, which Chunk_finder has filled with blocks andlines, into a new vector, which contains the execution order for the DSP-program. More

    specifically, class Vec_handler manages the core of the link program, that is, the

    recursive search algorithm, which handles the ordering of input and output buffers for the

    different signal processing modules of the graphical model.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    17/39

    17

    Figure 5: Thirteen-block model. The recursive algorithm in the link program recycles buffers

    The picture above illustrates how the execution order of the DSP program is established,

    and how buffers are reused during the analysis of the graphical model. Block1, block2and block3, for example, have to write to separate buffers, since buffer 1 is to be read by

    an adder later on in the model. When all modules connected to buffer 1 have finished

    reading, buffer 1 can be reused. Therefore, buffer 1 is used as the output buffer for the

    adder, which reads buffer 2 and buffer 1, and so on. As can be seen above, the recursivesearch algorithm in Vec_handler always starts and ends on buffer 1 to simplify buffer

    handling in the DSP program. Refer to Appendix A for a brief outline of the recursivesearch algorithm.

    Class Encoder works for class Vec_handler. The Encoder converts string-names and

    string parameters to float parameters, according to a predefined table. The conversion is

    carried out before the names and the parameters are stored in the execution vector.

    Once the conversion has been completed, class Dsp_ctrl initiates the DSP-program, and

    downloads an executable file to the target hardware. Class Dsp_ctrl supplies the DSP-program with the execution order of the library modules by writing numbers to an

    external memory on the EVM Board. Then, this class initializes the C6701 memory space

    through the Host Port Interface (HPI), and the external memory registration registers.

    When the boot process has ended, the CPU is taken out of reset and starts executing code

    from address zero.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    18/39

    18

    The picture below illustrates the list of data the link program writes to the external

    memory in the DSP program. Each module has five memory positions at its disposal.

    Position 1: Block identifier

    Position 2: Parameter, for example, delay factorPosition 3: Buffer to read from

    Position 4: Buffer to write to

    Position 5: Empty; reserved for future workPosition 6: Block identifier

    Etcetera.

    Offset

    Module 1 Module 2 Module N Stop Signal

    2 10 21 34 1 12 1 12

    St

    op

    Figure 6: Execution order. The link program writes data to the external memory of the DSP

    program to inform the DSP program what module to run and what buffers to use.

    Position number one tells the DSP program what module to run. A float value of 2, forexample, identifies the multipliermodule, a value of 3 corresponds to the addermodule,

    a value of 4 to the delay module, and so on, according to a predefined coding table. The

    second position in the external memory conveys the parameter chosen for the module. Inthe case of the multiplier module, that is module 1 in the picture, the second position

    identifies the multiplier factor, 10. Position 3 and 4 identifies what buffer to read datafrom and what buffer to write the processed data to, respectively. Position 5 is reserved

    for future work. After the last module in the list, a stop signal is included so that the DSP

    program knows where to stop reading the memory. Section 3.4 below describes how theDSP program interprets the data written to the external memory.

    3.3 The DSP Program

    Section 3.3 presents the DSP program utilized by the laboratory system.

    3.3.1 Hardware

    This section presents the Cx6701 processor and the Evaluation Module (EVM) board.

    The floating point processor Cx6701 has a peak performance of 1000 million floating

    point operations per second and can operate at 167 MHz (6 ns cycle time) [7]. It executes

  • 8/8/2019 Kelayakan Lab DSP Thesis

    19/39

    19

    up to eight 32 bits instructions every cycle. It has 64 k internal program memory or

    cache, and 64 k internal data memory.

    The C6x EVM board is a complete DSP system, which provides quad DSP clock support

    up to 133 MHz. Apart from the processor chip, the EVM board includes memory, A/Dcapabilities and PC interfacing components [10],[7]. The peripherals include an External

    Memory interface (EMIF), Direct Memory Access (DMA), Multi-channel Buffered

    Serial Ports (McBSP) and Host Port Interface (HPI). The 32 bits EMIF handles thecommunication with external memory and supports SDRAM, SBSRAM and

    asynchronous memories. The DMA has two 32 bits DMA data and two 32 bits DMA

    address busses. Refer to appendix Cfor a description of DMA. The McBSP provides a

    high-speed communication link with externals. The Host Port Interface (HPI) provides a

    low cost interface through which a host processor can directly access the CPUs internal

    memory [6].

    EMIF

    DMA...

    CPU

    McBSP

    HPI...

    External Memory A/D Converter (CODEC)

    PC Host

    Figure 7: Overview of a few of the peripherals on the EVM Board

    To enable communication with external peripherals, the EVM Board also includes a CD

    quality, 16-bit audio interface. The coder/decoder (CODEC) in the interface supports

    sample rates from 5.5 kHz to 48kHz [10]. The CODEC is connected to the C6701

    processor through two serial ports. The most time efficient way of transferring databetween the serial ports and the internal memory is to utilize the DMA.

    3.3.2 DMA and Signal Processing

    The DMA channels and their interaction with the signal processing unit, are described in

    this section.

    The part of the DSP program that handles the communication with the McBSPs was

    developed from a skeleton program, found on [11]. The DSP program makes use of two

    DMA channels, configured to continuously capture the sample data from two blockseach. Refer to figure 8 below. DMA channel two copies data to InBuffer1 or to

    InBuffer2. When the input buffer is full, an interrupt is posted to the CPU and thecontents of its registers stored. Refer toAppendix B for a brief description of interrupts.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    20/39

    20

    During each interrupt service routine some signal processing, involving one or more

    buffer transfers, has to be completed on the sample data. Once the signal processing is

    completed, the processed data is copied to OutBuffer1 or to OutBuffer2. Then, the

    procedure starts all over again. DMA channel two copies datafrom the receive register of

    the McBSP, while DMA channel one copies the processed data to the transmit register ofthe McBSP.

    At the end of each block transfer, all necessary registers are restored so that the DMA canperform another block transfer.

    McBSPReceiveRegister

    InBuffer1

    InBuffer2

    SignalProcessing

    Unit OutBuffer2

    OutBuffer1 McBSPTransmitRegister

    Interrupt when buffer full

    Figure 8: DSP program overview

    3.3.3 The Amount of Time Available for Signal Processing

    It should be clear by now that the amount of time spent on signal processing one buffer of

    sample data in the DSP program has to be shorter than the time it takes to fill an input

    buffer. Thus, it is necessary to consider the amount of time available for the signal

    processing to be performed.

    The formula for calculating the amount of time available for signal processing was found

    in [7]. Given a sampling frequency, fs, of 8 kHz and an EVM frequency, fEVM, of 133

    MHz, there are 16 625 clock cycles (1/( fs/ fEVM)) between consecutive samples. If each

    input buffer contains 512 samples, where even data is right channel data and odd data is

    left data, there will be roughly 8106

    (512 times 16 625) clock cycles available for signal

    processing. A sampling frequency of 44.1 kHz results in 1.5106 available cycles.

    In other words, since one sample corresponds to 0.000125 (1/fs) seconds, it takes 0.064

    (512 times 0.000125 s) seconds to fill an input buffer. Within this time all signalprocessing has to be performed. Consequently, a higher sampling frequency, for example

    44.1 KHz, will decrease the available time up to 6 times.

    Both channels are sampled and put into the input buffer. Only one channels data is

    processed during the interrupt service routine though. Naturally, if both channels were to

  • 8/8/2019 Kelayakan Lab DSP Thesis

    21/39

    21

    be processed, the amount of time available for signal processing would decrease by

    almost a factor 2.

    3.3.4 Buffer Handling in the Signal Processing Unit

    This section describes the buffer handling in the signal processing unit of the DSP

    program.

    DSP programs, such as the delay or the IIR filter, where the last values of one buffer

    constitute the first values in the next buffer, require preservation of buffer values so as

    not to cause a gap in between buffers. To allow for more than one delay and one IIR filter

    in the model, the samples that are preserved until the next buffer arrives are stored in a

    separate memory space, with only an offset in between the different filters or delays. Thissolution makes it possible to configure each delay and each filter independently of the

    other filters and delays in the model. An offset pointer keeps track of the offset to each

    filters start address. The filter coefficients for the different filters are stored in a similar

    fashion in a separate memory and with an offset in between

    Offset

    Preserved Samples for filter1 Preserved Samples for filter2.... ....Preserved Samples for filterN

    Figure 9: Memory structure. Samples for the delay module and the IIR filter are preserved in a

    separate memory area

    Since the IIR filter output depends both on input and output samples, the buffers used

    whilst IIR filtering data are slightly longer than the buffers used in, for example, the

    delay-program. The length of the input and output buffers are the same as for all the other

    modules in the library though. If the modules are to be connected, as they are in agraphical model, the length of the buffers has to be the same.

    As seen above, the preservation of a few samples in between buffer transfers is easily

    dealt with. More demanding problems crop up when up sampling and down sampling are

    considered.

    The implementation of up sampling and down sampling, requires variable buffer lengths.

    If a program resamples the N values in a buffer at a rate L times higher than the inputsample rate, by inserting L-1 zeros between consecutive samples, the size of the output

  • 8/8/2019 Kelayakan Lab DSP Thesis

    22/39

    22

    buffer will consequently be L times longer. Similarly, since down sampling involves

    discarding a number of consecutive samples following each sample, the result will be a

    shorter output buffer. Dexterity in handling buffers is needed to solve these problems on

    a general level. Most certainly, the simple program structure of this prototype cannot be

    used.

    3.4 Main Loop of the DSP Program

    This section describes how the main loop in the Interrupt Service Routine (ISR) is

    designed. The pseudo code below illustrates how the DSP program interprets the

    execution order, which the C++ interface writes to the external memory.

    The signal process function is called each time the processor interrupts its normal

    program flow, that is, when an input buffer is full of data. Initially, in the signal processfunction, the content of a full input buffer is converted from short integers (16 bits) to

    floating point values (32 bits) via a function call to a separate function.

    Typecasting of the data that is to be processed is necessary. Whilst all signal processingalgorithms are performed on float values, the audio interface on the C6701 EVM Board

    includes a 16 bit audio interface. The speed penalty for using floats instead of shorts

    on the C6701 is typically only a factor 2 [12].

    Convert short input data to floating point values

    Set float pointer to first position in external memory

    MAIN LOOP

    do

    {

    set buffer to read

    set buffer to write

    run signal process

    switch(module)

    case 2: call multiplier

    break;

    case 5: call IIR filter

    break;

    }while(!stop signal)

    Signal processing done!

    Figure 10: Pseudo code for the main loop in the ISR

  • 8/8/2019 Kelayakan Lab DSP Thesis

    23/39

    23

    Next, a pointer is set to the first position in external memory, where the execution order

    for the DSP program resides.

    At the outset of the main loop, the buffer to read from and the buffer to write to are

    established. Then, one of the various signal processing functions are called. The secondturn in the while clause starts reading at the fifth position in the external memory, third at

    the tenth position, and so on. The loop is repeated until a stop signal is found.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    24/39

    24

    4 OPTIMIZATION AND TESTMETHOD

    Chapter 4 discusses optimization in general and whether to use C code or assembly code

    in particular. It also comments on the preparation of memory in the DSP program and thetest methods utilized.

    4.1 Optimization

    In general, the purpose of optimization is to create the smallest, or the fastest, object code

    possible. To achieve these goals, the compiler performs various changes to the assemblycode. For example, it eliminates the dead code, removes the redundant expressions,

    optimizes loops, and uses inline functions [19].

    Generally, inline means to place something directly in the source code. More specifically, in [13]s words:

    When an inline function is called, the C/C++ source code for the function is inserted at the point of the call. This is known

    as inline function expansion. Inline function expansion is advantageous in short functions for the following reasons:

    It saves the overhead of a function call.

    Once inlined, the optimizer is free to optimize the function in context with the surrounding code

    The compiler in Code Composer Studio allows four levels of optimization: -o0. o1,-o2,

    and -o3. Level o0 corresponds to no optimization. At level o3, the C compiler and

    optimizer is claimed by Texas Instrument to generate code that is 80% efficient, that is,

    optimization up to 80% of handwritten assembly is possible. Refer to [17] for details.

    4.1.1 C or Assembly?

    The DSP program is almost entirely written in C. The user is normally able to use C forall programs and functions [12]. With many functions, the code generated from the C

    compiler and optimizer in Code Composer, is making full use of the processors resource

    [12]. No benefit at all would come from writing the code in low-level assembly.

    However, FFT assembly code was downloaded from Texas Instruments ftp-site [14] to

    improve the performance. The FFT is an example of a particularly complicated signal

    processing task which utilizes manipulation of the real and imaginary parts of complexsamples. The use of the FFT assembly routine suits the instruction set on the C6701,

    which allows simultaneous manipulation of the real and imaginary parts of complex

    samples that are stored as a single 32-bit entity.

    Texas Instruments ftp-site also makes available IIR filter assembly code. But this code

    works on the assumption that interrupts are disabled before the filter function is called

    [14]. Therefore, the IIR filter module in this prototype utilizes the C routine provided

    together with the assembly code.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    25/39

    25

    4.2 Memory Usage

    There are two methods for preparing the memory: Static allocation and dynamicallocation. Static allocation refers to allocation of memory before the program starts. Thelocations of objects are decided at compile-time. Dynamic allocation, on the other hand,

    refers to allocation and deallocation of storage in arbitrary order, normally determined by

    the choices the user makes, at run-time [15]. Static memory is typically faster than

    dynamic memory and sometimes used in real-time systems.

    This section discusses how the memory is prepared in the DSP program. It also

    comments on the advantages and disadvantages of internal and external memory access.

    For an algorithm to run efficiently on the C6701, the code and data must reside on theDSPs internal program and data memory. If data has to be retrieved from the external

    memory, these transfers can slow down the execution by two to six times [16].

    Internal ProgramMemory

    Internal DataMemory

    ExternalMemory

    ALU

    2-6 cycle latency

    Figure 11: External memory access

    The program code for the DSP program fits entirely in the internal program memory.

    Similarly, flags for communication with the host-PC reside in the internal data memory.

    But, only a limited number of signal buffers can be statically allocated in the internalmemory. Since the aim of this project is to design a user-friendly and dynamic system,

    memory is dynamically allocated at run time for filter coefficients, for signal processing

    buffers, and for buffers preserving samples in between buffer transfers. After

    compilation, at run-time, the user can make certain choices as far as sampling frequency

    and buffer sizes are concerned.

    To explore the performance of the speed critical DSP program, two separate DSP applications

    were implemented though: one utilizing the internal memory and another utilizing theexternal memory. Refer to chapter 5 for details.

    It should be noted that with C6701, dynamic memory allocation is not possible in the

    internal data memory. Consequently, at the expense of speed (refer to section 5.1), filter

    coefficients and signal buffers reside in an external memory space (SDRAM0). The

    SDRAM devices are always clocked at one-half the CPU rate [10]. In this application the

    DSP core runs at 133 MHz, that is, the SDRAM runs at 66.5 MHz (15 ns). In return, the

    size of the dynamically allocatable memory section (.sysmem), which utilizes the

    SDRAM0 memory, can be extended up to roughly 5 Mbytes, if a linker option (heap

  • 8/8/2019 Kelayakan Lab DSP Thesis

    26/39

    26

    size) is changed. Accordingly, there is almost no upper limit as far as how many buffers

    can be used in a model.

    The sequence of numbers conveying the order in which to run the various DSP modules,

    reside in the external memory as well. The retrieval of a few float values from theexternal memory before running each DSP module, did not have any major impact on the

    amount of time required by the various signal processing options.

    Finally, the assembly code for the Infinite Impulse Response (IIR) filter module makes

    use of buffers that must be aligned in certain ways in the memory. The IIR filter utilizes

    opposite (even and odd) double word (64 bit) boundaries to avoid memory bank hits.

    Refer to [17] and [18]. But since this prototype utilizes the C routine provided together

    with the assembly code, no such concerns were taken when the filter buffers were

    dynamically allocated in the external memory.

    4.3 Test Methods

    This section describes the two test methods utilized in the project.

    Initially, in the project, Benchmarking was used to estimate the number of clock cycles

    needed for each DSP module.

    Figure 12: Benchmarking. The screen dump illustrates the benchmarking procedure in Code

    Composer Studio.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    27/39

    27

    The screen dump above illustrates the benchmarking procedure in Code Composer. The

    clock is enabled. Breakpoints and Profile points are inserted on and immediately after the

    function call to signal_process(). A return statement is added early in the multiplier

    module to determine the overhead of the signal process function itself. Note the dead

    code in the figure below, end = end, which is added to make possible the setting ofbreakpoints.

    During the benchmarking procedure, optimization level o3 and Speed most criticalwas chosen under Options in Code Composer Studio. Still, the results obtained from the

    initial measurements may be overly pessimistic, since debugging and full scale

    optimizations cannot be done together [7]. In debugging information is added to

    enhance the debugging process. In optimization, on the other hand, information is

    minimized or removed to increase code efficiency.

    A slightly different approach was taken to estimate the amount of processing time neededfor various DSP programs. These tests were conducted from the C++ interface and with

    the DSP programs utilizing both DMA and interrupt. Four different signal-processing

    algorithms (test cases) were studied:

    1 multiplier module

    1 IIR filter

    10 IIR filter (serially)

    1 FFT

    The amount of time required to complete the signal processing for each test case was

    calculated indirectly. The program resides in an infinite loop while it is waiting for thenext interrupt:

    while(!DONE)

    COUNTER++;

    A counter-variable in the infinite loop counts the number of additions performed while

    the CPU is waiting for the next interrupt. When the program leaves the infinite loop to

    service an ISR, a mailbox message communicates the obtained number of additions to the

    PC. By resetting the counter immediately after finishing the signal processing in the ISR,the amount of time needed for each test case can be estimated.

    The received number of additions actually corresponds to the amount of time available

    during one interrupt in the C++ program. To obtain the amount of time required to signal

    process each test case, the received number of additions must be subtracted from a

    reference value, that is, the maximum number of additions performed during one

    interrupt. This reference value is obtained by commenting the function call to the signalprocessing function whereby nothing of value is actually performed during the ISR, apartfrom the saving and the restoring of the contents of registers and flags.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    28/39

    28

    Further, to obtain the amount of time needed for each algorithm, the maximum number of

    additions was correlated with the amount of time available during one interrupt, as

    calculated in section 3.3.2. After a simple calculation in the C++ interface, the result was

    written to stdout.

    To make it possible to compare the different DSP programs, the sampling frequency was

    8 KHz and the buffer size 256 throughout the measurements.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    29/39

    29

    5 RESULTS

    This chapter evaluates the performance of the system. Initially, the speed performance ofa few individual algorithms is evaluated. Then, the speed performance of four different

    DSP programs is considered. The programs utilize static memory allocation in theinternal memory or dynamic memory allocation in the external memory. Then, the

    constraint put on the prototype is accounted for. Finally, future studies are identified.

    5.1 Speed Performance of the System

    This section presents the speed performance of individual algorithms in a DSP program

    utilizing static memory allocation in the internal memory.

    The table below lists the number of clock cycles required to perform signal processing for

    a few of the algorithms in the DSP program. The test utilizedBenchmarking, as describedin section 4.3, with the DMA and the interrupt routine disabled. Number of modules (Nof

    Modules) refers to one, two or three algorithms serially:

    Table 1: CLOCK CYCLES

    Nof Modules 1 2 3

    Multiplier 9606 12326 15043

    Delay 9048 11204 14626

    FFT 27259 - -IIR 36223 50905 65663

    Table 1: Number of clock cycles needed for one, two and three modules, including overhead (7014

    clock cycles)

    One clock tick in C6701 is equivalent to 7.5 ns. Table 2 lists the amount of time needed

    for each module.

    Table 2: TIME [ms]

    Nof Modules 1 2 3Multiplication 0.0724 0.0923 0.1128

    Delay 0.0678 0.0841 0.1096

    FFT 0.2044 - -

    IIR 0.271 0.381 0.4924

    Table 2: Time needed for one, two and three modules, including overhead ( 0.052 ms)

    The multiplier module above utilizes 0.1 % of the amount of time available during one

    interrupt. The IIR filter utilizes 0.4 %. These results were an indication that it might be

  • 8/8/2019 Kelayakan Lab DSP Thesis

    30/39

    30

    feasible to implement a laboratory system, which required a number of buffer transfers

    between various modules for each algorithm.

    With a buffer size of 256, the signal process function in the program has an overhead of

    about 7014 clock cycles, corresponding to 52 micro seconds (7.5 ns/clock cycle times7014). This overhead includes the int2float typecasting for left channel data and the

    copying of right channel data (reference channel) to the output buffer.

    5.2 Case Studies

    This section presents the speed performance of one DSP program utilizing internal datamemory, and three DSP programs utilizing the external memory.

    These DSP programs were tested:

    Intern: Intern utilizes static allocation of memory. Buffers for signal processing, as

    well as filter coefficients reside in the internal data memory; sampling frequency and

    buffer sizes are set before compilation.

    Extern: Extern utilizes dynamic memory allocation in the external memory for signal

    processing buffers; the user determines sampling frequency and buffer size at run time.

    FixedFs/N: FixedFs/N is the same application as Extern, but sampling frequency and

    buffer size are set before compilation.

    ExtIntern: ExtIntern is the same application as Extern, but memory for filter

    coefficients are not dynamically allocated in the external memory; the filter coefficients

    reside in the internal memory.

    In order to go through with the tests, a few test algorithms were created. The algorithm

    below, where 20 IIR filters are connected serially, is not a useful one; it is just there to

    illustrate the time needed to signal process such a block diagram or a similar timeconsuming circuit.

    In the table below, None refers to the result when the function call to the signalprocessing unit is removed from the source code and nothing is performed during the

    actual interrupt routine.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    31/39

    31

    Table 3: % Cycle UsageIntern Extern FixedFs /N ExtIntern

    Signal Processing % % % %

    None 0 0 0 0

    One multiplier module 0.226 1.015 0.929 1.015

    One IIR filter module 0.681 4.725 3.992 4.854

    Ten IIR filters (serially) 5.355 40.263 33.718 41.526

    One FFT module 1.412 2.665 2.101 2.663

    Table 4: TIME [ms]

    Intern Extern FixedFs /N ExtIntern

    Signal Processing Time [ms] Time [ms] Time [ms] Time [ms]None 0 0 0 0

    One multiplier module 0.144 0.649 0.595 0.650

    One IIR filter module 0.436 3.023 2.555 3.106

    Ten IIR filters (serially) 3.426 25.771 21.577 26.576

    One FFT module 0.903 1.705 1.345 1.705

    As seen from the table above, the amount of time needed for signal processing the

    algorithms vary a lot between the different applications. With 10 IIR filters serially in one

    algorithm, Intern utilizes a little more than 5 % of the available time . Extern utilizes

    slightly more than 40 % of the available time, whereas the extern system with samplingfrequency and buffer size fixed at the time of compilation reaches as high as 34 % of

    available time. The extern system became slightly faster when the filter coefficients were

    moved to the external memory, instead of having them reside in the internal memory. It

    might be fair to say that it is not always better to cram the internal data memory full ofdata, instead of reallocating certain data to the external memory.

    The obvious generalization to be made with respect to these results is of course that there

    is a tradeoff between speed and flexibility. If speed is an issue, choose static allocation in

    the internal memory; if flexibility is your main concern, chose dynamic memory

    allocation and the external memory.

    A maximum test was also performed to see how much signal processing the most flexible

    DSP program,EXTERN, could handle within one interrupt.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    32/39

    32

    Table 5: MAXIMUM NUMBER OF IIR_FILTERS

    Extern

    Signal Processing % T [ms]

    20 IIR filter (serially) 80 51.2

    25 IIR filter (serially) 99.8 63.8

    A seen below, this test resulted in the constraint reasoning on the maximum number of

    IIR filters that should be allowed in one model.

    5.3 Requirements

    Certain knowledge on the part of the user is required to run the laboratory system.

    Primarily, the first version of the prototype has been designed with some constraint put

    on the Simulinkmodel:

    In and Out blocks define the start point and the end point for the graphical

    model and have to be included. Only one input and one output block are allowed.

    Adders must be connected by both inputs to be handled by the Link program.

    As seen from the module library in section 3.1, the program immediately

    interrupts the signal processing, if an FFT module is included in the model. Theresult of the FFT is sent to the output

    The maximum number of IIR filters allowed in the model is 20, each individuallyconfigured. That is, the filters may have different orders and different parameters.

    Maximum order is 30.

    The maximum number of characters in the numerator and the denominator for theIIR filters is 150, respectively.

    The maximum number of delay modules is 100, each having a delay factor in therange 1-10, integer value.

    The maximum number of filters and delays in the model is by no means fixed, nor is it an

    upper limit for what the laboratory system might handle. Rather it is just the way thisprototype has been set up to work. If required, the user may, with a small modification to

    the source code, increase the number of allowed filters and delays.

    5.4 Limitations

    The major drawback of the prototype is that it cannot handle feedback in graphical

  • 8/8/2019 Kelayakan Lab DSP Thesis

    33/39

    33

    Figure 13: Feedback model. This model is not handled by the system

    models. A simple model, such as the one depicted above, could easily be dealt with, but a

    general solution to the problem of handling feedback models, requires an approach where

    every incoming sample is treated separately.

    5.5 Conclusion and Future Work

    Though several constraints have been put on the models, the proposed system combines,

    on a small scale, the benefits of Simulinks intuitiveness and user-friendliness, with the

    real-time capabilities of a proper DSP-implementation. The system allows for intricatemodels to be converted and makes available various signal-processing algorithms, such

    as adders, delays, FFTs, IIR filters and multipliers.

    As for the DSP implementation, there is a trade-off between flexibility and speed. Having

    data reside on the internal memory puts more constraints on the graphical models. In

    return, it allows for a much higher sampling frequency. The first version of the prototype

    utilizes the external memory. Considering that the recursive search algorithm in the link

    program recycles buffers quite effectively, memory allocation in the internal memory isan interesting alternative though.

    Apart from thorough tests, future work is to consider the modification of the C++

    interface and the DSP program to allow for feedback in the graphical models. This

    improvement of the system will require a new design approach, where individual

    processing of each incoming sample replaces the block-wise processing. Once this is

    complete, more modules should be added to the DSP-program and to the block library. Inparticular, future work is to consider an efficient way and general way of handling up-

    sampling and down-sampling, that is, signal processing algorithms where variable bufferlengths are required.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    34/39

    34

    6 APPENDIXES

    Appendix A: Brief outline for recursive search

    1. Find a line which has the "In"-block as its start block. Follow the line. When the

    end block of that line is found, retrieve its parameter and pick a buffer number.

    Then, store the block in a vector containing the execution order for the DSPprogram.

    2. Follow the line from the new block. If more than one line has its origin at the new

    output, store the block and its output as a loose end" in a temporary vector. Then

    ignore the rest of the branch and resume the recursive search from (1). If only one

    line has its origin at the block, continue searching for the next block. An iterativefunction call is needed to handle several blocks serially.

    3. If an adder is found, store buffer number and name of adder (Sum, or Sum1, etc)

    in a temporary adder vector and ignore the rest of the branch. Else, if the adder

    already exists in the temporary adder vector, set readbuffer and writebuffer

    for the adder, find its parameter and store it in the vector containing the execution

    order for the DSP program.

    4. Repeat the procedure from (1) with loose end-blocks and adders as starting

    points instead of "In". Continue until there are no loose ends and the "Out"-block

    is found.

    If possible, reuse buffers.

    A few more tests are included to make things work; refer to source code for details

  • 8/8/2019 Kelayakan Lab DSP Thesis

    35/39

    35

    Appendix B: C6701 Interrupts

    The brief descriptions in appendixes A and B of the Interrupt Service Routine (ISR) andthe Direct Memory Access (DMA) are based on [6].

    Most microprocessors, including C6701, have one or more inputs for stopping

    (interrupting) the normal program flow. The interrupt can come from an external or

    internal peripheral, or simply from a special instruction in the program. As mentioned

    earlier, in the laboratory system an interrupt is posted to the processor when an input

    buffer is full with sample data.

    When an interrupt occurs, the CPU finishes the current instruction. Refer to figure 14below. Then, the hardware in the processor branches to a fixed address, predefined duringthe construction of the computer. At this address, the programmer has placed an interrupt

    service routine. The program code in the ISR of this application performs signal-

    processing on the sample data during the interrupt.

    Save contents of registersand flags

    Interrupt occurs

    Program flow:

    Instruction 1Instruction 2

    .

    .Instruction n

    Service the interrupt

    Restore the contents of theregisters

    Resume original process

    Program flow:

    Instruction n+1Instruction n+2

    Figure 14: Interrupt response procedure

  • 8/8/2019 Kelayakan Lab DSP Thesis

    36/39

    36

    It should be noted that an interrupt might occur at any time during the program flow; it is

    impossible to predict between what two instructions in the program flow the interrupt

    will occur. Therefore, the user must save the contents of the registers and all flags, before

    servicing the interrupt task. Then, the user must restore the registers and the context of

    the process before the program is allowed to resume its original process. You can think ofan ISR as an ordinary function with no arguments and void return value that saves and

    restores the CPU state.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    37/39

    37

    Appendix C: Direct Memory Access (DMA)

    Load and store instructions can be used for transferring data from one part of memory toanother in a central processing unit (CPU). However, data transfers keep the CPU busy

    and prevent it from performing other tasks. In fact, if the CPU is used for data transfers,

    most of the processor time will be wasted while the processor is waiting for new data to

    arrive.

    A less time-consuming (CPU-time) way of transferring data between internal and

    external memory is to use Direct Memory Access (DMA). The DMA acts as a co-

    processor, which moves data from one part of memory into another without interferingwith the CPU. Therefore, the DMA-method leaves the CPU free to perform other tasks.Once the CPU has specified what data transfer options to be carried out, the DMA-unit

    can operate independently.

    C6701 has four DMA channels. Each channel has its own memory-mapped control

    registers that can be set up to move data from one place in memory to another. These

    registers contain information regarding source and destination locations in memory,

    number of transfers, and format of transfers. To avoid memory conflicts when more than

    one DMA channel tries to access the same resource in a given clock cycle, a priorityscheme has to be established. In C6701, the four DMA channels have fixed priorities,

    with channel 0 having the highest and channel 3 the lowest priority. In this application,

    DMA channel 2 is programmed to generate an interrupt of the CPU when an input buffer

    is full. DMA channel 1 copies the data to the transmit register of the McBSP. Channel 0

    is used for reset.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    38/39

    38

    Appendix D: Users Guide

    Start the Simulinkprogram from theMatlab-prompt:

    >simulink

    Develop a model from the ready-made building blocks in the sl_library, found in the

    folder C:\BRIDGE_PROJECT. Choose parameters for the building blocks, by double

    clicking on the blocks. Store the model as sl_model.mdl in the folder

    C:\BRIDGE_PROJECT.

    To start the conversion program, execute the file LINKPROGRAM.EXE in the folderC:\BRIDGE_PROJECT\cpp. You can either double click on the file in WindowsExplorer, or you can create a shortcut to the program and move it to your desktop area.

    You may also execute the program from theMatlab prompt by typing:

    >linkprogram

    At start up, the visual control panel will appear and prompt you to enter sampling

    frequency and buffer size for the DSP program. Once a valid frequency and buffer sizehas been entered (you will be guided by the program) and the Enter key pressed, the

    DSP program is launched automatically.

    If overload occurs in the DSP program, a warning will appear before the program is

    halted. If an overload warning appears, firstly check your input level (Maximum Signal

    Level: 6Vpp, 2.1 Vrms). Secondly, check your multiplier factors.

    To stop the signal-processing, press any key.

    Important Note:

    If you recompile the program and want to run the new version from outside the Microsoft

    Visual C++ program, i.e. from Windows Explorer or from the Matlab prompt, the

    LINKPROGRAM.EXE file has to be copied from folder ..\cpp\Debug into folder ..\cpp.

  • 8/8/2019 Kelayakan Lab DSP Thesis

    39/39

    7 REFERENCES

    [1] The MathWorks Inc., www.mathworks.com/products/

    [2] A Matlab to VHDL Conversion Toolbox for Digital Control, I.A. Krout, K. Keane,Department of Electronic and Computer Engineering, University of Limerick,

    Limerick, Ireland, 2000, www.ece.ul.ie/homepage/ian_grout/paper1.pdf

    [3] Simulink/Matlab-to-VHDL Route for Full-Custom/FGPA Rapid Prototyping of

    DSP Algorithms, Artur Krukowski, Izzet Kale, University of Westminster, United

    Kingdom, November 1999, www.cmsa.wmin.ac.uk/~artur/papers/Paper18.pdf[4] www.mathworks.com/products/controldesign/cgrp.shtml

    [5] Definition from Cambridge International Dictionary of English orConcise Oxford

    English Dictionary, www.ordboken.nu[6] Digital Signal Processing Implementation using the TMS320C6000 DSP Platform,

    Naim Dahnoun, Prentice Hall, 2000.

    [7] C6x-Based Digital Signal Processing, Nasser Kehtarnavaz, Burc Simsek, Prentice

    Hall 2000.

    [8] A C Test: The 0x10 Best Questions for would be Embedded Programmers,

    Nigel Jones, www.embedded.com/2000/0005/0005feat2.htm

    [9] The MDL File Format, Cornell University Program of Computer Graphics,

    Ithaca, New York, May 1998, www.graphics.cornell.edu/online/formats/mdl/

    [10] TMS 320C6201/6701 Evaluation Module, Technical Reference, Texas Instruments[11] Department of Signals, Sensors and Systems, Royal Institute of Technology (KTH),

    Stockholm, Sweden, www.s3.kth.se[12] Comparing the C4x and the C6x, Mark Siggins, Johan Thie, Horizon

    Technologies, Oslo, Norway, www.horizon-tech.fr/articles/dsp/hunt/compare.htm

    [13] TMS320C6000 Code Generation Tools Online Documentation (SPRH014E)

    (c)1998-2000 Texas Instruments Incorporated

    [14] Texas Instrument, TMS320C67x DSP Software Support Files,

    www-k.ext.ti.com/sc/technical-support/tools/dsp/ftp/c67x.htm [15] The Memory Management Glossary, Ravenbrook Limited, Cambridge,

    www.memorymanagement.org/glossary/

    [16] An Approach for Quick Development of High Performance Telecom Applications

    on the TI TIMS2320C620X DSP, Manish Kasliwal, RadiSys Corporation,

    http://www.radisys.com/files/Task_article_euromagazine.pdf[17] The TMS 320C6X Optimizing C Compilers Users Guide (SPRU 187), Texas

    Instruments

    [18] The TMS320C62x/C67x CPU and Instruction Set Reference Guide (SPRU189),Texas Instrument

    [19] Run-Time Debugging with Microsoft Visual Studio and Rational Purify , GoranBegic, www.therationaledge.com/content/apr_01/t_debug_gb.html


Recommended