Kelayakan Lab DSP Thesis

8/8/2019 Kelayakan Lab DSP Thesis

1/39

MASTER THESIS

A Prototype Laboratory Environment for Digital Signal Processing

Using Simulink and a Texas Instrument DSP Device

Calle Gustavsson

March 2002

IR-SB-EX-0207

Department of Signals, Sensors & Systems,

Royal Institute of Technology


2/39

2

Abstract

Normally, when a model is designed from building blocks in Simulink, the simulation isperformed within the Simulink environment. A test of the design in a real-time

environment requires that source code is generated, compiled and downloaded to the

target hardware. As a first attempt to bridge this software gap, this thesis describes and

evaluates a prototype laboratory environment, which directly links Simulink to a Texas

Instrument DSP device. The prototype system converts graphical models and makes

available various real-time signal processing algorithms, such as adders, delays, FFTs,

IIR filters and multipliers. Future work is to consider modification of the prototype to

allow for feedback in the graphical models and to find an efficient way of handling signalprocessing algorithms where variable buffer lengths are required.

Acknowledgements

I would like to thank Ph.D. Mats Bengtsson for his encouraging comments on my early

work and for his excellent guidance throughout the project. Personally, I gained a lot of

practical software experience and valuable insights into signal processing. Also, I would

like to thank Ph.D. Student Georg Jrngren for good advice concerning DSP issues.


3/39

3

CONTENTS

TABLE OF FIGURES..................................................................................................... 5

1 INTRODUCTION.......................................................................................................... 6

1.1 BACKGROUND............................................................................................................ 6

1.2 THE PROBLEM............................................................................................................ 6

1.3 THE OBJECTIVE OF THE PROJECT ............................................................................... 7

1.4 EARLIER WORK ......................................................................................................... 8

2 SYSTEM OVERVIEW ................................................................................................. 9

2.1 SYSTEM OVERVIEW ................................................................................................... 9

2.2 COMMENTS ON THE SYSTEM OVERVIEW.................................................................. 10

2.3 PROGRAMMING METHODOLOGY.............................................................................. 102.3.1 Block-wise Processing..................................................................................... 11

2.3.2 The Sampling Rate........................................................................................... 11

2.4 EQUIPMENT.............................................................................................................. 11

3 THE MAIN PARTS OF THE SYSTEM ................................................................... 12

3.1 THE GRAPHICAL INTERFACE.................................................................................... 12

3.1.1 The Model Description Language/List-File..................................................... 13

3.2 THE LINK PROGRAM ................................................................................................ 15

3.2.1 Class Diagram.................................................................................................. 15

3.3 THE DSP PROGRAM................................................................................................. 18

3.3.1 Hardware.......................................................................................................... 183.3.2 DMA and Signal Processing............................................................................ 19

3.3.3 The Amount of Time Available for Signal Processing.................................... 20

3.3.4 Buffer Handling in the Signal Processing Unit................................................ 21

3.4 MAIN LOOP OF THE DSP PROGRAM......................................................................... 22

4 OPTIMIZATION AND TESTMETHOD ................................................................. 24

4.1 OPTIMIZATION ......................................................................................................... 24

4.1.1 C or Assembly?................................................................................................ 24

4.2 MEMORY USAGE...................................................................................................... 254.3 TEST METHODS........................................................................................................ 26

5 RESULTS ..................................................................................................................... 29

5.1 SPEED PERFORMANCE OF THE SYSTEM .................................................................... 29

5.2 CASE STUDIES.......................................................................................................... 30

5.3 REQUIREMENTS........................................................................................................ 32

5.4 LIMITATIONS............................................................................................................ 32

5.5 CONCLUSION AND FUTURE WORK ........................................................................... 33

6 APPENDIXES.............................................................................................................. 34

APPENDIX A: BRIEF OUTLINE FOR RECURSIVE SEARCH .................................................. 34

APPENDIX B: C6701 INTERRUPTS.................................................................................. 35


4/39

4

APPENDIX C: DIRECT MEMORY ACCESS (DMA)........................................................... 37

APPENDIX D: USERS GUIDE ......................................................................................... 38

7 REFERENCES............................................................................................................. 39


5/39

5

TABLE of FIGURES

FIGURE 1: PROTOTYPE LABORATORY SYSTEM OVERVIEW .................................................. 9

FIGURE 2: MODULE LIBRARY FOR THE GRAPHICAL INTERFACE ........................................ 12

FIGURE 3: A BASIC GRAPHICAL MODEL AND AN EXTRACT FROM ITS .MDL-FILE ................. 13FIGURE 4:CLASS DIAGRAM. THE MOST IMPORTANT CLASSES IN THE LINK PROGRAM AND

THEIR ASSOCIATIONS ................................................................................................. 16

FIGURE 5: THIRTEEN-BLOCK MODEL. THE RECURSIVE ALGORITHM IN THE LINK PROGRAM

RECYCLES BUFFERS .................................................................................................... 17

FIGURE 6: EXECUTION ORDER. THE LINK PROGRAM WRITES DATA TO THE EXTERNALMEMORY OF THE DSP PROGRAM TO INFORM THE DSP PROGRAM WHAT MODULE TO

RUN AND WHAT BUFFERS TO USE................................................................................ 18

FIGURE 7: OVERVIEW OF A FEW OF THE PERIPHERALS ON THE EVM BOARD..................... 19FIGURE 8: DSP PROGRAM OVERVIEW ................................................................................ 20

FIGURE 9: MEMORY STRUCTURE. SAMPLES FOR THE DELAY MODULE AND THE IIR FILTER

ARE PRESERVED IN A SEPARATE MEMORY AREA......................................................... 21

FIGURE 10: PSEUDO CODE FOR THE MAIN LOOP IN THE ISR ............................................... 22FIGURE 11: EXTERNAL MEMORY ACCESS........................................................................... 25

FIGURE 12: BENCHMARKING. THE SCREEN DUMP ILLUSTRATES THE BENCHMARKING

PROCEDURE IN CODE COMPOSER STUDIO. ................................................................. 26

FIGURE 13: FEEDBACK MODEL. THIS MODEL IS NOT HANDLED BY THE SYSTEM ................ 33FIGURE 14: INTERRUPT RESPONSE PROCEDURE.................................................................. 35


6/39

6

1 INTRODUCTION

This Master Thesis tests the feasibility of an idea through the design of a prototypelaboratory environment for digital signal processing. The thesis begins with an

introduction to the problem of controlling digital signal processors (DSPs). In particular,the first chapter discusses the problem of directly linking a high-level tool for modeling

and simulation, Simulink (The MathWorks Inc. [1]), to a DSP. The second chapterpresents an overview of the proposed system and the subsequent chapter describes its

three parts: the graphical interface, the link program and the DSP implementation.

Chapter four discusses optimization, memory usage, and the test methods utilized.

Finally, the last chapter evaluates the performance of the system and identifies future

work.

1.1 Background

There are many different ways of controlling a digital signal processor:

1. By writing a computer program in a DSP-compatible language, for example C,

and then compile and download it directly to the target hardware.

2. By describing a system graphically using Simulink, and then generate C-code

using special toolboxes sold by MATLAB. The generated code is more or less

ready for compilation.

3. By developing special programs for DSP-control. You run the programs from the

MATLAB-prompt or from the DOS-prompt. Each program starts a specific

algorithm and parameter settings are possible before the function call.

Such programs form a vital part of the current laboratory exercise in the course Digital

Signal Processing at the Royal Institute of Technology.

The first alternative above is the most general, and the third the least general. The idea

behind this project is to explore a kind of DSP-control, which fits in between the second

and the third alternative as far as flexibility is concerned. To avoid the complication of

having to compile source code, the real-time application is implemented as a DSP-

program, controlled via parameter settings. To create a pedagogically sound and intuitive

laboratory environment, the proposed system has a graphical user interface (similar toSimulink), where a number of building blocks can be connected in a block diagram

1.2 The Problem

Primarily, this project considers the linking problem between Simulink and a Texas

Instrument DSP device. Normally, when a model is developed from building blocks inSimulink, the simulation is peformed within the Simulink environment. A test of the

design in a real-time environment requires that source code is generated, compiled and

downloaded to the target hardware.


7/39

7

Accordingly, a direct conversion of a Simulinkmodel to some kind of parameter table-

representation, which a DSP-program can interpret, involves a number of problems. To

begin with, the analysis of a Simulinkblock diagram which describes a signal processing

system, consists of these tasks:

Determining the building blocks (modules) of the graphical system and retrievingtheir parameters.

Identifying the input and output ports of the system

Determining the connections between the modules.

Determining the execution order of the modules in the graphical model.

After the analysis of the graphical model these tasks remain:

Conveying the results from the analysis of the model to a DSP program. Creating the DSP-program and a toolbox library.

1.3 The Objective of the Project

The objective of this project is to develop a prototype laboratory environment for digital

signal processing.

The implementation is divided into three parts:

1) A drawing board, which displays a model of the desired digital signal processingsystem. By double-clicking on a building block on the drawing board, parameter

values should be adjustable. A pre-defined limited toolbox library should contain

modules, such as FFT, linear filter, delay, multiplier and adder.

In digital signal processing the system typically performs some kind of filtering to extract information from a signal.

2) A Link-program, which interprets and converts the graphical representation of themodel, its component parts and their interconnections, to some kind of

parameter-table-representation. The table defines what building blocks and

what parameter values are used, and how the input and output ports are

interconnected.

3) A DSP-program, which interprets the parameter table description of the system,

and performs simulation, analysis and possible visualization. Part of the DSP-

program is a toolbox library consisting of the modules listed under

implementation, part 1.

The typical sampling rate of the system should be at least 8 KHz. It should be possible to

use sound (voice) as input application to the analog to digital converter (ADC) on the

evaluation module board (EVM Board), refer to figure 1, System Overview, below.


8/39

8

1.4 Earlier Work

The attempt to directly bridge the software gap between Simulinkand a DSP device is, tothe authors knowledge, a new design approach. However, several successful approacheshave been taken to convert Simulink models to VHDL code. Grout and Keane [2], for

example, describe the development of a software toolbox that can analyze a Simulink

block model in order to produce a VHDL representation of the model. The resulting data

from the toolbox is a model description language/list file (.mdl-file) for the complete

system, and a second model file that can be processed to create the VHDL code.

Similarly, Krukowski and Kale [3] outline the direct mapping of a Simulinkstructure into

one described in VHDL by generating a VHDL equivalent model.

Further, Matlabs Real Time Workshop [4] allows for C code generation directly fromSimulink models. By combining such code generation tools with real-time systems

hardware, it is possible to real-time simulate and analyze signal processing designs. In

this paper, however, code creation is not considered.


9/39

9

2 SYSTEM OVERVIEW

Chapter 2 presents an overview of the implemented laboratory system.

2.1 System Overview

Figure 1: Prototype Laboratory System Overview

High Level (Block Diagram)

>>RUN LINK_PROGRAM

Medium Level Low Level (Assembler, C)

C++ Interface Real-Time Application

LINK PROGRAM

get_param()create table()

DSP PROGRAM

Dsp(vector* table){

case (adder)

DSP Out

EVM MODULEADC/

DAC

In


10/39

10

2.2 Comments on the System Overview

The process begins in Simulink - refer to figure 1 above. You develop a model of a

system on a drawing board in Simulink, using ready-made modules from a toolbox

library. Parameter settings can be changed, by right clicking on the building blocks.

You run the Link program from the prompt in MATLAB, or by double clicking on the

file link.exe in the Windows Explorer. The Link program converts the graphical

representation to a kind of parameter table-representation of the model. The Link

program automatically launches the DSP-program which interprets the parameter tableand starts the simulation in real-time by calling various DSP toolbox modules.

Finally, a loudspeaker or an oscilloscope connected to the digital-to-analog converter on

the Evaluation Module (EVM Board) conveys the result of the simulation.

2.3 Programming Methodology

A prototype is, according to [5]:

An original thing in relation to a copy, imitation, representation, later specimen, improved form etc.,a trial model, a preliminary version

This prototype laboratory system was developed quickly. It has limited functionality -

time was not wasted on details. Nor is it intended to be complete or accurate in all details.

However, the project aims at designing a dynamic system, in the sense that it should beeasy to expand and modify the system. Emphasis has therefore been laid on making the C

and C++ code easy to understand. Comments on all functions are incorporated in the

source code and the programs are divided into modules; it should not be a nightmare to

improve the system and to add new DSP-modules.

The three parts of the laboratory system were developed and tested separately. The

toolbox library for the graphical interface was developed from ready-made building

blocks in Simulink, whereas Microsoft Visual C++ 6.0 was the essential tool for writing

the link program.

Code Composer Studio and The TMS 320C6701 DSP Platform were used for the initialdesign of the DSP program. As for literature, [6] and [7] served as a starting point for the

design of the design of the DSP modules. Also, [8] was at times used for inspiration.


11/39

11

2.3.1 Block-wise Processing

In order to simplify the first version of the conversion program, the DSP-program utilizesblock-wise processing. It continuously captures N samples first and then performsoperations on all N samples, not only when the discrete Fourier transform (DFT) or FFT

is performed, but also during filtering. Operations, such as filtering, can be done on every

incoming sample and do not require that a frame or block of data is available at the time

of processing. However, to avoid having to shift between various buffer operations,

block-wise processing is utilized for filtering too.

2.3.2 The Sampling Rate

The sampling rate of the system affects the speed requirements on the DSP program. Forthe system to work properly, the amount of processing time spent on signal processing

the samples in the DSP program has to be shorter than the time it takes to fill an input

buffer. To achieve this goal, economic memory usage, buffer sizes and number of blocks

allowed in a model, are key issues that have been considered.

2.4 Equipment

The following equipment was available for design and implementation:

One PC (Pentium III 500 MHz)

One DSP-card (Texas Instruments C6701)

Code Composer Studio

Matlab 6.0, Release 12

Microsoft Visual C++ 6.0

Function generatorHeadphones, adapters


12/39

12

3 THE MAIN PARTS OF THE SYSTEM

This chapter describes the graphical interface, the link program and the DSP program

more extensively.

3.1 The Graphical Interface

The graphical interface utilizes Simulink, a flexible design tool provided by Matlab [1].

Simulinkallows efficient testing and verification of signal processing algorithms [1], [3].In particular, the high-level circuit description makes possible quick changes and

corrections, which would be impractical, if not impossible to carry out if a low-level tool

were to be used from the start of the design process

Figure 2: Module Library for the Graphical Interface

As can be seen from figure 2, a few ready-made modules form a module library. It should

be noted that Matlabs Simulinkoffers various options and settings for each module. The

proposed interface, on the other hand, has put some constraints on these options. More

often than not, it is possible to adjust only one parameter (Multiplier, Delay). As for the

IIR filter, the numerator and denominator are adjustable. For professional use thislimitation is unsatisfactory, but for pedagogical purposes, for example if the interface is

to be used as an introduction to signal processing for university students, this constraint

might not be considered a serious flaw.


13/39

13

Note the FFT-Out module in the module library. If an FFT module is included in a

block diagram, the program immediately interrupts signal processing and the result of the

FFT is sent to the output.

Once the user has described an algorithm by connecting a number of given librarymodules, the model is stored in a Model Description Language/List-File, which is

described in the next section.

3.1.1 The Model Description Language/List-File

The Model Description Language/List-file (.mdl-file) is a low-tech format file that

defines a syntax for storing simple data in text and binary files [9]. The data is arranged

into chunks.

(informal) A chunk is a part of something, esp. a large part: a chunk of text [5]

There are several different chunk-types. In a graphical model developed in Simulink, the

major chunk-types are blocks and lines. The block sections, which describe the

components of the design model, are stored in alphabetical order, followed by the line

sections, each equivalent to a single wire connector.

Figure 3 shows a basic graphical model of a communication channel, where one branch is

delayed and attenuated. The last part of this models .mdl-file, illustrates the idea ofchunks:

Figure 3: A basic graphical model and an extract from its .mdl-file


14/39

14

Block {

BlockType Inport

Name "In"Position [30, 38, 60, 52]

Port "1"

Interpolate on

}

Block {

BlockType Fcn

Name "Delay"

Position [95, 80, 155, 110]

ForegroundColor "red"

Expr "2"

}

Block {

BlockType Fcn

Name "Multiplier"

Position [195, 80, 255, 110]

ForegroundColor "green"

Expr "0.2"

}

Block {BlockType Sum

Name "Sum"

Ports [2, 1]

Position [275, 35, 295, 55]

ForegroundColor "blue"

ShowName off

IconShape "round"

Inputs "|++"

SaturateOnIntegerOverflow on

}

Block {

BlockType Outport

Name "Out"

Position [335, 38, 365, 52]

Port "1"

OutputWhenDisabled "held"

InitialOutput "[]"

}

Line {SrcBlock "Sum"

SrcPort 1

DstBlock "Out"

DstPort 1

}

Line {

SrcBlock "Delay"

SrcPort 1

DstBlock "Multiplier"

DstPort 1

}

Line {

SrcBlock "Multiplier"

SrcPort 1

Points [25, 0]

DstBlock "Sum"

DstPort 2

}

Line {SrcBlock "In"

SrcPort 1

Points [5, 0]

Branch {

DstBlock "Sum"

DstPort 1

}

Branch {

Points [0, 50]

DstBlock "Delay"

DstPort 1

}

}

}

}


15/39

15

Each chunk has a keyword telling what type of chunk it is and a sequence of data items,

each of which can be an integer, a float, a string, or a new chunk. Each block section, for

instance, has a string, which identifies what type of block it is, and other strings

identifying its parameters. In the .mdl-file all text constants are described by the double

quote. For the Multiplier-block above, the multiplier is 0.2.

As for the line-chunks, the integers and strings are nested. The line with SrcBlock

In has, for example, two branches, ending at the Sum Block and the Delay blockrespectively. Thus, it is possible to connect a line to multiple outputs, but Simulinkdoes

not handle two branches ending at the same block without an intermediate adder.

The .mdl-format is simple. The format was intended to create a practical way of storing

models that was fast to read and write, easy to use in programs and reasonably space-

efficient [9]

3.2 The Link Program

This section describes the Link program. Basically, what the Link program does is toopen an .mdl-file to retrieve information about the various building blocks and their wire

connections in the graphical model. Then, this information is encoded and conveyed to

the DSP program.

The core of the link program is an iterative and recursive process, which establishes in

what order to read from and write to what buffers.

Whilst the amount of processing time spent on signal processing in the DSP program has

to be shorter than the time it takes to fill an input buffer, there are no speed requirements

on the Link program. The function of the Link program is to convert an .mdl-file to an

execution order for ready-made library DSP modules, and this conversion is not supposed

to be carried out in real time.

The link program may be operated directly from the Matlab prompt, or via the file

link.exe in the Windows Explorer

3.2.1 Class Diagram

This section describes the most important classes in the link program and how they

interact.


16/39

16

Block

Module: stringStartpoint: string

Endpoint: stringParam: string

Line

Start: stringEnd: string

Vec_handler

vec_search()

create_module_list()fill_ex_vec()

branch_flag: integerdsp_box: dsp_boxencoder: encoder

Encoder

code_name()

code_param

code table: int2float

< Works for

1 1

< Works for

* *

1

Dsp_box

module: integerparam: integer

readbuffer: integerwritebuffer: integer

Dsp_ctrl

dsp_init()

dsp_run()write_module_list()

EVM Board HandleHPI HandleEvent Handle

1

1

Works for

Inits and Supplies

DSP

INTERF

ACE

1

*

Chunk_finder

open_file()

search_mdl_file()store_in_vec()

block_vec: vectorline_vec: vectorex_vec: vector

1 1

Overload Messages

Figure 4:Class Diagram. The Most important Classes in the Link Program and their Associations

In the initial conversion stage, class Chunk-finder opens an .mdl-file and reads it in text-

mode. Chunk-finder searches the .mdl-file for two kinds of chunks: blocks and lines. It

ignores the rest of the data in the file whereas the block-chunks and the line-chunks,encapsulated in separate classes, are stored in two separate vectors, together with the data

items defining them.

In the second conversion stage, class Vec_handler takes over. Put simply, Vec_handlercombines the elements in the two vectors, which Chunk_finder has filled with blocks andlines, into a new vector, which contains the execution order for the DSP-program. More

specifically, class Vec_handler manages the core of the link program, that is, the

recursive search algorithm, which handles the ordering of input and output buffers for the

different signal processing modules of the graphical model.


17/39

17

Figure 5: Thirteen-block model. The recursive algorithm in the link program recycles buffers

The picture above illustrates how the execution order of the DSP program is established,

and how buffers are reused during the analysis of the graphical model. Block1, block2and block3, for example, have to write to separate buffers, since buffer 1 is to be read by

an adder later on in the model. When all modules connected to buffer 1 have finished

reading, buffer 1 can be reused. Therefore, buffer 1 is used as the output buffer for the

adder, which reads buffer 2 and buffer 1, and so on. As can be seen above, the recursivesearch algorithm in Vec_handler always starts and ends on buffer 1 to simplify buffer

handling in the DSP program. Refer to Appendix A for a brief outline of the recursivesearch algorithm.

Class Encoder works for class Vec_handler. The Encoder converts string-names and

string parameters to float parameters, according to a predefined table. The conversion is

carried out before the names and the parameters are stored in the execution vector.

Once the conversion has been completed, class Dsp_ctrl initiates the DSP-program, and

downloads an executable file to the target hardware. Class Dsp_ctrl supplies the DSP-program with the execution order of the library modules by writing numbers to an

external memory on the EVM Board. Then, this class initializes the C6701 memory space

through the Host Port Interface (HPI), and the external memory registration registers.

When the boot process has ended, the CPU is taken out of reset and starts executing code

from address zero.


18/39

18

The picture below illustrates the list of data the link program writes to the external

memory in the DSP program. Each module has five memory positions at its disposal.

Position 1: Block identifier

Position 2: Parameter, for example, delay factorPosition 3: Buffer to read from

Position 4: Buffer to write to

Position 5: Empty; reserved for future workPosition 6: Block identifier

Etcetera.

Offset

Module 1 Module 2 Module N Stop Signal

2 10 21 34 1 12 1 12

St

op

Figure 6: Execution order. The link program writes data to the external memory of the DSP

program to inform the DSP program what module to run and what buffers to use.

Position number one tells the DSP program what module to run. A float value of 2, forexample, identifies the multipliermodule, a value of 3 corresponds to the addermodule,

a value of 4 to the delay module, and so on, according to a predefined coding table. The

second position in the external memory conveys the parameter chosen for the module. Inthe case of the multiplier module, that is module 1 in the picture, the second position

identifies the multiplier factor, 10. Position 3 and 4 identifies what buffer to read datafrom and what buffer to write the processed data to, respectively. Position 5 is reserved

for future work. After the last module in the list, a stop signal is included so that the DSP

program knows where to stop reading the memory. Section 3.4 below describes how theDSP program interprets the data written to the external memory.

3.3 The DSP Program

Section 3.3 presents the DSP program utilized by the laboratory system.

3.3.1 Hardware

This section presents the Cx6701 processor and the Evaluation Module (EVM) board.

The floating point processor Cx6701 has a peak performance of 1000 million floating

point operations per second and can operate at 167 MHz (6 ns cycle time) [7]. It executes


19/39

19

up to eight 32 bits instructions every cycle. It has 64 k internal program memory or

cache, and 64 k internal data memory.

The C6x EVM board is a complete DSP system, which provides quad DSP clock support

up to 133 MHz. Apart from the processor chip, the EVM board includes memory, A/Dcapabilities and PC interfacing components [10],[7]. The peripherals include an External

Memory interface (EMIF), Direct Memory Access (DMA), Multi-channel Buffered

Serial Ports (McBSP) and Host Port Interface (HPI). The 32 bits EMIF handles thecommunication with external memory and supports SDRAM, SBSRAM and

asynchronous memories. The DMA has two 32 bits DMA data and two 32 bits DMA

address busses. Refer to appendix Cfor a description of DMA. The McBSP provides a

high-speed communication link with externals. The Host Port Interface (HPI) provides a

low cost interface through which a host processor can directly access the CPUs internal

memory [6].

EMIF

DMA...

CPU

McBSP

HPI...

External Memory A/D Converter (CODEC)

PC Host

Figure 7: Overview of a few of the peripherals on the EVM Board

To enable communication with external peripherals, the EVM Board also includes a CD

quality, 16-bit audio interface. The coder/decoder (CODEC) in the interface supports

sample rates from 5.5 kHz to 48kHz [10]. The CODEC is connected to the C6701

processor through two serial ports. The most time efficient way of transferring databetween the serial ports and the internal memory is to utilize the DMA.

3.3.2 DMA and Signal Processing

The DMA channels and their interaction with the signal processing unit, are described in

this section.

The part of the DSP program that handles the communication with the McBSPs was

developed from a skeleton program, found on [11]. The DSP program makes use of two

DMA channels, configured to continuously capture the sample data from two blockseach. Refer to figure 8 below. DMA channel two copies data to InBuffer1 or to

InBuffer2. When the input buffer is full, an interrupt is posted to the CPU and thecontents of its registers stored. Refer toAppendix B for a brief description of interrupts.


20/39

20

During each interrupt service routine some signal processing, involving one or more

buffer transfers, has to be completed on the sample data. Once the signal processing is

completed, the processed data is copied to OutBuffer1 or to OutBuffer2. Then, the

procedure starts all over again. DMA channel two copies datafrom the receive register of

the McBSP, while DMA channel one copies the processed data to the transmit register ofthe McBSP.

At the end of each block transfer, all necessary registers are restored so that the DMA canperform another block transfer.

McBSPReceiveRegister

InBuffer1

InBuffer2

SignalProcessing

Unit OutBuffer2

OutBuffer1 McBSPTransmitRegister

Interrupt when buffer full

Figure 8: DSP program overview

3.3.3 The Amount of Time Available for Signal Processing

It should be clear by now that the amount of time spent on signal processing one buffer of

sample data in the DSP program has to be shorter than the time it takes to fill an input

buffer. Thus, it is necessary to consider the amount of time available for the signal

processing to be performed.

The formula for calculating the amount of time available for signal processing was found

in [7]. Given a sampling frequency, fs, of 8 kHz and an EVM frequency, fEVM, of 133

MHz, there are 16 625 clock cycles (1/( fs/ fEVM)) between consecutive samples. If each

input buffer contains 512 samples, where even data is right channel data and odd data is

left data, there will be roughly 8106

(512 times 16 625) clock cycles available for signal

processing. A sampling frequency of 44.1 kHz results in 1.5106 available cycles.

In other words, since one sample corresponds to 0.000125 (1/fs) seconds, it takes 0.064

(512 times 0.000125 s) seconds to fill an input buffer. Within this time all signalprocessing has to be performed. Consequently, a higher sampling frequency, for example

44.1 KHz, will decrease the available time up to 6 times.

Both channels are sampled and put into the input buffer. Only one channels data is

processed during the interrupt service routine though. Naturally, if both channels were to


21/39

21

be processed, the amount of time available for signal processing would decrease by

almost a factor 2.

3.3.4 Buffer Handling in the Signal Processing Unit

This section describes the buffer handling in the signal processing unit of the DSP

program.

DSP programs, such as the delay or the IIR filter, where the last values of one buffer

constitute the first values in the next buffer, require preservation of buffer values so as

not to cause a gap in between buffers. To allow for more than one delay and one IIR filter

in the model, the samples that are preserved until the next buffer arrives are stored in a

separate memory space, with only an offset in between the different filters or delays. Thissolution makes it possible to configure each delay and each filter independently of the

other filters and delays in the model. An offset pointer keeps track of the offset to each

filters start address. The filter coefficients for the different filters are stored in a similar

fashion in a separate memory and with an offset in between

Offset

Preserved Samples for filter1 Preserved Samples for filter2.... ....Preserved Samples for filterN

Figure 9: Memory structure. Samples for the delay module and the IIR filter are preserved in a

separate memory area

Since the IIR filter output depends both on input and output samples, the buffers used

whilst IIR filtering data are slightly longer than the buffers used in, for example, the

delay-program. The length of the input and output buffers are the same as for all the other

modules in the library though. If the modules are to be connected, as they are in agraphical model, the length of the buffers has to be the same.

As seen above, the preservation of a few samples in between buffer transfers is easily

dealt with. More demanding problems crop up when up sampling and down sampling are

considered.

The implementation of up sampling and down sampling, requires variable buffer lengths.

If a program resamples the N values in a buffer at a rate L times higher than the inputsample rate, by inserting L-1 zeros between consecutive samples, the size of the output


22/39

22

buffer will consequently be L times longer. Similarly, since down sampling involves

discarding a number of consecutive samples following each sample, the result will be a

shorter output buffer. Dexterity in handling buffers is needed to solve these problems on

a general level. Most certainly, the simple program structure of this prototype cannot be

used.

3.4 Main Loop of the DSP Program

This section describes how the main loop in the Interrupt Service Routine (ISR) is

designed. The pseudo code below illustrates how the DSP program interprets the

execution order, which the C++ interface writes to the external memory.

The signal process function is called each time the processor interrupts its normal

program flow, that is, when an input buffer is full of data. Initially, in the signal processfunction, the content of a full input buffer is converted from short integers (16 bits) to

floating point values (32 bits) via a function call to a separate function.

Typecasting of the data that is to be processed is necessary. Whilst all signal processingalgorithms are performed on float values, the audio interface on the C6701 EVM Board

includes a 16 bit audio interface. The speed penalty for using floats instead of shorts

on the C6701 is typically only a factor 2 [12].

Convert short input data to floating point values

Set float pointer to first position in external memory

MAIN LOOP

do

{

set buffer to read

set buffer to write

run signal process

switch(module)

case 2: call multiplier

break;

case 5: call IIR filter

break;

}while(!stop signal)

Signal processing done!

Figure 10: Pseudo code for the main loop in the ISR


23/39

23

Next, a pointer is set to the first position in external memory, where the execution order

for the DSP program resides.

At the outset of the main loop, the buffer to read from and the buffer to write to are

established. Then, one of the various signal processing functions are called. The secondturn in the while clause starts reading at the fifth position in the external memory, third at

the tenth position, and so on. The loop is repeated until a stop signal is found.


24/39

24

4 OPTIMIZATION AND TESTMETHOD

Chapter 4 discusses optimization in general and whether to use C code or assembly code

in particular. It also comments on the preparation of memory in the DSP program and thetest methods utilized.

4.1 Optimization

In general, the purpose of optimization is to create the smallest, or the fastest, object code

possible. To achieve these goals, the compiler performs various changes to the assemblycode. For example, it eliminates the dead code, removes the redundant expressions,

optimizes loops, and uses inline functions [19].

Generally, inline means to place something directly in the source code. More specifically, in [13]s words:

When an inline function is called, the C/C++ source code for the function is inserted at the point of the call. This is known

as inline function expansion. Inline function expansion is advantageous in short functions for the following reasons:

It saves the overhead of a function call.

Once inlined, the optimizer is free to optimize the function in context with the surrounding code

The compiler in Code Composer Studio allows four levels of optimization: -o0. o1,-o2,

and -o3. Level o0 corresponds to no optimization. At level o3, the C compiler and

optimizer is claimed by Texas Instrument to generate code that is 80% efficient, that is,

optimization up to 80% of handwritten assembly is possible. Refer to [17] for details.

4.1.1 C or Assembly?

The DSP program is almost entirely written in C. The user is normally able to use C forall programs and functions [12]. With many functions, the code generated from the C

compiler and optimizer in Code Composer, is making full use of the processors resource

[12]. No benefit at all would come from writing the code in low-level assembly.

However, FFT assembly code was downloaded from Texas Instruments ftp-site [14] to

improve the performance. The FFT is an example of a particularly complicated signal

processing task which utilizes manipulation of the real and imaginary parts of complexsamples. The use of the FFT assembly routine suits the instruction set on the C6701,

which allows simultaneous manipulation of the real and imaginary parts of complex

samples that are stored as a single 32-bit entity.

Texas Instruments ftp-site also makes available IIR filter assembly code. But this code

works on the assumption that interrupts are disabled before the filter function is called

[14]. Therefore, the IIR filter module in this prototype utilizes the C routine provided

together with the assembly code.


25/39

25

4.2 Memory Usage

There are two methods for preparing the memory: Static allocation and dynamicallocation. Static allocation refers to allocation of memory before the program starts. Thelocations of objects are decided at compile-time. Dynamic allocation, on the other hand,

refers to allocation and deallocation of storage in arbitrary order, normally determined by

the choices the user makes, at run-time [15]. Static memory is typically faster than

dynamic memory and sometimes used in real-time systems.

This section discusses how the memory is prepared in the DSP program. It also

comments on the advantages and disadvantages of internal and external memory access.

For an algorithm to run efficiently on the C6701, the code and data must reside on theDSPs internal program and data memory. If data has to be retrieved from the external

memory, these transfers can slow down the execution by two to six times [16].

Internal ProgramMemory

Internal DataMemory

ExternalMemory

ALU

2-6 cycle latency

Figure 11: External memory access

The program code for the DSP program fits entirely in the internal program memory.

Similarly, flags for communication with the host-PC reside in the internal data memory.

But, only a limited number of signal buffers can be statically allocated in the internalmemory. Since the aim of this project is to design a user-friendly and dynamic system,

memory is dynamically allocated at run time for filter coefficients, for signal processing

buffers, and for buffers preserving samples in between buffer transfers. After

compilation, at run-time, the user can make certain choices as far as sampling frequency

and buffer sizes are concerned.

To explore the performance of the speed critical DSP program, two separate DSP applications

were implemented though: one utilizing the internal memory and another utilizing theexternal memory. Refer to chapter 5 for details.

It should be noted that with C6701, dynamic memory allocation is not possible in the

internal data memory. Consequently, at the expense of speed (refer to section 5.1), filter

coefficients and signal buffers reside in an external memory space (SDRAM0). The

SDRAM devices are always clocked at one-half the CPU rate [10]. In this application the

DSP core runs at 133 MHz, that is, the SDRAM runs at 66.5 MHz (15 ns). In return, the

size of the dynamically allocatable memory section (.sysmem), which utilizes the

SDRAM0 memory, can be extended up to roughly 5 Mbytes, if a linker option (heap


26/39

26

size) is changed. Accordingly, there is almost no upper limit as far as how many buffers

can be used in a model.

The sequence of numbers conveying the order in which to run the various DSP modules,

reside in the external memory as well. The retrieval of a few float values from theexternal memory before running each DSP module, did not have any major impact on the

amount of time required by the various signal processing options.

Finally, the assembly code for the Infinite Impulse Response (IIR) filter module makes

use of buffers that must be aligned in certain ways in the memory. The IIR filter utilizes

opposite (even and odd) double word (64 bit) boundaries to avoid memory bank hits.

Refer to [17] and [18]. But since this prototype utilizes the C routine provided together

with the assembly code, no such concerns were taken when the filter buffers were

dynamically allocated in the external memory.

4.3 Test Methods

This section describes the two test methods utilized in the project.

Initially, in the project, Benchmarking was used to estimate the number of clock cycles

needed for each DSP module.

Figure 12: Benchmarking. The screen dump illustrates the benchmarking procedure in Code

Composer Studio.


27/39

27

The screen dump above illustrates the benchmarking procedure in Code Composer. The

clock is enabled. Breakpoints and Profile points are inserted on and immediately after the

function call to signal_process(). A return statement is added early in the multiplier

module to determine the overhead of the signal process function itself. Note the dead

code in the figure below, end = end, which is added to make possible the setting ofbreakpoints.

During the benchmarking procedure, optimization level o3 and Speed most criticalwas chosen under Options in Code Composer Studio. Still, the results obtained from the

initial measurements may be overly pessimistic, since debugging and full scale

optimizations cannot be done together [7]. In debugging information is added to

enhance the debugging process. In optimization, on the other hand, information is

minimized or removed to increase code efficiency.

A slightly different approach was taken to estimate the amount of processing time neededfor various DSP programs. These tests were conducted from the C++ interface and with

the DSP programs utilizing both DMA and interrupt. Four different signal-processing

algorithms (test cases) were studied:

1 multiplier module

1 IIR filter

10 IIR filter (serially)

1 FFT

The amount of time required to complete the signal processing for each test case was

calculated indirectly. The program resides in an infinite loop while it is waiting for thenext interrupt:

while(!DONE)

COUNTER++;

A counter-variable in the infinite loop counts the number of additions performed while

the CPU is waiting for the next interrupt. When the program leaves the infinite loop to

service an ISR, a mailbox message communicates the obtained number of additions to the

PC. By resetting the counter immediately after finishing the signal processing in the ISR,the amount of time needed for each test case can be estimated.

The received number of additions actually corresponds to the amount of time available

during one interrupt in the C++ program. To obtain the amount of time required to signal

process each test case, the received number of additions must be subtracted from a

reference value, that is, the maximum number of additions performed during one

interrupt. This reference value is obtained by commenting the function call to the signalprocessing function whereby nothing of value is actually performed during the ISR, apartfrom the saving and the restoring of the contents of registers and flags.


28/39

28

Further, to obtain the amount of time needed for each algorithm, the maximum number of

additions was correlated with the amount of time available during one interrupt, as

calculated in section 3.3.2. After a simple calculation in the C++ interface, the result was

written to stdout.

To make it possible to compare the different DSP programs, the sampling frequency was

8 KHz and the buffer size 256 throughout the measurements.


29/39

29

5 RESULTS

This chapter evaluates the performance of the system. Initially, the speed performance ofa few individual algorithms is evaluated. Then, the speed performance of four different

DSP programs is considered. The programs utilize static memory allocation in theinternal memory or dynamic memory allocation in the external memory. Then, the

constraint put on the prototype is accounted for. Finally, future studies are identified.

5.1 Speed Performance of the System

This section presents the speed performance of individual algorithms in a DSP program

utilizing static memory allocation in the internal memory.

The table below lists the number of clock cycles required to perform signal processing for

a few of the algorithms in the DSP program. The test utilizedBenchmarking, as describedin section 4.3, with the DMA and the interrupt routine disabled. Number of modules (Nof

Modules) refers to one, two or three algorithms serially:

Table 1: CLOCK CYCLES

Nof Modules 1 2 3

Multiplier 9606 12326 15043

Delay 9048 11204 14626

FFT 27259 - -IIR 36223 50905 65663

Table 1: Number of clock cycles needed for one, two and three modules, including overhead (7014

clock cycles)

One clock tick in C6701 is equivalent to 7.5 ns. Table 2 lists the amount of time needed

for each module.

Table 2: TIME [ms]

Nof Modules 1 2 3Multiplication 0.0724 0.0923 0.1128

Delay 0.0678 0.0841 0.1096

FFT 0.2044 - -

IIR 0.271 0.381 0.4924

Table 2: Time needed for one, two and three modules, including overhead ( 0.052 ms)

The multiplier module above utilizes 0.1 % of the amount of time available during one

interrupt. The IIR filter utilizes 0.4 %. These results were an indication that it might be


30/39

30

feasible to implement a laboratory system, which required a number of buffer transfers

between various modules for each algorithm.

With a buffer size of 256, the signal process function in the program has an overhead of

about 7014 clock cycles, corresponding to 52 micro seconds (7.5 ns/clock cycle times7014). This overhead includes the int2float typecasting for left channel data and the

copying of right channel data (reference channel) to the output buffer.

5.2 Case Studies

This section presents the speed performance of one DSP program utilizing internal datamemory, and three DSP programs utilizing the external memory.

These DSP programs were tested:

Intern: Intern utilizes static allocation of memory. Buffers for signal processing, as

well as filter coefficients reside in the internal data memory; sampling frequency and

buffer sizes are set before compilation.

Extern: Extern utilizes dynamic memory allocation in the external memory for signal

processing buffers; the user determines sampling frequency and buffer size at run time.

FixedFs/N: FixedFs/N is the same application as Extern, but sampling frequency and

buffer size are set before compilation.

ExtIntern: ExtIntern is the same application as Extern, but memory for filter

coefficients are not dynamically allocated in the external memory; the filter coefficients

reside in the internal memory.

In order to go through with the tests, a few test algorithms were created. The algorithm

below, where 20 IIR filters are connected serially, is not a useful one; it is just there to

illustrate the time needed to signal process such a block diagram or a similar timeconsuming circuit.

In the table below, None refers to the result when the function call to the signalprocessing unit is removed from the source code and nothing is performed during the

actual interrupt routine.


31/39

31

Table 3: % Cycle UsageIntern Extern FixedFs /N ExtIntern

Signal Processing % % % %

None 0 0 0 0

One multiplier module 0.226 1.015 0.929 1.015

One IIR filter module 0.681 4.725 3.992 4.854

Ten IIR filters (serially) 5.355 40.263 33.718 41.526

One FFT module 1.412 2.665 2.101 2.663

Table 4: TIME [ms]

Intern Extern FixedFs /N ExtIntern

Signal Processing Time [ms] Time [ms] Time [ms] Time [ms]None 0 0 0 0

One multiplier module 0.144 0.649 0.595 0.650

One IIR filter module 0.436 3.023 2.555 3.106

Ten IIR filters (serially) 3.426 25.771 21.577 26.576

One FFT module 0.903 1.705 1.345 1.705

As seen from the table above, the amount of time needed for signal processing the

algorithms vary a lot between the different applications. With 10 IIR filters serially in one

algorithm, Intern utilizes a little more than 5 % of the available time . Extern utilizes

slightly more than 40 % of the available time, whereas the extern system with samplingfrequency and buffer size fixed at the time of compilation reaches as high as 34 % of

available time. The extern system became slightly faster when the filter coefficients were

moved to the external memory, instead of having them reside in the internal memory. It

might be fair to say that it is not always better to cram the internal data memory full ofdata, instead of reallocating certain data to the external memory.

The obvious generalization to be made with respect to these results is of course that there

is a tradeoff between speed and flexibility. If speed is an issue, choose static allocation in

the internal memory; if flexibility is your main concern, chose dynamic memory

allocation and the external memory.

A maximum test was also performed to see how much signal processing the most flexible

DSP program,EXTERN, could handle within one interrupt.


32/39

32

Table 5: MAXIMUM NUMBER OF IIR_FILTERS

Extern

Signal Processing % T [ms]

20 IIR filter (serially) 80 51.2

25 IIR filter (serially) 99.8 63.8

A seen below, this test resulted in the constraint reasoning on the maximum number of

IIR filters that should be allowed in one model.

5.3 Requirements

Certain knowledge on the part of the user is required to run the laboratory system.

Primarily, the first version of the prototype has been designed with some constraint put

on the Simulinkmodel:

In and Out blocks define the start point and the end point for the graphical

model and have to be included. Only one input and one output block are allowed.

Adders must be connected by both inputs to be handled by the Link program.

As seen from the module library in section 3.1, the program immediately

interrupts the signal processing, if an FFT module is included in the model. Theresult of the FFT is sent to the output

The maximum number of IIR filters allowed in the model is 20, each individuallyconfigured. That is, the filters may have different orders and different parameters.

Maximum order is 30.

The maximum number of characters in the numerator and the denominator for theIIR filters is 150, respectively.

The maximum number of delay modules is 100, each having a delay factor in therange 1-10, integer value.

The maximum number of filters and delays in the model is by no means fixed, nor is it an

upper limit for what the laboratory system might handle. Rather it is just the way thisprototype has been set up to work. If required, the user may, with a small modification to

the source code, increase the number of allowed filters and delays.

5.4 Limitations

The major drawback of the prototype is that it cannot handle feedback in graphical


33/39

33

Figure 13: Feedback model. This model is not handled by the system

models. A simple model, such as the one depicted above, could easily be dealt with, but a

general solution to the problem of handling feedback models, requires an approach where

every incoming sample is treated separately.

5.5 Conclusion and Future Work

Though several constraints have been put on the models, the proposed system combines,

on a small scale, the benefits of Simulinks intuitiveness and user-friendliness, with the

real-time capabilities of a proper DSP-implementation. The system allows for intricatemodels to be converted and makes available various signal-processing algorithms, such

as adders, delays, FFTs, IIR filters and multipliers.

As for the DSP implementation, there is a trade-off between flexibility and speed. Having

data reside on the internal memory puts more constraints on the graphical models. In

return, it allows for a much higher sampling frequency. The first version of the prototype

utilizes the external memory. Considering that the recursive search algorithm in the link

program recycles buffers quite effectively, memory allocation in the internal memory isan interesting alternative though.

Apart from thorough tests, future work is to consider the modification of the C++

interface and the DSP program to allow for feedback in the graphical models. This

improvement of the system will require a new design approach, where individual

processing of each incoming sample replaces the block-wise processing. Once this is

complete, more modules should be added to the DSP-program and to the block library. Inparticular, future work is to consider an efficient way and general way of handling up-

sampling and down-sampling, that is, signal processing algorithms where variable bufferlengths are required.


34/39

34

6 APPENDIXES

Appendix A: Brief outline for recursive search

1. Find a line which has the "In"-block as its start block. Follow the line. When the

end block of that line is found, retrieve its parameter and pick a buffer number.

Then, store the block in a vector containing the execution order for the DSPprogram.

2. Follow the line from the new block. If more than one line has its origin at the new

output, store the block and its output as a loose end" in a temporary vector. Then

ignore the rest of the branch and resume the recursive search from (1). If only one

line has its origin at the block, continue searching for the next block. An iterativefunction call is needed to handle several blocks serially.

3. If an adder is found, store buffer number and name of adder (Sum, or Sum1, etc)

in a temporary adder vector and ignore the rest of the branch. Else, if the adder

already exists in the temporary adder vector, set readbuffer and writebuffer

for the adder, find its parameter and store it in the vector containing the execution

order for the DSP program.

4. Repeat the procedure from (1) with loose end-blocks and adders as starting

points instead of "In". Continue until there are no loose ends and the "Out"-block

is found.

If possible, reuse buffers.

A few more tests are included to make things work; refer to source code for details


35/39

35

Appendix B: C6701 Interrupts

The brief descriptions in appendixes A and B of the Interrupt Service Routine (ISR) andthe Direct Memory Access (DMA) are based on [6].

Most microprocessors, including C6701, have one or more inputs for stopping

(interrupting) the normal program flow. The interrupt can come from an external or

internal peripheral, or simply from a special instruction in the program. As mentioned

earlier, in the laboratory system an interrupt is posted to the processor when an input

buffer is full with sample data.

When an interrupt occurs, the CPU finishes the current instruction. Refer to figure 14below. Then, the hardware in the processor branches to a fixed address, predefined duringthe construction of the computer. At this address, the programmer has placed an interrupt

service routine. The program code in the ISR of this application performs signal-

processing on the sample data during the interrupt.

Save contents of registersand flags

Interrupt occurs

Program flow:

Instruction 1Instruction 2

.

.Instruction n

Service the interrupt

Restore the contents of theregisters

Resume original process

Program flow:

Instruction n+1Instruction n+2

Figure 14: Interrupt response procedure


36/39

36

It should be noted that an interrupt might occur at any time during the program flow; it is

impossible to predict between what two instructions in the program flow the interrupt

will occur. Therefore, the user must save the contents of the registers and all flags, before

servicing the interrupt task. Then, the user must restore the registers and the context of

the process before the program is allowed to resume its original process. You can think ofan ISR as an ordinary function with no arguments and void return value that saves and

restores the CPU state.


37/39

37

Appendix C: Direct Memory Access (DMA)

Load and store instructions can be used for transferring data from one part of memory toanother in a central processing unit (CPU). However, data transfers keep the CPU busy

and prevent it from performing other tasks. In fact, if the CPU is used for data transfers,

most of the processor time will be wasted while the processor is waiting for new data to

arrive.

A less time-consuming (CPU-time) way of transferring data between internal and

external memory is to use Direct Memory Access (DMA). The DMA acts as a co-

processor, which moves data from one part of memory into another without interferingwith the CPU. Therefore, the DMA-method leaves the CPU free to perform other tasks.Once the CPU has specified what data transfer options to be carried out, the DMA-unit

can operate independently.

C6701 has four DMA channels. Each channel has its own memory-mapped control

registers that can be set up to move data from one place in memory to another. These

registers contain information regarding source and destination locations in memory,

number of transfers, and format of transfers. To avoid memory conflicts when more than

one DMA channel tries to access the same resource in a given clock cycle, a priorityscheme has to be established. In C6701, the four DMA channels have fixed priorities,

with channel 0 having the highest and channel 3 the lowest priority. In this application,

DMA channel 2 is programmed to generate an interrupt of the CPU when an input buffer

is full. DMA channel 1 copies the data to the transmit register of the McBSP. Channel 0

is used for reset.


38/39

38

Appendix D: Users Guide

Start the Simulinkprogram from theMatlab-prompt:

>simulink

Develop a model from the ready-made building blocks in the sl_library, found in the

folder C:\BRIDGE_PROJECT. Choose parameters for the building blocks, by double

clicking on the blocks. Store the model as sl_model.mdl in the folder

C:\BRIDGE_PROJECT.

To start the conversion program, execute the file LINKPROGRAM.EXE in the folderC:\BRIDGE_PROJECT\cpp. You can either double click on the file in WindowsExplorer, or you can create a shortcut to the program and move it to your desktop area.

You may also execute the program from theMatlab prompt by typing:

>linkprogram

At start up, the visual control panel will appear and prompt you to enter sampling

frequency and buffer size for the DSP program. Once a valid frequency and buffer sizehas been entered (you will be guided by the program) and the Enter key pressed, the

DSP program is launched automatically.

If overload occurs in the DSP program, a warning will appear before the program is

halted. If an overload warning appears, firstly check your input level (Maximum Signal

Level: 6Vpp, 2.1 Vrms). Secondly, check your multiplier factors.

To stop the signal-processing, press any key.

Important Note:

If you recompile the program and want to run the new version from outside the Microsoft

Visual C++ program, i.e. from Windows Explorer or from the Matlab prompt, the

LINKPROGRAM.EXE file has to be copied from folder ..\cpp\Debug into folder ..\cpp.


39/39

7 REFERENCES

[1] The MathWorks Inc., www.mathworks.com/products/

[2] A Matlab to VHDL Conversion Toolbox for Digital Control, I.A. Krout, K. Keane,Department of Electronic and Computer Engineering, University of Limerick,

Limerick, Ireland, 2000, www.ece.ul.ie/homepage/ian_grout/paper1.pdf

[3] Simulink/Matlab-to-VHDL Route for Full-Custom/FGPA Rapid Prototyping of

DSP Algorithms, Artur Krukowski, Izzet Kale, University of Westminster, United

Kingdom, November 1999, www.cmsa.wmin.ac.uk/~artur/papers/Paper18.pdf[4] www.mathworks.com/products/controldesign/cgrp.shtml

[5] Definition from Cambridge International Dictionary of English orConcise Oxford

English Dictionary, www.ordboken.nu[6] Digital Signal Processing Implementation using the TMS320C6000 DSP Platform,

Naim Dahnoun, Prentice Hall, 2000.

[7] C6x-Based Digital Signal Processing, Nasser Kehtarnavaz, Burc Simsek, Prentice

Hall 2000.

[8] A C Test: The 0x10 Best Questions for would be Embedded Programmers,

Nigel Jones, www.embedded.com/2000/0005/0005feat2.htm

[9] The MDL File Format, Cornell University Program of Computer Graphics,

Ithaca, New York, May 1998, www.graphics.cornell.edu/online/formats/mdl/

[10] TMS 320C6201/6701 Evaluation Module, Technical Reference, Texas Instruments[11] Department of Signals, Sensors and Systems, Royal Institute of Technology (KTH),

Stockholm, Sweden, www.s3.kth.se[12] Comparing the C4x and the C6x, Mark Siggins, Johan Thie, Horizon

Technologies, Oslo, Norway, www.horizon-tech.fr/articles/dsp/hunt/compare.htm

[13] TMS320C6000 Code Generation Tools Online Documentation (SPRH014E)

(c)1998-2000 Texas Instruments Incorporated

[14] Texas Instrument, TMS320C67x DSP Software Support Files,

www-k.ext.ti.com/sc/technical-support/tools/dsp/ftp/c67x.htm [15] The Memory Management Glossary, Ravenbrook Limited, Cambridge,

www.memorymanagement.org/glossary/

[16] An Approach for Quick Development of High Performance Telecom Applications

on the TI TIMS2320C620X DSP, Manish Kasliwal, RadiSys Corporation,

http://www.radisys.com/files/Task_article_euromagazine.pdf[17] The TMS 320C6X Optimizing C Compilers Users Guide (SPRU 187), Texas

Instruments

[18] The TMS320C62x/C67x CPU and Instruction Set Reference Guide (SPRU189),Texas Instrument

[19] Run-Time Debugging with Microsoft Visual Studio and Rational Purify , GoranBegic, www.therationaledge.com/content/apr_01/t_debug_gb.html

Date post:	29-May-2018
Category:	Documents
Upload:	jadur-rahman
View:	223 times
Download:	0 times

Kelayakan Lab DSP Thesis

Documents