Download - EECS 452 – Lecture 23 - University of Michigan · EECS 452 – Lecture 23 Today: TI MSP430 and Piccolo. ... power to ENIAC the average failure rate was 1 tube about every ... by

EECS 452 – Lecture 23

Today: TI MSP430 and Piccolo.

Handouts: printed copy of today’s lecture slides

Read: about DSP!

References:

Last one out should close the lab door!!!!

Please keep the lab clean and organized.

Where a calculator on the ENIAC is equipped with 18,000 vacuum tubes andweighs 30 tons, computers in the future may have only 1,000 vaccuum tubes andperhaps weigh 1.5 tons. – Popular Mechanics, March 1949

EECS 452 – Winter 2010 Lecture 23 – Page 1/62 Friday – March 12, 2010

Actually . . .

Actually there were 18800 vacuum tubes and of those 6550

were 6SN7s.

The 6SN7 was/is a dual triode and was used to implement the

20 digit signed decimal accumulators. By not turning off the

power to ENIAC the average failure rate was 1 tube about every

two days. The longest up period was 116 hours.

A portion of ENIAC is located in the lobby of the CSE building.

The tubes that you see are very likely 6SN7s.

ENIAC’s active lifetime was 9 years, 1947–1955.


Overview of today’s lecture

Unfortunately, likely to be fragmented and rambling.

◮ Comments on single supply operation.

◮ The MPS430

◮ Multiplying without a multiplier.

◮ An IIR filter for the MSP430

◮ The MSP430 SPI interface.

◮ Linking MSP430 SPI to C5505 I2S.

◮ The TI Piccolo


Thinking about single supply operation

+V/2

−V/2

−V/2

+V/2

ground

ground −V/2

+V/2

ground

ground

+V

V/2

+V

ground

V/2ground

+V

+V

groundR

R

Bypass capacitors not shown.

An alternative name for ground is common. Maybe a better choice.


Focusing now on the MSP430™

EECS 452 has a couple of eZ430-F2013 Development tools and several

Z-Accel wireless kits (uses F2274).

The development tool F2012/13 boards execute programs out of flash.

The boards can operate stand-alone, have projects have used them in

this manner.

The F2012/F2013 boards have been used to interface to XBee wireless

modules via UART and to the C5505 via SPI.

The three most important documents are:

◮ The data manual for the F20xx microcontrollers.

◮ The MSP430x2xx Family User’s Guide, SLAU144E.

◮ The eZ430-F2012 Development Tool User’s Guide, SLAU176B.


Where used?

http://www.ti.com/ww/en/mcu/valueline/index.shtml?DCMP=Value_Line&HQS=Other+BA+430value-promo.

All these applications likely involve the use of Digital Signal Processing!

I don’t understand how the new value line differs from the existing low end units otherthan in part number and price.


http://www.ti.com/ww/en/mcu/valueline/index.shtml?DCMP=Value_Line&HQS=Other+BA+430value-promo

What is low power?

◮ There are six low power modes of operation.

◮ Standby (asleep) at 3V with self wake up with RAM retention,

< 0.6µA, about 1.8 microwatts.

◮ 250µA per MIP when active. (MSP430X2xx family.) This is 3/4

milli-Watt per MIP at 3 Volts.

◮ Wake up time < 1µs.


Comments

http://focus.ti.com/graphics/mcu/ulp/battery-life.gif.


http://focus.ti.com/graphics/mcu/ulp/battery-life.gif

eZ430-Development Tool

The debugging interface shown is the old version. I believe that we only

have the 6 pin version. For the F2012/13 boards simply use the center

four pins.

Note that the 14 pin pattern mirror images the physical pin positions on

the F2012/13 packages. BEWARE!

SLAU176B documents the tool and the F2013 board. (Figure from there.)


MSP430 generic block diagram

ACLK

BusConv.

Peripheral

MAB 16-Bit

MDB 16-Bit

MCLK

SMCLK

ClockSystem

Peripheral PeripheralPeripheral

Peripheral Peripheral Peripheral

Watchdog

RAMFlash/

RISC CPU16-Bit

JTA

G/D

ebug

ACLK

SMCLK

ROM

MDB 8-Bit

JTAG

From the MSP430X2XX Family User’s Guide.


MSP430 CPU block diagram

◮ RISC architecture.

◮ 27 core instructions.

◮ Plus 24 emulated instructions.

◮ 7 addressing modes.

◮ Every instruction usable with every addressing mode.

◮ Single-cycle register operations.

◮ Constant generator for six most commonly used values.

◮ Direct memory-to-memory transfers.

◮ Instruction times depend on the addressing mode used.

◮ Instruction can take from 1 to 6 cycles.

From the MSP430X2XX Family User’s Guide.

015

MDB − Memory Data Bus Memory Address Bus − MAB

16

Zero, Z

Carry, C

Overflow, V

Negative, N

16−bit ALU

dst src

R8 General Purpose

R9 General Purpose

R10 General Purpose

R11 General Purpose

R12 General Purpose

R13 General Purpose

R14 General Purpose

R15 General Purpose

R4 General Purpose

R5 General Purpose

R6 General Purpose

R7 General Purpose

R3/CG2 Constant Generator

R2/SR/CG1 Status

R1/SP Stack Pointer

R0/PC Program Counter 0

0

16

MCLK


How to do DSP without a multiplier?

Here is the problem that I want to address:

◮ Manufacturers, such as TI, sell low cost, low power microcomputers,

essentially by the millions.

◮ Many of these do not possess a multiplier, yet alone a MAC unit.

◮ In spite of this there, are likely many applications that would benefit

(result in a more desirable product) by use of some DSP.

◮ Just as floating point arithmetic is emulated in the C5505 by software, one

can emulate the operation of a multiplier hardware in software.

◮ Implementation of multiplication in a multiplierless can be divided into

two basic categories : general purpose multiplication and hard coded

multiplication.

◮ The general multiplier is the more flexible but is also the most costly in

terms of execution time.

◮ The hard coding of the computation steps assumes multiplication by fixed

values (such as filter coefficients). Is fastest but requires significant code

space.


So what would I like to cover?

Disclaimer: this is a work in progress. Some has been done, some not. I

accidentally lost my MSP430 test codes when upgrading to CCS4. Some

of the outline below is fantasy, at this point, but should provide hints to

anyone interested in delving into this topic on their own.

◮ Pencil and paper unsigned binary multiplication.

◮ Pencil and paper two’s complement binary multiplication.

◮ Multiplier block diagrams.

◮ Coding a general multiplier in the MSP430. TI likely supplies code

for such.

◮ Booth’s algorithm.

◮ Signed Digit (SD) and Canonical Signed Digit (CSD) representation.

◮ Testing.

◮ A IIR filter code generator.


Will knowing how to do this be useful?

◮ The lowest cost MSP430 having a multiplier appears to be the

MSP430F2330 at $1.75 at 1ku. It has a slope A/D and lives in a 40

pin flat pack.

◮ If one could use a $0.60 part (e.g., the F2011) at the 1ku level the

savings would be $1150 and at the 10ku level $11,500, etc.

◮ There likely will be many situations where knowing how to do this

will be useful and make economic sense.

◮ Someone will benefit from knowing how to do this. Just who and

when? It might be you.


Relevant TI application notes

Efficient Multiplication and Division Using MSP430, Kripasagar Venkat,

Application Report slaa329, 9/2006.

Efficient MSP430 Code Synthesis for an FIR Filter, Kripasagar Venkat,

Application Report slaa357, 3/2007.

Combines Horner’s method of polynomial evaluation with the Canonical

Signed Digit (CSD) number representation to “efficiently” (as well as one

can) implement DSP.

The focus is on the multiplierless MSP430 devices but the method will

work on any computer or FPGA. The source files are also available.

This pair of notes are what started me on this effort.


Comments on the application notes

◮ Author assumes use of Q15.

◮ Develops a right to left algorithm.

◮ Relates process to use of Horner’s method of polynomial evaluation.

◮ Hard codes the shift and add steps for constant multiplier values.

◮ Uses signed digit representation for multipliers.

◮ Essentially equivalent basic shift and add multiplier.

◮ Recall that Q15 is a state of mind, not a function of a hardware

binary point.


Doing pencil and paper multiplication

a4 a4 a4 a4 a4 a4 a3 a2 a1 a0

p9 p8 p7 p6 p5 p4 p3 p2 p1 p0

a4 a4 a4 a3 a2 a1 a0

a4 a4 a4 a4 a3 a2 a1 a0

a4 a4 a4 a4 a4 a3 a2 a1 a0

a4 a4 a3 a2 a1 a0− b4 ×

+ b3 ×

+ b2 ×

+ b1 ×

b0 ×

× b4 b3 b2 b1 b0

a4 a3 a2 a1 a0

The multiplicand sign bit is extended for each row.


Summing rows signed multiplier logic

shift register

register

b

AND

add/subtract

p-register shift register

S subtract

high bits low bits

a× b

a

lsblsb

lsb


C simulation: unsigned shift and add multiplication

// FPGA and MSP430 simulated unsigned shift and add multiply

uint32_t u_sanda(uint16_t a, uint16_t b){

uint16_t ctr;uint32_t sum;

sum = 0;for (ctr=0; ctr<16; ctr++) {

if (b & 0x0001) {sum = sum&0xFFFF; // insure carry is 0sum += a;

}b = ((sum&0x0001)<<15) + (b>>1);sum = sum>>1; // shift right including carry

}return ((uint32_t)sum<<16)+(long)b;


C simulation: signed shift and add multiplication

// signed shift and add multiply

int32_t fs_sanda(int16_t a, int16_t b){

uint16_t ctr, pr, low, carry, sign_a, sign_b;

sign_a = a&0x8000; sign_b = b&0x8000;

pr = 0; low = 0; carry = 0;

for (ctr=0; ctr<16; ctr++) {if (b&0x0001 != 0) {

carry = sign_a;if (ctr == 15) pr -= a; else pr += a;

}b = b>>1;if (pr&0x0001 != 0) low = 0x8000+(low>>1); else low = (low>>1);pr = (pr>>1)|carry;

}

if (a !=0) pr = pr^(sign_b );return ((int32_t)pr<<16)+low;

}


Comments

These simulations mimic were written in conjunction with MSP430 code.

Multiplies two 16-bit values with a 32 bit result.

Exhaustively tested using all possible multiplier and multiplicand values.


Working with Q15 values.

◮ Basically do integer multiplication.

◮ Product is 32 bits (two words).

◮ Left shift result by 1 and retain only the top 16 bits. Round first?

◮ Only need to do the multiplication keeping the top 16 bits. The low

bits can be discarded as generated. Might complicate rounding.

◮ For the shown algorithm what if we don’t do the last right shift?

◮ Code and TEST. My norm is to exhaustively test where ever possible.

◮ When not possible, test end/special cases then use random values,

lots of random values.


Signed digit number representation

◮ Instead of representing values with 0 and 1 digit values, use digit

values of -1, 0, 1.

◮ Awkward on a binary processor. However, if one is hard coding the

steps in a multiplication operation is easily done.

◮ Not a unique representation. Lots of ways of writing a given value

using signed digits.


Canonical SD representation

Uses the minimum number of non-zero digits.

◮ Reduces the instructions needed to hard code multiplication.

◮ Where to find an algorithm for generating CSD? Try Computer

Arithmetic Algorithms, by Israel Koren.

◮ How much efficiency is obtained?


Converting an integer to CSD form/* File name: Int2CSD.c

Two’s complement integer to canonical signed digit.Algorithm from Koren ...

16Feb2009 .. initial version .. K.Metzger

*/

#include <stdio.h>#include <stdint.h>#include <stdlib.h>

void Int2CSD(int32_t value, // integer value to convertint nbits, // number of bits in value to convertint *bits, // bits array...nbits+1 elementsint *digits) // digits array...nbits elements

{int idx, cin=0, which;

for(idx=0; idx<nbits; idx++) {bits[idx] = value & 0x1;value >>= 1;

}bits[idx]= bits[idx-1]; // sign extend one extra bit

for (idx=0; idx<nbits; idx++) {which = (bits[idx+1]*2+bits[idx])*2+cin;switch(which) {

case 0: digits[idx] = 0; cin = 0; break;case 1: digits[idx] = 1; cin = 0; break;case 2: digits[idx] = 1; cin = 0; break;case 3: digits[idx] = 0; cin = 1; break;case 4: digits[idx] = 0; cin = 0; break;case 5: digits[idx] = -1; cin = 1; break;case 6: digits[idx] = -1; cin = 1; break;case 7: digits[idx] = 0; cin = 1; break;default: printf("Int2CSD: oops!\n"); exit(1);

} // end of switch} // end of for

} // end of function


Implementing a IIR filter

Assume 16-bit values. Assuming a uniform distribution on the ones and zeros.

◮ On the average there will 8 ones and 8 zeros in the multiplier.

◮ Each one will be coded as a shift and an add. Eight shifts and eight adds.

◮ Each zero will be coded as a shift. Eight shifts.

◮ On the average (assuming that we are not doing Voodoo statistics here) a

multiplication will need 16 shifts and 8 adds. Twenty four machine cycles.

◮ On a MSP430 running at 16 MHz a hard coded multiplication will take on

the order of 1.5µs.

◮ To be conservative let’s use a value of 3µs.

◮ To implement an 8th order biquad filter we need five multiplications per

biquad and four biquads.

◮ The nominal, very hand wavy, time required to filter a sample is on the

order of 60µs.

◮ It might be possible to sample using a sample rate of 16 kHz and filter.


Is this reasonable and can we do better?

◮ A 16-bit FPGA multiplier implementation should only need about 16

clock tics. The multiplier foot print should be small enough to allow

all 20 multipliers to be implemented. In this case a nominal 16 clock

tics would be needed per input sample for each filter output. (This

is an aside, sorry.)

◮ There is exists a non-unique number representation called signed

digit. When placed into canonical form this representation contains

the minimum number possible non-zero values. These non-zero

values are either +1 or −1.

◮ There is the possibility of speeding up hard coded multiplications.

◮ A reasonable question is “by how much”.


Implementing multiplication in an MSP430

When updating to CCS V4 I deleted my old Code Composer Essentials.

Oops.

I had meant to back this work up.


Canonical heresy

What are the maximum values associated

with the w1 and the w2?

What are the maximum values associated

with the w3 and w4? (Assuming our usual

scaling scheme.)

Where does overflow occur? Is this impor-

tant? (Combine the two top adders into

one.)

Is this truly real?

+

z−1

z−1 z−1

z−1

b0

b1

b2 −a2

x y

−a1w1

w2

w3

w4

+ +

+


Is the result worth the effort?

◮ I wrote a C simulation for the lab 8th order IIR filter.

◮ The straight shift and add multiplication algorithm takes 164 adds

per sample.

◮ The CSD multiplication algorithm takes 112.

◮ The nominal CSD version does 0.68 times the number add/subtracts

as the normal algorithm.

◮ In a final form filter there will also be additional overheads that will

mute the speedup amount. Maybe by a factor on the order of two.

This still gives an on the order of 16% speed up.

◮ Of course, I’m assuming that I’ve done everything correctly.

The only really good way to answer this question is to build both

versions and run them.


Moving onto the MSP430 SPI

◮ Two version have been present. Current can optionally do 8 or 16

bit transfers.

◮ A versatile device.

◮ Can be used to program a UART transmitter.

◮ Have programmed to communicate to C5505 via I2S.

◮ Used I2S mono mode. “Hand generated” frame sync.

◮ Last week TI issued an application note showing how to use a

couple of chips external to the MSP430 to do the I2S link. Their

solution is more general that what I did.


F2012/13 USI SPI block diagram

8/16 Bit Shift Register

USIGE USIOE

SDI

SCLK

Set USIIFG

0

1

USICKPL

USICNTx

Shift Clock

USICKPH

USISSELx

SMCLK

SMCLK

SCLK

ACLK

000

001

010

011

TA1

TA2

USISWCLK

TA0

100

101

110

111

Clock Divider

/1/2/4/8... /128

USIDIVx

0

1USICLK

HOLD

USIIFG

USIMST

SDO

USI16B

D

G

Q

EN

ENUSISWRST

USILSB

USIPE6

USIPE7

USIPE5

USISR

Bit Counter

USIIFGCC

USII2C = 0

From slau144e.pdf.


F2012/13 USI SPI timing diagram

USI

CKPH

USI

CKPLUSICNTx

SCLK

SCLK

SCLK

SCLK

SDO/SDI

SDO/SDI

USIIFG

0

1

0

0

01

1 1

0 X

1 X

MSB

MSB

8 7 6 5 4 3 2 1

LSB

LSB

00

Load USICNTx

From slau144e.pdf.


Can use SPI as a UART transmitter

◮ UART uses 10 bit frame.

◮ SPI has 16 bits in frame.

◮ Have to slow UART down some because sending 16 bits per item

versus 10.

◮ Have to bit reverse order in SPI frame because UART is lsb to msb.


Application Examples

1. Moving 16-bit values from a F2012 using the MSP430 SPI interface

to the C5505 using the C5505 I2S interface. The one available eZdsp

SPI “channel” is used to interface FPGA display support to the

C5505. Three I2S channels are available. Our intent is to use one of

these.

This is a slightly contrived example. The C5505 itself has four A/D

input channels that could be used for this application.

2. Moving 8-bit values from a F2012 using the MSP430 SPI interface to

the C5505 using the C5505 UART interface. Useful when sending

values from a MSP430 to a XBee wireless device.


MSP430 Master SPI to C5501 slave I2S1

An example application would to measure the positions of four variable

resistors (either rotary or slider) to be used as control inputs to an audio

special effects processor running on a C5505.


F2012 pin use

1

4

3

2

14

11

12

13

7

6

5

8

9

10

TEST/SBWTCK

VCC VSS

XOUT/P2.7

XIN/P2.6/TA1

RST/NMI/SBWTDIO

P1.7/A7/SDI/SDA/TDO/TDI

P1.6/TA1/A6/SDO/SCL/TDI/TCLKP1.5/TA0/A5/SCLK/TMS

P1.3/ADC10CLK/A3/VREF--/VeREF--

P1.2/TA1/A2

P1.1/TA0/A1

P1.0/TACLK/ACLK/A0

P1.4/SMCLK/A4/VREF+/VeREF+/TCK

◮ The F2012 package has 14 pins. Pins 1 and 14 are used for power

and ground. Pins 10 and 11 are used by JTAG, Spy by Wire. This

leaves 10 for signals.

◮ Need to use three signals to interface to I2S, frame sync, clock (pin

7), data (pin 8). The MSP430 SPI hardware does not generate frame

sync. Have to use an output port pin and generate it ourselves.

◮ Available A/D channels are on pins 2,3,4,5 and 9. Pin 2 is connected

to an led. Pins 3,4,5 and 9 are available as A/D inputs.

We will have to use either pin 12 or 13 as frame sync. This locks out

possible use of a 32768 Hz crystal. Will use pin 12 (port 2 pin 7).

From the TI MSP430F2012 data sheet.


C5505 and other considerations

◮ C5505 has X SPI ports of which only one is brought out and is

generally used to drive the S3SB graphics.

◮ There are four I2S ports. Port I2S port 0 is use with the CODEC.

Ports I2S1 and I2S2 are brought to the eZdsp connector. Port I2S3 is

shared with the UART.

◮ When I2S is a slave the transfer timing is controlled by the master

and does can be “bursty”.

◮ Will use I2S1 to support the slave input.

◮ Will use DSP mono-mode.

◮ The F2012/3 SPI output does not include a frame sync waveform.

One can be generated using a port pin.

◮ Need at least one clock additional clock pulse to allow the C5505 to

sample the frame sync transition.


F2012 main

#include <msp430x20x3.h>

volatile unsigned int i, value;

void main(void){

WDTCTL = WDTPW + WDTHOLD; // Stop watchdog timer

//12MhzBCSCTL1 = CALBC1_12MHZ; // Set rangeDCOCTL = CALDCO_12MHZ; // Set DCO step + modulation

P1DIR = 0x01; // P1.0 output, else inputP1DIR |= 0x20; // also P1.5 outputUSICTL0 |= USIPE7 + USIPE6 + USIPE5 + USIMST + USIOE; // Port, SPI masterUSICTL1 |= USIIE; // Counter interrupt, flag remains setUSICKCTL = USIDIV_4 + USISSEL_2; // SMCLK/16USICTL0 &= ~USISWRST; // USI released for operationUSISRL = 0; // initial load data value{IgnoreReturns}P2SEL = 0x00; // set up IO use on port 2P2DIR = 0x80; // use port 2 pin 7 as frame sync outputP2OUT &= ~0x80; // set sync lowvalue = 0; // initialize output valueUSICNT = 16 | USI16B; // init-load counter--starts SPI running_BIS_SR(LPM0_bits + GIE); // Enter LPM0 w/ interrupt

}


F2012 SPI interrupt support

// USI interrupt service routine

#pragma vector=USI_VECTOR__interrupt void universal_serial_interface(void){

for (i = 0xF; i > 0; i--); // delay between valuesUSISRL = value; // load low 8 bitsUSISRH = value >> 8; // load high 8 bitsvalue++; // increment value

USICTL0 &= ~USIPE5; // generate two clock pulses manuallyP1OUT |= 0x20; // clock rising edgeP2OUT |= 0x80; // sync rising edgeP1OUT &= ~0x20; // clock falling edgeP1OUT |= 0x20; // clock rising edgeP2OUT &= ~0x80; // sync falling edgeP1OUT &= ~0x20; // clock falling edgeUSICTL0 |= USIPE5; // return pin to the SPI

USICNT = 16 | USI16B; // load counter which starts transfer}


This is strange looking code

The main appears to start, run and then exit.

The main sets up the F2012/3, loads a value into the USI counter and

enters low power mode with interrupts (whatever that means).

A “normal” program would then exit back to the system. The F2012/3

doesn’t have a system to exit back to.

The USI/SPI hardware continues to run in low power mode. When the

counter decrements to 0, the CPU is powered back on and the interrupt

support routine is entered.

The shown interrupt routine delays a while to space values for looking at

on an oscilloscope. Loads a new 16-bit value into the shift registers,

loads the counter with a count of 16 and puts the processor back to

sleep.

In our nominal resistor application the A/D clock would control events

and the given interrupt routine would recast as a function.


C5505 test main#include <stdlib.h>#include <stdio.h>#include "..\c5505_support\data_types.h"

#define FOREVER 1

unsigned int I2S1_receive();void I2S1_transmit(unsigned int);void InitI2S1();void InitSystem();void ConfigPort();

void main(void){

unsigned int value, next_value, value_ctr, loop_ctr, bad_ctr;

// CPU initialization

InitSystem();ConfigPort();InitI2S1();

loop_ctr = 0;bad_ctr = 0;

while(FOREVER) {value = I2S1_receive(); // discard first valuenext_value = I2S1_receive()+1; // get initial test valuevalue_ctr = 0;while(value_ctr++ != 0xFFFF) {

value = I2S1_receive();if (next_value != value) {

printf("expected: %04X received: %04X\n", next_value, value);bad_ctr++;break;

}next_value++;

}printf("loop %6u completed, bad = %3u\n", loop_ctr++, bad_ctr);}

}EECS 452 – Winter 2010 Lecture 23 – Page 42/62 Friday – March 12, 2010

C5505 initialization and support

// File name: I2S1_support//// 14Jan2010 .. initial version .. KMetzger//

#include <stdlib.h>#include "..\c5505_support\data_types.h"#include "..\c5505_support\c5505.h"

void InitI2S1(void){

PCGCR1 &= ~I2S1CG; // enable the I2S1 peripheral clock (0 enables)I2S1SCTRL = 0; // reset I2S1I2S1SCTRL = I2SENABLE | I2SMONO | I2SDATADLY | I2SWDLENGTH16 | I2SFRMT ;I2S1INTMASK = I2SRCVMONFL; // enable the done flag--WARNING enables interrupt too!

}

unsigned int I2S1_receive(void){

while((I2S1INTFL & I2SRCVMONFL) == 0); // wait for received valuereturn I2S1RXLT1; // then return it

}


F2013 and C5505 waveformsC5505 I2S timing in DSP mode:

LD(n) LD(n+1)

I2S_CLK

DATA -1

-2

-3

2 1 03 -1

-2

-3

03 2 1 -1

-2

N N N N N N N N N-3

RD(n)

I2S_FSLEFT CHANNEL RIGHT CHANNEL

LD(n) = n'th sample of left channel data RD(n) = n'th sample of right channel data

From sprufp4.pdf.

MSP430F2012/3 SPI timing:

USI

CKPH

USI

CKPLUSICNTx

SCLK

SCLK

SCLK

SCLK

SDO/SDI

SDO/SDI

USIIFG

0

1

0

0

01

1 1

0 X

1 X

MSB

MSB

8 7 6 5 4 3 2 1

LSB

LSB

00

Load USICNTx

From TMS320F20xx data sheet.


C5505 I2S1 registers

CPU WordAddress Acronym Description

2900h I2SSCTRL I2S Serializer Control Register

2904h I2SSRATE I2S Sample Rate Generator Register

2908h I2STXLT0 I2S Transmit Left Data 0 Register

2909h I2STXLT1 I2S Transmit Left Data 1 Register

290Ch I2STXRT0 I2S Transmit Right Data 0 Register

290Dh I2STXRT1 I2S Transmit Right Data 1 Register

2910h I2SINTFL I2S Interrupt Flag Register

2914h I2SINTMASK I2S Interrupt Mask Register

2928h I2SRXLT0 I2S Receive Left Data 0 Register

2929h I2SRXLT1 I2S Receive Left Data 1 Register

292Ch I2SRXRT0 I2S Receive Right Data 0 Register

292Dh I2SRXRT1 I2S Receive Right Data 1 Register

From sprufp4.pdf.


Configuration and flag register bits

I2SnSCTRL register:

15 14 13 12 11 10 9 8

ENABLE Reserved MONO LOOPBACK FSPOL CLKPOL DATADLY

R/W-0 R-0 R/W-0 R/W-0 R/W-0 R/W-0 R/W-0

7 6 5 2 1 0

PACK SIGN_EXT WDLNGTH MODE FRMT

R/W-0 R/W-0 R/W-0 R/W-0 R/W-0

LEGEND: R/W = Read/Write; R = Read only; -n = value after reset

I2SnSINTFL register:15 8

Reserved

R-0

7 6 5 4 3 2 1 0

Reserved XMITSTFL XMITMONFL RCVSTFL RCVMONFL FERRFL OUERR

R-0 R-0 R-0 R-0 R-0 R-0 R-0

LEGEND: R/W = Read/Write; R = Read only; -n = value after reset

From sprufp4.pdf.


MSP403-C5505 SPI signals

Frame Sync

Bit Clock

Data Bits

Captured from an oscilloscope.


Time axis expanded

Frame Sync

Bit Clock

Data Bits

Captured from an oscilloscope. Different scan.


Comments about the waveforms

◮ Only those edges that are needed are generated.

◮ The clock dwell times are not relevant.

◮ Clock edge positions relevant to the data dwells are relevant.

◮ How were the important edges decided upon? Careful reading of

the C5505 I2S documentation. Asking the question, "How would I

implement this in a FPGA?". Cut and try.

◮ Note that the last bit sent stays in the shift register and thus on the

data line. For the two waveforms shown, the last bit sent was a logic

one.


Focusing now on the Piccolo™

This is of interest because:

◮ Very fast (≈ 5 MSPS) A/D.

◮ Dual track and holds.

◮ Ultra high resolution pulse width modulators that make it easy to

implement D/A conveters.

◮ Low cost development tools.


TI MS320C2000 microcontrollers

MS320C2000™ Microcontrollers combine control peripheral

integration with the processing power of a 32-bit architecture. All

C28x™ microcontrollers are 100% software compatible and offer

high-speed 12-bit Analog to Digital converters and advanced PWM

generators.

From TI C3000 web pages.


Piccolo controlSTICK

The big chip to the left is the USB interface and the big chip to the right is the F28027, $39. From a TI document.


TI controlSTICK overview

The new Piccolo controlSTICK USB tool allows quick and easy

evaluation of all the advanced capabilities of TI’s Piccolo 32-bit MCU

for just $39. Slightly larger than a memory stick, the Piccolo

controlSTICK features onboard JTAG emulation and access to all

control peripherals. Example projects walk through the advanced

functionality of Piccolo, from simply blinking an LED to configuring

the high resolution ePWM peripherals. Included in the kit is the

Piccolo controlSTICK, USB extension cable, jumpers and patch cords

necessary for example projects, full version of Code Composer Studio

with 32kB code size limit, example projects showcasing Piccolo MCU

features and full hardware documentation, including bill of materials,

schematics and Gerber files.

From a TI web site.


What is a Piccolo

◮ Member of TI’s C2000 32-bit family of microcontrollers.

◮ Uses TI’s fixed point C28x core.

◮ 40-60 MIPS operation.

◮ single 3.3 Volt supply.

◮ Family members vary in◮ the amount of on-chip RAM and flash EPROM.◮ the peripheral mix and characteristics.

◮ Low cost. The F28027 is priced at ≈ $3.60 qty 100.

◮ Currently there are three family members. More on the way.


Piccolo block diagram

From the TI Piccolo web site.


F28027 block diagram in detail

3 External Interrupts

M0

SARAM 1K x 16

(0-wait)

16-bit Peripheral Bus

M1

SARAM 1K x 16

(0-wait)

SCI

(4L FIFO)

ePWMSPI

(4L FIFO)

I2C

(4L FIFO)HRPWM

eCAP

32-Bit Peripheral Bus

CodeSecurityModule

GPIO MUX

C28x32-bit CPU

A7:0

B7:0

PIE

CPU Timer 0

CPU Timer 1

CPU Timer 2

TCK

TDITMS

TDO

TRST

OSC1,

OSC2,

Ext,

PLL,

LPM,

WD

XCLKIN

X2

XRS

32-bit Peripheral Bus

EC

AP

x

EP

WM

xA

ES

YN

CI

SD

Ax

SP

IST

Ex

SC

Lx

SP

ISIM

Ox

SP

ICL

Kx

COMP1OUT

SC

IRX

Dx

GPIOMux

LPM Wakeup

AIO

MUX

ADC

PSWD

FLASH16K/32K x 16

Secure

OTP 1K x 16Secure

OTP/Flash

Wrapper

Boot-ROM

8K x 16

(0-wait)

SARAM

1K/3K/4K x 16

(0-wait)

Secure

COMP

32

-bit

pe

rip

he

ral

bu

s

COMP1A

COMP1BCOMP2A

COMP2B

COMP2OUT

X1

GPIO

MUX

VREG

FromCOMP1OUT,COMP2OUT

POR/BOR

Mem

ory

Bu

s

Memory Bus

Memory Bus

TZ

x

SC

ITX

Dx

SP

ISO

MIx

EP

WM

xB

ES

YN

CO

A. Not all peripheral pins are available at the same time due to multiplexing.


Yet again

TMS320F2802x/3x Block Diagram

32x32 bit

Multiplier

Sectored

Flash

Program Bus

Data Bus

RAMBoot

ROM

32-bit

Auxiliary

Registers3

32-bit

Timers

Real-Time

JTAG

Emulation CPU

Register Bus

R-M-W

Atomic

ALU

PIE Interrupt Manager

eQEP

12-bit ADC

Watchdog

CAN 2.0B

I2C

SCI

SPI

GPIO

ePWM

eCAP

LIN

CLA Bus

CLA

Available only on TMS320F2803x devices: CLA, QEP, CAN, LIN


The C28027 has what?

◮ 16× 16, 32× 32 and dual 16× 16 MAC.

◮ Harvard architecture but with unified memory map.

◮ 2 internal, 1% accurate oscillators.

◮ On-chip temperature sensor.

◮ Clock phase-lock-loop multiplier.

◮ Watchdog timer module.

◮ Missing clock detection circuitry.

◮ Up to 22 individually programable GIPO pins.

◮ Three 32-bit timers.

◮ One enhanced pulse width modulator (ePWM). Eight outputs.

◮ Independent 16-bit timer per ePWM module.

◮ four high resolution PWM (HPRPWM).

◮ 1/2 analog comparator.

◮ 7/13 channel, 4.6 MHz, 12-bit A/D converter

◮ 128 bit security lock.

◮ Serial peripherals, one SCI, one SPI, one I2C.

◮ three external interrupts.


C28x processor block diagram

Data-write buffer register

Immediatedata

XAR7

XAR0XAR1XAR2XAR3XAR4XAR5XAR6XAR7

DPSPST1

ARAU

Program-read data bus, PRDB(0:31)

Program address bus, PAB(0:21)

RESULT BUS

Data-read address bus, DRAB(0:31)

Data-read data bus, DRDB(0:31)

Data-read buffer register

Multiplier,barrel shifter,

andALU

Data-/program-write data bus, DWDB(0:31)

Data-write address bus, DWAB(0:31)

Program-addressgeneration logic

Program controllogic

MUX

Immediateaddress

Immediatedata

MUX

Addressfrom stack

AH:ALPH:PLT:TLIER

DBGIERIFRST0PC

RPC

Result bus

Registers

Operand bus


F2807 on-chip memory

◮ On-chip flash – 32 K 16-bit words.

◮ On-chip SARAM – 6 K 16-bit words.

◮ Boot ROM – 8 K 16-bit words.

Included (free) CCS has limit of 32 kB code size.

Why is this considered a 32-bit MCU?

No provision for adding external memory, easily.


F28027 memory map

M0 Vector RAM (Enabled if VMAP = 0)

M0 SARAM (1K x 16, 0-Wait)

M1 SARAM (1K x 16, 0-Wait)

0x00 0000

0x00 0040

0x00 0400

Lo

w 6

4K

(24

x/2

40

x E

qu

iva

len

t D

ata

Sp

ac

e)

Data Space Prog Space

Reserved

Reserved

User OTP (1K x 16, Secure Zone + ECSL)

Reserved

Reserved

FLASH(32K x 16, 4 Sectors, Secure Zone + ECSL)

128-Bit Password

L0 SARAM (4K x 16)(0-Wait, Secure Zone + ECSL, Dual Mapped)

Reserved

Boot ROM (8K x 16, 0-Wait)

Vector (32 Vectors, Enabled if VMAP = 1)

0x00 9000

0x3D 7800

0x3D 7C00

0x3D 8000

0x3F 0000

0x3F 7FF8

0x3F 8000

0x3F 9000

0x3F E000

0x3F FFC0

Hig

h 6

4K

(24

x/2

40

x E

qu

iva

len

t P

rog

ram

Sp

ac

e)

Calibration Data0x3D 7C80

0x3D 7CC0

Reserved

Peripheral Frame 1(4K x 16, Protected)

Peripheral Frame 2(4K x 16, Protected)

L0 SARAM (4K x 16)(0-Wait, Secure Zone + ECSL, Dual Mapped)

0x00 2000

0x00 6000

0x00 7000

0x00 8000

Reserved

Peripheral Frame 00x00 0800

Peripheral Frame 00x00 0E00

0x00 0D00PIE Vector - RAM

(256 x 16)(Enabled ifVMAP = 1,ENPIE = 1)

Figure 3-5. 28023/28027 Memory Map


Flash memory addresses

Table 3-1. Addresses of Flash Sectors in F28021/28023/28027

ADDRESS RANGE PROGRAM AND DATA SPACE

0x3F 0000 - 0x3F 1FFF Sector D (8K x 16)

0x3F 2000 - 0x3F 3FFF Sector C (8K x 16)

0x3F 4000 - 0x3F 5FFF Sector B (8K x 16)

0x3F 6000 - 0x3F 7F7F Sector A (8K x 16)

Program to 0x0000 when using the0x3F 7F80 - 0x3F 7FF5

Code Security Module

Boot-to-Flash Entry Point0x3F 7FF6 - 0x3F 7FF7

(program branch instruction here)

Security Password (128-Bit)0x3F 7FF8 - 0x3F 7FFF

(Do not program to all zeros)

Please DO NOT change any of the security codes or passwords.

Don’t even think about doing so.