EECS 452 – Lecture 23
Today: TI MSP430 and Piccolo.
Handouts: printed copy of today’s lecture slides
Read: about DSP!
References:
Last one out should close the lab door!!!!
Please keep the lab clean and organized.
Where a calculator on the ENIAC is equipped with 18,000 vacuum tubes andweighs 30 tons, computers in the future may have only 1,000 vaccuum tubes andperhaps weigh 1.5 tons. – Popular Mechanics, March 1949
EECS 452 – Winter 2010 Lecture 23 – Page 1/62 Friday – March 12, 2010
Actually . . .
Actually there were 18800 vacuum tubes and of those 6550
were 6SN7s.
The 6SN7 was/is a dual triode and was used to implement the
20 digit signed decimal accumulators. By not turning off the
power to ENIAC the average failure rate was 1 tube about every
two days. The longest up period was 116 hours.
A portion of ENIAC is located in the lobby of the CSE building.
The tubes that you see are very likely 6SN7s.
ENIAC’s active lifetime was 9 years, 1947–1955.
EECS 452 – Winter 2010 Lecture 23 – Page 2/62 Friday – March 12, 2010
Overview of today’s lecture
Unfortunately, likely to be fragmented and rambling.
◮ Comments on single supply operation.
◮ The MPS430
◮ Multiplying without a multiplier.
◮ An IIR filter for the MSP430
◮ The MSP430 SPI interface.
◮ Linking MSP430 SPI to C5505 I2S.
◮ The TI Piccolo
EECS 452 – Winter 2010 Lecture 23 – Page 3/62 Friday – March 12, 2010
Thinking about single supply operation
+V/2
−V/2
−V/2
+V/2
ground
ground −V/2
+V/2
ground
ground
+V
V/2
+V
ground
V/2ground
+V
+V
groundR
R
Bypass capacitors not shown.
An alternative name for ground is common. Maybe a better choice.
EECS 452 – Winter 2010 Lecture 23 – Page 4/62 Friday – March 12, 2010
Focusing now on the MSP430™
EECS 452 has a couple of eZ430-F2013 Development tools and several
Z-Accel wireless kits (uses F2274).
The development tool F2012/13 boards execute programs out of flash.
The boards can operate stand-alone, have projects have used them in
this manner.
The F2012/F2013 boards have been used to interface to XBee wireless
modules via UART and to the C5505 via SPI.
The three most important documents are:
◮ The data manual for the F20xx microcontrollers.
◮ The MSP430x2xx Family User’s Guide, SLAU144E.
◮ The eZ430-F2012 Development Tool User’s Guide, SLAU176B.
EECS 452 – Winter 2010 Lecture 23 – Page 5/62 Friday – March 12, 2010
Where used?
http://www.ti.com/ww/en/mcu/valueline/index.shtml?DCMP=Value_Line&HQS=Other+BA+430value-promo.
All these applications likely involve the use of Digital Signal Processing!
I don’t understand how the new value line differs from the existing low end units otherthan in part number and price.
EECS 452 – Winter 2010 Lecture 23 – Page 6/62 Friday – March 12, 2010
What is low power?
◮ There are six low power modes of operation.
◮ Standby (asleep) at 3V with self wake up with RAM retention,
< 0.6µA, about 1.8 microwatts.
◮ 250µA per MIP when active. (MSP430X2xx family.) This is 3/4
milli-Watt per MIP at 3 Volts.
◮ Wake up time < 1µs.
EECS 452 – Winter 2010 Lecture 23 – Page 7/62 Friday – March 12, 2010
Comments
http://focus.ti.com/graphics/mcu/ulp/battery-life.gif.
EECS 452 – Winter 2010 Lecture 23 – Page 8/62 Friday – March 12, 2010
eZ430-Development Tool
The debugging interface shown is the old version. I believe that we only
have the 6 pin version. For the F2012/13 boards simply use the center
four pins.
Note that the 14 pin pattern mirror images the physical pin positions on
the F2012/13 packages. BEWARE!
SLAU176B documents the tool and the F2013 board. (Figure from there.)
EECS 452 – Winter 2010 Lecture 23 – Page 9/62 Friday – March 12, 2010
MSP430 generic block diagram
ACLK
BusConv.
Peripheral
MAB 16-Bit
MDB 16-Bit
MCLK
SMCLK
ClockSystem
Peripheral PeripheralPeripheral
Peripheral Peripheral Peripheral
Watchdog
RAMFlash/
RISC CPU16-Bit
JTA
G/D
ebug
ACLK
SMCLK
ROM
MDB 8-Bit
JTAG
From the MSP430X2XX Family User’s Guide.
EECS 452 – Winter 2010 Lecture 23 – Page 10/62 Friday – March 12, 2010
MSP430 CPU block diagram
◮ RISC architecture.
◮ 27 core instructions.
◮ Plus 24 emulated instructions.
◮ 7 addressing modes.
◮ Every instruction usable with every addressing mode.
◮ Single-cycle register operations.
◮ Constant generator for six most commonly used values.
◮ Direct memory-to-memory transfers.
◮ Instruction times depend on the addressing mode used.
◮ Instruction can take from 1 to 6 cycles.
From the MSP430X2XX Family User’s Guide.
015
MDB − Memory Data Bus Memory Address Bus − MAB
16
Zero, Z
Carry, C
Overflow, V
Negative, N
16−bit ALU
dst src
R8 General Purpose
R9 General Purpose
R10 General Purpose
R11 General Purpose
R12 General Purpose
R13 General Purpose
R14 General Purpose
R15 General Purpose
R4 General Purpose
R5 General Purpose
R6 General Purpose
R7 General Purpose
R3/CG2 Constant Generator
R2/SR/CG1 Status
R1/SP Stack Pointer
R0/PC Program Counter 0
0
16
MCLK
EECS 452 – Winter 2010 Lecture 23 – Page 11/62 Friday – March 12, 2010
How to do DSP without a multiplier?
Here is the problem that I want to address:
◮ Manufacturers, such as TI, sell low cost, low power microcomputers,
essentially by the millions.
◮ Many of these do not possess a multiplier, yet alone a MAC unit.
◮ In spite of this there, are likely many applications that would benefit
(result in a more desirable product) by use of some DSP.
◮ Just as floating point arithmetic is emulated in the C5505 by software, one
can emulate the operation of a multiplier hardware in software.
◮ Implementation of multiplication in a multiplierless can be divided into
two basic categories : general purpose multiplication and hard coded
multiplication.
◮ The general multiplier is the more flexible but is also the most costly in
terms of execution time.
◮ The hard coding of the computation steps assumes multiplication by fixed
values (such as filter coefficients). Is fastest but requires significant code
space.
EECS 452 – Winter 2010 Lecture 23 – Page 12/62 Friday – March 12, 2010
So what would I like to cover?
Disclaimer: this is a work in progress. Some has been done, some not. I
accidentally lost my MSP430 test codes when upgrading to CCS4. Some
of the outline below is fantasy, at this point, but should provide hints to
anyone interested in delving into this topic on their own.
◮ Pencil and paper unsigned binary multiplication.
◮ Pencil and paper two’s complement binary multiplication.
◮ Multiplier block diagrams.
◮ Coding a general multiplier in the MSP430. TI likely supplies code
for such.
◮ Booth’s algorithm.
◮ Signed Digit (SD) and Canonical Signed Digit (CSD) representation.
◮ Testing.
◮ A IIR filter code generator.
EECS 452 – Winter 2010 Lecture 23 – Page 13/62 Friday – March 12, 2010
Will knowing how to do this be useful?
◮ The lowest cost MSP430 having a multiplier appears to be the
MSP430F2330 at $1.75 at 1ku. It has a slope A/D and lives in a 40
pin flat pack.
◮ If one could use a $0.60 part (e.g., the F2011) at the 1ku level the
savings would be $1150 and at the 10ku level $11,500, etc.
◮ There likely will be many situations where knowing how to do this
will be useful and make economic sense.
◮ Someone will benefit from knowing how to do this. Just who and
when? It might be you.
EECS 452 – Winter 2010 Lecture 23 – Page 14/62 Friday – March 12, 2010
Relevant TI application notes
Efficient Multiplication and Division Using MSP430, Kripasagar Venkat,
Application Report slaa329, 9/2006.
Efficient MSP430 Code Synthesis for an FIR Filter, Kripasagar Venkat,
Application Report slaa357, 3/2007.
Combines Horner’s method of polynomial evaluation with the Canonical
Signed Digit (CSD) number representation to “efficiently” (as well as one
can) implement DSP.
The focus is on the multiplierless MSP430 devices but the method will
work on any computer or FPGA. The source files are also available.
This pair of notes are what started me on this effort.
EECS 452 – Winter 2010 Lecture 23 – Page 15/62 Friday – March 12, 2010
Comments on the application notes
◮ Author assumes use of Q15.
◮ Develops a right to left algorithm.
◮ Relates process to use of Horner’s method of polynomial evaluation.
◮ Hard codes the shift and add steps for constant multiplier values.
◮ Uses signed digit representation for multipliers.
◮ Essentially equivalent basic shift and add multiplier.
◮ Recall that Q15 is a state of mind, not a function of a hardware
binary point.
EECS 452 – Winter 2010 Lecture 23 – Page 16/62 Friday – March 12, 2010
Doing pencil and paper multiplication
a4 a4 a4 a4 a4 a4 a3 a2 a1 a0
p9 p8 p7 p6 p5 p4 p3 p2 p1 p0
a4 a4 a4 a3 a2 a1 a0
a4 a4 a4 a4 a3 a2 a1 a0
a4 a4 a4 a4 a4 a3 a2 a1 a0
a4 a4 a3 a2 a1 a0− b4 ×
+ b3 ×
+ b2 ×
+ b1 ×
b0 ×
× b4 b3 b2 b1 b0
a4 a3 a2 a1 a0
The multiplicand sign bit is extended for each row.
EECS 452 – Winter 2010 Lecture 23 – Page 17/62 Friday – March 12, 2010
Summing rows signed multiplier logic
shift register
register
b
AND
add/subtract
p-register shift register
S subtract
high bits low bits
a× b
a
lsblsb
lsb
EECS 452 – Winter 2010 Lecture 23 – Page 18/62 Friday – March 12, 2010
C simulation: unsigned shift and add multiplication
// FPGA and MSP430 simulated unsigned shift and add multiply
uint32_t u_sanda(uint16_t a, uint16_t b){
uint16_t ctr;uint32_t sum;
sum = 0;for (ctr=0; ctr<16; ctr++) {
if (b & 0x0001) {sum = sum&0xFFFF; // insure carry is 0sum += a;
}b = ((sum&0x0001)<<15) + (b>>1);sum = sum>>1; // shift right including carry
}return ((uint32_t)sum<<16)+(long)b;
EECS 452 – Winter 2010 Lecture 23 – Page 19/62 Friday – March 12, 2010
C simulation: signed shift and add multiplication
// signed shift and add multiply
int32_t fs_sanda(int16_t a, int16_t b){
uint16_t ctr, pr, low, carry, sign_a, sign_b;
sign_a = a&0x8000; sign_b = b&0x8000;
pr = 0; low = 0; carry = 0;
for (ctr=0; ctr<16; ctr++) {if (b&0x0001 != 0) {
carry = sign_a;if (ctr == 15) pr -= a; else pr += a;
}b = b>>1;if (pr&0x0001 != 0) low = 0x8000+(low>>1); else low = (low>>1);pr = (pr>>1)|carry;
}
if (a !=0) pr = pr^(sign_b );return ((int32_t)pr<<16)+low;
}
EECS 452 – Winter 2010 Lecture 23 – Page 20/62 Friday – March 12, 2010
Comments
These simulations mimic were written in conjunction with MSP430 code.
Multiplies two 16-bit values with a 32 bit result.
Exhaustively tested using all possible multiplier and multiplicand values.
EECS 452 – Winter 2010 Lecture 23 – Page 21/62 Friday – March 12, 2010
Working with Q15 values.
◮ Basically do integer multiplication.
◮ Product is 32 bits (two words).
◮ Left shift result by 1 and retain only the top 16 bits. Round first?
◮ Only need to do the multiplication keeping the top 16 bits. The low
bits can be discarded as generated. Might complicate rounding.
◮ For the shown algorithm what if we don’t do the last right shift?
◮ Code and TEST. My norm is to exhaustively test where ever possible.
◮ When not possible, test end/special cases then use random values,
lots of random values.
EECS 452 – Winter 2010 Lecture 23 – Page 22/62 Friday – March 12, 2010
Signed digit number representation
◮ Instead of representing values with 0 and 1 digit values, use digit
values of -1, 0, 1.
◮ Awkward on a binary processor. However, if one is hard coding the
steps in a multiplication operation is easily done.
◮ Not a unique representation. Lots of ways of writing a given value
using signed digits.
EECS 452 – Winter 2010 Lecture 23 – Page 23/62 Friday – March 12, 2010
Canonical SD representation
Uses the minimum number of non-zero digits.
◮ Reduces the instructions needed to hard code multiplication.
◮ Where to find an algorithm for generating CSD? Try Computer
Arithmetic Algorithms, by Israel Koren.
◮ How much efficiency is obtained?
EECS 452 – Winter 2010 Lecture 23 – Page 24/62 Friday – March 12, 2010
Converting an integer to CSD form/* File name: Int2CSD.c
Two’s complement integer to canonical signed digit.Algorithm from Koren ...
16Feb2009 .. initial version .. K.Metzger
*/
#include <stdio.h>#include <stdint.h>#include <stdlib.h>
void Int2CSD(int32_t value, // integer value to convertint nbits, // number of bits in value to convertint *bits, // bits array...nbits+1 elementsint *digits) // digits array...nbits elements
{int idx, cin=0, which;
for(idx=0; idx<nbits; idx++) {bits[idx] = value & 0x1;value >>= 1;
}bits[idx]= bits[idx-1]; // sign extend one extra bit
for (idx=0; idx<nbits; idx++) {which = (bits[idx+1]*2+bits[idx])*2+cin;switch(which) {
case 0: digits[idx] = 0; cin = 0; break;case 1: digits[idx] = 1; cin = 0; break;case 2: digits[idx] = 1; cin = 0; break;case 3: digits[idx] = 0; cin = 1; break;case 4: digits[idx] = 0; cin = 0; break;case 5: digits[idx] = -1; cin = 1; break;case 6: digits[idx] = -1; cin = 1; break;case 7: digits[idx] = 0; cin = 1; break;default: printf("Int2CSD: oops!\n"); exit(1);
} // end of switch} // end of for
} // end of function
EECS 452 – Winter 2010 Lecture 23 – Page 25/62 Friday – March 12, 2010
Implementing a IIR filter
Assume 16-bit values. Assuming a uniform distribution on the ones and zeros.
◮ On the average there will 8 ones and 8 zeros in the multiplier.
◮ Each one will be coded as a shift and an add. Eight shifts and eight adds.
◮ Each zero will be coded as a shift. Eight shifts.
◮ On the average (assuming that we are not doing Voodoo statistics here) a
multiplication will need 16 shifts and 8 adds. Twenty four machine cycles.
◮ On a MSP430 running at 16 MHz a hard coded multiplication will take on
the order of 1.5µs.
◮ To be conservative let’s use a value of 3µs.
◮ To implement an 8th order biquad filter we need five multiplications per
biquad and four biquads.
◮ The nominal, very hand wavy, time required to filter a sample is on the
order of 60µs.
◮ It might be possible to sample using a sample rate of 16 kHz and filter.
EECS 452 – Winter 2010 Lecture 23 – Page 26/62 Friday – March 12, 2010
Is this reasonable and can we do better?
◮ A 16-bit FPGA multiplier implementation should only need about 16
clock tics. The multiplier foot print should be small enough to allow
all 20 multipliers to be implemented. In this case a nominal 16 clock
tics would be needed per input sample for each filter output. (This
is an aside, sorry.)
◮ There is exists a non-unique number representation called signed
digit. When placed into canonical form this representation contains
the minimum number possible non-zero values. These non-zero
values are either +1 or −1.
◮ There is the possibility of speeding up hard coded multiplications.
◮ A reasonable question is “by how much”.
EECS 452 – Winter 2010 Lecture 23 – Page 27/62 Friday – March 12, 2010
Implementing multiplication in an MSP430
When updating to CCS V4 I deleted my old Code Composer Essentials.
Oops.
I had meant to back this work up.
EECS 452 – Winter 2010 Lecture 23 – Page 28/62 Friday – March 12, 2010
Canonical heresy
What are the maximum values associated
with the w1 and the w2?
What are the maximum values associated
with the w3 and w4? (Assuming our usual
scaling scheme.)
Where does overflow occur? Is this impor-
tant? (Combine the two top adders into
one.)
Is this truly real?
+
z−1
z−1 z−1
z−1
b0
b1
b2 −a2
x y
−a1w1
w2
w3
w4
+ +
+
EECS 452 – Winter 2010 Lecture 23 – Page 29/62 Friday – March 12, 2010
Is the result worth the effort?
◮ I wrote a C simulation for the lab 8th order IIR filter.
◮ The straight shift and add multiplication algorithm takes 164 adds
per sample.
◮ The CSD multiplication algorithm takes 112.
◮ The nominal CSD version does 0.68 times the number add/subtracts
as the normal algorithm.
◮ In a final form filter there will also be additional overheads that will
mute the speedup amount. Maybe by a factor on the order of two.
This still gives an on the order of 16% speed up.
◮ Of course, I’m assuming that I’ve done everything correctly.
The only really good way to answer this question is to build both
versions and run them.
EECS 452 – Winter 2010 Lecture 23 – Page 30/62 Friday – March 12, 2010
Moving onto the MSP430 SPI
◮ Two version have been present. Current can optionally do 8 or 16
bit transfers.
◮ A versatile device.
◮ Can be used to program a UART transmitter.
◮ Have programmed to communicate to C5505 via I2S.
◮ Used I2S mono mode. “Hand generated” frame sync.
◮ Last week TI issued an application note showing how to use a
couple of chips external to the MSP430 to do the I2S link. Their
solution is more general that what I did.
EECS 452 – Winter 2010 Lecture 23 – Page 31/62 Friday – March 12, 2010
F2012/13 USI SPI block diagram
8/16 Bit Shift Register
USIGE USIOE
SDI
SCLK
Set USIIFG
0
1
USICKPL
USICNTx
Shift Clock
USICKPH
USISSELx
SMCLK
SMCLK
SCLK
ACLK
000
001
010
011
TA1
TA2
USISWCLK
TA0
100
101
110
111
Clock Divider
/1/2/4/8... /128
USIDIVx
0
1USICLK
HOLD
USIIFG
USIMST
SDO
USI16B
D
G
Q
EN
ENUSISWRST
USILSB
USIPE6
USIPE7
USIPE5
USISR
Bit Counter
USIIFGCC
USII2C = 0
From slau144e.pdf.
EECS 452 – Winter 2010 Lecture 23 – Page 32/62 Friday – March 12, 2010
F2012/13 USI SPI timing diagram
USI
CKPH
USI
CKPLUSICNTx
SCLK
SCLK
SCLK
SCLK
SDO/SDI
SDO/SDI
USIIFG
0
1
0
0
01
1 1
0 X
1 X
MSB
MSB
8 7 6 5 4 3 2 1
LSB
LSB
00
Load USICNTx
From slau144e.pdf.
EECS 452 – Winter 2010 Lecture 23 – Page 33/62 Friday – March 12, 2010
Can use SPI as a UART transmitter
◮ UART uses 10 bit frame.
◮ SPI has 16 bits in frame.
◮ Have to slow UART down some because sending 16 bits per item
versus 10.
◮ Have to bit reverse order in SPI frame because UART is lsb to msb.
EECS 452 – Winter 2010 Lecture 23 – Page 34/62 Friday – March 12, 2010
Application Examples
1. Moving 16-bit values from a F2012 using the MSP430 SPI interface
to the C5505 using the C5505 I2S interface. The one available eZdsp
SPI “channel” is used to interface FPGA display support to the
C5505. Three I2S channels are available. Our intent is to use one of
these.
This is a slightly contrived example. The C5505 itself has four A/D
input channels that could be used for this application.
2. Moving 8-bit values from a F2012 using the MSP430 SPI interface to
the C5505 using the C5505 UART interface. Useful when sending
values from a MSP430 to a XBee wireless device.
EECS 452 – Winter 2010 Lecture 23 – Page 35/62 Friday – March 12, 2010
MSP430 Master SPI to C5501 slave I2S1
An example application would to measure the positions of four variable
resistors (either rotary or slider) to be used as control inputs to an audio
special effects processor running on a C5505.
EECS 452 – Winter 2010 Lecture 23 – Page 36/62 Friday – March 12, 2010
F2012 pin use
1
4
3
2
14
11
12
13
7
6
5
8
9
10
TEST/SBWTCK
VCC VSS
XOUT/P2.7
XIN/P2.6/TA1
RST/NMI/SBWTDIO
P1.7/A7/SDI/SDA/TDO/TDI
P1.6/TA1/A6/SDO/SCL/TDI/TCLKP1.5/TA0/A5/SCLK/TMS
P1.3/ADC10CLK/A3/VREF--/VeREF--
P1.2/TA1/A2
P1.1/TA0/A1
P1.0/TACLK/ACLK/A0
P1.4/SMCLK/A4/VREF+/VeREF+/TCK
◮ The F2012 package has 14 pins. Pins 1 and 14 are used for power
and ground. Pins 10 and 11 are used by JTAG, Spy by Wire. This
leaves 10 for signals.
◮ Need to use three signals to interface to I2S, frame sync, clock (pin
7), data (pin 8). The MSP430 SPI hardware does not generate frame
sync. Have to use an output port pin and generate it ourselves.
◮ Available A/D channels are on pins 2,3,4,5 and 9. Pin 2 is connected
to an led. Pins 3,4,5 and 9 are available as A/D inputs.
We will have to use either pin 12 or 13 as frame sync. This locks out
possible use of a 32768 Hz crystal. Will use pin 12 (port 2 pin 7).
From the TI MSP430F2012 data sheet.
EECS 452 – Winter 2010 Lecture 23 – Page 37/62 Friday – March 12, 2010
C5505 and other considerations
◮ C5505 has X SPI ports of which only one is brought out and is
generally used to drive the S3SB graphics.
◮ There are four I2S ports. Port I2S port 0 is use with the CODEC.
Ports I2S1 and I2S2 are brought to the eZdsp connector. Port I2S3 is
shared with the UART.
◮ When I2S is a slave the transfer timing is controlled by the master
and does can be “bursty”.
◮ Will use I2S1 to support the slave input.
◮ Will use DSP mono-mode.
◮ The F2012/3 SPI output does not include a frame sync waveform.
One can be generated using a port pin.
◮ Need at least one clock additional clock pulse to allow the C5505 to
sample the frame sync transition.
EECS 452 – Winter 2010 Lecture 23 – Page 38/62 Friday – March 12, 2010
F2012 main
#include <msp430x20x3.h>
volatile unsigned int i, value;
void main(void){
WDTCTL = WDTPW + WDTHOLD; // Stop watchdog timer
//12MhzBCSCTL1 = CALBC1_12MHZ; // Set rangeDCOCTL = CALDCO_12MHZ; // Set DCO step + modulation
P1DIR = 0x01; // P1.0 output, else inputP1DIR |= 0x20; // also P1.5 outputUSICTL0 |= USIPE7 + USIPE6 + USIPE5 + USIMST + USIOE; // Port, SPI masterUSICTL1 |= USIIE; // Counter interrupt, flag remains setUSICKCTL = USIDIV_4 + USISSEL_2; // SMCLK/16USICTL0 &= ~USISWRST; // USI released for operationUSISRL = 0; // initial load data value{IgnoreReturns}P2SEL = 0x00; // set up IO use on port 2P2DIR = 0x80; // use port 2 pin 7 as frame sync outputP2OUT &= ~0x80; // set sync lowvalue = 0; // initialize output valueUSICNT = 16 | USI16B; // init-load counter--starts SPI running_BIS_SR(LPM0_bits + GIE); // Enter LPM0 w/ interrupt
}
EECS 452 – Winter 2010 Lecture 23 – Page 39/62 Friday – March 12, 2010
F2012 SPI interrupt support
// USI interrupt service routine
#pragma vector=USI_VECTOR__interrupt void universal_serial_interface(void){
for (i = 0xF; i > 0; i--); // delay between valuesUSISRL = value; // load low 8 bitsUSISRH = value >> 8; // load high 8 bitsvalue++; // increment value
USICTL0 &= ~USIPE5; // generate two clock pulses manuallyP1OUT |= 0x20; // clock rising edgeP2OUT |= 0x80; // sync rising edgeP1OUT &= ~0x20; // clock falling edgeP1OUT |= 0x20; // clock rising edgeP2OUT &= ~0x80; // sync falling edgeP1OUT &= ~0x20; // clock falling edgeUSICTL0 |= USIPE5; // return pin to the SPI
USICNT = 16 | USI16B; // load counter which starts transfer}
EECS 452 – Winter 2010 Lecture 23 – Page 40/62 Friday – March 12, 2010
This is strange looking code
The main appears to start, run and then exit.
The main sets up the F2012/3, loads a value into the USI counter and
enters low power mode with interrupts (whatever that means).
A “normal” program would then exit back to the system. The F2012/3
doesn’t have a system to exit back to.
The USI/SPI hardware continues to run in low power mode. When the
counter decrements to 0, the CPU is powered back on and the interrupt
support routine is entered.
The shown interrupt routine delays a while to space values for looking at
on an oscilloscope. Loads a new 16-bit value into the shift registers,
loads the counter with a count of 16 and puts the processor back to
sleep.
In our nominal resistor application the A/D clock would control events
and the given interrupt routine would recast as a function.
EECS 452 – Winter 2010 Lecture 23 – Page 41/62 Friday – March 12, 2010
C5505 test main#include <stdlib.h>#include <stdio.h>#include "..\c5505_support\data_types.h"
#define FOREVER 1
unsigned int I2S1_receive();void I2S1_transmit(unsigned int);void InitI2S1();void InitSystem();void ConfigPort();
void main(void){
unsigned int value, next_value, value_ctr, loop_ctr, bad_ctr;
// CPU initialization
InitSystem();ConfigPort();InitI2S1();
loop_ctr = 0;bad_ctr = 0;
while(FOREVER) {value = I2S1_receive(); // discard first valuenext_value = I2S1_receive()+1; // get initial test valuevalue_ctr = 0;while(value_ctr++ != 0xFFFF) {
value = I2S1_receive();if (next_value != value) {
printf("expected: %04X received: %04X\n", next_value, value);bad_ctr++;break;
}next_value++;
}printf("loop %6u completed, bad = %3u\n", loop_ctr++, bad_ctr);}
}EECS 452 – Winter 2010 Lecture 23 – Page 42/62 Friday – March 12, 2010
C5505 initialization and support
// File name: I2S1_support//// 14Jan2010 .. initial version .. KMetzger//
#include <stdlib.h>#include "..\c5505_support\data_types.h"#include "..\c5505_support\c5505.h"
void InitI2S1(void){
PCGCR1 &= ~I2S1CG; // enable the I2S1 peripheral clock (0 enables)I2S1SCTRL = 0; // reset I2S1I2S1SCTRL = I2SENABLE | I2SMONO | I2SDATADLY | I2SWDLENGTH16 | I2SFRMT ;I2S1INTMASK = I2SRCVMONFL; // enable the done flag--WARNING enables interrupt too!
}
unsigned int I2S1_receive(void){
while((I2S1INTFL & I2SRCVMONFL) == 0); // wait for received valuereturn I2S1RXLT1; // then return it
}
EECS 452 – Winter 2010 Lecture 23 – Page 43/62 Friday – March 12, 2010
F2013 and C5505 waveformsC5505 I2S timing in DSP mode:
LD(n) LD(n+1)
I2S_CLK
DATA -1
-2
-3
2 1 03 -1
-2
-3
03 2 1 -1
-2
N N N N N N N N N-3
RD(n)
I2S_FSLEFT CHANNEL RIGHT CHANNEL
LD(n) = n'th sample of left channel data RD(n) = n'th sample of right channel data
From sprufp4.pdf.
MSP430F2012/3 SPI timing:
USI
CKPH
USI
CKPLUSICNTx
SCLK
SCLK
SCLK
SCLK
SDO/SDI
SDO/SDI
USIIFG
0
1
0
0
01
1 1
0 X
1 X
MSB
MSB
8 7 6 5 4 3 2 1
LSB
LSB
00
Load USICNTx
From TMS320F20xx data sheet.
EECS 452 – Winter 2010 Lecture 23 – Page 44/62 Friday – March 12, 2010
C5505 I2S1 registers
CPU WordAddress Acronym Description
2900h I2SSCTRL I2S Serializer Control Register
2904h I2SSRATE I2S Sample Rate Generator Register
2908h I2STXLT0 I2S Transmit Left Data 0 Register
2909h I2STXLT1 I2S Transmit Left Data 1 Register
290Ch I2STXRT0 I2S Transmit Right Data 0 Register
290Dh I2STXRT1 I2S Transmit Right Data 1 Register
2910h I2SINTFL I2S Interrupt Flag Register
2914h I2SINTMASK I2S Interrupt Mask Register
2928h I2SRXLT0 I2S Receive Left Data 0 Register
2929h I2SRXLT1 I2S Receive Left Data 1 Register
292Ch I2SRXRT0 I2S Receive Right Data 0 Register
292Dh I2SRXRT1 I2S Receive Right Data 1 Register
From sprufp4.pdf.
EECS 452 – Winter 2010 Lecture 23 – Page 45/62 Friday – March 12, 2010
Configuration and flag register bits
I2SnSCTRL register:
15 14 13 12 11 10 9 8
ENABLE Reserved MONO LOOPBACK FSPOL CLKPOL DATADLY
R/W-0 R-0 R/W-0 R/W-0 R/W-0 R/W-0 R/W-0
7 6 5 2 1 0
PACK SIGN_EXT WDLNGTH MODE FRMT
R/W-0 R/W-0 R/W-0 R/W-0 R/W-0
LEGEND: R/W = Read/Write; R = Read only; -n = value after reset
I2SnSINTFL register:15 8
Reserved
R-0
7 6 5 4 3 2 1 0
Reserved XMITSTFL XMITMONFL RCVSTFL RCVMONFL FERRFL OUERR
R-0 R-0 R-0 R-0 R-0 R-0 R-0
LEGEND: R/W = Read/Write; R = Read only; -n = value after reset
From sprufp4.pdf.
EECS 452 – Winter 2010 Lecture 23 – Page 46/62 Friday – March 12, 2010
MSP403-C5505 SPI signals
Frame Sync
Bit Clock
Data Bits
Captured from an oscilloscope.
EECS 452 – Winter 2010 Lecture 23 – Page 47/62 Friday – March 12, 2010
Time axis expanded
Frame Sync
Bit Clock
Data Bits
Captured from an oscilloscope. Different scan.
EECS 452 – Winter 2010 Lecture 23 – Page 48/62 Friday – March 12, 2010
Comments about the waveforms
◮ Only those edges that are needed are generated.
◮ The clock dwell times are not relevant.
◮ Clock edge positions relevant to the data dwells are relevant.
◮ How were the important edges decided upon? Careful reading of
the C5505 I2S documentation. Asking the question, "How would I
implement this in a FPGA?". Cut and try.
◮ Note that the last bit sent stays in the shift register and thus on the
data line. For the two waveforms shown, the last bit sent was a logic
one.
EECS 452 – Winter 2010 Lecture 23 – Page 49/62 Friday – March 12, 2010
Focusing now on the Piccolo™
This is of interest because:
◮ Very fast (≈ 5 MSPS) A/D.
◮ Dual track and holds.
◮ Ultra high resolution pulse width modulators that make it easy to
implement D/A conveters.
◮ Low cost development tools.
EECS 452 – Winter 2010 Lecture 23 – Page 50/62 Friday – March 12, 2010
TI MS320C2000 microcontrollers
MS320C2000™ Microcontrollers combine control peripheral
integration with the processing power of a 32-bit architecture. All
C28x™ microcontrollers are 100% software compatible and offer
high-speed 12-bit Analog to Digital converters and advanced PWM
generators.
From TI C3000 web pages.
EECS 452 – Winter 2010 Lecture 23 – Page 51/62 Friday – March 12, 2010
Piccolo controlSTICK
The big chip to the left is the USB interface and the big chip to the right is the F28027, $39. From a TI document.
EECS 452 – Winter 2010 Lecture 23 – Page 52/62 Friday – March 12, 2010
TI controlSTICK overview
The new Piccolo controlSTICK USB tool allows quick and easy
evaluation of all the advanced capabilities of TI’s Piccolo 32-bit MCU
for just $39. Slightly larger than a memory stick, the Piccolo
controlSTICK features onboard JTAG emulation and access to all
control peripherals. Example projects walk through the advanced
functionality of Piccolo, from simply blinking an LED to configuring
the high resolution ePWM peripherals. Included in the kit is the
Piccolo controlSTICK, USB extension cable, jumpers and patch cords
necessary for example projects, full version of Code Composer Studio
with 32kB code size limit, example projects showcasing Piccolo MCU
features and full hardware documentation, including bill of materials,
schematics and Gerber files.
From a TI web site.
EECS 452 – Winter 2010 Lecture 23 – Page 53/62 Friday – March 12, 2010
What is a Piccolo
◮ Member of TI’s C2000 32-bit family of microcontrollers.
◮ Uses TI’s fixed point C28x core.
◮ 40-60 MIPS operation.
◮ single 3.3 Volt supply.
◮ Family members vary in◮ the amount of on-chip RAM and flash EPROM.◮ the peripheral mix and characteristics.
◮ Low cost. The F28027 is priced at ≈ $3.60 qty 100.
◮ Currently there are three family members. More on the way.
EECS 452 – Winter 2010 Lecture 23 – Page 54/62 Friday – March 12, 2010
Piccolo block diagram
From the TI Piccolo web site.
EECS 452 – Winter 2010 Lecture 23 – Page 55/62 Friday – March 12, 2010
F28027 block diagram in detail
3 External Interrupts
M0
SARAM 1K x 16
(0-wait)
16-bit Peripheral Bus
M1
SARAM 1K x 16
(0-wait)
SCI
(4L FIFO)
ePWMSPI
(4L FIFO)
I2C
(4L FIFO)HRPWM
eCAP
32-Bit Peripheral Bus
CodeSecurityModule
GPIO MUX
C28x32-bit CPU
A7:0
B7:0
PIE
CPU Timer 0
CPU Timer 1
CPU Timer 2
TCK
TDITMS
TDO
TRST
OSC1,
OSC2,
Ext,
PLL,
LPM,
WD
XCLKIN
X2
XRS
32-bit Peripheral Bus
EC
AP
x
EP
WM
xA
ES
YN
CI
SD
Ax
SP
IST
Ex
SC
Lx
SP
ISIM
Ox
SP
ICL
Kx
COMP1OUT
SC
IRX
Dx
GPIOMux
LPM Wakeup
AIO
MUX
ADC
PSWD
FLASH16K/32K x 16
Secure
OTP 1K x 16Secure
OTP/Flash
Wrapper
Boot-ROM
8K x 16
(0-wait)
SARAM
1K/3K/4K x 16
(0-wait)
Secure
COMP
32
-bit
pe
rip
he
ral
bu
s
COMP1A
COMP1BCOMP2A
COMP2B
COMP2OUT
X1
GPIO
MUX
VREG
FromCOMP1OUT,COMP2OUT
POR/BOR
Mem
ory
Bu
s
Memory Bus
Memory Bus
TZ
x
SC
ITX
Dx
SP
ISO
MIx
EP
WM
xB
ES
YN
CO
A. Not all peripheral pins are available at the same time due to multiplexing.
EECS 452 – Winter 2010 Lecture 23 – Page 56/62 Friday – March 12, 2010
Yet again
TMS320F2802x/3x Block Diagram
32x32 bit
Multiplier
Sectored
Flash
Program Bus
Data Bus
RAMBoot
ROM
32-bit
Auxiliary
Registers3
32-bit
Timers
Real-Time
JTAG
Emulation CPU
Register Bus
R-M-W
Atomic
ALU
PIE Interrupt Manager
eQEP
12-bit ADC
Watchdog
CAN 2.0B
I2C
SCI
SPI
GPIO
ePWM
eCAP
LIN
CLA Bus
CLA
Available only on TMS320F2803x devices: CLA, QEP, CAN, LIN
EECS 452 – Winter 2010 Lecture 23 – Page 57/62 Friday – March 12, 2010
The C28027 has what?
◮ 16× 16, 32× 32 and dual 16× 16 MAC.
◮ Harvard architecture but with unified memory map.
◮ 2 internal, 1% accurate oscillators.
◮ On-chip temperature sensor.
◮ Clock phase-lock-loop multiplier.
◮ Watchdog timer module.
◮ Missing clock detection circuitry.
◮ Up to 22 individually programable GIPO pins.
◮ Three 32-bit timers.
◮ One enhanced pulse width modulator (ePWM). Eight outputs.
◮ Independent 16-bit timer per ePWM module.
◮ four high resolution PWM (HPRPWM).
◮ 1/2 analog comparator.
◮ 7/13 channel, 4.6 MHz, 12-bit A/D converter
◮ 128 bit security lock.
◮ Serial peripherals, one SCI, one SPI, one I2C.
◮ three external interrupts.
EECS 452 – Winter 2010 Lecture 23 – Page 58/62 Friday – March 12, 2010
C28x processor block diagram
Data-write buffer register
Immediatedata
XAR7
XAR0XAR1XAR2XAR3XAR4XAR5XAR6XAR7
DPSPST1
ARAU
Program-read data bus, PRDB(0:31)
Program address bus, PAB(0:21)
RESULT BUS
Data-read address bus, DRAB(0:31)
Data-read data bus, DRDB(0:31)
Data-read buffer register
Multiplier,barrel shifter,
andALU
Data-/program-write data bus, DWDB(0:31)
Data-write address bus, DWAB(0:31)
Program-addressgeneration logic
Program controllogic
MUX
Immediateaddress
Immediatedata
MUX
Addressfrom stack
AH:ALPH:PLT:TLIER
DBGIERIFRST0PC
RPC
Result bus
Registers
Operand bus
EECS 452 – Winter 2010 Lecture 23 – Page 59/62 Friday – March 12, 2010
F2807 on-chip memory
◮ On-chip flash – 32 K 16-bit words.
◮ On-chip SARAM – 6 K 16-bit words.
◮ Boot ROM – 8 K 16-bit words.
Included (free) CCS has limit of 32 kB code size.
Why is this considered a 32-bit MCU?
No provision for adding external memory, easily.
EECS 452 – Winter 2010 Lecture 23 – Page 60/62 Friday – March 12, 2010
F28027 memory map
M0 Vector RAM (Enabled if VMAP = 0)
M0 SARAM (1K x 16, 0-Wait)
M1 SARAM (1K x 16, 0-Wait)
0x00 0000
0x00 0040
0x00 0400
Lo
w 6
4K
(24
x/2
40
x E
qu
iva
len
t D
ata
Sp
ac
e)
Data Space Prog Space
Reserved
Reserved
User OTP (1K x 16, Secure Zone + ECSL)
Reserved
Reserved
FLASH(32K x 16, 4 Sectors, Secure Zone + ECSL)
128-Bit Password
L0 SARAM (4K x 16)(0-Wait, Secure Zone + ECSL, Dual Mapped)
Reserved
Boot ROM (8K x 16, 0-Wait)
Vector (32 Vectors, Enabled if VMAP = 1)
0x00 9000
0x3D 7800
0x3D 7C00
0x3D 8000
0x3F 0000
0x3F 7FF8
0x3F 8000
0x3F 9000
0x3F E000
0x3F FFC0
Hig
h 6
4K
(24
x/2
40
x E
qu
iva
len
t P
rog
ram
Sp
ac
e)
Calibration Data0x3D 7C80
0x3D 7CC0
Reserved
Peripheral Frame 1(4K x 16, Protected)
Peripheral Frame 2(4K x 16, Protected)
L0 SARAM (4K x 16)(0-Wait, Secure Zone + ECSL, Dual Mapped)
0x00 2000
0x00 6000
0x00 7000
0x00 8000
Reserved
Peripheral Frame 00x00 0800
Peripheral Frame 00x00 0E00
0x00 0D00PIE Vector - RAM
(256 x 16)(Enabled ifVMAP = 1,ENPIE = 1)
Figure 3-5. 28023/28027 Memory Map
EECS 452 – Winter 2010 Lecture 23 – Page 61/62 Friday – March 12, 2010
Flash memory addresses
Table 3-1. Addresses of Flash Sectors in F28021/28023/28027
ADDRESS RANGE PROGRAM AND DATA SPACE
0x3F 0000 - 0x3F 1FFF Sector D (8K x 16)
0x3F 2000 - 0x3F 3FFF Sector C (8K x 16)
0x3F 4000 - 0x3F 5FFF Sector B (8K x 16)
0x3F 6000 - 0x3F 7F7F Sector A (8K x 16)
Program to 0x0000 when using the0x3F 7F80 - 0x3F 7FF5
Code Security Module
Boot-to-Flash Entry Point0x3F 7FF6 - 0x3F 7FF7
(program branch instruction here)
Security Password (128-Bit)0x3F 7FF8 - 0x3F 7FFF
(Do not program to all zeros)
Please DO NOT change any of the security codes or passwords.
Don’t even think about doing so.
EECS 452 – Winter 2010 Lecture 23 – Page 62/62 Friday – March 12, 2010