Hindawi Publishing CorporationEURASIP Journal on Embedded SystemsVolume 2009, Article ID 897023, 12 pagesdoi:10.1155/2009/897023
Research Article
Prototyping Advanced Control Systems on FPGA
Stephane Simard, Jean-Gabriel Mailloux, and Rachid Beguenane
Department of Applied Sciences, University of Quebec at Chicoutimi, 555 boul. de l’Universite, Chicoutimi, QC, Canada G7H 2B1
Correspondence should be addressed to Rachid Beguenane, [email protected]
Received 19 June 2008; Accepted 3 March 2009
Recommended by Miriam Leeser
In advanced digital control and mechatronics, FPGA-based systems on a chip (SoCs) promise to supplant older technologies, suchas microcontrollers and DSPs. However, the tackling of FPGA technology by control specialists is complicated by the need forskilled hardware/software partitioning and design in order to match the performance requirements of more and more complexalgorithms while minimizing cost. Currently, without adequate software support to provide a straightforward design flow, theamount of time and efforts required is prohibitive. In this paper, we discuss our choice, adaptation, and use of a rapid prototypingplatform and design flow suitable for the design of on-chip motion controllers and other SoCs with a need for analog interfacing.The platform consists of a customized FPGA design for the Amirix AP1000 PCI FPGA board coupled with a multichannel analogI/O daughter card. The design flow uses Xilinx System Generator in Matlab/Simulink for system design and test, and XilinxPlatform Studio for SoC integration. This approach has been applied to the analysis, design, and hardware implementation ofa vector controller for 3-phase AC induction motors. It also has contributed to the development of CMC’s MEMS prototypingplatform, now used by several Canadian laboratories.
Copyright © 2009 Stephane Simard et al. This is an open access article distributed under the Creative Commons AttributionLicense, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properlycited.
1. Introduction
The use of advanced control algorithms depends upon beingable to perform complex calculations within demandingtiming constraints, where system dynamics can requirefeedback response in as short as a couple tens of microsec-onds. Developing and implementing such capable feedbackcontrollers is currently a hard goal to achieve, and there ismuch technological challenge in making it more affordable.Thanks to major technological breakthroughs in recent years,and to sustained rapid progress in the fields of very largescale integration (VLSI) and electronic design automation(EDA), electronic systems are increasingly powerful [1,2]. In the latter paper, it is rightly stated that FPGAdevices have reached a level of development that putsthem on the edge of microelectronics fabrication technologyadvancements. They provide many advantages with respectto their nonreconfigurable counterparts such as the generalpurpose micropocessors and DSP processors. In fact, FPGA-based digital processing systems achieve better performance-cost compromise, and with a moderate design effort theycan afford the implementation of a powerful and flexible
embedded SoCs. Exploiting the FPGA technology benefitsfor industrial electrical control systems has been the source ofintensive research investigations during last decade in orderto boost their performances at lower cost [3, 4]. There isstill, however, much work to be done to bring such powerin the hands of control specialists. In [5], it is stated that thepotential of implementing one FPGA chip-based controllerhas not been fully exploited in the complicated motorcontrol or complex converter control applications. Untilnow, most related research works using FPGA devices arefocusing on designing specific parts mainly to control powerelectronic devices such as space vector pulse width modula-tion (SVPWM) and power factor correction [6, 7]. Usuallythese are implemented on small FPGAs while the maincontrol tasks are realised sequentially by the supervisingprocessor system, basically the DSP. Important and constantimprovement in FPGA devices, synthesis, place-and-routetools, and debug capabilities has made FPGA prototypingmore available and practical to ASIC/SoC designers thanever before. The validation of their hardware and softwareon a common platform can be accomplished using FPGA-based prototypes. Thanks to the existing and mature tools
2 EURASIP Journal on Embedded Systems
that provide automation while maintaining flexibility, theFPGA prototypes make it now possible for ASIC/SoC designsto be delivered on time at minimal budget. Consequently,FPGA-based prototypes could be efficiently exploited formotion control applications to permit an easy modificationof the advanced control algorithms through short-designcycles, simple simulation, and rapid verification. Still theimplementation of FPGA-based SoCs for motion controlresults in very complex tasks involving SW and HW skilleddevelopers. The efficient IP integration constitutes the maindifficulty from hardware perspective while in software sidethe issue is the complexity of debugging the software thatruns under real-time operating system (RTOS), in real hard-ware. This paper discusses the choice, adaptation, and use ofa rapid prototyping platform and design flow suitable for thedesign of on-chip motion controllers and other SoCs with aneed for analog interfacing. Section 2 describes the chosenprototyping platform and the methodology that supportsembedded application software coupled with custom FPGAlogic and analog interfacing. Section 3 presents the strategyfor simulating and prototyping any control algorithm usingXilinx system Generator (XSG) along with Matlab/Simulink.A vector control for induction motor is taken as a runningexample to explain some features related to the cosimulation.Section 4 describes the process of integrating the designedcontroller, once completely debugged, within an SoC archi-tecture using Xilinx Platform Studio (XPS) and targetingthe chosen FPGA-based platform. Section 5 discusses thecomplex task of PCI initialization of the analog I/O cardand controller setup by software under embedded Linuxoperating system. Section 6 takes the induction motor vectorcontrol algorithm as an application basis to demonstratethe usefulness of the chosen FPGA-based SoC platform todesign/verify on-chip motion controllers. The last sectionconcludes the paper.
2. The FPGA-Based Prototyping Platform forOn-ChipMotion Controllers
With the advent of a new generation of high-performanceand high-density FPGAs offering speeds in the 100 secondsof MHz and complexities of up to 2 megagates, the FPGA-based prototyping becomes appropriate for verification ofSoC and ASIC designs. Consequently the increasing designcomplexities and the availability of high-capacity FPGAsin high-pin-count packages are motivating the need forsophisticated boards. Board development has become a taskthat demands unique expertise. That is one reason whycommercial off-the-shelf (COTS) boards are quickly becom-ing the solution of choice because they are closely relatedto the implementation and debugging tools. During manyyears, and under its System-on-Chip Research Network(SOCRN) program, CMC Microsystems provided canadianuniversities with development tools, various DSP/EmbeddedSystems/multimedia boards, and SoC prototyping boardssuch as Amirix AP1000 PCI FPGA development platform.In order to support our research on on-chip motioncontrollers, we have managed the former plateform, a host
Analog I/Qdaughter card
AP1000 board Digital outputsfrom the FPGA
Figure 1: Rapid prototyping station equiped with FPGA board andmultichannel analog I/O daughter card.
PC (3.4 GHz Xeon CPU with 2.75 GB of RAM) equipedwith the Amirix AP1000 PCI FPGA development board,to support a multichannel analog I/O PMC daughter card(Figure 1) to communicate with exterior world.
The AP1000 has lots of features to support complexsystem prototyping, including test access and expansioncapabilities. The PCB is a 64-bit PCI card that can beinserted in a standard expansion slot on a PC motherboardor PCI backplane. Use of the PMC site requires a secondchassis slot on the backside of the board and an optionalextender card to provide access to the board I/O. The AP1000platform includes a Xilinx Virtex-II Pro XC2VP100 FPGAand is connected to dual banks of DDR SDRAM (64 MB)and SRAM (2 MB), Flash Memory (16 MB), Ethernet andother interfaces. It is configured as a single board computerbased on two embedded IBM PowerPC processors, and it isproviding an advanced design starting point for the designerto improve time-to-market and reduce development costs.
The analog electronics are considered modular, and caneither be external or included on the same chip (e.g., whenfabricated into an ASIC). On the prototyping platform,of course, they are supplied by the PMC daughter card.It is a General Standards PMC66-16AISS8A04 analog I/Oboard featuring twelve 16-bit channels: eight simultaneouslysampled analog inputs, and four analog outputs, with inputsampling rates up to 2.0 MSPS per channel. It acts as a two-way analog interface between the FPGA and lab equipment,connected through an 80-pin ribbon cable and a breakeoutboard to the appropriate ports of the power module.
The application software is compiled with the freeEmbedded Linux Development Kit (ELDK) from DENXSoftware Engineering. Since it runs under such a completeoperating system as Linux, it can perform elaborated func-tions, including user interface management (via a seriallink or through networking), and real-time supervision andadaptation of a process such as adaptive control.
The overall platform is very well suited to FPGA-in-the-loop control and SoC controller prototyping (Figure 2).The controller can either be implemented completely indigital hardware, or executed on an application-specificinstruction set processor (ASIP). The hardware approach hasa familiar design flow, using the Xilinx System Generator
EURASIP Journal on Embedded Systems 3
ACinduction
motor
Powermodule
Digitaloutputs
(PMW,etc)
RJ45 RS232Xcvr
PMC Ethernet RJ45
PCIbridge
Bridge
Bridge
Externallocal
bridge
InterruptcontrollerUART
PLB
OPB
FPGA Virtex-II Pro XC2VP100
AP1000 FPGA board
PowerPC405
SDRAMcontroller
Userlogic
Interface
Applicationsoftware + HW logic driver
under Linux
General standards PMC analog I/Q card with 12 16 bit analog
channels: 4 outputs, and 8 simultaneously sampled inputs
-
Figure 2: Architecture of the embedded platform driving a powersystem (schematic not to scale).
(XSG) blockset and hardware/software cosimulation featuresin Matlab/Simulink. An ASIP specially devised for advancedcontrol applications is currently under development withinour laboratory.
3. Matlab/Simulink/XSG Controller Design
It is well known that simulation of large systems withinsystem analysis and modelling software environments takesa prohibitive amount of time. The main advantage of a rapidprototyping design flow with hardware/software cosimula-tion is that it provides the best of a system analysis andmodelling environment while offering adequate hardwareacceleration.
Hardware/software cosimulation has been introducedby major EDA vendors around year 2000, combiningMatlab/Simulink, the computing, and Model-Based Designsoftware, with synthesizable blocksets and automated hard-ware synthesis software such as DSP Builder from Altera,and System Generator from Xilinx (XSG). Such a designflow reduces the learning time and development risk forDSP developers, shortens the path from design concept toworking hardware, and enables engineers to rapidly createand implement innovative, high-performance DSP designs.
The XSG cosimulation feature allows the user to run adesign on the FPGA found on a certain platform. An impor-tant advantage of XSG is that it allows for quick evaluationof system response when making changes (e.g., changingcoefficient and data widths). As the AP1000 is not supportedby XSG among the preprogrammed cosimulation targets, weuse the Virtex-4 ML402 SX XtremeDSP Evaluation Platforminstead (Figure 3). The AP1000 is only targetted at the SoCintegration step (see Section 4).
Figure 3: Virtex-4 ML402 SX XtremeDSP evaluation platform.
We begin with a conventional, floating-point, simu-lated control system model, and corresponding fixed-pointhardware representation is then constructed using the XSGblockset, leading to a bit-accurate FPGA hardware model(Figure 4), and XSG generates synthesizable HDL targettingXilinx FPGAs. The XSG design, simulation, and test pro-cedure is briefly outlined below. Power systems includingmotor drives can be simulated using the SimPowerSystems(SPS) blockset in Simulink.
(1) Start by coding each system module individually withthe XSG blockset.
(2) Import any user-designed HDL cores.
(3) Adjust the fixed-point bit precisions (including bitwidths and binary point position) for each XSG blockof the system.
(4) Use the Xilinx Gateway blocks to interface a floating-point Simulink model with a fixed-point XSG design.The Gateway-in and Gateway-out blocks, respec-tively, convert inputs from Simulink to XSG andoutputs from XSG to Simulink.
(5) Test system response using the same input stimuli foran equivalent XSG design and Simulink model withautomatic comparision of their respective outputs.
Commonly, software simulation of a complete drivemodel, for a few seconds of results, could take a couple ofdays of computer time. Hardware/software cosimulation canbe used to accelerate the process of controller simulation,thus reducing the computing time to about a couple of hours.It also ensures that the design will respond correctly onceimplemented in hardware.
4. System-on-Chip Integration in XilinxPlatform Studio
FPGA design and the SoC architecture are managed withXilinx Platform Studio (XPS), targetting the AP1000. Wehave customized the CMC-modified Amirix baseline design
4 EURASIP Journal on Embedded Systems
Hardwarelibrary
component
BaselineSoC
architecture
Application-specification
hardwarecomponents
Hardwaredesign flow
Functionalsimulation
Integration of thecomponentsto the SoC
Controllersynthesisin XSG
Matlab/Simulinkmodeling
Hardware/softwareco-simulation
Softwarelibraries
and drivers
Source-levelintegration
Low-levelsoftware
simulation
Application-specific code
Embedded Linuxoperation system
Softwaredesign flow
FPGA prototype
Co-
sim
ula
tion
dat
a lin
k
Figure 4: Controller-on-chip design flow.
to support analog interfacing, user logic on the ProcessorLocal Bus (PLB), and communication with applicationsoftware under embedded Linux. XPS generates the corre-sponding .bin file, which is then transferred to the Flashconfiguration memory on the AP1000. The contents of thismemory is used to reconfigure the FPGA. We have foundan undocumented fact that, on the AP1000, this approachis the only practicable way to program the FPGA. JTAGprogramming is proved inconvenient, because it suppressesthe embedded Linux, which is essential to us for PCIinitialization. Once programmed, user logic awaits a startsignal from our application software following analog I/Ocard initialization.
To accelerate the logic synthesis process, the mapper andplace and route options are set to STD (standard) in theimplementation options file (etc/fast runtime.opt), found inthe Project Files menu. If the user wants a more aggressiveeffort, these options should be changed to HIGH, whichrequires much more time. Our experiments have shown thatit typically amounts to several hours.
4.1. Bus Interfacing. The busses implemented in FPGA logicfollow the IBM CoreConnect standard. It provides masterand slave operation modes for any instanciated hardwaremodule. The most important system busses are the ProcessorLocal Bus (PLB), and the On-chip Peripheral Bus (OPB).
The implementation of the vector control schemerequires much less of generality, and deletes some commu-nication stages that might be used in other applications. It iseasier to start from such a generic design, dropping unneededfeatures, than to start from scratch. This way, one can quicklyprogress from SoC architecture in XPS down to a workingcontroller on the AP1000.
4.1.1. Slave Model Register Read Mux. The baseline XPSdesign provides the developer with a slave model register
read multiplexer. This allows to decide which data is providedwhen a read request is sent to the user logic peripheral byanother peripheral in the system. While a greater numbermay be used, our pilot application, the vector control, onlyuse four slave registers. The user logic peripheral has aspecific base address (C BASEADDR), and the four 32-bit registers are accessed through C BASEADDR + registeroffset. In this example, C BASEADDR + 0x0 correspondsto the control and status register, which is composed of thefollowing bits:
0–7 : the DIP switches on the AP1000 fordebugging purposes,
8 : used by user software to reset, start, orstop the controller,
9–31 : reserved.
As for the other 3 registers, they correspond to
C BASEADDR + 0x4: Output to analogchannel 1
C BASEADDR + 0x8: Output to analogchannel 2
C BASEADDR + 0xC: Reserved (often used fordebugging purposes)
4.1.2. Master Model Control. The master model controlstate machine is used to control the requests and responsesbetween the user logic peripheral and the analog I/O card.The latter is used to read the input currents and voltagesfor vector control operation. The start signal previouslymentioned in slave register 0 is what gets the state machineout of IDLE mode, and thus starts the data acquisitionprocess. In this specific example, the I/O card is previouslyinitialized by the embedded application software, relievingthe state machine of any initialization code. Analog I/O
EURASIP Journal on Embedded Systems 5
initialization sets a lot of parameters, including how manyactive channels are to be read.
The state machine operates in the following way(Figure 5).
(1) The user logic waits for a start signal from the userthrough slave register 0.
(2) The different addresses to access the right AIO cardfields are set up, namely, the BCR and read buffer.
(3) A trigger is sent to the AIO card to buffer the valuesof all desired analog channels.
(4) A read cycle is repeated for the number of activechannels previously defined.
(5) Once all channels have been read, the state machinefalls back to trigger state, unless the user chooses tostop the process using slave register 0.
4.2. Creating or Importing User Cores. User-designed logicand other IPs can be created or imported into the XPS designfollowing this procedure.
(1) Select Create or Import Peripheral from the Hard-ware menu, and follow the wizard (unless otherwisestated below, the default options should be accepted).
(2) Choose the preferred bus. In the case of our vectorcontroller, it is connected to the PLB.
(3) For PLB interfacing, select the following IPIF ser-vices:
(a) burst and cacheline transaction support,
(b) master support,
(c) S/W register support.
(4) The User S/W Regitser data width should be 32.
(5) Accept the other wizard options as default, then clickFinish.
(6) You should find your newly created/imported core inthe Project Repository of the IP Catalog; right clickon it, and select Add IP.
(7) Finally go to the Assembly tab in the main SystemAssembly View, and set the base address (e.g.,0x2a001000), the memory size (e.g., 512), and the busconnection (e.g., plb bus).
4.3. Instantiating a Netlist Core. Using HDL generated bySystem Generator may be inconvenient for large controlsystems described with the XSG blockset, as it can requirea couple of days of synthesis time. System Generatorcan be asked to produce a corresponding NGC binarynetlist file instead, which is then treated as a black boxto be imported and integrated into an XPS project. Thisconsiderably reduces the synthesis time needed. The processof instantiating a Netlist Core in a custom peripheral (e.g.,user logic.vhd), performed following the steps documentedin XPS user guide.
IDLE
Adressessetup
AIOtrigger
PAUSE
Start signal
Trigger ACK
All channelsread
One active channelread
Read anotheractive channel
Stop signal
Readcycle
Setup completed
BCR andstatus
Figure 5: Master model state machine.
Table 1: The Two Intel StrataFlash Flash memory devices.
Bank Address Size Mode Description
1 0x20000000 0x1000000 (16 MB) 16 Program Flash
2 0x24000000 0x1000000 (16 MB) 8 Config. Flash
Table 2: AP1000 flash configurations.
Region Bank Sectors Description
0 2 0–39 Configuration 0
1 2 40–79 Configuration 1
2 2 80–127 Configuration 2 (Default Config.)
4.4. BIN File Generation and FPGA Configuration. To config-ure the FPGA, a BIN file must be generated from the XSGproject. Since JTAG programming disables the embeddedLinux, the BIN file must be downloaded directly to onboardFlash memory. There are two Intel Strataflash Flash memorydevices on the AP1000, one for the configuration, and onefor the U-boot bootstrap code (which should not be crushed)(Table 1).
The configuration memory (Table 2) is divided into threesections. Section 2 is the default Amirix configuration, andshould not be crushed. Downloading the BIN file to memoryis done through a network cable using the TFTP protocol.For this purpose, a TFTP server must be set up on thehost PC. The remote side of the protocol is managed byU-boot on the AP1000. Commands to U-boot to initiatethe transfer and to trigger FPGA reconfiguration from adesignated region are entered by the user through a serial linkterminal program. Here is the complete U-boot commandsequence:
setenv serverip 132.212.202.166setenv ipaddr 132.212.201.223erase 2 : 0–39Send tftp 00100000 download.binSend cp.b 00100000 24000000 00500000Send swrecon
6 EURASIP Journal on Embedded Systems
5. Application Software and Drivers
One of the main advantages of using an embedded Linuxsystem is the ability to perform the complex task of PCIinitialization. In addition, it allows for application softwareto provide elaborated interfacing and user monitoringthrough appropriate software drivers. Initialization of theanalog I/O card on the PMC site and controller setup areamong such tasks that are best performed by software.
5.1. Linux Device Drivers Essentials. Appropriate devicedrivers have to be written in order to use daughter cards(such as an analog I/O board) or custom hardware com-ponents on a bus internal to the SoC, and be able tocommunicate with them from the embedded Linux. Driversand application software for the AP1000 can be developedwith the free Embedded Linux Development Kit (ELDK)from DENX Software Engineering, Germany. The ELDKincludes the GNU cross development tools, along withprebuilt target tools and libraries to support the targetsystem. It comes with full source code, including all patches,extensions, programs, and scripts used to build the tools.A complete discussion on writing Linux device drivers isbeyond the scope of this paper, and this information may befound elsewhere, such as in [8]. Here, we only mention a fewimportant issues relevant to the pilot application.
To support all the required functions when creating aLinux device driver, the following includes are needed:
#include<linux/config.h>#include<linux/module.h>#include<linux/pci.h>#include<linux/init.h>#include<linux/kernel.h>#include<linux/slab.h>#include<linux/fs.h>#include<linux/ioport.h>#include<linux/ioctl.h>#include<linux/byteorder/
big endian.h>#include<asm/io.h>#include<asm/system.h>#include<asm/uaccess.h>
5.2. PCI Access to the Analog I/O Board . The pci finddevice() function begins or continues searching for a PCIdevice by vendor/device ID. It iterates through the list ofknown PCI devices, and if a PCI device is found witha matching vendor and device, a pointer to its devicestructure is returned. Otherwise, NULL is returned. Forthe PMC66-16AISS8A04, the vendor ID is 0x10e3, and thedevice ID is 0x8260. The device must then be initialized withpci initialize device() before it can be used by the driver.The start address of the base address registers (BARs) can beobtained using pci resource start(). In the example, we getBAR 2 which gives access to the main control registers of thePMC66-16AISS8A04.
volatile u32 ∗base addr;struct pci dev ∗dev;struc resource ∗ctrl res;
dev = pci find device(VENDORID,DEVICEID, NULL);
.
.
.pci enable device (dev);get revision (dev);base addr = (volatile u32 ∗)
pci resource start (dev, 2);ctrl res = request mem region (
(unsigned long)base addr,0x80L,"control");
bcr = (u32 ∗) ioremap nocache ((unsigned long)base addr,0x80L);
The readl() and writel() functions are defined to accessPCI memory space in units of 32 bits. Since the PowerPCis big-endian while the PCI bus is by definition little-endian, a byte swap occurs when reading and writing PCIdata. To ensure correct byte order, the le32 to cpu() andcpu to le32() functions are used on incoming and outgoingdata. The following code example defines some macros toread and write the Board Control Register, to read data fromthe analog input buffer, and to write to one of the four analogoutput channels.
volatile u32 ∗bcr;
#define GET BCR() (le32 to cpu(\\readl (bcr)))
#define SET BCR(x) writel(\\cpu to le32(x), bcr)
#define ANALOG IN()le32 to cpu(\\readl (&bcr[ANALOG INPUT BUF]))
#define ANALOG OUT(x,c) writel(\\cpu to le32(x), \\&bcr[ANALOG OUTPUT CHAN 00+c])
5.3. Cross-Compilation with the ELDK. To properly compilewith the ELDK, a makefile is required. Kernel source codeshould be available in KERNELDIR to provide for essentialincludes. The version of the preinstalled kernel on theAP1000 is Linux 1.4. Example of a minimal makefile:
TARGET= thetargetOBJS= myobj.o
#EDIT THE FOLLOWING TO POINT TO#THE TOP OF THE KERNEL SOURCE TREEKERNELDIR = ∼/kernel-sw-003996-01
CC = ppc 4xx−gccLD = ppc 4xx−ld
EURASIP Journal on Embedded Systems 7
DEFINES = −D—KERNEL—−DMODULE\\−DEXPORT SYMTAB
INCLUDES= −I$(KERNELDIR)/include\\−I$(KERNELDIR)/include/Linux\\
−I$(KERNELDIR)/include/asmFLAGS =−fno-strict-aliasing \\
−fno-common\\−fomit-frame-pointer\\−fsigned-char
CFLAGS = $(DEFINES) $(WARNINGS)\\$(INCLUDES) $(SWITCHES)\\$(FLAGS)
all: $(TARGET).o Makefile
$(TARGET).o: $(OBJS)$(LD) −r -o $@$∧
5.4. Software and Driver Installation on the AP1000. For easeof manipulation, user software and drivers are best carried ona CompactFlash card, which is then inserted in the back slotof the AP1000 and mounted into the Linux file system. Thedrivers are then intalled, and the application software started,as follows:
mount /dev/discs/disc0/part1 /mntinsmod /mnt/logic2/hwlogic.oinsmo /mnt/aio.ocd /devmknod hwlogic c 254 0mknod aio c 253 0/mnt/cvecapp
6. Application: AC InductionMotor Control
Given their well-known qualities of being cheap, highlyrobust, efficient, and reliable, AC induction motors currentlyconstitute the bulk of the motion industry park. From thecontrol point of view, however, these motors have highlynonlinear behavior.
6.1. FPGA-Based Induction Motor Vector Control. Theselected control algorithm for our pilot application is therotor-flux oriented vector control of a three-phase ACinduction motor of the squirrel-cage type. It is the firstmethod which makes it possible to artificially give somelinearity to the torque control of induction motors [9].
RFOC algorithm consists in partial linearization of thephysical model of the induction motor by breaking up thestator current is into its components in a suitable referenceframe (d, q). This frame is synchronously revolving alongwith the rotor flux space vector in order to get a separatecontrol of the torque and rotor flux. The overall strategythen consists in regulating the speed while maintainingthe rotor flux constant (e.g., 1 Wb). The RFOC algorithmis directly derived from the electromechanical model ofa three-phase, Y-connected, squirrel-cage induction motor.
This is described by equations in the synchronously rotatingreference frame (d, q) as
usd = Rsisd + σLsddtisd−σLsωisq +
M
Lr
ddtΨr ,
︸ ︷︷ ︸
Dd
usq = Rsisq + σLsddtisq+σLsωisd +
M
LrωΨr ,
︸ ︷︷ ︸
Dq
ddtΨr = Rr
Lr(Misd −Ψr),
ω = Ppωr +MRr
ΨrLrisq,
dωr
dt= 3
2Pp
M
JLrΨr isq − D
Jωr − Tl
J,
(1)
where usd and usq are d and q components of stator voltageus, isd, and isq are d and q components of stator current is, Ψr
is the modulus of rotor flux modulus, and θ is the angularposition of rotor flux, ω is the synchronous angular speed ofthe (d, q) reference frame (ω = dθ/dt), and Ls, Lr , and Mare stator, rotor, and mutual inductances, Rs, Rr are statorand rotor resistances, σ is the leakage coefficient of the motor,and Pp is the number of pole pairs, ωr is the mechanical rotorspeed, D is damping coefficient, J is the inertial momentum,and Tl is torque load.
6.2. RFOCAlgorithm. The derived expressions for each blockcomposing the induction motor RFOC scheme, as shown inFigure 6, are given as follows:
Speed PI Controller:
i∗sq = kpvεv + kiv
∫
εv dt; εv = ω∗r − ωr. (2)
Rotor Flux PI Controller:
i∗sd = kp f ε f + ki f
∫
ε f dt; ε f = Ψ∗r −Ψr . (3)
Rotor Flux Estimator:
Ψr =√
Ψ2rα + Ψ2
rβ, (4)
cos θ = Ψrα
Ψr, sin θ = Ψrβ
Ψr, (5)
with
Ψrα = LrM
(Ψsα − σLsisα), Ψrβ = LrM
(
Ψsβ − σLsisβ)
, (6)
Ψsα =∫
(usα − Rsisα), Ψsβ =∫(
usβ − Rsisβ)
, (7)
8 EURASIP Journal on Embedded Systems
V
v
v
u
uspsp
spsp
sp
u
u
i
sp
i
i
i
DC
sd
sq
sd
sqal
bh
ch
bl
cl
sa
sb
sa
ah
sb
sd
sq
Decoupling
Rotorflux
estimator
Speed PIcontroller
Rotor fluxPI controller
Q-currentPI controller
D-currentPI controller
Parktransform
Speed measure
Inversetransform
park
SVPWMmodulegating
Clarketransform
Clarketransform
IM
+
−
+
+
+
+ +
++
−
−
−
ω∗r
Ψ∗r
i∗sq
i∗sd
u∗sα
u∗sβ
cos θsin θΨr
ω
ω estimator
ωr
isαisβ
usαusβ
3-φ
voltagePWM
inverter
Figure 6: Conceptual block diagram of the system.
and using Clarke transformation
isα = isa, isβ = 1√3isa +
2√3isb, (8)
usα = usa, usβ = 1√3usa +
2√3usb. (9)
To be noticed that sine and cosine, of (5), sum up to adivision, and therefore do not have to be directly calculated.
Current PI Controller:
vsd = kpiεisd + kii
∫
εisd dt; εisd = i∗sd − isd, (10)
vsq = kpiεisq + kii
∫
εisq dt; εisq = i∗sq − isq. (11)
Decoupling:
usd = σLsvsd + Dd; usq = σLsvsq + Dq, (12)
with
Dd = −σLsωisq +M
Lr
ddtΨr , Dq = +σLsωisd +
M
LrωΨr .
(13)
Omega (ω) Estimator:
ω = Ppωr +MRr
ΨrLrisq. (14)
Park Transformation:
⎡
⎣
isd
isq
⎤
⎦ =⎡
⎣
cos θ sin θ
− sin θ cos θ
⎤
⎦
⎡
⎣
isα
isβ
⎤
⎦. (15)
Inverse Park Transformation:
⎡
⎣
u∗sαu∗sβ
⎤
⎦ =⎡
⎣
cos θ − sin θ
sin θ cos θ
⎤
⎦
⎡
⎣
usd
usq
⎤
⎦. (16)
In the above equations, for x standing for any variablesuch as voltage us, current is or rotor flux Ψr , we have thefollowing.
(x∗) Input reference corresponding to x.
(εx) Error signal corresponding to x.
(kpx , kix ) Proportional and integral parameters corre-sponding to the PI controller of x.
(xa, xb, xc) a, b, and c three-phase components of x in thestationary reference frame.
(xα, xβ) α and β two-phase components of x in thestationary reference frame.
(xd, xq) d and q components of x in the synchronouslyrotating frame.
The RFOC scheme features vector transformations(Clarke and Park), 4 IP regulators, and space-vector PWMgenerator (SVPWM). This algorithm is of interest for itsgood performances, and because it has a fair level ofcomplexity which benefits from a very-high-performanceFPGA implementation. In fact, FPGAs make it possible toexecute the loop of a complicated control algorithm in amatter of a few microseconds. The first prototype of sucha controller has been developed using the method andplatform described here, and has been implemented entirelyin FPGA logic [10].
Commonly used mediums prior to the advent of today’slarge FPGAs, including the use of DSPs alone and/or special-ized microcontrollers, led to a total cycle time of more than100 μs for vector control. This lead to switching frequencies
EURASIP Journal on Embedded Systems 9
Dynamo withoptical speed
encoder
Encodercable
Resistive load
Powersupply
Cable interfaceto analog I/O card
Digital I/Ofrom FPGA
High-voltagepower module
Squirrel-cageinduction motor
Figure 7: Experimental setup with power electronics, inductionmotor, and loads.
in the range of 1–5 kHz, which produced disturbing noise inthe audible band. With today’s FPGAs, it becomes possible tofit a very large control system on a single chip, and to supportvery high switching frequencies.
6.3. Validation of RFOC Using Cosimulation with XSG. Astrong hardware/software cosimulation environment andmethodology is necessary to allow validation of the hardwaredesign against a theoretical control system model.
As mentioned is Section 3, the design flow which hasbeen adopted in this research uses the XSG blockset inMatlab/Simulink. XSG model of RFOC block is built upfrom (2) to (16) and the global system architecture is shownin Figure 8 where Gateway-in and Gateway-out blocks pro-vide the necessary interface between the fixed-point FPGAhardware that include the RFOC and Space Vector PulseWidth Modulation (SVPWM) algorithms and the floating-point Simulink blocksets mainly the SimPowerSystems (SPS)models. In fact to make the simulations more realistic, thethree-phase AC induction motor and the correspondingVoltage Source Inverter were modelled in Simulink usingthe SPS blockset, which is robust and well proven. To benoticed that SVPWM is a widely used technique for three-phase voltage-source inverters (VSI), and is well suited forAC induction motors.
At runtime, the hardware design (RFOC and SVPWM)is automatically downloaded into the actual FPGA device,and its response can then be verified in real-time against thatof the theoretical model simulation done with floating-pointSimulink blocksets. An arbitrary load is induced by varyingthe torque load variable Tl as a time function. SPS receivesa reference voltage from the control through the inversePark transformation module. This voltage consists of twoquadrature voltages (u∗sα, u∗sβ), plus the angle (sine/cosine)of the voltage phasor usd corresponding to the rotor fluxorientation (Figure 6).
6.4. Reducing Cosimulation Times. In a closed loop setting,such as RFOC, hardware acceleration is only possible as long
as the replaced block does not require a lot of steps forcompletion. If the XSG design requires more steps to processthe data which is sent than what is necessary for the next datato be ready for processing, a costly (time wise) adjustmenthas to be made. The Simulink period for a given simulatedFPGA clock (one XSG design step) must be reduced, whilethe rest of the Simulink system runs at the same speed asbefore. In a fixed step Simulink simulation environment, thismeans that the fixed step size must be reduced enough sothat the XSG system has plenty of time to complete betweentwo data acquisitions. Obviously, such lenghty simulationsshould only be launched once the debugging process isfinished and the controller is ready to be thouroughly tested.
Once the control algorithm is designed with XSG, theHW/SW cosimulation procedure consists of the following.
(1) Building the interface between Simulink and FPGA-Based Cosimulation board.
(2) Making a hardware cosimulation design.
(3) Executing hardware cosimulation.
When using Simulink environment for cosimulation, oneshould distinguish between the single-step and free-runningmodes, in order for debugging purposes, to get much shortersimulations times.
Single-step cosimulation can improve simulation timewhen replacing one part of a bigger system. This is espe-cially true when replacing blocks that cannot be nativelyaccelerated by Simulink, like embedded Matlab functions.Replacing a block with an XSG cosimulated design shifts theburden from Matlab to the FPGA, and the block no longerremains the simulation’s bottleneck.
Free-running cosimulation means that the FPGA willalways be running at full speed. Simulink will no longerbe dictating the speed of an XSG step as was the casein single-step cosimulation. With the Virtex-4 ML402 SXXtremeDSP Evaluation Platform, that step will now be a fixed10 nanoseconds. Therefore, even a very complicated systemrequiring many steps for completion should have ample timeto process its data before the rest of the Simulink system doesits work. Nevertheless, a synchronization mechanism shouldalways be used for linking the free-running cosimulationblock with the rest of the design to ensure an exterior startsignal will not be mistakenly interpreted as more than onestart pulse. Table 3 shows the decrease of simulation timeafforded by the free-running mode for the induction motorvector control. This has been implemented using XSG withthe motor and its SVPWM-based drive being modeled usingSPS blockset from Simulink. For the same precision and thesame amount of data to be simulated (speed variations overa period of 7 seconds), a single-step approach would require100.7 times longer to complete, thus being an ineffectiveapproach. A more complete discussion of our methodologyfor rapid testing of an XSG-based controller using free-running cosimulation and SPS, has been given in [11].
6.5. Timing Analysis. Before actually generating a BIT fileto reconfigure the FPGA, and whether the cosimulation isdone through JTAG or Ethernet, the design must be able to
10 EURASIP Journal on Embedded Systems
fu_in
fv_in
fw_in
fu_pul_ou
fu_pulbar_o
fv_pulbar_o
fw_pulbar_o
fv_pul_ou
fw_pul_ou
Gating
fu
fv
ua
ub
prediction_uab
fire_u
fire_v
fire_wSTART
vqs
vds
Power system blockset domain(floating point)
System generator blockset domain(fixed point)
Firing_Signals
start_contr
uA
uB
READY
spd_ref
flux_ref
isa
usa
usb
aq_done
m_w
Vector_control
Gateway Out6
Gateway Out9
Gateway Out17
Gateway Out10
Gateway Out13
Gateway Out11
Out
OutOut2
Out3Out
Out
Out
Out
vds_in
vds_in2
vds_in1
vqs_in
vqs_in1
In
In
In
In
In
wbar
vbar
ubar
Motor_Drive
is_abc
wm
Te>
volt_mea>
Sensors
In1
In2
Wref
Speed_Ref
phiref
Rotor_Flux_Ref
Syestemgenerator
Resourceestimator
Discrete,Ts = 2.5e-006 s
y
x
u
v
w
Out1
Figure 8: Indcution motor RFOC drive, as modelled with XSG and SPS blocksets.
Table 3: Simulation times and methods
Type of simulation Simulation time
Free-running cosimulation 1734 s
Single-step cosimulation 174610 s (48 hours)
run at 100 MHz (10 nanoseconds step time). As long as thedesign is running inside Simulink, there are never any issueswith meeting timing requirements for the XSG model. Oncecompleted, the design will be synthesized, and simulated onFPGA. If the user launches the cosimulation block generationprocess, the timing errors will be mentioned quite far intothe operation. This means that, after waiting for a relativelylong delay (sometimes 20–30 minutes depending on thecomplexity of a design and the speed of the host computer),the user notices the failure to meet timing requirements withno extra information to quickly identify the problem. Thisis why the timing analysis tool must always be run prior tocosimulation. While it might seem a bit time-consuming,this tool will not simply tell you that your design does notmeet requirements, but it will give you the insight requiredto fix the timing problems. The control algorithm once
being fully designed, analysed (timing wise), and debuggedthrough the aforementioned FPGA-in-the-loop simulationplatform, the corresponding NGC binary netlist file orVHDL/Verilog code are automatically generated. Thesecould then be integrated within the SoC architecture usingXilinx Platform Studio (XPS) and targetting the AP1000platform. Next section describes the related steps.
6.6. Experimental Setup. Figure 7 shows the experimentalsetup with power electronics, induction motor, and loads.The power supply is taken from a 220 V outlet. The highvoltage power module, from Microchip, is connected to theanalog I/O card through the rainbow flex cable, and tothe expansion digital I/Os of the AP1000 through anotherparallel cable. Signals from a 1000-line optical speed encoderare among the digital signals fed to the FPGA. As for theloads, there is both a manually-controlled resistive load box,and a dynamo coupled to the motor shaft.
From the three motor phases, three currents and threevoltages (all prefiltered and prescaled) are fed to the analogI/O board to be sampled. Samples are stored in an internalinput buffer until fetched by the controller on FPGA. Data
EURASIP Journal on Embedded Systems 11
exchange between the FPGA and the I/O board proceedsthrough the PLB and the Dual Processor PCI Bus Bridge toand from the PMC site.
The process of generating SVPWM signals continuouslyruns in parallel with controller logic, but the speed at whichthese signals are generated is greater than the speed requiredfor the vector control processing. As a consequence, thesetwo processes are designed and tested separately before beingassembled and tested together.
Power gating and motor speed decoding are continuousprocesses that have critical clocking constraints beyond thecapabilities of bus operation to and from the I/O board.Therefore, even though the PMC66-16AISS8A04 board alsoprovides digital I/O, both the PWM gating signals and theinput pulses from the optical speed encoder are directlypassed through FPGA pins to be processed by dedicatedhardware logic. This is done by plugging a custom-madeadapter card with Samtec CON 0.8 mm connectors into theexpansion site on the AP1000. While the vector control usesdata acquired from the AIO card through a state machine,the PWM signals are constantly fed to the power module(Figure 6). Those signals are sent directly through the generalpurpose digital outputs on the AP1000 itself instead of goingthrough the AIO card. This ensures complete control overthe speed at which these signals are generated and sentwhile targeting a specific operating frequency (16 kHz inour example). This way, the speed calculations required forthe vector control algorithm are done using precise clockingwithout adding to the burden of the state machine whichdictates the communications between FPGA and the AIOcard. The number of transitions found on the signal linesbetween the FPGA and speed encoder are used to evaluatethe speed at which the motor is operating.
6.7. Timing Issues. Completion of one loop cycle of our vec-tor control design, takes 122 steps leading to a computationtime of less than 1.5 μs. To be noticed that for a sampling rateof 32 kHz, the SVPWM signal has 100 divisions (two zonesdivided by 50), which has been chosen as a good compromisebetween precision and simulation time. The simulationfixed-step size is then 625 nanoseconds, which is alreadysmall enough to hinder the performance of simulating theSPS model. Since PWM signal generation is divided intotwo zones, for every 50 steps of Simulink operations (PWMsignal generation and SPS model simulation), the 122 vectorcontrol steps must complete. The period of the XSG—Simulink system must be adjusted in order for the XSGmodel to run 2.44 times faster than the other Simulinkcomponents. The simulation fixed-step size becomes 2.56nanoseconds, thus prolonging simulation time. In otherwords, since the SPS model and PWM signals generation takelittle time (in terms of steps) to complete whereas the vectorcontrol scheme requires numerous steps, the coupling of thetwo forces the use of a very small simulation fixedstep size.
7. Conclusion
In this paper, we have discussed our choice, adaptation,and use of a rapid prototyping platform and design flow
suitable for the design of on-chip motion controllers andother SoCs with a need for analog interfacing. It supportsembedded application software coupled with custom FPGAlogic and analog interfacing, and is very well suited to FPGA-in-the-loop control and SoC controller prototyping. Suchplatform is suitable for academia and research communautythat cannot afford the expensive commercial solutions forFPGA-in-the-loop simulation [12, 13].
A convenient FPGA design, simulation, and test proce-dure, suitable for advanced feedback controllers, has beenoutlined. It uses the Xilinx System Generator blockset inMatlab/Simulink and a simulated motor drive described withthe SPS blockset. SoC integration of the resulting controller isdone in Xilinx Platform Studio. Our custom SoC design hasbeen described, with highlights on the state machine for businterfacing, NGC file integration, BIN file generation, andFPGA configuration.
Application software and drivers development forembedded Linux are often needed to provide for PCI andanalog I/O card initialization, interfacing, and monitoring.We have provided here some pointers along with essentialinformation not easily found elsewhere. The proposed designflow and prototyping platform have been applied to theanalysis, design, and hardware implementation of a vectorcontroller for three-phase AC induction motors, with verygood performance results. The resulting computation times,of about 1.5 μs, can in fact be considered record-breaking forsuch a controller.
Acknowledgments
This research is funded by a Grant from the NationalSciences and Engineering Research Council of Canada(NSERC). CMC Microsystems provided development toolsand support through the System-on-Chip Research Network(SOCRN) program.
References
[1] “Accelerating Canadian competitiveness through microsys-tems: strategic plan 2005–2010,” Tech. Rep., CMC Microsys-tems, Kingston, Canada, 2004.
[2] J. J. Rodriguez-Andina, M. J. Moure, and M. D. Valdes,“Features, design tools, and application domains of FPGAs,”IEEE Transactions on Industrial Electronics, vol. 54, no. 4, pp.1810–1823, 2007.
[3] R. Dubey, P. Agarwal, and M. K. Vasantha, “Programmablelogic devices for motion control—a review,” IEEE Transactionson Industrial Electronics, vol. 54, no. 1, pp. 559–566, 2007.
[4] E. Monmasson and M. N. Cirstea, “FPGA design methodologyfor industrial control systems—a review,” IEEE Transactions onIndustrial Electronics, vol. 54, no. 4, pp. 1824–1842, 2007.
[5] D. Zhang, A stochastic approach to digital control design andimplementation in power electronics, Ph.D. thesis, Florida StateUniversity College of Engineering, Tallahassee, Fla, USA, 2006.
[6] Y.-Y. Tzou and H.-J. Hsu, “FPGA realization of space-vectorPWM control IC for three-phase PWM inverters,” IEEETransactions on Power Electronics, vol. 12, no. 6, pp. 953–963,1997.
12 EURASIP Journal on Embedded Systems
[7] A. de Castro, P. Zumel, O. Garcıa, T. Riesgo, and J. Uceda,“Concurrent and simple digital controller of an AC/DCconverter with power factor correction based on an FPGA,”IEEE Transactions on Power Electronics, vol. 18, no. 1, part 2,pp. 334–343, 2003.
[8] “Developing device drivers for Linux Kernel 1.4.,” Tech. Rep.,CMC Microsystems, Kingston, Canada, 2006.
[9] B. K. Bose, Power Electronics and Variable-Frequency Drives:Technology and Applications, IEEE Press, New York, NY, USA,1996.
[10] J.-G. Mailloux, Prototypage rapide de la commande vectoriellesur FPGA a l’aide des outils Simulink—System Generator, M.S.thesis, Universite du Quebec a Chicoutimi, Quebec, Canada,January 2008.
[11] J.-G. Mailloux, S. Simard, and R. Beguenane, “Rapid testingof XSG-based induction motor vector controller using free-running hardware co-simulation and SimPowerSystems,” inProceedings of the 5th International Conference on Comput-ing, Communications and Control Technologies (CCCT ’07),Orlando, Fla, USA, July 2007.
[12] C. Dufour, S. Abourida, J. Belanger, and V. Lapointe, “Real-time simulation of permanent magnet motor drive on FPGAchip for high-bandwidth controller tests and validation,” inProceedings of the 32nd Annual Conference on IEEE Indus-trial Electronics (IECON ’06), pp. 4581–4586, Paris, France,November 2006.
[13] National Instruments, “Creating Custom Motion Control andDrive Electronics with an FPGA-based COTS System,” 2006.