+ All Categories
Home > Documents > Partial Reconfiguration of a CPRI Implementation on an FPGA

Partial Reconfiguration of a CPRI Implementation on an FPGA

Date post: 21-Nov-2021
Category:
Upload: others
View: 8 times
Download: 0 times
Share this document with a friend
86
IN DEGREE PROJECT ELECTRICAL ENGINEERING, SECOND CYCLE, 30 CREDITS , STOCKHOLM SWEDEN 2018 Partial Reconfiguration of a CPRI Implementation on an FPGA ALFRED SAMUELSON KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE
Transcript
Page 1: Partial Reconfiguration of a CPRI Implementation on an FPGA

IN DEGREE PROJECT ELECTRICAL ENGINEERING,SECOND CYCLE, 30 CREDITS

, STOCKHOLM SWEDEN 2018

Partial Reconfiguration of a CPRI Implementation on an FPGA

ALFRED SAMUELSON

KTH ROYAL INSTITUTE OF TECHNOLOGYSCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE

Page 2: Partial Reconfiguration of a CPRI Implementation on an FPGA

c© Alfred Samuelson, 27 February 2018

Page 3: Partial Reconfiguration of a CPRI Implementation on an FPGA

Abstract

Utilizing Partial Reconfiguration (PR) in Field Programmable Gate Arrays (FPGAs)is a digital hardware design concept that has gained in popularity and ease ofimplementation over the past decades. In short, it means that a limited region ofthe FPGA is reconfigured during run-time depending on which logic is needed ata given time. This way, the logic utilization of the FPGA can be reduced whilestill maintaining the same functionality in designs where certain logic blocks arenot run in parallel. For example, it has previously proven to be useful in designscontaining several types of hardware accelerators which are used by a CentralProcessing Unit (CPU).

Common Public Radio Interface (CPRI) is a communication interface betweencomponents of a Radio Base Station (RBS); Radio Equipment (RE) and RadioEquipment Control (REC). The specification of the interface outlines a functionalsplit between two different layers. In this master’s thesis, the potential benefitsand challenges of applying the concept of Partial Reconfiguration to a CPRI layer2 FPGA design are investigated. Using an Intel Arria 10 development board, aplatform has been designed for evaluation of relevant parameters with focus onresource utilization, bitstream file size and reconfiguration time.

The results do not show clear benefits of utilizing PR in this particular block,mainly due to the fact that not a large reduction of logic utilization is achievedcompared to a reference implementation of the block where PR is not utilized.However, important insights for future work on PR implementation of similarcircuits have been obtained.

i

Page 4: Partial Reconfiguration of a CPRI Implementation on an FPGA
Page 5: Partial Reconfiguration of a CPRI Implementation on an FPGA

Sammanfattning

Partiell Rekonfiguration (PR) i Field Programmable Gate Arrays (FPGAs) ar ettdesignkoncept for digital hardvara vars popularitet har okat de senaste decenniernasamtidigt som implementationsverktygens stod for metodiken har forbattrats.Sammanfattningsvis innebar det att en begransad region av FPGAn rekonfigurerasberoende pa vilken logik som behovs vid en given tidpunkt, samtidigt som restenav designen kors. Pa detta satt kan FPGAns logiska utilisation reduceras medbibehallen funktionalitet i designer dar vissa logikblock inte kors parallellt. Dethar till exempel visat sig vara anvandbart i designer dar flera olika typer avhardvauacceleratorer anvands av en central behandlingsenhet (CPU).

Common Public Radio Interface (CPRI) ar ett granssnitt for kommunikationmellan komponenter i en radiobasstation. Vanligen handlar det om radioutrustningoch den komponent som kontrollerar radioutrustningen. Specifikationen for CPRIpavisar en funktionell uppdelning mellan tva olika lager. I detta examensarbeteundersoks de potentiella fordelarna och utmaningarna med att applicera konceptetPartiell Rekonfiguration pa en FPGA-design av en CPRI lager 2-krets. Enplattform baserad pa ett utvecklingskort for Intel Arria 10 utformas for attutvardera relevanta parametrar med fokus pa resursutnyttjande, storlek pa bitstrom-filerna samt rekonfigurationstid.

Resultaten visar inte pa klara fordelar med att anvanda PR for just dettablock, framforallt eftersom inga stora besparingar i logikutilisation uppnaddesjamfort med en referensimplementation av samma block som ej anvande sig avPR. Anvandbara insikter for framtida arbete pa PR-implementation av liknandekretsar har dock forvarvats.

iii

Page 6: Partial Reconfiguration of a CPRI Implementation on an FPGA
Page 7: Partial Reconfiguration of a CPRI Implementation on an FPGA

Acknowledgements

The author would like to sincerely thank the thesis supervisors at Ericsson (EmilLundqvist) and KTH (Kalle Ngo) for invaluable support, insights and interestingdiscussions in a wide range of areas along the course of the thesis project. Aspecial thanks to Martin Nilsson at Ericsson for implementation specific support,and to Marten Kidd at Intel for providing the development board. To the fellowmaster thesis students at the Baseband Interconnect unit of Ericsson: thank you formany laughs, good conversations and interesting insights into your thesis topics.Thank you also to Pierre Rohdin and Yousaf Gulzar at Ericsson for providingthe author with the opportunity to write this thesis. Last but not least, the authorwould like to thank the examiner Johnny Oberg for offering great input in shapingthe thesis topic as well as providing the necessary prior knowledge within FPGAdesign through his class at the department of Electronic System Design at KTH.

v

Page 8: Partial Reconfiguration of a CPRI Implementation on an FPGA
Page 9: Partial Reconfiguration of a CPRI Implementation on an FPGA

Contents

1 Introduction 11.1 Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Relevance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.4 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . 2

2 Background 52.1 Common Public Radio Interface . . . . . . . . . . . . . . . . . . 5

2.1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 52.1.2 System Description . . . . . . . . . . . . . . . . . . . . . 6

2.1.2.1 System Components . . . . . . . . . . . . . . . 62.1.2.2 Example Configurations . . . . . . . . . . . . . 62.1.2.3 Protocol Layers . . . . . . . . . . . . . . . . . 92.1.2.4 Protocol Data Planes . . . . . . . . . . . . . . . 92.1.2.5 Interconnection . . . . . . . . . . . . . . . . . 102.1.2.6 Signal/Data Transfer . . . . . . . . . . . . . . . 10

2.1.3 CPRI Frame Structure . . . . . . . . . . . . . . . . . . . 102.1.4 CPRI over Ethernet . . . . . . . . . . . . . . . . . . . . . 13

2.2 Partial Reconfiguration of FPGAs . . . . . . . . . . . . . . . . . 132.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 132.2.2 History . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3 Related works and application areas . . . . . . . . . . . . . . . . 152.3.1 PR in General . . . . . . . . . . . . . . . . . . . . . . . . 152.3.2 Hardware Accelerators . . . . . . . . . . . . . . . . . . . 162.3.3 Software Defined Radio . . . . . . . . . . . . . . . . . . 162.3.4 Cognitive Radio . . . . . . . . . . . . . . . . . . . . . . 172.3.5 Dynamic CPRI line rate . . . . . . . . . . . . . . . . . . 182.3.6 Reconfigurable Ethernet Interface . . . . . . . . . . . . . 18

2.4 A Note on Platform Technology . . . . . . . . . . . . . . . . . . 182.5 Logic Resources available for PR on Arria 10 . . . . . . . . . . . 20

2.5.1 Adaptive Logic Modules . . . . . . . . . . . . . . . . . . 21

vii

Page 10: Partial Reconfiguration of a CPRI Implementation on an FPGA

viii CONTENTS

2.5.2 Logic Array Blocks . . . . . . . . . . . . . . . . . . . . . 212.5.3 Embedded Memory Blocks . . . . . . . . . . . . . . . . . 222.5.4 DSP Blocks . . . . . . . . . . . . . . . . . . . . . . . . . 22

3 PR Design Work Flow on Intel Arria 10 233.1 Special Considerations for PR Designs . . . . . . . . . . . . . . . 23

3.1.1 Defining Personas . . . . . . . . . . . . . . . . . . . . . 233.1.2 Floor planning . . . . . . . . . . . . . . . . . . . . . . . 243.1.3 Storage of Personas . . . . . . . . . . . . . . . . . . . . . 243.1.4 PR Control Logic . . . . . . . . . . . . . . . . . . . . . . 243.1.5 Reconfiguration time . . . . . . . . . . . . . . . . . . . . 253.1.6 Compilation . . . . . . . . . . . . . . . . . . . . . . . . 25

3.2 Work Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.2.1 Planning the Design . . . . . . . . . . . . . . . . . . . . 253.2.2 Creating PR Partition(s) . . . . . . . . . . . . . . . . . . 263.2.3 Floorplanning . . . . . . . . . . . . . . . . . . . . . . . . 263.2.4 Instantiating PR IP Core . . . . . . . . . . . . . . . . . . 263.2.5 Defining Personas . . . . . . . . . . . . . . . . . . . . . 263.2.6 Creating Revisions for Personas . . . . . . . . . . . . . . 273.2.7 Compiling Design . . . . . . . . . . . . . . . . . . . . . 273.2.8 Programming FPGA and memory . . . . . . . . . . . . . 28

4 CPRI L2 Block 294.1 Function in top CPRI block . . . . . . . . . . . . . . . . . . . . . 294.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.2.1 SW Control Register . . . . . . . . . . . . . . . . . . . . 314.2.2 Sync . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.2.3 Service Interface Bridges . . . . . . . . . . . . . . . . . . 314.2.4 CPRI Control . . . . . . . . . . . . . . . . . . . . . . . . 314.2.5 CPRI RX . . . . . . . . . . . . . . . . . . . . . . . . . . 324.2.6 CPRI TX . . . . . . . . . . . . . . . . . . . . . . . . . . 324.2.7 Gearbox . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.2.8 Reset Ctrl . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.3 Creation of PR Personas . . . . . . . . . . . . . . . . . . . . . . 33

5 Implemented PR Design 355.1 Development Board . . . . . . . . . . . . . . . . . . . . . . . . . 355.2 Architectural Design . . . . . . . . . . . . . . . . . . . . . . . . 36

5.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 365.2.2 PR Region . . . . . . . . . . . . . . . . . . . . . . . . . 375.2.3 Partial Reconfiguration IP Block . . . . . . . . . . . . . . 37

Page 11: Partial Reconfiguration of a CPRI Implementation on an FPGA

CONTENTS ix

5.2.4 Flash Memory Interface . . . . . . . . . . . . . . . . . . 375.2.5 DMA/PR Controller . . . . . . . . . . . . . . . . . . . . 385.2.6 Top Entity . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.3 Implementation and Data Collection . . . . . . . . . . . . . . . . 39

6 Results 416.1 Resource Utilization . . . . . . . . . . . . . . . . . . . . . . . . 41

6.1.1 Visualization of Results . . . . . . . . . . . . . . . . . . 416.1.2 Quartus Chip Planner Views . . . . . . . . . . . . . . . . 43

6.2 Bitstream File Size . . . . . . . . . . . . . . . . . . . . . . . . . 456.3 Reconfiguration Time . . . . . . . . . . . . . . . . . . . . . . . . 47

7 Analysis 517.1 Challenges Encountered During PR Design

Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517.1.1 Persona Storage and Memory Access . . . . . . . . . . . 517.1.2 PR Control Logic . . . . . . . . . . . . . . . . . . . . . . 517.1.3 Complexity of PR Design Work Flow . . . . . . . . . . . 52

7.2 Analysis of Results . . . . . . . . . . . . . . . . . . . . . . . . . 527.2.1 Memory Allocation . . . . . . . . . . . . . . . . . . . . . 527.2.2 Reconfiguration time . . . . . . . . . . . . . . . . . . . . 537.2.3 Logic Utilization . . . . . . . . . . . . . . . . . . . . . . 54

7.3 A Closer Look at the Sub Blocks . . . . . . . . . . . . . . . . . . 54

8 Conclusions 578.1 Insights and suggestions for further work . . . . . . . . . . . . . 58

8.1.1 Port overhead . . . . . . . . . . . . . . . . . . . . . . . . 588.1.2 PR applicability . . . . . . . . . . . . . . . . . . . . . . . 588.1.3 Constraining PR Region . . . . . . . . . . . . . . . . . . 598.1.4 Handshaking functionality . . . . . . . . . . . . . . . . . 598.1.5 Memory interface . . . . . . . . . . . . . . . . . . . . . . 59

Page 12: Partial Reconfiguration of a CPRI Implementation on an FPGA
Page 13: Partial Reconfiguration of a CPRI Implementation on an FPGA

List of Figures

2.1 CPRI system example . . . . . . . . . . . . . . . . . . . . . . . . 62.2 Several REs connected to one REC (star topology) . . . . . . . . 72.3 Several RECs serving one RE . . . . . . . . . . . . . . . . . . . 72.4 Several REs cascaded (chain topology) . . . . . . . . . . . . . . . 82.5 REC/RE tree topology . . . . . . . . . . . . . . . . . . . . . . . 82.6 REC/RE ring topology . . . . . . . . . . . . . . . . . . . . . . . 92.7 Overview of the CPRI frame substructures. . . . . . . . . . . . . 112.8 An overview horizontal slice of an Arria 10 FPGA . . . . . . . . 202.9 ALM architecture overview . . . . . . . . . . . . . . . . . . . . . 21

4.1 CPRI data flow through different layers . . . . . . . . . . . . . . 294.2 Block diagram of CPRI L2 block . . . . . . . . . . . . . . . . . . 30

5.1 Overview of the system implemented during this thesis project. . . 365.2 Overview of the DMA/PR IP block designed for this project. . . . 38

6.1 Resource utilization diagram . . . . . . . . . . . . . . . . . . . . 416.2 Chip Planner overview . . . . . . . . . . . . . . . . . . . . . . . 436.3 Chip Planner detailed view . . . . . . . . . . . . . . . . . . . . . 446.4 Bitstream file size statistics . . . . . . . . . . . . . . . . . . . . . 466.5 Reconfiguration time with a PR region measuring 30x40 cells . . 486.6 Reconfiguration time with a PR region measuring 80x15 cells . . 496.7 Reconfiguration time with a PR region measuring 50x12 cells . . 49

xi

Page 14: Partial Reconfiguration of a CPRI Implementation on an FPGA
Page 15: Partial Reconfiguration of a CPRI Implementation on an FPGA

List of Tables

2.1 Word length and Control word length for different line rates . . . 12

4.1 Overview of the PR personas and their settings. . . . . . . . . . . 33

5.1 Settings when instantiating the PR IP component in Qsys Pro 16.0 37

6.1 Bitstream file size statistics . . . . . . . . . . . . . . . . . . . . . 45

xiii

Page 16: Partial Reconfiguration of a CPRI Implementation on an FPGA
Page 17: Partial Reconfiguration of a CPRI Implementation on an FPGA

List of Abbreviations

ASIC Application Specific Integrated CircuitAVMM Avalon Memory-MappedBER Bit Error RateCBR Constant Bit RateCFI Compact Flash InterfaceCoE CPRI over EthernetCPRI Common Public Radio InterfaceCPU Central Processing UnitCR Cognitive RadioDPD Digital Pre-DistortionDMA Direct Memory AccessFEC Forward Error CorrectionFFT Fast Fourier TransformFPGA Field Programmable Gate ArrayGSM Groupe Special Mobile∗

IP Intellectual PropertyL2 Layer 2LTE Long Term Evolution†

MC-CDMA Multi Carrier Code Division Multiple AccessOFDM Orthogonal Frequency Division MultiplexPSK Phase Shift KeyingQAM Quadrature Amplitude ModulationRAN Radio Access NetworkRAT Radio Access TechnologyRE Radio EquipmentREC Radio Equipment ControlRX ReceiveSDR Software Defined RadioSNR Signal-to-Noise RatioTCL Tool Command LanguageTX TransmitUMTS Universal Mobile Telecommunications System

xv

Page 18: Partial Reconfiguration of a CPRI Implementation on an FPGA
Page 19: Partial Reconfiguration of a CPRI Implementation on an FPGA

Chapter 1

Introduction

The concept of Partial Reconfiguration (PR) is a rather intuitive utilization of thereprogrammability attribute of Field Programmable Gate Arrays (FPGAs). Theidea is to alter the configuration of the FPGA during runtime depending on therequired functionality at a given moment [1]. This is particularly useful when onlycertain blocks are active for each operating mode of a device. Potential benefitsof PR include the reduction of size, cost and power consumption of the FPGA.

1.1 Problem

Ericsson’s current Baseband FPGA designs are based on an “All features approach”,meaning that all functionality that could potentially need to be accessed in thefield for a certain FPGA are included in the design. The Common Public RadioInterface (CPRI) [2] links in the baseband circuits can be taken as an example.These need to be able to operate at different line rates (and sometimes use differentprotocols) depending on the radio unit on the other side of the communication line,which gives rise to physically very large FPGA designs with drawbacks including:

• Long compile times

• High efforts for meeting timing closure

• High power consumption

• High hardware cost since big, premium FPGAs that are generally notproduced in large quantity have to be purchased from vendors.

1

Page 20: Partial Reconfiguration of a CPRI Implementation on an FPGA

2 CHAPTER 1. INTRODUCTION

1.2 PurposeThe purpose of this thesis is to evaluate the potential benefits and challenges ofusing Partial Reconfiguration in FPGA design, as well as to document the designflow for a particular hardware. The main focus will be on Ericsson’s basebanddesigns and work flow, particularly concerning the CPRI links. These are a goodcandidate for evaluation of the PR concept in terms of features and usage (seeSection 2.1.2.6). The purpose is not to achieve a PR-based CPRI design actuallyrunning in hardware, but rather to compare relevant output from the design tool.The thesis mainly focuses on the following metrics: reconfiguration time, logicutilization and bitstream file size. Other metrics such as power consumption andeffects on efforts for timing closure would also be of interest but were deemedoutside of the scope of this project since they could not be analyzed from thecompilation reports.

The documentation of the hands-on experience of the design flow is intendedfor later use as internal reference at Ericsson.

1.3 RelevanceThe literature study outlined in Section 2.3 has shown that the aim of this thesis,namely to utilize PR in CPRI line rate configuration, fits nicely into a hole of thepuzzle of the current state of the art of radio communication equipment. As can beseen in that section, there are many other application areas related to CPRI withintelecommunication where PR has been applied. However, no other study has sofar (to the knowledge of the author) explored a PR implementation of CPRI.

1.4 Structure of the ThesisThis thesis is structured as follows:

• Chapter 2 provides insight into topics that are necessary to understand inorder to follow the rest of the thesis properly, namely CPRI, PR, RelatedWorks and information about the logic resources available for PR on Arria10 devices.

• Chapter 3 goes on to explain the PR work flow on Intel Arria 10 devices.Design considerations as well as a step-by-step presentation of the flow aregiven here.

• Chapter 4 goes into more detail about the logic block chosen for studyduring this thesis, namely Layer 2 of the CPRI block.

Page 21: Partial Reconfiguration of a CPRI Implementation on an FPGA

1.4. STRUCTURE OF THE THESIS 3

• Chapter 5 describes the implemented PR design.

• Chapter 6 outlines the results of the measurements done on the implementeddesign.

• Chapter 7 analyses the results and the PR work flow.

• Chapter 8 provides conclusions as well as suggestions for future work.

Page 22: Partial Reconfiguration of a CPRI Implementation on an FPGA
Page 23: Partial Reconfiguration of a CPRI Implementation on an FPGA

Chapter 2

Background

2.1 Common Public Radio Interface

2.1.1 OverviewCPRI is an industry cooperation between several major telecom equipmentcompanies.∗ The purpose of the cooperation is to provide a standardizedserial communication interface between the Radio Equipment (RE) and RadioEquipment Control (REC) in Radio Base Stations. RE and REC are oftenseparated due to their different roles in the system, as well as the fact that theymay be located on different geographical locations depending on the systemarchitecture. In short, the most common design approach (at least historically)is that the REC handles the radio functions of the digital baseband domain andthe RE handles the analog frequency functions. Thus, for a given application, itmight be desirable to use RE and REC from different vendors and/or differentgenerations of technology. A standardized communication interface enables suchflexibility as well as independent technology evolution of REC and RE.

∗ Ericsson, Huawei, NEC, Alcatel and Nokia

5

Page 24: Partial Reconfiguration of a CPRI Implementation on an FPGA

6 CHAPTER 2. BACKGROUND

2.1.2 System Description

2.1.2.1 System Components

Layer 2

Layer 1

Radio Equipment Control (REC)

Control & Mgmt. Sync. User

Network

Interface

Layer 2

Layer 1

Radio Equipment (RE)

Control & Mgmt. Sync. User

Antenna

Interface

Digitized Radio Base Station Internal Interface Specification

Figure 2.1: Overview of a system containing an RE and an REC unit as well as aCPRI link connecting them.

As stated in the CPRI Specification [3], “the RE provides the analogue andradio frequency functions such as filtering, modulation, frequency conversion andamplification.” The REC, on the other hand, “is concerned with the NetworkInterface transport, the Radio Base Station control and management as well asthe digital baseband processing.” The CPRI specification covers only the point-to-point communication interface between two nodes, which can be of either thesame or different type (RE or REC). However, each Radio Base System mustcontain at least one of each type. The interface specification enables a variety ofdifferent parallel as well as chained topologies of two or more nodes.

2.1.2.2 Example Configurations

The CPRI specification [3] outlines a number of reference configurations forinterconnection of RECs and REs, of which some examples are given here inFigures 2.2 through 2.6. The most basic configuration, namely a point-to-pointconnection between an REC and an RE, can be seen in Figure 2.1. It should benoted that even though a limited number of configurations are explicitly given

Page 25: Partial Reconfiguration of a CPRI Implementation on an FPGA

2.1. COMMON PUBLIC RADIO INTERFACE 7

in the specification, no other type of configuration is precluded given that theimplemented system components have sufficient functionality.

RE

REC

RE

. . .

. . .

Figure 2.2: Several REs connected to one REC (star topology)

REC

RECPRI Link(s)

REC

CPRI Link(s)

CPRI Link(s)

Figure 2.3: Several RECs serving one RE

Page 26: Partial Reconfiguration of a CPRI Implementation on an FPGA

8 CHAPTER 2. BACKGROUND

REREC CPRI Link(s) RECPRI Link(s) . . .

Figure 2.4: Several REs cascaded (chain topology). Cascading of RECs is alsopossible.

RE

REC

RE

. . .

. . .

RECPRI Link(s)

Figure 2.5: Tree topology

Page 27: Partial Reconfiguration of a CPRI Implementation on an FPGA

2.1. COMMON PUBLIC RADIO INTERFACE 9

RE

REC

CPRI Link(s) RECPRI Link(s)

CPRI Link(s)

Figure 2.6: Ring topology

2.1.2.3 Protocol Layers

As visualised in Figure 2.1, the CPRI interface specifies two layers. Layer 1 is thephysical layer and defines properties such as electrical and optical characteristics,time multiplexing of the different data flows and low level signalling. Layer 2 is adata link layer and defines media access control, flow control and data protectionof the control and management information flow. The main motivation for theCPRI specification focusing on these two hardware dependent layers was to ensurehardware compatibility in order to facilitate independent technology evolution onboth sides of the interface while not limiting product differentiation in higher andparallel layers of the RE or REC as a whole [3].

2.1.2.4 Protocol Data Planes

The data flow is divided into four groups:

Control Plane: Contains control data for call processingManagement Plane: Contains management information for the CPRI link

system itselfUser Plane: The actual user data that is to be transferred/received by

the Radio Base Station (usually in the form of IQ data)Synchronization: Used for synchronization and timing between nodes

These data flows are time division multiplexed by Layer 1 when sent overthe CPRI link. The connections going into the top of Layer 2 in Figure 2.1 arecalled Service Access Points (SAPs). These can be used as reference points for

Page 28: Partial Reconfiguration of a CPRI Implementation on an FPGA

10 CHAPTER 2. BACKGROUND

performance measurements. As can be seen, the Control and Management dataplanes share a SAP while the other data planes have one each.

2.1.2.5 Interconnection

CPRI supports optical as well as electrical interconnection. A CPRI link isa bidirectional interface in between two directly connected ports, using onetransmission line per direction. One port acts as master and the other as slave.In the case of a single connection between a REC and an RE, the REC port shallact as master.

2.1.2.6 Signal/Data Transfer

CPRI is a serial interface that uses time division multiplexing of the different dataflows. A wide range of line bit rates, from 614.4 Mbit/s to 24.3 Gbit/s, are includedin version 7.0 of the specification. However, the specification only specifies thateach CPRI compliant RE or REC shall support at least one of the available linebit rates. Furthermore, two different bit rates are available for the Control andManagement channel; one slower adhering to the High-Level Data Link Control(HDLC) protocol and one faster adhering to the Ethernet protocol. The actual bitrate of the Control and Management channel will depend on the line bit rate.

On startup, the two nodes perform negotiation in order to synchronize andagree on protocol parameters, starting with the line bit rate controlled by Layer 1and moving up to higher-level parameters. It is necessary that each node in thelink supports at least one protocol configuration that is compatible with at leastone on the other side.

Note: Since the line bit rate is controlled by the hardware layers described inthe CPRI specification, partially separate hardware circuits are necessary for eachline bit rate supported by the RE or REC. This is a potential area of improvementusing Partial Reconfiguration of CPRI links implemented in an FPGA; instead ofloading all the necessary logic into the FPGA, only the logic corresponding to thecurrent line bit rate is loaded during negotiation as well as operation.

2.1.3 CPRI Frame Structure

CPRI is a Constant Bit Rate (CBR) protocol, meaning that data is sent continuouslywith a certain interval. This differs from, for example, Ethernet, which is a packet-based protocol that sends data sporadically and with different time intervalsdepending on the workload. Defining CPRI as a CBR protocol makes it easierto adhere to the strict timing and synchronization requirements outlined in [3],

Page 29: Partial Reconfiguration of a CPRI Implementation on an FPGA

2.1. COMMON PUBLIC RADIO INTERFACE 11

which are stipulated in order to ensure robust streaming of the IQ data betweenthe REC and RE.

CPRI data is transmitted serially and arranged in a hierarchical frame structurewith three levels: Basic frames, Hyperframes and CPRI frames. The structureand timing is designed to match the LTE frame structure in order to simplify thetranslation and encapsulation of LTE data, but CPRI can of course also be usedfor other Radio Access Technologies (RATs). The frame structure is visualised inFigure 2.7 on the following page.

CPRI Frame CPRI frame CPRI Frame

#0 Hyperframe #149

#0 Basic frame #255

16 words

Nr of bytes/worddepends on

line rate

One byte

. . .

. . .

. . .

. . .

. . .

. . .

. . .

. . .

. . . . . .. . .

First word= control word

Figure 2.7: Overview of the CPRI frame substructures.

• A basic frame consists of 16 words with word length depending on theline rate (see Table 2.1). Creation and transmission of one basic frameto the other side of the CPRI link is completed once every Tc = 240.416ns. This is based on the Universal Mobile Telecommunications System

Page 30: Partial Reconfiguration of a CPRI Implementation on an FPGA

12 CHAPTER 2. BACKGROUND

(UMTS) clock rate which is 3.84 MHz. For example, this value of Tc issuitable for transporting one Fast Fourier Transform (FFT) sample for anLTE channel bandwidth of 2.5 MHz, as outlined in [4].It should be noted that the data is transmitted serially on a single port bit bybit and byte by byte. Thus, the bytes within a word organized vertically inFigure 2.7 are transmitted in serial within Tc, and higher line rates allow formore bytes to be transferred during the same time period.

• A hyperframe consists of 256 basic frames, and thus creation and transmissionof one hyperframe is completed once every 256 × Tc = 66.47 µs.

• A CPRI frame consists of 150 hyperframes, which means that creation andtransmition of one CPRI frame is completed every 10 ms. This correspondsto one LTE frame.

Table 2.1: Word length and Control word length according to the CPRIspecification [4] for different line rates

CPRI line bit rate (Gbit/s) Word length (bits) Control word length (bits)0.6 T=8

Tcw=T

1.2 T=162.5 T=323.1 T=404.9 T=646.1 T=80

8.1, 9.8 T=12810.1 T=160

Tcw=12812.1 T=19224.3 T=384

A basic frame always consists of 16 words, but the word length variesdepending on the line rate. Table 2.1 shows the word length for the different linerates supported in version 7.0 of the CPRI specification. The first word of eachbasic frame is a designated control word, meaning that each hyper frame contains256 control words. These are organized into different sub-channels used to carrythe Control & Management as well as Synchronization data. The remaining 15words of each basic frame is used to carry User plane IQ data.

Page 31: Partial Reconfiguration of a CPRI Implementation on an FPGA

2.2. PARTIAL RECONFIGURATION OF FPGAS 13

2.1.4 CPRI over EthernetCPRI has been a successful industry cooperation and standardization project,allowing for interconnection between radio equipment from different vendors.However, one idea that has been suggested (for instance in [5]) in order to bringdown the cost of both deployment and management as well as to offer moreconfiguration flexibility is to encapsulate the CPRI protocol within a physicalEthernet connection. This approach is called CPRI over Ethernet (CoE). Thephysical interfaces for Ethernet are more widely available and less costly than thecurrent CPRI interfaces. However, this approach introduces some considerations,especially when it comes to whether or not the jitter can be kept low enough toadhere to the CPRI standard. In one study from 2015, the feasibility of CoE isassessed [6]. The results in that report are not quite conclusive, but a later studywhich came out in 2017 argues strongly that it is indeed possible to meet thenecessary requirements [7].

2.2 Partial Reconfiguration of FPGAs

2.2.1 OverviewField Programmable Gate Arrays (FPGAs) are reprogrammable digital electronichardware chips. Development of digital electronic circuits usually involveswriting Hardware Description Language (HDL) code, which can then either beimplemented by manufacturing a static Application Specific Integrated Circuit(ASIC) or synthesized to configure an FPGA. ASICs offer lower power consumption,more compact design and lower cost per chip when manufactured on a large scale.However, the development cost is very high and it is not possible to update thedeployed hardware after manufacturing.

The reprogrammability of FPGAs have made them popular as prototyping andeducational tools. They can also be placed together with ASICs on circuit boardsin order to offer some added flexibility and ability to update parts of the designlater on. Furthermore, if a digital circuit is not going to be manufactured on alarge scale, full FPGA implementation can often yield a lower total cost.

The concept of Partial Reconfiguration (PR) means that only a small region ofthe FPGA chip is reconfigured. Partial Reconfiguration often also implies Run-Time Reconfiguration (RTR), meaning that the PR region is reprogrammed whilethe rest of the design (the static region) is still running. Another term sometimesused is Dynamic Partial Reconfiguration (DPR). In this report, PR will be usedand will also imply RTR.

PR makes it possible to utilize the full potential of the reprogrammabilityfeature of FPGAs to be utilized. Both duration and power consumption of

Page 32: Partial Reconfiguration of a CPRI Implementation on an FPGA

14 CHAPTER 2. BACKGROUND

the reconfiguration phase is greatly reduced when compared to reconfiguringthe whole chip in order to alter the functionality. Furthermore, there is noduration of time when the whole chip is unavailable, which greatly simplifiesthe integration with other components since the static region can handle thenecessary communication mechanisms and signalling during the reconfigurationphase. With PR, it is possible to construct hardware circuits that adapt to dynamicconditions. It combines the flexibility of software with the speed and reliability ofhardware.

Some FPGA designs are more suitable for PR implementation than others.In general, designs that offer a range of different modes of operation, and wheredifferent logic is used for the different modes, have the highest potential whenit comes to resource savings using PR. Drawbacks such as overhead in termsof supporting logic and reconfiguration time (see Section 3) must be taken intoconsideration when deciding whether to use PR in an implementation or not.

2.2.2 History

Ideas about reconfigurable computing hardware can be found in publicationswritten as early as [8] and [9], both from 1978. In those articles, the idea was toincrease computation efficiency when the word length of the input was less thanthe full word length of the processor. Although the idea of dynamically adaptivehardware is related to PR, the actual implementation differed from the methoddiscussed in this thesis.

During the last few decades, FPGA technology and PR capability hasdeveloped rapidly along with the rest of the electronic industry. During 2000-2010, a lot of research was made in order to develop reliable implementationtechniques and design frameworks for FPGA designs utilizing PR. A fewexamples of publications from that time period are:

• J.H. Pan et alias’ IEEE conference paper from 2004 about a technique forcompressing the bitstream used to program the PR region [10].

• The article from 2005 by Cindy Kao in Xilinx’s journal XCell about generalbenefits of PR [11].

• S. Liu et alias’ technical report from 2009 about an approach to reducing thereconfiguration time overhead by utilizing a Direct Memory Access (DMA)streaming engine found in [12].

• M. Liu et alias’ conference paper from 2009 which is also about reducingthe reconfiguration time overhead but this time by utilizing parallel PR

Page 33: Partial Reconfiguration of a CPRI Implementation on an FPGA

2.3. RELATED WORKS AND APPLICATION AREAS 15

regions and virtual configurations, albeit at the cost of added design size[13].

• Another conference paper from 2009 headlined by M. Liu but with differentco-authors, describing a design framework that takes aspects such ashardware processes, system interconnections, Operating Systems (OS),device drivers, scheduler software and context switching into consideration[14].

For quite some time, PR was not supported by all FPGA vendors. Furthermoreup until some years ago the design process in the tools available was rathercumbersome. During the second half of that decade, more reports that targetedactual use cases rather than development of the concept of PR itself startedsurfacing. A few examples targeting the telecom industry are [15], [16] and [17].The use cases mentioned in these reports will be elaborated on in Section 2.3.

2.3 Related works and application areas

2.3.1 PR in GeneralOne significant piece of work is the PhD thesis [18] from 2011 by Dr. MingLiu. It contains a thorough investigation of design methodology, application areas,potential and limitations of PR. The main focus is to develop hardware for particlephysics experiments, but the paper also includes case studies and designs that canbe useful in other application areas. For example, a PR design including oneconfiguration acting as a controller for an external SRAM memory and anotheracting as a controller for an external flash memory was implemented. This isa somewhat similar use case to the one investigated in this thesis, since theyboth discuss peripheral communication operating exclusively in a single modeat any given time. Liu’s results in regards to resource savings are encouraging.For example, 43.7% of the 4-input Look-Up Tables (LUTs) were saved whenusing PR instead of a static design implementation containing both controllersand switching mechanisms between the two. One significant difference betweenthe approach in [18] and the approach in this thesis is that software was used forhandling the reconfiguration flow in the former, while the latter aims to implementthe whole reconfiguration flow in hardware. Some of the articles published duringthe work on the PhD thesis are [14], [12] and [13].

Another PhD thesis which has significantly contributed to the knowledge inthe field is [19] from 2015 by Dr. Byron Navas. Here, a platform for easyimplementation of PR designs is described and applied to interesting areas suchas self-healing and cognitive Systems on Chip (SoC). A shorter insight into theRecoBlock system is provided in [20].

Page 34: Partial Reconfiguration of a CPRI Implementation on an FPGA

16 CHAPTER 2. BACKGROUND

In the following subsections, previous work in common application areas forPR will be reviewed. For the interested reader, some articles that investigate PRin application areas that are intriguing but not quite connected to the focus ofthis thesis are: [21] (FPGA debugging), [22] (Artificial Neural Networks), [23](Security issues when PR is controlled remotely) and [24] (fault tolerance forspace applications).

2.3.2 Hardware Accelerators

One common application area for PR is custom, time-multiplexed hardwareaccelerators for SoCs. The idea is quite alluring: to be able to access a hardwareaccelerator customized for many different types of processes while maintaininga small circuit footprint. In this application area however, the reconfigurationtime versus possible speedup is essential when it comes to how beneficial thePR technique is. Quite intuitively, PR is more beneficial in cases where manyoperations on the same type of accelerator is performed in a row than when contextswitching is frequent. [25] is one example of a report where those kind of trade-offs are investigated for different use cases and architectures.

Since the use case studied in this thesis does not involve frequent contextswitching, the timing constraints differ from when PR is used to implementhardware accelerators. The reconfiguration time is certainly of interest in orderto minimize system downtime, but context switching will only occur as often asin the order of seconds due to the line rate negotiation protocol of CPRI links.∗

However, hardware accelerators are frequently used in the closely related field ofSoftware Defined Radio (SDR), see Section 2.3.3 below.

2.3.3 Software Defined Radio

SDR is a concept where certain radio functionality such as for example modulation,demodulation, encoding and decoding is handled by software rather than dedicatedhardware. It has become increasingly popular due to its flexibility, the increasednumber of diverse radio protocols and the reduced size but increased computationalpower of CPUs. Since partially reconfigurable FPGAs offer the opportunityto load custom hardware accelerators depending on the current protocol whilemaintaining a compact design, it is a popular implementation technique for SDR.

In 2008, E.J. McDonald wrote an article [26] which has been cited by severallater works in the area. Aside from describing aspects of the PR design method

∗ On startup, a line rate negotation takes place between RE and REC where a new line rate istested if synchronization has not been reached within a pre-determined time limit. The exact timelimit is allowed to vary between 0.9-1.1 seconds depending on the implementation [3].

Page 35: Partial Reconfiguration of a CPRI Implementation on an FPGA

2.3. RELATED WORKS AND APPLICATION AREAS 17

in general, it more specifically discusses the feasibility and benefits of using PRwithin SDR. An example is given of a simplex transceiver architecture wherethe Forward Error Correction (FEC) block is reconfigured during runtime. Thiswork showed that even though the vendor tools for PR implementation were stillunder development, the technology had matured to a level where it was feasibleto include the PR design method in industrial products and not just in academicexperiments. Another article from around the same time discussing similar topicsis [16] by Delahaye et al., published in 2007.

In the last few years, there have been several case studies implementingdifferent functional blocks and hardware accelerators for SDR by using PR. Afew examples are:

• A.M. Lalge et alias’ conference paper from 2015 discussing Phase ShiftKeying (PSK) Modems [27].

• A. Hassan et alias’ IEEE conference paper from 2015 reviewing theperformance of different techniques used for programming the PR regionapplied to the use case of a Convolutuional Encoder [28].

• A.K. Nahar et alias’ journal article from 2017 discussing Multi Carrier CodeDivision Multiple Access (MC-CDMA)[29].

One researcher who has published many case studies implementing differentSDR function blocks using PR is Arun Kumar from the Centre for Developmentof Advanced Computing in Trivandrum, India. These include [30] aboutvariable Quadrature Amplitude Modulation (QAM) modes, [31] about DigitalPre-Distortion (DPD), [32] where an OFDM transmitter is implemented with theuse of a partially reconfigurable IFFT module, and lastly [33] which is anotherstudy of the implementation of PSK modems using PR added to the one in [27].

To the author’s knowledge, there has been no work published about CPRIimplementation using PR. Since CPRI is intended as more of a communicationbetween the RE (possibly utilizing SDR) and the REC, it does not necessarilyfall under the SDR function block category. However, it does operate in closeconjunction with the SDR blocks and shares applicable demands such as timingand space occupation constraints. Hence, the above mentioned works are highlyrelevant to this thesis, which in turn is relevant to the design of both softwaredefined and hardware based radio systems.

2.3.4 Cognitive RadioOne field that can be seen as an evolution of SDR is Cognitive Radio (CR).The idea is that the radio system changes its parameters and operating modes

Page 36: Partial Reconfiguration of a CPRI Implementation on an FPGA

18 CHAPTER 2. BACKGROUND

depending on sensed dynamic network and user conditions such as Bit ErrorRate (BER), Signal-to-Noise Ratio (SNR) and channel occupancy. The benefitsof implementing PR in CR applications are similar to those for SDR, but thereconfiguration time overhead becomes more crucial in the case of CR due tothe increasingly dynamic nature of the system. Articles exploring the CR conceptcan be found in [17], [34] and [35]. Furthermore, many of the studies regardingPR in SDR are relevant to CR as well. For example, [36] regarding OFDM in CRis closely related to [30], which has already been cited in Section 2.3.3.

2.3.5 Dynamic CPRI line rateIn addition to the CoE approach mentioned in Section 2.1.4 and in the same spiritas CR, a concept that has been explored is to dynamically alter the CPRI line rateof a connection depending on the current network demand. Currently, the CPRIstandard in [3] defines a constant line bit rate after the initial start-up negotiation.This means that it has to be configured to be able to handle the worst-case scenarioof network load and bandwidth. To reduce power consumption and operationalcost, the possibility of reconfigurable CPRI line rate would be desirable. In [37]and [38], this idea combined with CoE is explored. The line rate is configured bysending a RESET message which re-initiates the line rate negotiation process. Ifdynamic CPRI line rate would be implemented in future RANs, it would furtherincrease the relevance and potential benefit of implementing CPRI (and morespecifically the line rate configuration blocks) using PR.

2.3.6 Reconfigurable Ethernet InterfaceThe most closely related implementation case to this thesis that has been foundis [39], which describes a reconfigurable Ethernet controller utilizing PR inan FPGA. It has the capability of switching between two different Ethernetprotocols. The implementation and verification of functionality is thorough, butunfortunately no data is provided regarding savings in power consumption orarea utilization. Furthermore, there is no mention about the reconfiguration timeoverhead compared to a fully static design.

2.4 A Note on Platform TechnologyIn the vast majority of the works analysed during the literature study whichcontained an actual implementation of PR in an FPGA, Xilinx was the vendor.The reason for this can only be speculated on. It seems like Xilinx historicallyhas offered the most tools and least restrictions when it comes to PR. This thesis

Page 37: Partial Reconfiguration of a CPRI Implementation on an FPGA

2.4. A NOTE ON PLATFORM TECHNOLOGY 19

does not aim to make a platform comparison or claim to be able to argue forone or the other, but it is clear that presently, Altera (recently acquired by Intel)also offers extensive PR capabilities and tools. There certainly exists a goodamount of previously published articles with implementations on Altera FPGAs,for example[35], [39] and [40]. However, the value and relevance of this thesisis increased by the fact that published work for the specific platform used (IntelArria 10) seems scarce, at least within this specific application area.

Page 38: Partial Reconfiguration of a CPRI Implementation on an FPGA

20 CHAPTER 2. BACKGROUND

2.5 Logic Resources available for PR on Arria 10

Transceiver Channels

Hard IP Per Transceiver: Standard PCS, PCIe Gen3 PCS, Enhanced PCS

Transceiver Channels

Hard IP Per Transceiver: Standard PCS, PCIe Gen3 PCS, Enhanced PCS

PLLs

PLLs

Variable Precision DSP Blocks

M20K Internal Memory Blocks

Variable Precision DSP Blocks

Variable Precision DSP Blocks

M20K Internal Memory Blocks

M20K Internal Memory Blocks

I/O PLLs

Hard Memory Controllers, General-Purpose I/O Cells, LVDS

I/O PLLs

Hard Memory Controllers, General-Purpose I/O Cells, LVDS

Core Logic Fabric

Core Logic Fabric

PCI Express Gen 3 Hard IP PCI Express Gen 3 Hard IP

PCI Express Gen 3 Hard IP PCI Express Gen 3 Hard IP

North

East

Figure 2.8: An overview horizontal slice of an Arria 10 FPGA, showing how theresource blocks are distributed. The orientation of the chip has been rotated 90degrees in order to make it easier for the reader to view the text in the columns.

Only the core logic of the FPGA can be part of the PR region on Arria 10 devices.This section outlines the different logic resources which can be used for PartialReconfiguration.

Page 39: Partial Reconfiguration of a CPRI Implementation on an FPGA

2.5. LOGIC RESOURCES AVAILABLE FOR PR ON ARRIA 10 21

Adaptive LUT

Full Adder

Reg

Reg

Reg

Reg

Full Adder

In 1

In 2

In 3

In 4

In 5

In 6

In 7

In 8

Figure 2.9: ALM architecture overview

2.5.1 Adaptive Logic ModulesAdaptive Logic Modules (ALMs) are the basic building blocks of Intel FPGAs.As described in [41], they are small logic blocks that can be configured to performa variety of combinational or sequential functions. The functionality of the wholeFPGA is determined by the combined configuration of all of the ALMs it contains.As a reference, the Intel Arria 10 GX 1150 used for this thesis contains 427,200ALMs [42].

Figure 2.9 shows an overview of the architecture of an ALM. The adaptiveLook-Up Table (ALUT) module, the adders, the multiplexers (MUXes) andthe registers are configured according to the compiled design files, which areoptimised by the Quartus compiler to fit the ALM structure.

2.5.2 Logic Array BlocksA Logic Array Block (LAB) consists of ten ALMs, organized in a column. Itprovides a local interconnect between its ALMs which enables fast communicationfor implementing functions that require more than one ALM. This is furtherimproved by the fact that the ALMs within a LAB share carry chains andarithmetic chains between their adders and LUTs, respectively.

Up to a quarter of the available LABs on an Arria 10 FPGA can be configuredas Memory LABs (MLABs), acting as a dual-port SRAM with a maximum sizeof 640 bits. This is done by configuring each ALM as a 32x2 LUT-based memory.

Page 40: Partial Reconfiguration of a CPRI Implementation on an FPGA

22 CHAPTER 2. BACKGROUND

2.5.3 Embedded Memory BlocksAside from the MLABs mentioned above, there is a type of dedicated 20 Kbmemory blocks available in the Arria 10 FPGAs called M20Ks. These are suitablefor large memory arrays while the MLABs are more optimal for wide and shallowarrays used in, for example, shift registers for DSP applications and filter delaylines.[41]

2.5.4 DSP BlocksThe Arria 10 devices include dedicated DSP blocks with reconfigurable logicsupporting configurations that are optimized for certain arithmetic operations.Fixed as well as floating point is supported, and the supported arithmeticoperations include, for example, real and complex multiplication, systolic FIRfilters and vector operations. It should be noted that not all operations aresupported for both fixed and floating point arithmetic.

Page 41: Partial Reconfiguration of a CPRI Implementation on an FPGA

Chapter 3

PR Design Work Flow on Intel Arria10

3.1 Special Considerations for PR Designs

Implementing a PR FPGA design alters the regular design flow and introducessome special considerations. This subsection outlines the most significant ones.Many are technology-independent, but the statements here do not necessarilyhold true for other platforms than Intel Arria 10 and other software versions thanQuartus Pro 16.

3.1.1 Defining Personas

One of the most obvious differences when moving from a flat design∗ to a PRdesign is that the logic block which is going to be replaced depending on themode of operation needs to be identified and isolated. Then, different versionsof that logic block need to be defined and designed. The different versions arereferred to as personas in the Intel documentation [1], and the same term will beused henceforth in this thesis.

One important aspect when designing the different personas is that the portsof the HDL entity need to be identical regardless of persona. This is necessary inorder to ensure compatability with the logic outside the PR region. The compilertakes care of mapping each port to the exact same location in the FPGA for eachpersona.

∗ A design where all different modes of operation is implemented in a single entity

23

Page 42: Partial Reconfiguration of a CPRI Implementation on an FPGA

24 CHAPTER 3. PR DESIGN WORK FLOW ON INTEL ARRIA 10

3.1.2 Floor planning

The region(s) which shall be reconfigured depending on mode of operation (thePR region(s)) must be specified. This imposes additional floor planning effortscompared to flat designs. See Section 3.2 for more details.

3.1.3 Storage of Personas

The files containing the configuration data of the PR region for each personaneed to be stored somewhere. Depending on the file size, the personas can bestored either on the FPGA itself or in an off-chip memory. For demonstrationand experimentation purposes, the bitstream can be transferred from the QuartusProgrammer on a PC via a JTAG programming cable.

3.1.4 PR Control Logic

Dedicated logic is necessary in order to facilitate the reconfiguration of the PRregion. The main functionality that needs to be achieved by this logic is:

• Reading the persona bitstream from memory

• Using the persona bitstream to reconfigure the PR region

• Freezing the ports of the PR region to known values during reconfiguration∗

• Possibly, depending on whether it is necessary for the design at hand:performing handshaking with the logic block inside the PR region beforethe PR process is initiated.

When implementing PR designs on the Intel Arria 10 FPGA, a PR IP block isavailable in the Quartus 16 IP Catalog. This IP block handles the reconfigurationof the PR region and provides a single-bit freeze output which can be usedas an enable signal for the I/O freeze logic, which itself however needs to beimplemented by the designer†. Furthermore, custom control logic needs to beimplemented in order to transfer the persona bitstream from memory to the PR IPblock as well as sending the PR initiation signal to the PR IP block. If handshakinglogic is necessary for robust functionality of the design at hand, that needs to becustom made as well‡.

∗ It is not required to freeze the input ports of the PR region on Intel Arria 10 PR designs. † ForAVMM interfaces, there are freeze bridge IP blocks in the Quartus 16 IP Catalogue ‡ In Quartus17, extended handshaking and control logic is available in the IP Catalogue.

Page 43: Partial Reconfiguration of a CPRI Implementation on an FPGA

3.2. WORK FLOW 25

3.1.5 Reconfiguration time

It is important to analyze the design’s sensitivity and tolerance to the time it takesto re-configure the PR region, as well as how often reconfiguration will take place.This will have a great impact on the suitability of utilizing PR in the design. Forexample, it might be the case that certain deadlines in the system cannot be met ifthe logic inside the PR region is unavailable for too long. In that case, PR mighthave to be discarded altogether. In other cases where critical requirements can stillbe met, the usefulness of utilizing PR versus a flat design can still largely dependon the reconfiguration time of the PR region. One example of such a case is theapplication of reconfigurable hardware accelerators discussed in Section 2.3.2.

3.1.6 Compilation

Using the Quartus Shell from a terminal window, a PR compilation TCL∗ can begenerated that will handle the whole compilation process.

Even though the static region (the area outside the PR region(s)) only needsto be compiled once, synthesis and placement of each persona in the PR regionneeds to be performed. This increases the compile time significantly comparedto compiling a flat design (see Section 6.3). If only a certain persona (or onlythe static region) needs to be re-compiled, this can be done by passing certainarguments when executing the TCL script.

3.2 Work Flow

This section summarizes the PR design work flow for Intel Arria 10 in QuartusPro 16, as described in [43]. An example design and walk-through can be foundin [44].

3.2.1 Planning the Design

When planning the design, the first thing to do is to identify logical hierarchicalboundaries which can be defined as reconfigurable partitions. The designhierarchy and source code should be set up to support this partitioning. Keepin mind that only core logic can be used in the reconfigurable partition(s), and notperiphery resources such as I/O blocks and transceivers.

∗ Tool Command Language

Page 44: Partial Reconfiguration of a CPRI Implementation on an FPGA

26 CHAPTER 3. PR DESIGN WORK FLOW ON INTEL ARRIA 10

3.2.2 Creating PR Partition(s)A separate design partition must be created and set as reconfigurable for each PRregion that is to be included in the design. A design partition in itself does notspecify a physical area on the FPGA, but is merely a logical partitioning of thedesign. However, LogicLock Plus region assignments shall be used in order tospecify the placement of the PR Region (see Section 3.2.3).

3.2.3 FloorplanningIn order to ensure that the location of the PR region and its associated portsremain the same between all different personas, the location of the PR partitionmust be specified via a LogicLock placement region assignment. The proportionsof the assigned region affect the persona bit stream file size as well as partialreconfiguration time; increased height will yield increased reconfiguration timeand file size even if the total region size is kept constant. This is due to the waythat that the logic resources are addressed by the FPGA configuration files, namelythat the smallest addressable configuration segment is aligned along the rows andnot the columns of the FPGA. A routing region which is at least one unit largerthan the placement region must also be specified. As previously mentioned, onlycore logic can be included in the PR region. This must be considered duringfloorplanning.

3.2.4 Instantiating PR IP CoreThe PR IP Core performs the reconfiguration of the PR region, and can beinstantiated in the base revision via the IP Catalogue or in QSys. It can beconfigured to have either an AVMM, JTAG or conduit∗ interface. It must becontrolled via either an external host, for example an off-chip processor, or aninternal host, which can be for example be custom logic designed to integratewell with the rest of the design. The host must also facilitate the transfer of thepersona bitstreams from their memory storage to the PR IP Core.

3.2.5 Defining PersonasAs previously described, a persona is one of the possible logic configurations of aPR region. They should all be described by different HDL code, but it is importantthat all the personas use the exact same set of ports to connect to the static regionoutside the PR region. The physical placement of the ports must be the same aswell, but the compiler takes care of this† as long as the port declarations of the∗ Custom individual signal interface † It is possible to specify location assignments for the I/Oports as well

Page 45: Partial Reconfiguration of a CPRI Implementation on an FPGA

3.2. WORK FLOW 27

HDL code describing each persona of a PR region are identical.The different personas are later associated to different project revisions, see

Section 3.2.6. As a starting point, a Base Revision describing the full FPGAdesign and instantiating the most complex persona should be developed. Thebase revision will define the static region. It must include freeze logic which usesa control signal (one is available from the PR IP Core) to freeze the output portsof the PR region during partial reconfiguration. This is necessary due to the factthat the values of the output ports are unknown during the partial reconfigurationprocess. It is not required to implement freeze logic for the input ports of the PRregion(s) in Arria 10 PR designs.

3.2.6 Creating Revisions for Personas

In PR designs, there are three types of project revisions which must be created.The base revision is used to compile the static region, and is the revision onwhich all the other revisions are based. It describes the full FPGA design andshould instantiate the most complex persona in order to maximize the possibilityof discovering timing and/or fitting errors early.

One synthesis only revision must be created for each persona of a PRregion. It is solely used to synthesize the logic within the PR region for acertain persona. Its top component shall be specified as the top component ofthe corresponding persona, as opposed to the base revision whose top componentis the top component of the whole FPGA design.

One implementation revision must also be created for each persona of a PRregion. The only difference from the base revision is the revision type assignment.An implementation revision is later associated to a specific synthesis revision inthe partial reconfiguration compilation setup script, see below.

3.2.7 Compiling Design

Using the Quartus shell in the command line, a partial reconfiguration compilationTool Command Language (TCL) script template can be generated and thenadapted to the current design. All the revisions must be referenced, andthe synthesis revisions must be associated to their respective implementationrevisions. This can be done by editing the example setup script that is generatedalong with the compilation script. The compilation script can then be run fromQuartus in order to perform the necessary compilation steps.

Page 46: Partial Reconfiguration of a CPRI Implementation on an FPGA

28 CHAPTER 3. PR DESIGN WORK FLOW ON INTEL ARRIA 10

3.2.8 Programming FPGA and memoryRunning the compilation script generates files that can be used to program theFPGA and perform partial reconfiguration. Depending on the implementation,different files can be used. As an example, in the project for this thesis, RBFs(Raw Binary Files) generated by the compilation script and corresponding to eachpersona were converted into flash format for programming into an external flashmemory on the same board as the FPGA. The FPGA was then programmed witha SOF (SRAM Object File) corresponding to either one of the implementationrevisions.

Page 47: Partial Reconfiguration of a CPRI Implementation on an FPGA

Chapter 4

CPRI L2 Block

4.1 Function in top CPRI block

Time Division Multiplex

Optical Transmission

Electrical Transmission

IQ Data

Ven

do

rSPecific

Ethern

et

HD

LC

L1 In

ban

dP

roto

col

User Plane

Control & Management

PlaneSync

Layer 2

Layer 1

Figure 4.1: Overview of how the data flows through different streams in thedifferent layers of the CPRI block.

As can be seen in Figure 4.1, Layer 2 of the CPRI block is the interface betweenthe higher layers of operation and the physical layer. Data flows in differentformats and sub channels (IQ data, vendor specific, Ethernet etc.) from the threemain channels or, in other words, logical connections - User Plane, Control &Management Plane and Sync. Layer 2 is responsible for arranging the different

29

Page 48: Partial Reconfiguration of a CPRI Implementation on an FPGA

30 CHAPTER 4. CPRI L2 BLOCK

data flows into the correct frame format, outlined in Section 2.1.3. Layer 1 thenhandles the time multiplexing of the correctly arranged data.

As defined in [3] and outlined in Section 2.1.2, Layer 2 is the data linklayer which shall handle media access control and flow control of the differentdata streams. It also handles data protection of the control and managementinformation flow.

4.2 Architecture

SW Control Register

Sync CPRI TX

CPRI Control

CPRI RX Reset Ctrl

Gearbox

Sync

HDLC

Config

Ethernet

Vendor

IQ

L1 inband

L1 Status

L1 Data

Service Interface Bridges

Figure 4.2: Block diagram of the CPRI L2 component instantiated in the PRregion in this project. Due to confidentiality, the block diagram has beensimplified and some blocks have been anonymized.

Figure 4.2 shows an overview of the architecture of the CPRI L2 block instantiatedin the PR Region in this project. In the transmitting direction, the higher-leveldata flows of the interfaces to the left are encoded and packed into the correctframe structure in order to be presented correctly to the L1 interfaces to the right.This process is done in reverse in the receiving direction; the signals from the L1interfaces to the right are decoded, unpacked and sent out on the correct serviceinterfaces to the left.

The following sections describe the function of the different sub blocks.

Page 49: Partial Reconfiguration of a CPRI Implementation on an FPGA

4.2. ARCHITECTURE 31

4.2.1 SW Control RegisterThis sub block is used for software configuration of the other sub blocks, and alsoprovides status messages. Furthermore, it handles the IRQ generation from alarmsignals of different blocks.

4.2.2 SyncThe Sync block handles the synchronization of the TX and RX signals, mainlyby generating or receiving strobe signals (depending on if the CPRI block is inMaster, Slave or Partner mode) and delaying these depending on the line rate.

4.2.3 Service Interface BridgesThe function of the Service Interface Bridges in the TX direction is to performthe necessary operations to arrange the data coming in from the service interfaces(HDLC, Ethernet, Vendor Specific, IQ and L1 inband) into the right format beforeit is forwarded to the CPRI TX block. The reverse operation is done in the RXdirection; the respective signals from the RX block are received and re-arrangedinto the right format for the service interfaces.

There are some different reasons why these service interface bridges arenecessary. One example is that the IQ Service Interface Bridge folds/unfolds(depending on the direction) the IQ data in order to densify it and free up morespace in the CPRI frame for other signals. This data format is not suitable forhigher layers since they rely on deterministic data timing and frame structure.Another example reason why the service interface bridges are necessary is thatcertain data is scrambled in order to maintain a regular ratio of ones and zero onthe line, which is beneficial since the clock on the Slave side is derived from thesignal itself.

4.2.4 CPRI ControlThe CPRI control block contains control functions that are necessary to meet someof the requirements of the CPRI Specification, mainly when it comes to timingand synchronization. For example, this is where CPRI line delay calculation andcompensation is done. The block also filters the control word extracted fromthe CPRI frame. One example of this filtering is that hysteresis is applied tosome control signals, in order to ensure that single bit errors do not cause unstableoperation.

The timing functionality differs depending on the operating mode. Forexample, the transmission timing is driven by the received link timing and

Page 50: Partial Reconfiguration of a CPRI Implementation on an FPGA

32 CHAPTER 4. CPRI L2 BLOCK

received link delay in Slave mode. However, for Master and Partner mode,the transmit timing is driven by a signal that is received from the Sync blockmentioned above.

Furthermore, the CPRI Control block supervises the state of Layer 1 to detectcritical status changes such as Loss of Signal and Line Synchronization status.The status of Layer 1 partly determines which state Layer 2 should be in, and theCPRI Control block takes care of the supervision and signalling required to ensureproper operation in this aspect. The block also handles the L1 Inband Signalingthat can be used to communicate directly with Layer 1 of the unit on the other sideof the CPRI link.

4.2.5 CPRI RX

For each basic frame received from Layer 1, the CPRI RX block separates thecontrol word from the IQ data. Furthermore, the block extracts data from thecontrol word and forwards the data to the correct service interface bridge. Thedata size and conditions for which data to forward to each channel depend on theline rate and operating mode.

4.2.6 CPRI TX

The main functionality of the CPRI TX block is to create basic frames to transmit.Similarly to the process in the CPRI RX block (but reversed), the block assemblesa complete control word and then combines it in the correct order with the IQ data.Also here, the data size and conditions for which data to read from each channeldepend on the line rate and operating mode.

Furthermore, this block contains some special functionality to ensure that thedata stays aligned for the 10.1 Gbit/s line rate option even though a differentencoding scheme is used there∗.

4.2.7 Gearbox

The Gearbox is used to convert between different bus widths in the L1 and L2block. However, the functionality of the block is more complex for the 10.1 Gbit/sline rate option. This is due to the fact that bits in certain cases need to be storedbefore sending out in next cycle due to the higher data rate and encoding scheme.

∗ 64b66b line coding instead of 8b10b line coding

Page 51: Partial Reconfiguration of a CPRI Implementation on an FPGA

4.3. CREATION OF PR PERSONAS 33

4.2.8 Reset CtrlThis block handles the reset functionality for the whole CPRI L2 block and all itssub blocks.

4.3 Creation of PR PersonasIn the CPRI L2 block, line rate and operating mode depend on values in theSW Control Register. In order to reduce the logic utilization of the block andcreate personas for the PR region, different values of line rate and operating modewere hard coded in the HDL code. As described in the previous sections, thefunctionality of many of the sub blocks depend on one or both of these settings.Thus, the hard coded settings made the compiler remove logic that was not neededfor the configuration of each persona compared to the flat design. Creating aunique persona for each of the possible combinations of settings as well as anempty persona yielded a total of thirteen different personas.

Table 4.1: Overview of the PR personas and their settings.

Line RateMode Master Slave Partner

2.5 Gbit/s m 2p5 s 2p5 p 2p54.9 Gbit/s m 4p9 s 4p9 p 2p59.8 Gbit/s m 9p8 s 9p8 p 9p8

10.1 Gbit/s m 10p1 s 10p1 p 10p1

Page 52: Partial Reconfiguration of a CPRI Implementation on an FPGA
Page 53: Partial Reconfiguration of a CPRI Implementation on an FPGA

Chapter 5

Implemented PR Design

In order to evaluate the aspects of interest, a PR system was implemented in anIntel Arria 10 development board. This provided an opportunity to go throughthe design flow from the bottom up, giving insight into special considerations andobstacles that arise when working with PR as opposed to flat FPGA designs. Theimplemented PR system then acted as a test and evaluation platform.

5.1 Development BoardThe development board used for this thesis was the Intel Arria 10 GX TransceiverSignal Integrity Development Kit [45], featuring an Arria 10 GX1150 FPGA(10AX115F1932C) as well as a variety of off-chip components and accompanyingevaluation software. The components mainly used in the implementation at handare the flash memory (see Section 5.2.4) and the Embedded USB-Blaster which isused for computer-to-FPGA communication as well as full reconfiguration of thewhole FPGA. Furthermore, a few on-board LEDs were used in the initial, basicPR design.

35

Page 54: Partial Reconfiguration of a CPRI Implementation on an FPGA

36 CHAPTER 5. IMPLEMENTED PR DESIGN

5.2 Architectural Design

5.2.1 Overview

FPGA

External flash memory

Flash memory interface

DMA/PRController

PR Region

PR IP Core

Figure 5.1: Overview of the system implemented during this thesis project.

The PR system designed and implemented during the work on this thesis canbe seen in Figure 5.1. It is designed for evaluation of and experimentation withPR designs on Intel Arria 10 FPGAs. The custom-designed DMA/PR controllerblock is the central control unit of the system, and can in turn be controlled viaSystem Console∗ on an external computer. During partial reconfiguration of thePR region, the DMA/PR controller block reads data from the flash memory on thedevelopment board into an internal FIFO. It then forwards the data and handlesthe control of and communication with the PR IP block, which handles the actualreconfiguration of the PR region.

The PR system was designed using a combination of Qsys Pro† and pure HDLcoding. The different on-chip logical blocks all communicate via Intel’s AvalonMemory-Mapped (AVMM) interface standard.

∗ A built-in tool in Quartus Pro which can be used to communicate directly with circuits on theFPGA via a JTAG bridge. † A built-in hardware system design tool in Quartus Pro.

Page 55: Partial Reconfiguration of a CPRI Implementation on an FPGA

5.2. ARCHITECTURAL DESIGN 37

5.2.2 PR RegionAs previously described, the PR region is the constrained chip area which containsthe logic of the currently configured persona. The size and shape can be altereddepending on the resource demands of the different personas. The systemdesigned as part of this thesis is agnostic of which kind of logic is implemented inthe PR region, as long as the whole top design meets the constraints of the FPGA.

5.2.3 Partial Reconfiguration IP Block

Table 5.1: Settings when instantiating the PR IP component in Qsys Pro 16.0Setting Value

Use as PR Internal Host EnabledEnable JTAG debug mode Disabled

Enable Avalon-MM slave interface EnabledEnable interrupt interface Disabled

Input data width 32 bitsClock-to-Data ratio 1

Divide error detection frequency by 1Enable enhanced decompression Disabled

Auto-instantiate PR block EnabledAuto-instantiate CRC block Enabled

Generate timing constraints file Enabled

As described in Section 3.2, the Partial Reconfiguration IP block handles thereconfiguration of the PR region and provides a freeze signal for the outputs∗ ofthe PR region. The PR IP block was generated as a stand-alone component viathe Qsys IP Catalog. The settings can be seen in Table 5.1. Note in particular thatno bit stream compression or encryption was enabled.

5.2.4 Flash Memory InterfaceThe development board is equipped with two 1-Gbit CFI compatible flash devices(Micron PC28F00AP30BF CFI Flash), whose 16-bit buses are combined toachieve a total bus width of 32 bits and a total memory capacity of 2 Gbit.The flash memory interface of the PR system designed for this thesis is basedon the interface found in the source code for the Board Update Portal (BUP)Qsys system, an example design that came with the development board. Its

∗ Freezing the inputs is not necessary on Arria 10 [43]

Page 56: Partial Reconfiguration of a CPRI Implementation on an FPGA

38 CHAPTER 5. IMPLEMENTED PR DESIGN

main purpose is to act as a bridge between the Compact Flash Interface (CFI)of the off-chip flash memory and the component’s AVMM slave interface whichis connected to an AVMM master interface on the DMA/PR controller block.

5.2.5 DMA/PR Controller

DMA/PR IP Controller

AVMM Slave(CSR)

AVMM Master

(Memory)

AVMM Master(PR IP)

Memory FSM

FIFO

CSR

PR IP FSM

Figure 5.2: Overview of the DMA/PR IP block designed for this project.

The HDL code for the DMA/PR controller was custom written for this particularsystem. Its main purpose is to read the PR bit stream from the flash memoryinto the PR IP block, as well as controlling the PR IP block. The block featurestwo AVMM master interfaces and one slave interfaces. In the system developedfor this thesis, the slave interface is connected to System Console on an externalcomputer via a JTAG-to-AVMM bridge in order to be able to observe the PRprocess. However, the AVMM slave could be connected to any other logic blockcontaining an AVMM Master interface for fully FPGA-internal control of thePR process. As for the two master interfaces on the DMA/PR Controller, in thesystem constructed for this thesis, one master is connected to the flash memoryinterface and the second one is connected to the PR IP block.

An internal FIFO is used as intermediate storage for the PR bit stream in orderto enable synchronization between the read operations from the flash memory andthe write operations to the PR IP block. Separate state machines control the twoAVMM masters. The block also includes a counter which measures how long ittakes to perform the partial reconfiguration.

Page 57: Partial Reconfiguration of a CPRI Implementation on an FPGA

5.3. IMPLEMENTATION AND DATA COLLECTION 39

Via the AVMM slave, a set of status and control registers can be accessed. Inthe current implementation, these are used to control and monitor the block andthus the whole PR process from System Console.

• The address of the PR IP block which handles the reconfiguration of thedesired PR region is written to the DMA/PR controller block via JTAG fromSystem Console on external computer

• The flash memory address of the PR bitstream corresponding to the desiredpersona is written to the DMA/PR controller block via JTAG from SystemConsole on external computer

5.2.6 Top EntityThe top entity of the design connects the individual components internally and toexternal ports while also implementing the freeze logic for the output ports of thePR region.

5.3 Implementation and Data CollectionThe above described system was constructed in order to give insight into theprocess as well as many challenges and considerations of PR design on the Arria10. Furthermore, hard data could be collected from the compilation reports as wellas the reconfiguration timer implemented in the custom DMA/reconfigurationcontroller.

Initially, a basic “Blinking LED” PR design was implemented based on thetutorial found in [44]. The design was expanded from the tutorial, moving the PRprocess control and bitstream transfer from an external computer connected viaJTAG to the custom DMA/PR Control block described in 5.2.5.

Once the Blinking LED design functionality had been verified, the design wasaltered so that the PR region contained a CPRI L2 block. Personas were createdas described in Section 4.3.

Page 58: Partial Reconfiguration of a CPRI Implementation on an FPGA
Page 59: Partial Reconfiguration of a CPRI Implementation on an FPGA

Chapter 6

Results

6.1 Resource Utilization

6.1.1 Visualization of Results

0

1000

2000

3000

4000

5000

6000

Flat m_10p1 m_4p9 m_2p5 p_2p5

ALM

s

JTAG

Top Component OH

Flash Interface

PR/DMA Controller

PR IP

CPRI L2 Component OH

CPRI L2

Figure 6.1: Resource utilization divided into sub-blocks of the CPRI L2 designfor the flat design as well as different personas.

Figure 6.1 shows a comparison between the logic utilization of the flat designand the PR personas with the maximum, median and minimum logic utilization.The utilization is measured in the number of Adaptive Logic Modules (ALMs)

41

Page 60: Partial Reconfiguration of a CPRI Implementation on an FPGA

42 CHAPTER 6. RESULTS

used, and the data is taken from the compilation report of the 50x12 layout. Thislayout was chosen for comparison of logic utilization between personas since itwas assessed that it yielded a good balance between minimizing the physical sizeof the PR Region and avoiding routing congestion. The latter can lead to problemswith timing closure.∗

The number of ALMs is counted as A+B-C, where

• A is the number of ALMs actually used in the final layout

• B is the number of ALMs unavailable due to inaccessibility after netlistplacement

• C is an estimate of how many ALMs can be recovered through densepacking

The staples in Figure 6.1 are divided into different sub blocks of the fulldesign. The labels to the right in the figure can be explained as follows:

• JTAG - The logic needed for the JTAG communication between thecomputer and the AVMM slave interface of the DMA/PR IP Control block.

• Top Component OH - The overhead logic occupied by the design’s topcomponent itself and not by any of its sub blocks. This is for examplerouting logic interconnecting the instantiated sub components.

• Flash Interface - The communication interface to the external flashmemory described in Section 5.2.4

• PR/DMA Controller - The DMA/PR IP controller block described inSection 5.2.5

• PR IP - The PR IP block described in Section 5.2.3

• CPRI L2 OH - The overhead logic occupied by the top component of theCPRI L2 component and not by any of its sub blocks. The fact that this isso much bigger in the PR personas than in the flat implementation is furtherdiscussed in Section 7.2.3.

• CPRI L2 - The amount of logic occupied by the sub blocks of the CPRI L2component.

∗ Timing closure was not analyzed in this thesis.

Page 61: Partial Reconfiguration of a CPRI Implementation on an FPGA

6.1. RESOURCE UTILIZATION 43

6.1.2 Quartus Chip Planner Views

This section contains a select few examples of output from the Quartus ChipPlanner. It is provided here to illustrate the amount of physical area that the PRregion occupied (see Figure 6.2), as well as a visual comparison of the resourceutilization between the least complex as well as most complex persona (see Figure6.3).

Figure 6.2: Example of Chip Planner view in Quartus after compilation. 50x12layout, m 10p1 persona.

Page 62: Partial Reconfiguration of a CPRI Implementation on an FPGA

44 CHAPTER 6. RESULTS

Figure 6.3: Example of zoomed in Chip Planner view in Quartus aftercompilation. 50x12 layout. Left: m 10p1 persona, right: p 2p5 persona. Eachsmall rectangle represents a logic cell (LAB, DSP block or M20K), and darkercolors mean higher resource utilization.

Page 63: Partial Reconfiguration of a CPRI Implementation on an FPGA

6.2. BITSTREAM FILE SIZE 45

6.2 Bitstream File Size

Table 6.1: Bitstream file size statistics

FPGA Spec. Layout Parameter Size (kB) PersonaStratix V 30x40 Median 19 148 s 9p8, s 2p5

Max 19 433 m 10p1Min 19 122 p 4p9Sum 230 315

80x15 Mean 8 221 p 10p, m 2p5Max 8 396 m 10p1Min 8 137 p 4p9Sum 90 510

50x12 Median 7 091 s 2p5, m 2p5Max 7 369 m 10p1Min 6 998 p 2p5Sum 85 652

Arria 10 50x12 Median 7 140 p 10p1, m 4p9Max 7 351 m 10p1Min 7 031 p 4p9Sum 85 751

Table 6.1 shows the median, maximum and minimum bitstream file sizes of thePR personas for different LogicLock topological layouts (notated as ”ALM rowsx ALM columns”). Furthermore, the sum of all the personas has been calculated.The file size of the empty persona has been omitted when calculating the statistics.Since that leaves an even number of configuration personas (twelve), the medianbecomes the average of two personas which are given as the smaller one followedby the bigger one.

Page 64: Partial Reconfiguration of a CPRI Implementation on an FPGA

46 CHAPTER 6. RESULTS

0

5

10

15

20

25

30x40 80x15 50x12

FILE

SIZ

E (M

B)

REGION CONSTRAINT LAYOUT (HORIZONTAL X VERTICAL)

Min

Median

Average

Max

Figure 6.4: Graph of bitstream file size statistics for different topological layoutsof the PR region for the Stratix V compilations.

At a very late stage in the project, a bug which made the compiler ignore thespecified value of a VHDL Generic parameter in the top component of the CPRIL2 block was discovered. The standard value was Stratix V, and this affectedsome of the sub blocks. This had a large impact on the logic utilization, butdid not largely affect the bitstream file size, as can be seen in Table 6.1. Thecolumn FPGA Spec. signifies which FPGA the design was compiled for. Sincethe results only varied slightly for the bitstream file size (less than 1%) and thefeature of interest was the topological layout’s effect on the bitstream size, onlythe data for the 50x12 was re-collected after correcting the bug. Since the bug wasdiscovered so late in the project, there was limited time to re-collect the data for alltopological layouts. The same line of reasoning was used for the ReconfigurationTime, where the results are from the Stratix V compilation. However, since thebug had a great impact on the resource utilization, the results there are all fromthe Arria 10 compilation.

Page 65: Partial Reconfiguration of a CPRI Implementation on an FPGA

6.3. RECONFIGURATION TIME 47

6.3 Reconfiguration TimeAs mentioned in Section 5.2.5, in the custom DMA/PR controller block thereis a counter implemented which measures the time from when the PR processis initiated to when it is completed. The former happens when the DMA/PRController block writes ’1’ to the PR START register of the PR IP block, and thelatter is confirmed by the DMA/PR controller block by reading the PR STATUSregister of the PR IP block after the last 16-bit word of the PR persona bitstreamfile has been written. However, during the reconfiguration time tests, it wasdiscovered that the last 59 words of each persona bitstream file can be discardedwithout error, which made it possible to reduce the reconfiguration time. Thisremained true throughout tests with different personas and different PR regionlocation layouts, regardless of which persona was previously programmed into thePR region. The PR process did fail one time for one persona using this approach,but the error could not be reproduced. The author did not find any explanationto why it was possible to discard the last 59 words of the persona bitstream fileanywhere in the Intel/Altera documentation. One qualified guess is that thesewords only contain metadata that is not actually used during the reconfiguration.This is however only speculation.

The graphs below show the results from measurements using the reconfigurationtimer for different LogicLock location layouts for the same PR design (CPRI L2).It should be noted that the convention used here is that 1 MB = 106 Bytes, and thatthe axes do not start at zero. The design was running at a clock speed of 50 MHz.

More samples could be obtained during a single test run for the 80x15 and50x12 layout than for the 30x40 layout due to the limited size of the memoryand the large file size of the 30x40 personas. The purpose of these tests wereto assess the feasibility of implementing a CPRI L2 block using PR with respectto reconfiguration time (see Section 7.2.2 for analysis of the results). The 30x40layout yielded longer reconfiguration times than both the 80x15 and 50x12 layout,and thus has no impact on said feasibility. Therefore, it was not deemed necessaryto re-run the test for the 30x40 layout with the remaining persona bitstream fileswho could not fit on the memory card during the first test. However, the resultsfrom all the reconfiguration time tests are shared here in order to give the reader asense of the correlation between bitstream file size and reconfiguration time.

Page 66: Partial Reconfiguration of a CPRI Implementation on an FPGA

48 CHAPTER 6. RESULTS

1,240

1,245

1,250

1,255

1,260

1,265

19,1 19,15 19,2 19,25 19,3 19,35 19,4 19,45

Rec

on

figu

rati

on

Tim

e (S

eco

nd

s)

Persona File Size (MB)

PR Region Dimensions: 30x40 cells

Figure 6.5: Reconfiguration time for different personas with a PR regionmeasuring 30x40 cells.

Page 67: Partial Reconfiguration of a CPRI Implementation on an FPGA

6.3. RECONFIGURATION TIME 49

0,520

0,525

0,530

0,535

0,540

0,545

0,550

8,05 8,1 8,15 8,2 8,25 8,3 8,35 8,4 8,45

Rec

on

figu

rati

on

Tim

e (S

eco

nd

s)

Persona File Size (MB)

PR Region Dimensions: 80x15 cells

Figure 6.6: Reconfiguration time for different personas with a PR regionmeasuring 80x15 cells.

0,445

0,450

0,455

0,460

0,465

0,470

0,475

0,480

0,485

6,8 6,9 7 7,1 7,2 7,3 7,4

Rec

on

figu

rati

on

Tim

e (S

eco

nd

s)

Persona File Size (MB)

PR Region Dimensions: 50x12 cells

Figure 6.7: Reconfiguration time for different personas with a PR regionmeasuring 50x12 cells.

Page 68: Partial Reconfiguration of a CPRI Implementation on an FPGA
Page 69: Partial Reconfiguration of a CPRI Implementation on an FPGA

Chapter 7

Analysis

7.1 Challenges Encountered During PR DesignDevelopment

This section will describe the main caveats and challenges introduced whenimplementing a PR design compared to a flat design, based on the hands-onexperience of designing the system described in this report. It will be moresubjective than in Section 3.1 and focus more on obstacles that occurred duringthe design process.

7.1.1 Persona Storage and Memory AccessThe fact that the persona bitstreams have to be stored in either external or internalmemory introduces additional effort in design as well as validation and debugging.Since an external memory was used in this project, significant time and effort wasput into understanding the functionality of the memory itself, as well as interfacingit correctly to the FPGA and the DMA/PR Controller inside. Furthermore,programming the memory with all the different personas was time-consumingand had to be done every time the design was changed, which increased timeeffort during debugging.

7.1.2 PR Control LogicThe PR IP block provided in Altera IP Catalog does simplify the PR designprocess by handling the programming of the PR region. However, the PR IP blockitself needs to be controlled and the persona bitstream needs to be transferred toit. The best way to do this is depends largely on the design at hand and whatresources are available on-chip as well as off-chip. Which conditions shall trigger

51

Page 70: Partial Reconfiguration of a CPRI Implementation on an FPGA

52 CHAPTER 7. ANALYSIS

PR for a certain persona must be considered and logic enabling this to happenmust be implemented accordingly. If possible, it is best to plan the design withPR functionality in mind from the beginning to enable smooth integration of thePR control functionality.

The effort for implementing the PR control functionality combined with thebitstream transfer functionality should not be under estimated. During this thesisproject, migrating from a JTAG controlled PR IP block to an integrated solutionthat could be ported to other designs proved to be a much larger challenge thanexpected, and delayed the project about three weeks.

7.1.3 Complexity of PR Design Work FlowEven though the tools for and documentation on creating a PR design in Quartusare of great help, the complexity of using the tools and design methods is stillfairly high. One example of this is the large number of project revisions thathave to be created for synthesis and implementation, especially if there are manypersonas like in this project. Some changes, for example changes to LogicLockRegion assignments, have to be made manually∗ to all revisions if done after thesynthesis and implementation revisions are created based on the base revision.This increases the risk of introducing a bug in the design, which is also true whencreating freeze logic for the ports of the top component of the PR region if thecomponent has a large number of ports like in this project. Going through allthe design steps and considerations outlined in Section 3 will inevitably increasethe man hours needed for development. Judging from the experiences obtainedduring the work on this thesis project, the author would consider it non-negligiblewhen when weighing benefits and drawbacks of using PR against each other.

Another aspect that needs to be taken into consideration is the increasedcompilation time that is introduced due to the fact that each persona needs to be(partially) compiled individually. As a reference, the compilation of the CPRI L2PR design took approximately 4 hours while the flat design took approximately30 minutes to compile on the same machine.

7.2 Analysis of Results

7.2.1 Memory AllocationAs can be seen from the results in Section 6.2, the bitstream file size is largelyimpacted by the orientation of the PR region. The 30x40 layout is the same sizeas the 80x15 region, but the latter yields a file size that is less than half of the

∗ To the author’s knowledge. This was indicated from Altera’s support.

Page 71: Partial Reconfiguration of a CPRI Implementation on an FPGA

7.2. ANALYSIS OF RESULTS 53

former. This is expected due to the way the layout of the FPGA is divided whencreating bitstream files, as explained in [43].

Looking at the file sizes, the difference between the largest and smallestpersona for a given topological layout is small but non-zero. This is unexpected.Since the bitstream file shall describe how each ALM in the PR region shall beconfigured, it would be expected that the file is the same size for each persona,especially since no compression was enabled. No explanation for this could befound in the documentation. One possible reason could be that some simplecompression is implemented by default, for example leaving out the last part ofthe bitstream file if the corresponding ALMs are configured as empty. It can benoted from comparing the last four rows of Table 6.1 with Figure 6.4 that there isno strict correlation between bitstream file size and logic utilization.

There has been no attempt to find the minimum PR region size that would stillbe able to contain the necessary logic. That type of investigation was deemed tobe outside of the scope for this thesis since the search space would be very largeand timing closure would add more layers of complexity. Thus, the figures hereshould only be seen as examples, which however can give a rough insight intohow big the persona bitstream files would be for this design. For example, thesum of the file sizes of all personas for the 50x12 can give some insight in designconsiderations when it comes to whether internal or external memory should beused for persona storage. The total amount of available RAM on the Arria 10GX 1150 FPGA is only 8405 bytes and the sum of the personas of the Arria 10compilation of the 50x12 layout is 85755648 bytes. Thus, an external memory isnecessary for the approach used in this project for applying PR to the CPRI L2block.

7.2.2 Reconfiguration timeLooking at the results in Section 6.3, the reconfiguration time appears to have alinear dependency on the file size of the bitstream. For the 50 x 12 layout, thereconfiguration time is below 500 ms. In the CPRI specification [3], it is statedthat during line rate negotiation, a new line rate shall be tried if synchronizationis not reached within 0,9-1,1 seconds. Hence, the reconfiguration time does notviolate that specific requirement. Furthermore, there is potential for decreasingthe memory access time by using a faster memory and interface than the Flashmemory with CFI which was used in this design, for example PCI-Express.

It should be noted, however, that no further investigation has been made intohow quickly the CPRI L2 block must be up and running with the new line rateafter changing the setting in order for synchronization to be established within thegiven time frame. Also, it is not clear how long the CPRI block as a whole wouldtake to be up and running after the setting has been changed, including the time

Page 72: Partial Reconfiguration of a CPRI Implementation on an FPGA

54 CHAPTER 7. ANALYSIS

it takes to reconfigure the PR region as well as other delays already existing inthe current implementation. There is also a possibility that it becomes less likelyto achieve line rate synchronization in the given time interval since the remainingtime after changing the setting is reduced compared to the flat implementation.

Thus, it can be said that there are good indications that the timing requirementscan be met given the results obtained during this project, but that more investigationis necessary in order to fully conclude that this is true.

7.2.3 Logic UtilizationAs can be seen in Figure 6.1, the amount of ALMs that is saved when compilingthe largest persona (m 10p1) is not more than a few hundred. This does not weighup for the overhead that is introduced when moving from flat to PR design. Sincethe PR design needs to be dimensioned with regards to the largest persona interms of floorplanning, memory storage etcetera, this means that the value ofusing PR in the case of this particular design is limited. However, when studyingFigure 6.1, one should keep in mind that the JTAG logic would not need to beaccounted for if this design were integrated into a real FPGA design. It was onlyimplemented in this project in order to be able to connect a computer to the AvalonMemory Mapped slave interface of the DMA/PR control block for measurementand verification purposes. In an integrated design, that slave interface would beconnected to a different control unit, for example a CPU.

Another interesting thing to note when studying Figure 6.1 is that the overheadof the CPRI L2 component itself is approximately the same size as the amount oflogic saved when compiling the most complex persona (m 10p1). According tothe Intel Tech Support, one major contributing factor to the component overheadin PR designs is that every port of the PR block will occupy some logic inorder to be coherently available to the static region across all different personas.Furthermore, the freeze logic that needs to be implemented for the ports introducesadditional overhead. The CPRI L2 component has a large amount of ports (over700), which is a strongly contributing factor to the overhead of the CPRI L2component when compiled as a PR persona.

7.3 A Closer Look at the Sub BlocksOne possible approach in order to reduce the overhead due to the large numberof ports in the CPRI L2 block would be to isolate one or more sub blocks wherea large amount of logic is saved, and constrain the PR region to only encapsulatethese. This could be a good suggestion for future work, but an initial investigationwas made in order to assess the feasibility of such an approach.

Page 73: Partial Reconfiguration of a CPRI Implementation on an FPGA

7.3. A CLOSER LOOK AT THE SUB BLOCKS 55

Unfortunately, the results of the investigation did not show a lot of promise.The same problem remains: there is not a large amount of logic saved for the mostcomplex persona, especially when taking into account the overhead that will beintroduced due to the I/Os. The number of ports in relation to how much logic issaved remains high even for the sub blocks, largely due to high data bus widths(often 40 or more). It could still be worth investigating deeper in order to see ifthere is a certain combination of blocks that are interconnected with each otherthat all yield a large amount of resource savings, thus increasing the ratio of savedlogic versus number of ports.

Page 74: Partial Reconfiguration of a CPRI Implementation on an FPGA
Page 75: Partial Reconfiguration of a CPRI Implementation on an FPGA

Chapter 8

Conclusions

The CPRI L2 block studied in this thesis appears to be a good candidate forPR implementation at first glance. It has many different possible combinationsof settings, and only one setting is used at a time. For example, there is nocase where two different line rates are used simultaneously for a single CPRIL2 block. Furthermore, the requirements on reconfiguration time become lenientsince switching between the different settings is done seldomly. Line rateis changed approximately every second during line rate negotiation and thenkept constant for the remainder of the session, and operating mode is neverchanged during a session. This is very different from, for example, the casewith a reconfigurable hardware accelerator described in [25], where the trade-off between reconfiguration time and accelerated execution is a strongly limitingfactor when assessing the benefits of partial reconfiguration. As discussed inSection 7.2.2, the results obtained during the work on this thesis indicate thatthe reconfiguration time is not a limiting factor.

However, the results seen in Section 6.1 show that there is no benefit in usingPartial Reconfiguration for this particular hardware block, as analyzed in Section7.2.3. The largest reason for this is that not much logic is saved compared to theflat design when hard coding the settings for the largest persona. The high numberof ports decreases the gain even further, and more or less cancels out the savingsintroduced from the hard coding of the m 10p1 settings.

Thus, the investigation shows that using PR is not beneficial for this particulardesign. The project has however generated valuable insight in design considerationsfor future potential PR designs.

57

Page 76: Partial Reconfiguration of a CPRI Implementation on an FPGA

58 CHAPTER 8. CONCLUSIONS

8.1 Insights and suggestions for further work

This thesis has yielded a good overview of the PR design flow when applied to aCPRI L2 design and highlighted some of the challenges and added complexity thatcomes with it. A foundational platform for assessing the usefulness of applyingthe PR design method to a certain logic block has been constructed. This platformand the insights from this project could be used to investigate the usefulness ofPR in other blocks of the CPRI system, or in the radio/baseband FPGA designs ingeneral.

8.1.1 Port overhead

One of the main insights is that the number of ports of the PR region greatlyaffects its logic utilization overhead. In this project, it was large enough to single-handedly cancel out the logic utilization savings of the largest persona. Hence,when analyzing whether a certain logic block is suitable for PR implementation,one of the first things that should be investigated is the number of ports. In thisthesis project, the logic overhead in term of ALMs was approximately half ofthe number of ports. This single observation certainly can’t be considered as adependable rule of thumb, but it gives an indication of the order of magnitude ofthe overhead generated by the ports.

8.1.2 PR applicability

The perhaps most crucial factor when assessing the usefulness of PR for a certainblock is the amount of logic that is saved in the most complex persona comparedto the flat design. As discussed earlier, the design must be dimensioned after thelargest persona in terms of logic utilization. The approach used in this projectwas to hard code two settings and let the compiler strip the unnecessary logicwhen compiling the PR persona. This is quite an advantageous approach in thesense that the amount of logic saved can be assessed quickly by comparing thelogic utilization when compiling the block as top component with and without thehard coded setting(s). In order to be certain which setting(s) generate the largestpersona, this should be repeated for every potential persona. The drawback of this“set and compile” approach used here is that the compiler plays a great role inhow much logic is saved. Designing for PR implementation from the bottom upwould probably decrease the logic utilization of the personas, and this could besomething worth exploring for CPRI designs in future works. However, it wouldsignificantly increase the design effort.

Page 77: Partial Reconfiguration of a CPRI Implementation on an FPGA

8.1. INSIGHTS AND SUGGESTIONS FOR FURTHER WORK 59

8.1.3 Constraining PR RegionAs discussed in Section 7.3, one possible further work would be to investigatethe potential benefits of restricting the PR region to only contain a subset ofthe CPRI L2 block, namely the sub blocks where the most logic is saved whenhard coding the settings. Even though a shallow investigation of this did notshow great potential, there could be combinations of interconnected sub blocksthat constitute a beneficial configuration. One general insight from working onthis project is that even if the “set and compile” approach is used, it is good ifthe person working on the PR implementation has a deep understanding of thearchitecture and functionality of the block at hand. This will make it easier toconstrain the PR region to contain only the logic which is affected by the setting(s)at hand, which can reduce logic utilization, compile times, memory allocation andreconfiguration times.

8.1.4 Handshaking functionalitySomething that was not investigated during the work on this thesis or implementedin the prototyping platform due to time constraints was the handshaking functionalitybetween the PR region and the PR IP Controller that might be necessary toimplement depending on the block in the PR region. It is for example necessaryif the PR region block needs to alert other connected blocks that it will be out ofoperation in order to avoid system failure. Quartus 16 was used for this project,and in Quartus 17 there is a new PR IP suite which contains IP blocks enablingthis type of functionality. It would be interesting to analyze the effects on the logicutilization as well as system functionality if this was implemented.

8.1.5 Memory interfaceAnother point of potential improvement of the platform would be to implementa different type of memory and interface than Flash with CFI. It is possible thatthis is a bottle neck which increases the reconfiguration time. Using, for example,a memory with a PCI-Express interface could potentially be beneficial. Duringdebugging of a problem in the reconfiguration process, the internal state of theFIFO in the DMA/PR controller was observed. It could be seen that during thereconfiguration process, the FIFO never contained more than one 32-bit word at atime. This indicates that the throughput of the PR IP block exceeds the speed withwhich the CFI can retrieve data from the Flash memory. Due to time limitations,this was never further investigated during the course of this thesis, but it would beinteresting to analyse the potential improvements in the memory retrieval chain infuture works. The theoretical maximum throughput of both the CFI and the PR IP

Page 78: Partial Reconfiguration of a CPRI Implementation on an FPGA

60 CHAPTER 8. CONCLUSIONS

block is 400 MB/s since the circuit is running at 50 MHz and the data buses are 32bits wide. However, the actual throughput is limited by the memory access timeon the the CFI/Flash memory side and the reconfiguration time for given receiveddata on the PR IP side.

Page 79: Partial Reconfiguration of a CPRI Implementation on an FPGA

Bibliography

[1] Altera. Partial Reconfiguration IP Core, 2015. URL https://www.altera.com/content/dam/altera-www/global/en_US/\pdfs/literature/ug/ug_partrecon.pdf.

[2] Ericsson AB; Huawei Technologies Co. Ltd; NEC Corporation; AlcatelLucent; Nokia Networks. Official CPRI website, 2017. URL http://www.cpri.info/.

[3] Ericsson AB; Huawei Technologies Co. Ltd; NEC Corporation; AlcatelLucent; Nokia Networks. CPRI Specification V7.0, 2015. URL http://www.cpri.info/downloads/CPRI_v_7_0_2015-10-09.pdf.

[4] A. de la Oliva, J. A. Hernandez, D. Larrabeiti, and A. Azcorra. An overviewof the CPRI specification and its application to C-RAN-based LTE scenarios.IEEE Communications Magazine, 54(2):152–159, February 2016. ISSN0163-6804. doi: 10.1109/MCOM.2016.7402275.

[5] Nathan J. Gomes, Philippe Chanclou, Peter Turnbull, AnthonyMagee, and Volker Jungnickel. Fronthaul evolution: From CPRI toethernet. Optical Fiber Technology, 26, Part A:50 – 58, 2015. ISSN1068-5200. doi: http://doi.org/10.1016/j.yofte.2015.07.009. URLhttp://www.sciencedirect.com/science/article/pii/S1068520015000942.

[6] T. Wan and P. Ashwood-Smith. A performance study of CPRI over ethernetwith IEEE 802.1Qbu and 802.1Qbv enhancements. In 2015 IEEE GlobalCommunications Conference (GLOBECOM), pages 1–6, Dec 2015. doi:10.1109/GLOCOM.2015.7417599.

[7] D. Chitimalla, K. Kondepu, L. Valcarenghi, M. Tornatore, and B. Mukherjee.5G fronthaul-latency and jitter studies of CPRI over ethernet. IEEE/OSAJournal of Optical Communications and Networking, 9(2):172–182, Feb2017. ISSN 1943-0620. doi: 10.1364/JOCN.9.000172.

61

Page 80: Partial Reconfiguration of a CPRI Implementation on an FPGA

62 BIBLIOGRAPHY

[8] S. I. Kartashev and S. P. Kartashev. Dynamic architectures: Problems andsolutions. Computer, 11(7):26–40, July 1978. ISSN 0018-9162. doi: 10.1109/C-M.1978.218262.

[9] S. P. Kartashev and S. I. Kartashev. Software problems for dynamicarchitectures: Adaptive assignment of hardware resources. In ComputerSoftware and Applications Conference, 1978. COMPSAC ’78. The IEEEComputer Society’s Second International, pages 775–780, 1978. doi:10.1109/CMPSAC.1978.810566.

[10] Ju Hwa Pan, T. Mitra, and Weng-Fai Wong. Configuration bitstreamcompression for dynamically reconfigurable FPGAs. In IEEE/ACMInternational Conference on Computer Aided Design, 2004. ICCAD-2004.,pages 766–773, Nov 2004. doi: 10.1109/ICCAD.2004.1382679.

[11] Cindy Kao. Benefits of partial reconfiguration. Xcell Journal, 55:65–67, Fourth Quarter 2005. URL https://www.xilinx.com/publications/archives/xcell/Xcell55.pdf.

[12] Shaoshan Liu, Neil Pittman, and Alessandro Forin. Minimizingpartial reconfiguration overhead with fully streaming DMA enginesand intelligent ICAP controller. Technical report, September 2009.URL https://www.microsoft.com/en-us/research/publication/minimizing-partial-reconfiguration-overhead-with-fully-streaming-dma-engines-and-intelligent-icap-controller/.

[13] M. Liu, W. Kuehn, Z. Lu, and A. Jantsch. Run-time partial reconfigurationspeed investigation and architectural design space exploration. In 2009International Conference on Field Programmable Logic and Applications,pages 498–502, Aug 2009. doi: 10.1109/FPL.2009.5272463.

[14] M. Liu, Z. Lu, W. Kuehn, S. Yang, and A. Jantsch. A reconfigurable designframework for FPGA adaptive computing. In 2009 International Conferenceon Reconfigurable Computing and FPGAs, pages 439–444, Dec 2009. doi:10.1109/ReConFig.2009.39.

[15] Jean-Philippe Delahaye, Christophe Moy, Pierre Leray, and Jacques Palicot.Managing Dynamic Partial Reconfiguration on Actual HeterogeneousPlatform. In SDR Forum Technical Conference’05, Anaheim, CA, UnitedStates, 2005. URL https://hal.archives-ouvertes.fr/hal-00084143.

Page 81: Partial Reconfiguration of a CPRI Implementation on an FPGA

BIBLIOGRAPHY 63

[16] J. P. Delahaye, J. Palicot, C. Moy, and P. Leray. Partial reconfigurationof FPGAs for dynamical reconfiguration of a software radio platform. In2007 16th IST Mobile and Wireless Communications Summit, pages 1–5,July 2007. doi: 10.1109/ISTMWC.2007.4299250.

[17] J. Delorme, J. Martin, A. Nafkha, C. Moy, F. Clermidy, P. Leray, andJ. Palicot. A FPGA partial reconfiguration design approach for cognitiveradio based on NoC architecture. In 2008 Joint 6th International IEEENortheast Workshop on Circuits and Systems and TAISA Conference, pages355–358, June 2008. doi: 10.1109/NEWCAS.2008.4606394.

[18] Ming Liu. Adaptive Computing based on FPGA Run-time Reconfigurability.PhD thesis, KTH, Electronic Systems, 2011. QC 20110531.

[19] Byron Navas. Cognitive and Self-Adaptive SoCs with Self-Healing Run-Time-Reconfigurable RecoBlocks. PhD thesis, KTH, Electronics andEmbedded Systems, 2015. QC 20151201.

[20] B. Navas, I. Sander, and J. Oberg. The RecoBlock SoC platform: Aflexible array of reusable run-time-reconfigurable IP-blocks. In 2013 Design,Automation Test in Europe Conference Exhibition, pages 833–838, March2013. doi: 10.7873/DATE.2013.176.

[21] Jacob Siverskog. Evaluation of partial reconfiguration for FPGAdebugging. LiTH-ISY-EX–10/4390–SE, Master’s thesis, LinkopingUniversity, Department of Electrical Engineering, 2010.

[22] Kalle Ngo. FPGA hardware acceleration of inception style parameterreduced convolution neural networks. Master’s thesis, KTH, School ofInformation and Communication Technology (ICT), 2016.

[23] Anju P. Johnson, Sayandeep Saha, Rajat Subhra Chakraborty, DebdeepMukhopadhyay, and Sezer Goren. Fault attack on AES via hardware trojaninsertion by dynamic partial reconfiguration of FPGA over ethernet. InProceedings of the 9th Workshop on Embedded Systems Security, WESS’14, pages 1:1–1:8, New York, NY, USA, 2014. ACM. ISBN 978-1-4503-2932-3. doi: 10.1145/2668322.2668323. URL http://doi.acm.org/10.1145/2668322.2668323.

[24] Naser Derakshan. Design and implementation of hardened reconfigurationcontroller for self-healing systems on SRAM-based FPGAs. Master’s thesis,KTH, School of Information and Communication Technology (ICT), 2013.

Page 82: Partial Reconfiguration of a CPRI Implementation on an FPGA

64 BIBLIOGRAPHY

[25] Emilio Fazzoletto. Characterization of partial and run-time reconfigurableFPGAs. Master’s thesis, KTH, School of Information and CommunicationTechnology (ICT), 2016.

[26] E. J. McDonald. Runtime FPGA partial reconfiguration. IEEE Aerospaceand Electronic Systems Magazine, 23(7):10–15, July 2008. ISSN 0885-8985. doi: 10.1109/MAES.2008.4579286.

[27] A. M. Lalge, A. Shrivastav, and S. U. Bhandari. Implementing PSK modemson FPGA using partial reconfiguration. In 2015 International Conferenceon Computing Communication Control and Automation, pages 917–921,February 2015. doi: 10.1109/ICCUBEA.2015.182.

[28] A. Hassan, R. Ahmed, H. Mostafa, H. A. H. Fahmy, and A. Hussien.Performance evaluation of dynamic partial reconfiguration techniquesfor software defined radio implementation on FPGA. In 2015 IEEEInternational Conference on Electronics, Circuits, and Systems (ICECS),pages 183–186, Dec 2015. doi: 10.1109/ICECS.2015.7440279.

[29] Mohammed Moanes Ezzaldea Ali Kareem Nahar, Sabah A. Gitaffa andHussain K. Khleaf. FPGA implementation of MC-CDMA wirelesscommunication system based on SDR-a review. In Review of InformationEngineering and Applications, pages 1–19, January 2017. doi: 10.18488/journal.79.2017.41.1.19.

[30] K. A. Arun Kumar. Fpga implementation of qam modems using pr forreconfigurable wireless radios. In 2013 Annual International Conferenceon Emerging Research Areas and 2013 International Conference onMicroelectronics, Communications and Renewable Energy, pages 1–6, June2013. doi: 10.1109/AICERA-ICMiCR.2013.6575999.

[31] K. A. Arun Kumar. An implementation of DPD in FPGA with a softproceesor using partial re-configuration for wireless radios. In 2013 IEEEConference on Information Communication Technologies, pages 860–864,April 2013. doi: 10.1109/CICT.2013.6558215.

[32] K. A. A. Kumar. An OFDM transmitter implementation using cordic basedpartially reconfigurable IFFT module. In 2014 3rd International Conferenceon Eco-friendly Computing and Communication Systems, pages 266–270,Dec 2014. doi: 10.1109/Eco-friendly.2014.61.

[33] K. A. Arun Kumar. FPGA implementation of PSK modems using partialre-configuration for SDR and CR applications. In 2012 Annual IEEE India

Page 83: Partial Reconfiguration of a CPRI Implementation on an FPGA

BIBLIOGRAPHY 65

Conference (INDICON), pages 205–209, Dec 2012. doi: 10.1109/INDCON.2012.6420616.

[34] M. A. Rihani, J. C. Prevotet, F. Nouvel, M. Mroue, and Y. Mohanna. ARM-FPGA based platform for automated adaptive wireless communicationsystems using partial reconfiguration technique. In 2016 Conference onDesign and Architectures for Signal and Image Processing (DASIP), pages113–120, Oct 2016. doi: 10.1109/DASIP.2016.7853806.

[35] F. Shamani, R. Airoldi, T. Ahonen, and J. Nurmi. FPGA implementationof a flexible synchronizer for cognitive radio applications. In Proceedingsof the 2014 Conference on Design and Architectures for Signal and ImageProcessing, pages 1–8, Oct 2014. doi: 10.1109/DASIP.2014.7115603.

[36] Arun Kumar K A. An SoC based partially reconfigurable OFDMtransmitters for cognitive radios. In 2013 International Conference onControl Communication and Computing (ICCC), pages 172–177, Dec 2013.doi: 10.1109/ICCC.2013.6731645.

[37] L. Valcarenghi, K. Kondepu, and P. Castoldi. Analytical and experimentalevaluation of CPRI over ethernet dynamic rate reconfiguration. In 2016IEEE International Conference on Communications (ICC), pages 1–6, May2016. doi: 10.1109/ICC.2016.7510718.

[38] D. Chitimalla, K. Kondepu, L. Valcarenghi, and B. Mukherjee.Reconfigurable and efficient fronthaul of 5G systems. In 2015 IEEEInternational Conference on Advanced Networks and TelecommuncationsSystems (ANTS), pages 1–5, Dec 2015. doi: 10.1109/ANTS.2015.7413609.

[39] He Fei, Zhao Yixin, and Huang Wei. Design and research of dynamicpartial reconfiguration for industrial ethernet. In 2016 IEEE AdvancedInformation Management, Communicates, Electronic and AutomationControl Conference (IMCEC), pages 1146–1152, Oct 2016. doi: 10.1109/IMCEC.2016.7867391.

[40] Zhenzhong Xiao, D. Koch, and M. Lujan. A partial reconfigurationcontroller for altera stratix V FPGAs. In 2016 26th International Conferenceon Field Programmable Logic and Applications (FPL), pages 1–4, Aug2016. doi: 10.1109/FPL.2016.7577349.

[41] Intel. Intel R© Arria R© 10 Core Fabric and General Purpose I/OsHandbook, 2017. URL https://www.altera.com/en_US/pdfs/literature/hb/arria-10/a10_handbook.pdf.

Page 84: Partial Reconfiguration of a CPRI Implementation on an FPGA

66 BIBLIOGRAPHY

[42] Intel. Intel R© Arria R© 10 Device Overview, 2017. URL https://www.altera.com/content/dam/altera-www/global/en_US/pdfs/literature/hb/arria-10/a10_overview.pdf.

[43] Intel. Intel Quartus Prime Pro Edition Handbook Volume 1:Design and Compilation, 2017. URL https://www.altera.com/documentation/jbr1437426657605.html.

[44] Altera. AN797 - Partially Reconfiguring a Design on Arria 10 GXFPGA Development Board, 2016. URL https://www.altera.com/documentation/ihj1482170009390.html.

[45] Intel. Arria 10 GX Transceiver Signal Integrity Kit website,2017. URL www.altera.com/products/boards_and_kits/dev-kits/altera/kit-a10-gx-si.html.

Page 85: Partial Reconfiguration of a CPRI Implementation on an FPGA
Page 86: Partial Reconfiguration of a CPRI Implementation on an FPGA

TRITA TRITA-EECS-EX-2018:65

www.kth.se


Recommended