+ All Categories
Home > Documents > EDA Tech Forum Journal: March 2009

EDA Tech Forum Journal: March 2009

Date post: 15-Mar-2016
Category:
Upload: rtc-group
View: 234 times
Download: 8 times
Share this document with a friend
Description:
EDA Tech Forum Journal: March 2009
Popular Tags:
52
The Technical Journal for the Electronic Design Automation Community Volume 6 Issue 1 March 2009 www.edatechforum.com Embedded ESL/SystemC Digital/Analog Implementation Tested Component to System Verified RTL to Gates Design to Silicon INSIDE: Managing your IP for profit Embedded World bucks recession Toward the power-aware OS Turbocharging ESL design Virtual prototypes for security
Transcript
Page 1: EDA Tech Forum Journal: March 2009

The Technical Journal for the Electronic Design Automation Community

Volume 6 Issue 1 March 2009www.edatechforum.com

EmbeddedESL/SystemCDigital/Analog ImplementationTested Component to SystemVerified RTL to GatesDesign to Silicon

INSIDE:Managing your IP for profit

Embedded World bucks recession

Toward the power-aware OS

Turbocharging ESL design

Virtual prototypes for security

Page 2: EDA Tech Forum Journal: March 2009

COMMON PLATFORM TECHNOLOGY

Chartered Semiconductor Manufacturing, IBM and Samsung provide you the access to innovation you need for industry-changing 32/28nm high-k/metal gate (HKMG) technology with manufacturing alignment, ecosystem design enablement, and fl exibility of support through Common Platform technology.

Collaborating with some of the world’s premier IDMs to develop leading-edge technology as part of a joint development alliance, Chartered, IBM and Samsung provide access to this technology as well as qualifi ed IP and robust ecosystem offerings to help you get to market faster, with less risk and more choice in your manufacturing options.

Visit www.commonplatform.com today to fi nd out how you can get your access to innovation.

www.commonplatform.com

Industry availability of real innovation in materials science, process technology and manufacturing for differentiated customer solutions.

Untitled-11 1 2/4/09 10:39:16 AM

Page 3: EDA Tech Forum Journal: March 2009

3EDA Tech Forum March 2009

contents

EDA Tech ForumVolume 6, Issue 1March 2009

EDA Tech Forum Journal is a quarterly publication for the Electronic Design Automation community including design engineers, engineering managers, industry executives and academia. The journal provides an ongoing medium in which to discuss, debate and communicate the electronic design automation industry’s most pressing issues, challenges, methodologies, problem-solving techniques and trends.

EDA Tech Forum Journal is distributed to a dedicated circulation of 50,000 subscribers.

EDA Tech Forum is a trademark of Mentor Graphics Corporation, and is owned and published by Mentor Graphics. Rights in contributed works remain the copyright of the respective authors. Rights in the compilation are the copyright of Mentor Graphics Corporation. Publication of information about third party products and services does not constitute Mentor Graphics’ approval, opinion, warranty, or endorsement thereof. Authors’ opinions are their own and may not reflect the opinion of Mentor Graphics Corporation.

< TECH FORUM >

18Embedded

The power-aware OSMentor Graphics

22Embedded

FPGA-based speech encrypting and decrypting embedded systemTexas A&M University

26ESL/SystemC

Rapid design flows for advanced technology pathfindingNXP-TSMC Research Center and NXP Semiconductors

30Verified RTL to gates

Multiple cross clock domain verificationeInfochips

34Digital/analog implementation

TrustMe-ViP: trusted personal devices virtual prototypingUniversity of Nice-Sophia Antipolis

42Design to silicon

Chemical mechanical polish: the enabling technologyIntel

46Tested component to system

Automating sawtooth tuningBroadcom

< COMMENTARY >

5Start Here

Mainstream not nicheOur last chance on green design.

6Analysis

Choose, but choose wiselySome consumer markets remain busy in spite of the recession’s severity.

10Design Management

A profitable disciplineEfficiently tracking the use of intellectual property and open source software is becoming increasingly vital.

14Conference Preview

Ahead of the gameThe Embedded World conference has actually grown in 2009.

Page 4: EDA Tech Forum Journal: March 2009

4

team< EDITORIAL TEAM >

Editor-in-ChiefPaul Dempsey +1 703 536 1609 [email protected]

Managing EditorMarina Tringali +1 949 226 2020 [email protected]

Copy EditorRochelle Cohn

< CREATIVE TEAM >Creative DirectorJason Van Dorn [email protected]

Art DirectorKirsten Wyatt [email protected]

Graphic DesignerChristopher Saucier [email protected]

< EXECUTIVE MANAGEMENT TEAM >PresidentJohn Reardon [email protected]

Vice PresidentCindy Hickson [email protected]

Vice President of FinanceCindy Muir [email protected]

Director of Corporate MarketingAaron Foellmi [email protected]

< SALES TEAM >Advertising ManagerStacy Gandre +1 949 226 2024 [email protected]

Advertising ManagerLauren Trudeau +1 949 226 2014 [email protected]

Advertising ManagerShandi Ricciotti +1 949 573 7660 [email protected]

Supported Vendors: Mentor Mathworks Cadence Agilent Synopsys Synplicity Denali Magma

And more

Track & Monitor: Licensed Assets Costs Utilization Servers & Daemons Expiration Alerts P.O.’s Agreements

And more

Consolidate, share, and protect all pertinent engineering software, Vendor, and license data with TeamEDA’s License Asset Manager

Please call for a web demonstration www.teameda.com

(978) 251-7510 [email protected]

Untitled-6 1 4/28/08 10:18:27 AM

Make Sure YouGet Every Issue

SUBSCRIBE AT:www.edatechforum.com

The Technical Journal for the Electronic Design Automation CommunityEDA Tech Forumwww.edatechforum.com

Volume 4 Issue 3 September 2007

> Seeking a vision for the MPSoC era> Holistic design means analog too> Organizing tool licenses for profit> A user's take on verification flows> Managing multi-corner multi-mode

INSIDE

The Tech nica l Journal for the Elect ronic D esign Automation Community

EDA Tech Forumwww.edatechforum.com

Volume 4 Issue 4 December 2007

> Selecting and implementing memory

> Simulation with asynchronous clocks

> Verification for reconfigurable chips

> Realizing software-defined radio

> Managing SoC design flow complexity

INSIDE

subscribeadv1.indd 1 5/5/08 10:19:19 AM

Page 5: EDA Tech Forum Journal: March 2009

5EDA Tech Forum March 2009

start hereMainstream not nicheGreen design will be one of the enduring themes of 2009, or so we are told. As hard-faced as it might sound, though, it is hard to see this theme getting the attention it needs right now.

My problem with the whole concept is that it currently seems to at-tract a premium at just about every point in the design and supply chain, above and beyond what is truly merited by the extra functionality or physical qualities that a tool or component or board or finished product may possess. Slap on “green” and you can slap on 10%.

Companies are even encouraged to think along these lines by reputa-ble research, such as that recently published by the Consumer Electronics Association. It said that just over a fifth of the John and Jane Qs out there would be prepared to pay 20% more on the retail price for an average flat-panel display, if it had been designed with the environment in mind.

Well, sorry, but even people from Greenpeace look at that skeptically. As Casey Harrell, one of the pressure group’s senior campaigners on elec-tronics says, “There is always a big difference here between what people say and what they then go and do.” In other words, if you frog-marched all those people to Best Buy straight after getting their answers, would they still lay out the extra cash?

And there is another issue that goes begging here. Even if 22% of peo-ple will open their pocketbooks that wide for such goods and services, it’s not enough. Global warming, carbon footprints and related issues cannot be solved with niche products. The approach we take has to be mass-market.

Now this might sound as though I’m telling electronics companies to sacrifice margin. However, I don’t think it’s still called a “sacrifice” when you have no choice, and that is the way things are going.

President Obama has picked Steven Chu as his energy secretary, a Nobel Laureate who has made climate change the focus of his work, and who has also published and promoted some of the most disturbing re-search on the topic. The signal is clear—if industry won’t do something, here is a guy who will and has full executive backing.

So, unless electronics wants to face regulations that impose green think-ing on its mainstream design processes, it needs to start developing those processes itself, and account for the environment in everyday products in a commercially viable way. Such a strategy will actually protect margins rather than sacrifice them to red tape and regulatory diktat.

Paul DempseyEditor-in-Chief

Page 6: EDA Tech Forum Journal: March 2009

< COMMENTARY > ANALYSIS6

The consumer sector has become a reliable bellwether for the state of the electronics industry. However you slice it, the latest data does not paint a pretty picture. With the pub-lication of its latest U.S. market forecast, the Consumer Elec-tronics Association (CEA) provided data that put the typical knock-on effect of a recession into concrete terms.

“Economists estimate that for every dollar decline in wealth, consumption declines approximately 4-8¢,” said Steve Koenig, the association’s director for industry analysis.

Given that macro-economic formula, the further bad news is that the ‘wealth’ number (broadly analogous to disposable income) is now 40% off the base (Figure 1). And remember, consumer spending represents about 70% of the U.S. economy.

“[2009] will remain a very bad year,” said CEA economist Shawn DuBravac. “This will be the worst two-year period we’ve seen since the early eighties. We project that this year consumer spending will be down 0.3%, which will be a little bit worse than we’ve already experienced in the latter part of 2008.”

Overall, according to CEA chairman and CEO Gary Sha-piro, his organization is forecasting a -0.6% year-on-year drop in U.S. consumer electronics factory sales for 2009. This will follow on from a 5.4% rise in 2008, against an origi-nally forecast 6%.

If confirmed, this projection will mean that 2009 is only the fourth year since the seventies when CE revenues have contracted, alongside 2001 (a post 9/11 3% drop), 1991 (-0.8%), 1974 and 1975 (both of which suffered double-digit declines).

Even before we entered 2009, retailer Tweeter had shut-tered all its stores, Circuit City had filed for Chapter 11 bankruptcy protection and Polaroid Consumer Electronics had done the same. Circuit City has since also been forced to throw in the towel.

At least the CEA is looking for a relatively short and sharp downturn. “2010 should be a year of recovery, and we hope and expect to return to positive industry growth,” Shapiro said. However, in the meantime, should we all head for a dark, dank corner of the basement and pull on a paper bag?

Strangely enough, by mining the data you find evidence to suggest that while things are going to be tough, notions

that electronics design activity is about to judder to a halt are misplaced.

Recession or not, 4,000 flat screen TVs are sold every day, 20,000 new products were launched at the Consumer Electronics Show in January, and American consumers are still set to buy more than one billion CE products over the course of 2009.

It is probably fair to say that we entered the age of ‘elec-tronics everywhere’ several years ago, largely courtesy of the mobile phone. However, one positive aspect of that may only become clear now that we are in recession: many CE products are no longer viewed as luxuries but more as es-sentials, and are therefore less likely to fall out of people’s spending plans when belts must be tightened.

Consequently, there are numerous instances of products where the unit shipments are set to rise in 2009 even though price pressures and other market dynamics may cause rev-enues from them to fall. It is all a question of balance. One particularly fast-moving product shows how carefully that question must be considered.

The rise of the laptop ‘lite’Netbooks, ultra-lights, mini-notebooks—call them what you will (notwithstanding that there are some trademarks in there). The fact is that if you are working on board de-sign for any of these children spawned by the 7-inch-screen

Our annual review of consumer electronics trends finds there is still scope for well-targeted design activity.

Choose, but choose wisely

60%

40%

20%

0%

-20%

-40%

-60%

Economists estimate that for everydollar decline in wealth, consumptiondeclines approximately 4 to 8 cents.

1980 1983 1986 1989 1992 1995 1998 2001 2004 2007

Recession S&P500 (Y/Y % Change)

FIGURE 1 Large negative wealth effects

Source: Standard & Poor’s

Page 7: EDA Tech Forum Journal: March 2009

7EDA Tech Forum March 2009

Asustek (aka Asus) Eee PC, you are probably looking for-ward to a busy 2009. The same is true for those working on processors and other chipsets for this market.

It is hard to believe that the first of these products only shipped as recently as September 2007, less than 18 months ago. Already, the CEA says that they account for 7% of all laptop sales, and forecasts that they will move up to an 11% share by the end of the year (Figure 2). The Information Net-work, a Pennsylvania-based research firm, takes a broadly similar view on the units. It says that 11.4 million of these devices were sold last year and that 21.5 million will sell this year, implying a 12% share.

In many respects, the start of shipments for Intel’s Atom processor last April really ignited this segment, bringing in vendors who competed to follow Asus in shrinking form factors and applying better and better industrial design (rapidly becoming an even more powerful product differ-entiator here than processor power, OS, or memory). Dell, Lenovo and Acer have all launched their own Atom-based equivalents. HP has also entered the space, but using a Via Technologies processor.

At CES, a second wave of products, now based on rival ARM-powered processors, went on view. These ranged from Cortex A8-based offerings in CPUs from companies such as Freescale Semiconductor to Qualcomm’s Snap-dragon engine that is based around an ARM architectural license. This new set of machines has brought a more prob-lematic side of the ‘netbook’ market into focus.

The original Eee PC was Celeron-powered, had a very short battery life and was not intended for much beyond Web-surfing and email. As such, calling these devices ‘netbooks’ actually suited Intel quite well. It implied that this was a wholly separate product segment from that of notebooks, preserving the higher-end market for its more expensive, higher performance (and higher margin) Core processors. In short, the Eee PC (and its equivalents) were ‘cheap’ at around the $300 mark be-cause they were limited.

Then, however, netbooks started to undergo ‘function creep’. The processing power of the Atom allowed OEMs to add Windows XP and programs from Microsoft Office or—to keep costs down—have their products run the OpenOffice suite on Linux. Now, the entry of ARM-based chips has upped the ante still further.

Here are the technical specifications for Qualcomm’s Snapdragon platform:

• 1GHz CPU• 600MHz DSP• Support for Linux and Windows Mobile• WWAN, Wi-Fi and Bluetooth connectivity• Seventh-generation gpsOne engine for standalone-GPS

and assisted-GPS modes, as well as gpsOneXTRATM Assistance

• High-definition video decode (720p)

• 3D graphics with up to 22m triangles/sec and 133m 3D pixels/sec

• High-resolution XGA display support• 12-Mpi camera• Support for multiple video codecs• Audio codecs: (AAC+, eAAC+, AMR, FR, EFR, HR,

WB-AMR, G.729a, G.711, AAC stereo encode)• Support for broadcast TV (MediaFLO, DVB-H and ISDB-T)• Fully tested, highly integrated solution including base-

band, software, RF, PMIC, Bluetooth, broadcast & Wi-Fi

Notwithstanding XP’s absence from that list, this kind of machine looks very much like a traditional notebook. Indeed, the capability of these small form factor machines to take on more and more functionality is not only being driven by increasing competition and performance in the processor space, but also by trends in storage.

According to SanDisk, integrating 32GB of solid-state (typically flash) storage (SSD) into one of these machines now costs about the same as an equivalent capacity 2.5” hard disk drive (HDD), and a 64GB SSD is available at only a “small premium” to the equivalent 1.8” HDD.

Prototypes are in circulation that can quite comfortably carry three HD movies and—given that they are based around engines optimized for the low-power obsessed mo-bile communications space—have the battery life to show them. The portable computer that can run for an entire transatlantic flight is apparently with us—some designs are now claiming capacity of eight hours and above.

The even more important issue here, though, is cost. Net-book OEMs say that an Atom processor adds $30 to a bill of materials, and to date this has set the floor for this new market at about $250 retail. However, the Freescale i.MX515 used in a netbook launched at CES by Pegatron is said to add only $20, and also comes with software and other pe-ripherals. Thus armed, Pegatron is targeting a $199 price point, the figure the far less well-featured Eee PC originally aimed for but could not hit.

0%

7%

11%

13% 13%14%

2007 2008 2009 2010 2011 2012

FIGURE 2 Netbook market share of laptop sales

Source: CEA

Continued on next page

Page 8: EDA Tech Forum Journal: March 2009

< COMMENTARY > ANALYSIS8

Competition for this netbook space is going to intensify still further. Other ARM licensees such as Samsung, Texas Instruments and Marvell will release CPUs for it. AMD, In-tel’s chief rival in higher-end PCs, now has the Athlon Neo. Via already has the ultra-low-voltage C7-M processor used by HP. And graphics specialist Nvidia has ambitions in this segment. It is an exciting space, then, and promises to be a very busy one for design in the months ahead. If there is one other thing that is now obvious about these netbooks, it is that their functionality is still in flux.

However, what about the bottom line? There are those who see Intel’s Atom cannibalizing its existing higher-end laptop business. According to the Information Network, consumers do not see netbooks as a separate category, but as a cheaper alternative to a ‘traditional’ laptop. As a result, the research firm has said that Atom may have actually cost Intel $1.14B in revenue in 2008, and could cost it another $2.16B, assuming 50% of netbook purchasers would oth-erwise have bought notebooks and a $200 price difference between Atom and Core devices.

There is one further factor here, the insertion of mobile communications economics into a PC space. So far, Europe has been a far more voracious adopter of netbooks than North America. An important volume driver is that the machines are being bundled with data contracts by the continent’s 3G network operators. The first similar U.S. offer came in Janu-ary from AT&T, which is discounting Dell’s Inspiron Mini 9 from $449 to $99 in return for a two-year 3G data contract.

When network operators become major customers, sup-pliers soon find that they are not willing to accept PC-level CPU margins. That is an assumption that ARM and its li-censees are using in the business model they are using for netbooks, seeing the space as an upwards evolution from cell phones, through smart phones, through mobile Inter-net devices and ultimately into something akin to the tra-ditional computing space. Is Intel then trapped with Atom by a process under which it is evolving backwards, to some extent? And bear in mind that recessionary economics and their impact on consumer buying patterns need to be over-laid on all these existing market dynamics.

Just good enough?In 1997’s The Innovator’s Dilemma, Clayton Cristensen fa-mously described and analyzed a number of the scenarios that have challenged and wounded technology companies, even when they have been apparently on top of their games. One tale in particular is being cited by a number of execu-tives right now and concerns the former Digital Equipment Corporation (DEC).

DEC was, in its time, the prince of the minicomputer business—but then along came the PC, and for a number of reasons, including its structure and insistence on only high margin business, DEC missed the boat. However, another important factor was that while DEC continued to make the highest performance computers, that equipment’s specifications far exceeded the market’s need. The PC’s strength was that it combined being ‘just good enough’ (JGE) to do what its users needed with a much lower price point. Sound familiar?

To suggest that Intel may be at a similar crossroads now is something that the company’s competitors might well promote for obvious reasons—though you doubt that even they believe it entirely. Nevertheless, there may well be something in the idea that we are entering a period where JGE products are those that are especially well placed to win customers’ business.

Figure 3 shows the products that the CEA expects to post the greatest volume growth in 2009. In reviewing that list, one caveat always has to be made: any percentage ranking will always favor very new products that are growing from a low initial sales volume in the year before. Still, you can reasonably argue that five of the product groups on that list fit JGE criteria.

HD flash camcordersFlip video’s MinoHD has an incredibly straightforward iPod-like user interface but few of the effects that you will find on a traditional camcorder. It will capture high-defini-tion H.264 video—about an hour’s worth on a 4GB SSD—and fit in your pocket. Accessing the content is via a simple USB connection.

The retail price is $229.99 against a previous entry level for HD camcorders of about $500, but the Mino will do what most people want it to do. More to the point, most estab-lished OEMs in this space are bringing their own ‘me too’ products to market.

Netbooks/subnotebooksSee above.

Next-generation DVD players/home-theater-in-a-box (HTIB) with Blu-ray.There was surprise in some quarters that although sales in this segment rose 118% in 2008, the figure should have been much higher given the comparatively low base and

1) OLED Displays2) E-readers3) HD Flash Camcorders4) Netbooks/subnotebooks5) Climate Systems-Communicating Thermostats6) Next Generation DVD Players7) LCD TVs with 120Mhz+ Refresh Rate8) Portable Navigation (traffic compatible)9) MP3 Players with Wireless Connectivity10) Home-theater-in-a-box w/ Blu-Ray

149%110%106%80%71%62%57%52%41%30%

2009 Fastest Growing Products(based on estimated shipment revenues)

FIGURE 3 2009’s fastest growing CE products

Source: CEA

Page 9: EDA Tech Forum Journal: March 2009

9EDA Tech Forum March 2009

Toshiba’s decision to terminate the HD-DVD format almost immediately after CES 2008. Blu-ray Disc should have done much better with the market to itself.

However, for most of last year, Sony concentrated on marketing the format through its PlayStation3 (with, ap-parently, limited results among non-gamers) while other manufacturers held back equipment that fully met the 2.0 specifications until later in the year.

More importantly though, low-cost Blu-ray players—such as a $199 model from Best Buy in-house brand In-signia—only reached retail shelves in the fourth quarter. As this price point continues to fall over the course of 2009, more take-up will occur as the format moves from being a premium, performance-based technology, to one whose JGE position is established by the cheaper kit. The

bundling of Blu-ray technology in home theaters will also help.

LCD TV with 120Mhz+ refresh rateThe LCD has long been perceived as a junior technol-ogy to a plasma display in performance terms, but once it does get to this level of screen refresh, the important point is that it will have improved sufficiently to convert more consumers.

The overarching point here is that none of these products would necessarily have the notion of ‘high-end’ associated with them. They are mass-market in the purest sense, but with perhaps more of an emphasis on the price point un-derpinning their growth than might be the case in a more robust economy.

Digital Displays Wireless Handsets

32.7

34.6

36.7

38.3

39.0

2008

2009

2010

2011

2012

2008

2009

2010

2011

2012

27.6

29.0

33.2

38.2

43.3

121.0

124.2

127.9

136.0

144.8

35.3

36.3

35.3

32.5

29.2

Personal Computers Game Consoles

5.8%growthin 2009

2.6%growthin 2009

5.1%growthin 2009

2.8%growthin 2009

CE Sales OutlookUnit Sales in Millions

FIGURE 4 CE sales outlook (unit sales in millions)

The CEA expects its four core product groups to each post growth in unit sales during 2009 despite the recession. The two tipped for the greater increases are Digital Displays (5.8%) and PCs (5.1%).

The tumbling cost of displays is undoubtedly helping the sector maintain sales traffic, but a second factor that will con-tinue to push the 2009 number is the impending switch-off of analog TV broadcasts in the U.S. PC numbers are largely being driven by the netbook subsection of the laptop market, as discussed in the main article.

If the CEA numbers here do provide one source of concern, it is in the games consoles segment, with 2009 now pinned as the peak year for the current generation.

The latest machines from Sony, Nintendo and Microsoft have been available for two years at least, so some maturing was to have been expected. However, given the PlayStation3’s Blu-ray capabilities and following its repositioning in 2008 as an entertainment center rather than simply a gaming console, Sony may have expected its product to show more longevity than these numbers suggest.

Source: CEA

Page 10: EDA Tech Forum Journal: March 2009

< COMMENTARY > DESIGN MANAGEMENT10

The software food chainSoftware has become a ubiquitous part of many products and services, some of which are distributed and used on a massive scale. Think of cloud computing, online banking and the en-gines for today’s cellphones and cameras. Competition requires that such new products are developed more quickly than ever before and yet brought to market at lower and lower cost.

Project managers therefore demand the design efficiency in-herent in the reuse of existing, proven software code, be it inter-nally generated or acquired from third parties. Most companies have lots of their own code files. Many, many more can now eas-ily be sourced online and downloaded from open source librar-ies or acquired from external developers at a fraction of the time and cost it would take to do the work from scratch in-house.

Open source code is also attractive because it tends to grow exponentially in many respects—e.g., functional-ity, richness, stability, quality and security—while being subjected to rigorous peer review. Yet it is also thought to involve no monetary cost. However, such software is still governed by a wide range of complex license regulations and usage terms that require expert interpretation.

Keeping track of the use of such software is also difficult. Companies rarely create the entire software load in their prod-ucts or services. Rather, most players in the ‘food chain’ (Figure 1) assemble software from a mixture of suppliers, contractors and open sources, then couple it with their own value-add code and pass it along to another stage or competence—at which point, the same process of creative software combina-tion from various sources will take place all over again.

Today’s typical end product or service is the sum of code provided by many organizations and individuals at mul-tiple stages of a supply chain. Software modules are treated as commodities, as has happened for longer with many hardware components. However, there is one major differ-ence: software code comes laden with licensing and intellec-tual property (IP) terms that can seriously affect a product’s commercialization and/or certain corporate transactions.

Without a complete and up-to-date view of all the software components and contributors to its products or services, a com-pany runs the risk of encountering unexpected costs, missed deadlines and significant business risks. In particularly egre-gious scenarios, software components may not meet specifica-

tion and need to be repaired, or proper IP and copyright obli-gations may be overlooked, yet the problems they create only come to light after a product has been released. Imagine that a security fault is discovered in the protocol stack supplied by one of the players. The end-user now has to ascertain which products are affected (potentially thousands), and meet the cost of solving the problem and correcting it in each instance.

Software governance and development disciplineThe risks outlined above can be greatly reduced, even elimi-nated if you have proper, accurate records for all the code in a product and know the pedigree of all its components. Alas, few companies have traditionally managed to capture such a ‘software bill of materials’ because those tools that could de-termine code’s content and manage IP correctly were mainly retrospective. They also either involved cumbersome manual audits or were expensive tools for automatic code analysis.

These problems can now be avoided by implementing real-time records-gathering and preventive IP management processes based on appropriate design flow policies. More-over, recently released preventive tools allow for the real-time analysis, recording and compliance-checking of legacy code as well as any new code brought into a project.

Managing your software and IP usage is all about the bottom line, say Mahshad Koohgoli and Sorin Cohn-Sfetcu.

A profitable discipline

Open Source

SoftwareVendor

Chips,Sub-systems,

Supplier

ServiceProvider

End-User

ProductCompany

DevelopmentOutsourcer

FIGURE 1 The software food chain

Source: Protecode

Page 11: EDA Tech Forum Journal: March 2009

11EDA Tech Forum March 2009

While good software development practices have evolved to include systems for checking syntax, managing software versions and tracking software bugs, disciplines that are standard practice in structured hardware development flows have yet to be adopted. This latter group includes:

• Approved vendor lists that should contain approved com-ponents and licenses, including the commercial terms, vendor history, version details, historic pricing, and so on. A developer can then select components freely from the list without concern.

• Automatic notification in case attempts are made to use code with unapproved licenses, or code modules that are governed by incompatible licenses.

• A bill of materials that fully records which components feature in the final product and that includes necessary details to enable production, determine costs, set export terms, track vendor upgrades and manage other post-design activities.

Such advanced practices must be applied to software development processes in order to ensure proper software governance. They will give organizations a more effective overview of their source code assets and streamline prod-uct commercialization. To establish the necessary culture of software development discipline and good IP management, product and development managers must therefore:

• understand the economics of software governance;• work with legal counsel to create and administer appro-

priate policies for software IP management;• acquire full knowledge of licensing and other IP charac-

teristics of the existing code in the institution and clean it where necessary; and

• automate the record keeping and IP management pro-cess as an integral part of the software development process going forward.

The latest automoted tools can help managers take all these necessary steps and create an effective source code management flow (Figure 2).

The economics of software IP managementThe economic justifications for software IP management include the cost incurred to fix bugs and delays in time-to-market caused by the amount of time that elapses between the detection of IP issues and their correction. The earlier

any external IP obligations are detected and managed, the less expensive they are to address and fix.

Until recently, the industry had only limited options. The main preventive solutions are still often entirely manual, and basically involve training developers to avoid certain types of software code. This is time-consuming and expen-sive because the number and range of code suppliers and sources keep getting larger and more complex. And still, there is no certainty that the end software will be ‘clean’.

On the other hand, after-the-fact corrective measures waste resources on removing the offending code, identifying a re-placement third-party code (or worse, rewriting code inter-nally), and making changes to the application program inter-faces (APIs) so that the new code fits the software. The time required for correction lengthens time-to-market and a proj-ect’s commercial window can even close before its release.

It is better to handle record gathering and software IP manage-ment during the original code development and quality assurance (QA) processes. Here, automatic tools now place minimal training demands on R&D staff with regards to the minutiae of IP issues. They are easy to adopt and structured so that they do not dis-turb normal creative development flows. Examples of these tools include the offerings from our company. As shown in Figure 3, they cover the time-efficiency spectrum, from On-Command to On-Schedule and Continuous Real-Time. These tools are:

The Enterprise IP Analyzer. It helps establish corporate records-keeping and corresponding IP policies, ana-lyzes legacy code in a single project or across an entire organization, creates an associated pedigree database, and reports violations of declared policies.

IP Policy &Compliance

Administration

EnterpriseLegacy CodeIP Assessment

Market-readyLoad-build

IP Audit

PreventiveClean IP

Development

Source Code Management

FIGURE 2 A basic source code management flow

Source: Protecode

AutomatedPreventive

AutomatedRetrospective

Manual

On-command On-schedule ContinuousReal-time

EnterpriseIP Analyzer

BuildIP Analyzer

DeveloperIP Assistant

FIGURE 3 Protecode tool options

Source: Protecode

Continued on next page

Page 12: EDA Tech Forum Journal: March 2009

< COMMENTARY > DESIGN MANAGEMENT12

The Build IP Analyzer. It identifies all software mod-ules that actually feature in a product and checks them against established IP policies.The Developer IP Assistant. It automatically builds a re-cord of the code’s pedigree as a project develops and provides real-time notifications to ensure disciplined re-cords-keeping and that the software contains ‘clean’ IP.

Software IP management policies and compliance administrationSoftware managers who want to maintain disciplined develop-ment environments and deliver products with clean IP must collaborate toward these goals with legal counsel. Together, they can develop appropriate coding policies that ensure IP compliance and the presence of downstream safeguards, while enabling the constructive use and adoption of open source or third-party software. A sound, clear and enforceable IP policy will contain an effective list of approved software vendors, to-gether with acceptable external-content license attributes and measures to be taken in various situations:

• What is allowed and what is restricted.• What to do if a piece of code with a hitherto-unknown

license is desired.• What to do in case of unknown code.• What are the potential distribution restrictions, includ-

ing those on exports.

Such policies should be consistent with corporate goals, easy to enforce and allow compliance monitoring through-out a project’s lifetime from development to commercializa-tion and on to post-sales support. An organization should also be able to adopt and enforce appropriate IP policies for each class of projects it handles.

Retrospective code mapping and IP issue correctionsTo ascertain the IP cleanliness of the software that has already been developed or acquired, users need to map the existing (leg-acy) code and ensure that it is in line with the appropriate IP pol-icy. This can be done by engaging expert teams to perform an IP

audit or by using automatic tools—e.g., Enterprise IP Analyzer—to perform the code mapping and check all the components.

Retrospective tools—available from companies like Protecode, BlackDuck and Palamida—establish the IP pedigree of existing code by analyzing it and decomposing it into modules that can be matched against extensive databases of known software.

Real-time preventive IP software managementThe final and most cost- and time-effective measure is for manage-ment to embed the records-keeping and software IP management process as an intrinsic element of the company’s development and QA processes. Tools that unobtrusively operate at each develop-ment workstation are preferrable where they do not perturb the normal process of software creation and assembly. Our Developer IP Assistant product is an example of this kind of approach.

Such tools automatically detect and log in real time each piece of code a developer brings into a project. Various tech-niques are used to determine the code’s pedigree (e.g., its ‘signature’ when compared against a database of known code modules, identifying which URL it came from, identi-fiers within the code itself, requesting the developer to pro-vide a valid record of how the code was created and brought in). The resulting pedigree is captured in the overall code map and henceforth always associated with that piece of code. The Developer IP Assistant also checks the licensing obligations of the identified code against the IP Policy for that project and takes any necessary enforcement steps.

The integration of various analysis and policy enforce-ment tools (Figure 4) ensures proper record keeping and clean IP software development. Such automatic tools are already being used in academia and by industry. Their utili-zation here will accelerate in the face of enhanced demands for IP indemnification, open source enforcement of copy-right infringement and reduced product development cost.

Dr. Mahshad Koohgoli is CEO of Protecode, and Dr. Sorin Cohn-Sfetcu is an executive consultant to the company. More details about its products and services can be accessed online at www.protecode.com.

start Software Development Process end

Edit Source File(s) Load Build Test

IP PolicyAdministration

EnterpriseIP Analyzer

DeveloperIP Assistant

Software IP Database

Real-Time Software Discipline IP Development Tools

Final S/W

PedigreeRecord

BoM

IP Reports

FIGURE 4 A fully integrated code and IP management flow

Source: Protecode

Page 13: EDA Tech Forum Journal: March 2009

www.arm.com

at the heart...of SoC DesignARM IP — More Choice. More Advantages.

• Full range of microprocessors, fabric and physical IP, as well assoftware and tools

• Flexible Foundry Program offering direct orWeb access to ARM IP

• Extensive support of industry-leading EDA solutions

• Broadest range of manufacturing choice at leading foundries

• Industry’s largest Partner network

©ARM Ltd.AD123 | 04/08

The Architecture for the DigitalWorld®

Untitled-2 1 5/6/08 9:32:51 AM

Page 14: EDA Tech Forum Journal: March 2009

14

The 2009 Embedded World conference and exhibition in Nuremberg, Germany (March 3-5) has managed to maintain much of its scale despite the global downturn. Some speak-ers have apparently been pulled out by their companies—the most notable contingent, sadly, coming from the USA—but overall the exhibition has expanded by 5% in terms of its physical size and is also reporting a useful increase in international participants.

Meanwhile, its collocated conference, organized by Ger-man magazine Design&Elektronik, is set to address an admi-rably wide range of subjects, touching on all of the embedded sector’s existing and emerging concerns. This combination of an increasingly broad-based technical event with a re-markably healthy show provides sufficient justification for

Embedded World’s claim to be its sector’s pre-eminent an-nual global gathering.

The technical side is over-seen by Dr.-Ing Matthias Sturm, conference director and professor of Microcom-puting and Electronics at the Leipzig University of Ap-plied Sciences. It is a complex task, embracing all three key aspects of embedded tech-nology—hardware, software and tools—and then weaving them into a diverse set of sub-jects. However, in Sturm, the organizers have very much the right man for the job.

Sturm’s enthusiasm for electronics dates back to his

childhood, his research includes work in embedded sys-tems during the 1970s (before they were even known as such), and he has made significant technological contribu-tions himself. “One project of which I’m particularly proud was a microcontroller-based, embedded Web server, and we published the research on that in September 2001,” he says. “That application may not be seen as so spectacular today, but by describing the server starting with a circuit schemat-

ic through to complete software sources, we provided evi-dence that you can make a Web server with a 16-bit MCU. Numerous designers took up this suggestion, adapted it and developed it further.”

Sturm’s work today is looking toward the use of embed-ded systems in life science engineering, an area that, in the years to come, is likely to become a fruitful market segment. More immediately, though, there is this year’s conference, Sturm’s seventh. The length of his involvement puts Sturm

The embedded systems community gathers in Nuremberg this March for an event that is bucking the downturn.

Ahead of the game

< COMMENTARY > CONFERENCE PREVIEW

FIGURE 1 Prof. Sturm has been with Embedded World since 2003

Source: Embedded World

FIGURE 2 The conference emphasizes practical class sessions

Source: Embedded World

Page 15: EDA Tech Forum Journal: March 2009

15EDA Tech Forum March 2009

in a good position to judge how Embedded World has and will develop, and what it offers to engineers.

“The combination of exhibition and conference, both orga-nized and staged by highly motivated teams that do an ex-tremely professional job, is only one reason for the success of the event,” he says. “What’s equally important is that despite Embedded World’s tremendous growth, the design commu-nity still sees it as its very own event. It’s no exaggeration to say that the family character of ‘Embedded’—as most of those attending call it—is a major factor in that success.”

Sturm also believes that Embedded World is a vital event because it completes a virtuous circle in terms of both its delegates and the stand personnel. “Both designers and decision-makers get their money’s worth. What’s notice-able is that the majority of exhibitors are very positive when judging the quality of attendees and the discussions they are able to have with them,” he says. “A large percentage of those visiting the exhibition booths to gather more informa-tion are development engineers. So, as the exhibitors have realized this, they are increasingly sending engineers from their development departments to man their booths be-

cause the technical discussions that result can often be very in-depth. Visitors aren’t interested in a show in the sense of entertainment. They’re looking for direct contact with other developers. They want technical information straight from the source, and close contact with someone who can offer them something.”

Meanwhile, the conference is developed to a very clear agenda that reflects the various different and practical priorities of the attendees. “The conference offers partici-pants various possibilities for broadening and furthering their own knowledge, according to their expectations; it is split between an even balance of classes and sessions,” says Sturm.

Classes typically last a whole day and cover a specific topic. They are aimed primarily at participants who want to familiarize themselves thoroughly and efficiently with a particular field.

“There are options for direct dialog with experts to help attendees clarify a whole load of questions. The idea is to

Embedded World 2009Class topics

• Modeling embedded systems with UML• Introduction to real-time operating systems• Introduction to real-time Linux• Linux in safety-related systems• Solutions workshop—get the most out of Cortex M3

debugging• Creating multitasking systems with real-world timing• Open-source project management• More busting bugs from birth to death of an embedded

system running an RTOS• IEC61508—developing safety-oriented software• Cryptography and embedded security• Design of safety-critical systems• Using standards to analyze and implement multicore

platforms• Design and test of safety-critical systems• Modeling real-time systems• Software development in HLL I: JAVA• Software development in HLL II: C++• Software design for multicore systems• Unified Design—innovative architectural design method-

ology for embedded systems• USB host workshop with NXP L• ARM quick start workshop for LPC1700 (Cortex-M3)

Embedded World 2009Session topics

• Network technologies• Wireless technologies• Multicore processing• Development tools• Microprocessor architectures and cores• Cryptography and embedded security• Graphical user interface• Memory in embedded systems• M2M-communication• Automotive applications• System on chip• Software development methods I, II• CompactPCI Plus• Safe and secure virtualization• Green electronics• Automotive software development & test• Embedded system architecture• Managing development projects successfully• Model-based design• Embedded Linux• Debug methods• Software quality/test & verification• Successfully implementing ARM

Continued on next page

Page 16: EDA Tech Forum Journal: March 2009

16 < COMMENTARY > CONFERENCE PREVIEW

offer an excellent opportunity of deepening and widen-ing your knowledge fast. The classes are also didactic in structure to guarantee a high level of learning success,” says Sturm.

Meanwhile, the sessions serve primarily for the presen-tation of more discrete ideas, solutions and experience in embedded system development. “They’re a lively forum for imparting knowledge, and the resulting talks are often just the starting point for further discussions at the exhibi-tion booths. However, for the conference we stress that we want purely technical presentations without any distracting marketing,” says Sturm. “At the same time, the sessions will allow attendees to quickly acquire an overview of certain technologies and the latest trends.”

In that latter respect, Embedded World will this year build on the strong position it has always given to low-power ap-plications and smart energy management with a session dedicated to green electronics.

“But there are many other areas of special interest,” says Sturm. “There we have, for example, sessions on safety-oriented systems, system design at a high abstraction lev-el, and the effective development, programming and han-dling of multicore systems. There are issues that continue to revolve around RTOS, embedded Linux in different builds and open source. Embedded security and cryptog-raphy are increasing in importance, and are therefore also on the agenda.”

The conference also attracts name speakers for these ar-eas. At this year’s event, consultant David Kalinsky will speak on RTOS, Prof. Nicholas McGuire of the University of Lanzhou will speak on embedded Linux, Prof. Christof Paar of the Ruhr-University of Bochum will speak on embedded security and Dr. Bruce Powel Douglass of iLogix will speak on model-based development.

Sturm says that Embedded World has—for now, at least— avoided the worst effects of the global recession. “Prepara-tions for a conference always start straight after the preced-ing event. You need a certain run-up for the jury to review the submissions, to compose the program, and to publish the contents of the conference,” he says. “So the financial crisis hasn’t yet had any effect on arrangements for this year’s conference.

“Moreover, the financial crisis isn’t causing the innova-tive power of the community to dwindle. Engineers are still implementing innovative ideas, and making things possible that once seemed fantastic or quite impossible. It’ll just take a little longer. Aside from the financial crises, as a professor at a university, I feel it’s important that engineers continue their education so that they’re well prepared to face future challenges, and that they let others know of the enthusiasm they put into their daily work, and pass it on to the next generation.”

More information on Embedded World’s conference and exhibi-tion is available online at www.embedded-world.de.

FIGURE 3 Despite the downturn, this year’s exhibition is larger than in 2008

Source: Embedded World

Page 17: EDA Tech Forum Journal: March 2009

320,000,000 MILES, 380,000 SIMULATIONS AND ZERO TEST FLIGHTS LATER.

THAT’S MODEL-BASED DESIGN.

Accelerating the pace of engineering and science

After simulating the final descent of the Mars Rovers under thousandsof atmospheric disturbances, theengineering team developed andverified a fully redundant retro firing system to ensure a safetouchdown. The result—two successful autonomous landingsthat went exactly as simulated. To learn more, go tomathworks.com/mbd

©2005 The MathWorks, Inc.

Untitled-4 1 5/6/08 11:40:32 AM

Page 18: EDA Tech Forum Journal: March 2009

< TECH FORUM > EMBEDDED18

For system developers designing portable electronic devic-es, the main dilemma has always concerned power man-agement. How to maximize use and functionality while at the same time minimizing the battery’s cost and size to fit the shrinking form factors and longer lifespan demanded by users? Furthermore, consumers are demanding more functionality with every successive product generation. If a mobile phone does not have a Web browser, camera, audio/video capabilities, speaker phone, Bluetooth and a GPS receiver, it is a candidate for the scrap pile. With such a range of functionality being packed into today’s elec-tronic devices, power management has become intrinsic to success.

At the foundational embedded operating system (OS) layer—the interface between hardware and end-user ap-plications—device manufacturers are demanding more complete software platforms to meet the requirements of increasingly sophisticated applications. Within the core of the system, this OS layer is uniquely positioned to exploit power-saving features offered by the underlying hardware, while at the same time controlling user applications in a power-efficient manner.

Addressing power consumptionWhen employed by the OS, an application drains a device of its power. To take a mobile phone as an example, an ap-plication that uses the display heavily is a candidate for excessive power consumption. A typical scenario would be where the LCD keeps the back light active so the user can read the menu screen. Once a selection has been made—e.g., launch a video playback application—the LCD is on full power while the video is streamed to the phone. Is it high, medium, or low bandwidth video? Does the device support hardware acceleration? These variables impact how the system schedules power use. The CPU transfers the video and then decompresses it in either hardware or software. The end result is a user interface that displays the

video, and a system that fully performs the task requested by the user—but in the process, too much power might have been consumed.

That was a simplified example, but it illustrates a key point. The OS ultimately controls all of the applications and as a result, must decide what is shut down and when. Power management raises a number of questions the OS must address. Which applications can be controlled? What power is necessary in the lower states and how much power needs to be saved when going into these states? The answers to these questions vary from device to de-vice. The one variable not in question, however, is the fact that any good power-aware OS must be able to deal with a host of possibilities.

It is important to mention that many devices allow users the option to turn off or turn down the functionality of in-dividual applications to extend battery life. While this may result in power savings, it is definitely not the type of pow-er management the industry or consumers are demanding from next-generation portable devices. To truly tackle the problem, we must go deep into the OS and make funda-mental changes to how a system is designed.

A greener OS?For power management on a portable electronic device, we need to look not just at the hardware and software, but the entire system design. The hardware may well have features that reduce power consumption, but the software will often not take advantage of them. Therefore, in or-der for a battery-operated system to prolong its active life while in use, we must introduce energy-saving techniques at the OS level.

By definition, some embedded OSs are more power-ef-ficient than others. All things being equal, an embedded OS that can be configured with the smallest amount of memory for a specific task is more power-efficient than one with a larger footprint. Where there is less code to ex-ecute and less memory required, you will conserve over-all system energy. But beyond this basic consideration, a green OS had not been considered a design requirement for most engineers until recent developments in the in-dustry and the increased concern about the global cli-

Stephen Olsen is a product marketing manager in the Embed-ded Systems Division at Mentor Graphics. He has more than 20 years of embedded software experience with an emphasis on system architecture, embedded software and IP. Stephen holds a BS in Physics from Humboldt State University in California.

Stephen Olsen, Mentor Graphics

The Power-aware OS

Page 19: EDA Tech Forum Journal: March 2009

19EDA Tech Forum March 2009

mate. Back in the day, the hardware was physically wired so that when a device powered up, a peripheral (e.g., an Ethernet controller) would also start up during system initialization. Regardless of its intended use, this ‘set and forget’ mode left the peripheral powered on as long as the system was on.

Further, it is easy to see why battery-operated device manufacturers have a vested interest in batteries that last longer, but what about manufacturers of electrical-powered

devices such as DVD players and set-top boxes? These too can benefit from improved power management. If they use their power more effectively there is less of a drain on the power grid.

With a green, power-aware OS in mind, let’s take a look at a couple of energy-saving techniques that can improve power management. At the system level, there are two

The article describes the context and need for embed-ded operating systems that are more responsive to the power management demands placed on today’s electronic devices. It reviews the design objectives for the two main types of power management, reactive and proactive, and examines how both can be implemented.

1 61 121 181 241 301 361

380

370

360

350

340

330

320

310

300

290

LCD ON/Backlight ON

InactivityTimer

TouchpanelEvent

Initializethe System

Time (seconds)

mili

amp

s

InactivityTimer

InactivityTimer

SystemHalted

Reactive Power Management

FIGURE 1 The user interface power domain changing states

Source: Mentor Graphics

Continued on next page

Page 20: EDA Tech Forum Journal: March 2009

< TECH FORUM > EMBEDDED20

ways to solve power drain and extend battery life: reactive and proactive power management.

Reactive power managementWe briefly touched upon reactive power manage-ment in the earlier discussion of the LCD example. This technique represents the most basic approach to power management. Essentially, it responds to each application in use at a given time, based on a set of preprogrammed conditions. For example, when there are no applications with open sockets, having an Ether-net controller powered on is wasteful because there is nothing using it. If the application opens a socket only when it needs to communicate, that is reactive power management.

A more sophisticated example might involve the process by which a USB drive is accessed by the software. If the file system is mounted and a file is open, and the application is reading and writing the file, then the device is active. But what if the task becomes busy in such a way that it does not

read or write data into the file for several seconds? The sys-tem can reactively determine that the device is not using the USB drive, and after an inactivity timeout, tell the system to act accordingly.

This technique is depicted in Figure 1 (p. 19). Al-though basic in its implementation, it remains a good first step toward effective power management. It does have one major drawback though—by the time reactive power management takes place, electrons have already been sent through the system. Reactive power manage-ment does nothing to scale the processor. The CPU does not get scheduled correctly and prioritizations of tasks are excluded.

So while reactive power management helps start to pre-serve battery life, it is by no means the final answer. An even more efficient power management system is one in which an application gets scheduled for power consump-tion and is prioritized by the device, before the application is launched. Proactive power management is one way of achieving this goal.

120

100

80

60

40

20

0

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47

Wake up CPU/ISR to Schedule Tasks

Tasks 1, 2, & 3Readied

Tasks 1 & 2Complete

Task 3File Transfer

Idle

Timeout/Sleep CPU

Proactive Power Management

Time (seconds)

% C

PU F

req

uen

cy

FIGURE 2 The system undergoing dynamic voltage and frequency scaling as it schedules CPU frequency along with scheduling tasks

Source: Mentor Graphics

Page 21: EDA Tech Forum Journal: March 2009

21EDA Tech Forum March 2009

Proactive power managementProactive power management plays with the notion that developers can predict the future. Of course, that is not entirely possible; however, developers can use complex scheduling techniques to predict what the power use will be when the system is in operation. The data can be discov-ered manually by programming the system with a power-use scenario, or by dynamically measuring which domains are active and when.

For example, let’s say a mobile device has been asked to run the Bluetooth radio. The user is typing a text message on the keypad at the same time. What are the power require-ments of the Bluetooth radio when the keypad is engaged? A proactive power management system might say, “Do not communicate at the highest speed, communicate at a lower speed.” The result is less bandwidth for the Bluetooth radio, but the system knows the keypad is in use, so high-speed bandwidth in not required. It’s all about trade-offs that do not affect the apparent device performance. With proactive power management, the user never notices any kind of per-formance degradation.

Proactive power management also works for file trans-fers. Developers can optimize the system so that a transfer still happens, but consumes less power by scaling back the voltage or frequency of the CPU. Proactive power manage-ment senses that instead of transferring a digital photo in five seconds, it can transmit it in 30 seconds with no discern-able difference to the end-user experience.

Here is one final example. If a system has 10 tasks and all are ready to run, one would expect the system to be busy running these tasks. It makes sense to run the CPU at high power. It is important to note that which ten tasks are running might make a significant difference. If the system integrator can establish that every time a certain task is made ready to run—regardless of it actually being scheduled—the system will increase its power usage; then dynamic voltage and frequency scaling (DVFS) can be used. As seen in Figure 2, DVFS provides enough cycles to get the job done without wasting electrons. Continuing with this line of thinking, it is sometimes better to consume a little more power now so as not to degrade the quality of the user experience while waiting for the power modes to change.

Proactive power management techniques are the new wave in power management. Mentor Graphics, through its highly regarded Nucleus OS, is currently investigat-ing several approaches in this area. These techniques will provide system developers and integrators the necessary elements to implement a more sophisticated power man-agement policy.

ConclusionThe idea of a power-efficient portable electronic device must be taken seriously. System developers can begin

to do this by taking a more holistic approach to power conservation, utilizing hardware and infrastructure de-signed to scale back power use and by using software that is capable of controlling a device’s overall power consumption.

At the very center of this approach is the concept of a power-aware OS that combines both reactive and proactive power management techniques. Whether it is the individual system developer or an electronic de-vice manufacturer who is most actively pursuing a more power-efficient product line, it is the end-user who ben-efits most by having a device with improved battery life along with built-in capabilities that actually contribute to a greener planet.

Mentor GraphicsCorporate Office8005 SW Boeckman RdWilsonvilleOR 97070USA

T: +1 800 547 3000W: www.mentor.com

Page 22: EDA Tech Forum Journal: March 2009

< TECH FORUM > EMBEDDED22

We have undertaken a design to assess the viability of us-ing FPGAs in embedded systems with real-time require-ments by using one as the basis for a digital speech en-cryption and decryption system (Figure 1). Digital speech encryption is one of the most powerful countermeasures against eavesdropping on telephonic communications, and was therefore a good test of both the FPGA and the surrounding infrastructural technology. For the purpose of the exercise, we selected the Xilinx Virtex II Pro plat-form FPGA.

FPGAs as building blocksThe use of FPGAs and configurable processors has be-come an increasingly interesting option for embedded system development. FPGAs offer all of the features needed to implement even the most complex designs. Clock management is facilitated by on-chip phase-locked loop (PLL) or delay-locked loop (DLL) circuitry. Dedicated memory blocks can be configured as basic single-port RAMs, ROMs, FIFOs, or CAMs. Data pro-cessing capabilities, as embodied in the devices’ logic fabric, can vary widely.

The ability to link an FPGA with backplanes, high-speed buses and memories is provided through both on-chip and development kit support for various sin-gle-ended and differential I/O standards. Also, today’s FPGAs feature such system-building resources as high-speed serial I/Os, arithmetic modules, embedded pro-cessors and large amounts of memory. There are also options from leading vendors for embedded cores and configurable cores.

We developed our FPGA-based embedded system to be a completely programmed chip for a complex function that faced minimal delays during its development and that ex-ploits all that FPGAs have to offer to such complex systems, particularly their practical use in situations that demand the utmost accuracy and precision.

Speech encryption and decryption principlesSpeech encryption has always been a very important mili-tary communications technology (Figure 2). Given the tech-nology available today, digital encryption is considered the best technological approach. Here, an original speech sig-nal, x, is first digitized into a sequence of bits, x(k), which is then encrypted digitally into a different sequence of bits, y(k), and finally transmitted.

But while digital encryption techniques can offer a very high degree of security, they are not entirely or immediately compatible with all of today’s communications networks. Most telephone systems are still largely analog. Most prac-tical speech digitizers operate at bit rates higher than those that can easily be transmitted over standard analog tele-phone channels. Meanwhile, low bit-rate speech digitizers

K.A.S. Srikanth Pentakota holds a Bachelor’s Degree from N.I.T Rourkela, India and is pursuing a Master’s in mixed signal devices at Texas A&M University. He has worked as a project assistant at the Indian Institute of Science, Bangalore and as a subject matter expert in DVCI, AMDOCS.

K.A.S. Srikanth Pentakota, Texas A&M University

FPGA-based speech encrypting and decrypting embedded system

Encryption Decryption

Cipher textPlain text Plain text

Network

Sender Receiver

FIGURE 1 Block diagram for standard encryption and decryption

Source: Texas A&M

Encryption Key

EncryptionAlgorithm

Cipher textPlain text

Decryption Key

DecryptionAlgorithm

Plain textCipher text

FIGURE 2 Encryption and decryption

Source: Texas A&M

Page 23: EDA Tech Forum Journal: March 2009

23EDA Tech Forum March 2009

The Texas A&M team describes an FPGA embedded system design project intended to assess the technology’s suitability for use in real-time, high-performance applica-tions.

The test application is a speech encryption system in-tended for use in military and high-security environments. The team used a Xilinx Virtex II Pro platform FPGA for the project, and was also particularly interested in using the peripheral technologies made available.

still entail relatively high design complexity but offer rela-tively poor quality results.

Furthermore, almost all digital encryption techniques rely on accurate synchronization between the transmitter and the receiver. Exactly the same block of bits must be pro-cessed by the encryption and decryption devices. This not only increases design complexity, but also makes transmis-sion much more sensitive to channel conditions—a slight synchronization error due to channel impairment can com-pletely break the transmission.

There is another type of speech encryption technique, scrambling. The original speech signal, x, is scrambled directly into a different signal, y(t), in analog form be-fore transmission.

Since the scrambled signal is analog, with similar band-width and characteristics to the original speech signal, this type of technique can be easily used with existing analog telephone systems. Some conventional scrambling tech-niques (e.g., frequency inversion, band splitting) do not require synchronization but today offer only relatively low levels of security. More advanced scrambling techniques have recently been developed (e.g., sample data scram-bling) and are now used extensively because they continue to offer relative ease of implementation alongside improved levels of security.

A typical advanced scrambling sequence is as follows. Original speech, x, is first sampled into a series of sample data, x(n). This is then scrambled into a different series of sample data, y(n), and recovered into a different signal, y(t), for transmission.

These techniques offer a relatively high level of security and are compatible with today’s technical environment. However, like digital techniques, there is again a heavy dependence on synchronization between transmitter and receiver. The transformation from x(n) into y(n) has to be performed frame-by-frame, and exactly the same frame of sample data has to be used in the scrambling and descram-bling processes for the signal to be recovered. As with digi-tal approaches, this complicates implementation and makes transmissions very sensitive to channel conditions.

Recently, two new sample data scrambling techniques have emerged. One scrambles the speech in the frequency

domain, and the other scrambles it in the time domain. Both preserve the advantages of traditional sample data scrambling, while eliminating the requirement for synchro-nization in the receiver. This simplifies the system struc-ture, and significantly improves the feasibility and reliabil-ity of sample data scrambling techniques. The basic point here is that the synchronization is only necessary as long as the scrambling and descrambling are performed frame-by-frame. It becomes unnecessary when a ‘frame’ is not de-fined in the operation.

Scrambling based on frequency band swapping of the analog signal can be used in a wide variety of analog and digital systems since the method can transmit speech sig-nals over a standard telephone line with acceptable quality.

Encryption Decryption

Cipher textPlain text Plain text

Network

Sender ReceiverShared secret key

FIGURE 3 Symmetric-key cryptography system

Source: Texas A&M

VoiceInput .txt file FPGA

Encryption

FPGADecryption

Playing

Recording

EncryptedVoicePC

MatlabPC

Matlab.txt file

DecryptedVoicePC

Matlab.txt file

FIGURE 4 Offline process

Source: Texas A&M

Continued on next page

Page 24: EDA Tech Forum Journal: March 2009

< TECH FORUM > EMBEDDED24

In the time domain method, a digital speech signal is encrypted. This is based on a redundant bit to protect speech information effectively. Although the method is secure, it is hard to apply to a conventional analog trans-mission line because the bandwidth is wide. In order to reduce the bit rate under the bandwidth for the analog line, a speech encryption system with a low bit-rate cod-ing algorithm is necessary.

Cryptography principlesThe main components involved in cryptography are:

• a sender;• a receiver;• plain text (the message before it is encrypted);• cipher text (the message that has been encrypted);• encryption and decryption algorithms; and• a key or keys.

All methods of cryptographic encryption are then divided into two groups:

• symmetric-key cryptography (a.k.a. private key cryp-tography); and

• public-key cryptography.

In public key cryptography, the encryption and decryp-tion algorithms are public but the key is secret. Only the key needs to be protected rather than the encryption and decryption algorithms.

In symmetric-key cryptography (Figure 3, p. 23), a com-mon key is shared by both sender and receiver. Particular advantages of a symmetric-key cryptography algorithm in-clude the following:

• Less time is needed to encrypt a message than when using a public key algorithm.

• The key is usually smaller, so symmetric-key algorithms are used to encrypt and decrypt long messages.

Disadvantages include:• Each pair of users must have an unique key.• So many keys may be required that their distribution

between the two parties becomes difficult.

Symmetric-key algorithms can be divided into traditional ciphers and block ciphers. Traditional ciphers encrypt the bits of the message one at a time. Block ciphers take several bits and encrypt them as a single unit. Blocks of 64 bits have been commonly used. Today, some advanced algorithms can encrypt blocks of 128 bits.

FPGA implementationWe worked on two simulation processes for our system:

• An offline simulation using the Xilinx 9.1i project navi-gator.

• An online simulation using the Virtex II Pro platform FPGA.

Offline processThe offline simulation process is shown in Figure 4 (p. 23). It did not include a hardware implementation but was in-tended only to allow for self-test of the encryption/decryp-tion algorithm. Assuming this was successful, our plan was then to move on to the online process.

The following steps were required:• In the recording process, first we took a voice source

input. Then, using Matlab, we created a text (.txt) file. The parameters given during creation of the .txt file were voice duration and its bit rate. The Matlab code for conversion of speech (.wav file) to text (.txt) file was written.

• After successfully creating a text file, a testbench read it character-by-character and then these values were mapped for the encryption/decryption operation.

• After the encryption/decryption operation had been performed, the testbench again created a text file but this one contained the encrypted version of the original voice input.

• In the final step, Matlab code again read the encrypted version of the text file and played it at the specified bit rate.

Online ProcessThe online process (Figure 5) included the software as well as the hardware implementation. The board on which the VHDL code was burned contained built-in analog-to-digital and digital-to-analog converter (ADC & DAC) ports. So the VHDL code written at the transmitting end participates in three operations:

• It converts analog voice into digital data by an ADC.• It encrypts the data using symmetric-key cryptography.• The encrypted values are sent to the outside world via

a DAC.

VoiceInput FPGA

EncryptonDACADC

EncryptedVoice

FPGADecrypton

DACADC

DecryptedVoice

FIGURE 5 Online process

Source: Texas A&M

Page 25: EDA Tech Forum Journal: March 2009

25EDA Tech Forum March 2009

Similarly, at the receiver end, the code participates in these three tasks:

• analog-to-digital conversion;• use of the standard decryption algorithm; and• digital-to-analog conversion.

This is a real-time process. Input comes continuously from a microphone and is given to the ADC converter on a Virtex II Pro board. We preferred symmetric-key cryptography for encryption because here the same code performs both encryption and decryption operations. As noted, symmetric-key cryptography involves both send-er and recipient using a common key. The sender uses an encryption algorithm and the same key for encryption of data, and the recipient uses a decryption algorithm and the same key for the decryption of data. In this pro-cess of cryptography, the algorithm used for decryption is the reverse of the encryption algorithm. The shared key must be set up in advance and kept secret from all other parties.

The Stream Cipher of the Data Encryption Standard al-gorithm was used here. This is a class of ciphers in which encryption or decryption is performed using separate keys created on numerous occasions by a keygen.

In our context, the key space consists of 30 different keys for 30 data samples, and the key space repeats it-self for each subsequent data sample. A shift algorithm is also used to obtain these 30 different keys. The algo-rithm consists of the generation of the key space and the XORing of the space in which the data samples are de-signed, simulated, implemented on an FPGA, and then tested on hardware.

In this implementation, the keys used for encryption/decryption were each 14 bit long. Since ADC gives 14 bit output, our keys are confined to 14 bits. But the DAC that is present on Virtex II Pro takes 12 bit input, so we are neglecting two MSB bits of the ADC output. Then the DAC sends the encrypted version of speech to the out-side world.

Encryption and decryption in practice

Offline ProcessIn the offline process, we used Matlab to record and play speech, to store the recorded speech in a text file, and to play a speech waveform by reading it from a text file.

There are, as noted, Matlab codes that write to and read from text files. However, we then had to write some VHDL codes. One was for the speech encryption/decryption pro-cess and the others were needed to read the text file we had generated in Matlab through the Xilinx testbench.

We observed that the larger the number of bits used to represent the discrete values of a sampled speech signal, the greater the clarity of the speech, although a more com-

plex algorithm was required for higher bit representations. Ultimately, we used 12 bit representations for each sample value and took 20 samples at a time for encryption with 20 different random bit sequences.

Online ProcessIn the online process, we first programmed the ADC/

DAC converters on the Virtex II Pro board to sample the incoming analog speech signal at a frequency of 1MHz. Once we had the sampled data value for the speech signal, we encrypted it using a random bit sequence. We took a 20-state code for this purpose with a key space of 20 ran-dom bit sequences. The encrypted data values were then sent simultaneously through a DAC to speakers. For de-cryption, since we had used an XORing scheme, we could reuse the same code we had developed for encryption. The ADC output is composed of a 14 bit number whereas the DAC input had to be 12 bit, so we had do a conversion of the 14 bit input number to a 12 bit number by rounding off two LSBs.

References:[1] H. J. Beker and F. C. Piper, Secure Speech Communications. London, U.K.: Academic, 1985.[2] B. Goldburg, S. Sridharan, and E. Dawson, “Design and cryptanalysis of transform-based analog speech scramblers,” IEEE J. Select. Areas Commun., vol. 11, no. 5, pp. 735-744, May 1993.[3] A. Matsunaga, K. Koga, and M. Ohkawa, “An analog speech scrambling system using the FFT technique with high-level security,” IEEE J. Select. Areas Commun., vol. 7, no. 4, pp. 540-547, Apr. 1989.[4] K. Li, Y. C. Soh, and Z. G. Li, “Chaotic cryptosystem with high sensitivity to parameter mismatch,” IEEE Trans. Circuits Syst. I, Fundam. Theory Appl., vol. 50, no. 4, pp. 579-583, Apr. 2003.[5] G. Manjunath and G. V. Anand, “Speech encryption using circulant transformations,” Proc. IEEE Int. Conf. Multimedia and Expo, vol. 1, pp. 553-556, 2002.[6] Digital design: Morris Mano M[7] Digital design: Frank Vahid

Department of Electrical & Computer EngineeringDwight Look College of EngineeringTexas A&M University Zachry Engineering CenterCollege StationTX 77843USA

T: 1 979 845 7441W: www.ece.tamu.edu

Page 26: EDA Tech Forum Journal: March 2009

< TECH FORUM > ESL/SYSTEMC26

NXP Semiconductors is one of Europe’s leading silicon design companies. The NXP-TSMC Research Center is an R&D collaboration between it and TSMC, the world’s largest foundry.

IntroductionThe complexity of current design flows makes it extremely time-consuming to evaluate new device technologies in terms of the parameters that designers need (e.g., clock rate, die area, battery life). The process can take months. This re-search describes several simplifications to standard design flows that enable extremely short experiment turn-around times—often less than a day—while maintaining reason-able timing accuracy (better than 10%).

This approach is illustrated using two use-case examples. In the first example, the impact of two competing 15nm technolo-gies on the clock-rate versus area trade-off of a large block of intellectual property (IP) is analyzed. In the second example, rapid design flows are then coupled to a design flow descrip-tion language to enable unique experiments at the 45nm node that directly link process-level variability to timing variations, without recourse to perturbations of device model parameters.

Rapid design flow componentsThe components of an RDF are shown in Figure 1. Transis-tors incorporating new materials, architectures and trans-port mechanisms are designed using Technology Computer Aided Design (TCAD) tools and embedded in the RDF using a model designed to be extracted from only six I-V curves, with eight model parameters.

beta gain factor for one um wide devicevt0 threshold voltage at zero back-biasm0 sub-threshold slopegam0 drain-induced threshold shift for zero gate biasgam1 drain-induced threshold shift for large gate biasthe1 mobility reduction due to vertical fieldthe3 mobility reduction due to lateral fieldrs source/drain series resistance

Although simple, the RDF model maintains the ability to model deep submicron device non-idealities, such as DIBL, velocity saturation, reduced sub-threshold slope and series

resistance. Such dramatic model simplification is possible because the dominant pole (i.e., RC constant) of the stan-dard cell frequency response is given by the product of the cell output resistance and the distributed interconnect ca-pacitance. Only the DC properties of the device/cell then need to be modeled accurately [1].

Table 1 summarizes the main results of a benchmarking ex-ercise at the 65nm node, to compare a fully automated RDF-generated library with a commercially produced library. For this comparison, the RDF model was extracted from I-V curves generated from a 65nm BSIM4 model, and design rules for cell layout were generated automatically from the specifications for the lithography tools used at that node.

We observed that 10% timing accuracy was maintained at the cost of a 25% increase in cell area due to the automated cell compaction procedures. A similar exercise at the 45nm node also showed an area penalty of approximately 25%, indicating that the cell area off-set is predictable and node-independent. Although a significant 30% advantage in run-time is observed, the real benefit of the RDF model is the rapid extraction time, 0.3s. This is exploited in the last section of the article.

Coupling to standard synthesis/timing flowsIt is anticipated that at the 15nm node, die area will be more important than processing speed. An RDF was therefore used to calculate the area of a system L2 cache controller [2] (approximately 400,000 cells) at constant clock rate, using two competing 15nm technologies. The first was a fully de-pleted SOI (FDSOI) technology, and the second was a III-V NMOS/Ge PMOS technology.

P. Christie, A. Nackaerts, & G. Doornbos, NXP-TSMC Research Center A. Kumar & A. S. Terechko, NXP Semiconductors

Rapid design flows for advanced technology pathfinding

Extraction time Automated RDF model extracted in 0.3 sec

Area RDF library 25% larger area than reference

Delay RDF library 11% faster than reference

Output slope RDF library 13% larger slope than reference

Characterization time RDF model 30% faster than BSIM4 in same library

TABLE 1 RDF benchmark using 65nm technology node and a set of 15 combinatorial cells

Source: NXP/TSMC

Page 27: EDA Tech Forum Journal: March 2009

27EDA Tech Forum March 2009

In both cases, the RDF model was based on a bulk CMOS device and therefore the extracted parameters are empirically rather than physically based. Table 2 lists a selection of extracted parameters for each device. The predicted trade-offs between the area and clock rate for the two technologies are shown in Figure 2, where the III-V/Ge library implements the controller in half the area of the FDSOI library at a clock rate of 3GHz.

A design flow description languageFor even more rapid technology assessment scenarios, we

have developed a design flow description language, called PSYCHIC. This is implemented as a set of functions in a Matlab toolbox (Table 3).

The PSYCHIC approach relies on the use of defined func-tions to construct custom scripts, tailored to each modeling problem, rather than develop a single compiled program. Here is the script for emulating the static timing of the criti-cal path within the system L2 cache controller.

% STA script for path from system L2 cache controllertech = GaAsGeTech ; % imported technology informationlib = GaAsGeLib2 ; % imported 15nm III-V/Ge RDF library

logicDepth = 9 ; % number of cells in timing pathslew = zeros(1,logicDepth) ; % starting transition timesdelay = zeros(1,logicDepth) ; % starting delays

M = 630 ; N = 630 ; % rows and columns of cellsheight = GaAsGeLib2.INVD1.height ; width = height ; % square cellspeff = 0.8 ; reff = [0.0 0.7 0.7 0.7 0.7] ; % set place\route efficiencytc = 3 ; tn = 2 ; % set terminals per cell and net

% calculate average wire length within arraylav = savlength(height,width,preff,reff,tc,tn,tech) ;% convert to capacitancecload = cint(2,lav,tech).*ones(1,logicDepth) ;initSlew = 0 ; % first input transition time

[delay(1),slew(1)]=timing(GaAsGeLib2.DFCND1,cload(1),initSlew,) ;[delay(2),slew(2)]=timing(GaAsGeLib2.INVD1,cload(2),slew(1)) ;[delay(3),slew(3)]=timing(GaAsGeLib2.AN2XD1,cload(3),slew(2)) ;[delay(4),slew(4)]=timing(GaAsGeLib2.ND3D2,cload(4),slew(3)) ;[delay(5),slew(5)]=timing(GaAsGeLib2.NR2D1,cload(5),slew(4)) ;[delay(6),slew(6)]=timing(GaAsGeLib2.INVD2,cload(6),slew(5)) ;[delay(7),slew(7)]=timing(GaAsGeLib2.NR2D1,cload(7),slew(6)) ;[delay(8),slew(8)]=timing(GaAsGeLib2.NR2D1,cload(8),slew(7)) ;[delay(9),slew(9)]=timing(GaAsGeLib2.IOA21D2,cload(9),slew(8)) ;pathDelay = sum(delay) ;

Note that the script itself is node-independent, and tech-nology and library information are ‘fire-walled’ within sep-arate tech and lib files, respectively.

The toolbox makes it easy to couple RDF libraries with new design flow concepts such as statistical static timing

analysis. In this context, the output load, input and output transition times and delay parameters of the timing func-tion are not single values but probability density functions. Scripts based on this approach have been used to assess the impact of process-level variability on critical path timing.

Variations were introduced into a TCAD model of a 45nm device by varying the gate insulator thickness (σEOT=1Å) and gate length (σL=3nm), in order to produce 50 sets of NMOS and PMOS I-V curves (five hours processing time). RDF device models were then extracted for each device variant (extraction time 30 sec). Fifty libraries were then generated and character-ized (total time one day) and imported into PSYCHIC. Statis-tical static timing experiments were carried out on the slow-est timing path extracted from a 45nm memory concentrator block within a multimedia processor SoC. Figure 3 shows the results. The timing histograms for each cell in the path were convolved to produce the overall path delay.

Clock Rate (MHz)

FDSOIIII-V/Ge

Are

a (u

m2 )

x105

2.2

2

1.8

1.6

1.4

1.2

1

0.81000 1500 2000 2500 3000 3500 4000 4500

FIGURE 2 Clock rate-area trade-offs for 15nm system L2 cache controller (400,000 cells, cell height 0.624um)

Source: NXP/TSMC

TCADSilicon

I-V data...

Reduced orderdevice model

(mm9p8)

Coredesignrules...

Complexdesgnrules

RapidLibrary

Creationtool(RLC)

Libraryfiles

IP library(VHDL)

Synthesis

Psychictoolbox

WireLoad models

Static timing/poweranalysis

Psychic timing/powerscripts

FIGURE 1 Schematic representation of RDF components

Source: NXP/TSMC

The paper describes several innovative modifications to standard design flows that enable new device technologies to be rapidly assessed at the system level. Cell libraries from these rapid flows are employed by a design flow description language (PSYCHIC) for the exploration of highly speculative ‘what if’ scenarios. These rapid design flows are used to ex-plore the performance of two competing 15nm technologies in a system L2 cache controller and a PSYCHIC analysis of statistical timing variations in a 45nm memory concentrator.

Continued on next page

Page 28: EDA Tech Forum Journal: March 2009

< TECH FORUM > ESL/SYSTEMC28

This approach allows fundamental experiments to be performed on the effects of process-level variability on sys-tem-level timing, and avoids issues associated with vary-ing individual parameters within a single compact model to generate timing statistics [3].

ConclusionsBenchmarking with commercially generated libraries

shows that 10% timing accuracy and 30% run-time gain can be maintained with RDF libraries at the cost of a consistent, node-independent 25% cell area penalty. As an example, this approach was used to analyze the clock rate-area trade-off for a system L2 cache controller implemented using two competing 15nm technologies.

In order to explore more speculative ‘what if’ scenarios and to avoid costly synthesis and timing tools, a design flow description language was developed. Scripts writ-ten in this language were shown to reproduce critical path timing data from, for example, Cadence Encounter to 2% accuracy. The unique capabilities of RDF libraries and the ease of implementing new PSYCHIC functions were em-ployed to analyze statistical timing variations in a 45nm memory concentrator.

References[1] P. Christie, et al., Proc. IEDM (2007).[2] P. Stravers, et al., Proc. Int. Sym. VLSI-TSA (2001)[3] C. Visweswariah, Proc. Design Automation Conference

(2003)

Path Delay (ps)

Pro

bab

ility

(%

)

50 100 150 200 250 300 350

FIGURE 3 PSYCHIC prediction for statistical timing delay for critical data path in memory concentrator IP block due to process-level variability

Source: NXP/TSMC

Si technology III-V/Ge hybrid technology

Parameter NMOS PMOS III-V NMOS Ge PMOS

Beta (A/V2) 0.0647 0.0284 0.2055 0.1023

vt0 (V) 0.5267 0.4719 0.5424 0.4996

m0 1.6593 1.6904 1.9846 2.0000

rs (Ω) 123.2600 146.4500 33.1310 24.9610

TABLE 2 Some key RDF model parameters for 15nm devices

Source: NXP/TSMC

Function Description

cint Interconnect capacitance

rint Interconnect resistance

savlength Average wire length

crosstalk Calculate induced cross-talk voltage

sintenergy Supply-side energy dissipation in inter-connect during logic transition

dintenergy Demand-side energy dissipation in interconnect during logic transition

pplaceflat Pseudo-placement using site function

proute Allocate wire length distribution to layers

timing Cell static timing analysis

stiming Cell statistical static timing analysis

TABLE 3 Partial PSYCHIC function listing

Source: NXP/TSMC

NXP-TSMC Research CenterKapeldreef 75B-3001LeuvenBelgium

NXP SemiconductorsHigh Tech Campus5656 AE EindhovenThe Netherlands

W: www.nxp.comW: www.tsmc.com

Page 29: EDA Tech Forum Journal: March 2009

Calibre nmDRC There’s nothing like having an advantage when you’re racing to market.

That’s exactly what you get with Calibre nmDRC. Widely recognized as the world’s most popular

physical verification solution, Calibre’s hyperscaling architecture produces the fastest run times

available. To accelerate things even more, Calibre nmDRC adds incremental verification and

a dynamic results-viewing and debugging environment. Designers can check, fix and re-verify

DRC violations in parallel, dramatically reducing total cycle time. Start out and stay ahead of the

pack. Go to mentor.com/go/calibre_nmdrc or call us at 800.547.3000.

TAPE-OUT COMES A LOT FASTER

WITH CALIBRE nmDRC.

©2008 Mentor Graphics Corporation. All Rights Reserved. Mentor Graphics and Calibre are registered trademarks of Mentor Graphics Corporation.

®

Untitled-1 1 5/6/08 2:37:56 PM

Page 30: EDA Tech Forum Journal: March 2009

< TECH FORUM > VERIFIED RTL TO GATES30

The design of multi-million-gate system-on-chips (SoCs) is made still more complex where engineers must account for the pres-ence of multiple asynchronous clocks. The problem is not uncom-mon—for example, different external interface standards such as PCI Express (PCIe) and Internal Bus (IB) use different clock fre-quencies. Another factor can be the additional presence of a single fast clock that is effectively distributed over the entire chip.

The verification of such designs has become especially te-dious, time-consuming and therefore costly. Moreover, tradi-tional functional simulation is proving inadequate for the veri-fication of clock domain crossings (CDCs). Instead, this article describes a methodology for verifying the synchronization of different asynchronous clock domains that uses both structural and functional verification techniques. It also outlines the use of an assertion-based verification strategy that can be added to a simulation-based flow to improve design quality.

SoCs with interfaces like PCIe and IB have two different operating frequencies and require synchronization during data transfer. Failing to synchronize data and control transfers between asynchronous clock interfaces leads to timing viola-tions (e.g., setup and hold time) that cause signals to enter a metastable state. These metastable timing errors are difficult to detect as they are randomly generated. To detect them, you need the right combination of clock edges and data.

MetastabilityThe proper operation of a clocked flip-flop depends on the input signal being stable for a certain period of time before and after the clock edge. If the requirements for setup and hold time are met then a valid output will appear at the output of the flip-flop after a maximum delay. But if the re-quirements are not met, the signal will take much longer to reach a valid output level. We refer to the result as either an ‘unstable state’ or ‘metastability’ (Figure 1).

Metastability is avoided by adding various features to a design. Two of the most common such features are:

• two flip-flop circuits (this technique requires a stable

design since incorrect design will lead to synchroniza-tion failure); and/or

• FIFO buffers (handshaking mechanisms).

Structural clock domain verificationStructure analysis checks the connectivity and combination-al logic of a design. It should be performed on both the RTL code and the post-synthesis gate-level netlist. It can be per-formed by using automated CDC tools, scripts-based verifi-cation and code review, and identifies the following issues:

Snehal Patel is a project leader in the ASIC Verification division of eInfochips, a company providing integrated silicon, embedded system and software design and development.

Snehal Patel, eInfochips

Multiple cross clock domain verification

In

CLKA

CLKB

InA InB

FIFO1 FIFO1

Clock B samples InA while it is charging

InB is in metastable state

CLKA

InA

CLKB

InB

FIGURE 1 Metastability

Source: eInfochips

d_in

clk

d_out

FIFO1 FIFO1

FIGURE 2 Two flip-flop circuit

Source: eInfochips

Page 31: EDA Tech Forum Journal: March 2009

31EDA Tech Forum March 2009

Today’s system-on-chip designs often need to encompass multiple asynchronous clocks. This raises the problem of verification for the resultant clock domain crossings.

It is becoming apparent that functional simulation alone is not up to the task. Instead, engineers need to consider hybrid methodologies, combining structural and function-al verification approaches.

The use of assertions is also proving increasingly important to CDC verification, and the author provides examples of code he built to address the techniques and features demanded by increasingly complex designs.

• unresolved clocks;• combinational logic failures ( jitter and glitches); and• insufficient synchronization of signals where synchro-

nizers are based on flip-flops.

Functional clock domain verificationStructural analysis will find annoying errors. However, it alone will not verify whether synchronizers have been used correctly in the design. Here, we pull upon functional analysis techniques as they can identify the following types of issue:

• data stability between two different clock speeds (fast and slow);

• FIFO read and write when both pointers operate on different clock frequencies; and

• data stability with a handshake-based synchronizer.

To identify synchronizers, functional clock domain veri-fication uses the same algorithm as structural verification. However, under functional verification, assertions can be implemented for each type of synchronizer and as checks on all possible input conditions. Clock domain analysis with assertions can also be performed at different levels (e.g., block-level, full chip-level and gate-level) and helps to detect real timing violations in the gate-level simulation.

Let’s now explore in more detail two of the design fea-tures used to overcome metastability.

1. Two flip-flop circuitA two flip-flop circuit is used to add delay for synchronization on a signal path. For the example shown in Figure 2, input data values must be stable for three destination clock edges. There are three cases where the input will not be transferred to the syn-chronizer output—where it is sampled by only one clock edge, sampled by two clock edges or not sampled by clocks at all.

The advantage of a two flip-flop synchronizer is that it filters the metastability state, and this process is shown in Figure 3. The first ‘d_in’ pulse is never sampled by the clock and will not be transferred to ‘d_out’. But if the rising edge of this pulse vio-lates the previous hold time or the falling edge violates the cur-rent set up time, this value may propagate to the real circuit.

The second ‘d_in’ pulse is sampled by the clock, but if this signal changes just before the clock edge because of a setup time violation, the simulation will transfer the new value. In this case, the real circuit will become metastable and will transfer the original low value.

The third ‘d_in’ pulse is sampled by two consecutive

clock edges and will be filtered by the synchronizer. Since the rising edge of ‘d_in’ violates the setup time for the first clock edge and the falling edge violates the hold time for the next clock edge, simulation will propagate two clocks.

Here are the assertions and related descriptions required during this step.

To check stability of the data:property stability;@(posedge clk)!$stable(d_in) |=> $stable(d_in) [*2];endproperty : stability

To check for a glitch:property no_glitch;logic data;@(d_in)(1, data = !d_in) |=>@(posedge clk)(d_in == data);endproperty : no_glitch

assert property(stability);assert property(no_glitch);

The ‘stability’ assertion checks the stability of ‘d_in’ for two clocks. The ‘no_glitch’ assertion checks for a glitch on ‘d_in’.

2. FIFO buffer circuits2.1. Handshake synchronizerDifferent types of handshake synchronizers are used to avoid CDC. In the type shown in Figure 4, the transmitter sends a re-quest that is synchronized by two flip-flops. Here, the number of flip-flops is not fixed as it depends on the frequency of the clock. Once the receiver receives a request, it latches the data and sends an acknowledgement back to the transmitter. In this case, the data must be valid when the request is asserted.

If the request is not synchronized properly because of flip-flop error, it can be caught during structural verifica-tion. That step identifies errors that arise when incorrect data is latched through glitches generated by combinational logic driving a flop-based synchronizer.

Insufficient synchronization is also a structural error. High-speed clocks require a greater number of flip-flops and any shortfall can be detected by structural verification.

For verification of the handshake synchronizer circuit, the following conditions can be asserted:

1. Every request gets acknowledged.2. No acknowledgement without a request.3. Data stability.

Clk

DataIn

RealOutput

FIGURE 3 Timing diagram

Source: eInfochips

Continued on next page

Page 32: EDA Tech Forum Journal: March 2009

< TECH FORUM > VERIFIED RTL TO GATES32

write and read is kept in synchronization by write and read pointer positions.

The following conditions can be asserted for verification of dual FIFO asynchronous circuits:

1. Data integrity.2. Do not write when the FIFO is full.3. Do not read when the FIFO is empty.

Here are the assertions and related descriptions for this step.

Don’t write when FIFO is full/Don’t read when FIFO is emptyproperty full_empty_access(clk, inc, empty_full_flag); @(posedge clk) inc |-> !empty_full_flagendproprty : full_empty_access

Data integrityint write_cnt, read_cnt;always@(posedge write_clk or negedge write_rst_n) if (!write_rst_n) write_cnt = 0; else if (winc) write_cnt = write_cnt+1;always@(posedge read_clk or negedge read_rst_n) if (!read_rst_n) read_cnt = 0; else if (rinc) read_cnt = write_cnt+1;

property data_integrity int cnt; logic [DSIZE-1:0] data; disable iff (!write_rst_n || !read_rst_n)@(posedge write_clk)(winc, cnt = write_cnt, data = wdata) |=>@(posedge read_clk)first_match(##[0:$] (rinc && (read_cnt == cnt)))##0 (rdata == data);endproperty : data_integrity

assert property(full_empty_access(write_clk,winc,wfull));assert property(full_empty_access(read_clk,rinc,rempty));assert property (data_integrity);

The ‘full_empty_access’ assertion ensures that there should not be data access when FIFO is empty and FIFO is full. The ‘data_integrity’ assertion starts a new thread with an initial count value whenever data is written into the FIFO, and checks the read data value against a local data variable whenever a corresponding read op-eration occurs.

Here are the assertions and related descriptions required for this stage.

Every request gets acknowledged sequence req_transfer; @(posedge clk) req ##1 !req [*1:max] ##0 ack;endsequence : req_transfer

property req_gets_ack; @(posedge clk) req |-> req_transfer;endproperty : req_gets_ack

No acknowledgement without requestproperty ack_had_req; @(posedge clk) ack |-> req_transfer ended;endproperty : ack_had_req

Data stabilityproperty data_stablility; @(posedge clk) req |=> data [*1:max] ##0 ack;endproperty : data_stablility

assert property(req_gets_ack);assert property(ack_had_req);

assert property(data_stablility);

The ‘req_transfer’ assertion checks whether every request gets acknowledged. The ‘ack_had_req’ assertion checks whether every acknowledgement has a request. The ‘data_stability’ assertion checks if the data is stable for the period of the request, including an acknowledgement.

2.2 Asynchronous FIFO circuitA dual clock asynchronous FIFO circuit is used for CDC

synchronization when high latency in the handshake proto-col cannot be tolerated. The circuit changes according to the requirements of the SoC but its basic operation is constant (Figure 5). Data is written into the FIFO from the source code and read from the FIFO in the destination clock domain.

Read and write pointers are passed into different clock domains to generate full and empty status flags. Data

Request

Acknowledgement

clk1 clk2

TransmitterControl

ReceiverControl

data data datalatch latch

F1 F2

F3 F4

FIGURE 4 Handshake synchronizer circuit

Source: eInfochips

Page 33: EDA Tech Forum Journal: March 2009

33EDA Tech Forum March 2009

The relevant assertions statement is shown here:property check_graycoded = always ((gc == 8’h0) || (((gc ^ prev(gc)) & ((gc ^ prev(gc)) – 8’h1)) == 8’h0));assert check_graycoded;

All the assertion statements in the examples above verify that the receiver gets the expected data. Since the latency between sending and receiving data is not known, verifying the sequence will ensure that no data gets dropped.

The actual value in the data sequence should not make a difference to the verification; an assertion could be written using a more random sequence or specific data values.

Formal verification can use this assertion to verify that the control logic of the assertion does not allow any loss of data and will generate counter example traces if any failures are found.

ConclusionMultiple cross clock domain verification is best performed

using both structural and functional verification techniques. The verification process may start with structural verifica-tion to remove errors related to insufficient synchroniza-tion, combinational logic driving synchronizers or missing synchronizers. After structural verification, functional veri-fication may be used to implement additional assertions and check the usage of the synchronizers.

CDC jitter emulationOne problem that can arise because of synchronization

is CDC jitter. Even though all signals may be synchronized properly for the destination clock domain, the actual arrival time can become uncertain if the signal goes metastable in the first synchronization flip-flop. This CDC jitter can lead to functional failures in the destination logic.

Normally, this jitter would not show up on a simulation. Both formal and dynamic verification methodologies can be used to determine if the design still works with CDC jit-ter. Formal verification can prove that design properties are never violated for all possible jitter conditions whereas simu-lation can only demonstrate that no assertion checks are vio-lated for a particular jitter combination. Actual violations can also be difficult to debug in simulation. CDC jitter emulation provides an improvement in overall verification quality.

Gray encodingIn addition to handshake-based and FIFO-based synchro-

nizers, another method of synchronizing data is first to gray encode it and then use multi-flop synchronizers to transfer it across domains.

For multibit signals, a gray code ensures that only a single bit changes when a group of signals counts. This makes it possible to detect whether all the bits have been captured by the receiving clock together or if they have been skewed across multiple clock cycles due to metastability.

Another example is in an asynchronous FIFO where the read and write pointers cross the write and read clock domains re-spectively. These pointers are control signals that are flop-syn-chronized. The signal is gray-encoded prior to the crossover.

This can be verified using assertions. The assertion is sim-ply exclusive-or of the data and the previous value of the data with the condition that it must have at most one active bit (i.e., it should be ‘one hot’).

Write Ptr

Read Ptr

clk clk

WriteControl

ReadControl

MEMORY

F1 F2

F3 F4

Write Clock

Write Enable

Write Data

Read Clock

Read Enable

Read Data

FIGURE 5 Asynchronous FIFO circuit

Source: eInfochips

eInfochips, Inc.1230 Midas WaySuite# 200SunnyvaleCA 94085, USA

T: +1 408 496 1882W: www.einfochips.com

Page 34: EDA Tech Forum Journal: March 2009

< TECH FORUM > DIGITAL/ANALOG IMPLEMENTATION34

IntroductionLike all electronic products, trusted personal devices (TPDs) have to be more and more competitive, delivering higher performance at lower cost. Reducing the bill of ma-terials and time-to-market can best be achieved through the broadest and most complete integration, one that also includes elements such as the antenna and/or the pack-aging where smart cards or system-in-package (SiP) tech-nologies are involved. This paper proceeds from the po-sition that designing such complex systems requires the development of a single EDA framework composed of multi-engine simulators that are associated with a library of hierarchical models.

Figure 1 shows the different parts in a TPD and the dif-ferent types of signal involved in a typical communication between two such devices. It is clear that preparing them to handle the different signal shapes and frequencies is no trivial task. Indeed, during different design phases, engi-neers have to account for such factors as bit error rate (BER) on one hand, and power consumption on the other. Block design requires that electrical properties are set with due consideration for factors such as noise, nonlinearity, gain or impedance matching. Each step requires different simula-tors, and all must be interoperable. Even then, there is usu-ally a design gap between the system specification and the RF block design. It typically centers on:

• the use of different design frameworks and different simulators; and

• a huge frequency ratio between the RF (GHz) and the digital baseband (MHz).

Nevertheless, to meet an original system-level specifica-tion, engineers need to mix different levels of abstraction so that they can explore implementation architectures and then validate the final design at the circuit level. Luckily, existing behavioral models in the VHDL-AMS analog and mixed signal language provide the basis for such a seam-

less system-to-transistor-level approach. Relevant features include:

• top-level functional simulations for architecture valida-tion using a top-down methodology;

• bottom-up verification with accurately characterized models;

• test program development using tester resource models;• traditional measurement, post-processing and/or the

use of testbenches; and• IP exchange and protection to help assess a model

against a specification.

This paper describes a methodology for the design of TPDs that can interface to multiple terminals and networks. This methodology is based on the Mentor Graphics’ AD-VanceMS (ADMS) framework [1], Matlab/Simulink from The Mathworks, and a hierarchical analog and mixed-sig-nal intellectual property (AMS-IP) library.

The virtual RF system platformThe virtual platform

The different parts of a complex embedded system are usu-ally specified and designed separately by engineers working with various EDA tools. For TPDs and similar communica-tions devices, one must account for critical design parameters (e.g., cost, power consumption, channel effects) at the system level. The exploration and evaluation of various system archi-tectures then requires access to a complete, hierarchical AMS-IP model library and a design environment that supports the use of several design levels. Then, test vectors that have been initially applied at the system level should also be available for reuse at each subsequent level in the design flow.

Simulation time is very important. To control its influence on time-to-market, this methodology uses a four-level hier-archical model library (Figure 2, p.36): transistor (SPICE or physical level), structural (low level), behavioral (interme-diate) and high level (ESL or specification). It is also tailored to the three main types of design process: bottom-up, top-down and meet-in-the-middle [2]. The overarching goal is to define the primitive components and objects, and to then develop them and the system structure simultaneously, so

Gilles Jacquemod received his PhD from INSA Lyon in 1989. In 2000, he joined the LEAT laboratory at the Ecole Polytechnique of Nice-Sophia Antipolis University as a full professor. His primary research interests include analog design and the behavioral modeling of mixed domain systems.

Gilles Jacquemod, LEAT, University of Nice-Sophia Antipolis

TrustMe-ViP: trusted personal devices virtual prototyping

Page 35: EDA Tech Forum Journal: March 2009

35EDA Tech Forum March 2009

As the market continues to push for higher performance at lower cost, trusted personal devices (TPDs) must also be able to exchange greater volumes of data, voice or streaming video at ever higher bit rates, while transmitting and receiving securely under multiple telecommunication standards. Reductions in cost and time-to-market can only be achieved by integrating the complete system design including such aspects as the antenna and the packaging.

The TrustMe-ViP project is part of a larger CIM-PACA (Centre Intégré de Microélectronique Provence Alpes Côte d’Azur) initiative that has developed a virtual RF system-level design platform, and it is specifically dedicated to the design of complex TPD systems. It provides a unified EDA framework that features multiple simulation engines based around dedicated hierarchical model libraries avail-able for multiple levels of abstraction.

The benefits and structure of this methodology are described here and illustrated by way of the example of a Bluetooth transceiver design.

that the final system is constructed from these primitives in the middle of the design process. The same paradigm is used to develop the hierarchical models.

SystemC is used to provide a fast, high-level, C++ based framework for the simulation and to validate the system specifications [3]. Meanwhile, for non-electrical devices (e.g., micro-electromechanical systems [4], photonic components, etc.) finite element methods (FEMs) provide golden simula-tions [5] at the physical level, in the same way as SPICE does for electronic circuits. These simulations are used to validate a higher-level model or to develop them. In fact, both top-down and bottom-up approaches are suitable in that the hier-archical library can be built by refinement or abstraction.

Limitations of the methodologyThe main limitation concerns the development of the hierar-

chical library. Model development is difficult and needs a high level of expertise to be carried out effectively. Achieving an ac-curate RF implementation demands the full consideration of such factors as non-linearity effects, compression, noise, phase noise, frequency response, mismatch and so on. Moreover, RF designers also often use proprietary models during simula-tion, and this often means that there are limited opportunities for their reuse across different design environments.

The problem of coupling by the substrate or the wires must also be considered. Increased digital signal processing (DSP) lowers the quality of analog signal processing. Lucki-ly, mixed-signal simulation is now a mature technology, and

several unified simulators and co-simulation solutions are available [6]. This smoothes our path to the objective that, wherever possible, one should reuse existing models and unify various platforms in ways that are transparent to the user. The key factors for success are:

• the ability to take a hybrid approach to the integration of different tools;

• the use of open databases using standard languages; and

• the availability of standard, compatible input/output formats and interfaces.

Tool flowAs noted earlier, the platform is based on the ADMS

framework coupled with Matlab/Simulink. The main idea is to reuse models developed under Simulink, which offers an RF system baseband modeling simulation where the number of simulation steps is significantly reduced, as well as Simulink user-defined continuous models.

For example, in a Bluetooth [7] transceiver, the wireless system transmits a GFSK-modulated signal at 1Mb/s, using a frequency-hopping algorithm over 79 1MHz sub-carrier channels, with a center frequency around 2.4GHz. In this case,

I/Q impairments,modulationinfluence,

...

Phase noise,white noise,

non-linearities,matching,

...

BER=1012

DSP RFAnalog RF DSPAnalog

FIGURE 1 TPD communication and signals characteristics

Source: LEAT

Continued on next page

Page 36: EDA Tech Forum Journal: March 2009

< TECH FORUM > DIGITAL/ANALOG IMPLEMENTATION36

Hierarchical modelsThe design levels most commonly used in today’s meth-

odologies are the system level, the block or register level, and the circuit or transistor level.

The system level is usually the first to be considered, and here the design’s specifications are extracted. It can be use-ful in determining the modulation to be used for a given design. It also allows for the definition of critical parts of the design, of the modeling level that will be used across the flow, and of the simulators to be used later on.

At the block level, one must provide more detail to in-crease simulation accuracy. One must mainly account for electrical considerations in more specific terms, rather than

the established models only take into account the spectral peaks within a bandwidth around the carrier frequency. As a result, the first important drawback is the loss of simulated nonlinear behavior outside the bandwidth. A second is a limi-tation to the modulated steady state (MODSST) simulation. It requires the development of Simulink continuous models be-cause the tool lacks appropriate baseband models. So, for be-havioral and functional approaches, we need to run different RF simulations to analyze the complete system. The solution is to export and compile the Simulink models in ADMS, using the Real Time Workshop (RTW) toolbox [8] (Figure 3).

Simulink is a graphical ‘what-you-see-is-what-you-get’ editor based on the Matlab engine. This simulator covers the system level, delivering its results easily and quickly. It allows the simulation of complete transmission systems, in-cluding both the digital and analog elements. The simulation accuracy is admittedly limited for the analog portion, but that is counterbalanced by the tool’s system approach to veri-fication and validation. Simulink has thus become a standard language by default, particularly for system-level designers.

VHDL-AMS (or Verilog-AMS) is used to describe the digital, analog and RF behavior. The ADMS framework can cope with mixtures of different languages (e.g., SystemC, VHDL, Verilog, VHDL-AMS, Verilog-AMS, Netlist SPICE) and simulators (e.g., Questa, ModelSim, Eldo and Eldo RF) (Figure 4). To carry out high-level simulations, ADVance-MS allows simulators that support VHDL or SystemC to be run. For the analog and RF parts of the circuit, Eldo and EldoRF are best suited to the simulation, and both support Com-mlib and Commlib RF libraries.

UseCase

Tests

Sub-systemDSP

Sub-systemRF

Sub-systemBaseband

Environment

System Specifications(Simulink + VHDL-AMS)

Block 1 Block 2 Block n

Synthesis Validation

Specific tests

SPICE ImplementationTransistors level

Inte

rmed

iate

hie

rarc

hic

al le

vels

VH

DL-

AM

S M

od

elin

g

Hierarchical Library(IP AMS)

System level modelsMatlab/Simulink

VHDL-AMS/SystemC

Intermediate level modelsVHDL-AMS + Simulink (export)

Intermediate level modelsVHDL-AMS+SPICE

Refinement Abstraction

Low level modelstransistor SPICE

Time Sim

ulatio

n an

d M

od

el Accu

racy increase

HighLevel Specifications

Bo

ttom

-Up

Top

-Do

wn

Circuit Circuit

FIGURE 2 Methodology for complex SoC development

Source: LEAT

System

Block

Circuit

ModelSim Eldo RF

ADVance MS RF

Matlab

-Simu

link

Verilo

g

VH

DL

SystemC

VH

DL-A

MS

Verilo

g-A

MS

SPICE

ADVance SME

Matlab

FIGURE 3 Matlab/Simulink over ADVance MS

Source: LEAT

Page 37: EDA Tech Forum Journal: March 2009

37EDA Tech Forum March 2009

System Interface (OSI) protocol stack and deals with end-user processes. Profiles defines how to use the low-er Bluetooth layers to accomplish specific tasks.

• Logical Link Control and Adaptation Protocol (L2CAP). This is responsible for packet multiplexing, segmenta-tion and reassembly.

• Host Control Interface (HCI) or Control. This defines the interface by which upper layers access lower layers of the stack.

• Link Manager Protocol (LMP). This is responsible for the link state and establishes the power control mode.

• BaseBand (BB). This deals with framing, flow control, me-dium access control (MAC) and mechanisms for timeout.

• Radio Frequency layer. This is concerned with the design of the Bluetooth transceiver.

Transceiver simulation from SystemC to RFFigure 6 (p.38) illustrates the modeled architecture of a

Bluetooth transceiver. We focus here on one transceiver. The baseband Simulink model includes modulation/demodu-lation and gathers the necessary information to generate a sequence at 1Mb/s before we address the RF part of the design. The master SystemC model of the Bluetooth device master model can be substituted with a random bit genera-tor for the RF simulation. For the RF architecture, we use a heterodyne structure with two intermediate frequencies (IF) equal to 1/3 and 2/3 of the carrier to reduce classical coupling between the voltage-controlled oscillator (VCO) and the power amplifier (PA). The frequency synthesizer

using the mathematical equations seen at higher abstrac-tions. This does increase simulation time.

Finally, at the transistor level, one needs to link the per-formance of the components to the technology. Here, the necessary SPICE simulations are quite time-consuming and therefore must be tightly controlled.

To improve simulation efficiency, we recommend that en-gineers mix these description levels during both design and verification. The development of such hierarchical models is both possible and desirable whether the project uses a top-down or a bottom-up methodology, or a combination.

Bluetooth implementation and resultsSystem considerations

Bluetooth dominates the wireless personal area network (WPAN) space, and, as one would expect, has mainly been used to date in the mobile communications market where both low cost and low power consumption are required. The latest specification introduced an enhanced data rate (EDR) mode, up from the original 1Mb/s (using Gaussian-frequency-shift-keying modulation) to 3Mb/s (using eight-differential-phase-shift-keying, or 8DPSK). It uses a slot-ted protocol with a frequency-hopping-spread-spectrum technique in the typically unlicensed ISM frequency band (2.402-2.483GHz) across 79 channels of 1MHz. The trans-mission channel changes 1,600 times per second. Therefore, time is divided into 625μs slots using a time-division-multi-plexing (TDM) scheme.

The Bluetooth specification not only defines a radio inter-face but also an entire communication protocol stack with the following layers (also shown graphically in Figure 5):

• Applications/Profiles. This is the upper layer of the Open

Main

Structure

Process

Nets

Variables

Source

Viewer(EZ Wave)

modsst transient Verilog/VHDL

RF A-BB D-BB Test-bench

ELDO RFELDO & ADiTVerilog-AMS

ModelSim: VHDL & VerilogQuesta:

ONE DESCRIPTION ONE USER INTERFACE

ONE RESULTS DATABASE

FIGURE 4 ADVance MS framework

Source: LEAT

Host

Profile #1 Profile #2 Profile #N

L2CAP

Host Controller Interface (HCI)

Controller

Link Manager

Baseband Resource Manager

Link Controller

Radio(Simulink/VHDL-AMS/SPICE)

DeviceManager

FIGURE 5 SystemC model architecture

Source: LEAT

Continued on next page

Page 38: EDA Tech Forum Journal: March 2009

< TECH FORUM > DIGITAL/ANALOG IMPLEMENTATION38

SPICE for our PA. A final comparison with SPICE visibility can then be performed.

Several conclusions can be drawn: a Simulink design ex-ported on ADMS runs 6x faster; one combining SystemC and Simulink on ADMS runs still 7x faster; and SystemC decreases the simulation time by 1x while behavioral languages reduce it by 2x. Armed with this information, we can determine the adequate abstraction level at which to analyze the design from SPICE to system models and estimate the bit error rate. We notice that MODSST can significantly reduce simulation time (reported as 34x [1]), but the SPICE device number must be increased to correctly use the MODSST simulation.

To validate our methodology, a system simulation is first done using SystemC, and the corresponding frame sequence is generated. Figure 7 shows the generation of the sequence of bits for an identification (ID) packet, the simplest packet in the Bluetooth standard.

To quickly extract estimated BER from the transceiver, an analytic model has been developed. Based on the SystemC

is generated by a fractional phase-locked loop (PLL) [12]. The PA and the mixers are modeled both in Simulink and VHDL-AMS at different levels of abstraction.

An RF transceiver contains key models (e.g., low noise am-plifiers (LNAs), mixers, channel descriptions, PAs, PLLs with VCOs, filters). To simulate these elements, we need to develop a RF library [13]. This is described in a combination of VHDL-AMS behavioral language and Simulink on the ADVance Sys-tem Model Extractor (ADSME) [1]. The main features of the different blocks are then characterized (e.g., the third-order in-tercept point and clipping effects, conversion gain, phase noise, noise figure and input/output impedances and matching).

After a first simulation, a preliminary study shows that the PA is a sensitive block. Its power consumption is by far the most important in the RF part of the design, and its power output can directly affect the transceiver (BER). For these reasons, a full SPICE description for this component is worth having. However, other blocks are described either with Simulink, VHDL-AMS or Verilog-AMS. Under the ADMS RF software, we can again mix different abstraction levels and the description languages across various parts of one block.

Simulink allows us to simulate a complete transmission sys-tem that contains RF, MAC, modems and channel by combin-ing the digital and analog parts. We start with Simulink at the system level where it offers a baseband modeling simulation for RF systems that significantly reduces the number of simu-lation steps on the time domain (transient simulation).

Based on elapsed time for a 68μs Simulink simulation frame sequence, we model our transceiver by choosing from several possible simulation options (e.g., MODSST or tran-sient) and behavioral language descriptions (e.g., SystemC, Simulink, VHDL-AMS, Verilog- AMS), and then use circuit

FIGURE 6 Bluetooth transceiver architecture

Source: LEAT

time (ms)

enableTx

clock

devAddr

bitstream

channel

0 1

0xa3482c8

29

19.380 ms

FIGURE 7 SystemC generation of a bit train

Source: LEAT

Page 39: EDA Tech Forum Journal: March 2009

39EDA Tech Forum March 2009

• Simulink models with RF noise for the PA: 55.1s/bit simulation time with a BER equal to 6.7*10-4.

• Simulink or SystemC models for the digital part, Sim-ulink models for the MODEM part and VHDLAMS models including noise for the RF part: 26s/bit simula-tion time with a BER equals to 8.1*10-4.

• Quasi-analytic BER estimation: 16ms/bit simulation time with a BER equals to 6.4*10-4.

Circuit modeling and designArmed with the nominal BER, one can perform RF system

simulations to determine the noise performance required for each RF block (e.g., oscillator, PLL, amplifier, mixers). For example, Figure 9 shows the performance range of a PLL that can be used in Bluetooth applications—notice that the Bluetooth specification’s requirement for phase noise of -119dBc/[email protected] is respected. A time-domain simulation, which compares different hierarchical models (using a combi-nation of Simulink and VHDL-AMS) for the settling time, is given in Jacquemod, Geynet et al [15]. Given adequate abstrac-tion, this simulation can be reduced by a ratio between five and six.

Bluetooth frame, we simulate the transceiver including modu-lation, demodulation and RF architecture, with compression, frequency response and third-order harmonics characteristics extracted from previous simulations. This simulation will be done without noise in a relatively fast time. The next step will be the estimation of the BER using formula (1) [5]:

BER erfcI D

p ii

iii=

∑1

2 2σ( ) (1)

We can approximate this formula by relation to equation 2:

BERN

I D

I Di

Ni

i

i

i

=− −

−=∑1

2

21

2exp( (2)

( ) / )σ

πσ

where D represents the decision level, σi is the variance of block i, N is the number of blocks and p(i) the sequence probability, linked to the total variance (see equation 3) :

σ σ σ σσ σ

2 2 2 2

2

= + + +

+ +mixers LNA PA

filters PL LL channel2 2+ σ (3)

An approximation of phase noise can be done using the Allan variance [12] for mixer and PLL models or by phase noise area computation. Figure 8 presents a section from a BER analyzer simulation with the ‘cleaned’ signals (before and after modulation), variance computation and the re-sulting estimate for the BER.

We compute the BER using different simulations based on different hierarchical models and languages. To compare such results, we give the elapsed simulation time per bit (Xeon PC Linux, 2.8GHz and 8 Gbyte RAM) and the cor-responding estimated BER:

• Ideal Simulink models (without noise): 1.4s/bit simula-tion time with a BER equal to 0.

FIGURE 8 BER estimation simulation from SystemC frame

Source: LEAT

-40

-60

-80

-100

-120

-140

-160

-180102 103 104 105 106 107 108

Frequency Shift (Hz)

Phase Noise MeasurementPhase Noise Simulation

Phas

e N

ois

e (

dB

Hz)

FIGURE 9 PLL phase noise

Source: LEAT

Continued on next page

Page 40: EDA Tech Forum Journal: March 2009

< TECH FORUM > DIGITAL/ANALOG IMPLEMENTATION40

[13] B. Nicolle, W. Tatinian, J-J. Mayol, J. Oudinot, G. Jacquemod, “RF library based on block diagram and behavioral description”, BMAS 2007, San Jose, CA, 2007

[14] Clark R.L., “Phase Noise measurements on vibrating crystal filters”, Frequency Control, 1990, Proceedings of the 44th Annual Symposium, pp. 493-497

[15] G. Jacquemod, L. Geynet, B. Nicolle, E. de Foucauld, W. Tatinian, P. Vincent, “Design and Modelling of a Multistandard Fractional PLL in CMOS/SOI Technology”, Microelectronics Journal, 2008, vol. 39, no. 9, 2008, pp. 1130-1139

AcknowledgmentsThe author would like to thank the CIM-PACA Design Plat-form for its support.

ConclusionThis paper demonstrated the feasibility and the efficiency of mixing analog RF and digital simulators in a single design environment. A dedicated RF design and verification plat-form was coupled with a SystemC emulation environment that enabled the simulation of a full Bluetooth transmit/re-ceive flow. This started with protocol emulation followed by sent-data generation, modulation, and modeling and op-timization of the RF front-end.

To achieve this, we developed a multi-language descrip-tion (SystemC, Simulink, VHDL-AMS and SPICE netlist) using multiple engines (ModelSim, Matlab, Eldo and EldoRF), across multiple domains (MAC, BB, and RF) and hierarchical description levels (system, behavioral, struc-tural and transistor) to simulate a BT transceiver.

In comparison with other generic solutions, the simula-tion time is significantly decreased while the precision is in-creased. In addition, most of the models used for the simula-tions were generic, enabling easy reuse of the methodology.

References[1] ADVance MS Users’ Manual, v4.5_1, AMS 2006.2[2] H. De Man, F. Catthoor, G. Goossens, J. Vanhoof, J. Van

Meerbergen, S. Note, J. Huisken, “Architecture-driven Synthesis Techniques for VLSI Implementation of DSP Algorithms”, Proc. IEEE, Vol. 78, no.2, 1990, pp. 319-334

[3] Younes Lahbib, Romain Kamdem, Mohamed-lyes Benalycherif, Rached Tourki, “An automatic ABV methodology enabling PSL assertions across SLD flow for SOCs modelled in SystemC”, Sciencedirect.com, Computers and Electrical Engineering 31 (2005), pp. 282-302, Elsevier.

[4] M. Zorzi, N. Speciale, G. Masetti, “A new VHDL-AMS simulation framework in Matlab”, BMAS 2002.

[5] P. Bontoux, I. O’Connor, F. Gaffiot, G. Jacquemod, “Behavioral modeling and simulation of optical integrated devices”, Analog Integrated Circuits and Signal Processing, vol. 29, issue 1/2, 2001, pp. 37-47

[6] G. Jacquemod, “Virtual RF System Platform- Myth or Reality?”, Design Automation and Test in Europe 2007, Exhibition Theatre, Technical Panel: H. Guegan, F. Lemery, Y. Deval, A. Tudose, W. Krenik, O. Vermesan, Nice, 2007

[7] http://www.bluetooth.com[8] Matlab/Simulink User’s Manual, v2006b[9] http://www.eda.org/vhdl-ams[10] Modeling and Simulation for RF system design, Ronny

Frevert et al., Springer 2005[11] BT SIG, “Specification of the Bluetooth System - Core” Core

Specification v2.1 + EDR, 2007, https://www.bluetooth.org/spec[12] Clark R.L., “Phase Noise measurements on vibrating crystal

filters”, Frequency Control, 1990, Proceedings of the 44th Annual Symposium, pp. 493-497

Electronics, Antennas and Telecommunications Laboratory (LEAT)University of Nice-Sophia AntipolisUMR CNRS 6071Bâtiment 4250 rue Albert Einstein06560 ValbonneFrance

T: + 33 (0)4 92 94 28 00W: www.elec.unice.fr

Page 41: EDA Tech Forum Journal: March 2009

intelligent, connected devices.

15 billion connected devices by 2015.* How many will be yours? intel.com/embedded

Choose your architecture wisely.

* Gantz, John. The Embedded Internet: Methodology and Findings, IDC, January 2009.Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries. © 2009 Intel Corporation. All rights reserved.

Untitled-1 1 2/17/09 1:46:30 PM

Page 42: EDA Tech Forum Journal: March 2009

< TECH FORUM > DESIGN TO SILICON42

When Chemical Mechanical Polish (CMP) was first intro-duced to IC manufacturing in the early 1990s, common sense insisted that the process was too crude and riddled with defects to use in the production of modern electron devices. However, CMP overcame predictions of an early demise, and as the decade passed, its use expanded consid-erably. Table 1 shows the insertion of new CMP steps into logic IC technology nodes by year of insertion as well as the technology element enabled by each CMP step. Introduc-tion of these early CMP steps was key to the IC manufactur-ers’ ability to maintain scaling trends.

The insertion of new steps slowed during the first part of the current decade after the insertion of copper CMP at the 130nm node. A major reason for the slowdown was con-cern raised over CMP’s inadequacies—mainly that the tech-nique was expensive, induced yield-limiting defect modes, and resulted in thickness variations that were inconsistent with the scaling trends of the IC industry. While the first implementations of CMP were required to enable technol-ogy scaling, the original concerns around its crudeness ap-peared to reassert themselves. During these years, methods were found to scale dimensions and improve transistor per-formance without requiring new CMP steps.

However, at the 45nm node, CMP is once again being used to enable a critical advancement in silicon technology. It is an inte-gral component of the replacement metal gate (RMG) approach for defining metal gate structures required for high-k, metal gate (HKMG) dielectrics [1, 2]. Copper CMP has also been car-ried forward from the 65nm node to form Cu interconnects.

Requirements for RMG CMPCMP technology is extensively utilized to create metal gate electrodes for the introduction of HKMGs at the 45nm tech-nology node [1, 2]. Figure 1 shows the RMG process flow utilizing poly opening polish (POP) and metal gate pol-ish steps. Because of the small dimensions and consequent small dimensional tolerance of the gate structure, traditional

CMP processes are inadequate for these RMG steps. For functional devices and requisite yield, thickness control and defect performance has to be significantly improved over CMP processes used for previous technologies.

Table 2 shows integration issues associated with the inser-tion of the new RMG CMP steps. Many of these arise from insufficient thickness control, either during CMP or incom-ing to CMP. Low polish rates at the POP step result in tall gates that may not be properly filled with gate metals (due to high aspect ratio). Taller gates also require longer contact etch potentially resulting in underetched contacts. Severe underpolish, such that the poly-Si is not exposed, causes poly to remain in the gate, preventing proper metal fill. The resultant poly-Si gate transistor will fail due to improper work function (Vt shift) and high gate resistance.

Underpolishing at the metal CMP step results in incom-plete overburden metal removal and hence shorting. The metal gate polish step must undergo sufficient overpolish-ing to remove any topography evolved during poly open-ing. Excessive overpolishing at either of the RMG CMP

Joe Steigerwald is director of Chemical Mechanical Polish Technology, Intel Technology and Manufacturing Group. He received his bachelor’s in electrical engineering from Clarkson University, and his masters and doctorate in electrical engineering from Rensselaer Polytechnic Institute.

Joseph M. Steigerwald, Intel

Chemical mechanical polish: the enabling technology

a)

b)

c)

d)

e)

FIGURE 1 RMG process flow showing CMP steps: (a) ILD0 deposition post transistor formation, (b) POP CMP to expose the poly-Si gate, (c) poly etch, (d) metals deposition, (e) metal gate CMP

Source: Intel

Page 43: EDA Tech Forum Journal: March 2009

43EDA Tech Forum March 2009

Chemical mechanical polishing (CMP) has traditionally been considered an enabling technology. It was first used in the early 1990s for BEOL metallization to replanarize the wafer substrate thus enabling advanced lithography, which was becoming ever more sensitive to wafer surface topography. Subsequent uses of CMP included density scaling via shallow trench isolation and interconnect for-mation via copper CMP.

As silicon devices scale to 45nm and beyond, many new uses of CMP are becoming attractive options to enable new transistor technologies. These new uses will demand improved CMP performance (uniformity, topography, low defects) at lower cost, and this will in turn require break-throughs in hardware, software, metrology and materials (e.g., slurry, pad, cleaning chemicals).

This paper reviews the module level and integration chal-lenges of applying traditional CMP steps to enable high-K metal gate for 45nm technology and to advance copper metallization from the 65nm to the 45nm node. These challenges are then considered with respect to new CMP applications for 32nm and beyond.

steps results in thin gates with high gate resistance and the potential for over-etched contacts. Severe overpolishing results in the exposure of the adjacent raised source/drain regions, which are then attacked in the post-CMP poly re-moval etch step. These integration concerns mean there is a narrow process window at both CMP steps.

Figure 2 shows the historical improvements of within die (WID) thickness variation obtained for shallow trench iso-lation (STI)/POP CMP processes. Due to thickness control issues listed in Table 2 (p.44), circuit yield declines precipi-tously for WID values above the dashed line. Note that his-toric 70% scaling from previous technologies is inadequate to meet required WID control.

The required thickness control is gained via several key CMP innovations. First WID/topography control is achieved by the selection of high selectivity slurry (HSS) and polish pad as well as the optimization of machine pa-rameters around the selected set of consumables. With the HSS, the polish process slows significantly when the void-free interlayer dielectric (ILD0) overburden is cleared and the gate is exposed resulting in an autostop to the process. Next, within wafer (WIW) uniformity is optimized by struc-

tured experimental designs varying polish pressure, head and pad velocities, pad dressing and polish head design.

Thickness control within a standard edge exclusion is insufficient: poor polish control of the bevel region of the wafer potentially leads to subsequent redistribution of bevel films during poly etch and subsequent wet etch operations.

Significant CMP defect improvement is also required for satisfactory HMKG yield. Typical CMP defects are listed in Table 3 (p.44) (these modes were experienced during the de-velopment of RMG CMP processes). Because of the narrow dimensions at the gate layer, HKMG yields are particularly sensitive to CMP defects, and without significant improve-ments from the initial levels of RMG CMP defects, HKMG yield would not have been viable.

Node Year CMP Enabling

0.8um 1990 ILD Multilevel metallization

0.35um 1995 STI PSP W

Compact isolation PolySi patterning Yield/defect red

0.18um 1999 SiOF ILD RC scaling

0.13um 2001 Cu RC scaling

90nm 2003 SiOC ILD RC scaling

65nm 2005

45nm 2007 RMG HiK - metal gate

<32nm 2009+ ??? Continued scaling New devices New architecture

TABLE 1 Enabling CMP technologies

Source: Intel

Continued on next page

Must be below line toenable functional Hi-K/metal gate transistors

Scaling fasterthan 70% rate

350 250 180 130 9065 45

STI WID70% LinePOP WID

STI/POP WID Performance

1

0.1

0.01

CM

P To

po

gra

ph

y(n

orm

aliz

ed)

Technology (nm)

FIGURE 2 Improvements in CMP topography by technology node

Source: Intel

Page 44: EDA Tech Forum Journal: March 2009

< TECH FORUM > DESIGN TO SILICON44

Issue Cause Impact to Device

Gate resistance variation Poor polish rate control

(WIW, WID, WTW)

Parametrics

Poor gate fill Thick poly opening/

large aspect ratio

High gate resistance/defects

Unexposed poly si gate Low polish rate Vt shift, high gate resistance

Residual material (underpolish) Non-uniform polish Poor planarization

Shorting/opens

Particle generation

Raised S/D exposure Overpolish

Thick epi S/D

S/D removal during poly etch

Contact etch window Metal gate height determines etch depth Opens/shorts for under/over etch

Bevel edge redistribution of poly/epi Under/overpolish in bevel region generates defect source

Defects due to redistribution of under exposed gate or overpolished S/D regions

Structural damage to device layer High CMP shear forces Low yield/reliability

Strain/stress relaxation High shear force during CMP Loss of strain induced carrier mobility

TABLE 2 Integration issues with RMG CMP

Source: Intel

Defect mode Potential Causes Impact to device Potential Solutions

Particles Slurry/pad residue Polish byproducts

Shorting/opens Pattern distortion

Cleaner tooling Clean chemistries

Macro scratches Large/hard foreign particles on polish pad

Pattern removal over multiple die.

Pad conditioning Pad cleaning Environment

Micro scratches Slurry agglomeration Pad asperities

Shorting/opens Slurry filters Pad/Pad conditioning

Corrosion (metal CMP) Slurry chemistry Clean chemistry

Opens Reliability

Passivating films Chemistry optimization

Film delamination Weak adhesion CMP shear force

Shorting/opens Device parametrics

Improve adhesion Low pressure CMP

Organic residue Inadequate cleaning, residual slurry components

Shorting/opens Disturbed patterning of next layer

Cleaner tooling Slurry optimization Clean chemistries

TABLE 3 CMP defect modes

Source: Intel

Back-end metallization requirementsThe scaling of circuit dimensions at the 45nm node also re-quires significant improvement to the copper CMP process. As the metal line width scales, variation in the height of the line results in greater variation in its resistance and capaci-tance. Copper metal loss during CMP (dishing and erosion effects) is a primary cause of interconnect height variation for the 45nm node, and a significant reduction in copper loss is required to ensure proper functioning of the interconnect.

Copper thickness loss is decreased as WIW and WID thickness control is raised, and also as improvements in surface topography at a given layer allow a reduction in the

amount of oxide removed at subsequent layers. Underlying surface topography requires additional oxide removal to ensure that all of the metal overburden is removed from the low lying areas of the surface topography. Hence improve-ments in front-end topography translate into a need for less oxide removal in the upper back-end metal layers.

Improvements in the uniformity of the copper CMP WIW and WID removal rate at the 45nm node are a result of im-provements in slurry selectivity as well as polish pad, pad dressing and polish machine parameter optimizations.

As with the RMG CMP steps, the defect modes listed in Table 3 are a challenge at copper CMP steps. Because mod-

Page 45: EDA Tech Forum Journal: March 2009

45EDA Tech Forum March 2009

[4] H.Y. Yu et al., IEDM Tech Dig., p.638, (2005).[5] C. Park et al., IEDM Tech Dig., P.299, (2004).[6] A. Kaneko et al., IEDM Tech Dig., p.884, (2005).[7] Y-S Kim et al., IEDM Tech Dig., p.315, (2005).[8] K.N. Chen et al., IEDM Tech Dig., (2006).[9] S.M. Jung et al., IEDM Tech Dig., (2006).[10] E.K. Lai et al., IEDM Tech Dig., (2006).[11] D.C. Yoo et al., IEDM Tech Dig., (2006).[12] T. Nirschl et al., IEDM Tech Dig., p.461, (2007).

ern IC technologies contain up to 10 layers of metal, even low levels of defect densities in the copper CMP step can have a significant impact on yield.

New uses of CMPRecent conference proceedings and journal articles are rich with new potential uses of CMP, under consideration for 32nm and beyond. Table 4 lists some of those intriguing possi-bilities as well as some of the potential challenges they can be expected to bring. It is evident from all that activity that CMP is still considered useful by process technology architects.

However, the difficulties experienced in introducing RMG steps at the 45nm node demonstrate that any new CMP steps introduced into the front end of the line will require exceedingly tight film thickness and defect con-trol. Without such control, recently added RMG CMP steps would not have successfully advanced from R&D to high volume manufacturing. Emerging options can be expected to demand even greater control of thickness and defects—as will shrinking to dimensions of 32nm, 22nm and beyond. To continue the growth of CMP, the industry must find pro-cessing conditions (slurry, pad, tooling) that improve per-formance in critical areas.

AcknowledgementsThe author thanks Francis Tambwe, Matthew Prince and Gary Ding for assistance in preparing data, and Tahir Ghani and Anand Murthy for valuable discussions in preparing the manuscript.

References[1] K.Mistry et al., IEDM Tech. Dig., p.247, (2007).[2] C.Auth et al., Symp. VLSI Dig., p.128, (2008)[3] K.Kuhn IEDM Tech. Dig., p.471, (2007).

IntelRA3-3012501 NW 229th AveHillsboroOR 97124USA

T: 1 503 613-8472W: www.intel.com

Application CMP Enabling Aspect Potential Challenges Reference

FUSI replacement metal gates

CMP used to expose p and n gates Inadvertent exposure of opposing gates

H.Y. Yu, et al. [4]

Novel FUSI metal gates Enables differential silicidation of poly-Si gate

Requires CMP of dissimilar metals

C. Park, et al. [5]

FINFET devices Planarization of Poly Si, reduction of topography by fins

Poly-si thickness variation resulting in patterning issues

A. Kaneko, et al. [6]

FINFET devices Damascene (inlaid material) approach to FINFET formation

Thickness variation caused by multiple CMP steps

Y-S Kim, et al. [7]

3D Inegration – chip stacking Oxide CMP post Cu CMP to recess oxide and promote Cu bonding

Smearing of Cu bumps K.N. Chen, et al. [8]

3D stacked NAND Flash devices (2 papers)

3D integration Ultra-flat topography re-quired for building multiple layers of devices

S.M. Jung, et al. [9]

E.K. Lai, et al. [10]

Novel memory – Ferroelectric media

CMP eliminates roughness typical of ferroelectric material

Device layer thickness control D.C. Yoo, et al. [11]

Phase change memory Planarize and expose phase change element

Thickness control of small device region

T. Nirschl, et al. [12]

TABLE 4 Potential new CMP applications

Source: Intel

Page 46: EDA Tech Forum Journal: March 2009

< TECH FORUM > TESTED COMPONENT TO SYSTEM46

PCB designers at Broadcom were tasked with accomplishing length matching within a differential pair. The task involves applying serpentine or sawtooth routing to a trace. Doing this manually is tedious and time-consuming, whereas au-tomation cuts the time and effort involved significantly. The automation layer in Mentor Graphics’ Expedition product provides all the tools needed to complete this job.

The paper demonstrates how the script form editor and automation object model were used to create an interface for applying a sawtooth route on an existing trace to achieve proper length matching for differential pairs.

Tuning requirementsAll signal traces in the MAC section have controlled imped-ance and need to be routed with a specific trace width. They must follow 3-W minimum spacing from other traces and differential pairs on the same layer.

• 95Ω +/- 10% for the PCIe differential pairs (17 pairs total).• 100Ω +/- 10% for the XAUI differential pairs (two sets

of 16 pairs each).

The two traces within each PCIe differential pair must be length-matched so that the long trace is no more than 5mils longer than the short trace. The two traces within each XAUI differential pair must be length-matched so that the long trace is no more than 5mils longer than the short trace.

Figure 1 shows how to achieve the 5mil-length matching within a differential pair. Apply this serpentine at the end of the trace run where the mismatch occurs for the PCIe and for the XAUI differential pairs.

To achieve the above sawtooth serpentine, we had two options:

Option oneCreate an AutoCad DXF file of a small two-bump sawtooth

pattern following the requirements shown in Figure 1. Import this DXF file into Expedition as a drawn line object onto a tem-

porary user layer where it will be copied-and-pasted until the desired length is met. Then select all the segments and convert them into a continuous poly line. Change this drawn poly line type from a ‘Draw Object’ to a ‘Trace’, changing the ‘Layer’ and ‘Net’ properties at the same time. Move the trace to the desired location and finish routing the differential pair.

Option twoSelect the longest line in the differential pair net and ‘fix’

or ‘lock’ it down. Manually create a small segment on the shortest net of the differential pair and pull this segment away from the longest differential pair net. Adjust this small segment to meet the sawtooth requirement shown in Figure 1. Continue creating and adjusting these small segments until the desired length is met.

Either option demands a lot of time and effort. For one particular design, we chose option two because it best suit-ed the board’s routing density. The initial routing pass of 49 differential pairs took almost 32 hours to match. By replac-ing this with sawtooth tuning automation, we cut the time required to route and match all 49 pairs to five hours.

Sawtooth tuning automationThe tuning form that we developed modifies the route of an

existing trace to have sawtooth bumps according to user-defined parameters. It was designed using Expedition’s Script Form Edi-tor. Figure 2 shows the form that is presented to the user.

Differential pair sectionIn this section, the user must first select a differential pair

net to work with using the ‘ComboBox’ control. On selec-

Alex Diaz and John Mehlmauer are both on the senior staff, PCB Layout at Broadcom.

Alex Diaz, John Mehlmauer, Broadcom

Automating sawtooth tuning

>3w

w

S S1 < 2S

FIGURE 1 Sawtooth serpentine for length matching

Source: Broadcom

Page 47: EDA Tech Forum Journal: March 2009

47EDA Tech Forum March 2009

Length matching within a differential pair can be one of the more tedious tasks facing a PCB designer. This article describes how a team at communications and consumer electronics semiconductor company Broadcom overcame this by using the automation options available within their design tools. While the article describes automation tailored to a specific task, the authors present it here as an example of what can be done more generally. The work-flow they developed is described, followed by detailed description of the code that underpinned the automation process.

tion, the pair is highlighted in the Expedition workspace. Just under the ComboBox control is a checkbox labeled ‘Fit window when selected’. If this is checked, the program will fit the Expedition workspace area to the differential pair traces that are highlighted.

Once selected, the information for the net is automati-cally gathered through the Constraint Editor System (CES) and presented to the user in the form. This information is the two electrical net names for the differential pair, the length for each electrical net, the length difference between them, and the differential pair tolerance constraint value. Next to each electrical net name is a ‘Highlight’ button. If clicked, then all that net’s features are highlighted in the Expedition workspace.

Line width and spacing information for the differential pair needs to be manually entered in this section of the

form. The spacing value is shown as variable ‘S’ in Figure 1. The ‘OK’ button is then clicked to enable the ‘Sawtooth’ and ‘Routing’ sections of the form.

Sawtooth sectionThis is where the user describes the parameters for the

sawtooth bump that will be added to the trace. The param-eters are spacing, length, quantity and type. The spacing parameter is the distance between the edge of the oppos-ing trace segment and the edge of the ‘bumped’ sawtooth segment. It is shown in Figure 1 as variable S1. This value is automatically filled out to be just less than double the spac-ing value from the differential pair section.

The length parameter defines how long the ‘bumped’ sawtooth segment is to run after the angle. This is shown as variable >3w in Figure 1 (i.e., greater than three times the width of the trace).

The quantity (‘Qty’) parameter states how many saw-tooth segments the user wants to add. This can be an in-teger number entered in to the EditBox control or the user can click on a ‘Maximum’ CheckBox control to keep adding them until the length matching is within tolerance or there is no more room on the trace, whichever comes first.

The type parameter is a RadioButton control and the val-ue can only be positive or negative. This determines which way to ‘bump’ the trace with the sawtooth segment. ‘Posi-tive’ will bump above horizontal and angled traces and bump to the right of vertical traces. ‘Negative’ will bump

FIGURE 2 The sawtooth tuning form

Source: Broadcom

01 ' Get the CES application object02 Set objCesApp = GetObject(,"CES.Application")0304' Get the CES object05 Set objCES = objCesApp.Tools("CES")0607 ' Get the Design object08 Set objDesign = objCES.ObjectDefs.Design0910 For Each DiffPair In objDesign.DiffPairs11 Set objDiffPairRow = objCES.ObjectDefs.Sheets("ENET").FindObject(DiffPair.Name)12 ' Get the Pair Tol Max value13 dpTol = objDiffPairRow.ConstraintEx("DP_TOL")14 ' Get the Pair Tol Actual Value15 dpAct = objDiffPairRow.ConstraintEx("ACT_DP_DELTA")16 ' Check if this is in violation17 If CDbl(dpAct) > CDbl(dpTol) Then18 ' This is a violation so add it to ComboBoxDiffPairs19 ComboBoxDiffPair.AddString(DiffPair.Name)20 End If21 Next

FIGURE 3 Populating the ComboBox

Source: Broadcom

Continued on next page

Page 48: EDA Tech Forum Journal: March 2009

< TECH FORUM > TESTED COMPONENT TO SYSTEM48

in the Expedition workspace is a one-segment trace that belongs to the working differential pair that the user has selected in the form.

ProgrammingThis section considers key programming techniques used in this program that the Automation Layer in the CES and Ex-pedition provide. The language used for this program and in the examples to follow is VBScript.

Populating the ComboBoxUpon program start up, the ComboBox control in the Dif-

ferential Pair section of the form is populated. In order to do this, the program will access the CES Application object. The code is shown in Figure 3 (p.47).

The key object needed here is a CES ‘Design’ object. This will let us access the ‘DiffPairs’ collection in order to check the Pair Tolerance constraint value as well as the actual dif-ference in length between the lengths of the differential pair nets. Lines 1 through 8 do the steps necessary to retrieve the ‘Design’ object—first by getting the ‘Application’ object, then the CES object, and finally the ‘Design’ object itself on line 8.

Beginning on line 10, a ‘For Each’ loop is then used to traverse the ‘objDesign.DiffPairs’ collection. Within this loop on line 11, the variable ‘objDiffPairRow’ is being set as a spreadsheet object, which is retrieved from the ‘ENET’ page in CES using the differential pair name as the argu-ment for the ‘FindObject’ method. Using this object allows us to retrieve the constraint values we want through the ‘ConstraintEx’ property.

Line 13 gets the maximum pair tolerance and assigns the value to the variable ‘dpTol’, and line 15 gets the actual dif-ference in length assigning it to the variable ‘dpAct’. From

below horizontal and angled traces and bump to the left of vertical traces.

Routing sectionIn this section, the user fills in two parameters, ‘Start

With’ and ‘Direction’.‘Start With’ determines whether to start with the bump

right away (‘Sawtooth’) or continue on the same path with a segment that is the length of the sawtooth and then apply the bump (‘Segment’).

The direction can be either right or left for horizontal and angled lines, and either up or down for vertical lines. This will determine the start/end-point of the working trace where the sawtooth segments begin to be added.

‘Apply’ buttonThis is at the bottom of the form and executes the saw-

tooth route modification on the selected trace in the Expedi-tion document. It is not enabled unless the current selection

01 ' Get the CES application object02 Set objCesApp = GetObject(,"CES.Application")0304 ' Get the CES object05 Set objCES = objCesApp.Tools("CES")0607 ' Get the Design object08 Set objDesign = objCES.ObjectDefs.Design0910 ' Get the Expedition Application object11 Set objExpApp = GetObject(,"MGCPCB.Application")1213 ' Get the Expedition Document object14 Set objExpDoc = objExpApp.ActiveDocument1516 ' Set the DiffPairName variable to be the value that is selected17 DiffPairName = ComboBoxDiffPair.TextValue1819 ' Clear any select and highlight20 objExpDoc.UnSelectAll()21 objExpDoc.UnHighlightAll()2223 'Get the DiffPair object24 Set objDiffPair = objDesign.GetDiffPair(DiffPairName)2526 ' Get each Enet and fill out the EditName and EditLength controls appropriately27 ct = 1 ' counter28 For Each Enet In objDiffPair.Nets29 ' Use Eval to get the control object based on the counter30 Set EditName = Eval("EditDiffPairNet" & ct)3132 ' Use Eval to get the control object based on the counter33 Set EditLength = Eval("EditDiffPairLength" & ct)3435 EditName.Text = Enet.Name3637 ' Get the CES spreadsheet row for this electrical net38 Set objEnetRow = objCES.ObjectDefs.Sheets("ENET").FindObject(Enet.Name)3940 EditLength.Text = objEnetRow.ConstraintEx("ACT_LENGTH_TOF")4142 ' Highlight each physical net in the electrical net43 For Each Pnet In Enet.PhysicalNets44 ' Get the Net object from the Document Object45 Set netObj = objExpDoc.FindNet(Pnet.Name)46 ' Highlight the net47 netObj.Highlighted = True48 Next49 ct = ct + 150 Next5152 ' Fill in the Length Difference and Pair Tolerance.53 Set objDiffPairRow = objCES.ObjectDefs.Sheets("ENET").FindObject(objDiffPair.Name)54 ' Get the Pair Tol Max value55 dpTol = objDiffPairRow.ConstraintEx("DP_TOL")56 ' Get the Pair Tol Actual Value57 dpAct = objDiffPairRow.ConstraintEx("ACT_DP_DELTA")5859 EditLengthDifference.Text = dpAct60 EditPairTolerance.Text = dpTol6162 ' Enable the Update and Ok buttons.63 ButtonUpdateDiffPair.Enable = 164 ButtonPairOk.Enable = 16566 ' Set the extents to the highlighted nets if the CheckFitWindow checkbox control is67 checked.68 If CheckFitWindow.Check = 1 Then69 objExpDoc.ActiveView.SetExtentsToSelection True70 End If

FIGURE 4 Selecting a differential pair name

Source: Broadcom01 ' Get the CES application object02 Set objCesApp = GetObject(,"CES.Application")0304 ' Get the CES object05 Set objCES = objCesApp.Tools("CES")0607 ' Get the Design object08 Set objDesign = objCES.ObjectDefs.Design0910 ' Get the Expedition Application object11 Set objExpApp = GetObject(,"MGCPCB.Application")1213 ' Get the Expedition Document object14 Set objExpDoc = objExpApp.ActiveDocument1516 Call Scripting.AttachEvents(objExpDoc, "docEvents")1718 Sub docEvents_OnSelectionChange(SelectionType)19 Dim enable20 Dim pcbTraces, netName, objDiffPair21 Dim Enet, Pnet22 enable = 023 Set pcbTraces = objExpDoc.Traces(epcbSelectSelected)24 ' Make sure only 1 trace is selected25 If pcbTraces.Count = 1 Then26 netName = pcbTraces.Item(1).Net.Name27 Set objDiffPair = objDesign.GetDiffPair(ComboBoxDiffPair.TextValue)28 ' Verify the trace selected belongs to the working differential pair.29 For Each Enet In objDiffPair.Nets30 For Each Pnet In Enet.PhysicalNets31 If netName = Pnet.Name Then32 enable = 133 End If34 Next35 Next36 End If37 ButtonApply.Enable = enable38 End Sub

FIGURE 5 Enabling the apply button

Source: Broadcom

Page 49: EDA Tech Forum Journal: March 2009

49EDA Tech Forum March 2009

that has been selected in the ComboBox control. This is ac-complished using the ‘ComboBox.TextValue’ attribute. With this differential pair name we can now get the ‘DiffPair’ ob-ject on line 24 from the ‘objDesign.GetDiffPair’ method.

Now that we have a ‘DiffPair’ object, we can loop through the child electrical nets within the ‘DiffPair’. Line 28 starts the appropriate ‘For Each’ loop, where the name of each electrical net is being placed in the appropriate ‘EditBox’ control on line 35. The length of the electrical net is then retrieved from the CES spreadsheet on line 38 and placed in the appropriate length ‘EditBox’ control on line 40. Next the features for the electrical nets are highlighted in a nested ‘For Each’ loop starting on line 43. It loops through the child physical nets within the electrical net. First, the physical ‘Net’ object is retrieved from the Expedition Document ob-ject on line 45, and it is then highlighted on line 47.

After the electrical net information is filled out, we then must populate the length difference and pair tolerance ‘Edit-Box’ controls. This is done on lines 52 through 60. First, by retrieving the constraint values on line 55 and 57 from the CES, and then by setting the text properties for the controls on lines 59 and 60.

We can now enable the ‘Update’ and ‘OK’ buttons within the differential pair section. This is done on line 63 and 64 us-ing their respective ‘Enable’ properties and setting them to 1.

There is a check to see if the ‘Fit window when selected’ checkbox is checked or not on line 68 and, if it is, then the extents of the Expedition Workspace are set to the highlight-ed features on line 69.

Enabling the ‘Apply’ buttonThis button will not be enabled unless the current se-

lection in the Expedition Workspace is a single trace that belongs to the working differential pair that the user has selected in the form. To check for this, the program uses the ‘OnSelectionChange’ event in the ‘Document’ object. Figure 5 shows the code.

First we must bind the document object event to our event handler function. This is done on line 7. The ‘AttachEvents’ method takes two parameters, the object reference and the event function prefix. We are specifying ‘docEvents’ as the function prefix that the automation layer will look for when an event occurs in the ‘Document’ object. The event handler function we define is ‘docEvents_OnSelectionChange’ on line 9. This event passes one argument, the ‘SelectionType’. Whenever a selection change occurs in the ‘Document’, this function will then execute. For our purposes, we want to verify that the selection is a single trace and is part of the differential pair currently selected in the form.

Line 14 gets the collection of ‘Traces’ selected in the ‘Doc-ument’. Line 16 verifies that there is only one trace selected. Line 26 gets the physical net name of the selected trace. We then get the ‘DiffPair’ object from the CES on line 27 based

there, we can do the math on line 17 to see if this differential pair is in violation. If it is in violation, then the differential pair name is added to the ‘ComboBox’ control on line 19 us-ing the ‘AddString’ method.

Selecting the differential pairWhen the user selects a differential pair from the ‘Com-

boBox’ control, the electrical net names and lengths are then presented in the form automatically. Figure 4 shows the code. Along with the CES’ ‘Application’ object, we also use Expedition’s ‘Application’ and ‘Document’ object. ‘Docu-ment’ allows us to interact with the working PCB design. The object is being set as the variable ‘ExpDoc’ on line 11 and is derived from the ‘Application.ActiveDocument’ property. This will be used to highlight the net’s features and zoom to their extents in the Expedition Workspace.

Lines 10 through 14 accomplish retrieving the Document object; first by retrieving the ‘Application’ object on line 11 and then the ‘Document’ object on line 14. Line 17 sets a vari-able named ‘DiffPairName’ to be the differential pair name

01 ' Get the Trace object that is selected02 '03 Set pcbTraces = objExpDoc.Traces(epcbSelectSelected)04 Set pcbTrace = pcbTraces.Item(1)0506 ' Layer that the trace belongs to07 '08 pcbTraceLayer = pcbTrace.Layer0910 ' Trace width11 '12 pcbTraceWidth = pcbTrace.Geometry.LineWidth1314 ' Get the Net object for the trace were working on15 '16 pcbTraceNet = pcbTrace.Net.Name17 Set pcbNetObj = objExpDoc.FindNet(pcbTraceNet)1819 ' Delete the original trace20 '21 pcbTrace.Delete2223 ' Get the parameters from the form24 '25 toothLength = CDbl(editToothLength.Text)26 toothSpacing = CDbl(editToothSpacing.Text)27 pairSpacing = CDbl(editPairSpacing.Text)28 If radioToothType.Value = 0 Then29 toothType = "positive" 'positive | negative This is the side of the trace the30 sawtooth bump goes31 Else32 toothType = "negative"33 End If34 ' number of teeth to add. "Max" for as much as the trace length allows35 If editToothQty.Text = "Max" Then36 addAmount = "Max"37 Else38 addAmount = CInt(editToothQty.Text)39 End If40 If radioStartWith.Value = 0 Then41 startWith = "sawtooth" ' segment | sawtooth42 Else43 startWith = "segment" ' segment | sawtooth44 End If4546 Select Case radioDirection.Value4748 Case 0 routDirection = "right"49 Case 1 routDirection = "left"50 Case 2 routDirection = "up"51 Case 3 routDirection = "down"5253 End Select5455 'Do While room is left or the addAmount has been reached5657 'Use the parameters and calculate the points array for this sawtooth58 '59 pntsArr(0,0) = sawtoothX1 :pntsArr(1,0) = sawtoothY1 :pntsArr(2,0) = 0.060 pntsArr(0,1) = sawtoothX2 :pntsArr(1,1) = sawtoothY2 :pntsArr(2,1) = 0.061 pntsArr(0,2) = sawtoothX3 :pntsArr(1,2) = sawtoothY3 :pntsArr(2,2) = 0.062 pntsArr(0,3) = sawtoothX4 :pntsArr(1,3) = sawtoothY4 :pntsArr(2,3) = 0.063 '64 'Add the sawtooth trace.65 Set pcbAddTrace = objExpDoc.PutTrace(pcbTraceLayer, pcbNetObj,_66 pcbTraceWidth, 4, pntsArr,_67 Nothing, epcbUnitCurrent, epcbAnchorNone)

FIGURE 6 Applying the sawtooth modification

Source: Broadcom

Continued on next page

Page 50: EDA Tech Forum Journal: March 2009

< TECH FORUM > TESTED COMPONENT TO SYSTEM50

on what is currently selected in the form’s ComboBox con-trol. From there, we can traverse the child electrical and physical nets to verify that the selected physical net is part of the ‘DiffPair’. If it is, then the ‘enable’ variable is set to 1 on line 32 and will enable the ‘Apply’ button on line 37.

Applying the sawtooth modificationWhen the ‘Apply’ button is pressed, the program will

then modify the selected trace to have the sawtooth bumps according to the parameters from the form. Figure 6 (p.49) shows the code.

First, information is gathered from the selected trace to be modified. This is done on lines 1 through 17. We need to know the layer the trace is on, the width of the trace, and finally we need to retrieve the ‘Net’ object that the trace be-longs to. This information is necessary in order to execute the ‘PutTrace’ method for the sawtooth bump later down the line. The original trace is then deleted on line 21 using the ‘Trace.Delete’ method. The next step is to retrieve all the parameter values from the form’s controls. This is done on lines 23 through 53.

Using all the parameters, we can now calculate where to start the sawtooth bumps and set a points array to use in the ‘Document.PutTrace’ method. This method is per-formed within a ‘Do While’ loop and will only execute if there is enough room left to add a sawtooth or a user-de-fined number of sawtooth segments has been reached. Lines 59 through 62 demonstrate setting a points array to be used and line 65 executes the ‘Document.PutTrace’ method.

ConclusionThis paper has demonstrated the power of the script form

editor and the automation layer that Expedition and the CES have available. This problem is only one of many out there that can be solved using automation. We hope this paper will increase your interest in and knowledge of automation. If your PCB design group finds itself repeating time-con-suming tasks, consider automation as the solution.

Broadcom16215 Alton ParkwayIrvineCA 92618USA

T: +1 949 926 5000W: www.broadcom.com

Page 51: EDA Tech Forum Journal: March 2009

15 Worldwide Locations• March 26 – Irvine, CA• April 9 – Tempe, AZ• May 5 – Austin, TX• May 28 – Ottawa, ON• June 16 – Munich, Germany• August 25 – Hsin-Chu, Taiwan• August 27 – Seoul, South Korea• September 1 – Shanghai, China• September 3 – Santa Clara, CA • September 3 – Beijing, China• September 4 – Tokyo, Japan• September 8 – New Delhi, India• September 10 – Bangalore, India• October 1 – Denver, CO• October 8 – Boston, MA

Attend a 2009 EDA Tech Forum® Event Near You

STAY ON THE

FRONT LINEOF EE DESIGN

STAY ON THE

FRONT LINEOF EE DESIGN

2009 Platinum Sponsors:

EDA_EventAd_09v2.indd 1 2/18/09 11:51:34 AM

Page 52: EDA Tech Forum Journal: March 2009

TO NONE

2 speed-grade advantage

2X the density

1/2 the power

SECOND

www.altera.com

Copyright © 2008 Altera Corporation. All rights reserved.

Think AND, not OR. Our new 40-nm Stratix® IV FPGAs give you

2X the density AND a 2 speed-grade advantage AND half the

power—with and without 8.5-Gbps transceivers. For production,

combine them with our risk-free HardCopy® ASICs for all the benefits

of FPGAs AND ASICs. Design with Quartus® II software for the

highest logic utilization AND 3X faster compile times. When second

is not an option, design with Stratix IV FPGAs.

www altera com

® IV FPGAs give you

tage AND half the

vers. For production,

ASICs for all the benefits www.altera.com

.

ASICs for all the benefits ® II software for the

pile times. When second

As.

Untitled-8 1 8/7/08 4:38:10 PM


Recommended