A scaling problem survival guide - MathWorks · A scaling problem survival guide ... Code Prover...

A scaling problem survival guide

By Alexandre De Barros

Advanced Support Group

The MathWorks, Inc.

Copyright 2016 – The MathWorks Inc.

Introduction

This guide has been written to help Polyspace Code Prover users facing a “scaling problem”. It aims

to be short, not too technical but practical.

The document will start with an explanation on why these scaling problems occur and then what you

can do about it.

You will see that there are actually several options that can be used.

But let’s see first what are scaling problems.

Why do scaling problems occur (and why it is not systematically a

problem)?

Polyspace Code Prover is a code verifier. Its job is to statically prove the absence of run-time errors.

Exhaustively (i.e. on every operation) verifying the absence of run-time problems is a complex

process. More complex than just finding bugs, like other static tools do.

This complexity will increase with different factors, one of the most important being the size of the

application. Logically, the more code you give to Polyspace, the more complex the verification is.

Other factors will also impact the complexity, like the complexity of the code itself: It is more difficult

to prove the absence of run-time errors on a code that involves pointers to structures with five

levels, with fields being pointers too, than with a code that just accesses scalar values.

But generally speaking, the size of the application is the first factor that can lead to a scaling problem.

There is so much computation to perform, so much data to handle that the verification process

either fails with an out of memory error or shows no progress for several hours or days.

This is even truer with today’s applications that are bigger and bigger, and often include libraries.

Some words about how Code Prover works

In order to compute the colors (a.k.a. checks) Polyspace Code Prover performs several “Verification

levels” (also called “Software safety analysis levels”).

You can find the verification levels in the project setup:


Figure 1 - the verification levels

They go from 0 to 4 (level 2 being the default level). To be precise, you can even choose the level

“other” that corresponds to the level 20 (!).

Polyspace Code Prover will start to produce the colored checks from the level 0 and the next levels

will use the results of the previous level to prove more checks, i.e. to transform orange checks (also

named “unproven checks”) into green, red or grey .

The statistics in terms of checks are given in the verification log, at the end of each verification level.

For example, here is what you can see in the log of the demo example “Demo_C”:

**** Software Safety Analysis Level 0 - 111 (P_R)

**** Software Safety Analysis Level 0 - 111 (P_R) took 0real, 0u + 0s

*mmu 109

Generating GUI files

Checks statistics: (including internal files)

- IRV => Green : 33, Orange : 0, Red : 0 (100%)

- OVFL => Green : 52, Orange : 8, Red : 0 (88%)

- STD_LIB => Green : 0, Orange : 0, Red : 1 (100%)

- NIP => Green : 16, Orange : 0, Red : 0 (100%)

- NIVL => Green : 105, Orange : 7, Red : 0 (94%)

- NIV => Green : 31, Orange : 0, Red : 0 (100%)

- OBAI => Green : 1, Orange : 1, Red : 1 (75%)

- ZDV => Green : 15, Orange : 1, Red : 0 (94%)

- IDP => Green : 9, Orange : 2, Red : 1 (86%)

- ASRT => Green : 0, Orange : 4, Red : 0 (0%)

TOTAL: => Green : 262, Orange : 23, Red : 3 (92%)

Number of NTL : 1

Number of NTC : 3

Number of UNR : 5

Number of functions defined but never called : 1 / 44

Number of functions called from unreachable code: 0 / 44


We can see that Code Prover was already able to detect three red errors and to prove 92% of the operations.

The Verification level 0 is then the least precise level. But it is already very interesting and is able to

produce meaningful results.

So it is time to say something very important when dealing with scaling problems:

Reaching the level 2 should not be a goal in itself!

The verification level is just a precision option, a parameter of the verification.

Coming back to Demo_C, interestingly, at the end of level 2, the statistics are almost identical, the only difference being that one orange has turned into green. Hence, if the scaling problem occurs in level 1, then you should not consider that this is a scaling

problem. It is just that, due to the complexity of the verification, Polyspace was not able to be more

precise than the level 0.

So if you’re reading this guide because your application is stuck in level 1, then you must realize that

the verification has produced results, with a lower precision than the one specified in the project but

that results are available (also, take a look at the log and mention if red errors were found in level 0).

Now, scaling problems can also occur before the level 0.

What to do?

Logically enough, the first thing to do is to see why the Polyspace verification does not scale.

For that, you will have to take a look at the verification log, since it contains, among other things, the

basic code metrics of the application: the number of files, number of lines and number of lines

without comments.

Where to find the metrics in the log?

The log is made of several parts. Here is what you find in the beginning:

Options used for the verification

Verification starting date

Verification of the sources (“Verifying C sources”)

Code metrics

The code metrics will look like this:

Number of files : 939

Number of lines : 573568

Number of lines without comments : 316575


The last number (number of lines without comments) is the most interesting one.

As explained in the introduction, the complexity of a verification is impacted by several factors and so

the number of lines is not the only criterion to take into account, but is the most important one.

Another factor of complexity is the complexity of the code. Not surprisingly, proving that a code

manipulating lots of pointers (that could point to multiple objects at run-time), nested structures and

conditions everywhere (increasing the number of paths in the control flow) is a more complex task

than when the code is more “basic”.

It is even more true with C++ code and the nature of the object-oriented programming. Because of

the inheritance, even behind a single assignment, there are potentially several function calls that

Polyspace has to consider. And the C++ libraries are also bigger and more complex.

For a C application, you can consider that an application of 100 000 locwc (lines of code without

comments) is a huge application and that scaling problems can appear.

An application of 200 000 locwc will very likely not succeed because the verification will not scale.

No need to say what can happen with an application of 300 000 locwc…

Now, things could change in the future, and the Polyspace R&D is working on improving the product

so it can handle more code, but for the moment these are the “figures” for what we can consider as

large and scaling-problem-prone Polyspace verifications.

So, what options do I have?

There are several things you can try.

The first one should be to…

Use the Bug Finder There are lots of good reasons to use the Bug Finder. The first one being that there is no scaling

problem with this tool. A Bug Finder analysis is indeed a much less complex process compared to a

Code Prover verification (Bug Finder will not try to prove things). It can then handle huge

applications.

Thanks to Bug Finder, users will be able to find the future red checks but much quicker. It’s better not

to have run-time errors in the code to have all of this code being verified, it is more convenient to

find them earlier with the Bug Finder. Fixing the run-time errors earlier will actually save you time.

Moreover, Polyspace Bug Finder is able to find defects that are not detected by Code Prover like

double free, memory leaks… There are actually more than 140 defects that can be detected.

Of course, the greatest benefit of Code Prover is its ability to prove the absence of run-time errors.

And then you may want to use Code Prover, even on their huge application.

Here are different strategies you can suggest.


Reduce the complexity

Because the root cause of the scaling issue is the complexity, we have to find a way to reduce it.

This can be achieved:

by options

by splitting the code

by reducing the code

Using scaling options

Here the idea is to ask Polyspace to be less precise in its “check proving” process so that the

verification is less complex and the less “scaling-prone”.

Here are the options that you can set in their projects.

The first one is the Precision level:

Figure 2 - Precision level

It goes from 0 to 3, 0 being the less precise mode.

This is a parameter that is not targeting a specific construct, but that will impact the precision in

general.


Then there are the scaling options:

Figure 3 - The scaling options

“Optimize large static initializers” (-no-fold ) : reduce the complexity on large arrays.

Inline : the functions given as argument to this option will be “inlined”,i.e. it will make

Polyspace consider each call to this function as a new function. Polyspace Code Prover will

suggest you in the log functions that are worth to inline. You will find messages like:

* inlining WRITE_ARRAY_TO_TX_QUEUE could decrease the number of aliases of

parameter #1 from 389 to 2

* inlining READ_EEPROM_LOOP_0 could decrease the number of aliases of

parameter #3 from 161 to 12

Launch the verification in “unit-by-unit” mode

This mode is designed to verify the application in a context where, instead of considering all the files

of the application as a whole, each file will be an independent unit, so if your application is made of

three files, three independent verifications will be launched:


Figure 4 - whole application vs unit-by-unit mode

The first benefit of this mode is to prevent the scaling of course since now only small verifications will

take place.

This mode is also a good way to find local problems (i.e. problems in a single file) faster, and then to fix them earlier, without having to wait for the end of the whole application verification. This is why we usually recommend to first start with a unit-by-unit verification, fix the local problems, and eventually continue with the “whole application” mode. This mode is adapted for applications where files are “naturally” independent, for example libraries. This is also a good opportunity to check if some files are longer to compute than others. These files, probably complex since it takes time for Polyspace to verify them, could be the cause of a future scaling issue in the “whole application” mode. This approach has been adopted successfully by most of our customers, especially in the automotive sector. It is also very easy to use this mode. It is just an option to set:


Figure 5 - the "Verify file independently" option

Since R2016a, users can see the summary of statistics for the units directly in the graphical interface:

Figure 6 - statistics for unit verifications

As seen in figure below, since each file is verified independently, there will be more stubbed objects,

and then more “full-range” values leading to more unproven (orange) checks in the unit results.


Figure 7 - independent verifications

But on the other hand, it is an opportunity to take a look at the unproven checks and see if the file

can be improved against these problems.

Do module verifications

The idea here is similar to the unit-by-unit mode (split of the code to divide the complexity) but with

modules, i.e. sets of files, that represent a functionality. The files part of this module have a strong

coupling with each other, and less with the rest of the application. It thus makes more sense to verify

these files together.

Modules can be easily created in the graphical interface:


Figure 8 - modules in a projects

In the example above, the application has been split in three modules.

The idea when splitting the application is to create meaningful modules that are adapted to the

architecture of the code, for example that are based on the application components.

Use the modularization The idea of modularization is to help the user splitting the code automatically thanks to a “smart split” of the verification into modules. This smart split is done by analyzing the dependencies of functions and variables in order to partition the application into modules with low dependencies. The tool will analyze the code and a MATLAB window will be displayed. Select the best module

number from 2 to 10 (click on a grey column). Select a number just after a big drop in the maximum

complexity and before a big increase in the number of public functions and variables. If there is no

clear choice, then chose “Separate functions and variables” in Menu>Public Entities.

After having selected a module number, a new project called <project>_N_modules with N modules

to verify is created. Finally, you just have to click ‘Batch Run’.

More information here:


http://www.mathworks.com/help/codeprover/ug/automatically-modularize-large-applications.html

If the verification level 0 was not reached because of the scaling problem, it is still possible to launch

the modularization manually.

As a prerequisite, your verification needs to have reached the end of the internal phase P_CMS in

order to have the possibility to launch the modularization. You can check that in the verification log:

**** C to intermediate language translation - 5 (P_CMS)

Then you need to copy all the files that have the .mdg extension from the C-ALL folder to the ALL

folder.

Remove parts of the code

Here, the idea is to stay in the “whole application” mode (i.e. no more split or modules) but to see if

some parts of the application can be removed.

For example: is it worth to verify the code of third party libraries? Or: are there any parts of the

application that are not interesting?

http://www.mathworks.com/help/codeprover/ug/automatically-modularize-large-applications.html

Date post:	11-Jun-2018
Category:	Documents
Upload:	vanhanh
View:	223 times
Download:	1 times

A scaling problem survival guide - MathWorks · A scaling problem survival guide ... Code Prover...

Documents