An empirical comparison of dependency issues in open source software packaging ecosystems

Post on 20-Mar-2017

70 views 2 download

transcript

AnEmpiricalComparisonofDependencyIssuesinPackagingEcosystems

AlexandreDecan,TomMens,MaelickClaesSo#wareEngineeringLab,Belgium

SANER–Klagenfurt,Austria,February2017

PackagingEcosystemAlargecollecDonofinterdependentso#warepackages……thatcanbeinstalledanddistributedusingapackagemanager

SelectedExamples:

OpenSourcePackagingEcosystems

Bogart,Kastner,Herbsleb&Thung(FSE2016)HowtobreakanAPI:CostNegotaDonandCommunityValuesinThreeSo#wareEcosystems

2

SANER–Klagenfurt,Austria,February2017

PackageDependencies

Arenecessary•  IncreasemodularityandevoluDon•  Facilitatereuse•  Reducecomplexity

3

SANER–Klagenfurt,Austria,February2017

CRAN

RubyGems

npm

Language R Ruby JavaScriptPackages 10K 123K 317K

Dependencies 22K 183K 728K All pkg. releases 57K 685K 2000K

All dependencies 128K 1675K 7500K

Package Dependencies

Areomnipresent

4

SANER–Klagenfurt,Austria,February2017

Most packages depend on another one

April 2016npm ~60%RubyGems ~60%CRAN ~70%

5

SANER–Klagenfurt,Austria,February2017

PackageDependencies

Aredifficulttomanage– TransiDvedependencies

6

SANER–Klagenfurt,Austria,February2017

Dealing with the deeply nested dependencies has caused us no end of frustrations. A dependency of a dependency of a

dependency breaks and we’re left trying to trace the source of the error and figure out which repo to open an issue on.

TheProblemofTransiDvePackageDependencies

h[p://www.haneycodes.net/npm-le#-pad-have-we-forgo[en-how-to-program/

7

SANER–Klagenfurt,Austria,February2017

TheProblemofTransiDvePackageDependencies

This impacted many thousands of projects. [...] We began observing hundreds of failures per minute, as

dependent projects – and their dependents, and their dependents...

– all failed when requesting the now-unpublished package.

8

SANER–Klagenfurt,Austria,February2017

>2% of all npm packages relied on left-pad.Left-pad is not an exception:

TheProblemofTransiDvePackageDependencies

EvoluDonofthenumberofpackageshavingarelaDveimpact>2%

9

SANER–Klagenfurt,Austria,February2017

PackageDependencies

Somepackageshaveveryhighimpact(>30%)

2011 2012 2013 2014 2015 20160.0

0.1

0.2

0.3

0.4

0.5

ratio

of p

acka

ges

npmcranrubygems

RelaDvenumberof(transiDve)dependentsformosttherequiredpackage

10

SANER–Klagenfurt,Austria,February2017

[...] the risk of things breaking at some point due to the fact that a version of a dependency has changed

without you knowing about it is immense. That actually cost usweeks and months in a couple of professional

projects I was part of.

TheProblemofIncompaDblePackageUpdates

One recent example was the forced roll-back of the ggplot2 update to version 0.9.0, because the introduced

changes caused several other packages to break.

11

SANER–Klagenfurt,Austria,February2017

41% of observed errors caused by incompatible updates.On average, one backward incompatible update per 20 new releases

TheProblemofIncompaDblePackageUpdates

Decan,Mens,Claes&Grosjean,SANER2016:“WhenGitHubmeetsCRAN:ananalysisofinter-repositorypackagedependencyproblems.”

In 2010, release 0.5.0 of i18n broke the popular ActiveRecord gem…… on which relied 874 packages...... which represents 5.2% of all packages!

12

SANER–Klagenfurt,Austria,February2017

Possible solutions to package dependency management

Solutions tend to be ecosystem-specific 1.  Package Update policy2.  Semantic Versioning3.  Dependency Constraints4.  Continuous Integration Tools

13

SANER–Klagenfurt,Austria,February2017

1.PackageUpdatePolicy

Possible solutions to package dependency management

Submiting updates should be done responsibly and with respect for the volunteers’ time. Once a package is

established (which may take several rounds), “no more than every 1–2 months” seems appropriate.

Changes to CRAN packages causing significant disruption to other packages must be agreed with the CRAN

maintainers well in advance of any publicity.

14

One recent example was the forced roll-back of the ggplot2 update to version 0.9.0, because the introduced changes

caused several other packages to break.

SANER–Klagenfurt,Austria,February2017

How frequently are packages updated?

•  Packages tend to be updated shortly after a previous update.

•  Packages required by other packages are updated more frequently.

15

SANER–Klagenfurt,Austria,February2017

Possible solutions to package dependency management

2.SemanDcversioning:MAJOR.MINOR.PATCH– MAJOR=breakingchangesareallowed– MINOR=onlybackwardcompaDbleupdates– PATCH=onlybugandsecurityfixes

WhilesemanDcversioningcanbesuggested,itcannotbeenforced!

release0.5.0ofi18nbroke875packages (i.e.,5%oftheecosystem)

16

SANER–Klagenfurt,Austria,February2017

3.DependencyConstraints•  Minimalconstraint pkg>=2.4.0•  Maximalconstraint pkg<3.0.0•  Strictconstraint pkg==2.4.0

ProporDonofpackages(straightlines)andproporDonofdependencies(do[edlines)thatuseadependencyconstraint.

Possible solutions to package dependency management

17

SANER–Klagenfurt,Austria,February2017

3.DependencyConstraints•  Minimalconstraint pkg>=2.4.0•  Maximalconstraint pkg<3.0.0•  Strictconstraint pkg==2.4.0

Possible solutions to package dependency management

ProporDonofpackageswithdependencies(straightlines)anddependencies(do[edlines)thatspecifyastrict,minimalormaximaldependencyconstraint.

18

SANER–Klagenfurt,Austria,February2017

3.DependencyConstraints•  Minimalconstraint pkg>=2.4.0•  Maximalconstraint pkg<3.0.0•  Strictconstraint pkg==2.4.0

Possible solutions to package dependency management

“we continued to observe many errors. This happened because a number of dependency

chains [...] explicitly requested 0.0.3.”

19

SANER–Klagenfurt,Austria,February2017

Possible solutions to package dependency management

Constraints that require a specific subset of accepted versions

Can lead to co-installability issues

May prevent a package to benefit from updatesEg.: security fixes in C 1.4.1

A C1.4.0

B

<= 1.4.0

>= 1.4.1 C1.4.1

20

SANER–Klagenfurt,Austria,February2017

Possible solutions to package dependency management

4. Continuous integration management

Automated monitoring of dependency updates and security issues

e.g., Gemnasium, Requires.io, DependencyCI, GreenKeeper … only monitor direct dependencies, not transitive ones

Automated testing for breaking changes e.g., travis-ci, codeship … help to detect breaking changes but not to address them

21

SANER–Klagenfurt,Austria,February2017

Empiricalcomparisonof3packagingecosystems

Needtofindrightbalancebetween

–  havinguptodatedependencies–  facingtheriskofbackwardincompaDblechanges

RequiresacombinaDonof–  technicalsoluDons(constraints,CI)–  socialresponsibiliDes

22

SANER–Klagenfurt,Austria,February2017

Aremicro-packagesharmful?– 11linesofle#padpackagebreaking>6000packages?

Isinstallingpackagesdirectlyfromgithubharmful?– NospecificnoDonofversion(onlycommitsandtags)– WillmakepackagemanagementevenmoreproblemaDc

23