Web viewAn intriguing development in the consumer market landscape is the substantial increase in...

Growth and popularity in markets for free digital products

Gil AppelMarshall School of Business

University of Southern [email protected]

Barak LibaiArison School of Business

Interdisciplinary Center (IDC), [email protected]

Eitan MullerStern School of Business

New York University Arison School of Business

Interdisciplinary Center (IDC), [email protected]

June 2016

The authors would like to thank Gal Elidan, Zvi Gilula, Jacob Goldenberg, Hema Yoganarasimhan, Scott Neslin and Oded Netzer for their advice and helpful comments during the research process.

mailto:[email protected]

mailto:[email protected]

Growth and popularity in markets for free digital products

Abstract

Free digital products (FDPs) dominate online markets, yet our knowledge and theories about

their growth are based mainly on conventional goods. We demonstrate how FDPs’ growth

dynamics differ from those observed for conventional new products, using a large-scale dataset

that documents the growth of close to 60,000 FDPs, and supported by an additional growth

analysis of thousands of mobile apps. We find that FDPs display three distinct patterns of

growth: bell-shaped pattern (“Diffuse”); exponential-type decline (“Slide”); and a combination

of the two (“Slide and Diffuse”). We further show a robust relationship between FDP popularity

and growth pattern ubiquity, providing the first evidence of a correlation between products’

popularity and growth patterns. We further show how FDP-related growth phenomena help to

explain the patterns that emerge, and elucidate the need to adapt our knowledge on new product

growth and its modeling to the fast-moving world of free digital products.

Keywords: diffusion of innovations; free products; mobile applications; product life cycle; social influence; software

2

1. Introduction

An intriguing development in the consumer market landscape is the substantial increase in

the number of digital products available for free (Anderson 2009). Free digital products (FDP)

have been available for a while for computer software products supplied via online platforms,

joined recently by similar FDPs for smartphones and web applications. Some of this availability

stems from the “freemium” business model, under which a certain percentage of adopters will

eventually upgrade to a less restricted version or purchase in-app byproducts (Kumar 2014). Yet

the increase in FDPs also follows other developments such as the rise of open-source software

collaboration projects, where many users join forces to produce software products that will be

free except for technical support (Mallapragada et al. 2012). Recent reports highlight the

ubiquity of the phenomenon: More than 90% of recently downloaded smartphone applications

were free, with this percentage expected to continue rising in the foreseeable future (Olson 2013;

AppBrain 2016). In established markets (e.g., task management tools and anti-virus programs), a

fierce battle is being waged among “freemium” and “premium” business models (Dunn 2011;

Woods 2013).

The question we put forth is whether the new product growth generalizations and insights

developed over the years in markets for conventional products apply also to FDPs. In particular,

we focus on the essential shape of growth. Previous research has maintained that growth in

digital environments in general (Rangaswamy and Gupta 2000) and FDP in particular (Jiang and

Sarkar 2010; Lee and Tan 2013) follows the commonly observed S-shaped diffusion patterns –

with bell-shaped non-cumulative growth – and can thus be analyzed using traditional diffusion

models. However, when we began to examine the growth patterns of tens of thousands of FDPs

(to be described later), the picture that emerged was different: While a bell-shaped pattern

3

implies growing demand early on, we find that in the FDPs examined, the correlation of month-

to-month growth in the first year was positive for only about 40% of the products. What further

stood out was that the extent of the phenomenon was highly correlated to the level of product

popularity: The percentage of positive correlations monotonically decreased with the popularity

level, down to 26% positively correlated patterns for the less popular bottom 10% of popularity.

This is not the conventional pattern of growth we read about in the new product textbooks.

What can drive this phenomenon? Note that conventional products are associated with

significant R&D costs, as well as costs of manufacturing, marketing, and maintaining market

presence. Therefore, firms will invest in screening and testing of the products before market

launch, and will be motivated, internally or due to channel pressure, to take a product off the

shelf if it seems to fail. The case of FDP differs, in particular due to the low barriers to

development and introduction of many digital products into the market, which may be reflected

in two ways: First, the cost of adapting the product and offering it to small, specific niches is

low, which leads to a “long tail” of supply (Brynjolfsson et al. 2010). If past research

emphasized the ability of digital channels to enable a long tail of physical goods (Brynjolfsson et

al. 2011), then the fact that the goods themselves are digital enables even better tailoring to small

niches. Second, given the lack of barriers, there is an increased presence of small and less

experienced suppliers with low resources to invest in marketing, so that we can expect to find

many products whose low popularity stems from inability to reach larger audiences even if

targeted otherwise. Indeed, it is reported that a large share of FDPs are considered failures, and

eventually do not even cover development costs (Foresman 2012; Rubin 2013).

Overall, whether the niche market was intended at the outset or not, consumers should face

a large share of low-popularity products when considering supply in the FDP market, at least in

4

the absolute number of offerings1. Empirical data suggest that this is the case for markets such as

free PC software (Zhou and Duan 2012) and mobile applications (Zhong and Michahelles 2012).

In early 2016, for example, among the 1.87 million free android apps available, more than 60%

had fewer than 1,000 downloads, and only about 1% were downloaded more than 1 million times

(AppBrain 2016).

This phenomenon raises an interesting question regarding the prevalent growth pattern of

FDPs. Our knowledge on the diffusion of new products has been largely shaped in markets such

as durables, pharmaceuticals, and services looking typically at highly popular cases of growth

(Peres et al. 2010). In fact, one of the essential concerns with the understanding of innovation

diffusion is that nearly all knowledge comes from successful innovations (Greve 2011; Rogers

2003). This lack of evidence on the growth pattern for what may be the majority of the FDP

market is an issue of significant managerial and theoretical importance. The shape of the growth

curve is considered “the most important and most widely reported finding about new product

diffusion” (Chandrasekaran and Tellis 2007). Studying growth patterns is a fundamental stepping

stone to the understanding of markets for new products: It is used to understand the driving

forces of new products’ success; as a base for modeling and optimizing firm behavior in the

context of new product introductions; for decisions of termination or further support for new

products; and for segmentation by adoption times (Golder and Tellis 1997; Peres et al. 2010).

Here we study the full spectrum of growth patterns in FDPs, providing comprehensive

evidence for a fundamental difference between the growth of highly studied superstars, and the

growth of the less popular majority. The ability to track information in the case of FDP markets

provides an opportunity to conduct a large-scale analysis in a way seldom available to past new

1 This is independent of the question of the share of downloads by various segments of the popularity curve in such markets (Brynjolfsson et al. 2010).

5

product growth researchers, and to overcome the problem of a left truncation bias to lack of data

on the product’s early days (Jiang et al. 2006). We use data on the monthly level of downloads

from launch-day of a large number of software products in multiple categories, with downloads

per product ranging from a few hundred to millions, making this one of the largest new product

diffusion studies to date. Our main data source is the SourceForge database, which enables us to

study the growth of almost 60,000 free software products. We are able to complement this

analysis by also looking at data on the growth of close to 7,000 mobile apps, which shows

consistent results. The main insights can be summarized as follows:

Three pattern archetypes dominate the growth of FDPs in our datasets: a bell-shaped

curve (largely left skewed) that we label diffuse, an exponential-like decline starting at

launch labeled slide, and a combination of the first two – slide & diffuse. Diffuse patterns

represent about half of the cases in our database.

The dynamics that lead a product to the “underdog” part of the long tail differ from the

pattern that leads a product to become a superstar, as the ubiquity of the three archetypes

is strongly related to the popularity of the products. Bell shapes are dominant in popular

products, yet become a minority in small niche products. The fact that the very popular

products are almost exclusively bell shaped may help to explain how previous research,

which has been based on popular products, missed this relationship.

Two phenomena that characterize FDP markets help explain the shape of growth: The

first is the inception effect, representing disproportional early-onset external effects,

which explains the slide phenomenon in the presence of social influence. The second is

the recency effect, which implies that in free digital markets, recent adoptions (and not

only cumulative adoptions as traditionally used in diffusion models) help explain the

dynamic effect of social influence on growth.

Recency is in particular important in helping to differentiate between popular and less

popular products. The association of recency and growth is more than double among the

top popular 10% compared to the bottom 10% in popularity. We further find evidence

that recency level in a category is associated with the shape of the popularity curve, so

6

that higher average recency level in the category is associated with higher inequality,

captured by the Gini coefficient.

These findings are significant to our understanding of FDP growth, and to attempts to

model and optimize growth in such markets. In a broader theoretical sense, these findings imply

that generalizations that developed along the product life cycle, its turning points, and its drivers

(Golder and Tellis 2004) may need re-examining in the rapidly growing, dynamic world of free

digital products.

2. Background

2.1. Related literature

Our study relates to a number of research avenues:

Markets for free digital products: Research on FDPs has examined issues such as

optimal initial spread of freeware as part of profit maximization in the longer run (Cheng and Liu

2012; Niculescu and Wu 2014), free-riding and competitive dynamics (Haruvy and Prasad

2005), and the impact of the creation process on success (Grewal et al. 2006). Other research has

focused on the effect on demand of bestseller ranking and consumer ranking (Carare 2012; Lee

and Tan 2013), as well as other factors such as price discounts on in-app purchases (Ghose and

Han 2014). We add to this growing literature by providing the first large-scale analysis of the

growth patterns of FDPs, which is significant in particular given the assumption that FDPs grow

and should be modeled in a manner similar to other products typically described by the Bass

diffusion model (Jiang and Sarkar 2010; Yogev 2012; Lee and Tan 2013).

The long tail. From another angle, this work is also related to efforts to understand the

nature and significance of supply and demand inequality in electronic commerce, often

7

considered in the context of a “long tail”. Previous literature in that area has focused on the

factors that affect the pattern of sales, and in particular whether it leads to higher shares of sales

among low-selling niche products, or alternatively among high selling “superstars”. Looking at

both supply-side factors, such as broader product variety and distribution channel dynamics and

lower stocking costs, and demand-side factors, such as reduced search costs (Elberse and

Oberholzer-Gee 2007; Brynjolfsson et al. 2009, 2011; Hinz et al. 2011; Kumar et al. 2014),

considerable attention has been given to the inter-customer effect in the form of

recommendations and reviews in the creation of the long tail; yet also providing more “thrust” to

superstars (Fleder and Hosanagar 2009; Oestreicher-Singer and Sundararajan 2012; Hervas-

Drane 2015; Zhu and Zhang 2010).

We add to this literature an exploration of the dynamics at the individual product level

along the curve. If previous approaches have generally accepted the existence of “underdog

products” and “superstar products”, we ask how a product gets to become one or the other.

Patterns of innovation growth: In a more general sense, our effort is related to the

ongoing efforts to study the pattern of new product growth, which spans numerous disciplines

(Rogers 2003). The fact that the adoption rate of successful innovations follows a bell-shaped or

logistic-type curve, and a cumulative S-shaped curve, is considered one of the fundamental

discoveries of social science, and was largely attributed to the dynamic role of social influence

among customers in various forms (Young 2009; Peres et al. 2010). While there is evidence of

some exceptions to the S-shaped curve with a cumulative r-shaped (non-cumulative exponential

decline) pattern for entertainment goods such as movies and for supermarket goods (Gatingon

and Robertson 1985; Sawhney and Eliashberg 1996), the perception across disciplines is that

“the S-curves are everywhere” (Bejan and Lorente 2012). Indeed, these patterns form the bases

8

of diffusion-of-innovations theory and forecasting new product growth using consistent growth

shapes, such as the Bass model, Gompertz, or logistic curves (Meade and Islam 2006).

We add to this literature in two ways. First, we highlight FDPs as an additional, yet

separate category that is not necessarily dominated by S-shaped curves, and show how growth

characteristics of FDPs can explain the various shapes. In a more general sense, we provide

initial evidence for the relationship between product popularity and the shape of growth, an

unexplored issue in a research stream that has focused on highly popular products.

2.2. Modeling FDP growth

Since our aim is to examine growth along the FDP popularity curve, we will need to model

the growth of an individual free digital product. Two fundamental effects that lead to the

commonly observed S-shaped curve are considered when modeling the growth of new products

(Mahajan et al. 1990): The internal influence captures the impact of previous adopters via word

of mouth, imitation, and network externalities, typically considered a function of the number of

cumulative adopters to date. The external influence captures influences outside of the group of

previous adopters, such as advertising and mass media. We argue that an adaptation is needed in

both types of influence is to capture the growth in FDP markets as follows:

Internal influence and recency effect. While diffusion modelers have largely used the

number of cumulative adopters as a sole indicator of internal influence, some recent work points

to a possible need to separate the effect of recent adopters from that of cumulative number of

adopters, attributed to the difference in intensity of word of mouth in the two groups (Hill et al.

2006; Iyengar et al. 2011). It has been suggested, for example, that recent adopters may be more

contagious than consumers who adopted less recently, as the former are more enthused and/or

credible (Risselada et al. 2014).

9

We contend that in particular the growth in FDPs should allow this distinction. First, it is

often reported that for many FDP users, usage and engagement center on the time right after

adoption (Danova 2015). Second, it is well accepted that adopters of FDPs (and other digital

goods) rely heavily on popularity ranking information as appears in social media, app stores, and

download sites (Carare 2012; Garg and Telang 2013; Ghose and Han 2014; Lee and Raghu

2014). Yet, as is clearly observable, rankings do not necessarily reflect cumulative downloads,

but rather reflect past period popularity (Neitz 2015). This means that the recent number of

downloads, and not only cumulative ones, may play a pivotal role in FDP download decision

making. In fact, popularity rankings may also affect users who do not consider this information

explicitly, but rely on search. For example, it is reported that search results of engines belonging

to Google and Apple also largely depend on recent popularity ranking when displaying results

(Walz 2015).

External influence and the Inception effect. External influence is traditionally a parameter

that captures the marketing mix in the industry, in particular that of advertising (Mahajan et al.

1990). In the absence of large-scale advertising support for many FDPs, and given the

dominance of social media, much of the external influence comes from social media articles and

experts’ recommendations and ratings. However, attention to new products may be short lived:

Given the large number of launched products, the attention given to a new product centers on the

beginning of its life cycle. In fact, even when considering firms that do invest in advertising to

promote FDPs, there is a strong motivation to focus on the early period of growth. It is argued

that FDP producers have a short window of time in which to generate the groundswell that can

lead to attention by sources such as the charts in the app stores, and thus they must act early on

(Rice 2013; Kimura 2014). Consequently, those FDP developers who invest in marketing may

10

often do so in “burst campaigns” that are meant to get them on consumers’ radar early in the

game (ADA 2014; Klein 2014). Overall, we can expect that for FDPs, external influence will be

particularly strong early on in the new product’s life, a phenomenon that we label the inception

effect. This effect can be reflected in decay in the external influence parameter’s value over time.

Following these, we will use an FDP growth model that takes into consideration the

inception and recency effects. We begin with the fundamental Bass product growth model,

which is widely used to model the growth of new products. Under this approach, expected

adoptions at time period t (between t and t+1) are assumed to be reflected in the following

equation, where N is the market potential, X t is the cumulative number of adoptions up to time t,

p is the force of external influence, and q is that of internal influence:

(1)

X t +1−X t=( p+q⋅X t / N )⋅(N−X t )

To capture the inception effect, we let the external parameter be a varying function of time

with an initial external influence parameter (p), using an external decay parameter (δ), to capture

the decay in marketing effect over time. If the external decay parameter is positive, external

influence intensity decays with time. To capture the recency effect, we separate the internal

influence into two sub-parameters: As in the classic diffusion approach, parameter q captures the

effect of cumulative adoptions. The recency parameter r captures the effect of recent adoptions,

so that we multiply the relative change in the past period X t−X t−1

by r. We can now write the

model as follows:

(2)X t +1−X t=( pe−δ⋅t +q⋅X t / N+r⋅( X t−X t−1 )/ N )⋅(N−X t )

11

3. Growth Patterns at SourceForge3.1. Dataset

Our primary source of data is SourceForge.net, a large, open-source software (OSS)

repository that empowers software developers to control and manage open-source software, and

enables users to download these products for free (Madey 2013). As of June 2013, when we

scraped the data, SourceForge offered about 400,000 registered projects, with 3.4 million

registered developers and 4 million downloads a day. As such, it is among the largest download

sites, and home to some well-known consumer software products such as VLC media player,

eMule, and 7-Zip. In fact, many users may not be aware that products they download from

various software download sites are actually hosted by SourceForge.

Scraping SourceForge, we retrieved the monthly history of downloads for a large number

of products. The number of downloads is largely used to assess the success of open-source

products (Grewal et al. 2006; Daniel et al. 2013) and in a broader sense acts as a proxy for the

success of free products (Chandrashekaran et al. 1999). While SourceForge contains a large

number of products (close to 400,000), many of them are inactive and had zero downloads, and

thus are not relevant to our analysis. We focused on the download patterns for the 59,343

products that met the following criteria:

Data from five years of growth. We looked at a 60-month window for all products.

Naturally, the life cycles of FDPs are considerably shorter than the typically analyzed

growth of durables (although for some products, the cycle may be longer). Thus, to

reduce cases of right censoring and to use a consistent time frame, we considered only

products launched before mid-2008. Nonetheless, our analysis suggests that we covered

the majority of downloads for the various products2.

2 SourceForge data is reported monthly. Because the first month is incomplete and with varying lengths, which can bias the results, we use the first full month for which we have data as the first month.

12

At least 200 downloads at the five-year window. This criterion enabled us to capture

actual growth processes that are not affected much by possible developers’ noise over the

product life cycle.

The distribution of downloads in our data points to a large variance in downloads among

the products (see Figure 1). In our dataset, 41% of products had less than 1,000 downloads, while

about 0.6% (329 products) had more than one million downloads. The Gini coefficient is 0.96,

which indicates a high concentration, larger than those reported for markets such as videos and

books (Oestreicher-Singer and Sundararajan 2012). As our focus of interest is the shape of the

growth, and in order to be able to compare between patterns, we scale each pattern to a (0,1)

scale by dividing each observation by the total sum of downloads. We further elaborate on this

scaling in Section 4.2.

Figure 1: Distribution of popularity in SourceForge

3.2. Patterns and estimation of data and model

We break identifying patterns of growth into two stages: We first use the FDP model

presented in Section 2 to smooth the data, particularly essential given our use of monthly data,

13

which is much noisier than the classical annual diffusion data. Our analysis shows that not only

does using the FDP model have the advantage of being theoretically driven, but it also creates a

better smoothing algorithm than do alternatives such as HP filters. See Appendix A for a

discussion.

For the estimation we use general nonlinear optimization. Since we estimate scaled data

(with a sum of 1), we use the augmented Lagrange multiplier method to ensure that our

estimations always sum to one3. To examine our estimations’ fit, we consider two fit measures:

The first is R2, with an average value of 47.7% (as we use nonlinear estimation, an adjusted R2

coefficient could not be calculated). Recall, however, that large-scale monthly data can be very

noisy. We do find a positive correlation between the R2 values and the log of each product’s

downloads (ρ = 0.35), suggesting that when downloads are few and the data tends to be more

noisy (as is the case with much of our data), the R2 levels may be lower. However, the R2

measure can be biased toward capturing the peaks (where the variance can be larger) compared

to the entire curve.

As an alternative to the R2 measure, we also use the Kullback-Leibler divergence (KL

divergence) to measure the difference between our estimations and the data (Dzyabura and

Hauser 2011; Gilula and McCulloch 2013). KL divergence weighs each observation and thus is

less sensitive to the absolute value of the difference, weighing the relative difference instead. The

average KL divergence is 0.2 (with a standard deviation of 0.22, median of 0.14, and mode of

0.05), and most of the divergence values are close to zero.

Classifying the patterns and matching parameters to patterns. In a second stage, we

determine the patterns that emerge. Consistent with past efforts to identify patterns and turning

3 This procedure can be applied using the Rsolnp package in R (Ghalanos and Theussl 2014) or the Solnp implementation in MATLAB (Ye 1987).

14

points in diffusion data, we use a peaks-and-valleys algorithm for the classification (Goldenberg

et al. 2002; Golder and Tellis 2004; Chandrasekaran and Tellis 2011) using these rules:

We count the number of peaks and troughs in the data.

We require peaks to be substantial (10% over the start period or the previous valley) as

well as troughs (a drop of 10% or more since the start period), or else they are ignored

(Goldenberg et al. 2002).

Using a difference algorithm, we find that there were at most two peaks and one trough in

each pattern, which leads us to three general patterns:

o If a pattern climbs from the start toward a peak, then it is pattern that we label

Diffuse, as it is consistent with what we may expect given diffusion theory.

o If a pattern begins with a drop in adoptions with no peaks, it is labeled a Slide

pattern [named after playground slides].

o If a pattern begins with a drop in adoptions but has a later peak, it is labeled Slide

& Diffuse (S&D).

Table 1 presents some statistics on the resultant archetypes as well as the estimated model

parameters per archetype. As we can be seen in the descriptive statistics in part a) of Table 1, the

Diffuse pattern is the most ubiquitous in the data, with 48% of the software exhibiting this

pattern (28,490 patterns); S&D patterns were nearly 28% of the data (16,583 patterns), and

Slides accounted for 24% of the data (14,270 patterns).

Table 1: Archetype pattern characteristics and estimation

a) Descriptive Statistics Diffuse Slide S&D

No. of patterns 28,490 14,270 16,583Share (%) of patterns 48% 24.1% 27.9%Average no. of downloads 110,093 6,363 9,481Median no. of downloads 2,698 996 900

15

b) Model parameter values Diffuse Slide S&D

p (initial external effect) 0.009 0.067 0.032q (cumulative effect) 0.04 0.038 0.053r (recency effect) 0.489 0.184 0.318δ (external decay parameter) 0.18 0.546 1.829

We next examine the relationship between popularity and the shape patterns. One way of

doing so is to divide the dataset by equal download bins in terms of number of products (top

10%, 11%-20%, etc.). Figure 2 shows the relationship between bin membership and the

percentage of each archetype in every popularity bin, in the case of equal-size bins, separating

the top 1% from the rest of the top bin. We observe a strong monotonic increase in the share of

Diffuse patterns from low-popularity products to high-popularity ones, and a monotonic decrease

in the share of Slide and S&D patterns as products increase in popularity. While Diffuse is the

clear majority (89% of the products at the top 1%), it represents a minority among the less

popular products (less than third of the cases in the bottom 10% of the products).

We can also see this pattern visually going back to Figure 1 above. In Figure 1, Diffuse

patterns received a dark shade, while Slide and S&D received a pale shade. We can see how the

color of adoptions is dark around the area of high downloads, and lightens as we look at the long

tail of adoptions.

16

Figure 2: Growth patterns and popularity in SourceForge data

Bottom 10%81%-90%

71%-80%61%-70%

51%-60%41%-50%

31%-40%21%-30%

11%-20%2%-10%

Top 1%*0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

31% 33% 36% 38%42%

47%52%

57%

66%

78%

89%

29% 30% 29% 29% 28% 26% 24% 21%16%

9%4%

40% 37% 35% 33% 30% 27% 24% 22% 19%13%

7%

Diffuse Slide S&D

Download popularity

Patte

rn %

* Note that the top 1% is presented separately.

3.3. A log scale analysis

One of the challenges of an equal decile analysis in a concentrated distribution is that the

range with some bins may be very large. As can be seen in the upper portion of Table 2 below,

when considering equal deciles, the range of downloads is very large in the upper decile, while in

the lower decile, the range is small. To limit this variance, we took a long scale of the range of

downloads (200 to over 300M) and divided it into 10 bins of download size. As can be seen at

the bottom portion of Table 2, the within-bin discrepancy is now lower, however, in the upper

bins there are far fewer products.

17

Table 2: Range of downloads in each bin with equal and log-based binsEqual deciles

Bin 1 -

Bottom

Bin 2 Bin 3 Bin 4 Bin 5 Bin 6 Bin 7

Bin 8 Bin 9 Bin 10 - Top

Frequency

5,934 5,935 5,934 5,934 5,934 5,935 5,934

5,934 5,935 5,934

Min 200 302 447 649 957 1,463 649 4,040 8,027 23,066Max 302 447 649 957 1,463 2,322 957 8,027 23,05

9329.4

M

Log-based bins

Bin 1 -

Bottom

Bin 2 Bin 3 Bin 4 Bin 5 Bin 6 Bin 7

Bin 8 Bin 9 Bin 10 - Top

Frequency

21,663 18,333

11,188

5,135 2,021 689 231 62 16 5

Min 200 836 3,506 14,716

61,767 259,255

1.1M

4.5M 19.2M

80.5M

Max 835 3,505 14,715

61,766

259,254

1.1M 4.5M

19.2M

80.5M

329.4M

Figure 3 presents the shape ubiquity of the log scale deciles. We see that the ubiquity

pattern seen in Figure b2 continues, displaying an even larger difference among the bins. The

two bins that contain 21 products of 19.2 M downloads and up are composed of 100% Diffuse

pattern.

Figure 3: Growth patterns and popularity in SourceForge data (log-based bins)

18

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

34%

45%

59%

72%

81%86%

91%97% 100% 100%

37%

28%21%

16%12% 10%

6% 3% 0% 0%

29% 26%20%

12%7% 5% 3%

0% 0% 0%

Diffuse S&D Slide

Download popularity

Patte

rn %

3.4. Categories

The patterns we see above reflect a blend of many product categories. Is the pattern

ubiquity driven by a subset of the dataset, or is consistent across product types? To see which,

we repeated the analysis of Figure 2 with the six most popular categories SourceForge uses.

Figure 4 presents the results for the larger categories, and in Appendix B we can see the

distribution of the patterns in all 16 categories. As can be seen in Figure 4, the ubiquity pattern

identified above generally remains stable.

Figure 4: Ubiquity of shapes in equal-download deciles by category (top six categories)

19

3.5. Does it work outside of SourceForge?

To what extent can our findings from open-source software be generalized to other

freeware environments? In particular, smartphones have become a prominent freeware

distribution outlet, to the extent that the vast majority of smartphone apps are freeware (Olson

2013). While data on large-scale smartphone app adoption over time is not readily available to

20

researchers (Garg and Telang 2013), we were able to obtain the cooperation of a global firm,

which we will call “Mobility” so as not to reveal its identity. Mobility is a player in a market of

helping businesses create free smartphone apps that can be used as part of their business. Under

this business model, Mobility creates the app and helps manage it for the client for a monthly

fee. Mobility clients are varied and include service providers such as restaurants, artists,

musicians, educational institutions, and non-profits. These clients offer free apps created on the

Mobility platform for their own end users, who are typically individual customers or prospects.

Mobility can track these apps’ downloads by end users over time.

Figure 5: Growth patterns and popularity in Mobility data

Bottom 10%81%-90%

71%-80%61%-70%

51%-60%41%-50%

31%-40%21%-30%

11%-20%2%-10%

Top 1%*0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

33% 35% 36%40%

47% 48%

57% 56%

64%

77% 78%

45% 42%38%

31% 31%27% 25% 24%

19%13%

9%

22% 23%26%

29%

22%25%

18% 20% 17%

10%13%

Diffuse Slide S&D

Download popularity

Patte

rn %

21

* Note that the top 1% is presented separately.

The Mobility dataset is more limited than that of SourceForge in several aspects: The

Mobility apps are specific to certain service providers, so are naturally relevant to much smaller

market segments. In addition, unlike the case of open-source software, there is an entity

(Mobility clients) that may make dedicated efforts to push the freeware via external effects,

which we do not observe. While the time span we have for Mobility downloads is more limited,

it is more detailed, as we observe weekly data for downloaded apps (between February 2011 and

November 2013). Due to the smaller magnitude of adoption, we used data on apps that had at

least 50 downloads, taking a minimum of 52 weeks, and truncated at 52 weeks. We thus had

weekly adoption data for 6,914 smartphone apps.

We repeated the analysis as in the first dataset, and found notably similar results to the

SourceForge case. The patterns that emerged were grouped again in the same order of size into

the archetypes of Diffuse (49%), Slide (30%), and S&D (21%). We can see that while the share

of Diffuse patterns is close to that of SourceForge, the share of Slides is higher at the expense of

S&D.

The Mobility data can also help us examine the data from another angle, which might

affect the archetypes: the issue of versions. FDP creators often release new versions (i.e.,

software updates) and in the SourceForge database, 63% of the products have released more than

one version over the examined life cycle. One might wonder if the demand for versions can

fundamentally affect the archetypes we see and their relationship to popularity. We looked at the

issue in two ways: First, the Mobility data includes only one version, and as we can see, the

extent and the pattern of archetypes remains the same. Second, in the SourceForge dataset, we

looked at products that had only one version to see if within this group the dynamics of the entire

22

groups reported above change. Here also, we found that the dynamics of archetypes’ ubiquity

and popularity shown in Figure 2 largely remain the same for the one version only. Thus,

versioning does not appear to be the driver of the phenomena we identify here. The descriptive

statistics and parameter estimation per archetype for Mobility’s data is found in Appendix C in

Table C1, parts a) and part b) respectively.

23

4. Recency, inception, and the share of patterns4.1. Parameter values per shape

Our next aim was to see to what extent our data can help us to understand the relationship

between popularity and shape ubiquity identified above. We thus now turn to examine the

implication of parameter values that emerge from the FDP model we used, and see their

relationship to popularity.

Looking first at part b) of Table 1, we see a difference between parameter values of the

different shapes. The difference between each pair of the archetypes was significant using a two-

sample Hotelling’s t2 test, and similarly with a two-sample t-test. Following that, we want to

ensure that our model’s parameters actually define and drive these patterns, and that the

classification results are determined by the model and its parameters. We used a random forest

classifier (Breiman 2001) to see if we can correctly match the classified patterns using the

parameters of the freeware model. We indeed see that the random forest classifier shows a very

low out-of-bag error of 2.27%. The resultant confusion matrix is found in Table 3:

Table 3: Random forest confusion matrix results for the classification of the three patterns

Predicted

Actual

Diffuse Slide S&D Classification error

Diffuse 98.4% 0.7% 0.9% 1.6%Slide 1.4% 96.6% 2.0% 3.4%S&D 1.1% 1.4% 97.5% 2.5%

* Percentages are of the actual number of patterns.

4.2. Share of effects and popularity

We now turn to see if the effects represented by the parameters are related to product

popularity. An interesting feature of diffusion modeling is that it allows us to further understand

how the various parameters drive the distinctive shapes (Mahajan et al. 1990). Let T be the time

24

horizon (T = 60 in the SourceForge data, and T = 52 in the Mobility data). Thus we set

N=m⋅XT where m is a scaling factor of the observed data and XT is the total number of

downloads up to time T. In addition, in order to remove the effect of popularity on our shape

analysis, each observation is divided by the sum of observations over time (XT ), and the

equation has been translated into percentages in the standard manner by dividing both sides by

XT. We calculated the sources of growth by breaking down Equation 2 into the main

components that drive adoptions as follows4:

(3) Cumulative w-o-m effect =q⋅

X t

m⋅XT⋅(m−

X t

XT)

(4) Recency effect

=r⋅( X t−X t−1

m⋅XT)⋅(m−

X t

XT)

(5) Inception effect = pe−δ⋅t⋅(m−

X t

XT)

Turning to Table 4, we see a difference between the three archetypes in both parameter

value and share of patterns. While the inception effect is especially dominant for Slides, it has

the lowest share for the other two archetypes.

Table 4: Share of pattern attributed to each effect by archetype pattern

Share of pattern attributed to: Diffuse Slide S&D

p + δ (inception effect) 26.8% 49.3% 11.5%

4 The parameter m represents the fact that the diffusion process has not ended after 60 periods. Thus, m is a scaling parameter that does not have an impact on the shape of a pattern. We repeated the random forest examination without the using the m parameter, obtaining nearly identical results with out-of-bag error of 2.29%.

25

Share of pattern attributed to: Diffuse Slide S&D

q (cumulative w-o-m effect) 41.2% 40.1% 67.5%r (recency effect) 32.0% 10.6% 21.0%

Similar results were found in Mobility’s data in Table C1 (part c) of Appendix C. Figure 6

further elucidates the relationship between share of effect and popularity. We see the average

share of influence of the external, recency, and cumulative effects in various popularity tiers,

generated from Equations 3-5. The direction is clear: The share of recency increases dramatically

from 15% at the lowest popularity products to 40% at the upper 10%, while the cumulative effect

and the inception effect monotonically decrease in their share from less popular to more popular

product tier.

Figure 6: Share of patterns and profitability

Bottom 10%81-90%

71-80%61-70%

51-60%41-50%

31-40%21-30%

11-20%Top 10%

-10%

0%

10%

20%

30%

40%

50%

60%

70%

31% 31% 31% 30% 30% 30% 28% 27%24%

18%15% 16% 18% 19% 21% 23% 25%

28%32%

40%

55% 53% 51% 50% 49% 48% 47% 45% 44% 42%

External Recency Cumulative

Download popularity

Patte

rn's

effe

ct %

To further understand the source of difference among the patterns in this respect, consider

Figure 7, in which we graph the dynamics of the share of each effect over time for each pattern

archetype, using the average parameter values for each pattern presented in part b of Table 1.

26

Figure 7: The temporal dynamics’ shares of effects in the three patterns

Figure 7a: DiffuseCumu...

Months

Dow

nloa

ds

Figure 7c: Slide & DiffuseCumu...

Months

Dow

nloa

ds

Figure 7b: SlideCumu...

Months

Dow

nloa

ds

Consider the case of a Slide pattern (Figure 7b). An exponential-like decline in demand

was considered in the past in two types of markets. In the case of low-involvement supermarket

goods, the explanation – a slide-like pattern – was attributed to lack of inter-customer social

influence, and a dominant role of external effects such as advertising (Fourt and Woodlock 1960;

27

Gatingon and Robertson 1985). However, FDPs typically are not promotion driven, and there is

no reason to assume that they are unaffected by social influence (see Aharony et al. 2011 for the

role of social influence in such markets). In the case of entertainment goods such as movies, a

Slide pattern was identified in particular for blockbusters, and was explained by the anticipation

leading up to the movie’s release, which on the one hand can create a social influence process

pre-release, and on the other hand drives marketers to invest large resources in advertising and

screens early on in these blockbusters’ life cycles (Moe and Fader 2002; Ainslie et al. 2005). It is

not unlikely that such a phenomenon will be relevant to the continuous flow of free products we

examine. In particular, we see an opposite effect to that of movies: For FDPs, it is the least

popular products that exhibit a Slide pattern, not the most popular ones, indicating a different

process.

What Figure 7b suggests instead is an inception that is driving demand in particular early

on. However, later on it is joined by a cumulative effect, which has a large share – though not as

large as inception – in driving growth. Thus, while we don’t need to assume a lack of social

influence to explain the declining Slide pattern, the inception effect is not enough to create a

highly popular product. To create a bell-shaped Diffuse pattern, social influence should kick in

relatively early, and become dominant. What is in particular interesting from the Diffuse

dynamics in Figure 7a is the role of the recency effect. For products that enjoy social influence,

recency becomes the dominant social influence early on, immediately following the external

influence ignited by the inception effect. Only later when there are enough adopters does the

cumulative effect become dominant. As can be seen in Table 4, the cumulative effect has the

overall largest share in Diffuse, yet not much larger than the recency effect. If the recency was

28

not there to start the social process that eventually will bring in a larger number of adopters, the

product could have remained a lower-popularity Slide.

As Figure 7c suggests, S&D begins with an inception effect that is stronger than that of a

Diffuse (yet weaker than that of a Slide), and a recency effect that is weaker than that of a

Diffuse (yet higher than that of a Slide). In such a case, while the initial pattern is that of a

decline, the product creates enough social influence to turn the pattern around later on, i.e., social

influence begins to dominate due to the recency effect, and eventually the cumulative effect

becomes dominant. Overall, it seems that a recency effect is critical to create a popular FDP,

explaining why we see a large difference in the role of recency between the high- and low-

popularity products (Figure 6). Popular products create enough social influence early on, whether

by direct word of mouth or by ranking information to potential adopters, so that recency and later

on the cumulative effect begin to drive demand upward. What could have remained a less-

popular Slide thus become a more-popular Diffuse.

5. Discussion and Conclusion

5.1 The fundamental findings

The first core insight emerging from our findings is that Free Digital Goods have distinct

growth patterns. A large body of research has used information on the adoption of traditional

durables and services to teach us how new products grow, and constitute the base for managerial

thinking on product introduction in general. The fundamentally different shape of growth for

FDP indicates that we need to be cautious in applying past diffusion knowledge to these

environments. One could use the case of movies as an example in that sense: Given the unique

dynamic of growth and profitability in the motion picture industry, its growth and dynamics have

29

been largely analyzed separately from other categories (Eliashberg et al. 2006). The case of FDP

growth may require a similar consideration.

A second essential issue relates to the relationship between popularity and growth. The

strong monotonic relationship between popularity and the shape of growth we witness suggests

that past focus on popular products when analyzing new product growth can lead to a real bias.

This issue is of particular importance for FDPs given the significant dispersion of demand and

the presence of a significant long tail. This finding may have major implications for other

categories as well, yet given that we have data only on FDPs, the generalizability of our finding

to other categories remains to be investigated. The issue of popularity is interesting particularly

in light of the rich research stream that has acknowledged the strong dispersion of product

popularity in digital environments, and the existence of a long tail of demand (Brynjolfsson et al.

2010). While the 60,000 FDPs analyzed here show a large dispersion indeed, the fact that we

could use individual-level growth data and not look at products cross-sectionally based on

overall popularity, enabled us a unique opportunity to understand the creation of demand

dispersion in digital environments.

Our analysis suggests that for FDPs, environments, sales or downloads may start with a

drop. The fact that products are free encourages potential users to download them even if they

are not completely certain they need them. Since people often hear of FDPs most at the time of

their release (the “inception effect”), the time after launch may be relatively high in adoptions.

Yet in order to attract to a wider audience, external influence early on and even some word of

mouth afterwards may not be enough: The product has to create an engagement that will produce

social influence, which in turn will make it popular for a larger market potential. This stage will

be driven by the recency effect, which represents the effect of recent adopters on potential ones.

30

This effect is relevant to many product categories because of the high involvement and word of

mouth characteristics of recent adopters. But it should have a special meaning for free digital

products, as in FDP environments, consumers learn much from recommendation engines,

ranking tables, and search results. As we discussed, all of these may be largely affected by the

number of recent adopters rather than by cumulative adoption. If a product enjoys a strong

enough recency effect early on, it will quickly move to grow in adoption, with a growing

cumulative base of adopters that will join the recent ones. The growth then will be bell shaped,

and it is thus no wonder that bell shapes are more strongly associated with popular products.

In cases where word of mouth is not strong enough early on, the process may take some

time, and the product will begin with a slide. However, given enough time, the social process

will become dominant enough to spur a growth process that creates a Slide & Diffuse pattern.

Our results are thus consistent with studies that cite the communication process, and in particular

recommendation systems, as affecting the level of overall popularity (Fleder and Hosanagar

2009; Oestreicher-Singer and Sundararajan 2012). While the recency effect is driven by word of

mouth between individuals, important drivers for FDPs are online recommendation systems (and

search), which provide products with powerful enough social influence early on so that recent

adopters affect new ones and spur the process of real growth. If the product is not appealing

enough to draw people early on, the recency effect will not kick in and the product may remain

in a slide situation.

5.2 Individual product growth and the long tail phenomenon

So far we have not dealt with one of the main interests of the long tail literature – the

magnitude of the variance in popularity between best-selling product and slow ones, and the

nature of markets that increase this variance. While this highly discussed research question is not

31

our focus, and our ability to examine the issue is partial given the limitation on product specific

information in our large scale database, it is still of interest to investigate whether the product

growth dynamics we highlighted here can help to explain the within product variance in

popularity we see in digital markets.

We did take a first look on the matter by considering the variance in popularity in different

markets. Aiming to analyse markets that are as homogeneous as possible, we took advantage of

the fact that beyond the sub categories used above, SourceForge divides products also to sub-sub

categories (SSC). SSC are not always mutually exclusive for individual products, and in some

the number of products is small so that within product variance is less applicable to examine. We

examined 176 SSC (out of 316) where we had at least 50 products per category. We wanted to

see if the value of within product parameters for the SSC can help explain the between-product

variance in popularity that is reflected in the Gini coefficient of the specific SSC.

Using an OLS regression with SSC Gini coefficient as the dependent variable and the

average value of the parameters of the FDG growth model and SSC size as the independent

variables, two parameters came out significant regarding the effect on the SSC Gini: SSC size –

that is larger SSC’s are less equal, as may be expected (p < 0.05); and the parameter of recency –

that is higher recency is associated with higher Gini coefficient and thus the less equal is the SSC

(p < 0.001).

The effect of recency on inequality is consistent with the insights discussed above on the

importance of recency in FDG markets. Beyond the role of recency in the growth and success of

individual products, we see indications that in markets where recency is high the difference

between the less and more successful products is higher. Given our discussion above on the

possible relationship between recency and the effect of recommendation systems, we see this

32

result as supportive of research that highlights how recommendation systems can create

inequality in digital markets (Fleder and Hosanagar 2009; Oestreicher-Singer and Sundararajan

2012; Hervas-Drane 2015): in markets when recommendation systems play a stronger role,

recency effect may be higher leading to higher inequality between the long tail and the

superstars. Yet, to further understand this relationship and additional analysis that uses smaller

scale data yet is able to dive into the specifics of markets is needed. We believe this is a

promising area for future research.

5.3 Conclusion

The ability to collect precise adoption data in a timely manner and for differing levels of

popularity renders digital environments an unprecedented source of knowledge on the growth of

new products. The abundance of data, in particular individual-level adoption data, social network

data flows, and location information, ensure that much of our knowledge on growth is yet to

come, and may demand updating of our beliefs and empirical generalizations created in times

when such data were not available. We hope this study took a sizeable step in this direction.

33

References

ADA. 2014. Discoverability: How to get noticed in a marketplace overflowing with apps. White Paper, Application Developers Alliance, Washington, DC.

Aharony, N., W. Pan, C. Ip, I. Khayal, A. Pentland. 2011. Social fMRI: Investigating and shaping social mechanisms in the real world. Pervasive and Mobile Comput. 7(6) 643-659.

Ainslie, A., X. Drèze, F. Zufryden. 2005. Modeling movie life cycles and market share. Marketing Sci. 24(3) 508-517.

Anderson, C. 2009. Free: The Future of a Radical Price, 1st ed. New York: Hyperion.

AppBrain. 2016. AppBrain Stats. AppBrain, March 26. Available at http://www.appbrain.com/stats

Bejan, A., S. Lorente. 2012. The S-curves are everywhere. Mech. Engrg. 134(5) 44-47.

Breiman, L. 2001. Random forests. Machine learn. 45(1) 5-32.

Brynjolfsson, E., Y. J. Hu, M. D. Smith. 2010. Long tails vs. superstars: The effect of information technology on product variety and sales concentration patterns. Inform. Systems Res. 21(4) 736-747.

Brynjolfsson, E., Y. J. Hu, D. Simester. 2011. Goodbye Pareto Principle, Hello Long Tail: The effect of search costs on the concentration of product sales. Management Sci. 57(8) 1373-1386.

Brynjolfsson, E., Y. J. Hu, M. S. Rahman. 2009. Battle of the retail channels: How product selection and geography drive cross-channel competition. Management Sci. 55(11) 1755-1765.

Carare, O. 2012. The impact of bestseller rank on demand: Evidence from the app market. Internat. Econom. Rev. 53(3) 717-742.

Chandrasekaran, D., G. J. Tellis. 2007. A critical review of marketing research on diffusion of new products. N. K. Malhotra, ed. Rev. Marketing Res. 39-80.

—. 2011. Getting a grip on the saddle: Chasms, or cycles? J. Marketing 75(4) 21-34.

Chandrashekaran, M., R. Mehta, R. Chandrashekaran, R. Grewal. 1999. Market motives, distinctive capabilities, and domestic inertia: A hybrid model of innovation generation. J. Marketing Res. 36(1) 95-112.

Cheng, H. K., Y. Liu. 2012. Optimal software free trial strategy: The impact of network externalities and consumer uncertainty. Inform. Systems Res. 23(2) 488-504.

Daniel, S., R. Agarwal, K. J. Stewart. 2013. The effects of diversity in global, distributed collectives: A study of open-source project success. Inform. Systems Res. 24(2) 312-333.

Danova, T. 2015. The App-Store marketing report: User acquisition, retention, and strategies for getting apps to stand out. Business Insider, February 5. Available at http://www.businessinsider.com/app-store-marketing-strategies-and-stats-2015-2

Dunn, J. E. 2011. Free antivirus grabs more market share, claims Opswat survey. Techworld, June 8. Available at http://news.techworld.com/security/3284838/free-antivirus-grabs-more-

34

http://news.techworld.com/security/3284838/free-antivirus-grabs-more-market-share-claims-opswat-survey

http://www.businessinsider.com/app-store-marketing-strategies-and-stats-2015-2

http://www.businessinsider.com/app-store-marketing-strategies-and-stats-2015-2

http://www.appbrain.com/stats

http://devsbuilditdev.devcloud.acquia-sites.com/sites/default/files/Discoverability%20White%20Paper.pdf

market-share-claims-opswat-survey

Dzyabura, D., J. R. Hauser. 2011. Active machine learning for consideration heuristics. Marketing Sci. 30(5) 801-819.

Elberse, A., F. Oberholzer-Gee. 2007. Superstars and underdogs: An examination of the long tail phenomenon in video sales. Harvard Business School working paper.

Eliashberg, J., A. Elberse, M. Leenders. 2006. The motion picture industry: Critical issues in practice, current research, and new research directions. Marketing Sci. 25(6) 638-661.

Fleder, D., K. Hosanagar. 2009. Blockbuster culture’s next rise or fall: The impact of recommender systems on sales diversity. Management Sci. 55(5) 697-712.

Foresman, C. 2012. iOS app success is a “lottery”: 60% (or more) of developers don’t break even. Ars Technica, May 4. Available at http://arstechnica.com/apple/2012/05/ios-app-success-is-a-lottery-and-60-of-developers-dont-break-even

Fourt, L. A., J. W. Woodlock. 1960. Early prediction of market success for new grocery products. J. Marketing 25(2) 31-38.

Garg, R., R. Telang. 2013. Inferring app demand from publicly available data. MIS Quart. 37(4) 1253-1264.

Gatignon, H., T. S. Robertson. 1985. A propositional inventory for new diffusion research. J. Consumer Res. 11(4) 849-867.

Ghalanos, A., S. Theussl. 2014. Rsolnp: General non-linear optimization using augmented Lagrange Multiplier Method. R package version 1.15.

Ghose, A., S. P. Han. 2014. Estimating demand for mobile applications in the new economy. Management Sci. 60(6) 1470-1488.

Gilula, A., R. McCulloch. 2013. Multi level categorical data fusion using partially fused data. Quant. Marketing Econom. 11(3) 353-377.

Goldenberg, J., B. Libai, E. Muller. 2002. Riding the saddle: How cross-market communications can create a major slump in sales. J. Marketing. 66(2) 1-16.

Golder, P. N., G. J. Tellis. 1997. Will it ever fly? Modeling the takeoff of really new consumer durables. Marketing Sci. 16(3) 256-270.

—. 2004. Growing, growing, gone: Cascades, diffusion, and turning points in the product life cycle. Marketing Sci. 23(2) 207-218.

Greve, H. R. 2011. Fast and expensive: The diffusion of a disappointing innovation. Strategic Management J. 32(9) 949-968.

Grewal, R., G. L. Lilien, G. Mallapragada. 2006. Location, location, location: How network embeddedness affects project success in open-source systems. Management Sci. 52(7) 1043-1056.

Haruvy, E., A. Prasad. 2005. Freeware as a competitive deterrent. Inform. Econom. & Policy 17(4) 513-534.

Hervas-Drane, A. 2015. Recommended for you: The effect of word of mouth on sales concentration. Internat. J. Res. Marketing 32(2) 207-218.

35

http://arstechnica.com/apple/2012/05/ios-app-success-is-a-lottery-and-60-of-developers-dont-break-even/

Hill, S., F. Provost, C. Volinsky. 2006. Network-based marketing: Identifying likely adopters via consumer networks. Statist. Sci. 21(2) 256-276.

Hinz ,O., J. Eckert, B. Skiera. 2011. Drivers of the long tail phenomenon: An empirical analysis. J. Management Inform. Systems. 27(4) 43-70.

Iyengar, R., C. Van den Bulte, T. W. Valente. 2011. Opinion leadership and social contagion in new product diffusion. Marketing Sci. 30(2) 195-212.

Jiang, Z., F. M. Bass, P. I. Bass. 2006. Virtual Bass model and the left-hand data-truncation bias in diffusion of innovation studies. Internat. J. Res. Marketing 23(1) 93-106.

Jiang, Z., S. Sarkar. 2010. Speed matters: The role of free software offer in software diffusion. J. Management. Inform. Sys. 26(3) 207-240.

Kimura, H. 2014. Why app store keyword rankings drop dramatically seven days after launch. Sensor Tower, August 21. Available at https://blog.sensortower.com/blog/2014/08/21/why-app-store-keyword-rankings-drop-dramatically-seven-days-after-launch/

Klein, A. 2014. The Insider: Preparing your new app for launch. Tune, August 12. Available at http://www.tune.com/blog/the-insider-preparing-your-new-app-for-launch/

Kumar, A., M. D. Smith, R. Telang. 2014. Information discovery and the long tail of motion picture content. MIS Quart. 38(4) 1057-1078.

Kumar, V. 2014. Making “freemium” work. Harvard Bus. Rev. 92(5) 27-29.

Lee, G., T. S. Raghu. 2014. Determinants of mobile apps’ success: Evidence from the app store market. J. Management Inform. Sys. 31(2) 133-170.

Lee, Y. J., Y. Tan. 2013. Effects of different types of free trials and ratings in sampling of consumer software: An empirical study. J. Management Inform. Sys. 30(3) 213-246.

Madey, G. 2013. The SourceForge Research Data Archive (SRDA). University of Notre Dame, Feb. 14. Available at: http://srda.cse.nd.edu

Mahajan, V., E. Muller, F. M. Bass. 1990. New product diffusion models in marketing: A review and directions for research. J. Marketing. 54(1) 1-26.

Mallapragada, G., R. Grewal, G. Lilien. 2012. User-generated open-source products: Founder’s social capital and time to product release. Marketing Sci. 31(3) 474-492.

Meade, N., T. Islam. 2006. Modeling and forecasting the diffusion of innovation: A 25-year review. Internat. J. Forecasting 22(3) 519-545.

Moe, W. W., P. S. Fader. 2002. Using advance purchase orders to forecast new product sales. Marketing Sci. 21(3) 347-364.

Neitz, R. 2015. Extensive Guide to App Store Optimization (ASO) in 2015 – Part 2: Google Play Store. Trademob, June 12. Available at http://www.trademob.com/app-store-optimization-guide-google

Niculescu, M. F., D. J. Wu. 2014. Economics of free under perpetual licensing: Implications for the software industry. Inform. Sys. Res. 25(1) 173-199.

Oestreicher-Singer, G., A. Sundararajan. 2012. Recommendation networks and the long tail of electronic commerce. MIS Quart. 36(1) 65-83.

36

http://www.trademob.com/app-store-optimization-guide-google

http://www.trademob.com/app-store-optimization-guide-google

http://srda.cse.nd.edu/

http://www.tune.com/blog/the-insider-preparing-your-new-app-for-launch/

https://blog.sensortower.com/blog/2014/08/21/why-app-store-keyword-rankings-drop-dramatically-seven-days-after-launch/

Olson, P. 2013. The win for games: They grab two-thirds of app store sales. Forbes, September 19. Available at http://www.forbes.com/sites/parmyolson/2013/09/19/the-win-for-games-they-grab-two-thirds-of-app-store-sales

Peres, R., E. Muller, V. Mahajan. 2010. Innovation diffusion and new product growth models: A critical review and research directions. Internat. J. Res. Marketing 27(2) 91-106.

Rangaswamy, A., S. Gupta. 2000. Innovation adoption and diffusion in the digital environment: Some research opportunities. V. Mahajan, E. Muller, Y. Wind, eds. New-Product Diffusion Models. Norwell, MA: Kluwer Academic Publishers, 75-96.

Rice, K. 2013. Why pre-launch hype is the key to app success. Kinvey, May 2. Available at http://www.kinvey.com/blog/2545/why-prelaunch-hype-is-the-key-to-app-success

Risselada, H., P. C. Verhoef, T. H. A. Bijmolt. 2014. Dynamic effects of social influence and direct marketing on the adoption of high-technology products. J. Marketing 78(2) 52-68.

Rogers, E. M. 2003. Diffusion of Innovations. New York: Free Press.

Rubin, B. F. 2013. The dirty secret of apps: Many go bust. Wall Street J., March 7. Available at http://online.wsj.com/news/articles/SB10001424127887324582804578346221047028366

Sawhney, M. S., J. Eliashberg. 1996. A parsimonious model for forecasting gross box-office revenues of motion pictures. Marketing Sci. 15(2) 113-131.

Walz, A. 2015. Deconstructing the app store rankings formula with a little mad science. Moz.com, May 27. Available at https://moz.com/blog/app-store-rankings-formula-deconstructed-in-5-mad-science-experiments

Woods, D. 2013. The battle of the freemium and enterprise business models in the task management market. Forbes, March 18. Available at http://www.forbes.com/sites/danwoods/2013/03/18/freemium-enterprise-business-models-task-management-attask-asana

Ye, Y. PhD thesis. Department of ESS, Stanford University; 1987. Interior Algorithms for Linear, Quadratic and Linearly Constrained Non-linear Programming.

Yogev, G. 2012. The Diffusion of Free Products: How Freemium Revenue Model Changes the Strategy and Growth of New Digital Products. Saarbrucken, Germany: Lap Lambert Academic Publishing.

Young, H. P. 2009. Innovation diffusion in heterogeneous populations: Contagion, social influence, and social learning. Amer. Econom. Rev. 99(5) 1899-1924.

Zhong, N., F. Michahelles. 2012. Long tail, or superstar? An analysis of app adoption on the Android market. LARGE 3.0 Conf. 11-14.

Zhou, W., W. Duan. 2012. Online user reviews, product variety, and the long tail: An empirical investigation of online software downloads. Electronic Commerce Res. Appl. 11(3) 275-289.

37

http://www.forbes.com/sites/danwoods/2013/03/18/freemium-enterprise-business-models-task-management-attask-asana/

http://www.forbes.com/sites/danwoods/2013/03/18/freemium-enterprise-business-models-task-management-attask-asana/

https://moz.com/blog/app-store-rankings-formula-deconstructed-in-5-mad-science-experiments

http://online.wsj.com/news/articles/SB10001424127887324582804578346221047028366

http://www.kinvey.com/blog/2545/why-prelaunch-hype-is-the-key-to-app-success

http://www.forbes.com/sites/parmyolson/2013/09/19/the-win-for-games-they-grab-two-thirds-of-app-store-sales/

Zhu, F. Zhang, X. 2010. Impact of online consumer reviews on sales: The moderating role of product and consumer characteristics. J. Marketing, 74(2), 133-148

38

Appendices

Appendix A: Comparing model smoothing alternatives

We compared the R2 and the KL divergence measures from our model to other smoothing

alternatives to examine the goodness of fit of our model. First, we examine other variants of the

model. We examine the fit of a model without the recency and δ components separately, and

without both effects (effectively collapsing to the Bass model). Second, we examine other

smoothing methods used in time series modeling. The Hodrick-Prescott (HP) filter removes short

term-cyclical components from the filtered graph, allowing us to separate short-term noises and

retaining the long-term trend (Hodrick and Prescott 1997; Chandrasekaran and Tellis 2011). We

use 129,600 and 14,400 as the smoothing coefficient (λ) commonly used for monthly data

analysis with the HP filter. The Christiano-Fitzgerald filter (CF) has been examined as an

alternative to the HP filter, offering better control over high-frequency fluctuations and better

fitting more granular (e.g., monthly) data (Christiano and Fitzgerald 1998; Lamey et al. 2007;

Van Heerde et al. 2013). We use two months as the minimum length of a software cycle in the

CF filter, and examine 40 and 60 months as the maximum length of the software cycle. We also

examine smoothing with a locally weighted least squared regression (LOWESS, Rust and

Bornman 1982) and with penalized splines (Foutz and Jank 2010; Stremersch and Lemmens

2009). The results are in Table A1 below:

Table A1: Goodness-of-fit comparison between models

Model examined R2 KL Divergence

FDP growth model 0.48 (.26) 0.20 (.22)FDP growth model (without recency) 0.40 (.26) 0.26 (.49)FDP growth model (without δ) 0.34 (.26) 0.30 (.51)Bass model 0.33 (.25) 0.27 (.26)HP filter (λ = 14,400) 0.36 (.22) 0.27 (.27)HP filter (λ = 129,600) 0.28 (.21) 0.36 (.38)CF filter (max cycle = 40) 0.39 (.23) 0.31 (.45)CF filter (max cycle = 60) 0.32 (.23) 0.37 (.58)LOWESS 0.34 (.23) 0.28 (.51)Penalized splines 0.73 (.16) 0.12 (.16)

39

Looking at Table A1, we see that the FDP growth model performs better than all other models

and smoothing methods but one. While the penalized splines model fits the data better than the

FDP growth model, the resulting penalized splines curve is too flexible and does not clean the

short-term trends and outliers that are inherent in monthly-level data, thus offering a “ceiling” of

fit that a smoothing algorithm can reach. If we increase the smoothing parameters of the

penalized splines model to take that into account (see Appendix A in Foutz and Jank 2010 for

further discussion), the fit drops rapidly.

Appendix B: Distribution of pattern types by categories, SourceForge data

Table B1: Distribution of pattern types by categories

Category Diffuse%

Slide%

S&D%

Categorysize

Development 47.2% 23.2% 29.7% 12,074Internet 47.1% 26.5% 26.3% 7,172System Administration 48.0% 23.2% 28.7% 6,267Communications 47.8% 27.2% 25.0% 4,987Games 41.5% 26.7% 31.8% 4,910Science & Engineering 53.6% 16.2% 30.2% 4,317Audio & Video 51.1% 23.3% 25.6% 2,957Security & Utilities 46.3% 23.5% 30.2% 2,542Business & Enterprise 46.1% 27.0% 26.9% 2,387Home & Education 48.7% 19.6% 31.7% 1,607Graphics 48.9% 23.2% 27.9% 1,593Desktop Environment 48.4% 26.3% 25.3% 1,186Other / Unlisted Topic 46.5% 23.2% 30.4% 764Multimedia 47.6% 22.6% 29.8% 477Mobile 47.1% 25.2% 27.7% 242Formats and Protocols 49.1% 33.9% 17.0% 112

Software without an assigned category 52.0% 25.0% 23.0% 5,749

40

Appendix C: Results for the Mobility dataset

Table C1: Statistics and average parameter values for the pattern archetypes

a) Descriptive Statistics Diffuse Slide S&D

No. of patterns 3,412 2,032 1,470% of patterns 49% 30% 21%Average no. of downloads 4,404 874 1,218Median no. of downloads 414 162 196

b) Model parameter values Diffuse Slide S&D

p (initial external effect) 0.016 0.149 0.045q (cumulative effect) 0.083 0.059 0.075r (recency effect) 0.517 0.204 0.214δ (external decay parameter) 0.232 0.746 1.475

c) Share of pattern attributed to: Diffuse Slide S&D

p and δ (external effect) 28.6% 57.2% 15.2%q (cumulative effect) 41.5% 32.7% 71.7%r (recency effect) 29.9% 10.1% 13.1%

41

Date post:	03-Feb-2018
Category:	Documents
Upload:	tranlien
View:	216 times
Download:	1 times

Web viewAn intriguing development in the consumer market landscape is the substantial increase in...

Documents