+ All Categories
Home > Documents > Fast Fuzzy set Qualitative Comparative Analysis (Fast fsQCA)

Fast Fuzzy set Qualitative Comparative Analysis (Fast fsQCA)

Date post: 28-Nov-2023
Category:
Upload: independent
View: 0 times
Download: 0 times
Share this document with a friend
6
Fast Fuzzy Set Qualitative Comparative Analysis (Fast fsQCA) Jerry M. Mendel Signal and Image Processing Institute Ming Hsieh Department of Electrical Engineering University of Southern California Los Angeles, CA 90089-2564 [email protected] Mohammad M. Korjani Signal and Image Processing Institute Ming Hsieh Department of Electrical Engineering University of Southern California Los Angeles, CA 90089-2564 [email protected] Abstract—Fuzzy set Qualitative Comparative Analysis (fsQCA) is a methodology for obtaining linguistic summarizations from data that are associated with cases. It has recently been described as a collection of 13 steps [3]. In this paper we focus on how to speed up some of the computationally intensive steps of fsQCA and how to use the speed-up equations to obtain some interesting properties of fsQCA. Keywords-fuzzy sets; fuzzy set Qualitative Comparative Analysis; fsQCA; fast fsQCA I. INTRODUCTION UZZY Set Qualitative Comparative Analysis (fsQCA), developed by the social scientist Charles Ragin [4]-[6], [8, Ch. 5], is a methodology for obtaining linguistic summarizations from data that are associated with cases. Unlike more quantitative methods that are based on correlation, fsQCA seeks to establish logical connections between combinations of causal conditions (conjunctural causation) and an outcome, the result being rules that summarize the sufficiency between subsets of all of the possible combinations of the causal conditions (or their complements) and the outcome. The rules are connected by the word OR to the output. Each rule is a possible path from the causal conditions to the outcome and represents equifinal causation, i.e. different causal combinations leading to the same outcome. fsQCA is not a methodology that is derived through mathematics, e.g. as the solution to an optimization problem, although it uses mathematics. We have spent more than two years (beginning in September 2009) studying fsQCA, and with the very generous help of Prof. Ragin, have been able to summarize it mathematically as a collection of 13 steps [3]. This description of fsQCA cannot be found in Ragin’s works, and is essential if fsQCA is to be used by engineers and computer scientists. For a less mathematical presentation of the steps of fsQCA, as well as some additional discussions about it, see [2]. Figure 1 shows a high-level overview of the fsQCA algorithm. fsQCA begins [3] with your substantive knowledge () about a problem. You specify a desired outcome () (a separate fsQCA is run for each such outcome) and then choose the N cases () [7] from which you hope to extract new knowledge about the potential causes for that outcome. Next you postulate a set of k potential causes () that you believe could have, either individually or in various combinations, led to the desired outcome. You might be wrong about postulating a cause and so you protect yourself against this by simultaneously considering each cause and its complement. Best Instances ! Coverage 11 Fig. 1. fsQCA summarized. The numbers in this figure do not correspond to the numbered 13 steps of fsQCA, but are instead associated with a high- level overview of fsQCA [3]. fsQCA connects the 2 k possible (candidate) causal combinations to the desired outcome as a simple if-then rule, namely “if this causal combination, then the desired outcome.” Each causal combination contains exactly k terms (the causal condition or its complement) connected to each other by AND, to the desired outcome. All 2 k candidate rules are for the same desired outcome and are therefore connected by the word OR (). fsQCA now uses the case-based data to reduce the number of rules from 2 k candidate rules to a much small number of rules, and it simplifies the rules so that they usually contain causal combinations with fewer than k terms (). The latter happens because all of the rules are for the same desired outcome; hence, they can be logically combined using set theory reduction techniques, and by doing this it frequently happens that some or many causal conditions are absorbed (so F 978-1-4673-2338-3/12/$31.00 ©2012 IEEE
Transcript

Fast Fuzzy Set Qualitative Comparative Analysis (Fast fsQCA)

Jerry M. MendelSignal and Image Processing Institute

Ming Hsieh Department of Electrical Engineering University of Southern California

Los Angeles, CA 90089-2564 [email protected]

Mohammad M. KorjaniSignal and Image Processing Institute

Ming Hsieh Department of Electrical Engineering University of Southern California

Los Angeles, CA 90089-2564 [email protected]

Abstract—Fuzzy set Qualitative Comparative Analysis (fsQCA) is a methodology for obtaining linguistic summarizations from data that are associated with cases. It has recently been described as a collection of 13 steps [3]. In this paper we focus on how to speed up some of the computationally intensive steps of fsQCA and how to use the speed-up equations to obtain some interesting properties of fsQCA.

Keywords-fuzzy sets; fuzzy set Qualitative Comparative Analysis; fsQCA; fast fsQCA

I. INTRODUCTION UZZY Set Qualitative Comparative Analysis (fsQCA), developed by the social scientist Charles Ragin [4]-[6], [8,

Ch. 5], is a methodology for obtaining linguistic summarizations from data that are associated with cases. Unlike more quantitative methods that are based on correlation, fsQCA seeks to establish logical connections between combinations of causal conditions (conjunctural causation) and an outcome, the result being rules that summarize the sufficiency between subsets of all of the possible combinations of the causal conditions (or their complements) and the outcome. The rules are connected by the word OR to the output. Each rule is a possible path from the causal conditions to the outcome and represents equifinal causation, i.e. different causal combinations leading to the same outcome.

fsQCA is not a methodology that is derived through mathematics, e.g. as the solution to an optimization problem, although it uses mathematics. We have spent more than two years (beginning in September 2009) studying fsQCA, and with the very generous help of Prof. Ragin, have been able to summarize it mathematically as a collection of 13 steps [3]. This description of fsQCA cannot be found in Ragin’s works, and is essential if fsQCA is to be used by engineers and computer scientists. For a less mathematical presentation of the steps of fsQCA, as well as some additional discussions about it, see [2].

Figure 1 shows a high-level overview of the fsQCA algorithm. fsQCA begins [3] with your substantive knowledge (�) about a problem. You specify a desired outcome (�) (a separate fsQCA is run for each such outcome) and then choose the N cases (�) [7] from which you hope to extract new

knowledge about the potential causes for that outcome. Next you postulate a set of k potential causes (�) that you believe could have, either individually or in various combinations, led to the desired outcome. You might be wrong about postulating a cause and so you protect yourself against this by simultaneously considering each cause and its complement.

Best Instances

��������������� ���

�����������������

�����

����������������

������� ��

����������

� �

!����� � ���

Coverage 11

������� ��������������

Fig. 1. fsQCA summarized. The numbers in this figure do not correspond to the numbered 13 steps of fsQCA, but are instead associated with a high-

level overview of fsQCA [3].

fsQCA connects the 2k possible (candidate) causal combinations to the desired outcome as a simple if-then rule, namely “if this causal combination, then the desired outcome.” Each causal combination contains exactly k terms (the causal condition or its complement) connected to each other by AND, to the desired outcome. All 2k candidate rules are for the same desired outcome and are therefore connected by the word OR (�).

fsQCA now uses the case-based data to reduce the number of rules from 2k candidate rules to a much small number of rules, and it simplifies the rules so that they usually contain causal combinations with fewer than k terms (�). The latter happens because all of the rules are for the same desired outcome; hence, they can be logically combined using set theory reduction techniques, and by doing this it frequently happens that some or many causal conditions are absorbed (so

F

978-1-4673-2338-3/12/$31.00 ©2012 IEEE

they disappear from the final causal combination).

There may not be enough cases (Ragin calls this “limited diversity”) to provide evidence (or enough evidence) about all

2k candidate causal combinations, so more substantive knowledge is obtained from domain experts (�). This additional substantive knowledge is incorporated into the fsQCA computations ().

At the end of fsQCA one has a small collection of simplified if-then rules () that provide at least one simplified causal combination for a desired outcome (unless no such rule can be found). It is then possible to connect cases to each rule that are its best instances (��), and to compute the coverage (11) of the cases by each rule.

Fuzzy sets are used in some of the fsQCA steps because things are not always black and white; instead, they are a matter of degree.

The 13 computational steps of fsQCA are summarized in Figs. 2 and 3. The numbers on these figures are associated with the following 13 steps of fsQCA [3]. Those steps are: (1) Choose a desired outcome; (2) Choose k causal conditions; (3) Treat the desired outcome and causal conditions as fuzzy sets, and determine MFs for all of them; (4) Evaluate these MFs for all available cases, obtaining derived MFs; (5) Create 2k candidate causal combinations (rules); (6) Compute the MF of each of these candidate causal combinations in all of the available cases, and keep only the RS surviving causal combinations that have a sufficient number of cases (f) whose MF values are > 0.5; (7) Compute the subsethoods (consistencies) of these RS surviving causal combinations, and keep only those RA actual causal combinations whose subsethoods are � 0.80 (a design parameter advocated by Ragin); (8) Use the QM algorithm to obtain the complex solutions (summarizations) and the parsimonious solutions (see [3] for references for the QM algorithm); (9) Perform Counterfactual Analysis on the complex solutions, constrained by the parsimonious solutions, to obtain the intermediate solutions; (10) Perform QM on the intermediate solutions to obtain the simplified intermediate solutions; (11) Retain only those simplified intermediate solutions whose subsethoods are approximately � 0.80, the believable simplified intermediate solutions; (12) Connect the believable simplified intermediate solutions with its best instances; and, (13) Compute the coverage of each solution (summarization).

In the rest of this paper we focus on some of these steps and how to speed them up.

II. A BRIEF SUMMARY OF STEPS 1-6 OF FSQCA In this section, we provide a more quantitative description of the first six steps of fsQCA (Fig. 2) because this is needed in order to understand the rest of the paper. The six steps are:

Step 1. Choose a desired outcome, O, and its appropriate cases,

SCases = {1,2,...,N} .

No

Yes

NFi> f ?

A

Compute NFi

Make the Causal Combination

a Remainder

Compute Derived MFs N Data Cases

Choose Desired Outcome

Postulate k Causal Conditions

Obtain MFs

RS Firing-Level Surviving Rules ( ) ( ) µ

FiS (x), i = 1,...,RS

Create 2k Candidate Rules

Compute Firing Levels for 2k Candidate Rules

SF � SFS

SFS(RS )

SF (2k )

1

1

3

4

5

6

2

6

6

6

Fig. 2. Flowchart for fsQCA; it is continued in Fig. 3.

QM Algorithm

Prime Implicants Complex Summarizations

Counterfactual Analysis

Minimal Prime Implicants Parsimonious Summarizations

Intermediate Summarizations

Compute Consistency, Best Instances and

Coverages

Simplified Intermediate Summarizations

QM Algorithm Believable Simplified

Intermediate Summarizations

A

Compute RS Subsethoods

� 0.80 ? No

Yes RA Actual Rules ( )

Derived MF for Desired Outcome

SFS(RS )

SFS� S

FA

SFA (RA )

SF � SFS

SFPI(RC ) SFMPI (RP )

SFI (RI )

SFSI(RSI )

SFBSI (RBSI )

SF

7

7 7

8

9

10

11-13

Fig. 3. Flowchart for fsQCA, continued.

Step 2. Choose k causal conditions (if a condition is described by more than one term, treat each term as an independent causal condition), SC = {Ci , i = 1,...,k} .

Step 3. Treat the desired outcome and causal conditions as fuzzy sets, and determine membership functions (MFs) for all of them, µO (� ) and µCi

(�i ) , i = 1,...,k .

Step 4. Evaluate these MFs for all available cases, the results being derived MFs, µO (� (x)) � µO

D (x) and µCi(�i (x)) � µCi

D (x) , x = 1,...,N .

Step 5. Create 2k candidate causal combinations (rules) and view each as a possible corner in a 2k -dimensional vector space

SF = {F1,...,F2k } � Fj = A1

j � A2j � ...� Ak

j , Aij = Ci or ci (1)

where ci denotes the complement of Ci .

Step 6. Compute the MF of each of the 2k candidate causal combinations in all of the available cases, and keep only the ones—the RS surviving causal combinations (firing-level surviving rules)—whose MF values are > 0.5 for an adequate number of cases, i.e., keep the causal combinations that are closer to corners and not the ones that are farther away from corners: ( j = 1,...,2k , x = 1,...,N and l = 1,...,RS ) (see [3] for explanations of these equations)

µFj: (SF ,SCases )� [0,1]

x� µFj(x) = min µ

A1j (x),µ

A2j (x),...,µ

Akj (x){ }

���

�� (2)

µAij (x) = µCi

D (x) or µciD (x) = 1� µCi

D (x) i = 1,...,k (3)

tFj : ([0,1],SCases )� {0,1}

x� tFj (x) =1 if µFj

(x) > 0.50

0 if µFj(x) � 0.50

���

��

���

��

(4)

NFj:{0,1}� I

tFj � NFj= tFj (x)

x=1

N�

���

�� (5)

FlS : (SF , I )� SFS

Fj � FlS = Fj ( j� l) NFj

� f , j = 1,...,2k{ }���

�� (6)

III. FROM 2k N TO RS N FIRING-LEVEL COMPUTATIONS

Step 6 may be a computational bottleneck because it

requires 2k N computations of µFj(x) in (2), and even for

modest k (e.g., k = 8) and N (e.g., N = 200), 2k N is large

(e.g., 51,200). Ragin [6 p. 131] observed in an example with

four causal conditions: “… each case can have (at most) only a

single membership score greater than 0.5 in the logical

possible combinations from a given set of causal conditions.”

This is true in general and how to find that single membership

score greater than 0.5 is provided in the following:

Theorem 1 (min-max Theorem): Given k causal

conditions, C1 , C2 ,…, Ck and their respective complements,

c1 , c2 , …, ck . Form the 2k causal combinations ( j = 1,...,2k )

Fj = A1j � A2

j � ...� Akj where Ai

j = Ci or ci and i = 1,...,k . Let µFj(x) = min{µ

A1j (x),µA2

j (x),...,µAkj (x)} , x = 1,2,...,N . Then

for each x there is only one j, j*, for which µFj*(x) > 0.5 and

µFj*(x) can be computed as:

µFj*(x) = min max µC1

(x),µc1(x)( ),...,max µCk

(x),µck(x)( ){ }

(7)

Fj*(x) is determined from the right-hand side of (7), as:

Fj*(x) = argmax µC1(x),µc1

(x)( )... argmax µCk(x),µck

(x)( ) � A1

j* � ...� Akj*

(8)

In (8), argmax µCi(x),µci

(x)( ) denotes the winner of

max µCi(x),µci

(x)( ) , namely Ci or ci .

Proof1: If Aij = Ci and µCi

(x) = min µCi(x),µci

(x){ } , then it

is true that µci(x) = max µCi

(x),µci(x){ } . If, instead,

µCi(x) = max µCi

(x),µci(x){ } then it is true that

µci(x) = min µCi

(x),µci(x){ } . Consequently, choosing Ai

j = Ci or ci is equivalent to choosing µ

Aij (x) as either

min µCi(x),µci

(x){ } or max µCi(x),µci

(x){ } , so that rather

than thinking about the 2k causal combinations for each case in terms of Ci and ci , with their associated MFs µCi

(x) and

µci(x) , one can think about the 2k causal combinations for

each case in terms of the following new ordering of the Fj

and their MFs, µFj(x) :

µF1(x) = min min µC1

(x),µc1(x)( ),...,min µCk

(x),µck(x)( ){ }

µF2(x) = min min µC1

(x),µc1(x)( ),...,max µCk

(x),µck(x)( ){ }

...µF

2k(x) = min max µC1

(x),µc1(x)( ),...,max µCk

(x),µck(x)( ){ }

���

���

(9)

Because it is always true that

min µCi

(x),µci(x){ } � 0.5 , (10)

when such terms are evaluated in (9) the first 2k �1 terms will

always have a MF value that is also � 0.5. It is only the last

term in (9) that can have a MF value that is > 0.5, and that

term is the one in (7).

Observe that (8) is an immediate consequence of the last

line of (9) [which is the same as (7)], and that j* = 2k . �

We actually were motivated to study this result when we

created tabular results such as the ones by Ragin [5], [6], [8,

Ch. 5] in which the rows are the N cases, the columns are the

2k causal combinations, and the entries are the MF values for

these causal combinations. When N and k became “too large”

then showing such a table was difficult to do, because it no

longer fit on a single page. Theorem 1 demonstrates that it is

not necessary to show such a table because most of its 2k N entries are unimportant. Only N entries are important, and this

theorem locates them.

IV. FAST FSQCA

In Fast fsQCA, Steps 1-4 of fsQCA are unchanged, Steps 5

and 6 are changed and Steps 7-13 are also unchanged. Unlike

1 This proof was provided by Ms. Jhiin Joo.

fsQCA, where all 2k firing level fuzzy sets must actually be established in Step 5, so that their MF values can be computed in Step 6 for all cases (which requires 2k N computations), Fast fsQCA does not require that those 2k firing level fuzzy sets actually be established; it only requires that the concept of a firing level fuzzy set be established. It is not until (14) in Step 6New below that a specific subset of all of the 2k firing level fuzzy sets is computed for all N appropriate cases. This requires only NRS computations, which is a vastly smaller

number than 2k N , which is why this is called “Fast fsQCA.” The new Steps 5 and 6 are:

Step 5New. Conceptually, create 2k candidate causal combinations (rules) and view each as a corner in a 2k- dimensional vector space. Let SF be the finite space of 2k candidate causal combinations, called (by us) firing level fuzzy sets, Fj , as in (1).

Step 6New. Compute the RS surviving causal combinations (firing level surviving rules), whose MF values are > 0.5 for an adequate number of cases (this must be specified by the user), i.e., keep the adequately represented causal combinations that are closer to corners and not the ones that are farther away from corners, and then compute the MF of each of the RS surviving causal combinations in all the appropriate N cases. This is a mapping from

{SC ,SCases} into SFS that makes use of µCi

(x) and µci(x) , where

SFS is a subset of SF , with RS elements, i.e. Fj*(x) is given by (8)

( SFj*

= {Fj*(1),...,Fj*(N )} ), after which the J uniquely different

Fj*(x) are relabeled F �j ( �j = 1,..., J , and SF �j

= {F1,...,FJ} ), and

then the following are computed ( x = 1,...,N , �j = 1,..., J and l = 1,...,RS )

tF �j: (SFj*

,SF �j,SCases )� {0,1}

x� tF �j(x) =

1 if F �j = Fj*(x)0 otherwise

���

��

���

��

(11)

NF �j:{0,1}� I

tF �j� NF �j

= tF �j(x)

x=1

N����

�� (12)

FlS : (SF , I )� SFS

F �j � FlS =

F �j ( �j � l) if NF �j� f

0 if NF �j< f

���

��

��

��

(13)

µFl: (SFS ,SCases )� [0,1]

x� µFl(x) = min µA1

l (x),µA2l (x),...,µAk

l (x){ }���

�� (14)

V. WHEN THE NUMBER OF CAUSAL CONDITIONS CHANGES Sometimes one wants to perform fsQCA for different

combinations of causal conditions, by either including more causal conditions into the mix of the original k causal conditions, or by removing some of the original k causal

conditions. Presently, doing any of these things requires treating each modified set of causal conditions as a totally new fsQCA. The results in this section show that there are much easier and faster ways to perform fsQCA once it has already been performed for k causal conditions.

Observe in (8) that, e.g., argmax µC1(x),µc1

(x)( ) is unchanged whether there are one, two, three, etc. causal conditions. This means that, for each case, the winning causal combination for k causal conditions includes the winning causal combination for �k causal conditions, when �k < k , and is contained in the winning causal combination for ��k causal conditions, when ��k > k . It also means that if one knows the winning causal combination for ��k causal conditions, where ��k > k , and one wants to know the winning causal combination for k causal conditions, one simply deletes the undesired ��k � k causal conditions from the winning causal combination of ��k causal conditions.

For example, if AbcdE is a winning causal combination for case 1 when five causal conditions are used, and if one wants to eliminate causal conditions B and D, then AcE is the winning causal combination for case 1 when the three causal conditions A, C and E are used. No new computations have to be performed to obtain this result, because the winning causal combination for case 1 when the three causal conditions A, C and E are used is contained in the winning causal combination for case 1 when five causal conditions are used.

These observations suggest that there are both a forward recursion and a backward recursion for (7) and (8).

In what follows, it is assumed that the smallest number of causal conditions for which an fsQCA is performed is two.

Corollary 1-1 (Forward Recursion). For each case, it is true that ( k = 3,4,... ):

Fj*(x |C1,C2,...,Ck ) = Fj*(x |C1,C2,...,Ck�1)

� argmax µCk(x),µck

(x)( ) (15)

µFj*(x |C1,C2,...,Ck ) = min µFj*

(x |C1,C2,...,Ck�1),{ max µCk

(x),µck(x)( )} (16)

These results, arguably for the first time, connect fsQCA firing-level calculations for k and k - 1 causal conditions, provide an entirely new way to perform fsQCA computations when one wishes to study different combinations of causal conditions on a desired outcome, and should lead to a vast reduction in computation time for such a study.

Proof: It is easy to prove both (15) and (16) by using (7), (8) and mathematical induction. This is left to the reader.

Corollary 1-2 (Backward Recursion). Let Cj denote the

suppression of causal condition Cj . Then it is true that:

Fj*(x |C1,C2 ,..., Ci ,...,Ck ) = Fj*(x |C1,C2 ,...,Ci�1,Ci+1,...,Ck ) (17)

Proof: Obvious from (8).

This backward recursion can also lead to a vast reduction in computation time. For example, if the winning causal combination Fj*(C1,C2,...,Ck ) has been determined for six causal conditions (k = 6), then it can be used to establish the winning causal combination for any combination of five, four, three, or two of the causal conditions, by inspection!

No way has yet been determined for computing µFj*(x |C1,C2,..., Ci ,...,Ck ) from µFj*

(x |C1,C2,...,Ci ,...,Ck ) . It

seems that once Fj*(C1,C2,..., Ci ,...,Ck ) has been determined from (17), µFj*

(x |C1,C2,..., Ci ,...,Ck ) must be computed

directly from (7), as ( l = 1,2,...,RS ):

µFi*(x |C1,C2 ,..., Cj ,...,Ck ) =

min max µC1(x),µc1

(x)( ),...,max µCj�1(x),µcj�1

(x)( ),{��� max µCj+1

(x),µcj+1(x)( ),...,max µCk

(x),µck(x)( )}���

(18)

Corollary 1-3 (Firing Levels are Bounded). If µFj*(x |C1,C2,...,Ck1

) has been computed for k1 causal

conditions, and one now considers k2 causal conditions, where k2 > k1 , then

µFj*(x |C1,C2,...,Ck2

) � µFj*(x |C1,C2,...,Ck1

) (19)

This means that when new causal conditions are added to an existing set of causal conditions, the firing level for the new winning causal combination (which, by Corollary 1, contains the prior winning causal combination) can never be larger than the prior firing level, i.e. firing levels tend to become weakened when more causal conditions are included.

Proof: (19) is obvious from (16), because min(a,b) = a if a � b , or min(a,b) = b if b < a .

VI. RECURSIVE COMPUTATION OF CONSISTENCY Subsethood is the major computation of Step 7 of fsQCA

[3]. That computation is ( l = 1,2,...,RS ):

ssK (FlS ,O) =

min(µFlS (x),µO (x))x=1

N�µFlS (x)x=1

N� (20)

Additionally, in Step 7 only those FlS whose subsethoods are

greater than or equal to 0.80 are kept and become the so-called actual causal combinations, Fm

A , m = 1,...,RA .

Consider two populations, one of size N1 and the other of size N2 , where the N1 cases are contained in the N2 cases. In order to show the dependency of ssK (Fl

S ,O) on the population size, we use a conditioning notation, i.e.:

ssK (FlS ,O | N1) =

min(µFlS (x),µO (x))x=1

N1�µFlS (x)x=1

N1� (21)

ssK (FlS ,O | N2 ) =

min(µFlS (x),µO (x))x=1

N2�µFlS (x)x=1

N2� (22)

The following theorem shows how ssK (FlS ,O | N2 ) is

computed recursively from ssK (FlS ,O | N1) , which can also

save computations.

Theorem 2. Suppose N2 > N1 . ssK (FlS ,O | N2 ) is computed

recursively from ssK (FlS ,O | N1) , as:

ssK (FlS ,O | N2 ) = 1

1+µFl

S (x)x=N1+1

N2�µFl

S (x)x=1

N1�

× ssK (FlS ,O | N1)

+min(µFl

S (x),µO (x))x=N1+1

N2�µFl

S (x)x=1

N2�

(23)

Note that µFlS (x)x=1

N2� can also be computed recursively,

but we do not pursue that here. (23), which is obtained from a simple algebraic decomposition of (22), lets us compute ssK (Fl

S ,O | N2 ) in two steps: (1) resize ssK (FlS ,O | N1) and (2)

add in a correction due to the new N2 � N1 cases.

Equation (23) allows us to examine the relative sizes of ssK (Fl

S ,O | N2 ) and ssK (FlS ,O | N1) . To begin, let

min(µFlS (x),µO (x))x=N1+1

N2�µFlS (x)x=N1+1

N2�� ssK (Fl

S ,O | N2 � N1) (24)

ssK (FlS ,O | N2 � N1) is the subsethood of Fl

S in O, but only for the new N2 � N1 cases. Then:

Corollary 2-1. It is true that

ssK (Fl

S ,O | N2 )�<ssK (Fl

S ,O | N1) (25)

if and only if

ssK (Fl

S ,O | N2 � N1)�<ssK (Fl

S ,O | N1) (26)

Proof: Will appear in the journal version of this paper.

It is clear from this corollary that it is possible that new cases can cause a rule to be obliterated if ssK (Fl

S ,O | N2 � N1) < ssK (FlS ,O | N1) . But how much smaller

must ssK (FlS ,O | N2 � N1) be than ssK (Fl

S ,O | N1) for a rule to be obliterated? This question is answered in our next section.

VII. ON THE OBLITERATION OF A RULE

Corollary 2-2. Suppose ssK (FlS ,O | N1) = 0.8 + �(N1) , where

0 � �(N1) � 0.2 . Let

µFlS (x)x=1

N1�µFlS (x)x=N1+1

N2�� � (27)

Then2

ssK (FlS ,O | N2 ) < 0.8 , (28)

if and only if

ssK (FlS ,O | N2 � N1) < 0.8 � �(N1)� (29)

which means that FlS is obliterated. Another way to express

(29) is [substitute ssK (FlS ,O | N1) = 0.8 + �(N1) for �(N1) in

(29), noting that �(N1) � 0 , and then solve the resulting

inequality for � ]:

� < 0.8 � ssK (Fl

S ,O | N2 � N1)ssK (Fl

S ,O | N1)� 0.8 (30)

Proof: Will appear in the journal version of this paper.

Observe that (29) or (30) provide constructive tests to establish if a rule that has survived the consistency threshold based on N1 cases will be obliterated by the additional N2 � N1 cases. This requires computing both sides of (29) [or (30)] to see if it is (or is not) satisfied. If it is satisfied, then (28) will be true and the lth rule will be obliterated. If it is not satisfied, then the lth rule will not be obliterated.

� is a very interesting parameter; it is the ratio of the sums of firing levels, one for the original N1 cases and the other for the additional N2 � N1 cases. To-date, we have no other physical meaning for this parameter, and we also have no physical meaning for the right-hand side of (30).

The results in this section cause one to think about the cases that are/will be used in fsQCA. Ragin [7] has a lot to say about cases and their choice (see also [2, Step 1]). If one insists on always using all of the cases for every desired outcome, then it is quite possible that important rules will be obliterated by those cases for which their MF value in the desired outcome is zero. On the other hand, if one includes cases that seem to support the desired outcome, then the likelihood of obliterating an important rule will be greatly reduced. So, the choice of cases should be done in Step 1 in a way that actively support the goal of learning which causal combinations support (are the cause of) a desired outcome.

VIII. CONCLUSIONS In this paper we have provided various ways to speed up

the computations within fsQCA. The min-max Theorem 1 is

2 The maximum value of consistency is 1. Ragin’s consistency threshold

is 0.8. It is easy to modify the results in this corollary for another value of this threshold; simply replace 0.8 by that value.

by far our most useful result and is one that we now use all of the time. Doing so has led to a modification of Steps 5 and 6 in fsQCA [3] to Steps 5NEW and 6NEW in “Fast fsQCA.”

By using the recursive formula for consistency we have not only been able to speed up its computations, but have also gained an understanding about when a rule can be obliterated, which has helped us in choosing cases.

The embedding corollaries are very useful in that they also provide insights about fsQCA.

All of the results in this paper were possible only after we quantified fsQCA. Perhaps there is more to be learned about fsQCA from such quantification.

Finally, although fsQCA can be described as a collection of 13 steps, and although some of the steps can be made more efficient by using the results from this paper, there are still some challenges to using fsQCA for engineering or computer science applications, including obtaining the MFs, the robustness of fsQCA to its design parameters and to the substantive knowledge that is used in counterfactual analysis. These challenges as well as others are the subject of our present research, e.g., see [1].

ACKNOWLEDGEMENT This study was funded by the Center of Excellence for

Research and Academic Training on Interactive Smart Oilfield Technologies (CiSoft); CiSoft is a joint University of Southern California-Chevron initiative.

REFERENCES [1] M. M. Korjani and J. M. Mendel, “Fuzzy Set Qualitative Comparative

Analysis (fsQCA): challenges and application,” Proc. NAFIPS 2012, Berkeley, CA, August 2012.

[2] J. M. Mendel, “On Charles Ragin’s Fuzzy Set Qualitative Comparative Analysis (fsQCA),” Proc. of IEEE World Conference on Soft Computing, San Franciso, CA, May 2011.

[3] J. M. Mendel and M. Korjani, “Charles Ragin's Fuzzy Set Qualitative Comparative Analysis (fsQCA) used for linguistic summarizations,” Information Sciences, 2012.

[4] C. C. Ragin, The Comparative Method: Moving Beyond Qualitative and Quantitative Strategies, University of California Press, Berkeley, Los Angeles and London, 1987.

[5] C. C. Ragin, Fuzzy-set Social Science, Univ. of Chicago Press, Chicago, IL, 2000

[6] C. C. Ragin, Redesigning Social Inquiry: Fuzzy Sets and Beyond, Univ. of Chicago Press, Chicago, IL, 2008.

[7] C. C. Ragin, “Reflections on casing and case-oriented research,” in D. Byrne and C. Ragin (eds.), The SAGE Handbook of Case-Based Methods, Chapter 31, pp. 522-534, SAGE Publications, Inc. Thousand Oaks, CA, 2009.

[8] B. Rihoux and C. C. Ragin (eds.), Configurational Comparative Methods: Qualitative Comparative Analysis (QCA) and Related Techniques, SAGE, Los Angeles, CA 2009.


Recommended