Specificity in scikit learn

I need specificity for my classification which is defined as : TN/(TN+FP)

I am writing a custom scorer function :

from sklearn.metrics import make_scorer
def specificity_loss_func(ground_truth, predictions):
    print predictions
    tp, tn, fn, fp = 0.0,0.0,0.0,0.0
    for l,m in enumerate(ground_truth):        
        if m==predictions[l] and m==1:
            tp+=1
        if m==predictions[l] and m==0:
            tn+=1
        if m!=predictions[l] and m==1:
            fn+=1
        if m!=predictions[l] and m==0:
            fp+=1
    `return tn/(tn+fp)

score = make_scorer(specificity_loss_func, greater_is_better=True)

Then,

from sklearn.dummy import DummyClassifier
clf_dummy = DummyClassifier(strategy='most_frequent', random_state=0)
ground_truth = [0,0,1,0,1,1,1,0,0,1,0,0,1]
p  = [0,0,0,1,0,1,1,1,1,0,0,1,0]
clf_dummy = clf_dummy.fit(ground_truth, p)
score(clf_dummy, ground_truth, p)

When I run these commands, I get p printed as :

[0 0 0 0 0 0 0 0 0 0 0 0 0]
1.0

Why is my p changing to a series of zeros when I input p = [0,0,0,1,0,1,1,1,1,0,0,1,0]

You could get specificity from the confusion matrix. For a binary classification problem, it would be something like:

from sklearn.metrics import confusion_matrix
y_true = [0, 0, 0, 1, 1, 1, 1, 1]
y_pred = [0, 1, 0, 1, 0, 1, 0, 1]
tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
specificity = tn / (tn+fp)

First of all you need to know that:

DummyClassifier(strategy='most_frequent'...

Will give you classifier which returns most frequent label from your training set. It doesn't even take into consideration samples in X. You can pass anything instead of ground_truth in this line:

clf_dummy = clf_dummy.fit(ground_truth, p)

result of training, and predictions will stay same, because majority of labels inside p is label "0".

Second thing that you need to know: make_scorer returns function with interface scorer(estimator, X, y) This function will call predict method of estimator on set X, and calculates your specificity function between predicted labels and y.

So it calls clf_dummy on any dataset (doesn't matter which one, it will always return 0), and returns vector of 0's, then it computes specificity loss between ground_truth and predictions. Your predictions is 0 because 0 was majority class in training set. Your score is equals 1 because there is no false positive predictions.

I corrected your code, to add more convenience.

from sklearn.dummy import DummyClassifier
clf_dummy = DummyClassifier(strategy='most_frequent', random_state=0)
X = [[0],[0],[1],[0],[1],[1],[1],[0],[0],[1],[0],[0],[1]]
p  = [0,0,0,1,0,1,1,1,1,0,0,1,0]
clf_dummy = clf_dummy.fit(X, p)
score(clf_dummy, X, p)

brownie：ValueError: execution reverted: VM Exception while processing transaction: revert

How to get Item-Quantity map using Spring MVC forms?

ES7 React/Redux/GraphQL/React-Native snippets not working

Label beside input (bootstrap form)

MaterialButton style being overridden, how do I keep it?

MySQL seems to be very slow for updates

LNK2001 error - VC++ external symbol unsigned __int64 __cdecl fbxsdk::FbxAllocSize(unsigned __int64,unsigned __int64) error [duplicate]

Check equality with Python dataframe containing None values

Why is it important to extends Thread Class In Java to create parameterised constructor

PLS-00103: Encountered the symbol "end-of-file" when expecting one of the following: ;

How to access a method in a class written in swift using the app delegate

Understanding Future, await in Flutter