python - Scikit-learn: precision_recall_fscore_support returns strange results -
i doing text minining/classification , attempt evaluate performance precision_recall_fscore_support
function sklearn.metrics
module. not sure how can create small example reproducing problem, maybe can because obvious missing.
the aforementioned function returns among other things support each class. documentation states
support: int (if average not none) or array of int, shape = [n_unique_labels] : number of occurrences of each label in y_true.
but in case, number of classes support returned not same number of different classes in testing data.
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.5) classifier = svm.svc(kernel="linear") classifier.fit(x_train, y_train) y_pred = classifier.predict(x_test) prec, rec, fbeta, supp = precision_recall_fscore_support(y_test, y_pred) print(len(classifier.classes_)) # prints 18 print(len(supp)) # prints 19 print(len(np.unique(y_test))) # prints 18
how can be? how can there support class not in data?
i not sure problem is, in case there seems mismatch between classes learned classifier , ones occurring in test data. 1 can force the function compute performance measures right classes explicitly naming them.
prec, rec, fbeta, supp = precision_recall_fscore_support(y_test, y_pred, labels=classifier.classes_)
Comments
Post a Comment