Scikit-learn 中的模型的多维指标评估方法 郝伟 2021/07/20 [TOC]

1. 简介

在 Scikit-learn 中,提供了四个用于评估模型性能的方法:

  • sklearn.metrics.accuracy_score
  • sklearn.metrics.precision_score
  • sklearn.metrics.recall_score
  • sklearn.metrics.f1_score 看名子就算知道其作用,使用方法如下所示:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

def evalute(y_true, y_pred, avgtype='binary'):
    '''使用4类指标来对预测结果进行评估''' 
    print(' datasize= {} '.format(len(y_true)).center(60, '*'))
    print('acc=', accuracy_score(y_true, y_pred))
    print('pre=', precision_score(y_true, y_pred, average=avgtype))
    print('rec=', recall_score(y_true, y_pred, average=avgtype))
    print('f1=', f1_score(y_true, y_pred, average=avgtype))

y_true = [1,1,0,0,1,1,1,1,1,0,0,0,0,0,1,1,0]
y_pred = [1,0,0,1,0,1,1,1,1,0,0,0,0,0,1,1,0]
evalute(y_true, y_pred)

输出为:

*********************** datasize= 17 ***********************
acc= 0.8235294117647058
pre= 0.875
rec= 0.7777777777777778
f1= 0.823529411764706

但是,默认的使用方法只合适二维数据,如果是多给数据就有问题。

2. 多维数据

from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

def evalute(y_true, y_pred, avgtype='binary'):
    '''使用4类指标来对预测结果进行评估''' 
    print(' datasize= {} '.format(len(y_true)).center(60, '*'))
    print('acc=', accuracy_score(y_true, y_pred))
    print('pre=', precision_score(y_true, y_pred, average=avgtype))
    print('rec=', recall_score(y_true, y_pred, average=avgtype))
    print('f1=', f1_score(y_true, y_pred, average=avgtype))

# 多维数据有0,1,2三个值
y_true = [1,1,0,2,0,2,0,1,1,1,2,1,0,0,2,0,0,1,1,2]
y_pred = [2,1,0,2,0,2,0,1,1,1,2,1,0,0,2,0,0,1,1,2]
evalute(y_true, y_pred, avgtype=None)

输出如下所示:

*********************** datasize= 20 ***********************
acc= 0.95
pre= [1.      1.          0.83333333]
rec= [1.      0.875       1.        ]
f1=  [1.      0.93333333  0.90909091]

在多维数据处理时的处理原则是:

  • 每个数值表示一个维度,如在本示例中三个数据0,1,2表示三个维度;
  • 预测值与真实值相比,相同则为预测正确,不相同则表示错误。

所以,对于 pre, rec 和 f1 都是1行3列的输出,分别对应0, 1, 2的值预测。

3. 多种分类情况

根据官网的pre函数的使用说明,averge 共有以下几种情况:

  • 'binary' (default): 默认的二分类。 Only report results for the class specified by poslabel. This is applicable only if targets (y{true,pred}) are binary.
  • None: 多维单独分类计算指标。
  • 'micro': (未知) Calculate metrics globally by counting the total true positives, false negatives and false positives.
  • 'macro': (未知) Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.
  • 'weighted': 以加权的形式,即值越大,占比重越多(待确认) Calculate metrics for each label, and find their average weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.
  • 'samples': (未知) Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score).

results matching ""

    No results matching ""