Scikit-learn 中的模型的多维指标评估方法 郝伟 2021/07/20 [TOC]
1. 简介
在 Scikit-learn 中,提供了四个用于评估模型性能的方法:
sklearn.metrics.accuracy_score
sklearn.metrics.precision_score
sklearn.metrics.recall_score
sklearn.metrics.f1_score
看名子就算知道其作用,使用方法如下所示:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
def evalute(y_true, y_pred, avgtype='binary'):
'''使用4类指标来对预测结果进行评估'''
print(' datasize= {} '.format(len(y_true)).center(60, '*'))
print('acc=', accuracy_score(y_true, y_pred))
print('pre=', precision_score(y_true, y_pred, average=avgtype))
print('rec=', recall_score(y_true, y_pred, average=avgtype))
print('f1=', f1_score(y_true, y_pred, average=avgtype))
y_true = [1,1,0,0,1,1,1,1,1,0,0,0,0,0,1,1,0]
y_pred = [1,0,0,1,0,1,1,1,1,0,0,0,0,0,1,1,0]
evalute(y_true, y_pred)
输出为:
*********************** datasize= 17 ***********************
acc= 0.8235294117647058
pre= 0.875
rec= 0.7777777777777778
f1= 0.823529411764706
但是,默认的使用方法只合适二维数据,如果是多给数据就有问题。
2. 多维数据
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
def evalute(y_true, y_pred, avgtype='binary'):
'''使用4类指标来对预测结果进行评估'''
print(' datasize= {} '.format(len(y_true)).center(60, '*'))
print('acc=', accuracy_score(y_true, y_pred))
print('pre=', precision_score(y_true, y_pred, average=avgtype))
print('rec=', recall_score(y_true, y_pred, average=avgtype))
print('f1=', f1_score(y_true, y_pred, average=avgtype))
# 多维数据有0,1,2三个值
y_true = [1,1,0,2,0,2,0,1,1,1,2,1,0,0,2,0,0,1,1,2]
y_pred = [2,1,0,2,0,2,0,1,1,1,2,1,0,0,2,0,0,1,1,2]
evalute(y_true, y_pred, avgtype=None)
输出如下所示:
*********************** datasize= 20 ***********************
acc= 0.95
pre= [1. 1. 0.83333333]
rec= [1. 0.875 1. ]
f1= [1. 0.93333333 0.90909091]
在多维数据处理时的处理原则是:
- 每个数值表示一个维度,如在本示例中三个数据0,1,2表示三个维度;
- 预测值与真实值相比,相同则为预测正确,不相同则表示错误。
所以,对于 pre, rec 和 f1 都是1行3列的输出,分别对应0, 1, 2的值预测。
3. 多种分类情况
根据官网的pre函数的使用说明,averge 共有以下几种情况:
- 'binary' (default): 默认的二分类。 Only report results for the class specified by poslabel. This is applicable only if targets (y{true,pred}) are binary.
- None: 多维单独分类计算指标。
- 'micro': (未知) Calculate metrics globally by counting the total true positives, false negatives and false positives.
- 'macro': (未知) Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.
- 'weighted': 以加权的形式,即值越大,占比重越多(待确认) Calculate metrics for each label, and find their average weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.
- 'samples': (未知) Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score).