使用aitodpycocotools获得APvt,APt,APs及APm指标

Lenardo_00

1077人浏览 · 2024-10-19 07:00:00

Lenardo_00 · 2024-10-19 07:00:00 发布

评估代码

最近看论文看到这样的指标，就去了解了一下是怎么得来的。

这套评价指标来自于aitodpycocotools，官方github：GitHub - jwwangchn/cocoapi-aitod: COCO API - Dataset @ http://cocodataset.org/

这套评估指标的生成和使用pycocotools进行评估类似，先来看看使用pycocotools如何生成评估指标：（参考自cocoapi的评估示例脚本：https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocoEvalDemo.ipynb）

from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval

coco_true = COCO(annotation_file='/path/to/annotation.json')
coco_pre = coco_true.loadRes('/path/to/prediction.json')
cocoevaluator = COCOeval(cocoGt = coco_true, cocoDt = coco_pre, iouType = "bbox")
cocoevaluator.evaluate()
cocoevaluator.accumulate()
cocoevaluator.summarize()

与上述类似，使用aitodpycocotools进行评估，只需要改前面两行导入：

from aitodpycocotools.coco import COCO
from aitodpycocotools.cocoeval import COCOeval

coco_true = COCO(annotation_file='/mnt/sdb2/ray/AI-TOD/annotations/aitodv2_test.json')
coco_pre = coco_true.loadRes('output/tod/prediction.json')
cocoevaluator = COCOeval(cocoGt = coco_true, cocoDt = coco_pre, iouType = "bbox")
cocoevaluator.evaluate()
cocoevaluator.accumulate()
cocoevaluator.summarize()

获取predicition.json

在上面的代码中，只有prediction.json是需要自己生成的。predicition.json的数据格式是一个列表，里面的元素为字典，每一个字典就是一个预测框的信息。predicition.json的数据格式如下所示：（参考自：COCO - Common Objects in Context）

[{"image_id": int, 
  "category_id": int, 
  "bbox": [x,y,width,height], 
  "score": float}, {......}, ...]

注意以上的x，y是图片左上角的坐标xmin和ymin。以我在使用的DETR类模型为例，对测试集中的每张图片，都会生成300个预测框，将每个框的以上四个信息写入，最后就能得到json文件，最后成功生成评估指标。

Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=1500 ] = 0.133
Average Precision  (AP) @[ IoU=0.25      | area=   all | maxDets=1500 ] = -1.000
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=1500 ] = 0.347
Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=1500 ] = 0.074
Average Precision  (AP) @[ IoU=0.50:0.95 | area=verytiny | maxDets=1500 ] = 0.035
Average Precision  (AP) @[ IoU=0.50:0.95 | area=  tiny | maxDets=1500 ] = 0.128
Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1500 ] = 0.181
Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1500 ] = 0.242
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.043
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.226
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=1500 ] = 0.238
Average Recall     (AR) @[ IoU=0.50:0.95 | area=verytiny | maxDets=1500 ] = 0.056
Average Recall     (AR) @[ IoU=0.50:0.95 | area=  tiny | maxDets=1500 ] = 0.229                                            Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1500 ] = 0.314                                            Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1500 ] = 0.371
Optimal LRP             @[ IoU=0.50      | area=   all | maxDets=1500 ] = 0.883
Optimal LRP Loc         @[ IoU=0.50      | area=   all | maxDets=1500 ] = 0.311
Optimal LRP FP          @[ IoU=0.50      | area=   all | maxDets=1500 ] = 0.444
Optimal LRP FN          @[ IoU=0.50      | area=   all | maxDets=1500 ] = 0.629
# Class-specific LRP-Optimal Thresholds #                                                                                   [0.51 0.47 0.59 0.53 0.44 0.51 0.42 0.4 ]

这里不清楚为什么第二行的结果是-1，但是其他的指标应该是对的，因为这里本人用pycocotools也进行了评估，指标结果和以上相差不大，差别应该来自maxdets的影响。这里也贴一个用pycocotools评估的结果。

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.127
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.329
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.072
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.120
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.239
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.043
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.136
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.226
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.217
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.357
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000

由于AI-TOD数据集里的目标都是小目标，所以area=large的指标输出为-1。

注意事项

主要是讲一下我踩的坑。

模型预测得到的预测框坐标值为[cx, cy, w, h]，我没有进行坐标的转换就写入了json文件，导致后续生成的评估结果都为0（第二行还是-1）。坐标三种格式：xyxy、cxcywh和xywh，写入json的是xywh。
aitodpycocotools的评估结果第二行结果为-1虽不影响使用，但还暂时没弄清原因。

腾讯云开发者社区

腾讯云面向开发者汇聚海量精品云计算使用和开发经验，营造开放的云计算技术生态圈。

更多推荐

Elasticsearch复杂数据类型终极指南：从入门到精通

Elasticsearch作为功能强大的搜索引擎，支持多种复杂数据类型，让开发者能够灵活处理各种结构化和非结构化数据。本文将带你全面了解Elasticsearch中的复杂数据类型，从基础概念到实际应用，助你轻松掌握数据建模的核心技巧。## 内部对象：构建层级化数据结构在Elasticsearch中，对象类型（Object）是最基础的复杂数据类型之一，用于表示具有嵌套关系的数据。例如，我们可

腾讯云开发者社区

终极指南：Flink SQL连接器版本管理从混乱到有序的升级之路

Apache Flink作为流处理领域的佼佼者，其SQL连接器的版本管理一直是开发者面临的核心挑战。本文将系统讲解Flink SQL连接器版本管理的最佳实践，帮助你轻松应对版本兼容性问题，实现从混乱到有序的升级之旅。## 连接器版本管理的常见痛点 😫在Flink应用开发中，连接器版本管理常常让开发者头疼不已。不同版本的连接器可能导致各种兼容性问题，例如API变更、功能差异甚至运行时错误。

腾讯云开发者社区

如何快速搭建Neon无服务器PostgreSQL：面向初学者的完整指南

Neon是一款革命性的无服务器PostgreSQL解决方案，它通过分离存储和计算层，实现了自动扩缩容、类代码式数据库分支以及零级扩展能力。本指南将帮助你从零开始搭建Neon开发环境，体验这款创新数据库的强大功能。## 准备工作：环境要求与依赖项在开始搭建Neon环境前，请确保你的系统满足以下要求：- Linux操作系统（推荐Ubuntu 20.04+或Debian 11+）- Git