Yolo v2 Anchor Box | Notion

1. Anchor Box

Untitled

YOLO V2 reference [15]

Faster R-CNN Explained

BBOX 좌표 예측 방법의 변경(Faster R-CNN 방법과 비교하여)

1.1 Convolutional with Anchor Boxes

YOLO는 Faster R-CNN 의 방법과 다르게

FC layer(Fully Connected Layer) 대신 FCN(Fully Convolutional Layer)을 이용하여 각 anchor box와 class를 예측
pooling layer 한개를 없앰
13x13 grid를 만들기 위해 416x416 image로 축소

→ 7x7 grid는 detection하기에 낮은 resolution 및 98개의 bbox는 recall 성능을 측정하기엔 수가 부족

→ object 크기가 큰 경우 중심에 존재하기 쉽기 때문에, 중앙을 맞추기 위해 홀수 size의 grid를 선택

→ pooling layer를 줄임으로써 고해상도를 얻음

Two issues with anchor boxes

기존 anchor box dimension을 직접 정해줘야 했다(hand picked). (Faster R-CNN에서는 aspect ratio 1:1, 2:1, 1:2) ⇒ Dimension Clusters
Model instability, especially during early iterations ⇒ Directed Location Prediction

1.2 Dimension Clusters

기존 anchor box dimension을 직접 정해줘야 했다. (Faster R-CNN에서는 aspect ratio 1:1, 2:1, 1:2)
yolo v2에서는 K-means 알고리즘을 이용하여 anchor box 크기를 학습하게 만듬.
- GT Box를 Grouping 후 Anchor box의 크기와 Ratio 결정
→ 실제 bounding box와 높은 IOU를 가진 anchor box를 생성해야 함 Euclidean distance 기반으로 중심 거리가 가까운 anchor box를 선택하면 낮은 IOU를 가진 anchor box를 생성할 수 있음

⇒ IOU 기반 K-means Clustering

Using K-means Algorithm to determine priors

YOLO V2에서는 K-means 알고리즘을 이용하여 anchor box 크기를 학습하게 만듬

Untitled