8.1.1.7.1.3.1.3. blueoil.networks.object_detection.yolo_v2

8.1.1.7.1.3.1.3.1. Module Contents

8.1.1.7.1.3.1.3.1.1. Classes

YoloV2

YOLO version2.

YoloV2Loss

YOLO v2 loss function.

8.1.1.7.1.3.1.3.1.2. Functions

summary_boxes(tag, images, boxes, image_size, max_outputs=3, data_format=’NHWC’)

Draw bounding boxes images on Tensorboard.

format_XYWH_to_CXCYWH(boxes, axis=1)

Format form (x, y, w, h) to (center_x, center_y, w, h) along specific dimension.

format_CXCYWH_to_XYWH(boxes, axis=1)

Format form (center_x, center_y, w, h) to (x, y, w, h) along specific dimension.

format_CXCYWH_to_YX(inputs, axis=1)

Format from (x, y, w, h) to (y1, x1, y2, x2) boxes along specific dimension.

format_XYWH_to_YX(inputs, axis=1)

Format from (x, y, w, h) to (y1, x1, y2, x2) boxes along specific dimension.

class blueoil.networks.object_detection.yolo_v2.YoloV2(num_max_boxes=5, anchors=[(0.25, 0.25), (0.5, 0.5), (1.0, 1.0)], leaky_relu_scale=0.1, object_scale=5.0, no_object_scale=1.0, class_scale=1.0, coordinate_scale=1.0, loss_iou_threshold=0.6, weight_decay_rate=0.0005, score_threshold=0.05, nms_iou_threshold=0.5, nms_max_output_size=100, nms_per_class=True, loss_warmup_steps=200, is_dynamic_image_size=False, use_cross_entropy_loss=True, change_base_output=False, *args, **kwargs)

Bases: blueoil.networks.base.BaseNetwork

YOLO version2.

YOLO v2. paper: https://arxiv.org/abs/1612.08242

placeholders(self)

placeholders

summary(self, output, labels=None)

Summary.

Parameters
  • output – tensor from inference.

  • labels – labels tensor.

metrics(self, output, labels, thresholds=[0.3, 0.5, 0.7])

Metrics.

Parameters
  • output – tensor from inference.

  • labels – labels tensor.

convert_gt_boxes_xywh_to_cxcywh(self, gt_boxes)

Convert gt_boxes format form (x, y, w, h) to (center_x, center_y, w, h).

Parameters

gt_boxes – 3D tensor [batch_size, max_num_boxes, 5(x, y, w, h, class_id)]

static py_offset_boxes(num_cell_y, num_cell_x, batch_size, boxes_per_cell, anchors)

Numpy implementing of offset_boxes. Return yolo space offset of x and y and w and h.

Parameters
  • num_cell_y – Number of cell y. The spatial dimension of the final convolutional features.

  • num_cell_x – Number of cell x. The spatial dimension of the final convolutional features.

  • batch_size – int, Batch size.

  • boxes_per_cell – int, number of boxes per cell.

  • anchors – list of tuples.

offset_boxes(self)

Return yolo space offset of x and y and w and h.

Returns

shape is [batch_size, num_cell[0], num_cell[1], boxes_per_cell] offset_y: shape is [batch_size, num_cell[0], num_cell[1], boxes_per_cell] offset_w: shape is [batch_size, num_cell[0], num_cell[1], boxes_per_cell] offset_h: shape is [batch_size, num_cell[0], num_cell[1], boxes_per_cell]

Return type

offset_x

convert_boxes_space_from_real_to_yolo(self, boxes)

Convert boxes space size from real to yolo.

Real space boxes coordinates are in the interval [0, image_size]. Yolo space boxes x,y are in the interval [-1, 1]. w,h are in the interval [-inf, +inf].

Parameters

boxes – 5D Tensor, shape is [batch_size, num_cell, num_cell, boxes_per_cell, 4(center_x, center_y, w, h)].

Returns

5D Tensor,

shape is [batch_size, num_cell, num_cell, boxes_per_cell, 4(center_x, center_y, w, h)].

Return type

resized_boxes

convert_boxes_space_from_yolo_to_real(self, predict_boxes)

Convert predict boxes space size from yolo to real.

Real space boxes coordinates are in the interval [0, image_size]. Yolo space boxes x,y are in the interval [-1, 1]. w,h are in the interval [-inf, +inf].

Parameters

boxes – 5D Tensor, shape is [batch_size, num_cell, num_cell, boxes_per_cell, 4(center_x, center_y, w, h)].

Returns

5D Tensor,

shape is [batch_size, num_cell, num_cell, boxes_per_cell, 4(center_x, center_y, w, h)].

Return type

resized_boxes

_predictions(self, output)
_split_predictions(self, output)

Separate combined final convolution outputs to predictions.

Parameters

output – combined final convolution outputs 4D Tensor. shape is [batch_size, num_cell[0], num_cell[1], (num_classes + 5) * boxes_per_cell]

Returns

[batch_size, num_cell[0], num_cell[1], boxes_per_cell, num_classes] predict_confidence(Tensor): [batch_size, num_cell[0], num_cell[1], boxes_per_cell, 1] predict_boxes(Tensor): [batch_size, num_cell[0], num_cell[1], boxes_per_cell, 4(center_x, center_y, w, h)]

Return type

predict_classes(Tensor)

_concat_predictions(self, predict_classes, predict_confidence, predict_boxes)

Concat predictions to inference output.

post_process(self, output)
_format_output(self, output)

Format yolov2 inference output to predict boxes list.

Parameters

output – Tensor of inference() outputs.

Returns

List of predict_boxes Tensor. The Shape is [batch_size, num_predicted_boxes, 6(x, y, w, h, class_id, score)]. The score be calculated by for each class probability and confidence.

_exclude_low_score_box(self, formatted_output, threshold=0.05)

Exclude low score boxes. The score be calculated by for each class probability and confidence.

Parameters
  • formatted_output – Formatted predict_boxes Tensor. The Shape is [batch_size, num_predicted_boxes, 6(x, y, w, h, class_id, score)].

  • threshold – low threshold of predict score.

Returns

python list of predict_boxes Tensor. predict_boxes shape is [num_predicted_boxes, 6(x, y, w, h, class_id, probability)]. python list lenght is batch size.

_nms(self, formatted_output, iou_threshold, max_output_size, per_class)

Non Maximum Suppression.

Parameters
  • formatted_output – python list of predict_boxes Tensor. predict_boxes shape is [num_predicted_boxes, 6(x, y, w, h, class_id, probability)].

  • iou_threshold (float) – The threshold for deciding whether boxes overlap with respect to IOU.

  • max_output_size (int) – The maximum number of boxes to be selected

  • per_class (boolean) – Whether or not, NMS respect to per class.

Returns

python list of predict_boxes Tensor. predict_boxes shape is [num_predicted_boxes, 6(x, y, w, h, class_id, probability)]. python list lenght is batch size.

loss(self, output, gt_boxes)

Loss.

Parameters
  • output – 2D tensor. shape is [batch_size, self.num_cell * self.num_cell * (self.num_classes + self.boxes_per_cell * 5)]

  • gt_boxes – ground truth boxes 3D tensor. [batch_size, max_num_boxes, 4(x, y, w, h, class_id)].

inference(self, images, is_training)

Inference.

Parameters

images – images tensor. shape is (batch_num, height, width, channel)

_reorg(self, name, inputs, stride, data_format, use_space_to_depth=True, darknet_original=False)
base(self, images, is_training)

Base network.

Returns: Output. output shape depends on parameter.

When data_format is NHWC shape is [

batch_size, num_cell[0], num_cell[1], (num_classes + 5(x, y ,w, h, confidence)) * boxes_per_cell(length of anchors),

]

When data_format is NCHW shape is [

batch_size, (num_classes + 5(x, y ,w, h, confidence)) * boxes_per_cell(length of anchors), num_cell[0], num_cell[1],

]

class blueoil.networks.object_detection.yolo_v2.YoloV2Loss(is_debug=False, anchors=[(1.0, 1.0), (2.0, 2.0)], num_cell=[4, 4], boxes_per_cell=2, object_scale=5.0, no_object_scale=1.0, class_scale=1.0, coordinate_scale=1.0, loss_iou_threshold=0.6, weight_decay_rate=0.0005, image_size=[448, 448], batch_size=64, classes=[], yolo=None, warmup_steps=100, use_cross_entropy_loss=True)

YOLO v2 loss function.

_iou_per_gtbox(self, boxes, box)

Calculate ious.

Parameters
  • boxes – 4-D np.ndarray [num_cell, num_cell, boxes_per_cell, 4(x_center, y_center, w, h)]

  • box – 1-D np.ndarray [4(x_center, y_center, w, h)]

Returns

3-D np.ndarray [num_cell, num_cell, boxes_per_cell]

Return type

iou

__iou_gt_boxes(self, boxes, gt_boxes_list, num_cell)
_iou_gt_boxes(self, boxes, gt_boxes_list)

Calculate ious between predict box and gt box. And choice best iou for each gt box.

Parameters
  • boxes – Predicted boxes in real space coordinate. 5-D tensor [batch_size, num_cell, num_cell, boxes_per_cell, 4(x_center, y_center, w, h)].

  • gt_boxes_list – 5-D tensor [batch_size, max_num_boxes, 5(center_x, center_y, w, h, class_id)]

Returns

4-D tensor [batch_size, num_cell, num_cell, boxes_per_cell]

Return type

iou

_one_iou(self, box1, box2)
__calculate_truth_and_masks(self, gt_boxes_list, predict_boxes, num_cell, image_size, global_step, predict_classes=None, predict_confidence=None)

Calculate truth and masks for loss function from gt boxes and predict boxes.

1. When global steps is less than warmup_steps, set cell_gt_boxes and coordinate_masks to manage coordinate loss for early training steps to encourage predictions to match anchor.

  1. About not dummy gt_boxes, calculate between gt boxes and anchor iou, and select best anchor.

3. In the best anchor, create cell_gt_boxes from the gt_boxes and calculate truth_confidence and assign masks true.

Parameters
  • gt_boxes_list (np.ndarray) – The ground truth boxes. Shape is [batch_size, max_num_boxes, 5(center_x, center_y, w, h, class_id)].

  • predict_boxes (np.ndarray) – Predicted boxes. Shape is [batch_size, num_cell, num_cell, boxes_per_cell, 4(center_x, center_y, w, h)].

  • num_cell – Number of cell. num_cell[0] is y axis, num_cell[1] is x axis.

  • image_size – Image size(px). image_size[0] is height, image_size[1] is width.

  • global_step (int) – Number of current training step.

  • predict_classes – Use only in debug. Shape is [batch_size, num_cell, num_cell, boxes_per_cell, num_classes]

  • predict_confidence – Use only in debug. Shape is [batch_size, num_cell, num_cell, boxes_per_cell, 1]

Returns

The cell anchor corresponding gt_boxes from gt_boxes_list. Dummy cell gt boxes are zeros.

shape is [batch_size, num_cell, num_cell, box_per_cell, 5(center_x, center_y, w, h, class_id)].

truth_confidence: The confidence values each for cell anchors.

Tensor shape is [batch_size, num_cell, num_cell, box_per_cell, 1].

object_masks: The cell anchor that has gt boxes is 1, none is 0.

Tensor shape is [batch_size, num_cell, num_cell, box_per_cell, 1].

coordinate_masks: the cell anchor that has gt boxes is 1, none is 0.

Tensor [batch_size, num_cell, num_cell, box_per_cell, 1].

Return type

cell_gt_boxes

_calculate_truth_and_masks(self, gt_boxes_list, predict_boxes, global_step, predict_classes=None, predict_confidence=None)

Calculate truth and masks for loss function from gt boxes and predict boxes.

Parameters
  • gt_boxes_list – The ground truth boxes. Tensor shape is [batch_size, max_num_boxes, 5(center_x, center_y, w, h, class_id)].

  • predict_boxes – Predicted boxes. Tensor shape is [batch_size, num_cell, num_cell, boxes_per_cell, 4(center_x, center_y, w, h)].

  • global_step – Integer tensor.

  • predict_classes – Use only in debug. Shape is [batch_size, num_cell, num_cell, boxes_per_cell, num_classes]

  • predict_confidence – Use only in debug. Shape is [batch_size, num_cell, num_cell, boxes_per_cell, 1]

Returns

The cell anchor corresponding gt_boxes from gt_boxes_list. Dummy cell gt boxes are zeros.

shape is [batch_size, num_cell, num_cell, box_per_cell, 5(center_x, center_y, w, h, class_id)].

truth_confidence: The confidence values each for cell anchors.

Tensor shape is [batch_size, num_cell, num_cell, box_per_cell, 1].

object_masks: The cell anchor that has gt boxes is 1, none is 0.

Tensor shape is [batch_size, num_cell, num_cell, box_per_cell, 1].

coordinate_masks: the cell anchor that has gt boxes is 1, none is 0.

Tensor [batch_size, num_cell, num_cell, box_per_cell, 1].

Return type

cell_gt_boxes

_weight_decay_loss(self)

L2 weight decay (regularization) loss.

__call__(self, predict_classes, predict_confidence, predict_boxes, gt_boxes, global_step)

Loss function.

Parameters
  • predict_classes – [batch_size, num_cell, num_cell, boxes_per_cell, num_classes]

  • predict_confidence – [batch_size, num_cell, num_cell, boxes_per_cell, 1]

  • predict_boxes – [batch_size, num_cell, num_cell, boxes_per_cell, 4(center_x, center_y, w, h)]

  • gt_boxes – ground truth boxes 3D tensor. [batch_size, max_num_boxes, 5(center_x, center_y, w, h, class_id)].

  • global_step – integer tensor.

Returns

loss value scalar tensor.

Return type

loss

blueoil.networks.object_detection.yolo_v2.summary_boxes(tag, images, boxes, image_size, max_outputs=3, data_format='NHWC')

Draw bounding boxes images on Tensorboard.

Args: tag: name of summary tag. images: Tensor of images [batch_size, height, widths, 3]. boxes: Tensor of boxes. assumed shape is [batch_size, num_boxes, 4(y1, x1, y2, x2)]. image_size: python list image size [height, width].

blueoil.networks.object_detection.yolo_v2.format_XYWH_to_CXCYWH(boxes, axis=1)

Format form (x, y, w, h) to (center_x, center_y, w, h) along specific dimension.

Args: boxes :a Tensor include boxes. [:, 4(x, y, w, h)] axis: which dimension of the inputs Tensor is boxes.

blueoil.networks.object_detection.yolo_v2.format_CXCYWH_to_XYWH(boxes, axis=1)

Format form (center_x, center_y, w, h) to (x, y, w, h) along specific dimension.

Args: boxes: A tensor include boxes. [:, 4(x, y, w, h)] axis: Which dimension of the inputs Tensor is boxes.

blueoil.networks.object_detection.yolo_v2.format_CXCYWH_to_YX(inputs, axis=1)

Format from (x, y, w, h) to (y1, x1, y2, x2) boxes along specific dimension.

Parameters
  • inputs – a Tensor include boxes.

  • axis – which dimension of the inputs Tensor is boxes.

blueoil.networks.object_detection.yolo_v2.format_XYWH_to_YX(inputs, axis=1)

Format from (x, y, w, h) to (y1, x1, y2, x2) boxes along specific dimension.

Parameters
  • inputs – a Tensor include boxes.

  • axis – which dimension of the inputs Tensor is boxes.