Human perception algorithms

terran.face.detection

terran.face.detection.face_detection = <Detection(RetinaFace)>

Default entry point to face detection.

This is an instantiation of the Detection class, lazily-loaded in order to avoid reading the checkpoints on import. Refer to that class’ __call__ method for more information.

class terran.face.detection.Detection(checkpoint=None, short_side=416, merge_method='padding', device=device(type='cpu'), lazy=False)

Initializes and loads the model for checkpoint.

Parameters:
  • checkpoint (str) – Checkpoint (and model) to use in order to perform face detection. If None, will use the default one for the task.
  • short_side (int) – Resize images’ short side to short_side before sending over to detection model.
  • merge_method ('padding', 'crop') –

    How to merge images together into a batch when receiving a list. Merge is done after resizing. Options are:

    • padding, which will add padding around images, possibly increasing total image size. If mixing portrait and landscape images, might be inefficient.
    • crop, which will center-crop the images to the smallest size. If images are of very different sizes, might end up cropping too much.
  • device (torch.Device) – Device to load the model on.
  • lazy (bool) – If set, will defer model loading until first call.
__call__(images)

Performs face detection on images.

Derives the actual prediction to the model the Detection object was initialized with.

Parameters:images (list of numpy.ndarray or numpy.ndarray) – Images to perform face detection on.
Returns:List of dictionaries containing face data for a single image, or a list of these entries thereof.

Each entry is of the form:

{
    'bbox': [x_min, y_min, x_max, y_max],
    'landmarks': ...,  # Array of shape (5, 2).
    'score': ... # Confidence score.
}
Return type:list of list of dicts, or list dict

terran.face.recognition

terran.face.recognition.extract_features = <Recognition(ArcFace)>

Default entry point to face recognition.

This is an instantiation of the Recognition class, lazily-loaded in order to avoid reading the checkpoints on import. Refer to that class’ __call__ method for more information.

class terran.face.recognition.Recognition(checkpoint=None, device=device(type='cpu'), lazy=False)

Initializes and loads the model for checkpoint.

Parameters:
  • checkpoint (str) – Checkpoint (and model) to use in order to perform face recognition. If None, will use the default one for the task.
  • device (torch.Device) – Device to load the model on.
  • lazy (bool) – If set, will defer model loading until first call.
__call__(images, faces_per_image=None)

Performs face recognition on images.

Derives the actual prediction to the model the Recognition object was initialized with.

Parameters:
  • images (list of numpy.ndarray or numpy.ndarray) – Images to perform face recognition on.
  • faces_per_image (list of list of dicts.) – Each dict entry must contain bbox and landmarks keys, as returned by a Detection instance.
Returns:

One entry per image, with a numpy array of size (N_i, F), with F being the embedding size returned by the model.

Return type:

list of numpy.ndarray or numpy.ndarray

terran.pose

terran.pose.pose_estimation = <Estimation(OpenPose)>

Default entry point to pose estimation.

This is an instantiation of the Estimation class, lazily-loaded in order to avoid reading the checkpoints on import. Refer to that class’ __call__ method for more information.

class terran.pose.Estimation(checkpoint=None, short_side=184, merge_method='padding', device=device(type='cpu'), lazy=False)

Initializes and loads the model for checkpoint.

Parameters:
  • checkpoint (str or None) – Checkpoint (and model) to use in order to perform pose estimation. If None, will use the default one for the task.
  • short_side (int) – Resize images’ short side to short_side before sending over to the pose estimation model. Default is 184 to keep the model fast enough, though for better results 386 is an appropriate value.
  • merge_method ('padding', 'crop') –

    How to merge images together into a batch when receiving a list. Merge is done after resizing. Options are:

    • padding, which will add padding around images, possibly increasing total image size. If mixing portrait and landscape images, might be inefficient.
    • crop, which will center-crop the images to the smallest size. If images are of very different sizes, might end up cropping too much.
  • device (torch.Device) – Device to load the model on.
  • lazy (bool) – If set, will defer model loading until first call.
__call__(images)

Performs pose estimation on images.

Derives the actual prediction to the model the Estimation object was initialized with.

Parameters:images (list or tuple or np.ndarray) –
Returns:List of dictionaries containing pose data for a single image, or a list of these entries thereof.
Return type:list
class terran.pose.Keypoint

An enumeration.

L_EAR = 17
L_ELBOW = 6
L_EYE = 15
L_FOOT = 13
L_HAND = 7
L_HIP = 11
L_KNEE = 12
L_SHOULDER = 5
NECK = 1
NOSE = 0
R_EAR = 16
R_ELBOW = 3
R_EYE = 14
R_FOOT = 10
R_HAND = 4
R_HIP = 8
R_KNEE = 9
R_SHOULDER = 2

terran.tracking.face

terran.tracking.face.face_tracking(*, video=None, max_age=None, min_hits=None, detector=None, return_unmatched=False)

Default entry point to face tracking.

This is a factory for an underlying FaceTracking instance, which will be tasked with keeping the state of the different identities available.

Once you create it, you can treat the resulting object as if it was an instance of the terran.face.detection.Detection class, but focused on working in same-size batches of frames, and returning an additional field on the faces corresponding to the identity, or track.

The tracking utilities provided focus on filtering observations only. No smoothing nor interpolation will be performed, so the results you obtained can be traced back to detections of the detector passed on to it. This is meant as a building block from which to do more detailed face recognition over videos.

Parameters:
  • video (terran.io.reader.Video) –

    Video to derive max_age and min_hits from. The first value will be one video of the second, while the latter will be 1/5th of a second.

    When those values are specified as well, they’ll have precedence.

  • max_age (int) – Maximum number of frames to keep identities around for after no appearance.
  • min_hits (int) –

    Minimum number of observations required for an identity to be returned.

    For instance, if min_hits is 6, it means that only after a face is detected six times will it be returned on prediction. This is, in essence, adding latency to the predictions. So consider decreasing this value if you care more about latency than any possible noise you may get due to short-lived faces.

    You can also get around this latency by specifying return_unmatched value of True, but in that case, returned faces will not have an identity associated.

  • detector (terran.face.detection.Detection) – Face detector to get observations from. Default is using Terran’s default face detection class.
  • return_unmatched (boolean) – Whether to return observations (faces) that don’t have a matched identity or not.
class terran.tracking.face.FaceTracking(detector=None, tracker=None)

Object for performing face tracking.

This object is meant to be used as a substitute to a Detection object, behaving exactly the same way except for having an extra track field in the face dictionaries.

The object will only encapsulate and call the detector and tracker objects used, offer a __call__()-based interface. That is, it’s simply a container for the main Sort class.

__call__(frames)

Performs face tracking on images.

The face detection itself will be done by the self.detector object, while the tracking by the self.tracker object.

Parameters:frames (list of numpy.ndarray or numpy.ndarray) – Frames to perform face tracking on.
Returns:List of dictionaries containing face data for a single image, or a list of these entries thereof.

Each entry is of the form:

{
    'bbox': [x_min, y_min, x_max, y_max],
    'landmarks': ...,  # Array of shape (5, 2).
    'track': ...,  # Either an `int` or `None`.
    'score': ... # Confidence score.
}
Return type:list of list of dicts, or list dict