How to Install Oz Software

Terms and Definitions

Term
Description

APM

Analyses per minute. Please note:

  • Analysis is a request for Quality (Liveness) or Biometry analysis using a single media.

  • A single analysis with multiple media counts as separate analyses in terms of APM.

  • Multiple analysis types on single media (two media for Biometry) count as separate analyses in terms of APM.

PoC

Proof of Concept

Node

A Node is a worker machine. Can be either a virtual or a physical machine.

HA

High availability

K8s

Kubernetes

SC

StorageClass

RWX

ReadWriteMany

Components' Description

Oz API components:

  • APP is the API front app that receives REST requests, performs preprocessing, and creates tasks for other API components.

  • Celery is the asynchronous task queue. API has 6 celery queues:

    • Celery-default processes system-wide tasks.

    • Celery-maintenance processes maintenance tasks.

    • Celery-tfss processes analysis tasks.

    • Celery-resolution checks for completion of all nested analyses within a folder and changes folder status.

    • Celery-preview_convert creates a video preview for media.

    • Celery-beat is a CronJob for managing maintenance celery tasks.

    • Celery-Flower is a Celery metrics collector.

    • Celery-regula (optional) processes document analysis tasks.

  • Redis is a message broker and result backend for Celery.

  • RabbitMQ (optional) can be used as a message broker for Celery instead of Redis.

  • Nginx serves static media files for external HTTP(s) requests.

  • O2N (optional) processes the Blacklist analysis.

  • Statistic (optional) provides statistics' collection for Web UI.

  • Web UI provides the web interface.

BIO-Updater checks for models updates and downloads new models.

Oz BIO (TFSS) runs TensorFlow with AI models and makes decisions for incoming media.

The BIO-Updater and BIO components require access to the following external resources:

Deployment Scenarios

The deployment scenario depends on the workload you expect.

Small Business or PoC
Medium Load
High

Use cases

  • Testing/Development purposes

  • Small installations with low number of APM

  • Typical usage with moderate load

  • High load with HA and autoscaling

  • Usage with cloud provider

Environment

Docker

Docker

Kubernetes

HA

No

Partially

Yes

Pros

  • Requires a minimal amount of computing resources

  • Low complexity, so no high-qualified engineers are needed on-site

  • Easy to manage and support

  • Partially supports HA

  • Can be scaled up to support higher workload

  • HA and autoscaling

  • Observability and manageability

  • Allows high workload and can be scaled up

Cons

  • Suitable only for low loads, no high APM

  • No scaling and high-availability

  • API HA requires precise balancing

  • Higher staff qualification requirements

  • High staff qualification requirements

  • Additional infrastructure requirements

External resource requirements

  • PostgreSQL

  • For Kubernetes deployments:

    • K8s v1.25+

    • ingress-nginx

    • clusterIssuer

    • kube-metrics

    • Prometheus

    • clusterAutoscaler

  • PostgreSQL

Autoscaling is implemented on the basis of ClusterAutoscaler and must be supported by your infrastructure.

Small business or PoC

Please find the installation guide here: Docker.

  • Type of containerization: Docker,

  • Type of installation: Docker compose,

  • Autoscaling/HA: none.

Requirements

Software

  • Docker 19.03+,

  • Podman 4.4+,

  • Python 3.4+.

Storage

  • Depends on image quality and required archive depth.

  • May be count as: [average image size] * 2 * [analyses per day] * [archive depth in days].

Staff qualification:

  • Basic knowledge of Linux and Docker.

Deployment

  1. Single node.

Resources:

  • 1 node,

  • 16 CPU/32 RAM.

  1. Two nodes.

Resources:

  • 2 nodes,

  • 16 CPU/32 RAM for the first node; 8 CPU/16 RAM for the second node.

Medium Load

Please find the installation guide here: Docker.

  • Type of containerization: Docker/Podman,

  • Type of installation: Docker compose,

  • Autoscaling/HA: manual scaling; HA is partially supported.

Requirements

Computational resources

Depending on load, you can change the number of nodes. However, for 5+ nodes, we recommend that you proceed to the High Load section.

  • From 2 to 4 Docker nodes (see schemes):

    • 2 Nodes:

      • 24 CPU/32 RAM per node.

    • 3 Nodes:

      • 16 CPU/24 RAM per node.

    • 4 Nodes:

      • 8 CPU/16 RAM for two nodes (each),

      • 16 CPU/24 RAM for two nodes (each).

We recommend using external self-managed PostgreSQL database and NFS share.

Software

  • Docker 19.03+,

  • Podman 4.4+,

  • Python 3.4+.

Storage

  • Depends on image quality and required archive depth.

  • May be count as: [average image size] * 2 * [analyses per day] * [archive depth in days].

Staff qualification:

  • Advanced knowledge of Linux, Docker, and Postgres.

Deployment

2 nodes:

3 nodes:

4 nodes:

High Load

Please find the installation guide here: Kubernetes.

  • Type of containerization: Type of containerization: Docker containers with Kubernetes orchestration,

  • Type of installation: Helm charts,

  • Autoscaling/HA: supports autoscaling; HA for most components.

Requirements

Computational resources

3-4 nodes. Depending on load, you can change the number of nodes.

  • 16 CPU/32 RAM Nodes for the BIO pods,

  • 8+ CPU/16+ RAM Nodes for all other workload.

We recommend using external self-managed PostgreSQL database.

Requires RWX (ReadWriteMany) StorageClass or NFS share.

Software

  • Docker 19.03+,

  • Python 3.4+.

Storage

  • Depends on image quality and required archive depth.

  • May be count as: [average image size] * 2 * [analyses per day] * [archive depth in days].

Staff qualification:

  • Advanced knowledge of Linux, Docker, Kubernetes, and Postgres.

Deployment Scheme

Last updated