Automated video identification of marine species (AVIMS) - new application: report

A commissioned report on the development of a web-based computer application for machine learning-based (semi-)automated analysis of underwater video footage obtained during the monitoring of aquatic environments.


Hardware and Software Details

In the following two subsections, we give hardware and software details of the application website (AVIMS site) and of the GPU worker (AVIMS worker) where the time consuming machine learning training and inference are performed.

AVIMS site

Server Hardware and Operating System (OS)

Our web-server is a virtual machine with 4GB of RAM running with a larger server. We use the Ubuntu Linux OS and Nginx web server. Our data is stored in a PostgresSQL database. Large files such as image and video files are stored on the file system.

Web application

Our web application uses the Django web application framework (The web framework for perfectionists with deadlines | Django (djangoproject.com)). Django provides an object relational mapper that provides object oriented access to the rows stored in the PostgresSQL database. Its template system simplifies the design of HTML web pages that incorporate data extracted from the database.

We utilized the Twitter Bootstrap 4.3.1 framework (Introduction · Bootstrap (getbootstrap.com) to provide layout and user interface controls. AVIMS uses the Django-labeller labelling tool (GitHub - Britefury/django-labeller: An image labelling tool for creating segmentation data sets, for Django and Flask (github.com)) to allow users to annotate images quickly and effectively.

Video handling

The video files provided by the Marine Directorate came in a variety of formats, depending on their source. Early in the project we found that it was necessary to convert the video files to a common format. For this, we adopted FFMPEG (FFMPEG - A complete, cross-platform solution to record, convert and stream audio and video (ffmpeg.org)). FFMPEG was also able to convert interlaced video to a non-interlaced format in the instances where this was necessary.

AVIMS worker

Hardware

Training object detection models and analysing video files requires the use of a GPU. We therefore performed these tasks on a separate machine with the nVidia 1080-Ti GPU, 32GB RAM and an Intel Core i7 CPU.

Software

We used the Python Celery distributed task queue (Introduction to Celery — Celery 5.3.4 documentation (celeryq.dev)) to provide a task queue to run the machine learning tasks required by AVIMS.

When an AVIMS user initiates a model training or video analysis job, the task is appended to the celery queue that runs on the web server. This job is retrieved by the GPU worker machine that downloads the relevant data from the web server. It runs the task while periodically reporting progress and finally uploads the results to the web server on completion.

Contact

Email: craig.robinson@gov.scot

Back to top