Contest
Societies have always faced fire-related risks throughout history but with the advent of the industrial era, these dangers increased due to malfunctions and the improper use of machinery. Fire is now one of the major threats to life, infrastructure and ecosystems. To prevent disasters and protect the environmental heritage, local authorities are seeking advanced surveillance systems based on Computer Vision algorithms that can automatically and reliably detect fires. Early fire detection approaches based on Computer Vision focused on colour and motion models struggled to handle scene variability. With advancements in Machine Learning and Deep Learning methodologies, detection capabilities have significantly improved. However, challenges remain due to the complexity of fire phenomena and limitations in the datasets. Detection failures occur when fire appears differently from the samples in the training sets, such as in cases where fires are visible from greater distances. Additionally, moving objects that resemble fire, which may not have been present during training, can confuse the system, increasing the number of false alarms. These issues explain why fire detection cannot yet be considered definitively resolved. From an analysis of the literature, two main gaps emerge in existing methods. The first relates to the need to design methods based on the application scenarios. In simpler cases, where flames or smoke are clearly visible and no other moving objects are present, well-trained frame-wise fire detectors ensure good performance. However, when flames are small in the images or when there are many moving objects that may resemble fire, more complex models are needed, including temporal analysis techniques. Adding scenario awareness to the methods and using approaches tailored to the specific situation allows for better performance in real-world scenarios. The second gap deals with the optimal balance between precision and recall. Although existing methods are sensitive and capable of detecting fires with good recall, they are not always precise in distinguishing fire from similar moving objects. This issue was also observed in the first ONFIRE 2023 contest, where even the best methods were found to lack precision. It is worth pointing out that false positives not only compromise operational effectiveness and trust in the system but also increase costs and inefficiency, as human operators must handle a high number of false alarms. Additionally, to comply with the strict requirements regarding energy, bandwidth and processing resources in remote areas, these systems must be suited for smart cameras with graphic accelerators or embedded devices with limited resources while maintaining high detection performance.
The second ONFIRE 2025 contest is an international competition organized to encourage participants to develop advanced methods for real-time fire detection in videos captured by fixed CCTV cameras, runnable on smart cameras or embedded systems. The goal of this contest is to stimulate innovative solutions that address these challenges with algorithms that are reliable across 4 different application scenarios of varying difficulty:
- Low Activity - Short Range (LA-SR) – Easy level
- Low Activity - Long Range (LA-LR) – Intermediate level
- High Activity - Short Range (HA-SR) – Difficult level
- High Activity - Long Range (HA-LR) – Intermediate level
Each submitted method will be evaluated on the entire private test set, which includes unseen videos different from those in the training set but consistent with it and categorized by scenario. Participants will compete in both an overall ranking and scenario-based rankings. Additionally, the average frame processing rate and memory usage will be measured. These parameters will assess both the algorithm’s speed and the computational resources required for efficient operation. A final score will be proposed, considering the F1-score and the required processing resources, which will determine the competition rankings. For training purposes, participants will have access to an expanded dataset compared to the one provided in the previous ONFIRE 2023 contest. This dataset includes over 300 videos from public sources for flame and smoke detection, each annotated with the exact moment the fire starts and categorized by scenario (LA-SR, LA-LR, HA-LR and HA-SR). Furthermore, participants will have the option to extend the training dataset if additional data is later made publicly available, encouraging the scientific community to develop new and better solutions. A reference baseline will also be provided to enable performance and computational comparisons.
Dataset
The ONFIRE 2025 dataset, compared to the ONFIRE 2023 version, presents a significant expansion in the training set. Each video will either contain instances of flames or smoke or not and will be labelled as positive or negative based on the presence of these. Videos showing fire will be considered positive, while those containing fire or smoke like elements will be classified as negative. The training set will include videos from publicly available datasets (such as MIVIA, RISE, KMU, Wildfire, D-Fire and others). One of the main innovations in the ONFIRE 2025 contest is the categorization system for the videos. While the annotation method will remain unchanged —using an integer index to indicate the exact moment of ignition instead of classic bounding boxes— the videos will now be categorized based on scenarios of varying difficulty (LA-SR, LA-LR, HA-LR and HA-SR) allowing participants to choose the context in which they prefer to specialize their methods. Each video file will have a unique text file with the annotation value and the videos and their corresponding annotation files will be organized into folders named according to the scenario. Participants will have the option to extend the provided dataset if the new videos are then made publicly available, in order to encourage the scientific community to develop better solutions. Since the videos are collected under various conditions, the dataset will be highly heterogeneous.
Evaluation protocol
The fire detection accuracy of the competing methods will be evaluated in terms of $\textbf{Precision (P)}$, $\textbf{Recall (R)}$ and $\textbf{F1-Score (F)}$. To formalize these metrics, it is necessary to define the sets of $\textbf{True Positive (TP)}$, $\textbf{False Positive (FP)}$ and $\textbf{False Negative (FN)}$. Our test set contains both positive and negative samples, i.e. video depicting fire events or not, respectively. Each positive video shows one fire event only. Each prediction is evaluated by comparing the fire start instant $\mathbf{g}$ with the detection istant $\mathbf{p}$. We indicate as the fire start instant the first frame in which it is visible. It is manually labelled by the human operator after visioning the whole video thus it is a ground truth. Notwithstanding this, we consider that automatic methods could anticipate human detection by a few seconds, namely $ \mathbf{\Delta}_{\text{t}} = 5\ $seconds. At the same time, we discard detections occurring later than $\mathbf{T}_{\text{max}} = 60$ seconds. Consequently, we define:
- $\textbf{TP}$: all the detections occurring in positive videos at $ \mathbf{\boldsymbol{g - \Delta t \leq p \leq g + T_{\text{max}}}}$;
- $\textbf{FP}$: all the detections occurring at any $\mathbf{t}$ in $\mathbf{r}$ in positive or negative videos at $\mathbf{\boldsymbol{p < \max(0, g - \Delta t)}}$;
- $\textbf{FN}$: the set of positive videos for which no fire detection occurs;
Defined these sets for $\textbf{TP}$, $\textbf{FP}$ and $\textbf{FN}$, we can compute the $\textbf{P}$, $\textbf{R}$ and $\textbf{F}$ with respect to the number of $\textbf{|TP|}$, $\textbf{|FP|}$ and $\textbf{|FN|}$:
$\mathbf{\boldsymbol{P = \frac{|TP|}{|TP| + |FP|}}}$
$\mathbf{\boldsymbol{R = \frac{|TP|}{|TP| + |FN|}}}$
$\mathbf{\boldsymbol{F = \frac{2 \cdot (P \cdot R)}{P + R}}}$
- $\textbf{P}$ assumes values in the range $\mathbf{[0, 1]}$ and measures the capability of the methods to reject $\textbf{FP}$; the higher is $\textbf{P}$, the higher the specificity of the method;
- $\textbf{R}$ assumes values in the range $\mathbf{[0, 1]}$ and evaluates the sensitivity of the method to detect fire; the higher is $\textbf{R}$, the higher the sensitivity of the method;
- $\textbf{F}$ assumes values from $\mathbf{[0, 1]}$ and measures the balance between $\textbf{P}$ and $\textbf{R}$;
Then, we compute the $\textbf{Processing Frame Rate (PFR)}$, namely the average number of frames processed by the method in one second on a target $\textbf{GPU}$. In particular, being $\mathbf{N}$ the total number of frames processed in the test set and $\mathbf{t}_{\text{i}}$ the processing time in seconds for a single frame, the $\textbf{PFR}$ is computed as follows:
$\mathbf{\boldsymbol{PFR = \frac{1}{\frac{\sum_{i=1}^{N}{t_i}}{N}}}}$
The higher is $\mathbf{PFR}$, higher is the processing speed of the method. To normalize this value with respect to the minimum processing frame rate needed to achieve real-time performance, namely $\mathbf{PFR}_{\text{target}}$, we compute the $\mathbf{PFR}_{\text{delta}}$ score as follows:
$\mathbf{PFR_{\boldsymbol{delta}} = \max(0, \frac{PFR_{\boldsymbol{target}}}{PFR} - 1)}$
Then, We measure the $\textbf{Memory Usage (MEM)}$, namely the memory in GB occupied by the method on the target $\textbf{GPU}$. The lower is $\textbf{MEM}$, lower is the necessary memory on the processing device. To normalize this value with respect to the maximum $\textbf{GPU}$ memory available on the target processing device, namely $\mathbf{MEM}_{\text{target}}$, we compute the $\mathbf{MEM}_{\text{delta}}$ score as follows:
$\mathbf{MEM_{\boldsymbol{delta}} = \max(0, \frac{MEM}{MEM_{\boldsymbol{target}}} - 1)}$
Finally, we define the $\textbf{Constrained Fire Detection Score (CFDS)}$ computed as follows:
$\mathbf{CFDS = \frac{F}{(1 + PFR_{\boldsymbol{delta}}) \cdot (1 + MEM_{\boldsymbol{delta}})}}$
The method achieving the highest $\textbf{CFDS}$ will be crowned the winner of the ONFIRE 2025 Contest, showcasing the best balance between fire detection accuracy across a variety of scenarios with differing levels of difficulty, promptness in notification and resource efficiency. The winners will be those securing top performance in the scenario-specific rankings and the overall ranking.
Rules
- The deadline for the submission of the methods is 6th June, 2025. The submission must be done with an email in which the participants share (directly or with external links) the trained model, the code and the report. Please follow the detailed instructions reported here.
- The participants can receive the training set and its annotations by sending an email, in which they also communicate the name of the team.
- The participants can use these training samples and annotations, but also additional videos.
- The participants must submit their trained model and their code by carefully following the detailed instructions reported here.
- The participants are strongly encouraged to submit a contest paper by the deadline of 13th June, 2025. The paper can be submitted through Easychair. Authors can find complete instructions of how to format their papers here. The maximum number of pages is 12 including references. Accepted papers will be included in the ICIAP 2025 Workshops Proceedings.
Instructions
The methods proposed by the participants will be executed on a private test set. To leave the participants totally free to use all the software libraries they prefer and to correctly reproduce their processing pipeline, the evaluation will be done on Google Colab (follow this tutorial) by running the code submitted by the participants on the samples of our private test set.
Therefore, the participants must submit an archive (download an example) including the following elements:
-
A Python script
test.py
, which takes as input the folder of the test videos (--videos
) and produces as output a TXT file with the detection time (or nothing) for each video in a folder (--results
). Thus, the script may be executed with the following command:
python test.py --videos foo_videos/ --results foo_results/
-
A Google Colab Notebook
test.ipynb
, which includes the commands for installing all the software requirements and executes the scripttest.py
. - All the files necessary for running the test, namely the trained model, additional scripts and so on.
The provided sample test.py also includes the reading of the TXT files with the annotations. Each file includes the fire ignition instant or nothing if it is a negative video. The files of the results will be formatted exactly in the same way. The provided sample test.py includes the writing of the files with the results.
The submission must be done by email. The archive file can be attached to the e-mail or shared with external links. We strongly recommend to follow the example of code to prepare the submission.
Organizers

Diego Gragnaniello
Tenure-Track Assistant ProfessorDept. of Information and Electrical Engineering and Applied Mathematics (DIEM)
University of Salerno, Italy

Antonio Greco
Tenure-Track Assistant ProfessorDept. of Information and Electrical Engineering and Applied Mathematics (DIEM)
University of Salerno, Italy

Carlo Sansone
Full ProfessorDept. of Electrical Engineering and Information Technology (DIETI)
University of Napoli - Federico II, Italy

Bruno Vento
PhD StudentDept. of Electrical Engineering and Information Technology (DIETI)
University of Napoli - Federico II, Italy
Contact
onfire2025@unisa.it
+39 089 963006