International Contest on Illegal Waste Dumping Detection

EVENT STARTS IN

-
Days
-
Hours
-
Minutes
-
Seconds

Contest

The Illegal Waste Dumping Detection (IWDD) contest is an international competition organized to encourage participants to develop advanced methods for real-time illegal waste dumping detection in videos captured by fixed CCTV cameras, runnable on smart cameras or embedded systems.

Global Impact

Illegal dumping poses a major global issue, burdening cities with cleanup costs, harming ecosystems, and threatening public health.

Data Scarcity

Limited availability of comprehensive, well-annotated video datasets. Most existing work relies on private datasets, hindering progress toward intelligent surveillance systems.

Technical Challenge

Variations in human behavior, waste types, and environmental factors reduce the effectiveness of conventional models.

Types of Dumping Actions

The challenge of detecting illegal waste dumping lies in understanding and recognizing different behavioral patterns. Dumping actions can be categorized into two distinct types, each presenting unique detection challenges and requiring specialized temporal modeling approaches.

STATIC DISPOSAL

Waste is deliberately placed and left in a specific location, such as abandoning garbage bags, furniture, or bulky items on the street or in unauthorized areas.

Static Dumping
DYNAMIC DISPOSAL

Spontaneous, quick disposal actions performed while in motion, such as tossing waste from a moving vehicle, throwing litter while walking, or rapidly discarding objects during brief stops.

Dynamic Dumping

MIVIA-IWDD Dataset

The MIVIA-IWDD dataset is a comprehensive collection of video clips specifically designed to train and evaluate action recognition models for detecting illegal waste disposal events. The dataset is completely balanced to ensure unbiased model training and robust evaluation.

400
Total Videos
200
Positive Samples
200
Negative Samples
Varied
Resolutions
Temporal Annotations

Each positive video includes precise timestamp annotations marking the exact moment when illegal dumping occurs, enabling fine-grained temporal action localization and early detection training.

Real-World Diversity

Videos captured in diverse real-world conditions: day/night scenarios, multiple lighting conditions, various camera angles, and different environmental contexts to ensure robust model generalization.

Balanced Distribution

Perfectly balanced dataset with equal representation: 200 positive samples (100 static + 100 dynamic) and 200 negative samples, ensuring unbiased training and fair evaluation metrics.

Challenging Negative Samples

Negative samples featuring challenging scenarios without dumping events, helping models learn to distinguish normal activities from illegal waste disposal and reduce false positives.

Evaluation Protocol

Submitted methods are evaluated according to three key metrics: Precision, Recall, and F1-Score. The test set includes both positive samples (videos containing illegal waste dumping events) and negative samples (videos with no dumping activity).

Each prediction is evaluated by comparing the ground truth dumping start instant $g$ with the detection instant $p$. The ground truth represents the first frame where the dumping action becomes visible, manually annotated after reviewing the complete video. A tolerance of $\Delta t = 3$ seconds is allowed for early detection. At the same time, we discard detections occuring later than $T_{max} = 10$ seconds after the ground truth instant.

To compute evaluation metrics, each detection is assigned to one of three categories:

  • True Positives $(TP)$: detections in positive videos that occur within the valid time window, i.e. $(g - \Delta t) \leq p \leq (g + T_{max})$;
  • False Positives $(FP)$: detections occurring in negative videos or in positive videos that happen too early or too late, namely when $p < (g - \Delta t)$ or $p > (g + T_{max})$;
  • False Negatives $(FN)$: the set of positive videos where no detection occurs.

From these sets, standard metrics are computed:

Precision

Measures the system's ability to reject false alarms. Higher precision indicates fewer false positives.

$P = \frac{|TP|}{|TP| + |FP|}$

Recall

Measures the system's sensitivity to detecting dumping events. Higher recall means fewer missed events.

$R = \frac{|TP|}{|TP| + |FN|}$

F1-Score

Provides a balanced measure combining precision and recall. The harmonic mean of both metrics.

$F_1 = 2 \times \frac{P \times R}{P + R}$

Final Ranking: The contest winner will be determined based on the highest F1-Score achieved on the test set, as it provides the most balanced evaluation of detection accuracy combining both precision and recall performance.

However, to emphasize the applicability of these methods in real-time scenarios, we also compute additional metrics. To find the model with the best performance and the most efficient solution for real-time systems, we consider measures that evaluate both detection speed and computational efficiency. These metrics help identify methods that can operate effectively in resource-constrained surveillance environments while maintaining high detection accuracy.

Notification Delay

Measures how quickly the method detects illegal dumping after it occurs. Lower notification delay means faster response, crucial for timely intervention in illegal waste disposal activities. The normalized score transforms the average delay into a value between 0 (worst performance, maximum delay) and 1 (best performance, instantaneous detection).

Variables:

  • $d_i = |p_i - g_i|$ (detection delay for video $i$)
    • $p_i$ = detection instant
    • $g_i$ = ground truth instant
  • $D = \frac{\sum_{i=1}^{|TP|} d_i}{|TP|}$ (average delay in seconds)

$D_{norm} = \max(0, 1 - \frac{D}{T_{max}})$

Processing Frame Rate

The average number of frames processed by the method in one second on a target GPU. Higher PFR indicates faster processing speed, essential for real-time surveillance applications.

Variables:

  • $PFR_{target}$ = target processing frame rate;
  • $PFR = \frac{N}{\sum_{i=1}^{N} t_i}$ (frames per second);
    • $t_i$ = processing time for frame $i$ (seconds);
    • $N$ = total number of frames processed;

$PFR_{delta} = \max(0, \frac{PFR_{target}}{PFR} - 1)$

Memory Usage

The memory in GB occupied by the method on the target GPU during inference. Lower memory usage enables deployment on resource-constrained edge devices for smart city surveillance.

Variables:

  • $MEM_{target}$ = target memory threshold;
  • $MEM$ = peak GPU memory consumption (GB);

$MEM_{delta} = \max(0, \frac{MEM}{MEM_{target}} - 1)$

Rules

  1. The deadline for the submission of the methods is 10th December, 2025. The submission must be done with an email in which the participants share (directly or with external links) the trained model, the code and the report. Please follow the detailed instructions reported here.
  2. The participants can receive the training set and its annotations by sending an email, in which they also communicate the name of the team.
  3. The participants can use these training samples and annotations for training and validating their methods, but they are also encouraged to use other external data for improving the performance of their models.
  4. The participants must submit their trained model and their code by carefully following the detailed instructions reported here.
  5. The participants are strongly encouraged to submit a contest paper to WASTEVISION 2026, whose deadline is 15th December, 2025. To properly format their papers, authors must use the official WACV 2026 Author Kit Template, which contains all necessary LaTeX files, style sheets, and formatting guidelines required for submission. Authors can find complete instructions of how to format their papers in the workshop website, available here. Accepted papers will be included in the WACV 2026 Workshops Proceedings here.

Instructions

The methods proposed by the participants will be executed on a private test set. To leave the participants totally free to use all the software libraries they prefer and to correctly reproduce their processing pipeline, the evaluation will be done on Google Colab (follow this tutorial) by running the code submitted by the participants on the samples of our private test set.

Therefore, the participants must submit an archive including the following elements:

  • A Python script test.py, which takes as input the folder of the test videos (--videos) and produces as output a TXT file with the detection time (or nothing) for each video in a folder (--results). Thus, the script may be executed with the following command:
    python test.py --videos foo_videos/ --results foo_results/
  • A Google Colab Notebook test.ipynb, which includes the commands for installing all the software requirements and executes the script test.py.
  • All the files necessary for running the test, namely the trained model, additional scripts and so on.
  • The provided sample test.py also includes the reading of the TXT files with the annotations. Each file includes the illegal waste dumping instant or nothing if it is a negative video. The files of the results will be formatted exactly in the same way. The provided sample test.py includes the writing of the files with the results.

The submission must be done by email. The archive file can be attached to the e-mail or shared with external links. We strongly recommend to follow the example of code to prepare the submission.

Results

The final results of the IWDD Challenge will be published here after the evaluation of all submitted methods on the private test set. Stay tuned for updates on the leaderboard and detailed performance metrics of each method!