AgrI Challenge
A Data-Centric Competition Framework for Agricultural Machine Learning
Twelve interdisciplinary teams independently collected 50,673 field images of 6 tree species over a 2-day campaign. We introduce the Cross-Team Validation (CTV) framework, a new paradigm for evaluating real-world model generalization.
What is AgrI Challenge?
A participant-led AI competition where each team independently collects their own field data — generating authentic, real-world distributional diversity.
Interdisciplinary Teams
Each of the 12 teams combined students from computing backgrounds (computer science, AI, data science) with students from ecological or agronomic backgrounds (agriculture, forestry, plant sciences). This mixed composition guaranteed high-quality species labeling grounded in domain expertise while supporting effective ML system design.
Competition Phases
Data Collection Phase
Teams independently collected field images of 6 tree species at the experimental facilities of École Nationale Supérieure Agronomique. Each team had full freedom over their collection devices, sampling strategies, environmental coverage, and imaging protocols, generating authentic domain diversity.
Model Development Phase
Teams preprocessed, annotated, and trained machine learning models using only their own collected data at École Nationale Supérieure d'Intelligence Artificielle. AI specialists mentored teams on data preparation, model design, training, and evaluation.
Why AgrI Challenge?
Standard AI benchmarks fail in the field. AgrI Challenge was designed to close the gap between laboratory accuracy and real-world performance.
The Generalization Problem
Models trained on controlled laboratory datasets achieve over 99% accuracy on benchmarks, yet drop to 54% in real farm environments. This is not a model problem — it is a data problem.
The AgrI Challenge Approach
Rather than providing a fixed dataset, each team independently collects their own field data — generating authentic distributional diversity across devices, environments, and sampling strategies that mirrors real deployment conditions.
What the Results Show
Under single-team training (TOTO), models showed a mean validation-test gap of up to 16.2 percentage points — a direct consequence of narrow data diversity. When trained collaboratively across all 12 teams (LOTO), that gap collapsed by 82–84%, confirming that data diversity is the primary driver of robust AI generalization.
Where & Who
The AgrI Challenge was organized through a collaboration between two leading Algerian academic institutions, held across two campuses in Algiers.
École Nationale Supérieure Agronomique
A long-established agronomic institution in Algeria, founded in 1905. The experimental and teaching facilities at ENSA, El Harrach, provided access to representative agro-ecosystems and well-maintained plant collections.
ensa.dzÉcole Nationale Supérieure d'Intelligence Artificielle
A national center of excellence dedicated to education and research in artificial intelligence and data science. ENSIA specialists in AI and machine learning mentored teams throughout the modeling phase.
ensia.edu.dzContact
For questions about the dataset, access requests, or research collaboration:
mohamed.brahimi@ensia.edu.dzCorresponding author: Dr. Mohammed Brahimi, ENSIA, Algiers, Algeria.
Event Gallery
A look at the AgrI Challenge in action — 12 teams in the field, collecting real agricultural data and building AI models from the ground up.







Cross-Team Validation
A novel evaluation paradigm that treats each team's independently collected dataset as a distinct domain, revealing how models actually generalize.
What is CTV?
Cross-Team Validation (CTV) treats datasets collected by different teams as distinct domains for training and testing. Unlike traditional cross-validation that splits data randomly, CTV preserves natural domain boundaries created by different collection methodologies, environmental contexts, and device characteristics.
Each team's dataset represents a unique combination of factors, capturing authentic inter-domain variation that simulates real-world deployment scenarios.
TOTO
Train-on-One-Team-Only
A model is trained on a single team's collected data (70% train / 30% val split) and then evaluated on every other team's data independently. Measures single-source generalization.
LOTO
Leave-One-Team-Out
A model is trained on the combined data of all teams except one, then evaluated on the held-out team's data. Measures collaborative multi-source generalization.
Why Standard Validation Is Not Enough
Standard cross-validation with random data splits does not expose a model's true generalization ability across genuinely different domains. CTV specifically reveals the validation–test gap that standard metrics miss.
“Validation accuracy was remarkably stable (≈98%) across all LOTO folds, while test accuracy varied by over 11 percentage points, confirming that validation accuracy alone is not a reliable proxy for real-world generalization.”
Dataset
50,673 field images spanning 6 tree species, collected by 12 independent teams and curated into a 47,367-image clean benchmark.
Tree Species
Image Distribution by Team & Species
Full image count breakdown across all 12 teams and 6 species classes (Table 2 from the paper).
| Team | Carob | Oak | Peruvian Pepper | Ash | Pistachio | Tipuana | Total |
|---|---|---|---|---|---|---|---|
| AI-4o | 960 | 1,625 | 1,221 | 934 | 1,111 | 1,493 | 7,344 |
| AiGro | 686 | 604 | 595 | 572 | 541 | 635 | 3,633 |
| CACTUS | 400 | 597 | 694 | 803 | 755 | 552 | 3,801 |
| CHAJARA | 610 | 549 | 428 | 308 | 262 | 383 | 2,540 |
| GreenAI | 675 | 705 | 609 | 559 | 722 | 515 | 3,785 |
| PLT | 1,089 | 1,131 | 1,000 | 1,297 | 973 | 1,066 | 6,556 |
| RUSTICUS | 565 | 625 | 505 | 500 | 555 | 696 | 3,446 |
| SMART AGRICULTURES | 638 | 923 | 1,694 | 1,214 | 1,080 | 772 | 6,321 |
| Scorpions | 422 | 450 | 407 | 456 | 620 | 440 | 2,795 |
| Condimenteum | 1,048 | 1,000 | 655 | 1,074 | 606 | 1,121 | 5,504 |
| The Neural Ninjas | 234 | 262 | 256 | 236 | 293 | 361 | 1,642 |
| Organization team | 552 | 657 | 727 | 392 | 368 | 610 | 3,306 |
| Total | 7,879 | 9,128 | 8,791 | 8,345 | 7,886 | 8,644 | 50,673 |
Baseline Results
Dual architecture baselines evaluated under TOTO and LOTO protocols, providing benchmarks for future researchers using the dataset.
Baseline Architectures
DenseNet121
CNNUses dense connections between layers for efficient feature reuse. Pretrained on ImageNet-1K.
Swin Transformer
Vision TransformerComputes self-attention within local windows with shifted windowing. Pretrained on ImageNet-1K.
TOTO Protocol — Validation vs. Test Gap
DenseNet121
The gap between validation and test accuracy emerged by epoch 5 and remained stable throughout training, indicating distributional shift rather than overfitting.
Swin Transformer
The gap between validation and test accuracy emerged by epoch 5 and remained stable throughout training, indicating distributional shift rather than overfitting.
LOTO Protocol — Collaborative Training Impact
DenseNet121
Swin Transformer
Request Dataset Access
The AgrI Challenge dataset is available for non-commercial research purposes. Submit a request and we will review it and contact you.
The dataset contains 50,673 field images of 6 tree species collected by 12 teams under the AgrI Challenge framework. Access is granted for academic and research use only. Upon submission, your request will be reviewed and you will receive a response at the provided email address.
By requesting access to the AgrI Challenge dataset you agree to all of the following conditions:
- Non-Commercial Use Only. The dataset may only be used for academic, educational, and non-commercial research purposes. Any commercial use, including but not limited to product development, commercial services, or for-profit applications, is strictly prohibited.
- No Redistribution. You may not share, publish, redistribute, sublicense, or make the dataset (or any portion of it) publicly available in any form. If a collaborator requires access, they must submit their own individual request to the authors.
- Mandatory Citation. Any publication, report, or presentation that uses or references this dataset must cite the AgrI Challenge paper. See the Citation section for the correct BibTeX and APA formats.
- No Derivative Datasets. You may not create and distribute datasets derived from the AgrI Challenge data without explicit written permission from the corresponding author (mohamed.brahimi@ensia.edu.dz).
- Responsible Use. The dataset must be used in a manner consistent with applicable privacy laws and ethical research standards. You are responsible for ensuring your use complies with your institution's policies.
Cite This Work
If you use the AgrI Challenge dataset or the Cross-Team Validation framework in your research, please cite our paper.
AgrI Challenge: Cross-Team Insights from a Data-Centric AI Competition in Agricultural Vision
Brahimi, Laabassi, Hadj Ameur, Boutorh, Siab-Farsi, Khouani, Zouak, Bouziane, Lakhdari & Benghanem · 2026 · arXiv preprint
arXiv:XXXX.XXXXX(link available upon publication)The arXiv ID will be updated upon paper publication.