mmint-lab/VERA: VERA: Autonomous Replication and Validation of Machine Learning Experiments from Research Papers. Chu, A. January, 2026.
mmint-lab/VERA: VERA: Autonomous Replication and Validation of Machine Learning Experiments from Research Papers [link]Paper  doi  abstract   bibtex   
v1.0.0 - Initial Release First public release of VERA (Verification Engine for Reproducible Analysis) VERA is an autonomous agent that extracts, replicates, and validates machine learning experiments from research papers without manual intervention. Simply provide a PDF, and VERA handles extraction, execution, and comprehensive comparison reporting in ~35 seconds. Highlights 100% extraction accuracy across 4 validated research papers (32 models total) 82% average Results Match score with paper-reported benchmarks Zero manual intervention required during execution Full transparency with reviewable standalone Python scripts (~250 lines per paper) Web dashboard + CLI interface included Features Core Capabilities Autonomous PDF extraction with hybrid rule-based + semantic NLP UCI dataset auto-resolution (20+ datasets) 14+ ML models (SVM, Random Forest, XGBoost, LightGBM, CatBoost, etc.) Ensemble model support (e.g., "AdaBoost+LR") Hyperparameter extraction (30+ patterns) 4-component scoring: EQ, CQ, RM, E2E Interactive visualizations (heatmaps, delta charts, slopegraphs) Interfaces CLI: python -m vera –pdf paper.pdf –output_dir results/ Web UI: React dashboard with FastAPI backend Output Artifacts COMPREHENSIVE_REPORT.html - Interactive report replication_standalone.py - Reviewable code results.json - Structured benchmarks blueprint.md - Extraction details figures/ - Publication-quality visualizations Quick Start Requirements Python 3.11+ 4GB RAM Internet connection Installation # Create environment conda create -n vera python=3.12 -y conda activate vera # Install dependencies pip install -r requirements.txt # Optional: Install boosting libraries for better RM scores pip install xgboost lightgbm catboost Run Demo # CLI python -m vera –pdf test_papers/HeartDiseaseProj.pdf –output_dir demo/heart_disease open demo/heart_disease/article/COMPREHENSIVE_REPORT.html # Web Interface python -m vera web # Terminal 1 cd frontend && npm install && npm run dev # Terminal 2 # Open http://localhost:5173 Validation Results \textbar Metric \textbar Value \textbar \textbar——–\textbar——-\textbar \textbar Papers Tested \textbar 4 \textbar \textbar Models Extracted \textbar 32 \textbar \textbar Extraction Accuracy \textbar 100% \textbar \textbar Code Success Rate \textbar 100% \textbar \textbar Avg RM Score \textbar 82% \textbar \textbar Avg Execution Time \textbar ~35s \textbar Demo Papers Included Heart Disease (10 models, 82% RM) Breast Cancer (12 models, 84% RM) Cardiovascular Disease (9 models, 77% RM) Diabetes SVM (1 model, 84% RM) Known Limitations UCI datasets only (no Kaggle, HuggingFace, custom data) Tabular ML only (no PyTorch/TensorFlow deep learning) Machine-readable PDFs only (no OCR) Binary classification focus (limited multi-class) Documentation README.md - Quick start guide BLOG.md - Vision and detailed walkthrough TECHNICAL_BLOG.md - Technical deep-dive demo/ - 4 pre-run demo outputs What's Next See ROADMAP.md for planned enhancements: LLM extraction fallback Deep learning support OCR for scanned papers Batch processing Extended dataset coverage Authors Andre Chu and Warren Pettine Medical Machine Intelligence Lab, Department of Psychiatry, University of Utah Contact: andre.chu@utah.edu License Academic research tool. Not for clinical use. Acknowledgments This is a proof-of-concept demonstrating autonomous replication feasibility. It should complement, not replace, human review and expertise.
@misc{chu_mmint-labvera_2026,
	title = {mmint-lab/{VERA}: {VERA}: {Autonomous} {Replication} and {Validation} of {Machine} {Learning} {Experiments} from {Research} {Papers}},
	shorttitle = {mmint-lab/{VERA}},
	url = {https://zenodo.org/records/18165553},
	doi = {10.5281/zenodo.18165553},
	abstract = {v1.0.0 - Initial Release



First public release of VERA (Verification Engine for Reproducible Analysis)


VERA is an autonomous agent that extracts, replicates, and validates machine learning experiments from research papers without manual intervention. Simply provide a PDF, and VERA handles extraction, execution, and comprehensive comparison reporting in {\textasciitilde}35 seconds.



Highlights



100\% extraction accuracy across 4 validated research papers (32 models total)

82\% average Results Match score with paper-reported benchmarks

Zero manual intervention required during execution

Full transparency with reviewable standalone Python scripts ({\textasciitilde}250 lines per paper)

Web dashboard + CLI interface included




Features

Core Capabilities



Autonomous PDF extraction with hybrid rule-based + semantic NLP

UCI dataset auto-resolution (20+ datasets)

14+ ML models (SVM, Random Forest, XGBoost, LightGBM, CatBoost, etc.)

Ensemble model support (e.g., "AdaBoost+LR")

Hyperparameter extraction (30+ patterns)

4-component scoring: EQ, CQ, RM, E2E

Interactive visualizations (heatmaps, delta charts, slopegraphs)


Interfaces



CLI: python -m vera --pdf paper.pdf --output\_dir results/

Web UI: React dashboard with FastAPI backend


Output Artifacts



COMPREHENSIVE\_REPORT.html - Interactive report

replication\_standalone.py - Reviewable code

results.json - Structured benchmarks

blueprint.md - Extraction details

figures/ - Publication-quality visualizations




Quick Start

Requirements



Python 3.11+

4GB RAM

Internet connection


Installation

\# Create environment
conda create -n vera python=3.12 -y
conda activate vera

\# Install dependencies
pip install -r requirements.txt

\# Optional: Install boosting libraries for better RM scores
pip install xgboost lightgbm catboost


Run Demo

\# CLI
python -m vera --pdf test\_papers/HeartDiseaseProj.pdf --output\_dir demo/heart\_disease
open demo/heart\_disease/article/COMPREHENSIVE\_REPORT.html

\# Web Interface
python -m vera web  \# Terminal 1
cd frontend \&\& npm install \&\& npm run dev  \# Terminal 2
\# Open http://localhost:5173




Validation Results

{\textbar} Metric {\textbar} Value {\textbar}
{\textbar}--------{\textbar}-------{\textbar}
{\textbar} Papers Tested {\textbar} 4 {\textbar}
{\textbar} Models Extracted {\textbar} 32 {\textbar}
{\textbar} Extraction Accuracy {\textbar} 100\% {\textbar}
{\textbar} Code Success Rate {\textbar} 100\% {\textbar}
{\textbar} Avg RM Score {\textbar} 82\% {\textbar}
{\textbar} Avg Execution Time {\textbar} {\textasciitilde}35s {\textbar}

Demo Papers Included



Heart Disease (10 models, 82\% RM)

Breast Cancer (12 models, 84\% RM)

Cardiovascular Disease (9 models, 77\% RM)

Diabetes SVM (1 model, 84\% RM)




Known Limitations



UCI datasets only (no Kaggle, HuggingFace, custom data)

Tabular ML only (no PyTorch/TensorFlow deep learning)

Machine-readable PDFs only (no OCR)

Binary classification focus (limited multi-class)




Documentation



README.md - Quick start guide

BLOG.md - Vision and detailed walkthrough

TECHNICAL\_BLOG.md - Technical deep-dive

demo/ - 4 pre-run demo outputs




What's Next

See ROADMAP.md for planned enhancements:



LLM extraction fallback

Deep learning support

OCR for scanned papers

Batch processing

Extended dataset coverage




Authors

Andre Chu and Warren Pettine
Medical Machine Intelligence Lab, Department of Psychiatry, University of Utah

Contact: andre.chu@utah.edu



License

Academic research tool. Not for clinical use.



Acknowledgments

This is a proof-of-concept demonstrating autonomous replication feasibility. It should complement, not replace, human review and expertise.},
	urldate = {2026-01-06},
	publisher = {Zenodo},
	author = {Chu, Andre},
	month = jan,
	year = {2026},
}

Downloads: 0