c
🔄 Updated Apr 2026
🖥️ Self-hostable
Overview
chchenhui/mlrbench is an AI agent in the Evaluation Benchmarks category. — MLR-Bench: Evaluating AI agents on open-ended ML research. 201 tasks from NeurIPS/ICLR/ICML workshops.
Problem It Solves
This tool addresses challenges in the evaluation benchmarks domain.
Target Audience: Developers and teams working with evaluation benchmarks automation.
Inputs
- • User configuration
- • API credentials (if required)
- • Task parameters
Outputs
- • Automated task results
- • Status reports
- • Generated content or actions
Example Workflow
- 1 User configures the agent with required parameters
- 2 Agent receives input data or trigger
- 3 Agent processes the request using its core logic
- 4 Agent interacts with external services if needed
- 5 Results are returned to the user
Sample System Prompt
You are chchenhui/mlrbench, an AI assistant. Help the user accomplish their task efficiently.
Tools & Technologies
LLM APIs Python
Alternatives
- • AutoGPT
- • LangChain Agents
- • CrewAI
FAQs
- Is this agent open-source?
- Yes
- Can this agent be self-hosted?
- Yes
- What skill level is required?
- Intermediate
Rate This Agent
Your rating:
Reviews
Loading reviews...
Write a Review
Ready to try this agent?
chchenhui/mlrbench