LEGOEval: An Open-Source Toolkit for Dialogue System Evaluation via Crowdsourcing

by   Yu Li, et al.

We present LEGOEval, an open-source toolkit that enables researchers to easily evaluate dialogue systems in a few lines of code using the online crowdsource platform, Amazon Mechanical Turk. Compared to existing toolkits, LEGOEval features a flexible task design by providing a Python API that maps to commonly used React.js interface components. Researchers can personalize their evaluation procedures easily with our built-in pages as if playing with LEGO blocks. Thus, LEGOEval provides a fast, consistent method for reproducing human evaluation results. Besides the flexible task design, LEGOEval also offers an easy API to review collected data.


ADVISER: A Toolkit for Developing Multi-modal, Multi-domain and Socially-engaged Conversational Agents

We present ADVISER - an open-source, multi-domain dialog system toolkit ...

SacreROUGE: An Open-Source Library for Using and Developing Summarization Evaluation Metrics

We present SacreROUGE, an open-source library for using and developing s...

Transfer Learning Toolkit: Primers and Benchmarks

The transfer learning toolkit wraps the codes of 17 transfer learning mo...

Quantus: An Explainable AI Toolkit for Responsible Evaluation of Neural Network Explanations

The evaluation of explanation methods is a research topic that has not y...

Open Source Vizier: Distributed Infrastructure and API for Reliable and Flexible Blackbox Optimization

Vizier is the de-facto blackbox and hyperparameter optimization service ...

Building Inspection Toolkit: Unified Evaluation and Strong Baselines for Damage Recognition

In recent years, several companies and researchers have started to tackl...

PyRep: Bringing V-REP to Deep Robot Learning

PyRep is a toolkit for robot learning research, built on top of the virt...

Code Repositories


A toolkit for dialogue system evaluation via crowdsourcing

view repo