Our long-term goal is to develop a common methodology and framework for reproducible co-design of the efficient software/hardware stack for emerging algorithms requested by our advisory board (inference, object detection, training, etc) in terms of speed, accuracy, energy, size, complexity, costs and other metrics. Open ReQuEST competitions bring together AI, ML and systems researchers to share complete algorithm implementations (code and data) as portable, customizable and reusable Collective Knowledge workflows. This helps other researchers and end-users to quickly validate such results, reuse workflows and optimize/autotune algorithms across different platforms, models, data sets, libraries, compilers and tools. We will also use our practical experience reproducing experimental results from ReQuEST submissions to help set up artifact evaluation at the upcoming SysML 2019, and to suggest new algorithms for the inclusion to the MLPerf benchmark.
ReQuEST is aimed at providing a scalable tournament framework, a common experimental methodology and an open repository for continuous evaluation and optimization of the quality vs. efficiency Pareto optimality of a wide range of real-world applications, libraries and models across the whole hardware/software stack on complete platforms. In contrast with other (deep learning) benchmarking challenges where experimental results are submitted in a form of JSON, CSV or XLS files, ReQuEST participants will be asked to submit a complete workflow artifact in a unified and automated form (i.e. not just some ad-hoc Docker/VM image) which encompasses toolchains, frameworks, algorithm, libraries, and target hardware platform; any of which can be fine-tuned, or customized at will by the participant to implement their optimization technique. Such open infrastructure helps to bring together multidisciplinary researchers in systems, compilers, architecture and machine learning to develop and share their algorithms, tools and techniques as portable, customizable and "plug&play" components with a common API.
We then arrange open ReQuEST competitions on Pareto-efficient co-design of the whole software/hardware stack to continuously optimize such algorithms in terms of speed, accuracy, energy, costs and other metrics across diverse inputs and platforms from IoT to supercomputers. All benchmarking results and winning SW/HW/model configurations will be visualized on a public interactive dashboard and grouped according to certain categories (e.g.embedded vs. server). The winning artifacts will be discoverable via ACM Digital Library to help the community reproduce, reuse, improve and compare against them thanks to the common experimental framework.
We hope that our approach will help automate research and accelerate innovation!
See ReQuEST introduction report and CK presentation about our long-term vision.
For the first iteration of ReQuEST at ASPLOS'18, we focus on Deep Learning. Our first step is to provide coverage for the ImageNet image classification challenge. Restricting the competition to a single application domain will allow us to prepare an open-source tournament infrastructure and validate it across multiple hardware platforms, deep learning frameworks, libraries, models and inputs. For future incarnations of ReQuEST, we will provide broader application coverage, based on the interests of the research community and the direction set by our industrial board.
Though our main focus is on end-to-end applications, we also plan to allow future submissions for (micro)kernels such as matrix multiply, convolutions and transfer functions to facilitate participation from the compilers and computer architecture community.
In general, we want to encourage participants to target accessible, off-the-shelf hardware to allow our artifact evaluation committee to conveniently reproduce their results. Example systems include:
In the longer term, we also plan to provide support for simulator-based evaluations for architecture/micro-architecture research.
We want to unify every submission to enable fair evaluation. That is why we decided to use the open-source Collective Knowledge workflow framework (CK). CK helps the community share artifacts (models, data sets, libraries, tools) as reusable and customizable components with a common JSON API and meta description. CK also helps implement portable workflows which can adapt to a user environment on Linux, Windows, MacOS and Android. ACM currently evaluates CK to enabling sharing of reusable and portable artifacts in an ACM Digital Library.
Non-profit cTuning foundation will help authors convert their artifacts and experimental scripts to the CK format during evaluation while reusing AI artifacts already shared by the community in the CK format (see CK AI repositories, CK modules (wrappers), CK software detection plugins, portable CK packages). Authors can also try to convert their workflows to the CK format themselves using the distinguished artifact from ACM CGO'17 as an example ( see Artifact repository at GitHub, Artifact Appendix, CK notes, CK portable workflows) though the learning curve is still quite steep - we plan to prepare CK tutorials based on the feeback from the participants.
ReQuEST will not determine a single winner, as collapsing all of the metrics into one single metric across all platforms will result in over-engineered solutions. Instead, each ReQuEST tournament will expose a set of quality, performance and efficiency metrics to perform optimizations on.
Solutions do not have to be on the Pareto frontier to be accepted for such workshops and the open ReQuEST repository - a submission can be praised for its originality, reproducibility, adaptability, scalability, portability, ease of use, etc.
However, reproducible submissions on the Pareto frontier will have an option to be published in the ACM Digital Library with ACM available, reusable and replicated badges. This will make them discoverable via ACM DL search engine - you can check this new feature yourself (since 2018) by selecting "Artifact Badge" for field and then select any badge you wish in the ACM DL advanced search!
Members of the ReQuEST advisory/industrial board will look over and comment on the results of our tournaments and workshops, collaborate on a common methodology for reproducible evaluation and optimization, suggest realistic workloads for future tournaments, arrange access to rare hardware to Artifact Evaluation Committee, and provide prizes for the most efficient solutions.