Check the preview of 2nd version of this platform being developed by the open MLCommons taskforce on automation and reproducibility as a free, open-source and technology-agnostic on-prem platform.

FireSim: FPGA-Accelerated Cycle-Exact Scale-Out System Simulation in the Public Cloud

lib:9fae18e83c4d8d83 (v1.0.0)

Vote to reproduce this paper and share portable workflows   1 
Authors: Sagar Karandikar,Howard Mao,Donggyu Kim,David Biancolin,Alon Amid,Dayeol Lee,Nathan Pemberton,Emmanuel Amaro,Colin Schmidt,Aditya Chopra,Qijing Huang,Kyle Kovacs,Borivoje Nikolic,Randy Katz,Jonathan Bachrach,Krste Asanovic
Where published: 45th ACM/IEEE International Symposium on Computer Architecture (ISCA 2018) 2018 6
Document:  PDF  DOI 
Artifact development version: GitHub
Abstract URL: https://dl.acm.org/citation.cfm?id=3276543


We present FireSim, an open-source simulation platform that enables cycle-exact microarchitectural simulation of large scale-out clusters by combining FPGA-accelerated simulation of silicon-proven RTL designs with a scalable, distributed network simulation. Unlike prior FPGA-accelerated simulation tools, FireSim runs on Amazon EC2 F1, a public cloud FPGA platform, which greatly improves usability, provides elasticity, and lowers the cost of large-scale FPGAbased experiments. We describe the design and implementation of FireSim and show how it can provide sufficient performance to run modern applications at scale, to enable true hardware-software co-design. As an example, we demonstrate automatically generating and deploying a target cluster of 1,024 3.2 GHz quad-core server nodes, each with 16 GB of DRAM, interconnected by a 200 Gbit/s network with 2 microsecond latency, which simulates at a 3.4 MHz processor clock rate (less than 1,000x slowdown over real-time). In aggregate, this FireSim instantiation simulates 4,096 cores and 16 TB of memory, runs ˜14 billion instructions per second, and harnesses 12.8 million dollars worth of FPGAs—at a total cost of only ˜$100 per simulation hour to the user. We present several examples to show how FireSim can be used to explore various research directions in warehouse-scale machine design, including modeling networks with high-bandwidth and low-latency, integrating arbitrary RTL designs for a variety of commodity and specialized datacenter nodes, and modeling a variety of datacenter organizations, as well as reusing the scale-out FireSim infrastructure to enable fast, massively parallel cycle-exact single-node microarchitectural experimentation.

Relevant initiatives  

Related knowledge about this paper Reproduced results (crowd-benchmarking and competitions) Artifact and reproducibility checklists Common formats for research projects and shared artifacts Reproducibility initiatives

Comments  

Please log in to add your comments!
If you notice any inapropriate content that should not be here, please report us as soon as possible and we will try to remove it within 48 hours!