Disclaimer : The opinions expressed in this post are my own, based on personal experiences.Almost any computer architecture researcher realizes the importance of simulators to the community. Architectural level simulators allow us to model and evaluate new proposals in a timely fashion, without having to go through the pain of fabricating and testing a chip.
So, what makes for a good simulator? I started grad school in the good old days when
Simplescalar used to be the norm. It had a detailed, cycle-accurate pipeline model and most importantly, it was pretty fast. Once you got over the learning curve, additions to the model were fairly easy. It did have a few drawbacks. There was little support for simulating CMP architectures, there was no coherence protocol in place, and a detailed DRAM model was missing. Moreover, interference of the operating system with an application's behavior was not considered. But, these issues were less important back then.
Soon, it was apparent that CMPs (chip multi-processors) were here to stay. The face of applications changed as well. Web 2.0-based, data-intensive applications came into existence. To support these, a number of server-side applications needed to be run on backend datacenters. This made the performance of multi-threaded applications, and that of main memory of paramount importance.
To keep up with these requirements, the focus of the architecture community changed as well. Gone were the days of trying to optimize the pipeline and extracting
ILP. TLP,
MLP, and memory system performance (caches, DRAM) became important. One also could no longer ignore the importance of interference from the operating system when making design decisions. The community was now on the lookout for a simulation platform that could take all of these factors into account.
It was around this time that a number of full-system level simulators came into being. Off the top of my head, I can count a number of these, popular with the community and with a fairly large user base -
Wind River's Simics,
M5,
Zesto,
Simflex, SESC, to name a few. For community wide adoption, a simulator platform needed to be fast, have a modular code base,
and have a good support system (being cycle-accurate was a given). Simics was one of the first platforms that I tried personally and found
their support to be extremely responsive, which also garnered a large participation from the academic community. Also, with release of
GEMS framework from Wisconsin, I didn't need a reason to look any further.
In spite of all the options out there, getting the infrastructure (simulator, benchmark binaries, workloads and checkpoints) in place is a time consuming and arduous process. As a result, groups seem to have gravitated towards simulators that best suited their needs in terms of features and ease of use. The large number of options today also implies that different proposals on the same topic inevitably use different simulation platforms. As a result, it is often difficult to compare results across papers and
exactly reproduce results of prior work. This was not as significant a problem before when nearly everyone used Simplescalar.
In some sub-areas, it is common practice to compare an innovation against other state-of-the-art innovations (cache policies, DRAM scheduling, etc.). Faithfully modeling the innovations of others can be especially troublesome for new grad students learning the ropes of the process. I believe a large part of this effort can be reduced if these models (and by model I mean code :-) were already publicly available as part of a
common simulator framework.
The
Archer project, as some of you might know, is a recent effort in the direction of collaborative research in computer architecture. From the project's website, they strive for a noble goal -
"To thoroughly evaluate a new computer architecture idea, researchers and students need access to high-performance computers, simulation tools, benchmarks, and datasets - which are not often readily available to many in the community. Archer is a project funded by the National Science Foundation CRI program to address this need by providing researchers, educators and students with a cyberinfrastructure where users in the community benefit from sharing hardware, software, tools, data, documentation, and educational material."
In its current format, Archer provides a large pool of batch-scheduled
computing resources, a set of
commonly used tools and simulators and some
benchmarks and datasets. It also has support for sharing files via NFS and a wiki-based infrastructure to aggregate shared knowledge and experiences.
If widely adopted, this will provide a solution to many of the issues I listed above. It can help push the academic community towards a common infrastructure while at the same time reduce the effort to setup simulation infrastructure and to reproduce prior work.
Although Archer is a great initial step, I believe it still can be improved upon. If I had my wishes, I would like a
sourceforge like platform, where the model for a particular optimization is owned by a group of people (say, the authors of the research paper), available to be checked out from a version-control system. Anyone using the Archer platform for their research, that results in a peer-reviewed publication would be obliged to release their model into the public domain under a GPL. Bug reports will be sent to the owners who will in turn release revised versions of the model(s).
In recent years, collaborative research efforts in computer science have been very successful.
Emulab is a widely used resource by the networking community. The TCS community too has been involved in successful collaborative research, e.g.
the polymath project and
the recent collaborative review of Deolalikar's paper. I believe that there is certainly room for larger collaborative efforts within the computer architecture community.