Wikipendium

History Compendium
Log in
This is an old version of the compendium, written Nov. 30, 2015, 10:49 p.m. Changes made in this revision were made by aleksanb. View rendered version.
Previous version Next version

TDT1: Architecture of Computing Systems

**This compendium is the 2015 version of TDT1** The following papers are curriculum: *Part 1* Modern computing architectures (Lasse): with focus on multicores and energy efficient computing. - 1.1 **Models and metrics to enable energy-efficiency optimizations**, *Suzanne Rivoire et.al.*, Computer, December 2007 (ITSL = Made available to students thru Its Learning) [Presentation](https://docs.google.com/presentation/d/1znpJALl4ZqGZ37b8GSsSShTpxqChnYAz_un7OAvgCp4/pub?start=false&loop=false&delayms=30000&slide=id.p) - 1.2 **Case Studies of Multi-core Energy Efficiency in Task Based Programs**, *Hallgeir Lien et al*, In proc. of ICT-GLOW ICT againts Global Warming, Vienna Sept 2012 (ITSL) - 1.3 **Feedback-Driven Threading: Power-Efficient and High-Performance Execution of Multithreaded Workloads on CMPs**, *Suleman et.al.*, ASPLOS 2008 (ITSL) - 1.4 **Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction**, *Kumar et.al.*, MICRO 2003 (ITSL) - 1.6 **Optimized Hardware for Suboptimal Software: The Case for SIMD-aware Benchmarks**, *J M Cebrian et. al.*, ISPASS 2014[Presentation](https://docs.google.com/presentation/d/1vrm6qdNB7DQN5OherIT79_AmfVOzjikhRocNYIlbHBw/edit?usp=sharing) *Part 2* Unconventional computing architectures (Stefano, www.nichele.eu, nichele@idi.ntnu.no): Current and future research on new alternative computing architectures that go beyond the traditional Turing/von Neumann paradigm: - 2.1 *Moshe Sipper* - **The Emergence of Cellular Computing** (<http://www.cs.bgu.ac.il/~sipper/papabs/cellcomp.pdf>) - 2.2 *Melanie Mitchell* - **Life and Evolution in Computers** (<http://web.cecs.pdx.edu/~mm/life-and-evolution.pdf>) - 2.3 *Moshe Sipper et al.* - **A Phylogenetic, Ontogenetic and Epigenetic View of Bio-Inspired Hardware Systems** (<http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=585894&tag=1> also available on ITSLEARNING) - 2.4 *Julian Miller et al.* - **Evolution-in-Materio: Evolving Computation in Materials** (<http://www.cartesiangp.co.uk/papers/ei2014-miller.pdf>) [Presentation](https://docs.google.com/presentation/d/1ZBLbhKdwICzH6C1Y6RZVUUGOQk5aJE73TP67s5gEecI/edit?usp=sharing) - 2.5a **D-Wave Quantum Computer Architecture** (<http://www.dwavesys.com/sites/default/files/D-Wave-brochure-Aug2015B.pdf>) - 2.5b **Programming with D-Wave, Map Coloring Problem** ([http://www.dwavesys.com/sites/default/files/Map Coloring WP2.pdf](http://www.dwavesys.com/sites/default/files/Map%20Coloring%20WP2.pdf)) - 2.6 *Lila Kari* - **DNA Computing: Arrival of Biological Mathematics** (<http://link.springer.com/article/10.1007%2FBF03024425> also available on ITSLEARNING) # 1.1 Models and metrics to enable energy-efficiency optimizations [Slides](https://docs.google.com/presentation/d/1znpJALl4ZqGZ37b8GSsSShTpxqChnYAz_un7OAvgCp4/pub?start=false&loop=false&delayms=30000&slide=id.p) ## Intro Power consumption has become a major concern in data centers, as the consumption doubled between 2000 and 2006 (and a lot has probably happened since). Better energy efficiency can reduce costs directly, because computers use less power. In addition, more efficient computers generate less heat and thus require less cooling. The paper introduces a new benchmark for measuring energy efficiency, namely JouleSort. ## Energy-efficiency metrics The ideal benchmark for energy efficiency would balance power and performance perfectly, and its rules should be impossible to circumvent. ### Energy-delay product Comparing processors based on energy – the product of execution time and power – could motivate processor designers to focus on lowering clock frequency alone to achieve "energy efficiency". This would be very bad for performance. The energy-delay product, however, weighs power against the square of execution time. This more precisely shows the energy efficiency rather than just the clock frequency. ## JouleSort design goals - Should evaluate trade-off between power and performance. - No reward for only high performance or only low power. - The metric measured is energy (not energy-delay). - This is because another benchmark that emphasises performance is not needed, and joulesort can instead only emphasize energy. - Also, energy-delay product doesn't make as much sense on the system level as it did on the processor-level. - Should be balanced - Should be inclusive - As many systems as possible should be measurable with the benchmark - The metric and the workload should be applicable to many different technologies. The workload in joulesort is a standard external sort (the same as in many other benchmarks). JouleSort ranks systems based on the amount of records it can sort per Joule consumed. ![](https://s3-eu-west-1.amazonaws.com/wikipendium-public/14465654390xab059.jpg) ## Result of joulesort - Winners of pennysort do well - Cost efficiency has improved more than energy efficiency over the last years - Laptops and file servers has the best results In trying to create a machine custom tailored for the JouleSort benchmark, the most important decision was to balance the performance between CPU and I/O. A fast CPU is no use if it needs to idle most of the time waiting for I/O. Improving the performance of mainly energy efficient computers seems to be a better approach than trying to make high-performance computers more energy efficient. The authors created a "CoolSort machine" which consisted of a high-end mobile CPU and 13 SATA laptop disks. This was the perfect ratio for utilizing both the CPU and the disks at maximum capacity. # 1.2 Case Studies of Multi-core Energy Efficiency in Task Based Programs Recently, we have seen a convergence between embedded systems and High Performance Computing. Both these market segments now have energy efficiency as a major design goal. - Vectorization provides a significant improvement in on-chip energy efficiency. # 1.3 Feedback-Driven Threading: Power-Efficient and High-Performance Execution of Multithreaded Workloads on CMPs > Framework to find optimal threading for a program. FDT : Feedback-driven threading SAT : Synchronization-aware threading BAT : Bandwidth-aware threading - The FDT system is used to implement SAT and BAT. These reduce execution time and power significantly. - Shared data in a multi-threaded application is kept synchronized by using critical sections. - With critical sections, we can use SAT to boost performance. Critical sections says that there all other threads needs to wait for one thread to finish. - Training works by putting a marker in the compiler before and after each critical section. Then the time spent in critical sections is known. - When an application is highly parallelized, fetching new data will often be the bottleneck. - In these cases, we can use BAT to optimize for bandwith access. - Training works by putting markers before and after each call to the external data. Then the time spent off-chip is known. - One can combine SAT and BAT. - Works like this: - Have a training phase where one tries out stuff to see what should be done. - After the small time it takes to train, execute the rest of the threads based on the training information. - If we combine SAT and BAT, do this: - Compute the number of threads for BAT and SAT. The minimum number of those two minimizes the overall execution time. - Applications that get limited by data-synchronization, increasing the number of threads significantly increases execution time and power. - Applications that get limited by off-chip bandwidth, increasing the number of threads increases on-chip power without providing any performance improvements. - Having a system that controls the number of threads can reduce both execution time and power. # 1.4 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction The paper investigates whether we can achieve power savings while keeping performance by using single-ISA heterogeneous multi-core architectures. The answer? Yes, we can! ![](https://s3-eu-west-1.amazonaws.com/wikipendium-public/14489181110xd649b.jpg) ## How is this achieved? The paper describes four different architectures with differing degrees of complexity. Task-switching may be done when the current process is scheduled out of the CPU. Power usage is estimated using wattch and adjusted for architectural differences using peak power and typical power. Benchmarking is done using SPEC2000. ## Results The study shows that there is a non-constant ratio between the performance of the different cores. By using a switching oracle you get very good results. In some cases, using two cores may be sufficient to produce significant gains. ## Core switching Every 100th time interval, one or more cores are sampled. Different heuristics are: - neighbor - neighbor-global - random - all Then, a core is selected. These heuristics achieve up to 93% of the energy-delay gains of the oracle-based switcher. ## Conclusion A sample heterogeneous multi-core design with four complexity-graded cores has the potential to increase energy efficiency (defined as energy-delay product, in this case) by a factor of three, in one experiment, without dramatic losses in performance. Using the single-ISA heterogeneous multi-core architecture ields far better results for energy efficiency than simply reducing the clock frequency. Some performance loss is expected, but the energy efficiency gains outweigh the losses in performance.
# 1.6 Optimized Hardware for Suboptimal Software: The Case for SIMD-aware Benchmarks Well why don't you write this section?
# 2.4 Evolution-in-Materio: Evolving Computation in Materials [Link to the slides](https://docs.google.com/presentation/d/1ZBLbhKdwICzH6C1Y6RZVUUGOQk5aJE73TP67s5gEecI/edit?usp=sharing) ## Basically what it is Evolution in materio is about using evolutionary algorithms in physical material to get a desired result in a physical material. There are many reasons why this is done; we are approaching the silicon cap, where more transistors would mean our processors melt, therefore we might want to find some other basis for computation than silicon. Tests of this has also shown that using this technique is a good way of supplement the human mind's thinking, as the humand mind is often bound by it knowledge. ## Working examples There are multiple examples of this actually working. The NASA antenna which is on one of its space shuttles might be the best known example. ## Measuring and changing problems When working with evolution in materio, we _must_ be able to distinguish if we are significantly changing the matter in question. Matter is changing all the time, and it can be difficult to see if the matter is changing because of us, or regardless of us. To combat this we need ways of measuring changes in the material. If we are not careful, we might be measuring floating pins, and we might see the same results as we expect from measuring the actual matter. If our material spits out random numbers, we have created a random number generator, _not_ a turing machine with some extra computations. We need to be on the edge of chaos. ## Ways of evolving There are three main ways of using evolutionary algorithms in the context of evolution in materio. - **Purely in software** You evolve your design in software to be used in software. - **Evolution in software produces results in real life** You evolve in software, and take the result to be used in real life. This is often a blueprint or a design that is used to create a real life item. - **Physical evolution in actual materio** You have a material that you change physical properties of and read off data. ## Some notes on the field itself This is a very new field and it will take some more research before it matures. The results so far has to be taken with a grain of salt, as the results is not easy or even possible to replicate.
  • Contact
  • Twitter
  • Statistics
  • Report a bug
  • Wikipendium cc-by-sa
Wikipendium is ad-free and costs nothing to use. Please help keep Wikipendium alive by donating today!