ECE519(00)  Microprocessor Microarchitecture

▣ Lecture outline
  Traditional speedup curve of Amdahl's law no longer applies to computer system performance. All the recent high-performance designs of Intel, IBM, and Sun rely on multi-core technology. This technical shift from ILP (instruction-level parallelism) to TLP (thread-level parallelism) will reshape the design of future microprocessors. In this course, we will cover both ILP and TLP techniques. The topics we cover include adaptive dynamic branch prediction, high-bandwidth instruction fetch, dynamic scheduling, multiple issue, speculation, multithreading, symmetric multiprocessors, distributed shared memory multiprocessors, synchronization and consistency, and cache and memory hierarchy designs.

 Professor : Lynn Choi(, Engineering Bldg, #411, 3290-3249)

 Assistant : HanJun Bae(, Engineering Bldg, #236, 3290-3896)

 Time(Place) : Monday(6-8) West Building, College of Life Sciences & Biotechnology, #103

 Textbook : "Computer Architecture: A Quantitative Approach", John L. Hennessy and David A. Patterson, Morgan Kaufmann, 5th Edition, 2012

 Reference book : A Collection of Research Papers

▣ Bulitin Board :

 Class notice

1. Lecture Note 1 was updated on March 2.

2. Lecture Note 1 was updated again on March 5.

3. Lecture Note 1 was updated again on March 6.

4. Lecture Note 2 was updated on March 12.

5. Lecture Note 3 was updated on March 12.

6. Reading List 1 was updated on March 13.

7. Lecture Note 2 was updated again on March 20.

8. Lecture Note 4 was updated on March 26.

9. Lecture Note 5 was updated on March 31.

10. Lecture Note 6 was updated on April 8.

11. Lecture Note 7 was updated on April 17.

12. Lecture Note 8 was updated on May 1.

13. Lecture Note 9 was updated on May 7.

14. Lecture Note 9 was updated again on May 14.

※ 강의 시간(16:00 으로 최종 변경되었습니다.)

     강의 장소(생명과학관 동관 104호로 최종 변경되었습니다.)

※ 5월 22일 수업은 15:30 으로 변경되었습니다. 

선정된 논문 List

     2016 ISCA
     1. Dynamo : a Data Center-Wide Power Management System(Facebook inc. ,University of Michigan)

     2. Future Vector Microprocessor Extensions for Data Aggregations

     3. Back to the Future : Leveraging Belady's Algorithm for Improved Cache Replacement

     4. Efficient Synonym Filtering and Scalable Delayed Translation for Hybrid Virtual Caching

     5. APRES : improving cache efficiency by exploiting load characteristics on GPUs

     2016 MICRO
     1. Quantifying and Improving the Efficiency of Hardware-based Mobile Malware Detectors(UT Austin)

     2. SABRes Atomic Object Reads for In-Memory Rack-Scale Computing

     3. pTask : A Smart Prefetching Scheme for OS Intensive Applications

     4. Towards Efficient Server Architecture for Virtualized Network Function Deployment_Implications
         and Implementations


     2016 HPCA
     1. Improving Smartphone User Experience by Balancing Performance and Energy with Probabilistic

         Qos Guarantee(Arizona State University)

     2. Energy-Efficient Address Translation A Graph-Based Program Representation for Analyzing

        Hardware Specialization Approaches

    3. Predicting the Memory Bandwidth and Optimal Core Allocations for Multi-threaded Applica-
        tions on Large-scale NUMA Machines

     2015 MICRO

     1. Doppelganger: A Cache for Approximate Computing

     2015 ISCA

     1. A VariableWarp Size Architecture

     2015 HPCA

     1. Msacar : Speeding up GPU Warps by Reducing Memory Pitstops

▣ Lecture slide

       1. Microarchitecture_-_0._Introduction_revised_2

       2. Microarchitecture_-_1._Branch_Prediction_revised

       3. Microarchitecture_-_2._Instruction_Fetch

       4. Microarchitecture_-_3._Dynamic_Pipeline

       5. Microarchitecture_-_4._Interrupt_and_Precise_Exception
       6. Microarchitecture_-_5._Memory_Hierarchy_Optimization

       7. Microarchitecture_-_6._Limits_of_ILP

       8. Microarchitecture_-_7._Thread_Level_Parallelism

       9. Microarchitecture_-_8._Data_Level_Parallelism_rev


▣ Reference


 Paper Presentation

 Reading List