CONTACT   |   QUICK LINKS   |   SITEMAP     search
 
Computer Sciences
image aboutBSC computational earth life computer applications marenostrum image
Computer Sciences
Computer Architecture Operating System Interface (CAOS)
Home > Computer Sciences > Computer Architecture Operating System Interface (CAOS) pdf print
 
 
Related links

HiPEAC Network of Excellence

Merasa

CAOS

Proartis
 

Computer Architecture Operating System Interface

stroke

OVERVIEW

Multi-core and/or multi-threaded architectures are monopolizing the market, from embedded systems to supercomputers. However, achieving high performance with these modern systems has become a complex task: as the number of cores per chip and/or the number of hardware threads per core continue to increase, new challenges arise in terms of scheduling, power, temperature, scalability, analyzability, design complexity, efficiency, throughput, heterogeneity, etc. Performance is not the only important metric anymore, and new metrics (such as security, power, total throughput, Quality of Service) are becoming more and more important. It seems clear that neither the hardware nor the software alone can achieve the desired performance and, at the same time, be compliant with these constraints. The answer to these new challenges comes from hardware-software co-design. Computer Architectures (CA) and Operating Systems (OS) should interact through a well-defined interface, exchanging run-time information, monitoring application progress and needs, and enforcing resource management.

OBJECTIVES

The Computer Architecture/Operating Systems group researches mainly on real time and high performance computing. Our objectives are:

  • Proposing complexity-effective, low-power processor architectures with special emphasis  on multicore/multithreaded architectures, and in particular in on-chip resources.
  • Developing tools that allow evaluating different alternatives at hardware and software. The use of powerful and trustable simulation tools allows us to make design space explorations and, hence, fair comparisons between different hardware/software designs 
  • Developing methodologies to fairly evaluate different processor designs. Understanding the bottlenecks of current processors is a key factor to make proposals that are of the interest of the industry.
  • Designing and implementing power- and temperature- aware OS solutions for real time and high performance systems (scheduling, load balancing).
  • Improving the interaction between hardware (processor) and software (operating system and run time environment).
  • Deploying time-analyzable and low-power processor designs for the real-time arena.



PROJECTS/AREAS

  • Architectures for hard-real time systems: The increasing demand for functionality in current and future real-time embedded systems is driving an increase in performance of processors. More powerful processors, offer the opportunity to schedule a larger number of applications, potentially co-hosting several safety and non-safety applications on a common powerful platform, providing a better performance/Watt ratio than a single core solution with similar performance. However at the same time, in developing safety-related real- time embedded systems, there is a need to prove that the timing requirements are met. We investigate new processor architectures that allow executing hard, soft, and non real-time tasks simultaneously, providing timing predictability for hard real-time tasks and high-performance for non real-time tasks, under the stringent low-power constraints of embedded systems.
  • Probabilistic real-time systems: Aggressive hardware acceleration features like caches, deep memory hierarchies, and multicore processors need to be employed to respond to the increasing demand for performance, computation power, and the number and cost reduction of processing units of Critical Real-Time Embedded systems. Despite the fact that most CRTE systems are deployed on comparatively simple and old processor technologies whose temporal behaviour is relatively easy to understand, static analysis and extensive testing efforts (which account for a large proportion of total production time and cost) yield far from perfect results. There have been significant advances in this domain, both in static analysis methods as well as hybrid measurement-based methods and testing. However, they cannot keep pace with current hardware trends. As long as current analysis techniques and testing processes are unable to scale up to the challenge, increased hardware complexity will lead to a significant degradation of the quality of the resulting products. Our strategy is to introduce Architectural Design Principles that, by construction, result in temporal behaviour for which the hypothesis of statistical independence can be made (or a clear notion of independence) and therefore enables probabilistic analysis. This is done by moving away from deterministic behaviours to more random behaviours
  • Operating System and architectures for High-Performance Computing: Classic operating systems are designed according to. They have been implemented as an independent layer between hardware and applications. User programs communicate to the OS through a set of well-defined system calls while the OS, on the other hand, communicates with the underlying architecture using control registers. Except for these interfaces, the three layers are practically independent and oblivious of each other. While this approach worked well in the past, the arrival of multicore/multithread architectures poses new challenges in terms of  performance, power consumption and system utilization. In this new scenario, the classic approach may not deliver optimal performance. High Performance Systems are especially sensitive to these problems: in order to obtain the optimal performance the hardware, the operating system, and the applications can no longer remain isolated, and instead should communicate and cooperate to achieve high performance with the minimal power consumption. The CAOS group addresses some of the problems of modern HPC systems, such as power- and temperature-aware scheduling, dynamic load balancing, resource utilization, Quality of Service.
  • Low-Power and Complexity Effective architectures: Embedded processor evolution leads to an increasing number of features at lower power integrated into a single chip. This challenge requires a set of solutions to reduce power consumption of those systems and integration of heterogeneous systems into the same chip to reduce area, power and delay. Based on the fact that current techniques to save power in multicore processors for real-time applications miss significant details, and different tasks with diverse requirements must be run in the same chip, the CAOS group addresses these issues from new perspectives. The aim of the group consists of proposing new approaches to save power in multicore processors by smartly controlling the resources of the chip as well as devising new hybrid processor designs capable of running some tasks at high-performance in an energy-efficient manner and some others reliably at ultra-low-power with the same hardware.
Previous projects

  • SOW on POWER5: In this project IBM and BSC intend to pursue a Research Collaboration to enable BSC to analyze, understand and evaluate the behavior of SMT/CMP processor architectures, including but not limited to IBM's POWER5 processor.  In particular, we analyzed the interaction between the operating system and the IBM POWER5 processor; (2) we understand the effect of the IBM POWER5 hardware prioritization on performance; (3) we understand the SMT/CMP behavior characteristics of workloads frequently executed in the BSC; and (4) we explore the design space of current and future SMT/CMP architectures.
  • Real-time CMT architectures: In this project BSC and Sun microsystems Inc. collaborate in the area of Chip Multithreading (CMT) systems. As CMT systems we use boards based on the UltraSPARC T1 and T2 processors. In particular the project focuses on (1) Task scheduling of low-layer network-type of applications, such as IP Forwarding and (2) Analyzing the virtualization capabilities on the UltraSPARC T1 and T2 processors.



PEOPLE
arrowABELLA, JAUME , SENIOR RESEARCHER
arrowAHMED, MUHAMMAD ISMAIL , RESIDENT STUDENT
arrowCAKAREVIC, VLADIMIR , RESIDENT STUDENT
arrowCAZORLA ALMEIDA, FRANCISCO JAVIER , OPERATING SYSTEM GROUP MANAGER
arrowGIOIOSA, ROBERTO , RESEARCHER
arrowJIMENEZ, VICTOR JAVIER , RESIDENT STUDENT
arrowKEDZIERSKI, KAMIL , RESIDENT STUDENT
arrowKOSMIDIS, LEONIDAS , RESIDENT STUDENT
arrowMARIC, BOJAN , RESIDENT STUDENT
arrowMORARI, ALESSANDRO , RESIDENT STUDENT
arrowMORETO, MIQUEL , ASSOCIATE RESIDENT STUDENT
arrowPAOLIERI, MARCO , RESIDENT STUDENT
arrowQUINONES MORENO, EDUARDO , RESEARCHER
arrowRADOJKOVIC, PETAR , RESIDENT STUDENT
arrowRUIZ LUQUE, JOSE CARLOS , RESIDENT STUDENT
arrowSABBINENI, VIVEK , RESIDENT STUDENT


PUBLICATIONS AND COMMUNICATIONS


Publications

Journals

Marco Paolieri, Eduardo Quinones, Francisco J. Cazorla and Mateo Valero. An Analyzable Memory Controller for Hard Real-Time CMPs. In IEEE Embedded Systems Letters (ESL), , January 2010.



Miquel Moreto, Francisco J. Cazorla, Alex Ramirez, Rizos Sakellariou and Mateo Valero. FlexDCP: a QoS framework for CMP architectures. In ACM SIGOPS Operating System Review, Special Issue on the Interaction among the OS, Compilers, and Multicore Processors, , April 2009.



Carlos Luque, Miquel Moreto, Francisco J. Cazorla, Roberto Gioiosa, Alper Buyuktosunoglu, and Mateo Valero. CPU accounting in CMP Processors. In IEEE Computer Architecture Letters. Volume 9, , February 2009.



Kyle J. Nesbit, Miquel Moreto, Francisco J. Cazorla, Alex Ramirez, Mateo Valero, and James E. Smith . Multicore Resource Management. In IEEEmicro, , June 2008.



Miquel Moreto, Francisco J. Cazorla, Alex Ramirez and Mateo Valero. Dynamic Cache Partitioning Based on the MLP on Cache Misses. In Transactions on HiPEAC. Volume 3, Issue 1, , May 2008.



Miquel Moreto, Francisco J. Cazorla, Alex Ramirez, and Mateo Valero. Explaining Dynamic Cache Partitioning Speed Ups. In IEEE Computer Architecture Letters. Volume 6, Issue 1, , March 2007.



Francisco J. Cazorla, Peter M.W. Knijnenburg, Rizos Sakellariou, Enrique Fernandez, Alex Ramirez and Mateo Valero. Predictable Performance in SMT processors: Synergy Between the OS and SMTs. In IEEE Transaction on Computers. Volume 55, Issue 7, , July 2006.



International Conferences

Kamil Kedzierski, Miquel Moreto, Francisco J. Cazorla and Mateo Valero. Adapting Cache Partitioning Algorithms to Real pseudo-LRU Replacement Policies. In 24th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Atlanta, Georgia, , April 2010.



Petar Radojkovic, Vladimir Cakarevic, Javier Verdu, Alex Pajuelo, Francisco J. Cazorla, Mario Nemirovsky and Mateo Valero. Thread to Strand Binding of Parallel Network Applications in Massive Multi-Threaded Systems. In 15th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, Bangalore, India, , January 2010.



Vladimir Cakarevic, Petar Radojkovic, Javier Verdu, Alex Pajuelo, Francisco J. Cazorla, Mario Nemirovsky and Mateo Valero. Characterizing the resource-sharing levels in the UltraSPARC T2 Processor. In 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), New York, USA, , December 2009.



Carlos Boneti, Francisco J. Cazorla, Roberto Gioiosa, Chen-Yong Cher, Alper Buyuktosunoglu, Pradip Bose and Mateo Valero. A Dynamic Scheduler for Balancing HPC Applications. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC). Austin, USA, , November 2009.



Carmelo Acosta, Francisco J. Cazorla, Alex Ramirez, and Mateo Valero. Thread to Core Assignment in SMT On-Chip Multiprocessors. In 21st Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), Sao Paulo, Brazil, , October 2009.



Marco Paolieri, Eduardo Quinones, Francisco J. Cazorla and Mateo Valero. Efficient Execution of Mixed Application Workloads in a Hard Real-Time. In Workshop on Reconciling Performance with Predictability (RePP) Oct. 15, 2009, during the ESWEEK, Grenoble, France, , October 2009.



Carmelo Acosta, Francisco J. Cazorla, Alex Ramirez, and Mateo Valero. MFLUSH: Handling Long-latency loads in SMT On-Chip Multiprocessors. In International Conference on Parallel Processing. Portland, Oregon, USA. Oregon, USA, , September 2009.



Carlos Luque, Miquel Moreto, Francisco J. Cazorla, Roberto Gioiosa, Alper Buyuktosunoglu and Mateo Valero. ITCA: Inter-Task Conflict-Aware CPU Accounting for CMPs. In International Symposium on Parallel Architectures and Compilation Techniques, North Carolina, USA, , September 2009.



Eduardo Quinones, Emery Berger, Guillem Bernat and Francisco J. Cazorla . Using Randomized Caches in Probabilistic Real-Time Systems. In 21st Euromicro Conference on Real-Time Systems (ECRTS 09), Dublin, Ireland, , July 2009.



Marco Paolieri, Eduardo Quinones, Francisco J. Cazorla, Guillem Bernat and Mateo Valero. Hardware Support for WCET Analysis of Multicore Systems. In International Symposium on Computer Architecture, Austin, USA, , June 2009.



Petar Radojkovic, Vladimir Cakarevic, Javier Verdu, Alex Pajuelo, Roberto Gioiosa, Francisco J. Cazorla, Mario Nemirovsky and Malero Valero. . Measuring Operating System Overhead on CMT Processors. In 20th Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). Campo Grande, Brazil, , October 2008.



Jesus Alastruey, Francisco J. Cazorla, Teresa Monreal, Victor Vinals and Mateo Valero. Selection of the Register File Size and the Resource Allocation Policy on SMT Processors. In 20th Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). Campo Grande, Brazil, , October 2008.



Miquel Pericas, Ruben Gonzalez, Francisco J. Cazorla, Adrian Cristal, Alex Veidenbaum, Daniel A. Jimenez and Mateo Valero. A two-level Load/Store Queue based on Execution Locality. In International Symposium on Computer Architecture. Beijing, China, , June 2008.



Carlos Boneti, Francisco J. Cazorla, Roberto Gioiosa, Chen-Yong Cher, Alper Buyuktosunoglu and Mateo Valero. Software-Controlled Priority Characterization of POWER5 Processor. In International Symposium on Computer Architecture. Beijing, China, , June 2008.



P. A. Castillo, J. J. Merelo, M. Moreto, F. J. Cazorla, M. Valero, A. M. Mora, J. L. J. Laredo, and S.A. McKee. Evolutionary system for prediction and optimization of hardware architecture performance. In IEEE Congress on Evolutionary Computation (CEC). Hong Kong, , June 2008.



Carlos Boneti, Francisco J. Cazorla, Roberto Gioiosa, Julita Corbalan, Jesus Labarta and Mateo Valero. Balancing HPC Applications Through Smart Allocation of Resources in MT Processors. In International Parallel & Distributed Processing Symposium (IPDPS). Miami, Florida, USA, , April 2008.



Miquel Moreto, Francisco J. Cazorla, Alex Ramirez and Mateo Valero. MLP-aware dynamic cache partitioning. In International Conference on High Performance Embedded Architectures & Compilers. Goteborg, Sweeden.,
, January 2008.



Javier Vera, Francisco J. Cazorla, Alex Pajuelo, Oliveiro J. Santana, Enrique Fernandez and Mateo Valero.. FAME: FAirly MEasuring Multithreaded Architectures. In Parallel Architectures and Compilation Techniques (PACT). Brasov, Romania, , September 2007.



Miquel Pericas, Ruben Gonzalez, Adrian Cristal, Francisco J. Cazorla, Daniel A. Jimenez and Mateo Valero. A Flexible Heterogeneous Multi-Core Architecture. In Parallel Architectures and Compilation Techniques (PACT). Brasov, Romania, , September 2007.



Miquel Moreto, Francisco J. Cazorla, Alex Ramirez, and Mateo Valero. Online Prediction of Applications Cache Utility. In IEEE International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (IC-SAMOS ), Samos, Greece, , July 2007.



Francisco J. Cazorla, Peter M.W. Knijnenburg, Rizos Sakellariou, Enrique Fernandez, Alex Ramirez and Mateo Valero. On the Problem of Minimizing Workload Execution Time in SMT Processors. In IEEE International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (IC-SAMOS ), Samos, Greece, , July 2007.



Francisco J. Cazorla, Peter M.W. Knijnenburg, Rizos Sakellariou, Enrique Fernandez, Alex Ramirez and Mateo Valero. Architectural Support for Real-Time Task Scheduling in SMT Processors. In proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES-2005), San Francisco, USA, , September 2005.



Adrian Cristal, Oliverio J. Santana, Francisco J. Cazorla, Marco Galluzi, Tanausu Ramirez, Miquel Pericas, and Mateo Valero . Kilo-instruction Processors: Overcoming the memory wall. In IEEEmicro,Volume 25 Issue 3, , June 2005.



Francisco J. Cazorla, Enrique Fernandez, Alex Ramirez and Mateo Valero. Dynamically Controlled Resource Allocation in SMT Processors. In the 37th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Portland,
, December 2004.



Francisco J. Cazorla, Peter M.W. Knijnenburg, Rizos Sakellariou, Enrique Fernandez, Alex Ramirez and Mateo Valero. Implicit vs. Explicit Resource Allocation in SMT Processors. In EUROMICRO Symposium on Digital System Design. Invited Paper. Rennes, France,
, September 2004.



Workshops

Vladimir Cakarevic, Petar Radojkovic, Javier Verdu, Alejandro Pajuelo, Roberto Gioiosa, Francisco J. Cazorla, Mario Nemirovsky and Mateo Valero. Understanding the overhead of the spin-lock loop in CMT architectures. In Workshop on the Interaction between Operating Systems and Computer Architecture (WIOSCA). Beijing, China,
, June 2008.



P. A. Castillo, A. M. Mora, J. J. Merelo, J. L. J. Laredo, M. Moreto, F. J. Cazorla, M. Valero, and S.A. McKee. Architecture performance prediction using evolutionary artificial neural networks. In European Workshop on Hardware Optimization Techniques (EVOHot). Napoli, Italy, , March 2008.



Carmelo Acosta, Francisco J. Cazorla, Alex Ramirez, and Mateo Valero. Core to Memory Interconexion Implications for Forthcomming On-Chip Multiprocessors. In Workshop on Chip Multiprocessor Memory Systems and Interconnects (in conjunction with the 13th Annual International Conference on High-Performance Architecture Phoenix, USA,
, February 2007.



Javier Vera, Francisco J. Cazorla, Alex Pajuelo, Oliverio J. Santana, Enrique Fernandez, and Mateo Valero. Measuring the Performance of Multithreaded Processors. In SPEC Benchmark Workshop (in conjunction with the Annual Meeting of the Standard Performance Evaluation Corporation (SPEC)), Austin, USA, Schaeffer Award to the technical quality of the paper , January 2007.



Javier Vera, Francisco J. Cazorla, Alex Pajuelo, Oliverio J. Santana, Enrique Fernandez, and Mateo Valero. A Novel Evaluation Methodology to Obtain Fair Measurements in Multithreaded Architectures. In Workshop on Modeling, Benchmarking and Simulation (MoBS)2006. Held in conjunction with ISCA, Boston, USA, , June 2006.



Communications

Carlos Boneti, Francisco J. Cazorla, Roberto Gioiosa and Mateo Valero. Scheduling Real-Time Systems With Explicit Resource Allocation Processors. In International Conference on Architecture of Computing Systems (ARCS). Dresden, Germany, , February 2008.



Kamil Kedzierski, Miquel Moreto, Francisco J. Cazorla and Mateo Valero. pseudo-LRU based Cache Partitioning Algorithms. In International Symposium on Parallel Architectures and Compilation Techniques, North Carolina, USA, , September 2009.



Miquel Moreto, Francisco J. Cazorla, Alex Ramirez,nd Mateo Valero. MLP-aware dynamic cache partitioning. In Parallel Architectures and Compilation Techniques (PACT). Brasov, Romania,
, September 2007.



 
  top
link_top
  Barcelona Supercomputing Center, 2010 - Legal Notice
 
link_top