Architecture, Compilers and Embedded Systems (ACES) Laboratory

Publications @ ACES


Home People Research Sponsors Publications Contact

This page links to the publications by the faculty @ ACES.

To view the publications of each faculty member in the ACES laboratory, follow the links below:

Sudhir Aggarwal
Ted Baker
Dr. Robert van Engelen
Dr. Gary Tyson
Dr. David Whalley
Dr. Xin Yuan

To view publications associated with specific projects, then click here.

Below is a selection of publications sorted by year.


  1. "The GSI plug-in for gSOAP: Enhanced Security, Performance, and Reliability", Giovanni Aloisio, Massimo Cafaro, Italo Epicoco, Daniele Lezzi, and Robert van Engelen, to appear in the proceedings of the IEEE Internationl Conference on Information Technology Coding and Computing, April 2005.


  1. "A Grid Workflow-Based Monte Carlo Simulation Environment", Yaohang Li, Michael Mascagni, Robert van Engelen, and Qin Cai, in the Journal of Neural, Parallel, and Scientific Computations, 2004, pages 439-454.

  2. "Parametric Timing Estimation With the Newton-Gregory Formulae", Robert A. van Engelen, Kyle Gallivan, and Burt Walsh, accepted for publication in the Journal of Concurrency and Computation: Practice and Experience, 2004.

  3. "Value Range Analysis of Conditionally Updated Variables and Pointers", Johnnie Birch, Robert van Engelen, and Kyle Gallivan, in the proceedings of Compilers for Parallel Computing (CPC), July 2004, pages 265-276.

  4. "Toward Characterizing the Performance of SOAP Toolkits", M. Govindaraju, A. Slominski, K. Chiu, P. Liu, R. van Engelen, and M. Lewis, in the proceedings of the 5th IEEE/ACM International Workshop on Grid Computing, pages 365-372, Pittsburgh, USA, 2004.

  5. "Compiler Transformations for Effectively Exploiting a Zero Overhead Loop Buffer" by G. Uh, Y. Wang, D. Whalley, S. Jinturkar, V. Cao, C. Burns, Y. Paek, in Software Practice & Experience, accepted July 2004.

  6. "A Unified Framework for Nonlinear Dependence Testing and Symbolic Analysis", Robert van Engelen, Johnnie Birch, Yixin Shou, Burt Walsh, and Kyle Gallivan, in the proceedings of the ACM International Conference on Supercomputing (ICS), June 2004, pages 106-115.

  7. "Branch Elimination by Condition Merging" by W. Kreahling, D. Whalley, M. Bailey, X. Yuan, G. Uh, R. van Engelen, in Software Practice and Experience, accepted May 2004.

  8. "WCET Code Positioning" by W. Zhao, D. Whalley, C. Healy, F. Mueller in the Proceedings of the IEEE Real-Time Systems Symposium, December 2004.

  9. "Automatic Validation of Code-Improving Transformations on Low-Level Program Representations" by R. van Engelen, D. Whalley, X. Yuan, in Science of Computer Programming, August 2004, pages 257-280.

  10. "Fast Searches for Effective Optimization Phase Sequences" by P. Kulkarni, S. Hines, J. Hiser, D. Whalley, J. Davidson, D. Jones in the Proceedings of the ACM SIGPLAN Conference on Programming Language Design & Implementation, June 2004, pages 171-182.

  11. "Tuning the WCET of Embedded Applications" by W. Zhao, P. Kulkarni, D. Whalley, C. Healy, F. Mueller, G. Uh in the Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium, May 2004, pages 472-481.

  12. "Constructing Finite State Automata for High Performance XML Web Services", Robert van Engelen, in the proceedings of the International Symposium on Web Services (ISWS), May 2004.

  13. "Code Generation Techniques for Developing Web Services for Embedded Devices", Robert van Engelen, in the proceedings of the 9th ACM Symposium on Applied Computing SAC, Nicosia, Cyprus, March 2004, pages 854-861.

  14. "Array Data Dependence Testing with the Chains of Recurrences Algebra", Robert van Engelen, Johnnie Birch, and Kyle Gallivan, in the proceedings of the IEEE International Workshop on Innovative Architectures for Future Generation High-Performance Processors and Systems (IWIA), January 2004, pages 70-81.

  15. "Fast Memory Bank Assignment for Fixed-point Digital Signal Processors" by J. Cho, Y. Paek, and D. Whalley, in ACM Transactions on Design Automation of Electronic Systems, vol 9, no 1, January 2004, pages 52-74.


  1. "Parametric Intra-Task Dynamic Voltage Scheduling", Burt Walsh, Robert van Engelen, Kyle Gallivan, Johnnie Birch, and Yixin Shou, in proceedings of the PACT Workshop on Compilers and Operating Systems for Low Power, 2003.

  2. "Branch Elimination via Multi-Variable Condition Merging" by W. Kreahling, D. Whalley, M. Bailey, X. Yuan, G. Uh, and R. van Engelen in the Proceedings of the European Conference on Parallel and Distributed Computing, August 2003, pages 261-270.

  3. "Secure Web Services with Globus GSI and gSOAP", Giovanni Aloisio, Massimo Cafaro, Daniele Lezzi, and Robert van Engelen, in the proceedings of European Conference on Parallel and Distributed Computing, August 2003.

  4. "Finding Effective Optimization Phase Sequences" by P. Kulkarni, W. Zhao, H. Moon, K. Cho, D. Whalley, J. Davidson, M. Bailey, Y. Paek, K. Gallivan, D. Jones in the Proceedings of the ACM SIGPLAN Conference on Languages, Compilers, and Tools for Embedded Systems, June 2003, pages 12-23. [ slides]

  5. "Pushing the SOAP Envelope with Web Services for Scientific Computing", Robert A. van Engelen, in the proceedings of the International Conference on Web Services (ICWS), June 2003, pages 346-354.

  6. "CC-MPI: A Compiled Communication Capable MPI Prototype for Ethernet Switched Clusters" by A. Karwande, X. Yuan, and D. K. Lowenthal, in the Proceedings of ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), June 2003, pages 95-106.

  7. "Validation of Code-Improving Transformations for Embedded Systems" by R. van Engelen, D. Whalley, and X. Yuan, in the Proceedings of the ACM SIGAPP Symposium on Applied Computing, March 2003, pages 684-691.

  8. "Algorithms for Supporting Compiled Communication" by X. Yuan, R. Melhem, and R. Gupta, in IEEE Transactions on Parallel and Distributed Systems, Volume 14, No. 2, pages 107-118, February 2003.

  9. "Tight Timing Estimation With the Newton-Gregory Formulae", Robert A. van Engelen, Kyle Gallivan, and Burt Walsh, in proceedings of Compilers for Parallel Computing (CPC), January 2003, Amsterdam, Netherlands, pages 321-330.


  1. "Efficient and Effective Branch Reordering Using Profile Data" by M. Yang, G. Uh, and D. Whalley, in ACM Transactions on Programming Languages and Systems, vol 24, no 6, November 2002, pages 667-697.

  2. "Communication Characteristics in the NAS Parallel Benchmarks" by A. Faraj and X. Yuan, in the Proceedings of the Fourteenth IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS 2002), November 4-6, 2002, pages 729-734.

  3. "Automatic Detection and Exploitation of Branch Constraints for Timing Analysis" by C. Healy and D. Whalley, in IEEE Transactions on Software Engineering, August 2002, pages 763-781.

  4. "VISTA: A System for Interactive Code Improvement" by W. Zhao, B. Cai, D. Whalley, M. Bailey, R. van Engelen, X. Yuan, J. Hiser, J. Davidson, K. Gallivan, and D. Jones, in the Proceedings of the ACM SIGPLAN Conference on Language, Compilers, and Tools for Embedded Systems, June 2002, pages 155-164.

  5. "Group Management Schemes for Implementing MPI Collective Communication over IP-Multicast" by X. Yuan, S. Daniels, A. Faraj and A. Karwande, in the Proceedings of The 6th International Conference on Computer Science and Informatics, March 8-14, 2002, pages 76-80.


  1. Allocation by Conflict: A Simple, Effective Cache Management Scheme by Edward S. Tam, Stevan A. Vlaovic, Gary S. Tyson and Edward Davidson, in IEEE International Conference of Computer Design, Sept 2001.

  2. Evaluating the Use of Register Queues in Software Pipelined Loops by Gary Tyson, Mikhail Smelyanski and Edward Davidson, in IEEE Transactions on Computers, Vol, 50, No. 8, pp. 769 - 783, August 2001.

  3. Evaluating Design Tradeoffs in Dual Pipelines by Ramu Pyreddy and Gary Tyson, in the Workshop on Complexity-Effective Design in association with the 28th Annual Symposium on Computer Architecture, 2001.

  4. Stack Value File: Custom Microarchitecture for the Stack by Hsien-Hsin Lee, Mikhail Smelyanski, Chris Newburn and Gary Tyson, in the Seventh International Symposium on High Performance Computer Architecture (HPCA-7), pp. 5-14, Jan. 2001.

  5. Branch History Guided Instruction Prefetching by Viji Srinivasan, Edward Davidson and Gary Tyson, in the Seventh International Symposium on High Performance Computer Architecture (HPCA-7), pp. 291-300, Jan. 2001.

  6. "An Empirical Study of Reliable Multicast Protocols over Ethernet-Connected Networks" by R. G. Lane, S. Daniels, and X. Yuan, in the Proceedings of the International Conference on Parallel Processing (ICPP'01), September 3-7, 2001, pages 553-560.

  7. "Parametric Timing Analysis" by E. Vivancos, C. Healy, F. Mueller, and D. Whalley, in the Proceedings of the ACM SIGPLAN Workshop on Language, Compilers, and Tools for Embedded Systems, June 2001, pages 88-93.


  1. Eager Writeback - a Technique for Improving Bandwidth Utilization by Hsien-Hsin Lee, Gary Tyson and Matthew Farrens, in the 33rd Annual International Symposium on Microarchitecture (Micro 33), pp. 11-20, Dec. 2000.

  2. Improving BTB performance in the presence of DLLs by Steve Vlaovic, Edward Davidson and Gary Tyson, in the 33rd Annual International Symposium on Microarchitecture (Micro 33), pp. 77-86, Dec. 2000.

  3. Region-based caching: An energy-delay efficient memory architecture for embedded processors by Hsien-Hsin Lee and Gary Tyson, in the International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES 2000), Nov. 2000.

  4. Register Queues: A New Hardware/Software Approach to Efficient Software Pipelining by Mikhail Smelyanskiy, Gary Tyson and Edward Davidson, in the International Conference on Parallel Architectures and Compilation Techniques (PACT 2000), Oct 2000.

  5. Quantifying Instruction-Level Parallelism Limits on an EPIC Architecture by Hsien-Hsin Lee, Youfeng Wu, and Gary Tyson, in the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 21-27, April 2000.

  6. Instruction Overhead and Data Locality Effects in Superscalar Processors by Murali Annavaram, Gary S. Tyson and Edward S. Davidson, in the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 95-100, April 2000.

  7. "Supporting Timing Analysis by Automatic Bounding of Loop Iterations" by C. Healy, M. Sjodin, V. Rustagi, D. Whalley, and R. van Engelen, in Real-Time Systems, May 2000, pages 121-148.


  1. Memory Renaming: Fast, Early and Accurate Processing of Memory Communication by Gary Tyson and Todd Austin , in the International Journal of Parallel Programming, 1999.

  2. Classifying Load and Store Instructions for Memory Renaming by Glenn Reinman, Brad Calder, Dean Tullsen, Gary Tyson and Todd Austin, in the ACM International Conference on Supercomputing, pg. 399-407, June 1999.

  3. Active Management of Data Caches by Exploiting Reuse Information by Edward S. Tam, Jude A. Rivers, Vijayalakshmi Srinivasan, Gary S. Tyson and Edward S. Davidson, in IEEE Transactions on Computers, Vol 48, No 11, pp. 1244-1259, Nov 1999.

  4. Performance Limits of Trace Caches. by Matt Postiff, Gary Tyson and Trevor Mudge, in the Journal of Instruction Level Parallelism, 1999.

  5. Limits of Instruction Level Parallelism in SPEC95 Applications. by Matthew A. Postiff, David A. Green, Gary S. Tyson and Trevor N. Mudge, in Computer Architecture News, Vol 27 No. 1, March 1999.

  6. "Effectively Exploiting Indirect Jumps" by G. Uh and D. Whalley, in Software Practice & Experience, December 1999, pages 1061-1101.

  7. "Timing Analysis for Data Caches and Wrap-Around Fill Caches" by R. White, F. Mueller, C. Healy, D. Whalley, and M. Harmon, in Real-Time Systems, November 1999, pages 209-233.

  8. "A General Approach for Tight Timing Predictions of Non-Rectangular Loops" by C. Healy, R. van Engelen and D. Whalley, in the WIP Proceedings of the IEEE Real-Time Technology and Applications Symposium, June 1999, pages 11-14. (This was a Work in Progress (WIP) paper.)

  9. "Tighter Timing Predictions by Automatic Detection and Exploitation of Value-Dependent Constraints" by C. Healy and D. Whalley, in the Proceedings of the IEEE Real-Time Technology and Applications Symposium, June 1999, pages 79-88.

  10. "Compiler Analysis to Support Compiled Communication for HPF-like Programs" by X. Yuan, R. Gupta, and R. Melhem, in the Proceedings of the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing, April 1999, pages 603-608.

  11. "Bounding Pipeline and Instruction Cache Performance" by C. A. Healy, R. D. Arnold, F. Mueller, D. B. Whalley, and M. G. Harmon, in IEEE Transactions on Computers, January 1999, pages 53-70.

  12. "Timing Constraint Specification and Analysis" by L. Ko, N. Al-Yaqoubi, C. Healy, E. Ratliff, R. Arnold, D. Whalley, and M. Harmon, in Software Practice & Experience, January 1999, pages 77-98.


  1. Analyzing the Working Set Characteristics of Branch Execution by Sangwook P. Kim and Gary S. Tyson, in the Proceeding of the 31th Annual Symposium on Microarchitecure, Dec 1998.

  2. MirvSim: A high level simulator integrated with the Mirv compiler. by Krisztian Flautner, Gary S. Tyson and Trevor Mudge, in the Proc. 3rd Workshop Interaction between Compilers and Computer Architectures (INTERACT-3) at ASPLOS-VIII, Oct, 1998.

  3. Limits of Instruction Level Parallelism in SPEC95 Applications. by Matthew A. Postiff, David A. Green, Gary S. Tyson and Trevor N. Mudge, in the Proc. 3rd Workshop Interaction between Compilers and Computer Architectures (INTERACT-3) at ASPLOS-VIII, Oct, 1998.

  4. Evaluating the Performance of Active Cache Management Schemes by Edward S. Tam, Jude A. Rivers, Vijayalakshmi Srinivasan, Gary S. Tyson and Edward S. Davidson, in the Proceedings of the 1998 IEEE International Conference on Computer Design, October 1998

  5. Utilizing Reuse Information in Data Cache Management by Jude A. Rivers, Edward S. Tam, Gary S. Tyson, Edward S. Davidson and Matthew Farrens, in the Proceedings of the 12th ACM International Conference on Supercomputing July, 1998

  6. mlcache: A Flexible Multi-Lateral Cache Simulator by Edward S. Tam, Gary S. Tyson and Edward S. Davidson, in the Proceedings of the 6th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS '98) July, 1998

  7. "Improving Performance by Branch Reordering" by M. Yang, G. Uh, and D. Whalley, in the Proceedings of the SIGPLAN '98 Conference on Programming Language Design and Implementation, June 1998, pages 130-141.

  8. "Bounding Loop Iterations for Timing Analysis" by C. Healy, M. Sjodin, V. Rustagi, and D. Whalley in the Proceedings of the IEEE Real-Time Technology and Applications Symposium, June 1998, pages 12-21.


  1. "Coalescing Conditional Branches into Efficient Indirect Jumps" by G. Uh and D. Whalley in the Proceedings of the Static Analysis Symposium, September 1997, pages 315-329.

  2. "An Array Data Flow Analysis based Communication Optimizer" by X. Yuan, R. Gupta, and R. Melhem, in Proceedings of the Tenth Annual Workshop on Languages and Compilers for Parallel Computing (LCPC'97), LNCS 1366, August 1997, pages 246-260.

  3. "Timing Analysis for Data Caches and Set-Associative Caches" by R. White, F. Mueller, C. Healy, D. Whalley, and M. G. Harmon in the Proceedings of the IEEE Real-Time Technology and Applications Symposium, June 1997, pages 192-202.


  1. "Compiled Communication for All--optical TDM Networks" by X. Yuan, R. Melhem, and R. Gupta, Supercomputing'96, November 17-22, 1996.

  2. "Demand-driven Data Flow Analysis for Communication Optimization" by X. Yuan, R. Gupta, and R. Melhem, in Proceedings of the Workshop on Challenges in Compiling for Scalable Parallel Systems, 8th IEEE Symposium on Parallel and Distributed Processing, Oct. 23-26, 1996.

  3. "Supporting the Specification and Analysis of Timing Constraints" by L. Ko, C. Healy, E. Ratliff, R. Arnold, D. Whalley, and M. G. Harmon in the Proceedings of the IEEE Real-Time Technology and Applications Symposium, June 1996, pages 170-178.


  1. "Integrating the Timing Analysis of Pipelining and Instruction Caching" by C. A. Healy, D. B. Whalley, and M. G. Harmon in the Proceedings of the IEEE Real-Time Systems Symposium, December 1995, pages 288-297.

  2. "Supporting User-Friendly Analysis of Timing Constraints" by L. Ko, D. B. Whalley, M. G. Harmon in the Proceedings of the ACM SIGPLAN Workshop on Language, Compilers, and Tools for Real-Time Systems, June 1995, pages 107-115.


  1. "Predicting Instruction Cache Behavior" by F. Mueller, D. B. Whalley, M. G. Harmon in the Proceedings of the ACM SIGPLAN Workshop on Language, Compiler, and Tool Support for Real-Time Systems, June 1994.

  2. "Predicting Instruction Cache Behavior" by F. Mueller, D. B. Whalley, M. G. Harmon in the Proceedings of the ACM SIGPLAN Workshop on Language, Compiler, and Tool Support for Real-Time Systems, June 1994.

Comments about the presentation of these pages should be sent to webmaster@cs.fsu.edu.
Comments and questions about the content of these pages should be sent to aces@cs.fsu.edu.