Song Fu

Associate Professor
Discovery Park F250
940-565-2341
song.fu@unt.edu
  • Education
    • PhD, Wayne State University, 2008.
      Major: Computer Engineering
      Dissertation Title: Failure-Aware Reconfigurable Distributed Virtual Machine for Dependable and High Productivity Computing
    • MS, Nanjing University, 2002.
      Major: Computer Science
    • BS, Nanjing University of Aeronautics and Astronautics, 1999.
      Major: Computer Science
  • Research

    As parallel and distributed computing systems become more and more large-scale and complex, new foundations are needed for understanding and controlling their integral properties. Dr. Fu’s research is dedicated to the investigation, establishment, and experimental evaluation of new theoretical foundations and system artifacts to significantly improve the system resilience, power & energy, and performance. His research interest is primarily in high-performance computing, distributed and cloud systems, including

    • Resilience and Failure/Error Management
    • Anomaly Detection and Failure Diagnosis
    • Smart Storage and Storage Reliability
    • Power Management and Energy Efficiency
    • Autonomic Resource Management and Reconfiguration
    • High Performance Computing
    • Smart Cities and Smart Communities
  • Publications

    Peer-Reviewed Conference and Journal Publications

    • Song Fu, Hsing-Bung Chen and George Qiao
      A Machine Learning based Disk Health Status Assessment, Failure Prediction, and Pre-Failure Data Recovery Approach for Supporting Always-On Extreme Scale Storage Systems
      ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC), July 2017.
    • Zongze Li, Matthew Davidson, Song Fu, Sean Blanchard and Michael Lang
      Event Block Analysis for Effective Anomaly Detection on Production HPC Systems
      ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC), July 2017.
    • Song Huang, Song Fu, Weisong Shi and Devesh Tiwari
      Proactive Disk Failure Management and Data Protection for Highly Available Storage Systems
      ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC), July 2017.
    • Mohit Kumar, Devesh Tiwari, Weisong Shi, Saurabh Gupta and Song Fu
      Towards Understanding Interconnect Failures in HPC Systems
      Greater Chicago Area Systems Research Workshop (GCASR), April 2017.
    • Hsing-Bung Chen and Song Fu
      Improving Coding Performance and Energy Efficiency of Erasure Coding Process for Storage Systems
      IEEE International Conference on Cloud Computing (CLOUD), July 2016.
    • Song Huang, Song Fu, Scott Pakin and Michael Lang
      Characterizing Power and Energy Efficiency of Legion Runtime and Applications: An Early Experience
      IEEE International Green and Sustainable Computing Conference (IGSC), November 2016.
    • Song Huang, Zhiang Deng and Song Fu
      Quantifying Topology Criticality for Fault Impact Analysis in Software-Defined Networks
      The 35th IEEE International Performance Computing and Communications Conference (IPCCC), December 2016.
    • Jacob Hochstetler, Lauren Hochstetler, and Song Fu
      An Optimal Police Patrol Planning Strategy for Smart City Safety
      The 14th IEEE International Conference on Smart City (SmartCity), December 2016.
    • Elisabeth Baseman, Sean Blanchard, Zongze Li, and Song Fu
      Relational Synthesis of Text and Numeric Data for Anomaly Detection on Computing System Logs
      The 15th IEEE International Conference on Machine Learning and Applications (ICMLA), December 2016.
    • Hsing-Bung Chen and Song Fu
      Parallel Erasure Coding: Exploring Task Parallelism in Erasure Coding for Enhanced Bandwidth
      IEEE International Conference on Networking, Architecture and Storage (NAS), August 2016.
    • Ziming Zhang, Michael Lang, Scott Pakin, and Song Fu
      TracSim: Simulating and Scheduling Trapped Power Capacity to Maximize Machine Room Throughput
      Parallel Computing (ParCo) Journal, Vol. 57: 108-124, September 2016.
    • Qiang Guan, Nathan DeBardeleben, Sean Blanchard and Song Fu
      Addressing Statistical Significance of Fault Injection: Empirical Studies of the Soft Error Susceptibility
      International Journal of High Performance Computing and Networking, June 2016.
    • Q. Guan, N. DeBardeleben, S. Blanchard, S. Fu, C. H. Davis, and W. M. Jones
      Analyzing the Robustness of HPC Applications Using a Fine-Grained Soft Error Fault Injection Tool
      Book Chapter, Innovative Research and Applications in Next-Generation High Performance Computing, pp. 277-305, IGI Global, July 2016.
    • S. Huang, S. Fu, Q. Zhang and W. Shi
      Characterizing Disk Failures with Quantified Disk Degradation Signatures: An Early Experience
      IEEE International Symposium on Workload Characterization (IISWC), 10 pages, October 2015.
    • Q. Guan, N. DeBardeleben, S. Blanchard and S. Fu
      Addressing Statistical Significance of Fault Injection: Empirical Studies of the Soft Error Susceptibility
      The 21st IEEE/IFIP International Symposium on Dependable Computing, 10 pages, November 2015.
    • S. Huang, S. Fu, N. DeBardeleben, Q. Guan and C. Xu
      Differentiated Failure Remediation with Action Selection for Resilient Computing
      The 21st IEEE/IFIP International Symposium on Dependable Computing, 10 pages, November 2015.
    • H.-B. Chen, S. Fu, Z. Qiao, S. Liang and S. Huang
      A Parallel, Reliable and Scalable Storage Software Infrastructure for Active Storage System and I/O Environments
      The 34th IEEE International Performance Computing and Communications Conference, December 2015.
    • Z. Qiao, S. Liang, H. Jiang and S. Fu
      A Customizable MapReduce Framework for Complex Data-Intensive Workflows on GPUs
      The 34th IEEE International Performance Computing and Communications Conference, December 2015.
    • S. Huang, M. Lang, S. Pakin and S. Fu
      Measurement and Characterization of Haswell Power and Energy Consumption
      Energy Efficient Supercomputing in IEEE/ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SC'15), 10 pages, November 2015.
    • Q. Guan, N. DeBardeleben, S. Blanchard and S. Fu
      Empirical Studies of the Soft Error Susceptibility of Sorting Algorithms to Statistical Fault Injection
      Fault Tolerance for HPC at eXtreme Scale in the 24th International ACM Symposium on High Performance Distributed Computing (HPDC), Pages 33-40, June 2015.
    • Q. Guan, N. DeBardeleben, S. Blanchard and S. Fu
      Soft Error Susceptibility of Sorting Algorithms
      IEEE International Workshop on Silicon Errors in Logic - System Effects (SELSE), March 2015.
    • Z. Qiao, S. Liang, H. Jiang, and S. Fu
      MR-Graph: a Customizable GPU MapReduce
      IEEE International Conference on Cyber Security and Cloud Computing, November 2015.
    • B. Arigong, M. Zhou, H. Ren, J. Shao, S. Fu, H. Kim and H. Zhang
      System Application of Planar Couplers
      IEEE Symposium on Wireless and Microwave Circuits and Systems, April 2015.
    • H. Ren, J. Shao, B. Arigong, M. Zhou, S. Fu, H. Kim and H. Zhang
      Simplified Doherty Power Amplifier Structures
      IEEE Symposium on Wireless and Microwave Circuits and Systems, April 2015.
    • J. Shao, H. Ren, M. Zhou, B. Arigong, J. Ding, S. Fu, H. Kim, and H. Zhang
      Design of a Dual-Band Sequential Power Amplifier
      Microwave and Optical Technology Letters, 2015.
    • J. Shao, H. Ren, M. Zhou, B. Arigong, J. Ding, S. Fu, H. Kim, and H. Zhang
      Design of a Tunable Sequential Power Amplifier
      Microwave and Optical Technology Letters, 2015.
    • B. Arigong, J. Ding, H. Ren, M. Zhou, J. Shao, H. Kim, S. Fu, and H. Zhang
      An Improved Design of Dual-band 3dB 180▲ Directional Coupler
      Progress in Electromagnetic Research (PIER), Vol. 56: 153-162, 2015.
    • J. Shao, R. Zhou, S. Yoon, S. Fu, H. Kim, and H. Zhang
      Design of Dual-Band GaN Doherty Power Amplifier Using a Simplified Structure
      Microwave and Optical Technology Letters, Vol. 57(4): 953-956, 2015.
    • Qiang Guan, Song Fu, Nathan DeBardeleben and Sean Blanchard
      F-SEFI: A Fine-grained Soft Error Fault Injection Tool for Profiling Application Vulnerability
      The 28th IEEE International Parallel & Distributed Processing Symposium (IPDPS), pp.1-10, May 2014.
    • Ziming Zhang, Michael Lang, Scott Pakin and Song Fu
      Trapped Capacity: Scheduling under a Power Cap to Maximize Machine-Room Throughput
      Workshop on Energy Efficient Supercomputing in conjunction with IEEE/ACM Supercomputing Conference (SC), pp.1-10, November 2014.
    • Qiang Guan, Nathan DeBardeleben, Sean Blanchard and Song Fu
      Towards Exploring the Soft Error Susceptibility of Heapsort Algorithms
      The 44th IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), June 2014.
    • Xiajun Wang, Song Huang, Song Fu and Krishna Kavi
      Characterizing Workload of Web Applications on Virtualized Servers
      The 4th Workshop on Big Data Benchmarks, Performance Optimization, and Emerging Hardware in conjunction with the 19th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp.1-8, March 2014.
    • Bayaner Arigong, Jun Ding, Han Ren, Mi Zhou, Jin Shao, Rongguo Zhou, Hyoungsoo Kim, Yuankun Lin, Song Fu and Hualiang Zhang
      Transformation Optics for Microwave and Optical Device Design
      IEEE International Conference on Electromagnetics in Advanced Applications (ICEAA), August 2014.
    • Jin Shao, Rongguo Zhou, Sang-Woong Yoon, Song Fu, Hyoungsoo Kim and Hualiang Zhang
      Design of Dual-Band GaN Doherty Power Amplifier Using a Simplified Structure
      Microwave and Optical Technology Letters, in press, December 2014.
    • Qiang Guan and Song Fu
      Adaptive Anomaly Identification by Exploring Metric Subspace in Cloud Computing Infrastructures
      The 32nd IEEE International Symposium on Reliable Distributed Systems (SRDS), pp.1-10, 2013.
    • Qiang Guan and Song Fu
      Autonomic Failure Identification and Diagnosis for Building Dependable Computing Systems
      ACM/IEEE Supercomputing Conference (SC), 2013.
    • Qiang Guan, Song Fu, Nathan DeBardeleben and Sean Blanchard
      Exploring Time and Frequency Domains for Accurate and Automated Anomaly Detection in Cloud Computing Systems
      The 19th IEEE/IFIP International Symposium on Dependable Computing (PRDC), pp.1-10, 2013.
    • Qiang Guan and Song Fu
      Wavelet-Based Multi-Scale Anomaly Identification in Cloud Computing Systems
      IEEE Global Communications Conference (GLOBECOM), pp.1-7, 2013.
    • Husanbir Pannu, Jianguo Liu and Song Fu
      AAD: Adaptive Anomaly Detection System for Cloud Computing Infrastructures
      The 31st IEEE International Symposium on Reliable Distributed Systems (SRDS), 2012.
    • Ziming Zhang, Qiang Guan and Song Fu
      An Adaptive Power Management Framework for Autonomic Resource Configuration in Cloud Computing Infrastructures
      The 31st IEEE International Performance Computing and Communications Conference (IPCCC), pp.1-10, 2012. 
    • Husanbir Pannu, Jianguo Liu, Qiang Guan and Song Fu
      An Autonomic Failure Detection System for Cloud Computing Infrastructures
      The 31st IEEE International Performance Computing and Communications Conference (IPCCC), pp.1-10, 2012. 
    • Qiang Guan, Chi-Chen Chiu and Song Fu
      A Cloud Dependability Analysis Framework for Assessing System Dependability in Cloud Computing Infrastructures
      The 18th IEEE/IFIP International Symposium on Dependable Computing (PRDC), pp.1-10, 2012.
    • Husanbir Pannu, Jianguo Liu and Song Fu
      A Hybrid Anomaly Detection Framework in Cloud Computing using One-Class and Two-Class Support Vector Machines
      International Conference on Advanced Data Mining and Applications (ADMA), pp.1-12, 2012. 
    • Qiang Guan, Chi-Chen Chiu, Ziming Zhang and Song Fu
      Efficient and Accurate Anomaly Identification Using Reduced Metric Space in Utility Clouds
      IEEE International Conference on Networking, Architecture, and Storage (NAS), pp.1-10, 2012. 
    • Husanbir Pannu, Jianguo Liu and Song Fu
      Autonomic Anomaly Identification for Developing Highly Dependable Utility Clouds
      IEEE Global Communications Conference (GLOBECOM), pp.1-7, 2012. 
    • Qiang Guan, Ziming Zhang and Song Fu 
      Ensemble of Bayesian Predictors and Decision Trees for Proactive Failure Management in Cloud Computing Systems 
      Journal of Communications, Vol. 7, No. 1, pp. 52-61, 2012.
    • Qiang Guan, Ziming Zhang and Song Fu 
      A Failure Detection and Prediction Mechanism for Enhancing Dependability of Data Centers 
      International Journal of Computer Theory and Engineering, In press, 2012.
    • Ziming Zhang and Song Fu
      Characterizing Power and Energy Usage in Cloud Computing Systems
      IEEE International Conference on Cloud Computing Technology and Science (CloudCom), 2011. 
    • Qiang Guan, Ziming Zhang and Song Fu
      Proactive Failure Management by Integrated Unsupervised and Semi-Supervised Learning for Dependable Cloud Systems
      IEEE International Conference on Availability, Reliability and Security (ARES), 2011. 
    • Song Fu, Qiang Guan, Ziming Zhang
      Failure Detection and Prediction for Dependable Cloud Computing Systems
      IEEE Global Communication Conference (GLOBECOM), 2011.
    • Nathan DeBardeleben, Sean Blanchard, Qiang Guan, Ziming Zhang, and Song Fu
      Experimental Framework for Injecting Logic Errors in a Virtual Machine to Profile Applications for Soft Error Resilience
      Resilience, the 17th International European Conference on Parallel and Distributed Computing (Euro-Par), September 2011.
    • Ziming Zhang, and Song Fu
      macropower: A Coarse-Grain Power Profiling Framework for Energy-Efficient Cloud Computing
      The 30th IEEE International Performance Computing and Communications Conference (IPCCC), 2011.
    • Qiang Guan, Ziming Zhang and Song Fu
      Ensemble of Bayesian Predictors for Autonomic Failure Management in Cloud Computing
      The 20th IEEE International Conference on Computer Communications and Networks (ICCCN), 2011. 
    • Song Fu, Chengzhong Xu and Helen Shen 
      Randomized Load Balancing Strategies with Churn Resilience in Peer-to-Peer Networks 
      Journal of Network and Computer Applications, Elsevier, Vol. 34, No. 1, pp. 252-261, 2011.
    • Song Fu and Chengzhong Xu
      Failure-Aware Resource Management for High-Availability Computing Clusters with Distributed Virtual Machines 
      Journal of Parallel and Distributed Computing, Elsevier, Vol. 70, No. 4, pp. 384-393, 2010.
    • Song Fu and Chengzhong Xu 
      Quantifying Event Correlations for Proactive Failure Management in Networked Computing Systems 
      Journal of Parallel and Distributed Computing, Elsevier, Vol. 70, No. 11, pp. 1100-1109, 2010.
    • Ziming Zhang and Song Fu 
      A Hierarchical Failure Management Framework for Dependability Assurance in Compute Clusters 
      International Journal of Computational Science, Vol. 4, No. 4, pp. 313-326, 2010.
    • Ziming Zhang and Song Fu
      Failure Prediction for Autonomic Management of Networked Computer Systems with Availability Assurance
      DPDNS, IEEE International Parallel and Distributed Processing Symposium (IPDPS), April 2010.
    • Qiang Guan and Song Fu
      auto-AID: A Data Mining Framework for Autonomic Anomaly Identification in Networked Computer Systems
      The 29th IEEE International Performance Computing and Communications Conference (IPCCC), December 2010. 
    • Song Fu
      Dependability Enhancement for Coalition Clusters with Autonomic Failure Management
      The 15th IEEE International Symposium on Computers and Communications (ISCC), June 2010.
    • Ziming Zhang and Song Fu
      Proactive Failure Management for High Availability Computing in Computer Clusters
      IEEE International Conference on Computational Sciences and Optimization (CSO), May 2010.
    • Qiang Guan, Derek Smith and Song Fu
      Anomaly Detection in Large-Scale Coalition Clusters for Dependability Assurance
      The 17th IEEE International Conference on High Performance Computing (HiPC), December 2010. 
    • Song Fu
      Failure-Aware Construction and Reconfiguration of Distributed Virtual Machines for High Availability Computing
      The 9th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid), 2009. 
    • Song Fu and Chengzhong Xu
      Proactive Resource Management for Failure Resilient High Performance Computing Clusters
      The IEEE International Conference on Availability, Reliability and Security (ARES), 2009. 
    • Song Fu, Chengzhong Xu and Haiying Shen 
      Random Choices for Churn Resilient Load Balancing in Peer-to-Peer Networks 
      The 22nd ACM/IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2008.  
    • Song Fu and Chengzhong Xu 
      Exploring Event Correlation for Failure Prediction in Coalitions of Clusters 
      The ACM/IEEE Supercomputing Conference (SC), 2007. (Acceptance rate: 20%) 
    • Song Fu and Chengzhong Xu 
      Quantifying Temporal and Spatial Correlation of Failure Events for Proactive Management 
      The 26th IEEE International Symposium on Reliable Distributed Systems (SRDS), 2007.  
    • Song Fu and Chengzhong Xu
      Coordinated Access Control with Spatial Constraints in Coalition Mobile Computing Systems 
      Journal of Future Generation Computer Systems, Elsevier, Vol. 23, No. 6, pp. 804-815, 2007.
    • Song Fu and Chengzhong Xu 
      Stochastic Modeling and Analysis of Hybrid Mobility in Reconfigurable Distributed Virtual Machines
      Journal of Parallel and Distributed Computing, Elsevier, Vol. 66, No. 11, pp. 1442-1454, 2006.
    • Song Fu, Chengzhong Xu, Brian Wims, and Ramzi Basharahil 
      Distributed Shared Arrays: A Distributed Virtual Machine with Mobility Support for Reconfiguration
      Journal of Cluster Computing, Vol. 9, No. 3, pp. 237-255, 2006.
    • Song Fu and Chengzhong Xu 
      Service Migration in Distributed Virtual Machines for Adaptive Grid Computing 
      The 34th IEEE International Conference on Parallel Processing (ICPP), 2005. (Best Paper Nominee)
    • Song Fu and Chengzhong Xu 
      A Coordinated Spatio-Temporal Access Control Model for Mobile Computing in Coalition Environments
      The 19th ACM/IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2005.
    • Song Fu and Chengzhong Xu
      Mobility Support for Adaptive Grid Computing
      Scalable and Secure Internet Services and Architecture, Chapman & Hall/CRC, 2005.
    • Song Fu and Chengzhong Xu
      Mobile Code and Protection
      Handbook of Information Security, John Wiley & Sons, 2005.
    • Song Fu and Chengzhong Xu
      Migration Decision for Hybrid Mobility in Reconfigurable Distributed Virtual Machines 
      The 33rd IEEE International Conference on Parallel Processing (ICPP), 2004.
    • Ramzi Basharahil, Brian Wims, Chengzhong Xu, and Song Fu 
      Distributed Shared Array: an Integration of Message Passing and Multithreading on SMP Clusters
      Journal of Supercomputing, Vol. 31, No. 2, pp. 161-184, 2004.
    • Chengzhong Xu and Song Fu 
      Privilege Delegation and Agent-oriented Access Control in Naplet 
      IEEE International Workshop on Mobile Distributed Computing (In conjunction with ICDCS), 2003.
    • Song Fu, Zhiquan Jin and Peipei Chen 
      A Rate-Based Multicast Protocol for Large-Scale Reliable Transport 
      The 17th IEEE International Conference on Advanced Information Networking and Applications, 2003. 

    Research Posters

    • Jason He (TAMS student)
      CODY: Characterizing Power and Energy Usage with Resource Auto-Configuration in Cloud Computing
      DFW Science and Engineering Fair, Dallas Texas, 2013. [PDF]
    • Husanbir Singh Pannu
      Adaptive Anomaly Detection System for Cloud Computing Infrastructures
      The 31st IEEE International Symposium on Reliable Distributed Systems (SRDS), 2012. [PDF]
    • Song Fu
      Proactive Failure Management for Dependable Networked Computer Systems
      Department of Computer Science, University of North Texas, 2011. [PDF]
  • Professional Experience
    • Panelist for U.S. National Science Foundation; Proposal Reviewer for Portuguese Foundation for Science and Technology, Canada Foundation for Innovation, Research Grants Council of Hong Kong, Kentucky Science and Engineering Foundation, South Carolina Institutions of Higher Education.
    • General Chair, 35th IEEE International Performance Computing and Communications Conference (IPCCC 2016)
    • Workshop Chair, 26th IEEE International Conference on Computer Communications and Networks (ICCCN 2017)
    • Travel Grant Chair, 24th IEEE International Conference on Computer Communications and Networks (ICCCN 2015)
    • General Vice-Chair, 32nd IEEE International Performance Computing and Communications Conference (IPCCC 2013)
    • Program Chair, 31st IEEE International Performance Computing and Communications Conference (IPCCC 2012)
    • Publication Chair, 30th IEEE International Performance Computing and Communications Conference (IPCCC 2011)
    • Track Chair, 20th IEEE International Conference on Computer Communications and Networks (ICCCN 2011)
    • Registration Chair, IEEE International Symposium on Electronic System Design (ISED 2011)
    • Poster Chair, 29th IEEE International Performance Computing and Communications Conference (IPCCC 2010)
    • Program Committee, IEEE/ACM IPDPS 2018, IEEE ISM 2015, IEEE CLOUD 2015, IEEE CLOUDNET 2015, IEEE CLOUDNET 2014, ACM BodyNet 2014, IARIA CLOUD COMPUTING 2014, ACM BodyNet 2013, IEEE NAS 2013, IARIA INTERNET 2013, IEEE COMPSAC 2012, IEEE NAS 2012, IEEE I-SPAN 2012, FTRA FutureTech 2012, IEEE ICPADS 2010, IEEE AINA 2010, IEEE CloudCom 2010, ACM IC3 2010, IEEE I-SPAN 2009, ACM Compute 2009, IEEE FCST 2009, IEEE AINA 2009, ACM IC3 2009, IEEE/IFIP EUC 2008.
    • Paper Reviewer, IEEE Transactions on Parallel and Distributed Systems (TPDS), IEEE Transactions on Computers (TOC), IEEE Transactions on Emerging Topics in Computing (TETC), IEEE Transactions on Services Computing (TSC), ACM Transactions on Autonomous and Adaptive Systems (TAAS), Journal of Parallel and Distributed Computing (JPDC), Journal of Systems and Software (JSS), Journal of Future Generation Computer Systems (FGCS), Journal of Supercomputing (JSC).
    • IEEE senior member
    • Member of ACM, ASEE and Sigma Xi.