Prof. Geoff Webb
IEEE Fellow

Monash University, Australia

Geoff Webb is a leading data scientist. He is Research Director of the Monash University Data Futures Institute and a Technical Advisor to data science startups FROOMLE and BigML Inc. The latter have incorporated his best of class association discovery software, Magnum Opus, as a core component of their advanced Machine Learning service. He developed many of the key mechanisms of support-confidence association discovery in the late 1980s. His OPUS search algorithm remains the state-of-the-art in rule search. He pioneered multiple research areas as diverse as black-box user modelling, interactive data analytics and statistically-sound pattern discovery. He has developed many useful machine learning algorithms that are widely deployed. He has applied machine learning in a broad range of applications including manufacturing, finance, medicine, biology and geoscience. He was editor in chief of the premier data mining journal, Data Mining and Knowledge Discovery from 2005 to 2014. He has been Program Committee Chair of the two top data mining conferences, ACM SIGKDD and IEEE ICDM, as well as General Chair of ICDM. He is an IEEE Fellow. His many awards include the prestigious inaugural Australian Museum Eureka Prize for Excellence in Data Science. His 250+ academic papers have received over 14,000 citations with an h-index of 60. Seven of his recent papers have been recognised as Clarivate Web of Science High Cite papers (top 1% of citations for the discipline).

  • Keynote Speech Abstract

    Speech Title: Time Series Classification at Scale

    Abstract: Time series classification is a fundamental data science task, providing understanding of dynamic processes as they evolve over time. The recent introduction of ensemble techniques has revolutionised this field, greatly increasing accuracy, but at a cost of increasing already burdensome computational overheads. I present new time series classification technologies that achieve the same accuracy as recent state-of-the-art developments, but with many orders of magnitude greater efficiency and scalability. These make time series classification feasible at hitherto unattainable scale.


Prof. Lei Chen
IEEE Fellow

The Hong Kong University of Science and Technology, Hong Kong, China

Lei Chen has BS degree in computer science and engineering from Tianjin University, Tianjin, China, MA degree from Asian Institute of Technology, Bangkok, Thailand, and PhD in computer science from the University of Waterloo, Canada. He is a professor in the Department of Computer Science and Engineering, Hong Kong University of Science and Technology (HKUST). Currently, Prof. Chen serves as the director of Big Data Institute at HKUST, director of Master of Science on Big Data Technology and director of HKUST MOE/MSRA Information Technology Key Laboratory. Prof. Chen’s research interests include human-powered machine learning, crowdsourcing, Blockchain, graph data analysis, probabilistic and uncertain databases and time seires and multimedia databases. Prof. Chen got the SIGMOD Test-of-Time Award in 2015.The system developed by Prof. Chen’s team won the excellent demonstration award in VLDB 2014. Prof. Chen has served as VLDB 2019 PC Co-chair. Currently, Prof. Chen serves as Editor-in-Chief of VLDB Journal, associate editor-in-chief of IEEE Transaction on Data and Knowledge Engineering. He is an IEEE Fellow, ACM Distinguished Member and an executive member of the VLDB endowment.

  • Keynote Speech Abstract

    Speech Title: Data Management for Effective and Efficient Deep Learning

    Abstract: In recent years, deep learning (DL) has significantly penetrated and has been widely adopted in various fields of application, including facial recognition, strategy games (AlphaGo and Texas hold'em) and question answering. However, the effectiveness of the models and efficiency of the training process strongly depend on how well the associated data is managed. It is very challenging to train an effective deep learning-based image classifier without properly labelled training data. Furthermore, training efficiency is severely affected by a large amount of training data, complex structures of the models and tones of hyper parameters. A lack of validation for result data and explanation also seriously affect the applicability of trained models. In this talk, I will discuss three issues on how to manage data for effective and efficient deep learning: 1) how to prepare data for effective DL, which includes data extraction and integration as well as data labelling; 2) how to optimize DL training, including data compression and computation graph optimization; and 3) how to conduct explanation to make the model robust and transparent. Some future work will be highlighted at the end.


Prof. Ling Liu
IEEE Fellow

Georgia Institute of Technology, USA

Ling Liu is a Professor in the School of Computer Science at Georgia Institute of Technology. She directs the research programs in the Distributed Data Intensive Systems Lab (DiSL), examining various aspects of big data powered artificial intelligence (AI) systems, and machine learning (ML) algorithms and analytics, including performance, availability, privacy, security and trust. Prof. Liu is an elected IEEE Fellow, a recipient of IEEE Computer Society Technical Achievement Award (2012), and a recipient of the best paper award from numerous top venues, including IEEE ICDCS, WWW, ACM/IEEE CCGrid, IEEE Cloud, IEEE ICWS. Prof. Liu served on editorial board of over a dozen international journals and served as the editor in chief of IEEE Transactions on Service Computing (2013-2016), and currently is the editor in chief of ACM Transactions on Internet Computing (since 2019). Prof. Liu is a frequent keynote speaker in top-tier venues in Big Data, AI and ML systems and applications, Cloud Computing, Services Computing, Privacy, Security and Trust. Her current research is primarily supported by USA National Science Foundation under CISE programs, IBM and CISCO.

  • Keynote Speech Abstract

    Speech Title: Ensemble Learning Methods for Neural Networks

    Abstract: Neural network ensemble is a collaborative learning paradigm that utilizes multiple neural networks to solve a complex learning problem. Constructing predictive models with high generalization performance is an important and yet most challenging goal for robust intelligence systems. Given a target learning task, popular approaches have been dedicated to find the top performing model. However, it is difficult in general to estimate the best model when available data is finite and may not be sufficient for the problem. In this keynote, I will give an overview of a diversity-centric ensemble learning framework developed at Georgia Tech, including methodologies and algorithms for measuring, enforcing, and combining multiple neural networks by improving generalization performance of the overall system and maximizing ensemble utility.


Assoc. Prof. Bingsheng He

National University of Singapore, Singapore

Dr. Bingsheng He is currently a Dean’s Chair Associate Professor and Vice-Dean (Research) at School of Computing, National University of Singapore. Before that, he was a faculty member in Nanyang Technological University, Singapore (2010-2016), and held a research position in the System Research group of Microsoft Research Asia (2008-2010), where his major research was building high performance cloud computing systems for Microsoft. He got the Bachelor degree in Shanghai Jiao Tong University (1999-2003), and the Ph.D. degree in Hong Kong University of Science & Technology (2003-2008). His current research interests include cloud computing, database systems and high performance computing. His papers are published in prestigious international journals (such as ACM TODS and IEEE TKDE/TPDS/TC) and proceedings (such as ACM SIGMOD, VLDB/PVLDB, ACM/IEEE SuperComputing, ACM HPDC, and ACM SoCC). He has been awarded with the IBM Ph.D. fellowship (2008), NVIDIA Academic Partnership (2011), Adaptive Compute Research Cluster from Xilinx (2020) and ACM distinguished member (class 2020). Since 2010, he has (co-)chaired a number of international conferences and workshops, including IEEE CloudCom 2014/2015, BigData Congress 2018 and ICDCS 2020. He has served in editor board of international journals, including IEEE Transactions on Cloud Computing (IEEE TCC), IEEE Transactions on Parallel and Distributed Systems (IEEE TPDS), IEEE Transactions on Knowledge and Data Engineering (TKDE), Springer Journal of Distributed and Parallel Databases (DAPD) and ACM Computing Surveys (CSUR). He is an ACM Distinguished member (class of 2020).

  • Keynote Speech Abstract

    Speech Title: Parallel Graph Processing Systems on Heterogeneous Architectures

    Abstract: Graphs are de facto data structures for many data processing applications, and their volume is ever growing. Many graph processing tasks are computation intensive and/or memory intensive. Therefore, we have witnessed a significant amount of effort in accelerating graph processing tasks with heterogeneous architectures like GPUs, FPGAs and even ASIC. In this talk, we will first review the literatures of large graph processing systems on heterogeneous architectures. Next, we present our research efforts, and demonstrate the significant performance impact of hardware-software co-design on designing high performance graph computation systems and applications. Finally, we outline the research agenda on challenges and opportunities in the system and application development of future graph processing. More details about our research can be found at

Copyright © The 6th International Conference on Data Storage and Data Engineering (DSDE 2023) All rights reserved.