2022 7th International Conference on Big Data and Computing
May 27-29, 2022| virtual conference
2022年5月28-29日, 第七届大数据与计算 "云会议" 成功举行。ICBDC 2022大会录用文章被出版至 ICBDC 2022 论文集, 由ACM 出版, ICBDC 2022论文集出版刊号是 978-1-4503-9609-7, 已被 ACM 数据库收录, 会后5个月左右已被 EI 核心检索 和被 Scopus检索 。
Opening Remarks by General Chair:
Prof. Jianqiang Li , Shenzhen University, China
Keynote: Prof. Tok Wang Ling,
National University of Singapore, Singapore
Keynote: Prof. Hong Shen,
Sun Yat-sen University, China
Excellent Oral Presentation Winners of ICBDC 2022
Future of Artificial Intelligence and Machine learning in Marketing 4.0
Presenter: Rajeswari Sundararajan, Vellore Institute of Technology, India
Traffic Data Prediction with Geometric Algebra Spatial-Temporal Attention Neural Network
Presenter: Yongjie Ding, Tongji University, China
KARL: A Cost-effective Routing Algorithm in Fault Tolerant 3D Network-on-Chip via Kmeans Assisted Reinforcement Learning
Presenter: Lujian Chen, Shanghai Maritime University, China
Keynote Speakers of ICBDC 2022
Prof. Tok Wang Ling
National University of Singapore, Singapore
Dr. Tok Wang Ling
is a professor in Computer Science Department at the National University of Singapore. He was Head of IT Division, Deputy Head and acting head of the Department of Information Systems and Computer Science, and Vice Dean of the School of Computing of the University. Before joining the University as a lecturer in 1979, he was a scientific staff at Bell Northern Research, Ottawa, Canada. He serves/served on the program committee of more than 160 international database conferences including VLDB, CIKM, EDBT, ER, DASFAA, DOOD, DEXA, CoopIS, DOLAP, DaWaK, ADC, BigComp, etc. He organized and served as a program committee co-chair of 6 international database conferences, namely DASFAA 1995, DOOD 1995, ER 1998, WISE 2002, ER 2003, and ER 2011. He organized and served as a conference co-chair of several conferences and workshops, namely HSI 2001, 2003, and 2005, WAIM 2004, ER 2004, DASFAA 2005, DASFAA 2006 (Honorary Chair), SIGMOD 2007, VLDB 2010, DASFAA 2014 (Honorary Chair), BigComp 2015. He serves/served on the steering committees of 5 international conferences, namely ER, DASFAA, DOOD, HSI, and BigComp. He was the steering committee chair of both ER and DASFAA. He is currently the steering committee chair of BigComp. He is an editorial board member of the several international journals, namely Data & Knowledge Engineering, Journal of Database Management, Journal of Data Semantics, Journal of Information and Data Management, Journal of Computing Science and Engineering (International Advisory Board member). He is an ER Fellow, an ACM Distinguished Scientist, IEEE Senior Life Member, and Fellow of Singapore Computer Society. He received the ACM Recognition of Service Award in 2007, the DASFAA Outstanding Contributions Award in 2010, and the Peter P. Chen Award in 2011.
Speech Title: Conceptual Modeling Views on Relational Databases vs Big Data
Abstract: The concepts of object class, relationship type, and attribute of object class and relationship type, are the three basic concepts in Entity Relationship Model. They are termed ORA-semantics. Without knowing the ORA-semantics in the databases, the qualities of some database areas are low, such as database keyword search and data/schema integration. All the traditional database models and the big data models cannot capture ORA-semantics explicitly.
We briefly present some restrictions of relational model and performance issues of RDBMS, such as use fixed schema with flat relations, no complex data type, multi-valued attribute, normal forms, use Universal Relation Assumption, use FD and MVD to capture integrity constraints but not ORA-semantics, no class hierarchy and inheritance, use join to process SQL programs, use ACID to handle online transactions, use persistent disk storage to store databases, limited parallel processing capabilities, etc.
We briefly discuss the 4 data models of NoSQL databases for big data applications. We compare the relational database model and big data models using a set of characteristics. We describe some existing database techniques and concepts which can be used to improve the performances of certain database applications in RDBMS and big data stores, such as materialized view, strong and weak FD/MVD, partitioning of data for parallel processing, etc.
We further present some important concepts involved in data and schema integration, such as primary key vs object identifier (OID), local OID vs global OID, entity resolution vs relationship resolution, schematic discrepancy, etc. Data integration is an important issue in big data applications because they access and integrate data from different data sources which may use different data models. ORA-semantics is essential for correctness of data integration; however, it cannot be captured by the schema and it also cannot be automatically discovered from data alone.
We summarize a set of criteria to help designers to decide whether to use SQL or NoSQL for their applications.
Prof. Hong Shen (沈鸿教授)
China National Endowed Expert, Sun Yat-sen University, China
Hong Shen is a specially-appointed Professor in Sun Yat-sen University, China, where he was the foundation Director of SYSU's Institute for Advanced Computing. He is also an Adjunct Professor in the University of Adelaide, Australia, where he was a tenured Professor (Chair of Computer Science) for 15 years. He received the BS degree from Beijing University of Science and Technology, MS degree from University of Science and Technology of China, and PhD degree from Abo Akademi University, Finland. With main research interests in parallel and distributed computing, privacy preserving computing and high performance networks, he has led numerous research centers and projects in different countries. He has published 400+ papers including over 100 papers in major international journals such as a variety of IEEE and ACM transactions. Prof. Shen received many honors and awards, and served on different roles in professional societies, journal editorial boards and conference committees.
Speech Title: Bridge HPC and AI: Machine-Learning Enhanced Scheduling of High-Performance Computing Jobs
Abstract: How to effectively schedule massive jobs from different users is a core issue of today's high-performance computing datacenter operations, which is especially important for managing shared resources in a cloud computing environment. High-performance computing jobs have the characteristics of large scale and complex correlations among computation tasks. The key to job scheduling is to optimize the scheduling of both computation tasks in a job and parallel data transmission flows (coflows) among the tasks according to their data correlations. Because these problems are NP-hard, various greedy strategies, heuristic and machine learning based approaches have been proposed to obtain sub-optimal solutions. In view of the performance bottlenecks of the existing scheduling methods, in this talk, as an example of bridging HPC and AI, I will present our recent work in combining different models and computation paradigms of greedy strategies and machine learning techniques for scheduling of high-performance computing jobs. For computation task scheduling, I will first introduce our method of combining search with regression in the Bayesian optimization framework to accommodate jobs with weighted completion time for offline scheduling. I will then present our method of integrating greedy optimization with reinforcement learning to improve the overall performance guarantee for online scheduling. For coflow scheduling, I will first introduce our method of combining a neural network with meta-learning mechanism to compromise the minimization of coflow completion time and scheduling fairness for offline scheduling. Next, I will present our method of applying a pipelined graph neural network and deep reinforcement learning with self-attention mechanism to improve the scheduling efficiency without compromising quality for online scheduling. Finally, I will discuss how to promote the application of AI techniques in high-performance computing in general, as well as the integration between HPC and AI computing this emerging trend that is important to both fields of HPC and AI.