1. Title: Big Data Incremental Learning
Professor Zhi-Hua Zhou
Nanjing University
时间:2016.9.24 9:00-10:00
地点:西交利物浦国际会议中心,慧杰圣地厅
Abstract: Traditional learning approaches usually try to collect all available data and then train a model. In big data applications, however, the data usually come in an accumulation or streaming way. Thus, it is more desirable to do incremental learning rather than training a new model from scratch when receiving new data. It is noteworthy that some important losses used in machine learning are quite challenging for incremental optimization. Moreover, in addition to new training samples, new classes may also occur. In this talk we will introduce some studies along this direction.
Short Biography: Zhi-Hua Zhou is a Professor and Founding Director of the LAMDA Group at Nanjing University. He authored the book "Ensemble Methods: Foundations and Algorithms", and published more than 100 papers in top-tier journals and conference proceedings. His work have received more than 22,000 citations, with a h-index of 71. He also holds 14 patents and has good experiences in industrial applications. He has received various awards, including the National Natural Science Award of China, the IEEE CIS Outstanding Early Career Award, the Microsoft Professorship Award, 12 international journal/conference paper/competition awards, etc. He serves as the Executive Editor-in-Chief of Frontiers of Computer Science, Associate Editor-in-Chief of Science China, and Associate Editor of ACM TIST, IEEE TNNLS, etc. He founded ACML (Asian Conference on Machine Learning) and served as General Chair of ICDM’16, PAKDD’14, etc., Program Chair of IJCAI’15 Machine Learning track, SDM’13, etc. He also serves as Advisory Committee member for IJCAI 2015-2016, and Steering Committee Member of PAKDD and PRICAI. He is a Fellow of the AAAI, IEEE, IAPR, IET/IEE, CCF, and an ACM Distinguished Scientist.
2.Title: Small and Sweet MapReduce Algorithms
Professor Yufei Tao
The University of Queensland
时间:2016.9.25 9:00-10:00
地点:西交利物浦国际会议中心,慧杰圣地厅
Abstract: MapReduce has grown into a matured and powerful paradigm for large-scaled parallel computing. This keynote will introduce principles for designing algorithms on this paradigm that are both (i) small, i.e., they can be implemented in a real system with reasonable efforts, and (ii) sweet, i.e., they possess strong theoretical performance guarantees. Assuming little prior knowledge, we will start with the definition of the massively parallel computation (MPC) model, which has nowadays become a popular model in the database community for studying MapReduce algorithms. We will then move on to discuss MPC algorithms that can solve several fundamental database problems (particular, sorting and joins) optimally. The talk will end with several open problems exciting in the eyes of the speaker.
Short Biography: Yufei Tao is a Professor at the School of Information Technology and Electrical Engineering, the University of Queensland (UQ). Prior to joining UQ, he held professorial positions at the City University of Hong Kong, the Chinese University of Hong Kong (CUHK), and the Korea Advanced Institute of Science and Technology (KAIST). He served as an associate editor of ACM Transactions on Database Systems (TODS) from 2008 to 2015, and of IEEE Transactions on Knowledge and Data Engineering (TKDE) from 2012 to 2014. He was a PC chair of International Conference on Data Engineering (ICDE) 2014, and of International Symposium on Spatial and Temporal Databases (SSTD) 2011. He was a keynote speaker at International Conference on Database Theory (ICDT) 2016, and a winner of the SIGMOD best paper award in 2013 and 2015.
3.Title: Inference of Social Relationships from Location Data
Professor Cyrus Shahabi
University of Southern California (USC)
时间:2016.9.24 13:30-14:30
地点:西交利物浦国际会议中心,慧杰圣地厅
Abstract: For decades, social scientists have been studying people's social behaviors by utilizing sparse datasets obtained by observations and surveys. These studies received a major boost in the past decade due to the availability of web data (e.g., social networks, blogs and review web sites). However, due to the nature of the utilized dataset, these studies were confined to behaviors that were observed mostly in the virtual world. Differing from all the earlier work, here, we aim to study social behaviors by observing people's behaviors in the real world. This is now possible due to the availability of large high-resolution spatio-temporal location data collected by GPS-enabled mobile devices through mobile apps (Google's Map/Navigation/Search/Chrome, Facebook, Foursquare, WhatsApp, Twitter) or through online services, such as geo-tagged contents (tweets from Twitter, pictures from Instagram, Flickr or Google+ Photo), etc.
In particular, we focus on inferring two specific social measures: 1) pair-wise strength -- the strength of social connections between a pair of users, and 2) pair-wise influence - the amount of influence that an individual exerts on another, by utilizing the available high-fidelity location data representing people's movements.
Finally, we argue that due to the sensitivity of location data and user privacy concerns, these inferences cannot be largely carried out on individually contributed data without privacy guarantees. Hence, we discuss open problems in protecting individuals'location information while enabling these inference analyses.
Short Biography: Cyrus Shahabi is a Professor of Computer Science and Electrical Engineering and the Director of the Information Laboratory (InfoLAB) at the Computer Science Department and also the Director of the NSF's Integrated Media Systems Center (IMSC) at the University of Southern California (USC). He is also the director of Informatics at USC' Viterbi School of Engineering. He was the CTO and co-founder of a USC spin-off, Geosemble Technologies, which was acquired in July 2012. Since then, he founded another company, ClearPath (recently rebranded as TallyGo), focusing on predictive path-planning for car navigation systems. He received his B.S. in Computer Engineering from Sharif University of Technology in 1989 and then his M.S. and Ph.D. Degrees in Computer Science from the University of Southern California in May 1993 and August 1996, respectively. He authored two books and more than two hundred research papers in the areas of databases, GIS and multimedia with more than 12 US Patents.
Dr. Shahabi was an Associate Editor of IEEE Transactions on Parallel and Distributed Systems (TPDS) from 2004 to 2009, IEEE Transactions on Knowledge and Data Engineering (TKDE) from 2010-2013 and VLDB Journal from 2009-2015. He is currently on the editorial board of the ACM Transactions on Spatial Algorithms and Systems (TSAS) and ACM Computers in Entertainment. He is the founding chair of IEEE NetDB workshop and also the general co-chair of SSTD'15, ACM GIS 2007, 2008 and 2009. He chaired the nomination committee of ACM SIGSPATIAL for the 2011-2014 terms. He is a PC co-Chair of BigComp'2016 and MDM'2016. In the past, he has been PC co-chair of DASFAA 2015, IEEE MDM 2013 and IEEE BigData 2013, and regularly serves on the program committee of major conferences such as VLDB, ACM SIGMOD, IEEE ICDE, ACM SIGKDD, IEEE ICDM, and ACM Multimedia.
Dr. Shahabi is a fellow of IEEE, and a recipient of the ACM Distinguished Scientist award in 2009, the 2003 U.S. Presidential Early Career Awards for Scientists and Engineers (PECASE), the NSF CAREER award in 2002, and the 2001 Okawa Foundation Research Grant for Information and Telecommunications.
4. Title: Real-Time Analytics and Visualization on Large-Scale Spatial-Temporal-Textual Data
Professor Chen Li
University of California, Irvine
时间:2016.9.24 10:30-12:00
地点:西交利物浦国际会议中心,慧杰圣地厅
Abstract: We are developing a system called Cloudberry to support analytics and visualization on large data sets with spatial, temporal, and textual attributes, such as social media data and query logs. It supports aggregation queries on various types of attributes, and allows efficient data exploration at different granularities (e.g., state, county, and city). It also supports real-time analytics, which can allow applications to monitor “what’s happening now.” To achieve a high speed, it includes an intelligent middleware for view materialization and cache management. As a general-purpose solution for large data sets, it uses the Apache AsterixDB big data management system that provides rich features and high performance, such as various indexes and data feeds. In this talk, we will give an overview of the system, our initial results, and open challenges in this direction. A live demonstration using tweets is available at //cloudberry.ics.uci.edu/ .
Short Biography: Chen Li is a professor in the Department of Computer Science at UC Irvine. He received his Ph.D. degree in Computer Science from Stanford University, and his M.S. and B.S. in Computer Science from Tsinghua University, China, respectively. His research interests are in the field of data management, including data cleaning, data integration, data-intensive computing, and text analytics. He was a recipient of an NSF CAREER Award, several test-of-time publication awards, and many other grants and industry gifts. He was once a part-time Visiting Research Scientist at Google. He founded a company SRCH2 to develop an open source search engine with high performance and advanced features.
5.Title: Towards Interactive and Social-Aware Spatial Query Services
Professor Jianliang Xu
Hong Kong Baptist University
时间:2016.9.25 10:30-12:00
地点:西交利物浦国际会议中心,慧杰圣地厅
Abstract: Location-based service (LBS) have been gaining in prominence, with about 40% of world's population using smartphones today. As such, there is a growing need to continuously advance the spatial database research for emerging LBS applications, which pose new challenges as well as new opportunities. For example, the convergence of location data and social media has enabled a new class of geo-social queries that combine location and social factors in query processing. In addition, to enhance system usability and user experience, it is important to support instantaneous and interactive responses to queries. In this talk, we will present several of our recent efforts on geo-social queries and "why-not"/"what-if" interactive queries that are aimed to improve the functionality, usability, and performance of spatial query services. We will also discuss some possible future research directions.
Short Biography: Jianliang Xu is a Professor in the Department of Computer Science, Hong Kong Baptist University (HKBU). He received the BEng degree from Zhejiang University and the PhD degree from Hong Kong University of Science and Technology. He held visiting positions at Pennsylvania State University and Fudan University. His current research interests include data management, database security & privacy, and location-aware computing. He has published more than 150 technical papers in these areas, most of which appeared in leading journals and conferences including SIGMOD, VLDB, ICDE, TODS, TKDE and VLDBJ, with an h-index of 38 (Google Scholar). He was a recipient of IEEE ICDE Outstanding Reviewer Award (2010) and HKBU Faculty Performance Award for Outstanding Young Researcher (2012). He has served as a program co-chair/vice chair for a number of major international conferences including IEEE ICDCS 2012, IEEE CPSNA 2015 and WAIM 2016. He is an Associate Editor of IEEE Transactions on Knowledge and Data Engineering (TKDE).
6.Title : Data Science for Epidemic Computing
Professor Kun-Ta Chuang
National Cheng Kung University
时间:2016.9.24 10:30-12:00
地点:西交利物浦国际会议中心,慧杰6号厅
Abstract: The control of epidemic spread is the critical challenge for the authority in recent decades. When people are moving to live in the urban area, the crowded situation inevitably increases the outbreak probability of some contagious diseases such as flu and dengue fever. For the need to prevent the out-of-control infections, it is necessary to develop new technologies, predicting and evaluating the prevention result along with the dynamic deployment of intervention strategies over time.
In this tutorial, we will introduce some mechanisms from data science and discuss their extension applied in the outbreak control during the spread of dengue fever in Taiwan 2015. We will also discuss the intervention procedure in Taiwan and show the way to incorporate data mining idea for epidemic computing into the process of decision making in the government side. The audience will know the basic concept of public health and learn the way to devise new computational algorithms for this critical challenge.
Short Biography: Kun-Ta Chuang currently serves as an assistant professor in Department of Computer Science and Information Engineering in National Cheng Kung University. He was a senior engineer at EDA giant Synopsys during 2006-2011. He received the Ph.D. degree from Graduate Institute of Communication Engineering, National Taiwan University, Taipei, Taiwan in 2006. His research interests include data mining, web technology, mobile data management, and cloud computing.
7.Title: Workload-Aware Resource Management Technologies for Improving Server Performance
Professor Hyeonsang Eom
Seoul National University
时间:2016.9.25 10:30-12:00
地点:西交利物浦国际会议中心,慧杰6号厅
Abstract: Datacenters where various sorts of servers may run have been becoming larger and more heterogeneous, possibly being highly distributed. It is crucial to manage many heterogeneous resources effectively to efficiently and cost-effectivity provide services; it is necessary to allocate “right” resources to Virtual Machines (VMs) in virtualized datacenters in order to decrease the cost of the operation while meeting the SLAs (Service Level Agreements) such as meeting the latency requirement. One of the most effective ways to allocate “right” resources to a VM would be to do it considering the characteristics of the VM such as the memory intensiveness of the workload executed in the VM. However, the existing schedulers do not consider these kinds of characteristics, including the NOVA scheduler of OpenStack and DRS (Distributed Resource Scheduler) of VMWare. In this tutorial, I explain some workload-aware schedulers, and our workload-aware one that schedules VMs on OpenStack clusters of nodes, considering the characteristics of workload executed in the VMs. Our experimental study with Redis and Memcached possibly caching the data and links of Web servers shows that our memory-intensiveness-aware scheduler may outperform the default scheduler of OpenStack and DRS as well in terms of throughput and latency.
Short Biography: Hyeonsang Eom received the BS degree in computer science and statistics from Seoul National University (SNU), Seoul, Korea, in 1992, and the MS and PhD degrees in computer science from the University of Maryland at College Park, Maryland, USA, in 1996 and 2003, respectively. He is currently an associate professor in the Department of Computer Science and Engineering at SNU, where he has been a faculty member since 2005. He was an intern in the data engineering group at Sun Microsystems, California, USA, in 1997, and a senior engineer in the Telecommunication R&D Center at Samsung Electronics, Korea, from 2003 to 2004. His research interests include distributed systems, cloud computing, operating systems, high performance storage systems, energy efficient systems, fault-tolerant systems, security, and information dynamics.