暑期徐金辉系列学术报告预报
时间: 2015-07-15 发布者: 文章来源: 博彩平台 审核人: 浏览次数: 564

报告人简介:

专长和学术成就:算法设计,计算几何,组合优化,及他们在医学图像,治疗规划,及诊断 ,生物 , 网络与移动计算,大规模集成电路设计 等方面的应用。在这些方向的国际期刊和会议上发表了140余篇论文,大部分出现在国际顶级期刊和会议中。解决了几个长期公开的问题和猜想。推广了一类最基本的几何结构和经典问题。

工作单位及主要简历:于1992年和1995获得中国科技大学,计算机科学本科和硕士学位。于2000年获得美国圣母大学(University of Notre Dame)计算机科学与工程博士学位。同年获美国纽约州立大学(布法罗)计算机科学与工程系助理教授职位,现为该系终身正教授。

 

 

1. 报告题目: Clustering-Based Collaborative Filtering for Link Prediction

  报告时间:2015年7月6日下午14:00

  报告地点:理工楼555

 

Abstract:In this paper, we propose a novel collaborative filtering approachfor predicting the unobserved links in a network (orgraph) with both topological and node features. Our approachimproves the well-known compressed sensing basedmatrix completion method by introducing a new multipleindependent-Bernoulli-distribution model as the data samplingmask. It makes better link predictions since the modelis more general and better matches the data distributions inmany real-world networks, such as social networks like Facebook.As a result, a satisfying stability of the prediction canbe guaranteed. To obtain an accurate multiple-independent-Bernoulli-distribution model of the topological feature space,our approach adjusts the sampling of the adjacency matrixof the network (or graph) using the clustering informationin the node feature space. This yields a better performancethan those methods which simply combine the two types offeatures. Experimental results on several benchmark datasetssuggest that our approach outperforms the best existing linkprediction methods.

 

 

 

2.Title: A Unified Framework for Clustering Constrained Data without Locality Property

  报告时间:2015年7月8日下午14:00

  报告地点:理工楼555

 

Abstract: In this paper, we consider a class of constrained clustering problems of points in R^d space, where d could be rather high. A common feature of these problems is that their optimal clusterings no long have the locality property (due to the additional constraints), which is a

key property required by many algorithms for their unconstrained counterparts. To overcome the difficulty caused by the loss of locality, we present in this paper a unified framework, called

Peeling-and-Enclosing, to iteratively solve two variants of the constrained clustering problems, constrained k-means clustering (k-CMeans) and constrained k-median clustering (k-CMedian). Our framework is based on two standalone geometric techniques, called Simplex Lemma and Weaker Simplex Lemma, for k-CMeans and k-CMedian, respectively. Simplex lemma (or weaker simplex lemma) enables us to efficiently approximate the mean (or median) point of an unknown set of points by searching a small-size grid, independent of the dimensionality of the space, in a simplex (or the surrounding region of a simplex), and thus can be used to handle high dimensional data. With these techniques, our framework generates, in nearly linear time (i.e., O(nd(log n)^(k+1))), O((log n)^k) k-tuple candidates for the k mean or median points, and one of them induces a (1+epsilon)-approximation for k-CMeans or k-CMedian, where n is the number of points. Combining this unified framework with a problem-specific selection algorithm (which determines the best k-tuple candidate), we obtain a (1 + epsilon)-approximation for each of the constrained clustering problems. Our framework improves considerably the best known results for these problems. We expect that our technique will be applicable to other constrained clustering problems without locality.

 

 

 

3. Title: Finding Median Point-Set Using Earth Mover’s Distance

  报告时间:2015年7月13日下午14:00

  报告地点:理工楼555

 

Abstract: called Median Point-Set, whose objective is to constructa prototype for a set of given point-sets so as to minimizethe total Earth Mover’s Distances (EMD) betweenthe prototype and the point-sets, where EMD betweentwo point-sets is measured under affine transformation.For this problem, we present the first purely geometricapproach. Comparing to existing graph-based approaches(e.g., median graph, shock graph), our approachhas several unique advantages: (1) No encodingand decoding procedures are needed to map betweenobjects and graphs, and therefore avoid errors causedby information losing during the mappings; (2) Stayingonly in the geometric domain makes our approachcomputationally more efficient and robust to noise. Weevaluate the performance of our technique for prototypereconstruction on a random dataset and a benchmarkdataset, handwriting Chinese characters. Experimentssuggest that our technique

 

 

4. Title: On the Connectivity Preserving Minimum Cut Problem

  报告时间:2015年7月15日下午14:00

  报告地点:理工楼555

 

Abstract: In this paper, we study a generalization of the classical minimum cut problem, called Connectivity Preserving Minimum Cut (CPMC) problem, whichseeks a minimum cut to separate a pair (or pairs) of source and destinationnodes and meanwhile ensure the connectivity between the source and itspartner node(s). The CPMC problem is a rather powerful formulation for aset of problems and finds applications in many other areas, such as networksecurity, image processing, data mining, pattern recognition, and machinelearning. For this important problem, we consider two variants, connectivity preserving minimum node cut (CPMNC) and connectivity preservingminimum edge cut (CPMEC). For CPMNC, we show that it cannot be approximated within αlogn for some constant α unless P=NP, and cannot beapproximated within any poly(logn) unless NP has quasi-polynomial timealgorithms. The hardness results hold even for graphs with unit weight andbipartite graphs. Particularly, we show that polynomial time solutions existfor CPMEC in planar graphs and for CPMNC in some special planar graphs.The hardness of CPMEC in general graphs remains open, but the polynomialtime algorithm in planar graphs still has important practical applications.

 

 

 

5. Title: Sub-linear Time Hybrid Approximations for Least TrimmedSquares Estimator and Related Problems

  报告时间:2015年7月20日下午14:00

  报告地点:理工楼555

 

Abstract: Least Trimmed Squares (LTS) estimator is a statistical tool for estimating how well a set ofpoints fits a hyperplane. As a robust alternative to the classical least squares estimator, LTS takes asinput a set P of n points in Rd and a fitting parameter m £ n, and computes a non-vertical hyperplane Hso that the sum of the m smallest squared vertical distances from P to H is minimized. Previous researchhas indicated that although solving LTS (exactly or approximately) could be quite costly (i.e., it maytake W(nd-1) time to even approximate it when m/n is a positive constant c < 1), a hybrid version ofapproximation, which is a bi-criteria on residual approximation and quantile approximation, can be obtainedin linear time in any fixed dimensional space. In this paper, we further show that an (er; eq)-hybridapproximation of LTS can be computed in sub-linear time, whereer> 0 is the residual approximationratio and 0 <eq< 1 is the quantile approximation ratio. The running time is independent of the inputsize n, when m = Q(n). Comparing to existing result, our approach has quite a few advantages, e.g., ismuch simpler, has better robustness, takes only constant additional space, and can deal with big data(e.g., streaming data). Our result is based on new insights to the problem and several novel techniques,such as recursive slab partition, sequential orthogonal rotation, and symmetric sampling. Our techniquecan also be extended to achieve sub-linear time hybrid approximations for several related problems, suchas data-oblivious computation for LTS in Secure Multi-party Computation (SMC) protocol, LTS on uncertainand range data, and the Orthogonal Least Trimmed Squares (OLTS) problem. It is likely that ourtechnique will be applicable to other shape fitting problems.

 

 

 

6.Title: Random Gradient Descent Tree: A Combinatorial Approach for SVM with Outliers

  报告时间:2015年7月22日下午14:00

  报告地点:理工楼555

 

Abstract: Support Vector Machine (SVM) is a fundamental tech- nique in machine learning. A long time challenge facing SVM is how to deal with outliers (caused by mislabeling), as they could make the classes in SVM non-separable. Existing techniques, such as soft margin SVM, ν-SVM, and Core-SVM, can alleviate the problem to certain extent, but cannot completely resolve the issue. Recently, there are also techniques available for explicit outlier removal. But they suffer from high time complexity and cannot guarantee quality of solution. In this paper, we present a new combinatorial approach, called Random Gradient Descent Tree (or RGD-tree), to explicitly deal with outliers; this results in a new algorithm called RGD-SVM. Our technique yields provably good solution and can be efficiently implemented for practical purpose. The time and space complexities of our approach only linearly depend on the input size and the dimensionality of the space, which are significantly better than existing ones. Experiments on benchmark datasets suggest that our technique considerably outperforms several popular techniques in most of the cases.

 

 

 

7.Title: Influence Based Voronoi Diagrams of Clusters

  报告时间:2015年7月25日下午14:00

  报告地点:理工楼555

 

Abstract: In this paper, we study a generalization of Voronoi diagram, called Influence-based Voronoi Diagram (IVD). The input is a set C = {C1 , C2 , . . . , Cn } of point clusters in R^d and an influence function F (C, q) measuring the influence from a set C of points to any point q in R^d, and the goal is to construct an influence-based Voronoi diagram for C. By extending a recent work called Clustering Induced Voronoi Diagram (CIVD) for unclustered points, we are able to show that it is possible to utilize CIVD’s space-partition ability and combine it with a divide-and-conquer algorithm to simultaneously resolve the space partition and assignment problems for a large class of influence functions. This overcomes a main difficulty of CIVD on the assignment problem. Our technique yields a (1 + epsilon)-approximate IVD with size O(N log N) in O(T_2(N)N log^2 N + T_1(N)) time, where N is the total number of points in C, epsilon > 0 is a small constant, and T_1 and T_2 are functions measuring how efficiently F (C, q) can be evaluated.

 

 

 

8. Title: k-Prototype Learning for 3D Rigid Structures

  报告时间:2015年7月27日下午14:00

  报告地点:理工楼555

 

Abstract: In this paper, we study the following new variant of prototype learning, calledk-prototype learning problem for 3D rigid structures: Given a set of 3D rigidstructures, find a set of k rigid structures so that each of them is a prototype fora cluster of the given rigid structures and the total cost (or dissimilarity) is minimized.Prototype learning is a core problem in machine learning and has a widerange of applications in many areas. Existing results on this problem have mainlyfocused on the graph domain. In this paper, we present the first algorithm for learningmultiple prototypes from 3D rigid structures. Our result is based on a numberof new insights to rigid structures alignment, clustering, and prototype reconstruction,and is practically efficient with quality guarantee. We validate our approachusing two types of data sets, random data and 3D biomedical images. Experimentssuggest that our approach can effectively learn prototypes in both types of data.

 

 

 

9. Title: FPTAS for minimizing earth mover's distance underRigid transformations and related problems

  报告时间:2015年7月29日下午14:00

  报告地点:理工楼555

 

Abstract: In this paper, we consider the problem (denoted as EMDRT) of minimizing theearth mover's distance between two sets of weighted points A and B in a fixed dimensional Rdspace under rigid transformation. EMDRT is an important problem in both theory and applicationsand has received considerable attentions in recent years. Previous research on this problem hasresulted in only constant approximations and it has been an open problem for a long time to achievePTAS solution. In this paper, we present the first FPTAS algorithm for EMDRT. Our algorithmruns roughly in O((nm)d+2(lognm)2d) time (which is close to a lower bound on any PTAS for thisproblem), where n and m are the sizes of A and B, respectively. Our result is based on several newtechniques, such as the Sequential Orthogonal Decomposition (SOD) and Optimum Guided Base(OGB), and can be extended to several related problems, such as the problem of earth mover'sdistance under similarity transformation and the alignment problem, to achieve FPTAS for each ofthem.

 

 

 

10. Title: Gauging Association Patterns of Chromosome Territories via Chromatic Median

  报告时间:2015年8月3日下午14:00

  报告地点:理工楼555

 

Abstract: Computing accurate and robust organizational patternsof chromosome territories inside the cell nucleus is criticalfor understanding several fundamental genomic processes,such as co-regulation of gene activation, gene silencing, Xchromosome inactivation, and abnormal chromosome rearrangementin cancer cells. The usage of advanced fluorescencelabeling and image processing techniques has enabledresearchers to investigate interactions of chromosometerritories at large spatial resolution. The resulting highvolume of generated data demands for high-throughput andautomated image analysis methods. In this paper, we introducea novel algorithmic tool for investigating associationpatterns of chromosome territories in a population of cells.Our method takes as input a set of graphs, one for each cell,containing information about spatial interaction of chromosometerritories, and yields a single graph that contains essentialinformation for the whole population and stands asits structural representative. We formulate this combinatorialproblem as a semi-definite programming and presentnovel techniques to efficiently solve it. We validate our approachon both artificial and real biological data; the experimentalresults suggest that our approach yields a near-optimalsolution, and can handle large-size datasets, whichare significant improvements over existing techniques.