Dept of Computer Science

DEPT OF COMPUTER SCIENCE

Researcher : Bi B

List of Research Outputs

Bi B., Shang L. and Kao C.M., Collaborative Resource Discovery in Social Tagging Systems, Proceedings of The 18th ACM Conference on Information and Knowledge Management (ACM CIKM 2009). 2009.

Researcher : Bo P

List of Research Outputs

Bo P., Free-Form Surface Modeling with Developables and Cyclides. Hong Kong, 2010.

Researcher : Chan B

List of Research Outputs

Yao C., Wang B., Chan B., Yong J. and Paul J., Multi-Image Based Photon Tracing for Interactive Global Illumination of Dynamic Scenes, Computer Graphics Forum. 2010, 29(4): 1315-1324.

Researcher : Chan HL

Project Title:	Energy-efficient on-line scheduling for overloaded systems
Investigator(s):	Chan HL
Department:	Computer Science
Source(s) of Funding:	Seed Funding Programme for Basic Research
Start Date:	02/2010

Abstract:

Motivation. Energy-efficiency has become a major concern for many electronic devices. One reason is that mobile devices, which are usually powered by batteries, are gaining huge popularity. The energy efficiency of the devices directly affects the duration they can be used without recharging and hence their actual mobility. Another reason is that companies are now building large server farms with thousands of computers. The cost for powering and cooling those server farms is tremendous. In fact, it is reported that if energy efficiency of the computers remains unchanged, the cost of energy usage in those server farms will soon be more expensive than the cost of the hardware [1]. This project aims at studying the energy issue from a theoretical point of view and designing algorithms that can be mathematically proved to be energy efficient. To improve the energy efficiency, a popular technique is dynamic voltage scaling. Major chip producers like Intel, AMD and IBM now produce processors with speeds that can be dynamically changed according to the load of the systems. Running at a lower speed reduces the rate of energy usage, but at the same time reduces the rate at which jobs are processed. Given a sequence of jobs, how to schedule the jobs while maintaining both the energy efficiency and the user-experienced performance is a non-trivial problem. Our model. We study the scheduling problem using the following model, which has been used in [2,4]. We are given a single processor whose speed can be dynamically adjusted to any value in [0, T], where T is the maximum speed of the processor. When running at speed s, the processor processes jobs at a rate of s and consumes energy at a rate of P(s) = sα, where α is some constant bigger than 1. The value of α is usually 3 due to the cube-root-rule of the CMOS-based devices [3]. P(s) is called the power function of the processor. Jobs are arriving on-line and each job has an arbitrary arrival time, size and deadline. By on-line, it means that the information of a job is known only when it arrives and the schedule already executed cannot be revoked. We consider the preemptive setting where a job can be suspended at any time and later be resumed at the point of preemption. In general, the system may be overloaded and no algorithm can complete all the jobs by their deadlines. An optimal schedule should maximize the total size of jobs completed by their deadlines, which is known as the throughput, and minimize the energy usage subject to this throughput. Our target is to design an on-line algorithm that is close to the optimal schedule for any input sequence of jobs. Precisely, an on-line algorithm A is said to be c1-competitive on throughput and c2-competitive on energy if for any input sequence of jobs, the throughput of A is at least 1/c1 times that of the optimal and the energy usage is at most c2 times that of the optimal. c1 and c2 are called the competitive ratios on throughput and energy, respectively. Objectives. Two algorithms, FSA(OAT) and Slow-D, have been proposed for the above problem. FSA(OAT) is 14-competitive on throughput and (αα+α24α)-competitive on energy [4]. When α=3, the competitive ratio on energy is 603, which means that the energy usage can be as bad as 603 times that of the optimal. Hence, the performance of the algorithm in terms of the energy usage is far from being practical. Slow-D is 4-competitive on throughput and (αα+α24α)-competitive on energy [2]. It can be shown that no on-line algorithm can be better than 4-competitive on throughput. Hence Slow-D is the best on-line algorithm in terms of throughput. Yet, the competitive ratio on energy remains unchanged. In this project, we aim at an algorithm with better competitive ratio on energy. In particular, we aim at an algorithm that is 4-competitive on throughput and (O(1))α-competitive on energy, where (O(1))α should be a small constant close to 10 when α is 3. The above problem assumes that the optimal schedule always maximizes the throughput. It corresponds to the situation where we optimize the user-experienced performance and then minimize the energy usage subject to this performance. We extend the study to allow a more general trade-off between performance and energy usage. We consider another optimal schedule which, when given a input sequence of jobs, optimizes the objective function H-βE, where H is the throughput, E is the energy usage and β is a weight assigned by the users specifying how important is the energy usage compared with the throughput. When β=0, energy usage is not important and only throughput matters to the objective function. When β is sufficiently large, only the energy usage matters. Our target is to obtain an algorithm is that O(1)-competitive for this objective. Modern computing systems may involve a large number of computers working together. We further extend the study to the multi-processor setting. A scheduling algorithm needs to assign the jobs to the processors and schedule the jobs within each individual processor. We consider the setting where migration, i.e., moving a job from one processor to another, is not allowed. It reduces the overhead of migration and is common for many computing systems. We study both of the above two objectives. The target is to obtain algorithms in the multi-processor setting with performance similar to their counterparts in the single processor setting.

List of Research Outputs

Bansal N., Chan H.L., Pruhs K. and Katz D., Improved Bounds for Speed Scaling in Devices Obeying the Cube-Root Rule, Automata, Languages and Programming, 36th International Colloquium (ICALP). 2009, 144-155.

Bansal N., Chan H.L. and Pruhs K., Speed scaling with a solar cell, Theoretical Computer Science. 2009, 410 (45): 4580-4587.

Bonifaci V., Chan H.L., Marchetti-Spaccamela A. and Megow N., Algorithms and Complexity for Periodic Real-time Scheduling, The Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 2010, 1350-1359.

Chan H.L., Lam T.W., Lee L.K. and Ting H.F., Approximating frequent items in asynchronous data stream over a sliding window, The 7th Workshop on Approximation and Online Algorithms. Springer, 2009, 49 -- 61.

Chan H.L., Lam T.W., Lee L.K. and Ting H.F., Continuous Monitoring of Distributed Data Streams over a Time-based Sliding Window., 27th International Symposium on Theoretical Aspects of Computer Science, STACS 2010. LIPIcs, 179-190.

Chan H.L., Chan J.W.T., Lam T.W., Lee L.K., Mak K.S. and Wong P.W.H., Optimizing Throughput and Energy in Online Deadline Scheduling, Acm Transactions on Algorithms. 2009, 6(1): -.

Chan H.L., Edmonds J. and Pruhs K., Speed scaling of processes with arbitrary speedup curves on a multiprocessor, Proceedings of the 21st Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA). 2009, 1-10.

Siu W.Y., Mamoulis N., Yiu S.M. and Chan H.L., A Data-Mining Approach for Multiple Structural Alignment of Proteins, Bioinformation. 2010, 4(8): 366 - 370.

Researcher : Chan HL

List of Research Outputs

Bansal N., Chan H.L. and Pruhs K., Speed scaling with a solar cell, Theoretical Computer Science. 2009, 410 (45): 4580-4587.

Chan H.L., Chan J.W.T., Lam T.W., Lee L.K., Mak K.S. and Wong P.W.H., Optimizing Throughput and Energy in Online Deadline Scheduling, Acm Transactions on Algorithms. 2009, 6(1): -.

Siu W.Y., Mamoulis N., Yiu S.M. and Chan H.L., A Data-Mining Approach for Multiple Structural Alignment of Proteins, Bioinformation. 2010, 4(8): 366 - 370.

Researcher : Chan HTH

Project Title:	Graphs, Information Networks, and Related Algorithms
Investigator(s):	Chan HTH
Department:	Computer Science
Source(s) of Funding:	Seed Funding Programme for Basic Research
Start Date:	10/2009

Abstract:

A computer is nowadays rarely used in isolation. The Internet greatly enhances the power of computers because it allows a huge number of machines to communicate with one another. Indeed, computers are not the only objects that exhibit group behavior. Every person indeed behaves like a node in a network, and people are connected to one another through emails, SMS’s, instant messengers, and so on. Recently, blog and social network websites have gained tremendous popularity. Facebook claims that there are 250 million active users. It is conceivable that understanding the underlying social network is no easier than understanding the structure of Internet itself. Fortunately, theoretical computer scientists have the right tools to form abstract models in order to analyze such complicated systems and design algorithms for arising problems. On the other hand, people often have selfish wishes and this complicates the situation further. Again, tools from the game theoretical community and privacy notions from the security community come to our rescue and enable a better understanding. The main objectives of this project are to develop appropriate models for such networks, understand their structure and mathematical properties, and explore the relevant algorithms. We would describe the key issues in more details. Models of Graphs and Information Networks The graphs that arise in real networks are not totally random and arbitrary. For example, in the web-graph and many social networks, the fraction of high degree nodes is much larger than standard random graph models such as the Erdos-Renyi model, in which any pair of nodes are connected with some fixed probability, independent of any other connections. This heavy tail distribution exhibited in reality is known as the power-law behavior. It has been shown that the preferential attachment model, in which a new node connects to an existing node with probability proportional to its degree, gives rise to graphs that obey the power-law. This can be a reasonable model, because a node with a high degree is popular in some sense, and popular nodes are more likely to be connected to. However, it is arguable whether this is a natural model, as a new node can often only see a small portion of the graph and does not know the whole degree distribution of the existing graph. An interesting research direction is to investigate more “natural” models to explain the power-law behavior. Algorithms Adapting to Growth Rate of Graphs Apart from degree distribution, researchers have studied other mathematical properties of graphs. One useful concept is the growth rate of a graph. Recall that shortest paths between nodes induce the notion of distance between nodes. Intuitively, a graph has bounded growth rate if there cannot be many nodes in the graph so that the distance between any two such nodes is within some fixed range (say between x and 2x for some number x). As an example, a two-dimensional grid has bounded growth rate, while a star has large growth rate. Using the growth rate as an extra parameter gives an algorithm designer more power to design better algorithms, since the performance of algorithms can degrade gracefully with respect to this parameter. For easy instances of the problems in which the growth rate is bounded, a good algorithm should be adaptive and be able to exploit the specific structure to provide better guarantees. Indeed, it has been shown that for a graph with bounded growth rate, the inter-point distances of the nodes can be preserved by another graph with linear number of edges. Note that in general, there exists a family of graphs such that for any such graph, preserving inter-point distances require a super linear number of edges. Hence, this is further evidence that graphs with bounded growth rate could be easy instances in some problems. Moreover, Talwar has shown in a 2004 STOC paper that problems like traveling salesman problem (TSP) and facility location become more tractable if the given problem instances have bounded growth rate. We pursue this research direction and would investigate adaptive algorithms for extensions of similar and related problems, for example distance labeling and compact routing. For distance labeling, the task is to assign short labels to each node so that given the labels of two nodes, their distance can be estimated accurately. For compact routing, the task is to design a routing scheme so that the header size is small and the packet goes through a route with close to optimal latency. Emergent Behavior in Social Network Human beings have more concern than just efficiency. Needless to say, each person has his own agenda when he participates in a network. With information easily available these days, protecting one’s own privacy has become more important than ever. We propose to study how a network evolves when the involved agents have such concerns. Convergent behavior among different agents can be commonly observed. For example, birds tend to fly in the same direction. When new products come out, consumers’ usage patterns influence one another. The way new information spreads in a network can be analyzed in the same way as how a virus spreads. Both mathematicians and computer scientists have studied models to explain such convergent behavior. Cucker and Smale studied emergent behavior in flocks and how agents reach a consensus without a central direction. Kempe, Kleinberg and Tardos investigated models to study algorithms for picking a small number of initially nodes targeted for advertisement in order to maximize the spread of the product in the network. Such models assume that communications between agents are truthful. However, a consumer might not want to reveal his consumption habits. Recently, the security community proposes a notion known as differential privacy, which can measure how much one’s privacy is compromised through participation in a network. Such a notion provides a clean and formal way to analyze privacy, and has immediately become popular in the theory community as well. We wish to investigate the convergent behavior and related algorithms encountered in a network in which the differential privacy of an individual is preserved.

List of Research Outputs

Abiteboul S., Chan H.T.H., Kharlamov E., Nutt W. and Senellart P., Aggregate queries for discrete and continuous probabilistic XML, Proceedings of the 13th International Conference on Database Theory. New York, NY, USA, ACM, 2010, 50-61.

Chan H.T.H. and Elbassioni K., A QPTAS for TSP with Fat Weakly Disjoint Neighborhoods in Doubling Metrics, ACM-SIAM Symposium on Discrete Algorithms (SODA). 2010.

Chan H.T.H., Gupta A. and Talwar K., Ultra-low-dimensional Embeddings for Doubling Metrics, In: Victor Vianu, Journal of the ACM (JACM). New York, NY, USA, ACM, 2010, 57.

Researcher : Chan K

List of Research Outputs

Chan K., An Adaptive Software Transactional Memory Support for Multi-Core Programming. Hong Kong, 2009.

Researcher : Chan KP

Project Title:	Marker Identification in Example-Based Machine Translation
Investigator(s):	Chan KP
Department:	Computer Science
Source(s) of Funding:	Small Project Funding
Start Date:	01/2009
Completion Date:	12/2009

Abstract:

Objective of Research Proposal Background There are 3 main approaches in machine translation (MT), namely, rule-based machine translation (RBMT), statistical machine translation (SMT) and example-based machine translation (EBMT). A good summary of the relationship between these approaches has been given by Wu [1]. However, there are very few purist systems, and some kind of linguistic knowledge has usually to be incorporated to enhance the performance of the system. EBMT has been applied to Chinese-English machine translation. Siu, Meng and Wong [2, 3] used parse structure as input to the EBMT system. A semi-automatically induced grammar was obtained from unannotated corpora. Appropriate inflectional forms were selected to enhance the performance of the translation system. Kit, Pan and Webster [4] worked on an EBMT system for Hong Kong legal texts. Automatic text alignment were performed, and similarity measure was evaluated between sentence pairs. Finally target sentences were synthesized and smoothed. A similar project was conducted in Carnegie Mellon University, where an EBMT system was built on the bilingual Hong Kong legal code and various other bilingual resources [5]. The researchers used an improved segmenter, a term finder, and a statistical dictionary on top of the baseline system. Ma of NCLT, Dublin worked on extracting equivalent chunks from a Chinese-English bilingual corpus to be used by an EBMT system [6, 7]. The marker-based approach[8] has reported the greatest success of the various EBMT approaches. Way and Gough [9] reported that their marker-based EBMT system outperformed a word-based SMT system. Groves and Way [10] showed that their EBMT system achieved higher translation quality than a phrase-based SMT system (PBSMT, which has better performance than a word-based SMT system) constructed from freely available resources. Hybrid “statistical EBMT” and “example-based SMT”[8] were built and the performance of the hybrid system is better than the non-hybrid system. This means that both the statistical and example-based approaches have their own advantages, and they are complementary to each other. Some Outstanding issues One problem with Chinese is word segmentation. Although word segmentation performs reasonably well, any error in word segmentation is disastrous. In fact, segmentation was causing problem to the EBMT system used in [5] and they used an “improved segmenter” in order to have better performance. “The mis-segmenting of Chinese words makes it very hard to build a statistical dictionary and properly index the EBMT corpus.” Another problem is that the marker in Chinese is not as clear as in many Western languages. English marker words can be identified easily. However, this is not the case for Chinese. First, many Chinese characters are themselves words and have multiple POS tags, and the character will also form words with other characters. For example, 在線 and 在線上 will be very different syntactically, although 在was given a tag of preposition in many applications. Character-based approach, although more complicated, has certain advantages in these respects. Also, because of the very different syntactic structures in Chinese and Western languages, a structure-level EBMT such as the one used by Siu, Meng and Wong [9], may be more beneficial than a sentence-level one. Structure acquisition from a parallel corpus with the help of markers, which are obtained by automatic learning from the corpus given the English marker may be a possible way. Objective of the Project In this proposal, we concentrate on Example-based approach. Among all EMBT system, the Marker-based system is one of the most successful approaches. According to studies by linguists, markers are language unit that are special in reviewing syntactic structure. Because of the characteristics of Chinese language, the identification of markers is not trivial. The same English marker may correspond to different Chinese markers, or even a NULL marker. We propose an automatic identification of markers in Chinese languages from a bilingual corpus, which is the key process in applying the Marker-based EBMT. The only input is a bilingual corpus, a dictionary and the English markers, which are quite well-defined. The proposed system will obtain the corresponding Chinese markers (which may be a NULL marker). Once markers are identified, chunks in sentences can be extracted. With the automatic learning of markers and transfer rules, we will be able to build a high-quality EMBT system efficiently from a parallel corpus, with little human intervention. References: [1] D. Wu, “MT Model Space: Statistical versus Compositional versus Example-based Machine Translation,” Machine Translation, Vol. 19, pp. 213-227, 2005. [2] K. C. Siu & H. Meng, “Semi-Automatic Grammar Induction for Bi-directional English-Chinese Machine Translation,” Proc. 7th European Conf. on Speech Communication & Technology, Scandinavia, Sept. 2001. [3] K. C. Siu, H. Meng & C. C. Wong, “Example-based Bi-directional Chinese-English Machine Translation with Semi-automatically Induced Grammars,” Proc. 8th European Conf. on Speech Communication & Technology, Geneva, Switzerland, Sept 2003. [4] C. Kit, H. Pan & J. J. Webster, “Example-based Machine Translation: A New Paradigm,” Translation and Information Technology, S. W. Chan ed., pp. 57-78, CUHK Press, 2002. [5] Y. Zhang, R. D. Brown & R. E. Frederking, “Adapting an Example-based Translation System to Chinese,” Proc. of the 1st Int. Conf. on Human Language Technology Research, San Diego, pp. 1-4, 2001. [6] Y. Ma, Automatic Identification and Alignment of Chinese-English Phrases based on Multi-strategies, Master thesis, Tsinghua University. 2006. [7] Y. Ma, N. Stroppa, & A. Way, “Alignment-Guided Chunking”, Proc. of The 11th Conference on Theoretical and Methodological Issues in Machine Translation (TMI-07), Sweden, September, 2007. [8] D. Grove & A. Way, “Hybrid data-driven models of machine translation,” Machine Translation, Vol. 19, pp. 301-323, 2005. [9] A. Way & N. Gough, “Controlled Translation in an example-based environment: What do automatic evaluation metrics tell us?” Machine Translation, Vol. 19, pp. 1-36, 2005. [10] D. Groves & A. Way, “Hybrid example-based SMT: The best of both worlds?” Proc. ACL 2005 Workshop on Building & using Parallel Text, pp. 183-190, 2005.

Project Title:	IEEE International Conference on Data Mining (ICDM 2009) Cross-domain Web Image Annotation
Investigator(s):	Chan KP
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	12/2009
Completion Date:	12/2009

Abstract:

N/A

Project Title:	Graphical Models for Facial Expression Recognition
Investigator(s):	Chan KP
Department:	Computer Science
Source(s) of Funding:	Seed Funding Programme for Basic Research
Start Date:	04/2010

Abstract:

Facial Expression Recognition Facial expression contains a lot of information which is not carried in spoken language.As reported in [1], facial expressions have the most effect on the receiver. Facial expression account for about 55% of the effect, while 38% is conveyed by voice intonation, and only 7% by the actual spoken words. Hence facial expression recognition is an important topic in Human Computer Interaction (HCI). There are differences between face recognition and facial expression recognition, although both are important and challenging problems. In facial expression recognition, personal identity information, which is very important to face recognition, becomes unwanted variability. On the other hands, variability arising from facial expression is unwanted in face recognition. Facial expression recognition can be divided into two categories — still facial images recognition and facial image sequence analysis. Early researches usually focus on still facial image recognition. However, psychological studies suggest that the spatio-temporal analysis of facial image sequences produces more accurate and robust recognition [2]. Graphical Models Graphical models are tools to apply probabilistic theory to graph representation. Graphs can represent the intuitive interdependence relationship between objects, while the probabilistic theory is used to model any uncertainty. Graphical models has found a lot of application in Machine Learning. In particular, Dynamic Graphical Models can also model the temporal relationship, which is a crucial clue in facial expression recognition. Common graphical models include Kalman filters, Hidden Markov Models, and Bayesian Networks. Kalman filter is mainly used in a linear dynamic system. Hidden Markov Models (HMM) have been applied in Speech Processing for more than two decades already. Recently, HMM has been used in other disciplines such as Bioinformatics. The use of Bayesian Network is still quite limited. In particular, the use of Bayesian Network in facial expression recognition is still limited to the role of yet another recognizer. In this project, we would like to apply Graphical Models to Facial Expression Recognition. We will explore the use of Bayesian Network, and factorial HMM. Key issues Facial expressions contain temporal information, in the form of transition of expression state sequences. Models should be able to capture temporal evolution and momentary facial expressions, which are more informative in human behaviour analysis. Facial expressions consists of action unit combinations and transient cues. There are induced and nontransitive dependencies. The dependencies, uncertainties and temporal behaviours of facial expression require dynamic models which Neural Network lacks. Hidden Markov Models is one possible candidate. It can model uncertainties in time series. However, due to the nature of HMM underlying graphical model, it lacks the ability to represent induced and nontransitive dependencies. Another challenge in facial expression recognition is on the robustness of the system. The input of the system is of varying condition due to, for example, different lighting conditions, camera, etc. The recognition system should be able to tolerate transient noise in the input features, such as error induced in a particular frame due to change in environment. Ideally, the system should also be capable of adaptation if the change in the input is of longer term. Graphical Models such as Bayesian Network has been developed for some time already, and has been found to be a very useful tool in Pattern Recognition. However, their use on facial expression recognition remains to be limited as yet another classifier. The expressive power of Bayesian Network has not been fully exploited. In this project, we propose to use propose a Bayesian Network architecture that can connect the observation, the examples of different expressions, and the observation history together. We will also try the factorial HMM, which itself is a special kind of Bayesian Network, in estimation the prior probabilities. Objective of the Project In this project, we are going to apply a Graphical Models approach to facial expression recognition. Bayesian Network is a very important research topic. We will combine exemplar-based approach with Bayesian Network. The resultant system can not only be applied to facial expression recognition, but also to other applications with spatio-temporal variations, such as Geographical Information Systems (GIS), or even real estate data. The project involves 2 major parts. 1.Exemplar-based modeling The exemplar-based modeling can be applied not only to facial expression recognition. The finding of exemplar can greatly simplifies the recognition system and reduce the number of variable to be estimated in the later stage of the system. The exemplar-based modeling will act as an intermediate between the facial expressions and the facial features. 2.Dynamic Spatio-temporal modeling The modeling of facial expression is basically spatio-temporal. We propose to use dynamic Bayesian network to capture dynamics of the variation. References: [1] M. Pantic, & L.J.M. Rothkrantz, “Automatic Analysis of Facial Expressions: the State of the Art”, IEEE Trans. on Pattern Analsis & Machine Intelligence, Vol. 22, No. 12, pp. 1424–1445, 2000. [2] J. Bassili, “Emotion Recognition: The Role of Facial Movement and the Relative Importance of Upper and Lower Areas of the Face”, J. Personality and Social Psychology, Vol. 37, pp. 2049-2059, 1979.

List of Research Outputs

Si S., Tao D. and Chan K.P., Cross-Domain Web Image Annotation, Workshop of Internet Multimedia Mining, Int. Conf. on Data Mining. Miami, U.S.A., 2009, 184-189.

Si S., Tao D. and Chan K.P., Discriminative Hessian Eigenmaps for Face Recognition, Int. Conf. on Acoustic, Speech, Signal Processing, 2010. Dallas, U.S.A., 5586-5589.

Si S., Tao D. and Chan K.P., Evolutionary Cross-Domain Discriminative Hessian Eigenmaps, IEEE Trans. on Image Processing. IEEE, 2010, 19: 1075-1086.

Si S., Tao D. and Chan K.P., Transfer Discriminative Logmaps, 2009 IEEE Pacific Rim Conference on Multimedia. Bangkok, Springer, 2009, 131-143.

Researcher : Chan PF

List of Research Outputs

Law Y.W., Chan P.F., Yiu S.M., Tang B., Lai K.Y., Chow K.P., Ieong S.C.R., Kwan Y.K., Hon W.K. and Hui C.K., Identifying Static and Dynamic Volatile Memory Data from Multiple Dumps for Live Forensics, Sixth IFIP WG 11.9 International Conference on Digital Forensics. Hong Kong, 2010.

Researcher : Chan SH

List of Research Outputs

Chan S.H., Lee L.K., Lam T.W. and Ting H.F., Non-claairvoyant scheduling for weighted flow time and energy on speed bounded processors, The 16th Computing: the Australasian Theory Symposium (CATS). 2010, 3-10.

Researcher : Chan WK

List of Research Outputs

Chan W.K., Ho J.C.F. and Tse T.H., Finding failures from passed test cases: improving the pattern classification approach to the testing of mesh simplification programs, Software Testing, Verification and Reliability (ISI impact factor of journal: 1.632). 2010, 20 (2): 89−120.

Jiang B., Zhang Z., Chan W.K. and Tse T.H., Adaptive random test case prioritization, Proceedings of the 24th IEEE/ACM International Conference on Automated Software Engineering (ASE 2009), IEEE Computer Society Press, Los Alamitos, CA (acceptance rate: 17.1%). 2009, 233−244.

Mei L., Chan W.K., Tse T.H. and Kuo F.C., An empirical study of the use of Frankl-Weyuker data flow testing criteria to test BPEL web services, Proceedings of the 33rd Annual International Computer Software and Applications Conference (COMPSAC 2009), vol. 1, IEEE Computer Society Press, Los Alamitos, CA. 2009, 81−88.

Mei L., Chan W.K. and Tse T.H., Data flow testing of service choreography, Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC 2009/FSE-17), ACM Press, New York, NY (acceptance rate: 14.7%). 2009, 151−160.

Mei L., Chan W.K., Tse T.H. and Merkel R.G., Tag-based techniques for black-box test case prioritization for service testing, Proceedings of the 9th International Conference on Quality Software (QSIC 2009), IEEE Computer Society Press, Los Alamitos, CA. 2009, 21−30.

Wong W.E., Chan W.K., Tse T.H. and Kuo F.-.C., guest editor, Special Issue on Dynamic Analysis and Testing, Journal of Systems and Software. Elsevier, Amsterdam, 2010.

Zhang Z., Chan W.K., Tse T.H., Jiang B. and Wang X., Capturing propagation of infected program states, Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC 2009/FSE-17), ACM Press, New York, NY (acceptance rate: 14.7%). 2009, 43−52.

Zhang Z., Chan W.K., Tse T.H. and Hu P., Experimental study to compare the use of metamorphic testing and assertion checking, Journal of Software. 2009, 20 (10): 2637−2654.

Zhang Z., Jiang B., Chan W.K., Tse T.H. and Wang X., Fault localization through evaluation sequences, Journal of Systems and Software (ISI impact factor of journal: 1.340). 2010, 83 (2): 174−187.

Zhang Z., Chan W.K., Tse T.H., Hu P. and Wang X., Is non-parametric hypothesis testing model robust for statistical fault localization?, Information and Software Technology (ISI impact factor of journal: 1.821). 2009, 51 (11): 1573−1585.

Zhang Z., Chan W.K., Tse T.H., Lu H. and Mei L., Resource prioritization of code optimization techniques for program synthesis of wireless sensor network applications, Journal of Systems and Software (ISI impact factor of journal: 1.340). 2009, 82 (9): 1376−1387.

Researcher : Chan WT

List of Research Outputs

Chan W.T., Chin F.Y.L., Ting H.F. and Zhang Y., Online Tree Node Assignment with Resource Augmentation, The 15th International Computing and Combinatorics Conference (COCOON'2009). Niagara Falls, New York, USA, 2009, 358-367.

Researcher : Chang JW

List of Research Outputs

Choi M.G., Ju E.J., Chang J.W., Lee J.H. and Kim Y.J., Linkless Octree Using Multi-Level Perfect Hashing, Computer Graphics Forum. 2009, 28: 1773-1780.

Researcher : Chen J

List of Research Outputs

Chen J., Cheng C.K., Mokbel M. and Chow C.Y., Scalable Processing of Snapshot and Continuous Nearest-Neighbor Queries over One-Dimensional Uncertain Data. , In: Dan Suciu and Peter Haas, Very Large Databases Journal (VLDBJ), Special Issue. Springer, 2009, 18: 1219 - 1240.

Cheng C.K. and Chen J., Probabilistic Spatial Queries, In: L. Liu and T. Ozsu, Encyclopedia of Database Systems. Springer-Verlag, 2009.

Cheng C.K., Xie X., Yiu M.L., Chen J. and Sun L., UV-diagram: A Voronoi Diagram for Uncertain Data, The IEEE Intl. Conf. on Data Engineering (IEEE ICDE 2010). 2010.

Yang B., Peng Y., Leung C.M., Yiu S.M., Chen J. and Chin F.Y.L., Unsupervised Binning of Environmental Genomic Fragments based on an Error Robust Selection of l-mers, BMC Bioinformatics. London, United Kingdom, BioMed Central, 2010, 11 (Suppl 2): S5.

Yang B., Peng Y., Leung C.M., Yiu S.M., Chen J. and Chin F.Y.L., Unsupervised Binning of Environmental Genomic Fragments based on an Error Robust Selection of l-mers, The Third International Workshop on Data and Text Mining in Bioinformatics (DTMBIO 09) in the 18th ACM Conference on Information and Knowledge Management. Hong Kong, 2009.

Researcher : Chen J

List of Research Outputs

Cheng C.K. and Chen J., Probabilistic Spatial Queries, In: L. Liu and T. Ozsu, Encyclopedia of Database Systems. Springer-Verlag, 2009.

Cheng C.K., Xie X., Yiu M.L., Chen J. and Sun L., UV-diagram: A Voronoi Diagram for Uncertain Data, The IEEE Intl. Conf. on Data Engineering (IEEE ICDE 2010). 2010.

Researcher : Chen TY

List of Research Outputs

Chen T.Y., Kuo F.C., Merkel R.G. and Tse T.H., Adaptive random testing: the ART of test case diversity, Journal of Systems and Software (ISI impact factor of journal: 1.340). 2010, 83 (1): 60−66.

Chen T.Y., Tse T.H. and Zhou Z.Q., Semi-proving: an integrated method for program proving, testing, and debugging, IEEE Transactions on Software Engineering (ISI impact factor of journal: 3.750). 2010, doi:10.1109/TSE.2010.23.

Jiang B., Zhang Z., Tse T.H. and Chen T.Y., Best Paper Award, 33rd Annual International Computer Software and Applications Conference (COMPSAC 2009), IEEE Computer Society, Washington, DC, USA . 2009.

Jiang B., Zhang Z., Tse T.H. and Chen T.Y., How well do test case prioritization techniques support statistical fault localization [Best Paper Award, COMPSAC 2009], Proceedings of the 33rd Annual International Computer Software and Applications Conference (COMPSAC 2009), vol. 1, IEEE Computer Society Press, Los Alamitos, CA. 2009, 99−106.

Poon P.L., Tang S.F., Tse T.H. and Chen T.Y., CHOC'LATE: a framework for specification-based testing, Communications of the ACM (ISI impact factor of journal: 2.346). 2010, 53 (4): 113−118.

Wong W.E., Tse T.H., Glass R.L., Basili V.R. and Chen T.Y., An assessment of systems and software engineering scholars and institutions (2002−2006), Journal of Systems and Software (ISI impact factor of journal: 1.340). 2009, 82 (8): 1370−1373.

Researcher : Cheng CK

Project Title:	Privacy Protection in Location-based Services with Location Cloaking
Investigator(s):	Cheng CK
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	01/2007

Abstract:

1. [Purpose of Investigation] A location-based service (LBS) acquires locations of users through positioning technologies like GPS, WiFi and RF-ID. Examples of LBS include emergency call service (e.g., the U.S.’s E-911), cab-calling, location-based advertising, inventory management, person-finder and baggage tracking. Although these applications hold the promise of safety, convenience, and new business opportunities, the ability of locating objects accurately raises a new concern – intrusion of location privacy, i.e., learning about a person’s current or past location without her consent. For example, a user may not want people to know that she has visited a political party. However, this sensitive information may be disclosed to government agencies and public media without the user's permission. To safeguard location privacy, lowering the spatial and temporal resolution of the location data sent to the service provider has recently been proposed. This “location cloaking” technique allows a user to control the precision of her location data to be revealed. It also prevents a service provider from obtaining the actual coordinates of the user, and the time at which she is at a particular location. Although this method is intuitive and effective in protecting privacy, it may be overkill and the quality of desired service can be severely affected. In this project, we study the system, database and language issues related to location cloaking. 2. Study a location cloaking system: By introducing spatial and temporal uncertainty to location data, cloaking allows a user’s location privacy to be better protected. However, the service provider may not have the most accurate information to provide the best service. Consequently the service quality can be degraded. Thus by adjusting the accuracy of the information sent to the service provider, a tradeoff can be achieved between: (1) privacy of the user, and (2) quality of a service requested. We propose a framework that takes into account the interaction among these factors. We develop a data model for cloaked locations, privacy metrics, and semantics of services for processing this kind of data. Based on this framework, we develop solutions that balance service quality and location privacy. We apply this framework to three classes of privacy-aware systems: anonymous, pseudonymous, and non-anonymous applications. A software agent will be implemented to support location cloaking. The issues of different types of attacks to such systems will also be studied. 3. Study Imprecise queries: Since a cloaked data value is uncertain (spatially and temporally), the semantics of queries that operate on these data in an LBS also need to be changed: the answers to these “imprecise queries” are inexact and give probabilistic guarantees about answer correctness. As these queries can be computationally expensive, we develop efficient imprecise evaluation algorithms for queries commonly used in an LBS, including range queries, nearest-neighbor queries, and joins. For each query type, we introduce “quality metrics” for its answer. These quality metrics quantify the ambiguity of query answers due to the uncertainty of a cloaked location, and help the user to decide if she should reduce her location ambiguity in order to get a more accurate result. We also investigate how the evaluation algorithms of these query answers and quality metrics can be rewritten in database languages such as SQL and PL/SQL, so that they can be supported in an existing database system. 4. Develop Cloaked data indexing: A typical LBS handles massive amounts of location data. These data keep changing, since the objects being tracked move continuously. Spatial-temporal database indexing techniques may be used to improve data retrieval performance. However, traditional indexes like R-trees and grids are not designed to support cloaked data. Thus an index specially designed for data uncertainty is desirable. We develop novel cloaked data indexing techniques for efficient evaluation of imprecise range queries, nearest neighbor queries, and joins. 5. Propose a privacy preference language: An advantage of location cloaking is that rather than relying on the service provider to protect privacy, a user can decide the appropriate precision of her location. Since a formal cloaked location data model can be difficult for a casual user to specify, we develop a “privacy-preference language” that allows flexible and high-level specification of the degree of location cloaking required. The language will also allow the user to specify privacy preferences with respect to locations, in which a user may want to keep private her presence only for some location(s), and to other users and service providers, in which a user may want to make known his/her presence in a given location only to selected users and/or to selected providers. A translator will be built to convert privacy preferences into the cloaked data model that can be readily processed by imprecise queries. We will revisit the current P3P language to determine whether extensions are needed to support LBS. 6. [Impact, Significance and Value] The outcome of this project is crucial to the development of privacy-aware LBS’s in Hong Kong. It gives confidence to users who are concerned about their privacy when using LBS’s, and it provides an intuitive privacy-preference language interface for users to trade-off their privacy and service quality requirements. The system is supported by a high-performance database for providing efficient services. In terms of research merits, the project develops a privacy language specific to LBS which has not been well-studied. The study of cloaked data querying and indexing addresses some fundamental issues in uncertain data management, which is an important and emerging topic in the database field.

Project Title:	Adaptive Filters for Continuous Queries over Constantly-Evolving Data Streams
Investigator(s):	Cheng CK
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	01/2008

Abstract:

1. We will first design query tolerance semantics that specify the maximum extent of error allowed in query answers. Users can control the degree of correctness depending on their needs. A novel feature of these tolerances is that they describe the necessary conditions required for guaranteeing correctness against data uncertainty. We will redefine the semantics of tolerances for common stream queries, including range queries and top-k queries. 2. Secondly, we will develop scalable algorithms that satisfy query tolerances. We will study how to translate query tolerances into some commands for the ACF associated with each stream. In each ACF, algorithms are designed to implement the commands received from the DSMS, with carefully tuned data acquisition and transmission rates for optimal energy use. These algorithms will be extended to support multiple concurrent queries with different tolerance requirements. We also plan to understand the performance of these approaches through experimental evaluation. 3. [Impact, Significance and Value] The topics of distributed stream processing and uncertainty management have attracted much research interest recently. Our project designs novel algorithms for scalable, resource-efficient and error-bounded evaluation of continuous queries over uncertain data streams, and thus it will be interesting to the research community. Our result will be highly practical to Hong Kong. Particularly, numerous applications including transportation systems, surveillance systems, and natural habitat monitoring applications, can be built on top of our prototype.

Project Title:	Probing Imprecise Sensor Data with Quality Guarantees
Investigator(s):	Cheng CK
Department:	Computer Science
Source(s) of Funding:	Seed Funding Programme for Basic Research
Start Date:	08/2008

Abstract:

In location-aware environments, mobile objects obtain their locations through technologies like the Global-Positioning System (GPS), RFID systems or cellular networks. Sensor networks are also deployed in scientific applications, in order to acquire data (e.g., temperature, humidity) from external environments. Data collected from these external environments are used by the system to support applications like road-traffic control systems, emergency-call services, mobile commerce and habitat-monitoring. In these systems, the data captured from the physical world changes with time. However, system resources (e.g., battery power and wireless network bandwidth) are limited. Thus, it is infeasible to capture the data values at every point in time. Consequently, the data recorded by the system cannot capture exactly the state of the external world. It is important to handle the impreciseness that exists in these data, or else the quality of services provided to the users can be degraded. The main goal of this project is to study the intricate balance between (1) utilization of system resources; (2) freshness of sensor data; and (3) quality of queries that evaluate these data. In general, the current values of the data collected by sensor devices can be probed by the system, in order to acquire higher-quality data. Due to limited resources, the amount of probing is restricted. The problem, then, is to study how to achieve the highest query quality with a limited amount of resources available for probing. We address this problem by developing algorithms that efficiently determine the set of sensors to be probed. These algorithms are optimized for queries commonly found in sensor applications (e.g., range queries and nearest-neighbor queries). Specifically, our objective is to develop probing algorithms for imprecise data. We plan to design algorithms that allow the system to swiftly decide the set of data sources to be probed. We employ the data uncertainty model to capture the uncertain nature of the sensor data. Moreover, we employ probabilistic queries – those that process uncertain data and generate results with probabilistic guarantees – to support these systems. We develop probing algorithms for common probabilistic queries, such as the range query (e.g., what are the IDs of sensors whose values are between [10oC, 20oC]?) and the nearest-neighbour query (e.g., return the name of the friend who is the closest to me). These queries are commonly found in sensor applications. They are also well-studied in the literature.

Project Title:	Scalable Continuous Query Processing on Imprecise Location Data
Investigator(s):	Cheng CK
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	01/2009

Abstract:

1. [Incremental Query Algorithms] Since the objects being monitored may be moving, the result for a CPQ evaluated on these objects must be refreshed to reflect changes in their locations. However, computing a CPQ can be expensive. Therefore, we will develop efficient "incremental" algorithms for them. A property of these algorithms is that they reuse the previous query results and only update the portion of the answers that are affected by the location updates, obviating the need to execute a complete query. The challenge is to find out which part of the query result is affected by a location update. 2. [Probabilistic Filtering] To ensure that the database of the system is fresh, a mobile object usually reports its position periodically. However, these objects (e.g., cellular phones) may only have a limited amount of energy and network resources. To reduce the transmission overhead and at the same time maintain a fresh database, we develop “probabilistic filtering”. This method exploits the fact that an item does not need to be updated if no CPQs are interested in it. It also makes use of the processing capabilities of a mobile object. Particularly, the requirements of a CPQ are translated into a set of filter constraints. These conditions are used by a mobile object to decide whether its current location should be sent to the system. Since an object does not always have to report its location, transmission cost can be reduced. These algorithms will be extended to support multiple CPQs that are concurrently executed. 3. [Prototype Development] We will develop a system prototype to implement our methods. We will develop our methods on an open-source uncertain database system. A road traffic monitoring application will be built on top of the platform. 4. [Impact, Significance and Value] In terms of research merits, our project combines the techniques used in the area of data uncertainty and data stream management. The results of this research will not only help to reduce the workload of the location server, but also the use of resources (e.g., energy and network bandwidth). Although this project considers location uncertainty, our techniques can potentially be extended to address other types of data streams (e.g., stock ticket prices, sensor and RFID data). Our result will be highly practical to Hong Kong SAR and China. A transportation control system for traffic monitoring and analysis purposes can be built on top of our prototype.

Project Title:	Scalable Cleaning of Probabilistic Databases with Quality Guarantees
Investigator(s):	Cheng CK
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	01/2010

Abstract:

1) [Develop PWS-quality] We develop a theoretically-sound metric, called PWS-quality, to quantify the ambiguity of query answers on probabilistic databases. This measure observes the Possible-World Semantics (PWS), a correctness criterion commonly used by these databases. It can be used by different types of queries. We will study efficient algorithms for computing this metric over large databases for common database queries; 2) [Design cleaning algorithms] Based on the PWS-quality, we design efficient algorithms to choose the set of objects to be cleaned, under the assumption that each cleaning operation is associated with a cost, and a limited cleaning budget is provided for the query involved. We develop algorithms to find the optimal set of objects to be cleaned, in order to achieve the highest gain in query answer quality. We then introduce efficient selection heuristics that yield close-to-optimal quality improvement. We also develop incremental evaluation techniques to update the query results efficiently after the database is cleaned. The solutions are extended to handle the situation when the ambiguity about an object is not completely removed; 3) [Extend solutions for multiple queries] We extend the study to environments where different queries are concurrently executed by multiple users. Two problems are studied: (1) all queries share a cleaning budget; (2) each query has a different cleaning budget. We will design efficient cleaning algorithms to achieve an optimal quality improvement for these two scenarios; 4) [Prototype development] We develop a cleaning engine, which incorporates our algorithms, on top of an open-source probabilistic database. The engine is designed to be decoupled from the database system, in order to minimize the changes made to the database. We also design a location-based application. We will use the cleaning engine to enhance the quality of services provided by this application.

List of Research Outputs

Chen J., Cheng C.K., Mokbel M. and Chow C.Y., Scalable Processing of Snapshot and Continuous Nearest-Neighbor Queries over One-Dimensional Uncertain Data. , In: Dan Suciu and Peter Haas, Very Large Databases Journal (VLDBJ), Special Issue. Springer, 2009, 18: 1219 - 1240.

Cheng C.K., Cleaning Uncertain Data with Quality Guarantees, In: Jonas Mellin, StruFus: Infrastructure for Information Fusion (joint project with U. Skode, HKPU, HKBU, IIT Bombay, U. Wuhan, 2008-09). 2009.

Cheng C.K., Kao C.M., Kwan K.L., Prabhakar S. and Tu Y., Filtering Stream Data for Entity-based Continuous Queries, IEEE Transactions on Knowledge and Data Engineering (IEEE TKDE). IEEE, 2010, 22: 234-248.

Cheng C.K., Gong J. and Cheung D.W.L., Managing Uncertainty of XML Schema Matching, The IEEE Intl. Conf. on Data Engineering (IEEE ICDE 2010). 2010.

Cheng C.K., Outstanding Service Award (as Registration Chair), in the ACM 18th Conference on Information and Knowledge Management (CIKM), November 2-6, 2009, Hong Kong, CIKM 2009 Conference. 2009.

Cheng C.K. and Chen J., Probabilistic Spatial Queries, In: L. Liu and T. Ozsu, Encyclopedia of Database Systems. Springer-Verlag, 2009.

Cheng C.K., Xie X., Yiu M.L., Chen J. and Sun L., UV-diagram: A Voronoi Diagram for Uncertain Data, The IEEE Intl. Conf. on Data Engineering (IEEE ICDE 2010). 2010.

Cheng J., Yu J. and Cheng C.K., On-Line Preferential Nearest Neighbor Browsing in Large Attributed Graphs , 1st Intl. Workshop on Graph Data Management: Techniques and Applications (GDM 2010), associated with DASFAA. 2010.

Farrell T., Rothermel K. and Cheng C.K., Processing Continuous Range Queries with Spatio-temporal Tolerance (accepted in June 2010), In: Mani Srivastava, IEEE Transactions on Mobile Computing (IEEE TMC). 2010.

Prabhakar S. and Cheng C.K., Data Uncertainty Management in Sensor Networks, In: L. Liu and T. Ozsu, Encyclopedia of Database Systems. Springer-Verlag, 2009.

Ren J., Lee S.D., Chen X., Kao C.M., Cheng C.K. and Cheung D.W.L., Naive Bayes Classification of Uncertain Data, The IEEE International Conference on Data Mining (IEEE ICDM 2009). 2009.

Researcher : Cheng J

List of Research Outputs

Cheng J., Yu J. and Cheng C.K., On-Line Preferential Nearest Neighbor Browsing in Large Attributed Graphs , 1st Intl. Workshop on Graph Data Management: Techniques and Applications (GDM 2010), associated with DASFAA. 2010.

Researcher : Cheung DWL

Project Title:	An eLogistics Appliance with data exchange and conversion technologies for infrastructure connectivity
Investigator(s):	Cheung DWL, Kwok WCH, Kao CM, Lee TYT, Yee KC
Department:	Computer Science
Source(s) of Funding:	Platform Research Projects - General Award
Start Date:	07/2007

Abstract:

To extend CECID's B2B Connector technology to become an eLogistics Appliance to perform document exchange and conversion based on local and international e-commerce standards; to deploy the developed appliances in pilot projects with industry partners, such as DTTN, Kerry Logistics, iASPEC Technologies, Azeus Systems Ltd., and ICO Limited, to enable supply chain integration.

Project Title:	Multi-site infrastructure for massive digital content collaboration
Investigator(s):	Cheung DWL, Yee KC, Kao CM, Kwok WCH
Department:	Computer Science
Source(s) of Funding:	Guangdong-Hong Kong Technology Cooperation Funding Scheme
Start Date:	03/2008
Completion Date:	03/2010

Abstract:

File transmission is a common and recurring task in a collaborative development environment. Existing technologies, such as email attachments, HTTP, FTP, Web upload and VPN, have their shortcomings especially when files with large size are involved. These technologies can no longer meet the nowadays requirements of the DEI, such as the Massive Multi-player Online Game (MMOG) development sector, where an effective file transmission solution that can support large file transfer securely over the Internet is in need. The objective of this project is to establish an infrastructure that can support transmission of digital content of any siz in a secure, reliable, transactional, and managed manner. Together with this technology, an extensible management platform will be developed to manage the transmission process and integrate the secure digital content transfer with conventional development tools pervasive in the DEI. Basic management features, including version control, access control, and audit trail, will be developed for the platform.

Project Title:	Universal web service adapter to facilitate "software as a service" implementation
Investigator(s):	Cheung DWL, Lee SD, Lee TYT
Department:	Computer Science
Source(s) of Funding:	Innovation and Technology Support Programme (Tier 3)
Start Date:	09/2008
Completion Date:	02/2010

Abstract:

The objectives of this project are as follows: 1. To study the common patterns of data import/export formats of legacy applications and the gaps between these formats and Web Service interfaces; 2. To research and develop the Universal Web Services Adapter technology, which can standardize and streamline the process of programming Web Services adaptors for legacy applications; and 3. To apply the research results in proof-of-concept projects.

Project Title:	Universal web service adapter to facilitate "software as a service" implementation
Investigator(s):	Cheung DWL
Department:	Computer Science
Source(s) of Funding:	Innovation and Technology Fund Internship Programme
Start Date:	11/2008

Abstract:

Project Title:	Multi-site infrastructure for massive digital content collaboration
Investigator(s):	Cheung DWL
Department:	Computer Science
Source(s) of Funding:	Innovation and Technology Fund Internship Programme
Start Date:	12/2008

Abstract:

Project Title:	An Intelligent Data Security Gateway for Multiparty Data Transfer Supported by Multi-Factor and Multi-Dimensional Data Protection Scheme
Investigator(s):	Cheung DWL, Kwok WCH, Yee KC, Kwok RPM, Tsang WW
Department:	Computer Science
Source(s) of Funding:	Guangdong-Hong Kong Technology Cooperation Funding Scheme
Start Date:	04/2009

Abstract:

Generally, IT security is brought to an enterprise by setting forth necessary governance policies and realizing the policies using appropriate technologies. These two factors must work together – either governance policies or technologies alone cannot uphold the enterprise security practically. In this project, we focus on the area of data security and develop technologies which can make sure that the governance policies can be followed through effectively. The objectives of this project are to: ITSP-GHP 6.0 1. Develop a centralized enterprise security framework and tools to bridge the gap between the enterprise data security governance policies and the technical enforcement of data protection. 2. Develop solutions, working with both the enterprise framework and consumer portable devices, to prevent leakage of sensitive enterprise and personal data. 3. Conduct research studies on a novel multi-factor data encryption mechanism, which enterprises can use it to control the necessary conditions for the decryption of data.

Project Title:	An Intelligent Data Security Gateway for Multiparty Data Transfer Supported by Multi-Factor and Multi-Dimensional Data Protection Scheme
Investigator(s):	Cheung DWL
Department:	Computer Science
Source(s) of Funding:	Innovation and Technology Fund Internship Programme
Start Date:	04/2009

Abstract:

Project Title:	An Intelligent Data Security Gateway for Multiparty Data Transfer Supported by Multi-Factor and Multi-Dimensional Data Protection Scheme
Investigator(s):	Cheung DWL
Department:	Computer Science
Source(s) of Funding:	Innovation and Technology Fund Internship Programme
Start Date:	07/2009

Abstract:

Project Title:	35th International Conference on Very Large Data Bases (VLDB 2009) An Audit Environment for Outsourcing of Frequent Itemset Mining
Investigator(s):	Cheung DWL
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	08/2009
Completion Date:	08/2009

Abstract:

N/A

List of Research Outputs

Cheng C.K., Gong J. and Cheung D.W.L., Managing Uncertainty of XML Schema Matching, The IEEE Intl. Conf. on Data Engineering (IEEE ICDE 2010). 2010.

Cheung D.W.L. and Kwok W.C.H., Consulting Services on B2B Connectors, Apacus Software Corporation Ltd. 2010.

Cheung D.W.L. and Kwok W.C.H., Consulting Services on Hermes H2O , Apacus Software Corporation Ltd. 2010.

Cheung D.W.L., Song I.L., Chu W., Hu X.H. and Lin J., Editor, Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, Hong Kong, China. USA, Association of Computing Machinery,, 2009, 1-2012.

Cheung D.W.L., Editorial Board Member, International Journal of Database Theory and Application. SERSC, 2009.

Cheung D.W.L., Editorial Board Member, International Journal of Future Generation Communication and Networking. SERSC, 2009.

Cheung D.W.L., Editorial Board Member, International Journal of Data Mining and Bioinformatics. InderScience, 2009.

Cheung D.W.L., Editorial Board Member, International Journal of Granular Computing, Rough Sets and Intelligent Systems. InderScience, 2009.

Cheung D.W.L. and Kwok W.C.H., Enhancement on HKMA’s Data Transformer System , ICO Limited, and Hong Kong Monetary Authority. 2010.

Cheung D.W.L., Invited Talk, Secure Query Computation on Encrypted Databases,, Drexel University. 2009.

Cheung D.W.L., Invited Talk, Secure Query Computation on Encrypted Databases,, George Mason University, USA.. 2009.

Cheung D.W.L., Invited Talk, Secure Query Computation on Encrypted Databases,, Virgina Tech, USA. 2009.

Cheung D.W.L., Invited Talk, Secure Query Computation on Encrypted Databases, Japan Advanced Institute of Science and Technology . 2010.

Cheung D.W.L. and Kwok W.C.H., Open Source ebXML Gateway Herme, Curalia AB and Agresso AB, Sweden. 2009.

Cheung D.W.L. and Kwok W.C.H., Open Source ebXML Gateway Hermes , Babelway. 2010.

Cheung D.W.L. and Kwok W.C.H., Open Source ebXML Gateway Hermes, Connective-IT, France. 2009.

Cheung D.W.L. and Kwok W.C.H., Open Source ebXML Gateway Hermes, OGCIO, HKSARG. 2009.

Cheung D.W.L., Recognition of Service Award, Association for Computing Machinery. 2009.

Cheung D.W.L. and Kwok W.C.H., Software License, B2B Connector, Kerry Logistics. 2010.

Chui C.K., Lo E., Kao C.M. and Cheung D.W.L., S-OLAP: an OLAP system for analysis sequence data, ACM SIGMOD International Conference on Management of Data 2010 (SIGMOD 2010). 2010.

Kwok W.C.H. and Cheung D.W.L., Data Harmonization for ASEAN Single Window Project, Apacus Software Corporation Ltd. 2009.

Kwok W.C.H. and Cheung D.W.L., Strategy Study on the Mass Rollout of the Insurance Portal System, Transport Department, HKSAR. 2009.

Lee T.Y.T., Hon C.T. and Cheung D.W.L., XML Schema Design and Management for e-Government Data Interoperability , The Electronic Journal of e-Government. UK, Academic Publishing Limited, 2009, 7: 381-390.

Ren J., Lee S.D., Chen X., Kao C.M., Cheng C.K. and Cheung D.W.L., Naive Bayes Classification of Uncertain Data, The IEEE International Conference on Data Mining (IEEE ICDM 2009). 2009.

Wong W.K., Cheung D.W.L., Hung E., Kao C.M. and Mamoulis N., An Audit Environment for Outsourcing of Frequent Itemset Mining, Proceedings of the 35th International Conference on Very Large Data Bases (VLDB 2009). 2009.

Wong W.K., Cheung D.W.L., Hung E. and Kao C.M., An Audit Environment for Outsourcing of Frequent Itemset Mining, Proceedings of the 35th Very Large Data Bases Conference (VLDB). Lyon, France, 2009, 1162-1172.

Wong W.K., Mamoulis N. and Cheung D.W.L., Non-homogeneous Generalization in Privacy Preserving Data Publishing, Proceedings of the ACM Conference on Management of Data (SIGMOD), pp. 483-494, Indianapolis, IN. 2010.

Yip Y.L., Cheung D.W.L., Jing L. and Ng M., A Semi-supervised Approach to Projected Clustering with Applications to Microarray Data, International Journal of Data Mining and Bioinformatics. InderScience, 2009, 3: 229-259.

Researcher : Chim TW

List of Research Outputs

Chim T.W., Yiu S.M., Hui C.K. and Li V.O.K., SPECS: Secure and Privacy Enhancing Communications Schemes for VANETs, Ad Hoc Networks. 2010, 28: 160 - 175.

Chim T.W., Yiu S.M., Hui C.K., Jiang L. and Li V.O.K., SPECS: Secure and Privacy Enhancing Communications for VANET, ADHOCNETS '09. Niagara Falls, Canada, 2009.

Chim T.W., Yiu S.M., Hui C.K., Jiang L. and Li V.O.K., SPECS: Secure and Privacy Enhancing Communications for VANET, The First International Conference on Ad Hoc Networks (AdHocNets 2009). Canada, 2009.

Dong Y., Chim T.W., Li V.O.K., Yiu S.M. and Hui C.K., ARMR: Anonymous Routing Protocol with Multiple Routes for Communications in Mobile Ad Hoc Networks, Ad Hoc Networks. 2009, 7(8): 1536-50.

Researcher : Chin FYL

Project Title:	MultiVision Fund
Investigator(s):	Chin FYL
Department:	Computer Science
Source(s) of Funding:	MultiVision Intelligent Surveillance (Hong Kong) Ltd. - General Award
Start Date:	07/2003

Abstract:

MultiVision Fund.

Project Title:	A new motif representation based on position specific patterns
Investigator(s):	Chin FYL
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	09/2006
Completion Date:	02/2010

Abstract:

1 We want to introduce a good model to represent motifs and describe them in a precise and complete fashion. 2 Based on our new motif representation model, we will redefine the motif discovering problem and design efficient algorithms or heuristics to solve the problem. 3 The goal of the new model is to enable the discovery of a higher percentage of known motifs and some unknown motifs.

Project Title:	Design and Analysis of Online Algorithms for Frequency/Code Assignment Problems in Cellular Networks
Investigator(s):	Chin FYL
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	09/2007
Completion Date:	02/2010

Abstract:

(1) The first objective of this project is to develop novel techniques to solve the fundamental online frequency/code assignment problem, where all calls are assumed to request the same fixed data rate, and so frequencies/codes are labeled by integers. We consider the problem for a network with hexagonal cells or with a more general topology and with different reuse distances.(2) The second objective is to consider third-generation (3G) networks that use W-CDMA/OVSF technology which allows variable data rates to be requested by calls. This is a much harder problem which involves assignment, and possibly reassignment, of orthogonal codes to calls, using a code tree structure.

Project Title:	Finding Conserved Patterns in Biological Networks
Investigator(s):	Chin FYL
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	10/2008

Abstract:

1. Finding some restricted common patterns. Even though the problem of finding common patterns is usually NP-complete, we can study restricted common patterns giving rise to problems which can be solved in polynomial time, but without sacrificing the practicality of the solution. Usually the common patterns are of small size, so it is likely that we can design algorithms which can practically solve the problem without taking exponential time with respect to the size of the networks. Furthermore, some common patterns can be modeled by trees or cycles of repeated vertices. We expect that finding common patterns, when restricted to these topologies and when given the one-one correspondence of the vertices in the networks, can be solved in polynomial time. This achieved, our goal thereafter would be to study the finding of restricted common patterns when the vertex mapping might not be a one-one correspondence (but with each vertex mapped to a small constant number of vertices) and for multiple networks.; 2. Finding common patterns with missing edges. Because of the noisy data and incomplete experiments, some edges in the network might be missing (different from mismatches). The identification of a common path with one missing edge reduces to the problem of finding disjoint paths whose union matches well with the common path except by one edge. This problem can be extended to more missing edges, other subgraph topologies (other than paths) and for multiple networks.; 3. Verification by experiments on real data. The effectiveness and efficiency of our proposed algorithms will be implemented and tested on a number of databases available on biological networks. The accuracy of our algorithms will be determined by comparing our results with the known solutions.

Project Title:	Flexible Framework for GPGPU-based Video Decoding and Post-processing.
Investigator(s):	Chin FYL, Chow KP
Department:	Computer Science
Source(s) of Funding:	Innovation and Technology Support Programme (Tier 3)
Start Date:	09/2009

Abstract:

The objective of this project is to develop a flexible framework and toolset for utilizing the General Purpose Graphics Processing Unit (GPGPU) in high-definition video decoding and pos t-processing. Although there are software packages which make use of the GPGPU for high-definition video decoding and post-processing, these software packages, however, are usually proprietary and not flexible enough to allow further applications to be built upon them. Besides, from technical perspective most of the existing solutions focused on employing GPGPU to perform one single video stream decoding, without noting that there might be cases where multiple video streams, for instance in video surveillance, have to be decoded simultaneously. In this project, we will have a deeper exploration on how to utilize GPGPU in a more flexible way, and to define a framework and to develop a toolset so that multiple high-definition video streams decoding and post-processing can be properly offloaded to the GPGPU. With this toolset, potential applications including high-definition video editing, transcoding, etc. can be realized, but it can be ascertained that video surveillance industry in Hong Kong can be directly benefited by the toolset developed in this project as it addresses the fundamental problems in decoding multiple high-definition video streams on a typical workstation.

Project Title:	Combinatorial Phenotype Testing
Investigator(s):	Chin FYL
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	01/2010

Abstract:

1) Finding sequential and non-adaptive algorithms for the phenotype testing problem with mixed AND/OR-phenotypes using fewer tests.

Project Title:	The 14th International Conference on Research in Computational Molecular Biology (RECOMB 2010) IDBA - A Practical Iterative de Bruijn Graph De Novo Assembler
Investigator(s):	Chin FYL
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	04/2010
Completion Date:	04/2010

Abstract:

N/A

List of Research Outputs

Chan J.W.T., Chin F.Y.L., Ting H.F. and Zhang Y., Online Problems for Frequency Assignment and OVSF Code Assignment in Wireless Communication Networks, SIGACT News. 2009, 40(3): 86-98.

Chan W.T., Chin F.Y.L., Ting H.F. and Zhang Y., Online Tree Node Assignment with Resource Augmentation, The 15th International Computing and Combinatorics Conference (COCOON'2009). Niagara Falls, New York, USA, 2009, 358-367.

Chin F.Y.L., Ting H.F. and Zhang Y., 1-Space Bounded Algorithms for 2-Dimensional Bin Packing, The 20th Annual International Symposium on Algorithms and Computation (ISAAC 2009). Hawaii, USA, 2009.

Chin F.Y.L., Ting H.F. and Zhang Y., A constant-competitive algorithm for online OVSF code assignment, Algorithmica. 2010, 56: 89-104.

Chin F.Y.L., Conserved Patterns in Bioinformatics, Department of Computer Science, King's College. United Kingdom, 2010.

Chin F.Y.L., Conserved Patterns in Bioinformatics, Department of Computer Science, Memorial University of Newfoundland, Canada. 2009.

Chin F.Y.L., Conserved Patterns in Bioinformatics, Department of Computer Science, University of Liverpool. United Kingdom, 2010.

Chin F.Y.L., Conserved Patterns in Bioinformatics, School of Computer Science, University of Windsor. Ontario, Canada, 2009.

Chin F.Y.L., Editor, Chinese Journal of Advanced Software Research. 2009.

Chin F.Y.L., Editor, Current Bioinformatics. 2010.

Chin F.Y.L., Editor, HKIE Transactions Review Panel. 2009.

Chin F.Y.L., Editor, International Journal of Computer Processing of Oriental Languages. 2009.

Chin F.Y.L., Editor, Journal of Information Processing Letters. 2009.

Chin F.Y.L., Finding Conserved Patterns in Bioinformatics, The 2nd Mainland and Hong Kong Workshop on Bioinformatics. Xi Shuang Ban Na, Yunnan, China, 2010.

Chin F.Y.L., Managing Editor, International Journal of Foundations of Computer Science. 2009.

Chin F.Y.L., Online Frequency Assignment Problem in Cellular Network Communication, School of Computer Science, University of Windsor. Ontario, Canada, 2010.

Chin F.Y.L., Space Bounded Online Bin Packing Problem, The 2nd International Workshop on Education Technology and Computer Science (ETCS 2010) (Keynote Speeches). Wuhan, China, 2010.

Leung C.M., Leung S.Y., Yiu S.M. and Chin F.Y.L., Predicting Conserved Metabolic Pathways Leading to an Important Final Product (Poster), The 8th Annual International Conference on Computational Systems Bioinformatics Conference (CSB 2009). Stanford University, USA, 2009.

Liu Y., Zhang D. and Chin F.Y.L., A clique-based algorithm for constructing feasible timetables, Optimization Methods and Software. Taylor & Francis, 2010.

Peng Y., Leung C.M., Yiu S.M., Chin F.Y.L. and Li R., Assembling Short Reads with Much Less Memory, The 11th International Meeting on Human Genome Variation and Complex Genome Analysis (HGV 2009). Tallinn, Estonia, 2009.

Yang B., Peng Y., Leung C.M., Yiu S.M., Chen J. and Chin F.Y.L., Unsupervised Binning of Environmental Genomic Fragments based on an Error Robust Selection of l-mers, BMC Bioinformatics. London, United Kingdom, BioMed Central, 2010, 11 (Suppl 2): S5.

Yang B., Peng Y., Leung C.M., Yiu S.M., Chen J. and Chin F.Y.L., Unsupervised Binning of Environmental Genomic Fragments based on an Error Robust Selection of l-mers, The Third International Workshop on Data and Text Mining in Bioinformatics (DTMBIO 09) in the 18th ACM Conference on Information and Knowledge Management. Hong Kong, 2009.

Zhang Y., Chin F.Y.L. and Zhu H., A 1-Local Asymptotic 13/9-Competitive Algorithm for Multicoloring Hexagonal Graphs , Algorithmica. Springer, 2009, 54(4): 557–567.

Researcher : Choi YK

List of Research Outputs

Choi Y.K., Li X.Q., Rong F.G., Wang W.P. and Cameron S., Determining The Directional Contact Range Of Two Convex Polyhedra, Computer-Aided Design. Elsevier, 2010, 42(1): 27-35.

Choi Y.K., Li Ka Shing Prize (2007-2008) – Best PhD theses in the Faculties of Dentistry, Engineering, Medicine and Science, The Graduate School, The University of Hong Kong.. 2009.

Researcher : Chow KP

Project Title:	Robust Face Detection and Reconstruction Method that Addresses Inherent Limitations in Face Recognition Technology
Investigator(s):	Chow KP
Department:	Computer Science
Source(s) of Funding:	Innovation and Technology Fund Internship Programme
Start Date:	10/2008

Abstract:

The objective of this project is to study and address the limitations in the state-of-the-art of face recognition technologies. Although face recognition technologies have been quite mature and have been deployed in many application area such as access control, identity verification at immigration control, etc. However, they normally cannot work very well with non-frontal-view face images, such as those captured from surveillance cameras. As a result, the application scenarios of face recognition technologies for use with the abundantly available surveillance cameras have not been fully exploited. In this project, we will develop technologies that specifically address this limitation in face recognition. Instead of reinventing the wheel in face recognition technology, the scope of this project focuses only on the approaches that transform non-frontal-view face images back to the one that is favorable to face recognition. We intend to develop enabling technologies so that a lot more application scenarios for face recognition can be exploited.

Project Title:	Robust Face Detection and Reconstruction Method that Addresses Inherent Limitations in Face Recognition Technology
Investigator(s):	Chow KP, Chin FYL, Wong KKY
Department:	Computer Science
Source(s) of Funding:	Innovation and Technology Support Programme (Tier 3)
Start Date:	10/2008

Abstract:

Project Title:	Intelligent Internet Surveillance using Image Analysis Technologies
Investigator(s):	Chow KP
Department:	Computer Science
Source(s) of Funding:	Seed Funding Programme for Applied Research
Start Date:	01/2009
Completion Date:	12/2009

Abstract:

(a) Objectives of this project are as follows: 1. To research techniques for image analysis; 2. To research techniques for detecting specific images and text relating to taboo topics embedded in images; 3. To enhance the current capabilities of BTM to support graphical information analysis; 4. To develop a prototype system for intelligent Internet surveillance for data collection and analysis. (b) Technical challenge In 2007, the Customs and Excise Department of HKSAR adopted a research output, BitTorrent Monitoring System (BTM), developed by the Center for Information Security and Cryptography (CISC), the University of Hong Kong. BTM is a software system, designed for monitoring illegal BT traffics [2]. This software has been deployed to monitor the BT-related sub-forums in a number of public discussion forums, where actual BT users communicate and distribute torrent files. A rule-based alerting system, which supports keyword search, has been employed to locate the first uploaders in BT networks [1]. The BTM project has been proved to be a successful and practical tool for monitoring of illegitimate Internet information that is text-based in nature. However, the monitoring of multimedia information (such as, pornographic photos, specific human faces or logos) cannot be automatically detected and presented a technical challenge to law enforcement. As Internet contents to date are very varied, many of them are multimedia like pictures or videos, image analysis techniques can be very helpful for Internet surveillance as today’s content on the Internet are multimedia in nature. For example, hundreds of thousands of images are uploaded to public forums everyday. It is impossible to monitor the contents of the Internet manually when most of the information are embedded with text as well as images and photos. Also, some people may try to propagate messages relating to taboo topics with text embedded in images. With an Internet surveillance system supporting image analysis, real-time detection of these illegitimate images can be performed around the clock. However, there is no practical solution nowadays that could be employed for the detection and monitoring of multimedia contents on the Internet. (c ) Proposed solution In the past few years, CISC has done a number of researches in analyzing pictures and images under the output of ITF projects [3,4,5,6]. In this project, we shall extend the capabilities of BTM in analyzing graphical information with images and photos in the monitoring process. In particular, proposed research topics for image analysis include: • Detection of human faces; • Pornography detection; • Detection of sign or logo; • Detection of images embedded with text. We believe with the prototype system, we are able to demonstrate the capabilities of an intelligent Internet surveillance system to potential funding source so as to attract investment to further develop and commercialize this preliminary research output. Reference [1] BitTorrent Protocol, BitTorrent.org, http://www.bit torrent.org/protocol.html; accessed on July 26, 2008. [2] K.P. Chow, K.Y. Cheng, L.Y. Man, Pierre K.Y. Lai, Lucas C.K. Hui, C.F. Chong, K.H. Pun, W.W. Tsang, H.W. Chan, S.M. Yiu, BTM – An Automated Rule-based BT Monitoring System for Piracy Detection, in Proceedings of the Second International Conference on Internet Monitoring and Protection (ICIMP 2007), July 1-6, 2007 - Silicon Valley, USA. [3] Authentication for Survillance System (2004-2006) (Principal Investigator, funded by Innovation and Technology Fund, HK$3,123,000) [4] Reliable Digital Video Survillance Systems with Efficient Content-based Video Retrieval Support (2006-2007) (Principal Investigator and Project Manager, funded by Innovation and Technology Fund, HK$2,610,000) [5] Advanced Semantic Video Analysis for Content-based Video Retrieval (2006-2007) (Principal Investigator, funded by Innovation and Technology Fund, HK$944,565) [6] A Distributed Retrieval Architecture for Large Scale Surveillance System (2007-2008) (Principal Investigator and Project Manager, funded by Innovation and Technology Fund, HK$985,750)

Project Title:	MEMORY FORENSICS ON LIVE SYSTEM
Investigator(s):	Chow KP
Department:	Computer Science
Source(s) of Funding:	Seed Funding Programme for Basic Research
Start Date:	05/2009
Completion Date:	04/2010

Abstract:

INTRODUCTION In the conventional approach, we analyze memory process by preserving the memory content via dumping of memory from the physical memory [2, 4, 6]. The simplest way to analyze the memory is to perform "strings" search for locating any meaningful information therein [9, 10]. There also exist various kinds of methodology to the reconstruction of memory processes and carving of memory resided data [11, 12] to assist computer forensic examiners in better understanding the activities performed by a computer user. Notwithstanding, the aforementioned methods focus on the examination of a single snapshot of memory collected at a specific time. The integrity of the preserved memory data could not be verified due to the ever-changing internal statuses and volatile nature of the data [14]. Unlike traditional file systems forensics which allows verification of the image against the original data, memory forensics is a one-off examination where data integrity depends on the techniques and tools that are used at the memory acquisition process. To allow the acquired memory data to be admissible in legal proceedings, the confidence level of data becomes a core issue [1]. With the advance in technologies, computer systems become more complicated and are not limited to a single host. Most of the computers nowadays are connected by network with enormous network traffic flowing amongst each other, either within the LAN or across the Internet. Network information plays an important role in a number of applications such as FTP, P2P file sharing, BotNet communication, etc. This kind of network information is sometimes vital in computer forensics examination when it is required to find out which computers are currently connected and what sorts of information is being exchanged between them. Memory processes for handling those network data may provide more information about the activities being performed on the computer. However, dynamic data keeps changing and the collection of a single snapshot in conventional live forensics is apparently inadequate for analyzing the flow of running data. Should we be able to continuously collect the memory content and conduct analysis on multiple memory dumps, more information on such dynamic activities would be obtained. OBJECTIVES The purpose of the research is to derive a new memory preservation and analysis model which is able to alleviate the current problems in the collection and analysis of memory volatile data. Moreover, the research attempts to explore new ways in analyzing dynamic memory activities from continuous memory snapshots collected at various time intervals. MEMORY FORENSICS While the concept of memory forensics is still young in computer forensics study, evolving researches have been conducted in this area with a view to search for viable ways to enhance the effectiveness in this area. Recently, the collection and analysis of memory data has become one of the well-liked topics in computer forensics. A number of live incident response tools have been developed to address the needs [2,3]. Existing toolkits are often automated programs to be run on the target live system to collect volatile data at the memory. The collected data should be stored at external storage device to minimize the change of the original system. The credibility of the acquired data replies solely on the reliability of the tool and the expertise of the investigator. However, if the tool is run on a compromised system, the integrity of the collected data could hardly be secured [4]. Some tools may even substantially alter the digital environment of the original system and cause an adverse impact to the preserved memory data. As a result, it is often required to study the effect of running such tools and determine if the alterations would affect the acquired data [5]. Carrier and Grand pointed out the potential flaws in acquiring volatile data via the original system and proposed a hardware-based approach for making an accurate and reliable copy of volatile memory contents [4]. Its purpose is to avoid the volatile data being compromised by untrusted code from the operating system or other applications. Antonio further discussed the problems when acquiring live data via a network-based model and suggested a forensically sound approach in which firewire device is used to acquire memory data through the Direct Memory Access (DMA) controller [6]. Notwithstanding, the aforementioned researches focus on the proper methods or techniques that should be used in collecting the memory data, few of them discuss the way to verify the integrity and reliability of acquired memory data. During legal proceedings, the above factors are extremely important for the evidence to be accepted in the court of law. As a result, it is essential to devise some better memory acquisition model for enhancing the confidence level of using the collected data. REFERENCES [1] Judgment from Honorable Jacqueline Chooljian, Verdict of United States District Court, http://i.i.com.com/cnwk.1d/pdf/ne/2007/Torrentspy.pdf; accessed on October 14, 2008. [2] Harlan Carvey, “Windows Forensics and Incident Recovery”, Addison Wesley, 2005. [3] Kevin Mandia, Chris Prosise, and Matt Pepe, “Incident Response and Computer Forensics”, McGraw-Hill Osborne Media, 2 edition, 2003. [4] Brian D. Carrier and Joe Grand, “A Hardware-Based Memory Aquisition Procedure for Digital Investigations”, Journal of Digital Investigations, 1(1), 2004. [5] A.Walters and N. Petroni, Jr, “Volatools: Integrating Volatile Memory Forensics into the Digital Investigation Process”, Komoku, Inc., College Park, MD, USA, Jan 2006, http://www.komoku.com/forensics/basic/bh-fed-07-walters-paper.pdf; accessed on October 14, 2008. [6] A. Martin, “FireWire Memory Dump of Windows XP: A Forensic Approach”, Boston University, April 2007, http://www.friendsglobal.com/papers/FireWire%20Memory%20Dump%20of%20Windows%20XP.pdf; accessed on October 14, 2008. [7] III Golden G. Richard and Vassil Roussev, “Next-generation digital forensics”, Communications, ACM, 49(2):76–80, 2006. [8] Gabriela Limon Garcia, “Forensic physical memory analysis: an overview of tools and techniques”, Helsinki University of Technology, October 2007, http://www.tml.tkk.fi/Publications/C/25/papers/Limongarcia_final.pdf [9] Mark russinovich, strings v2.40, http://www.microsoft.com/technet/sysinternals/Miscellaneous/Strings.mspx; accessed on October 14, 2008. [10] Nicole Lang Beebe, Jan Guynes Clark, “Digital forensic text string searching”, Digital forensic research workgroup, 2007. [11] D. A. Solomon and M. E. Russinovich, “Inside Microsoft Windows 2000”, Third Edition, Microsoft Press, 2000. [12] H. Carvey, “Windows Forensics Analysis”, Syngress, 2007. [13] ManTech International Corporation, Memory DD v1.3, http://www.mantech.com/msma/mdd.asp; accessed on October 14, 2008. [14] Frank Y.W. Law, K.P. Chow, Michael Y.K. Kwan and Pierre K.Y. Lai, Consistency Issue on Live Systems Forensics, The 2007 International Workshop on Forensics for Future Generation Communication environments (F2GC-07), Jeju Island, Korea, 6 – 8 December 2007. pp. 136-140. [15] Official web site of WinHex, http://www.x-ways.net/winhex/; accessed on October 14,2008.

Project Title:	Development of Bayesian Network Templates and Probabilities for Enhanced Processing and Prosecution of Frequent Digital Crimes
Investigator(s):	Chow KP
Department:	Computer Science
Source(s) of Funding:	Seed Funding Programme for Applied Research
Start Date:	06/2009
Completion Date:	06/2010

Abstract:

1. Objectives This proposal requests a matching grant for Innovation China UK (ICUK) Proof of Concept joint application. ICUK is the first UK-China collaboration programme to promote joint innovation and knowledge transfer between UK and China. The Co-investigator, Dr Richard Overill, is the Senior Lecturer of Department of Computer Science, at King's College London (KCL), United Kingdom (UK) specialized in cybe-rcrime, cyber-terrorism, cyber-forensics and cyber-evidence. The ICUK proposal is included in the attachment of which a sum of £64,360 is requested. The ICUK grant will be used to support research to be carried out in KCL, UK. This proposal requests a sum of HK$190,000 to support research to be conducted in Hong Kong. With the funding from ICUK and the Applied Research Seed Funds of HKU, the intellectual property generated in the project will be shared by KCL and HKU. This application is a direct follow-up to a successful ICUK-funded Partnership award. It will test the Proof-of-Concept that: (a) a close-to-optimal digital forensic investigation procedure can be specified and implemented under real-world laboratory conditions; and (b) a rigorously justified likelihood that the digital evidence found supports the case can be provided for the guidance of expert witnesses in court; by means of a prototype software system based on Bayesian network [5] for each frequently occurring digital crime. 2. Background In 2008, the applicants' project "Development of power laws and Bayesian belief networks for a digital forensics software system" was awarded the ICUK funded Partnership award, which permitted two extended research visits by Dr. Overill of King's College London to the PI's research group in Hong Kong, during August 2008 and January 2009. During these visits an international conference paper [4] was produced, submitted, accepted and presented, outlining the conceptual basis for the method to be employed in this joint applied research application. As an initial verification of the method a Bayesian network for one class of digital crime (as described in [2]) has been analyzed and partially populated with probabilities generated using the proposed rigorous techniques. This funding application, if awarded, will complete this work for all five frequently occurring digital crimes in East Asia: 1) online auction fraud and auction sites selling offending goods, e.g. counterfeit products, etc. 2) theft of online game weapons 3) posting of offending items on "cyber-locker" 4) hacking, denial of services attacks 5) distribution of offending items by means of peer-to-peer networks The Bayesian networks approach has been proposed to deal with information uncertainties in digital forensics by various researchers [1, 3, 6]. We believe the prototype system to be developed in this project will help justify the use of Bayesian networks in digital forensics. References: [1] R. Brown, and B. Pham and O. de Vel, Design of a Digital Forensics Image Mining System. The Workshop on Intelligent Information Hiding and Multimedia Signal Processing (IIHMSP05), September, Melbourne. [2] M. Kwan, K.P. Chow, F. Law, P. Lai, Reasoning About Evidence using Bayesian Networks, Fourth Annual IFIP WG 11.9 International Conference on Digital Forensics, Kyoto, Japan, January 27 - 30, 2008. [3] M. Kwan, K.P. Chow, P. Lai, F. Law and K. Tse, A Digital Forensic View of the Yahoo! Case, Fifth Annual IFIP WG 11.9 International Conference on Digital Forensics, Orlando, Florida, U.S.A., January 25 – 28, 2009 (to appear) [4] R.E. Overill, M. Kwan, K.P. Chow, P. Lai and F. Law, A Cost-Effective Digital Forensic Investigation Model, Fifth Annual IFIP WG 11.9 International Conference on Digital Forensics, Orlando, Florida, U.S.A., January 25 – 28, 2009 (to appear) [5] D. Poole, Probabilistic Horn adduction and Bayesian networks, Artificial Intelligence, 64(1):81-129, 1993. [6] O.D. Vel, N. Liu, T. Caelli and T.S. Caetano, An Embedded Bayesian Network Hidden Markov Model for Digital Forensics, in Proc. Intelligence and Security Informatics, 2006, pp.459-465.

Project Title:	Object-oriented Motion Deblurring Technology for Forensic Video Enhancement
Investigator(s):	Chow KP, Chin FYL, Wong KKY
Department:	Computer Science
Source(s) of Funding:	Innovation and Technology Support Programme (Tier 3)
Start Date:	09/2009

Abstract:

The objective of this project is to realize object-oriented motion deblurring technology for forensic video enhancement. Video footages for forensic purpose are normally captured from surveillance cameras, but the objects of interest are usually found to be blurred, making identification of objects difficult and a tediously time consuming process. This project aims at resolving this problematic issue by developing a novel object-oriented motion deblurring and a super-resolution method to enhance the clarity of the objects of interest. The outcome of this project is expected to help shortening the investigation time on video-based forensic substantially.

Project Title:	The Fifteenth International Conference on Parallel and Distributed Systems (ICPADS 09) Privacy Reference Monitor - Computer Model for Law Compliant Privacy Protection
Investigator(s):	Chow KP
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	12/2009
Completion Date:	12/2009

Abstract:

N/A

List of Research Outputs

He X., Yuk S.C., Chow K.P., Wong K.K.Y. and Chung H.Y., A 3D face reconstruction approach in enhancing the robustness and accuracy of face recognition, Shanghai and Hong Kong Symposium on Science & Technology. 2009.

He X., Yuk S.C., Chow K.P., Wong K.K.Y. and Chung H.Y., Automatic 3D Face Texture Mapping Framework from Single Image, 1st International Conference on Internet Multimedia Computing and Service. Kunming, China, 2009, 123-128.

He X., Yuk S.C., Chow K.P., Wong K.K.Y. and Chung H.Y., Super-Resolution of Faces Using Texture Mapping on a Generic 3D Model, International Conference on Image and Graphics. Xi’an, China, 2009, 361-365.

Ieong S.C.R., Lai K.Y., Chow K.P., Kwan Y.K., Law Y.W., Tse K.S.H. and Tse W.H.K., Forensic Investigation of Peer-to-Peer Networks, In: Chang-Tsun Li (University of Warwick, UK), Handbook of Research on Computational Forensics, Digital Crime and Investigation: Methods and Solution. United Kingdom, IGI Global, 2009, 355-378.

Ieong S.C.R., Lai K.Y., Chow K.P., Kwan Y.K. and Law Y.W., Is it an Initial Seeder? Derived Rules that Indicate a Seeder is within the Slow-Rising Period, Sixth IFIP WG 11.9 International Conference on Digital Forensics. Hong Kong, 2010.

Kwan Y.K., Overill R.E., Chow K.P., Silomon J.A.M., Tse K.S.H., Law Y.W. and Lai K.Y., Evaluation of Evidence in Internet Auction Fraud Investigations, 6th Annual IFIP WG 11.9 International Conference on Digital Forensics, Advances in Digital Forensics VI, Ch.7. Hong Kong, Springer, 2010, 95-106.

Law Y.W., Chan P.F., Yiu S.M., Tang B., Lai K.Y., Chow K.P., Ieong S.C.R., Kwan Y.K., Hon W.K. and Hui C.K., Identifying Static and Dynamic Volatile Memory Data from Multiple Dumps for Live Forensics, Sixth IFIP WG 11.9 International Conference on Digital Forensics. Hong Kong, 2010.

Overill R.E., Silomon J.A.M. and Chow K.P., A Complexity Based Model for Quantifying Forensic Evidential Probabilities, Third International Workshop on Digital Forensics (WSDF 2010). Krakow, Poland, IEEE, 2010.

Tse W.H.K., Chow K.P., Law Y.W., Ieong S.C.R., Kwan Y.K., Tse K.S.H. and Lai K.Y., Analyzing Storage Media of Digital Camera , The 2009 International Workshop on Forensics for Future Generation Communication environments (F2GC-09). Jeju Island, Korea, 2009.

Researcher : Chui CK

List of Research Outputs

Chui C.K., Lo E., Kao C.M. and Cheung D.W.L., S-OLAP: an OLAP system for analysis sequence data, ACM SIGMOD International Conference on Management of Data 2010 (SIGMOD 2010). 2010.

Chui C.K., Lo E., Kao C.M. and Ho W.S., Supporting Ranking Pattern-Based Aggregate Queries in Sequence Data Cubes, Proceedings of The 18th ACM Conference on Information and Knowledge Management (ACM CIKM 2009). 2009.

Researcher : Chung HY

List of Research Outputs

He X., Yuk S.C., Chow K.P., Wong K.K.Y. and Chung H.Y., A 3D face reconstruction approach in enhancing the robustness and accuracy of face recognition, Shanghai and Hong Kong Symposium on Science & Technology. 2009.

He X., Yuk S.C., Chow K.P., Wong K.K.Y. and Chung H.Y., Automatic 3D Face Texture Mapping Framework from Single Image, 1st International Conference on Internet Multimedia Computing and Service. Kunming, China, 2009, 123-128.

He X., Yuk S.C., Chow K.P., Wong K.K.Y. and Chung H.Y., Super-Resolution of Faces Using Texture Mapping on a Generic 3D Model, International Conference on Image and Graphics. Xi’an, China, 2009, 361-365.

Researcher : Cui Y

List of Research Outputs

Cui Y., A Study on Privacy-Preserving Clustering. Hong Kong, 2010.

Researcher : Dai Z

List of Research Outputs

Dai Z., A Markov Random Field Approach for Multi-view Normal Integration. Hong Kong, 2009.

Schnieders D., Wong K.K.Y. and Dai Z., Polygonal Light Source Estimation, Asian Conference on Computer Vision. 2009, III: 96-107.

Researcher : Fang J

List of Research Outputs

Fang J., Jiang L., Yiu S.M. and Hui C.K., Efficient key integrity verification for quantum cryptography using combinatorial group testing, SPIE Defense, Security, and Sensing Conference. USA, 2010.

Fang J., Jiang L., Yiu S.M. and Hui C.K., Hard Disk Integrity Check by Hashing with Combinatorial Group Testing, The 2009 International Workshop on Forensics for Future Generation Communication environments (F2GC-09). Jeju Island, Korea, 2009.

Researcher : Fu X

Project Title:	The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition (CVPR2010) Reconstruction of Display and Eyes from a Single Image
Investigator(s):	Fu X
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	06/2010

Abstract:

N/A

List of Research Outputs

Schnieders D., Fu X. and Wong K.K.Y., Reconstruction of Display and Eyes from a Single Image, IEEE International Conference on Computer Vision and Pattern Recognition. 2010, 1442-1449.

Researcher : Gong J

List of Research Outputs

Cheng C.K., Gong J. and Cheung D.W.L., Managing Uncertainty of XML Schema Matching, The IEEE Intl. Conf. on Data Engineering (IEEE ICDE 2010). 2010.

Researcher : He X

Project Title:	The First International Conference on Internet Multimedia Computing and Service Automatic 3D Face Texture Mapping Framework from Single Image
Investigator(s):	He X
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	11/2009
Completion Date:	11/2009

Abstract:

N/A

Project Title:	Methodology and application of 3D segmentation for rigid objects from image and video sequences
Investigator(s):	He X, Chow KP
Department:	Computer Science
Source(s) of Funding:	Small Project Funding
Start Date:	01/2010

Abstract:

(a) Project Objectives The objective of this project is to enhance and expedite the on-going object segmentation research development in the Department of Computer Science. Through this project, we expect to deliver a brand new approach in addressing the problem of object segmentation, which is central to many computer vision related types of applications. Specifically, shadow and occlusion are main issues that current computer vision approaches fail to handle properly. We expect the project proposed herein to alleviate the central problems suffered by the existing approaches by extracting and utilizing 3D information of objects. (b) Key issues and problems One of the central issues in computer vision is to detect, extract and analyze moving objects. In a sense, the fundamental problem is to extract moving objects in space (segmentation), where a computer can determine later the structure and motion of each object relative to the viewing system (interpretation). However, since real world objects in a flow of image frames are usually associated with shadows or occluded over each other, where almost all the state of the art computer vision approaches [1-6] fail to handle properly, the performance of automated computer vision tasks can never come close to that of human vision system. (c) Proposed Solution Upon closer inspection, we believed that 3D object information is very helpful to remove shadow and split occluded objects. On the other hand, most of the existing object segmentation approaches do not consider useful 3D object information from 2D video sequence to tackle properly the object segmentation problem [7, 8]. In view of this, employing the 3D reconstruction approach for object segmentation has potential to be an ultimate solution for those issues. [1]A. Prati, I. Mikic, M. M. Trivedi, and R. Cucchiara, "Detecting moving shadows: Algorithms and evaluation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, pp. 918-923, 2003. [2]Y. Matsushita, K. Nishino, K. Ikeuchi, and M. Sakauchi, "Illumination normalization with time-dependent intrinsic images for video surveillance," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, pp. 1336-1347, 2004. [3]S. Nadimi and B. Bhanu, "Physical models for moving shadow and object detection in video," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, pp. 1079-1087, 2004. [4]C. C. C. Pang, W. W. L. Lam, and N. H. C. Yung, "A novel method for resolving vehicle occlusion in a monocular traffic-image sequence," IEEE Transactions on Intelligent Transportation Systems, vol. 5, pp. 129-141, 2004. [5]S. Kamijo, T. Nishida, and M. Sakauchi, "Occlusion robust and illumination invariant vehicle tracking for acquiring detailed statistics from traffic images," IEEE Transactions on Information and Systems, vol. E85D, pp. 1753-1766, 2002. [6]W. Zhang, Q. A. J. Wu, X. K. Yang, and X. Z. Fang, "Multilevel framework to detect and handle vehicle occlusion," IEEE Transactions on Intelligent Transportation Systems, vol. 9, pp. 161-174, 2008. [7]D. Cremers, M. Rousson, and R. Deriche, "A review of statistical approaches to level set segmentation: Integrating color, texture, motion and shape," International Journal of Computer Vision, vol. 72, pp. 195-215, 2007. [8]S. Lefevre, J. Holler, and N. Vincent, "A review of real-time segmentation of uncompressed video sequences for content-based search and retrieval," Real-Time Imaging, vol. 9, pp. 73-98, 2003.

List of Research Outputs

He X., Yuk S.C., Chow K.P., Wong K.K.Y. and Chung H.Y., A 3D face reconstruction approach in enhancing the robustness and accuracy of face recognition, Shanghai and Hong Kong Symposium on Science & Technology. 2009.

He X., Yuk S.C., Chow K.P., Wong K.K.Y. and Chung H.Y., Automatic 3D Face Texture Mapping Framework from Single Image, 1st International Conference on Internet Multimedia Computing and Service. Kunming, China, 2009, 123-128.

He X., Yuk S.C., Chow K.P., Wong K.K.Y. and Chung H.Y., Super-Resolution of Faces Using Texture Mapping on a Generic 3D Model, International Conference on Image and Graphics. Xi’an, China, 2009, 361-365.

Researcher : Ho WS

List of Research Outputs

Chui C.K., Lo E., Kao C.M. and Ho W.S., Supporting Ranking Pattern-Based Aggregate Queries in Sequence Data Cubes, Proceedings of The 18th ACM Conference on Information and Knowledge Management (ACM CIKM 2009). 2009.

Researcher : Hu P

List of Research Outputs

Zhang Z., Chan W.K., Tse T.H. and Hu P., Experimental study to compare the use of metamorphic testing and assertion checking, Journal of Software. 2009, 20 (10): 2637−2654.

Researcher : Hua Q

List of Research Outputs

Hua Q., Wang Y., Yu D. and Lau F.C.M., Dynamic Programming Based Algorithms for Set Multicover and Multiset Multicover Problems, Theoretical Computer Science. Elsevier, 2010, 411: 2467-2474.

Hua Q. and Lau F.C.M., Joint Link Scheduling and Topology Control for Wireless Sensor Networks with SINR Constraints, In: Jin Hai, Jiang Wenbin, Handbook of Research on Developments and Trends in Wireless Sensor Networks: From Principle to Practice. IGI Global, 2010, 184-208.

Hua Q., Scheduling Wireless Links with SINR Constraints. Hong Kong, 2009.

Hua Q., Wang Y., Yu D. and Lau F.C.M., Set Multi-Covering via Inclusion-Exclusion, Theoretical Computer Science. Elsevier, 2009, 410: 3882-3892.

Researcher : Hua Q

List of Research Outputs

Hua Q., Wang Y., Yu D. and Lau F.C.M., Dynamic Programming Based Algorithms for Set Multicover and Multiset Multicover Problems, Theoretical Computer Science. Elsevier, 2010, 411: 2467-2474.

Hua Q., Scheduling Wireless Links with SINR Constraints. Hong Kong, 2009.

Hua Q., Wang Y., Yu D. and Lau F.C.M., Set Multi-Covering via Inclusion-Exclusion, Theoretical Computer Science. Elsevier, 2009, 410: 3882-3892.

Researcher : Huang W

List of Research Outputs

Huang W., Wu C. and Lau F.C.M., The Performance and Locality Trade-off in BitTorrent-like P2P File-Sharing Systems, IEEE International Conference on Communications (IEEE ICC 2010). 2010.

Qiu X., Huang W., Wu C., Lau F.C.M. and Lin X., InstantLeap: an Architecture for Fast Neighbor Discovery in Large-scale P2P VoD Streaming, In: Thomas Plagemann, Springer Multimedia Systems Journal. Springer-Verlag, 2010, 16: 183-198.

Researcher : Hui CK

Project Title:	Keeping a single private key for different joint projects
Investigator(s):	Hui CK, Yiu SM
Department:	Computer Science
Source(s) of Funding:	Seed Funding Programme for Basic Research
Start Date:	06/2008
Completion Date:	05/2010

Abstract:

1. To design an ID-based efficient and practical solution to allow users to hold only one key for access to confidential documents of different joint projects. 2. To study the revocation problem to revoke the access rights from the users in case needed.

Project Title:	Error Correction in Quantum Cryptography
Investigator(s):	Hui CK
Department:	Computer Science
Source(s) of Funding:	Seed Funding Programme for Basic Research
Start Date:	06/2009

Abstract:

Quantum cryptography could be the first application of quantum information science, which is a new area integrating quantum mechanics and information theory. Quantum cryptography (secure key distribution) [1, 2] makes it possible to implement absolutely secure "one-time pad" cryptography systems (undecipherable by an eavesdropper practically and theoretically) [3, 4]. It is a technique to let two parties - for example, Alice and Bob - agree on a shared random bit sequence, with a very low probability that eavesdroppers can make successful inference on the values of the bits. Unlike public key cryptosystems based on computational complexity, the security of quantum cryptography relies on the fundamental principles of quantum mechanics such as Heisenberg’s uncertainty principle and No-cloning theorem. The basic idea of quantum cryptography is as follows: Alice sends a series of single photons to Bob through a quantum channel, each modulated with a random basis (here, a two-sided card) and a random value. Alice chooses one side of the card at random, writes a random 0 or 1 on that side, and sends the card to Bob. Bob also chooses a side at random and reads that side’s value. When Alice and Bob choose the same side, Bob reads exactly what Alice wrote. Otherwise, he reads a 0 or 1 completely at random. After Bob reads all the photons, he performs a sifting transaction with Alice on a public channel to discard all cases where he read the wrong side (basis). He sends the random basis setting list that he used to Alice, and she tells him which ones were correct. Then Alice and Bob discard all values where they disagreed on the basis and keep the remaining values for raw key material. About 50% of the bit string is discarded. This shorter key obtained after basis reconciliation is called the sifted key. During this process, if an eavesdropper, say Eve, tries to measure or duplicate the quantum states of the photon transmitted, it is guaranteed by the laws of quantum physics that Eve cannot get any information about the communication without introducing perturbations which would cause error bits in sifted key and so reveal her presence. The issues: In principle, if eavesdropping does not exist, the sifted keys held by Alice and Bob should be identical. However, in reality, technical imperfections are inherently present and will introduce noise into the messages transmitted, and then error bits in sifted key are unavoidable even if there is no eavesdropper. Hence, the procedure for error correction is necessary for quantum cryptography system which transmits raw keys over a noisy quantum communication channel. The error correction protocols must be efficient both in terms of the computational and communication resources used and in terms of the amount of extra information passed to correct the error bits. Limitations of Exiting solutions Cascade [5] is one of the most famous protocols and has been applied in many quantum cryptographic systems. Its greatest drawback the protocol is highly interactive, with several rounds each involving several communication round-trips between the parties. It is very slow and can only deal with sifted key rates of less than 50,000 bits per second. Another protocol that addresses the problem of too many round-trip communication in Cascade is Buttler et al.’s Winnow [6]. This protocol has a significantly smaller number of round-trips, though it is still quite interactive. The problem of this protocol is that in the case of three or more errors in the same block, the method can introduce additional errors. In addition, both Cascade and Winnow need to discard the bits revealed in the interactive communications and further reduce the generating rate of secure keys. For example, as a typical experimental result of BBN’s Cascade protocol within DARPA quantum network showed [7], for 4096 bit blocks with 3% error rate, the number of discarded bits is 958 and the payload for interactive communications is 19200 bytes. The problem to be addressed It can be summarized that the performance of current error correction protocols cannot be entirely satisfactory in both efficiency and reliability. In this proposal, we aim at developing a better error correction protocol for quantum cryptographic systems so that it is faster than all existing protocols and to correct more errors precisely with high robustness with no interaction needed.

Project Title:	A Harddisk Integrity Problem with Applications to Computer Forensics
Investigator(s):	Hui CK
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	12/2009

Abstract:

1) To develop a feasible solution to verify the integrity of a harddisk before the investigation of the law enforcement party on the computer of the suspect and after the investigation with the assumption that some sectors will be changed (due to bad sectors or legal procedure). 2) Extend the solution to a general harddisk integrity problem so that normal users can also use the solution provided to verify the integrity of the harddisk to make sure that no intrusion has occurred. This problem has an additional requirement as the user will keep changing the sectors so that the solution should be made conveniently for each update of the harddisk.

Project Title:	The 2009 International Workshop on Forensics for Future Generation Communication environments (F2GC-09) Hard Disk Integrity Check by Hashing with Combinatorial Group Testing
Investigator(s):	Hui CK
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	12/2009
Completion Date:	12/2009

Abstract:

N/A

Project Title:	Identifying News from Discussion Forums for Cyber Patrolling using Data Mining Techniques
Investigator(s):	Hui CK, Yiu SM
Department:	Computer Science
Source(s) of Funding:	Seed Funding Programme for Basic Research
Start Date:	01/2010

Abstract:

Background and issues Roughly speaking, cyber patrolling refers to the monitoring of websites, peer-to-peer networks, discussion forums and any Internet-based public areas for the detection of threats, pornography, fraud, abnormal behavior, and other types of cyber crimes. Cyber patrolling is a new approach to fight against cyber crime. However, cyber patrolling is still at the development stage. There are many unsolved issues behind this idea. There are at least two fundamental problems that make cyber patrolling a very challenging task. Firstly, there are millions of websites and forums. It is not easy to check and monitor all these sites and forums in every second. Secondly, to automatically detect abnormal behavior is a very difficult problem. The abnormal behavior (or activities related to cyber crime) may appear suddenly and may only exist in the Internet for a short period of time. For example, one may upload a piracy copy of a TV show to the Internet. The other users may rush to download the copy. The original “seeder” may remove the uploaded copy after a few hours, thus, if human being ‘cyber cops’ are used to monitor the Internet, this kind of abnormal behavior may be missed by the cyber cops. It is very important to detect this abnormal behavior in a timely manner. Recently, the European Union (EU) has decided to spend £ 10 million dollars to support a five-year research programme, called Project Indect (http://www.indect-project.eu), to develop a tool (artificial intelligence based) to automatically monitor and process information from websites and public domain so as to detect threats and abnormal behavior of violence. This shows strong evidence that cyber patrolling is a critical direction for investigating cyber crimes. Objectives Cyber patrolling is a very big problem. In this proposal, we do not aim at providing a comprehensive solution for cyber patrolling. On the other hand, we focus on one of the critical subproblems: how to automatically detect sudden abnormal behavior. To make the scope reasonable and achievable, we consider the following problem in this seed funding project. We assume that the law enforcement party already identifies several popular discussion forums. Problem: We want to design a solution to download all postings from the discussion forums on a daily basis, and the postings will be analyzed by a data mining engine to locate a set of emerging terms/s that suddenly appear very frequently in the Internet. These terms/s may provide hints for the law enforcement parties to identify potential “targets” for further investigation. To provide an example, referring to the case of a Hong Kong fast-food worker whose rape of a colleague was filmed and uploaded onto the Internet, the term “Yoshinoya restaurant” suddenly appeared in the discussion forums with a very abnormal frequency. If the law enforcement party can be automatically warned for this case, they can be more proactive. Of course, this instance was reported to the police by users of the Internet who viewed this video. However, there may be other cases that were not caught by law enforcement officers. Related work We found no existing solution to solve this problem of how to identify emerging terms/phrases that with an abnormal high frequency in a short period of time is not trivial. On the other hand, data mining has been successfully applied to problems related to intrusion detection [1 – 3], criminal investigations such as the technique of classification via discriminant analysis for the determining the probability of intentionality associated with downloading contraband images (i.e. child pornography) [4] and content-based image retrieval [5] to identify potential evidence from a huge amount of images. There are also fundamental text mining techniques developed by researchers for applying data mining in forensics investigation. For example, in [6], Shannon proposed a scoring scheme, called “Forensics Relative Strength Scoring” which includes two statistical measures, namely ASCII proportionality and entropy, to discover digital evidence. For solving our problem, detection of outliers, which are defined as observations that deviates substantially from other observations, may be more related since these outliers may be related to emerging/abnormal phrases which suddenly appear with high frequency in the discussion group. There are some existing data mining algorithms [7 – 9] for finding outliers which may be useful in our project. In particular, [10] defined a local outlier factor (LOF) which indicates the degree of an object being outlier. We may try to apply this idea to solve our problem (see the section for methodology for more details).

Project Title:	Cryptanalysis and Implementation of Cipher
Investigator(s):	Hui CK, Yiu SM
Department:	Computer Science
Source(s) of Funding:	NSFC/RGC Joint Research Scheme
Start Date:	01/2010

Abstract:

1) Cryptanalyze Hash Functions. Mainly focus on the candidates of SHA-3 competition organized by NIST. 2) Research fast implementation algorithm of Hash Functions, with application to the newly developed research field in computer forensics. 3) Cryptanalyze block ciphers such as AES and other widely used block ciphers, and cryptanalyze the block ciphers with special structures, such as those with non-XOR subkey operations, large S-box, and lightweight block ciphers designed for low cost environment in recent years, such as PRESENT, DESL. 4) Research on the security proof in the Random Oracle model based on block ciphers and Hash Functions.

List of Research Outputs

Chim T.W., Yiu S.M., Hui C.K. and Li V.O.K., SPECS: Secure and Privacy Enhancing Communications Schemes for VANETs, Ad Hoc Networks. 2010, 28: 160 - 175.

Chim T.W., Yiu S.M., Hui C.K., Jiang L. and Li V.O.K., SPECS: Secure and Privacy Enhancing Communications for VANET, ADHOCNETS '09. Niagara Falls, Canada, 2009.

Chim T.W., Yiu S.M., Hui C.K., Jiang L. and Li V.O.K., SPECS: Secure and Privacy Enhancing Communications for VANET, The First International Conference on Ad Hoc Networks (AdHocNets 2009). Canada, 2009.

Dong Y., Chim T.W., Li V.O.K., Yiu S.M. and Hui C.K., ARMR: Anonymous Routing Protocol with Multiple Routes for Communications in Mobile Ad Hoc Networks, Ad Hoc Networks. 2009, 7(8): 1536-50.

Fang J., Jiang L., Yiu S.M. and Hui C.K., Efficient key integrity verification for quantum cryptography using combinatorial group testing, SPIE Defense, Security, and Sensing Conference. USA, 2010.

Fang J., Jiang L., Yiu S.M. and Hui C.K., Hard Disk Integrity Check by Hashing with Combinatorial Group Testing, The 2009 International Workshop on Forensics for Future Generation Communication environments (F2GC-09). Jeju Island, Korea, 2009.

Jiang L., Yiu S.M., Dong Y., Hui C.K. and Wong H.Y., Secure Chained Threshold Proxy Signature without and with Supervision, Journal of Software Engineering and Applications. 2009, 2(4): 267 - 275.

Law Y.W., Chan P.F., Yiu S.M., Tang B., Lai K.Y., Chow K.P., Ieong S.C.R., Kwan Y.K., Hon W.K. and Hui C.K., Identifying Static and Dynamic Volatile Memory Data from Multiple Dumps for Live Forensics, Sixth IFIP WG 11.9 International Conference on Digital Forensics. Hong Kong, 2010.

Researcher : Hung RYS

List of Research Outputs

Hung R.Y.S. and Ting H.F., Design and analysis of online batching systems, Algorithmica. 2010, 57: 217-228.

Hung R.Y.S., Lee L.K. and Ting H.F., Finding frequent items over sliding windows with constant update time, Information processing letters. 2010, 110: 257-260.

Researcher : Huo Q

Project Title:	International Conference on Pattern Recognition (ICPR-2002) A Comparative Study of Several Modeling Approaches for Large Vocabulary Offline Recognition of Handwritten Chinese Characters
Investigator(s):	Huo Q
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	08/2002

Abstract:

N/A

Project Title:	Chinese spoken language processing
Investigator(s):	Huo Q
Department:	Computer Science
Source(s) of Funding:	Anhui USTC iFLYTEK Co. Ltd. - General Award
Start Date:	09/2005

Abstract:

To research and develop the state-of-the-art Chinese spoken language processing technology.

Researcher : Ieong SCR

List of Research Outputs

Ieong S.C.R., Lai K.Y., Chow K.P., Kwan Y.K., Law Y.W., Tse K.S.H. and Tse W.H.K., Forensic Investigation of Peer-to-Peer Networks, In: Chang-Tsun Li (University of Warwick, UK), Handbook of Research on Computational Forensics, Digital Crime and Investigation: Methods and Solution. United Kingdom, IGI Global, 2009, 355-378.

Ieong S.C.R., Lai K.Y., Chow K.P., Kwan Y.K. and Law Y.W., Is it an Initial Seeder? Derived Rules that Indicate a Seeder is within the Slow-Rising Period, Sixth IFIP WG 11.9 International Conference on Digital Forensics. Hong Kong, 2010.

Law Y.W., Chan P.F., Yiu S.M., Tang B., Lai K.Y., Chow K.P., Ieong S.C.R., Kwan Y.K., Hon W.K. and Hui C.K., Identifying Static and Dynamic Volatile Memory Data from Multiple Dumps for Live Forensics, Sixth IFIP WG 11.9 International Conference on Digital Forensics. Hong Kong, 2010.

Tse W.H.K., Chow K.P., Law Y.W., Ieong S.C.R., Kwan Y.K., Tse K.S.H. and Lai K.Y., Analyzing Storage Media of Digital Camera , The 2009 International Workshop on Forensics for Future Generation Communication environments (F2GC-09). Jeju Island, Korea, 2009.

Researcher : Jiang B

List of Research Outputs

Jiang B., Zhang Z., Chan W.K. and Tse T.H., Adaptive random test case prioritization, Proceedings of the 24th IEEE/ACM International Conference on Automated Software Engineering (ASE 2009), IEEE Computer Society Press, Los Alamitos, CA (acceptance rate: 17.1%). 2009, 233−244.

Jiang B., Zhang Z., Tse T.H. and Chen T.Y., Best Paper Award, 33rd Annual International Computer Software and Applications Conference (COMPSAC 2009), IEEE Computer Society, Washington, DC, USA . 2009.

Jiang B., Zhang Z., Tse T.H. and Chen T.Y., How well do test case prioritization techniques support statistical fault localization [Best Paper Award, COMPSAC 2009], Proceedings of the 33rd Annual International Computer Software and Applications Conference (COMPSAC 2009), vol. 1, IEEE Computer Society Press, Los Alamitos, CA. 2009, 99−106.

Zhang Z., Chan W.K., Tse T.H., Jiang B. and Wang X., Capturing propagation of infected program states, Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC 2009/FSE-17), ACM Press, New York, NY (acceptance rate: 14.7%). 2009, 43−52.

Zhang Z., Jiang B., Chan W.K., Tse T.H. and Wang X., Fault localization through evaluation sequences, Journal of Systems and Software (ISI impact factor of journal: 1.340). 2010, 83 (2): 174−187.

Researcher : Jiang L

List of Research Outputs

Chim T.W., Yiu S.M., Hui C.K., Jiang L. and Li V.O.K., SPECS: Secure and Privacy Enhancing Communications for VANET, ADHOCNETS '09. Niagara Falls, Canada, 2009.

Chim T.W., Yiu S.M., Hui C.K., Jiang L. and Li V.O.K., SPECS: Secure and Privacy Enhancing Communications for VANET, The First International Conference on Ad Hoc Networks (AdHocNets 2009). Canada, 2009.

Fang J., Jiang L., Yiu S.M. and Hui C.K., Efficient key integrity verification for quantum cryptography using combinatorial group testing, SPIE Defense, Security, and Sensing Conference. USA, 2010.

Fang J., Jiang L., Yiu S.M. and Hui C.K., Hard Disk Integrity Check by Hashing with Combinatorial Group Testing, The 2009 International Workshop on Forensics for Future Generation Communication environments (F2GC-09). Jeju Island, Korea, 2009.

Jiang L., New cryptographic schemes with application in network security and computer forensics. Hong Kong, 2010.

Jiang L., Yiu S.M., Dong Y., Hui C.K. and Wong H.Y., Secure Chained Threshold Proxy Signature without and with Supervision, Journal of Software Engineering and Applications. 2009, 2(4): 267 - 275.

Researcher : Jiang L

List of Research Outputs

Chim T.W., Yiu S.M., Hui C.K., Jiang L. and Li V.O.K., SPECS: Secure and Privacy Enhancing Communications for VANET, ADHOCNETS '09. Niagara Falls, Canada, 2009.

Jiang L., New cryptographic schemes with application in network security and computer forensics. Hong Kong, 2010.

Researcher : Kao CM

Project Title:	Online Analytical Processing on Sequence Data
Investigator(s):	Kao CM, Cheung DWL
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	09/2008

Abstract:

1. System Architecture. Our first objective is to study the various components of a sequence OLAP system and their interactions. Our S-OLAP architecture consists of the following components: an event database, a sequence database, an S-cuboid computation unit, an S-cuboid storage system, and a user interface that allows a user to identify an S-cuboid for query processing. 2. S-OLAP Operations. Traditional OLAP systems provide a few cube operations such as roll-up, drill-down, slice, and dice, which transform a cuboid to another. An S-OLAP system should support similar operations, in particular, operations that are pertinent to pattern manipulations. Our second objective is to propose a set of S-OLAP operations and study how they can be implemented efficiently. 3. Cuboid Materialization. Our third objective addresses the issue of how and when S-cuboids are materialized, computed, stored, and retrieved. We note that the number of possible S-cuboids could be huge. For example, if event records have d dimension attributes, each expressible under k abstraction levels, then there are as many as (kd)^s possible S-cuboids of pattern length s. Materializing and maintaining these large number of S-cuboids is a challenging research problem. 4. Aggregation Models. Our fourth objective is to study various aggregation models in S-OLAP. For example, a sequence “aabbab” matches the pattern “ab” at multiple places, and thus an aggregation that counts multiplicity could be useful. Another aggregation model is to summarize certain “measure attributes” of the events that constitute a sequence. For example, Octopus sequences register payment events. A useful aggregate would be the sum of money spent in all the events of a sequence group. 5. Prototyping and Applications. Our fifth objective is to build a prototype S-OLAP system on which certain specific applications are developed. The investigators have conducted meetings with a local railway company to understand the data characteristics and user requirements of such an S-OLAP system. We expect to evaluate the performance of our S-OLAP system on an executable platform.

List of Research Outputs

Bi B., Shang L. and Kao C.M., Collaborative Resource Discovery in Social Tagging Systems, Proceedings of The 18th ACM Conference on Information and Knowledge Management (ACM CIKM 2009). 2009.

Cheng C.K., Kao C.M., Kwan K.L., Prabhakar S. and Tu Y., Filtering Stream Data for Entity-based Continuous Queries, IEEE Transactions on Knowledge and Data Engineering (IEEE TKDE). IEEE, 2010, 22: 234-248.

Chui C.K., Lo E., Kao C.M. and Cheung D.W.L., S-OLAP: an OLAP system for analysis sequence data, ACM SIGMOD International Conference on Management of Data 2010 (SIGMOD 2010). 2010.

Chui C.K., Lo E., Kao C.M. and Ho W.S., Supporting Ranking Pattern-Based Aggregate Queries in Sequence Data Cubes, Proceedings of The 18th ACM Conference on Information and Knowledge Management (ACM CIKM 2009). 2009.

Ren J., Lee S.D., Chen X., Kao C.M., Cheng C.K. and Cheung D.W.L., Naive Bayes Classification of Uncertain Data, The IEEE International Conference on Data Mining (IEEE ICDM 2009). 2009.

Wong W.K., Cheung D.W.L., Hung E., Kao C.M. and Mamoulis N., An Audit Environment for Outsourcing of Frequent Itemset Mining, Proceedings of the 35th International Conference on Very Large Data Bases (VLDB 2009). 2009.

Wong W.K., Cheung D.W.L., Hung E. and Kao C.M., An Audit Environment for Outsourcing of Frequent Itemset Mining, Proceedings of the 35th Very Large Data Bases Conference (VLDB). Lyon, France, 2009, 1162-1172.

Researcher : Kwan KL

List of Research Outputs

Cheng C.K., Kao C.M., Kwan K.L., Prabhakar S. and Tu Y., Filtering Stream Data for Entity-based Continuous Queries, IEEE Transactions on Knowledge and Data Engineering (IEEE TKDE). IEEE, 2010, 22: 234-248.

Researcher : Kwan YK

List of Research Outputs

Ieong S.C.R., Lai K.Y., Chow K.P., Kwan Y.K., Law Y.W., Tse K.S.H. and Tse W.H.K., Forensic Investigation of Peer-to-Peer Networks, In: Chang-Tsun Li (University of Warwick, UK), Handbook of Research on Computational Forensics, Digital Crime and Investigation: Methods and Solution. United Kingdom, IGI Global, 2009, 355-378.

Ieong S.C.R., Lai K.Y., Chow K.P., Kwan Y.K. and Law Y.W., Is it an Initial Seeder? Derived Rules that Indicate a Seeder is within the Slow-Rising Period, Sixth IFIP WG 11.9 International Conference on Digital Forensics. Hong Kong, 2010.

Kwan Y.K., Overill R.E., Chow K.P., Silomon J.A.M., Tse K.S.H., Law Y.W. and Lai K.Y., Evaluation of Evidence in Internet Auction Fraud Investigations, 6th Annual IFIP WG 11.9 International Conference on Digital Forensics, Advances in Digital Forensics VI, Ch.7. Hong Kong, Springer, 2010, 95-106.

Law Y.W., Chan P.F., Yiu S.M., Tang B., Lai K.Y., Chow K.P., Ieong S.C.R., Kwan Y.K., Hon W.K. and Hui C.K., Identifying Static and Dynamic Volatile Memory Data from Multiple Dumps for Live Forensics, Sixth IFIP WG 11.9 International Conference on Digital Forensics. Hong Kong, 2010.

Tse W.H.K., Chow K.P., Law Y.W., Ieong S.C.R., Kwan Y.K., Tse K.S.H. and Lai K.Y., Analyzing Storage Media of Digital Camera , The 2009 International Workshop on Forensics for Future Generation Communication environments (F2GC-09). Jeju Island, Korea, 2009.

Researcher : Kwok WCH

List of Research Outputs

Cheung D.W.L. and Kwok W.C.H., Consulting Services on B2B Connectors, Apacus Software Corporation Ltd. 2010.

Cheung D.W.L. and Kwok W.C.H., Consulting Services on Hermes H2O , Apacus Software Corporation Ltd. 2010.

Cheung D.W.L. and Kwok W.C.H., Enhancement on HKMA’s Data Transformer System , ICO Limited, and Hong Kong Monetary Authority. 2010.

Cheung D.W.L. and Kwok W.C.H., Open Source ebXML Gateway Herme, Curalia AB and Agresso AB, Sweden. 2009.

Cheung D.W.L. and Kwok W.C.H., Open Source ebXML Gateway Hermes , Babelway. 2010.

Cheung D.W.L. and Kwok W.C.H., Open Source ebXML Gateway Hermes, Connective-IT, France. 2009.

Cheung D.W.L. and Kwok W.C.H., Open Source ebXML Gateway Hermes, OGCIO, HKSARG. 2009.

Cheung D.W.L. and Kwok W.C.H., Software License, B2B Connector, Kerry Logistics. 2010.

Kwok W.C.H. and Cheung D.W.L., Data Harmonization for ASEAN Single Window Project, Apacus Software Corporation Ltd. 2009.

Kwok W.C.H. and Cheung D.W.L., Strategy Study on the Mass Rollout of the Insurance Portal System, Transport Department, HKSAR. 2009.

Researcher : Lai KY

List of Research Outputs

Ieong S.C.R., Lai K.Y., Chow K.P., Kwan Y.K., Law Y.W., Tse K.S.H. and Tse W.H.K., Forensic Investigation of Peer-to-Peer Networks, In: Chang-Tsun Li (University of Warwick, UK), Handbook of Research on Computational Forensics, Digital Crime and Investigation: Methods and Solution. United Kingdom, IGI Global, 2009, 355-378.

Ieong S.C.R., Lai K.Y., Chow K.P., Kwan Y.K. and Law Y.W., Is it an Initial Seeder? Derived Rules that Indicate a Seeder is within the Slow-Rising Period, Sixth IFIP WG 11.9 International Conference on Digital Forensics. Hong Kong, 2010.

Kwan Y.K., Overill R.E., Chow K.P., Silomon J.A.M., Tse K.S.H., Law Y.W. and Lai K.Y., Evaluation of Evidence in Internet Auction Fraud Investigations, 6th Annual IFIP WG 11.9 International Conference on Digital Forensics, Advances in Digital Forensics VI, Ch.7. Hong Kong, Springer, 2010, 95-106.

Law Y.W., Chan P.F., Yiu S.M., Tang B., Lai K.Y., Chow K.P., Ieong S.C.R., Kwan Y.K., Hon W.K. and Hui C.K., Identifying Static and Dynamic Volatile Memory Data from Multiple Dumps for Live Forensics, Sixth IFIP WG 11.9 International Conference on Digital Forensics. Hong Kong, 2010.

Tse W.H.K., Chow K.P., Law Y.W., Ieong S.C.R., Kwan Y.K., Tse K.S.H. and Lai K.Y., Analyzing Storage Media of Digital Camera , The 2009 International Workshop on Forensics for Future Generation Communication environments (F2GC-09). Jeju Island, Korea, 2009.

Researcher : Lam KT

List of Research Outputs

Lam K.T., Luo Y. and Wang C.L., Adaptive Sampling-Based Profiling Techniques for Optimizing the Distributed JVM Runtime, The 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS2010) . 2010.

Lam K.T. and Wang C.L., Web Application Server Clustering with Distributed Java Virtual Machine (Chapter 28), In: Kuan-ching Li, Ching-hsien Hsu, Laurence Tianruo Yang, Jack Dongarra, Hans Zima , Handbook of Research on Scalable Computing Technologies. IGI Global, 2009, 658-681.

Luo Y., Lam K.T. and Wang C.L., Path-Analytic Distributed Object Prefetching, The 10th International Symposium on Pervasive Systems, Algorithms and Networks (I-SPAN 2009). 2009.

Researcher : Lam TW

Project Title:	Compressed indexes for approximate string matching, with applications to biological sequences
Investigator(s):	Lam TW
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	08/2006

Abstract:

1 The objectives of my proposal contains special fonts, and they are included into the section of backgrounds of research as a combined pdf file.

Project Title:	An Algorithmic Study of Energy Efficient Online Scheduling
Investigator(s):	Lam TW
Department:	Computer Science
Source(s) of Funding:	Small Project Funding
Start Date:	11/2008
Completion Date:	10/2009

Abstract:

Included in the attachment of Section VI.

Project Title:	The 9th Workshop on Models and Algorithms for Planning and Scheduling Problems (MAPSP 2009) Deadline Scheduling and Power Management for Speed Bounded Processors
Investigator(s):	Lam TW
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	06/2009
Completion Date:	07/2009

Abstract:

N/A

Project Title:	An algorithmic study of online communication for maintaining data statistics
Investigator(s):	Lam TW
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	09/2009

Abstract:

1) Basic counting: To derive communication-efficient algorithms to enable the controller to count a given subset of data items in all data streams; 2) Frequent items: To derive communication-efficient algorithms to enable the controller to report data items whose frequency exceeds a given fraction; 3) Quantile: To derive communication-efficient algorithms to enable the controller to report the data item of any given rank (when all data items are sorted); 4) To derive lower bounds on the communication required for maintaining the above statistics in different window models.

Project Title:	ALGO 2009 Approximating Frequent Items in Asynchronous Data Stream over a Sliding Window
Investigator(s):	Lam TW
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	09/2009
Completion Date:	09/2009

Abstract:

N/A

Project Title:	Outstanding Research Student Supervisor Award 2008-2009
Investigator(s):	Lam TW
Department:	Computer Science
Source(s) of Funding:	Outstanding Research Student Supervisor Award
Start Date:	12/2009

Abstract:

For recognizing, rewarding and encouraging exceptioal research achievements; and for strengthening the research culture of the University.

List of Research Outputs

Chan H.L., Lam T.W., Lee L.K. and Ting H.F., Approximating frequent items in asynchronous data stream over a sliding window, The 7th Workshop on Approximation and Online Algorithms. Springer, 2009, 49 -- 61.

Chan H.L., Lam T.W., Lee L.K. and Ting H.F., Continuous Monitoring of Distributed Data Streams over a Time-based Sliding Window., 27th International Symposium on Theoretical Aspects of Computer Science, STACS 2010. LIPIcs, 179-190.

Chan H.L., Chan J.W.T., Lam T.W., Lee L.K., Mak K.S. and Wong P.W.H., Optimizing Throughput and Energy in Online Deadline Scheduling, Acm Transactions on Algorithms. 2009, 6(1): -.

Chan S.H., Lee L.K., Lam T.W. and Ting H.F., Non-claairvoyant scheduling for weighted flow time and energy on speed bounded processors, The 16th Computing: the Australasian Theory Symposium (CATS). 2010, 3-10.

Hon W.K., Lam T.W., Shah R., Tam S.L. and Vitter J., Succinct Index for Dynamic Dictionary Matching, The 20th International Symposium on Algorithms and Computation. Springer, 2009, LNCS 5878: 1034-1043.

Lam T.W., Li R., Tam S.L., Wong C.K., Wu M.K.E. and Yiu S.M., High Throughput Short Read Alignment via Bi-directional BWT, The IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2009). 2009, 31 - 36.

Lam T.W., Journal of Applied Mathematics. Hindawi Publishing Coporation, 2009.

Lam T.W., Online Scheduling and Energy Efficiency, The Theoretical Computer Science Technical Committee of the China Computer Federation. 2009.

Lam T.W., Outstanding Research Student Supervisor Award, The University of Hong Kong. 2009.

Lam T.W., Lee L.K., Ting H.F., To K.K. and Wong W.H., Sleep with guilt and work faster to minimize flow plus energy, In: S. Albers, The 36th International Colloquium on Automata, Languages and Programming (ICALP). Springer, 2009, 665-676.

Li R., Yu C., Li Y., Lam T.W., Yiu S.M., Kristiansen K. and Wang J., SOAP2: an Improved Ultrafast Tool for Short Read Alignment, Bioinformatics. 2009, 25(15): 1966-67.

Tam S.L., Wu M.K.E., Lam T.W. and Yiu S.M., Succinct Text Indexing with Wildcards, The 16th String Processing and Information Retrieval Symposium (SPIRE). Springer, 2009, 39-50.

Wong K.F., Cheung W.Y., Lam T.W. and Yiu S.M., Local Structural Alignment of Regular RNA with Affine Gap Model, The 6th International Symposium on Bioinformatics Research and Applications (ISBRA’10). 2010.

Wong K.F., Lam T.W., Sung W.K. and Yiu S.M., Structural Alignment of RNA with Complex Pseudoknot Structure, The 9th International Workshop on Algorithms in Bioinformatics (WABI). 2009.

Yang S., Agrawal K.R., Lam T.W., Sham P.C., Cheah K.S.E. and Wang J.J., Novel Profile Hidden Markov Model to Predict MicroRNAs and their Targets Simutaneously, 14th Research Postgraduate Symposium, The University of Hong Kong, December 2-3. 2009.

Yang S., Agrawal K.R., Lam T.W., Sham P.C., Cheah K.S.E. and Wang J.J., Novel Profile Hidden Markov Model to predict microRNAs in Degenerative Disc Disease. University Grants Council site visit for Area of Excellence Programme "Developmental Genomics & Skeletal Research". January 25, 2010.

Researcher : Lau FCM

Project Title:	The Hong Kong University grid point
Investigator(s):	Lau FCM
Department:	Computer Science
Source(s) of Funding:	Matching Fund for Hi-Tech Research and Development Program of China (863 Projects)
Start Date:	12/2002

Abstract:

To study grid point of The University of Hong Kong.

Project Title:	HP mobile technology for teaching initiative
Investigator(s):	Lau FCM
Department:	Computer Science
Source(s) of Funding:	Hewlett-Packard HK SAR Ltd.
Start Date:	10/2004

Abstract:

To develop HP mobile technology for teaching initiative.

Project Title:	UGC-Matching Grant Scheme (2nd Phase)-Process Migration and Runtime Scheduling for Parallel Tasks in Computational Grids
Investigator(s):	Lau FCM
Department:	Computer Science
Source(s) of Funding:	Other Funding Scheme
Start Date:	05/2006

Abstract:

To study process migration and runtime scheduling for parallel tasks in computational grids.

Project Title:	High-Performance Wireless Link Scheduling with Topology Control using Non-Linear Power Assignment
Investigator(s):	Lau FCM
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	09/2007
Completion Date:	08/2009

Abstract:

(1) To design algorithms (combining link scheduling, topology control and power control) that can minimize both the scheduling complexity and the energy complexity for the network connectivity property. (2) To design algorithms that can minimize both the scheduling complexity and the energy complexity for other important network properties, such as k-connectivity and t-spanner. (3) Allowing packet-level control of the transmit power, to design computationally efficient and energy-efficient wireless link scheduling algorithms with approximation bounds for arbitrary traffic demands. (“Arbitrary traffic demands” can be seen as a kind of arbitrary topology where the traffic demand is interpreted as a set of edges connecting two nodes. For simplicity, we can assume the traffic demands, or the arbitrary topology, are given in advance.) (4) To devise distributed approximation algorithms for the above problems.

Project Title:	Nature-Inspired Particle Mechanics Algorithm
Investigator(s):	Lau FCM
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	09/2008

Abstract:

1. To present Particle Mechanics Algorithm (PMA) as a new (physics-based) branch of Nature-inspired Algorithm (NA). 2. Based on the Particle Mechanics (PM) model, to propose a new distributed-parallel swarm intelligent model and its theory (in terms of growth, evolution, phase transformation, etc.), and based on the proposed model and theory, to deal with and explain a variety of complicated phenomena that might occur in dynamic environments, which are difficult for traditional models and approaches to handle. 3. By extending our previous approaches ([DsXf05a,XfFl06a,XfFl07b,XfFl07c,XfFl07d,XfFl07e]), to develop a generic parallel-distributed method based on PM models (of various granularities) for swarm intelligence. 4. To implement some particle mechanics algorithms to test our approaches for their ability to adapt to real-world NP-hard problems in real-time dynamic environments that involve various interactions, metabolism, phase transformation, etc., and to show that our approaches can indeed overcome some of the deficiencies of traditional approaches.

Project Title:	2nd ACM International Conference on Recommender Systems Personalized Online Document, Image and Video Recommendation via Commod ity Eye-tracking
Investigator(s):	Lau FCM
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	10/2008

Abstract:

N/A

Project Title:	HKU Grid Point for Systems Research and Applications in Multiple Disciplines
Investigator(s):	Lau FCM, Wang CL, Ho PT, Chan GKY, Li WK
Department:	Computer Science
Source(s) of Funding:	UGC One-off Special Equipment Grant Scheme
Start Date:	12/2008

Abstract:

1) To support High Performance Computing and Grid computing for compute-intensive research projects in multiple disciplines in science an engineering; 2) To provide a platform for Hong Kong’s participation in the China National Grid (CNGrid) development led by the Chinese Academy of Sciences; 3) To provide leadership in fostering collaborative development of Grid Computing among local universities, and to provide an enabling facility for regional and international collaborations.

Project Title:	Throughput-Optimal Wireless Link Scheduling under the Physical Interference Model
Investigator(s):	Lau FCM
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	09/2009

Abstract:

1) For the minimum-length scheduling (MLS) problem, we aim to first give a formal hardness analysis, and then to find whether there are centralized approximate link scheduling algorithms with performance guarantee better than O(n/log n). Second, for some practical link topologies, such as a tree, we want to design a better heuristic scheduling algorithm which has an upper bound lower than O(log^2 n). If possible, we also want to find if there exists some non-trivial lower bound; 2) For the link scheduling problem under stochastic-arrival traffic, we want to design centralized and efficient scheduling algorithms that have desirable performance guarantees over the optimal capacity region under SINR constraints for practical link topologies; 3) Based on (1), we want to design localized joint power control and link scheduling algorithms with performance guarantees for practical link topologies; 4) Based on (2), we want to design low-complexity distributed versions of these scheduling algorithms.

List of Research Outputs

Hua Q., Wang Y., Yu D. and Lau F.C.M., Dynamic Programming Based Algorithms for Set Multicover and Multiset Multicover Problems, Theoretical Computer Science. Elsevier, 2010, 411: 2467-2474.

Hua Q. and Lau F.C.M., Joint Link Scheduling and Topology Control for Wireless Sensor Networks with SINR Constraints, In: Jin Hai, Jiang Wenbin, Handbook of Research on Developments and Trends in Wireless Sensor Networks: From Principle to Practice. IGI Global, 2010, 184-208.

Hua Q., Wang Y., Yu D. and Lau F.C.M., Set Multi-Covering via Inclusion-Exclusion, Theoretical Computer Science. Elsevier, 2009, 410: 3882-3892.

Huang W., Wu C. and Lau F.C.M., The Performance and Locality Trade-off in BitTorrent-like P2P File-Sharing Systems, IEEE International Conference on Communications (IEEE ICC 2010). 2010.

Qiu X., Huang W., Wu C., Lau F.C.M. and Lin X., InstantLeap: an Architecture for Fast Neighbor Discovery in Large-scale P2P VoD Streaming, In: Thomas Plagemann, Springer Multimedia Systems Journal. Springer-Verlag, 2010, 16: 183-198.

Researcher : Law YW

List of Research Outputs

Ieong S.C.R., Lai K.Y., Chow K.P., Kwan Y.K., Law Y.W., Tse K.S.H. and Tse W.H.K., Forensic Investigation of Peer-to-Peer Networks, In: Chang-Tsun Li (University of Warwick, UK), Handbook of Research on Computational Forensics, Digital Crime and Investigation: Methods and Solution. United Kingdom, IGI Global, 2009, 355-378.

Ieong S.C.R., Lai K.Y., Chow K.P., Kwan Y.K. and Law Y.W., Is it an Initial Seeder? Derived Rules that Indicate a Seeder is within the Slow-Rising Period, Sixth IFIP WG 11.9 International Conference on Digital Forensics. Hong Kong, 2010.

Kwan Y.K., Overill R.E., Chow K.P., Silomon J.A.M., Tse K.S.H., Law Y.W. and Lai K.Y., Evaluation of Evidence in Internet Auction Fraud Investigations, 6th Annual IFIP WG 11.9 International Conference on Digital Forensics, Advances in Digital Forensics VI, Ch.7. Hong Kong, Springer, 2010, 95-106.

Law Y.W., Chan P.F., Yiu S.M., Tang B., Lai K.Y., Chow K.P., Ieong S.C.R., Kwan Y.K., Hon W.K. and Hui C.K., Identifying Static and Dynamic Volatile Memory Data from Multiple Dumps for Live Forensics, Sixth IFIP WG 11.9 International Conference on Digital Forensics. Hong Kong, 2010.

Tse W.H.K., Chow K.P., Law Y.W., Ieong S.C.R., Kwan Y.K., Tse K.S.H. and Lai K.Y., Analyzing Storage Media of Digital Camera , The 2009 International Workshop on Forensics for Future Generation Communication environments (F2GC-09). Jeju Island, Korea, 2009.

Researcher : Lee KF

List of Research Outputs

Lee K.F., Clustering Uncertain Data Using Voronoi Diagram. Hong Kong, 2009.

Researcher : Lee KW

List of Research Outputs

Lee K.W., Mesh Denoising and Feature Extraction from Point Cloud. Hong Kong, 2009.

Researcher : Lee LK

List of Research Outputs

Chan H.L., Lam T.W., Lee L.K. and Ting H.F., Approximating frequent items in asynchronous data stream over a sliding window, The 7th Workshop on Approximation and Online Algorithms. Springer, 2009, 49 -- 61.

Chan H.L., Lam T.W., Lee L.K. and Ting H.F., Continuous Monitoring of Distributed Data Streams over a Time-based Sliding Window., 27th International Symposium on Theoretical Aspects of Computer Science, STACS 2010. LIPIcs, 179-190.

Chan H.L., Chan J.W.T., Lam T.W., Lee L.K., Mak K.S. and Wong P.W.H., Optimizing Throughput and Energy in Online Deadline Scheduling, Acm Transactions on Algorithms. 2009, 6(1): -.

Chan S.H., Lee L.K., Lam T.W. and Ting H.F., Non-claairvoyant scheduling for weighted flow time and energy on speed bounded processors, The 16th Computing: the Australasian Theory Symposium (CATS). 2010, 3-10.

Hung R.Y.S., Lee L.K. and Ting H.F., Finding frequent items over sliding windows with constant update time, Information processing letters. 2010, 110: 257-260.

Lam T.W., Lee L.K., Ting H.F., To K.K. and Wong W.H., Sleep with guilt and work faster to minimize flow plus energy, In: S. Albers, The 36th International Colloquium on Automata, Languages and Programming (ICALP). Springer, 2009, 665-676.

Researcher : Lee LK

List of Research Outputs

Chan H.L., Chan J.W.T., Lam T.W., Lee L.K., Mak K.S. and Wong P.W.H., Optimizing Throughput and Energy in Online Deadline Scheduling, Acm Transactions on Algorithms. 2009, 6(1): -.

Hung R.Y.S., Lee L.K. and Ting H.F., Finding frequent items over sliding windows with constant update time, Information processing letters. 2010, 110: 257-260.

Researcher : Lee SD

Project Title:	Spring Workshop on Mining and Learning (SML) 2010 Mining of Uncertain Data
Investigator(s):	Lee SD
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	03/2010
Completion Date:	03/2010

Abstract:

N/A

List of Research Outputs

Ren J., Lee S.D., Chen X., Kao C.M., Cheng C.K. and Cheung D.W.L., Naive Bayes Classification of Uncertain Data, The IEEE International Conference on Data Mining (IEEE ICDM 2009). 2009.

Researcher : Lee TYT

List of Research Outputs

Lee T.Y.T., Hon C.T. and Cheung D.W.L., XML Schema Design and Management for e-Government Data Interoperability , The Electronic Journal of e-Government. UK, Academic Publishing Limited, 2009, 7: 381-390.

Researcher : Lee YT

List of Research Outputs

Lee Y.T., Formalisms on Semi-structured and Unstructured Data Schema Computations. Hong Kong, 2010.

Researcher : Leung CM

List of Research Outputs

Kao M.Y., Leung C.M., Sun H. and Zhang Y., Deterministic Polynomial-Time Algorithms for Designing Short DNA Words, 7th Annual Conference on Theory and Applications of Models of Computation (TAMC 2010). 2010.

Leung C.M., Leung S.Y., Yiu S.M. and Chin F.Y.L., Predicting Conserved Metabolic Pathways Leading to an Important Final Product (Poster), The 8th Annual International Conference on Computational Systems Bioinformatics Conference (CSB 2009). Stanford University, USA, 2009.

Peng Y., Leung C.M., Yiu S.M., Chin F.Y.L. and Li R., Assembling Short Reads with Much Less Memory, The 11th International Meeting on Human Genome Variation and Complex Genome Analysis (HGV 2009). Tallinn, Estonia, 2009.

Yang B., Peng Y., Leung C.M., Yiu S.M., Chen J. and Chin F.Y.L., Unsupervised Binning of Environmental Genomic Fragments based on an Error Robust Selection of l-mers, BMC Bioinformatics. London, United Kingdom, BioMed Central, 2010, 11 (Suppl 2): S5.

Yang B., Peng Y., Leung C.M., Yiu S.M., Chen J. and Chin F.Y.L., Unsupervised Binning of Environmental Genomic Fragments based on an Error Robust Selection of l-mers, The Third International Workshop on Data and Text Mining in Bioinformatics (DTMBIO 09) in the 18th ACM Conference on Information and Knowledge Management. Hong Kong, 2009.

Researcher : Leung SY

List of Research Outputs

Leung C.M., Leung S.Y., Yiu S.M. and Chin F.Y.L., Predicting Conserved Metabolic Pathways Leading to an Important Final Product (Poster), The 8th Annual International Conference on Computational Systems Bioinformatics Conference (CSB 2009). Stanford University, USA, 2009.

Leung S.Y., Predicting Metabolic Pathways from Metabolic Networks. Hong Kong, 2009.

Researcher : Leung SY

List of Research Outputs

Leung S.Y., Predicting Metabolic Pathways from Metabolic Networks. Hong Kong, 2009.

Researcher : Liang C

List of Research Outputs

Liang C. and Wong K.K.Y., 3D Reconstruction Using Silhouettes from Unordered Viewpoints, Image and Vision Computing. 2010, 28(4): 579-589.

Researcher : Lin Z

List of Research Outputs

Lin Z., Advanced Spatial Queries in Wireless Ad Hoc Networks. Hong Kong, 2009.

Researcher : Liu M

List of Research Outputs

Liu M. and Wong K.K.Y., Shape from Shadings under Perspective Projection and Turntable Motion, International Conference on Computer Vision Theory and Applications. 2010, 28-35.

Researcher : Liu Y

List of Research Outputs

Yan D., Levy B., Liu Y., Sun F. and Wang W.P., Isotropic Remeshing with Fast and Exact Computation of Restricted Voronoi Diagram, Computer Graphics Forums (Symposium on Geometry Processing 2009). Berlin, Germnay, 1445-1454, 2009, 28(5): 10pp.

Researcher : Lu H

List of Research Outputs

Zhang Z., Chan W.K., Tse T.H., Lu H. and Mei L., Resource prioritization of code optimization techniques for program synthesis of wireless sensor network applications, Journal of Systems and Software (ISI impact factor of journal: 1.340). 2009, 82 (9): 1376−1387.

Researcher : Lu L

List of Research Outputs

Liu Y., Wang W.P., Levy B., Sun F., Yan D., Lu L. and Yang C.L., On Centroidal Voronoi Tessellation -- Energy Smoothness And Fast Computation, ACM Transactions on Graphics. 2009, 28, No. 24.

Researcher : Luo Y

List of Research Outputs

Lam K.T., Luo Y. and Wang C.L., Adaptive Sampling-Based Profiling Techniques for Optimizing the Distributed JVM Runtime, The 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS2010) . 2010.

Luo Y., Lam K.T. and Wang C.L., Path-Analytic Distributed Object Prefetching, The 10th International Symposium on Pervasive Systems, Algorithms and Networks (I-SPAN 2009). 2009.

Researcher : Mak KS

List of Research Outputs

Chan H.L., Chan J.W.T., Lam T.W., Lee L.K., Mak K.S. and Wong P.W.H., Optimizing Throughput and Energy in Online Deadline Scheduling, Acm Transactions on Algorithms. 2009, 6(1): -.

Researcher : Mamoulis N

Project Title:	Evaluation of advanced spatial queries in sensor networks
Investigator(s):	Mamoulis N
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	01/2007
Completion Date:	12/2009

Abstract:

(1) We aim at providing methods that continuously evaluate advanced spatial queries in a sensor network locally, avoiding as much as possible unnecessary transmission of information to (potentially remote) centralized servers (i.e., information collectors). We are especially interested in queries that combine information from neighbor nodes, instead of monitoring simple aggregate results in a spatial region. An interesting class of such queries is spatial-pattern queries (a.k.a. structural spatial queries [PMD98]), which are generalizations of spatial joins. An example of a query in this class is “a high-temperature reading (>50 degrees) close to and surrounded by at least four low-humidity readings (<40%)”. A local group of sensors whose readings satisfy this query could indicate an area that requires special attention (e.g., high chances of fire if a green area). The objectives of this project can be summarized as follows: (2) • The formal definition of spatial pattern queries for which results are combinations of sensors (i.e., locations) and interesting readings thereof. We aim to formally define such queries and language extensions for expressing them. In addition, we will investigate variants and extensions of the basic class of pattern queries. We will deal with instantaneous and continuous versions of the queries (with expiration times). We will also consider queries with temporal predicates between sensor readings (e.g., “report cases where a high temperature is sensed at most 5 seconds after a low-humidity reading, nearby”).(3) The development of routing mechanisms and communication protocols for registering spatial pattern queries at sensors. Sensors should be aware of the queries that may involve them and responsible for their evaluation (in their local neighborhood). Once a query is registered in the network, the affected sensors should be notified by appropriate routing mechanisms. In addition, we will develop mechanisms for communicating query results, computed locally, to a remote receiver (which may well be the query issuer). Moreover, we will deal with multiple query optimization, i.e., how to compute results of multiple sub-queries simultaneously and then extend them. Another related issue is how to encode/index queries at the sensor nodes in order to effectively determine whether a sensed value is relevant to a query and which one. (4) • The development of computationally efficient and power-aware query evaluation mechanisms. First, we should avoid continuous acquisition of sensor readings that would flood the network and affect negatively the power-life of sensors, which generate or forward them. Instead, we will develop effective mechanisms for localized computation of query results. Leader sensor nodes will be chosen and will be responsible for the evaluation and transmission of query results, aiming at minimization of (i) communication between sensors (i.e., messages and their sizes) and (ii) processing information at sensors.(5) The development of algorithms for approximate query evaluation in redundant networks. Sensor networks may be redundant; more nodes than necessary may densely be placed in a field in order to increase the accuracy for the sensed environmental readings and prevent total network failure when some nodes fail. We plan to apply query evaluation only at a small sample of the network enough to provide us with query results that may not be complete, but accurate enough based on guarantees derived by the sampling methodology. For this case, we will adapt previous mechanisms or develop new (based on the nature of queries) for node clustering, dynamic leader selection, and leader shutdown handling.(6) A successful outcome of this project will be of significant importance, since it may find application in several surveillance applications, including (1) emergency planning and disaster discovery, e.g., alerting fire departments for abnormal readings of sensor combinations; (2) real-time monitoring and analysis of movement patterns, indicated by thermal/motion sensors; (3) environmental monitoring, e.g., predicting wheather phenomena based on known combinations of sensor readings.

Project Title:	Top-k Dominating Queries on Multi-dimensional Data
Investigator(s):	Mamoulis N
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	01/2008
Completion Date:	12/2009

Abstract:

(1) The formal definition of top-k dominating queries and their variants. The top-k dominating query discovers the most popular objects in a dataset based on the number of other objects they dominate at all dimensions. A bi-chromatic version of the query seeks for objects in one dataset that dominate a large number of points in another. In addition, ranking views in a top-k dominating context will be defined and their usefulness will be investigated. Finally, sliding-window based on-line identification and maintenance of top-k dominating query results will be defined and investigated over streaming data. (2) Extensive study of the basic class of top-k dominating queries on stored data. We will investigate the effectiveness of multidimensional indexes, like the R-tree on the speed-up of top-k dominating query evaluation, over naïve approaches. We aim at developing specialized algorithms that take full advantage of the index structure to minimize I/O and computational cost during processing. For instance, by partially browsing the tree, we could derive upper and lower bounds for the number of dominated objects for each top-k candidate that may help reducing the search effort, by pruning parts of the data space. (3) Investigation of techniques for top-k dominating queries on raw (non-indexed) data. The rankings of the objects at different dimensions may not be known in advance (for indexing) or queries may refer to arbitrary subsets of dimensions and indexing for all of them may not be possible. Although an index can be computed on-the-fly (by bulk loading) and used to evaluate the query, we will investigate more efficient approaches based on multi-dimensional data partitioning (i.e., hashing). Based on the number of data in each partition, we can define an ordering for processing the data in them, while using the counters in other partitions to define search bounds. (4) Computation and maintenance of top-k dominating views. We will also investigate the computation and use of ranking views for top-k dominating queries. Intuitively, maintaining top-k dominating views is harder than simple top-k views. The main problem is that a change in some object’s ranking values may affect the dominating score for a large number of other objects. Therefore, we need special mechanisms for maintaining the dominating relationships between objects (e.g., using links), in order to efficiently update their scores upon changes in the database. (5) Maintaining top-k dominating results over data streams. We will study the efficient maintenance of top-k dominating results, in a streaming data scenario, where objects leave or enter the sliding window. Our objective is to device techniques for the efficient on-line update of memory-resident top-k dominating results and minimization of memory requirements.

Project Title:	Privacy Preservation in the Publication of Data Sequences
Investigator(s):	Mamoulis N
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	01/2009

Abstract:

1. The definition of appropriate models for the privacy and utility of sequence data is a big challenge of this project. Regarding the security of the data, we need to consider different types of attack by an adversary, different background knowledge of the attacker, and different objectives of the attack. The quality of the published information can be quantized by their utility. General models for utility based on inaccuracy will be studied, as well as models that rely on particular tasks (like pattern extraction). 2. Based on the definition of privacy and utility, we will develop methods for generalizing detailed sequence elements to more abstract groups, in order to ensure that the privacy constraint is satisfied. At the same time the methods should result in transformed information which is as accurate as possible compared to the original data. 3. We will evaluate the developed techniques with the use of real data. Evaluation will be done (i) with the use of general accuracy functions and (ii) by applying sequence analysis techniques to both the original and transformed sequences, in order to find out how close the analysis results are in the two cases. Finally, the efficiency and scalability of the proposed methods will be tested. 4. The developed models and transformation techniques will be adapted to different types of sequence data; from sequences of locations to sequences of multivariate data that may also include categorical attributes. In addition, the applicability of the research results will be tested against different databases (from sequences of customer transactions to histories of medical records of patients).

Project Title:	Outstanding Young Researcher Award 2008-2009
Investigator(s):	Mamoulis N
Department:	Computer Science
Source(s) of Funding:	Outstanding Young Researcher Award
Start Date:	12/2009

Abstract:

The Awards are intended to recognize, reward, and promote exceptional research accomplishments of academic and research staff.

Project Title:	Advanced Similarity Search in Uncertain Databases
Investigator(s):	Mamoulis N
Department:	Computer Science
Source(s) of Funding:	Germany/Hong Kong Joint Research Scheme
Start Date:	01/2010

Abstract:

Refer to hard copy

Project Title:	Algorithms for Large-Scale Matching Problems
Investigator(s):	Mamoulis N
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	01/2010

Abstract:

1) Identification of large-scale instances of the problems for realistic applications. This will be assisted by a deep bibliographical study, studying of news articles, and discussions among the project members. In addition, real data will be collected for these problems wherever possible; 2) Development of indexing and search techniques for the assignment problems. As existing solutions for large matching problems have prohibitive space and time complexity, it is imperative to device appropriate data structures and incremental search algorithms, which apply on a subset of the problem input and rely on conditions that guarantee their correctness. For example, this is possible if the weights of the graph edges are based on a distance function and can be derived dynamically, while browsing the nearest neighbors around each object. Our methods will be based on such derivations; 3) Development of techniques for the maintenance of matching results in dynamic environments. In many real applications, the objective is not only to compute a matching between two sets, but also to maintain it as the input data change. For example, consider a matching between mobile devices and WiFi access points and assume that after a while served users move, new service requests arrive, and some users are disconnected. The objective is to update the matching, in order to consider the changes in the data and still satisfy the constraints and objectives of the assignment. For example, the matching objective could be to maximize the number of served users, while minimizing their average distance between servers and users in the assignment; 4) Development of approximation algorithms for deriving a matching that may not be optimal (based on its objectives), but has similar quality with the optimal solution. For example, in the case, where an optimal assignment between WiFi access points and clients is hard to compute a solution that would guarantee a large percentage of served clients (compared to the optimal solution), while having similar quality with the optimal matching.

List of Research Outputs

Chatzimilioudis G., Mamoulis N. and Gunopulos D., A Distributed Technique for Dynamic Operator Placement in Wireless Sensor Networks, Proceedings of the 11th International conference on Mobile Data Management Systems (MDM), Kansas City, Missouri. 2010.

Mamoulis N., Seidl T., Pedersen T.B. and Torp K., Ira Assent: Advances in Spatial and Temporal Databases, 11th International Symposium, SSTD 2009. Aalborg, Denmark, Proceedings Springer 2009, 2009.

Mamoulis N., Seidl T., Pedersen T.B., Torp K. and Assent I., Lecture Notes in Computer Science, In: Nikos Mamoulis, Thomas Seidl, Torben Bach Pedersen, Kristian Torp, Ira Assent, Advances in Spatial and Temporal Databases, 11th International Symposium, SSTD 2009, Aalborg, Denmark, July 8-10, 2009, Proceedings. Heidelberg, Germany, Springer, 2009, 5644.

Mamoulis N., Outstanding Young Researcher Award, 2008-2009, University of Hong Kong. 2010.

Siu W.Y., Mamoulis N., Yiu S.M. and Chan H.L., A Data-Mining Approach for Multiple Structural Alignment of Proteins, Bioinformation. 2010, 4(8): 366 - 370.

U L.H., Mamoulis N. and Mouratidis K., A Fair Assignment Algorithm for Multiple Preference Queries, Proceedings of the 35th Very Large Data Bases Conference (VLDB). Lyon, France, 2009, 1054-1065.

U L.H., Mouratidis K. and Mamoulis N., Continuous Spatial Assignment of Moving Users, The VLDB Journal. Springer, 2010, 19(2): 141-160.

U L.H., Mamoulis N., Berberich K. and Bedathur S., Durable Top-k Search in Document Archives, Proceedings of the ACM Conference on Management of Data (SIGMOD), pp. 483-494, Indianapolis, IN. 2010.

U L.H., Mouratidis K., Yiu M.L. and Mamoulis N., Optimal Matching between Spatial Datasets under Capacity Constraints, ACM Transactions on Database Systems (TODS). 2010, 35(2): 44.

Wong W.K., Cheung D.W.L., Hung E., Kao C.M. and Mamoulis N., An Audit Environment for Outsourcing of Frequent Itemset Mining, Proceedings of the 35th International Conference on Very Large Data Bases (VLDB 2009). 2009.

Wong W.K., Mamoulis N. and Cheung D.W.L., Non-homogeneous Generalization in Privacy Preserving Data Publishing, Proceedings of the ACM Conference on Management of Data (SIGMOD), pp. 483-494, Indianapolis, IN. 2010.

Researcher : Mei L

List of Research Outputs

Mei L., Chan W.K., Tse T.H. and Kuo F.C., An empirical study of the use of Frankl-Weyuker data flow testing criteria to test BPEL web services, Proceedings of the 33rd Annual International Computer Software and Applications Conference (COMPSAC 2009), vol. 1, IEEE Computer Society Press, Los Alamitos, CA. 2009, 81−88.

Mei L., Chan W.K. and Tse T.H., Data flow testing of service choreography, Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC 2009/FSE-17), ACM Press, New York, NY (acceptance rate: 14.7%). 2009, 151−160.

Mei L., Chan W.K., Tse T.H. and Merkel R.G., Tag-based techniques for black-box test case prioritization for service testing, Proceedings of the 9th International Conference on Quality Software (QSIC 2009), IEEE Computer Society Press, Los Alamitos, CA. 2009, 21−30.

Zhang Z., Chan W.K., Tse T.H., Lu H. and Mei L., Resource prioritization of code optimization techniques for program synthesis of wireless sensor network applications, Journal of Systems and Software (ISI impact factor of journal: 1.340). 2009, 82 (9): 1376−1387.

Researcher : Mitcheson GR

List of Research Outputs

Rasotto M. .B., Sadovy Y.J. and Mitcheson G.R., Male body size predicts sperm number in the mandarinfish, Journal of Zoology. 2010, 281: 161-167.

Researcher : Peng Y

List of Research Outputs

Peng Y., Leung C.M., Yiu S.M., Chin F.Y.L. and Li R., Assembling Short Reads with Much Less Memory, The 11th International Meeting on Human Genome Variation and Complex Genome Analysis (HGV 2009). Tallinn, Estonia, 2009.

Yang B., Peng Y., Leung C.M., Yiu S.M., Chen J. and Chin F.Y.L., Unsupervised Binning of Environmental Genomic Fragments based on an Error Robust Selection of l-mers, BMC Bioinformatics. London, United Kingdom, BioMed Central, 2010, 11 (Suppl 2): S5.

Yang B., Peng Y., Leung C.M., Yiu S.M., Chen J. and Chin F.Y.L., Unsupervised Binning of Environmental Genomic Fragments based on an Error Robust Selection of l-mers, The Third International Workshop on Data and Text Mining in Bioinformatics (DTMBIO 09) in the 18th ACM Conference on Information and Knowledge Management. Hong Kong, 2009.

Researcher : Pun KH

Project Title:	Community Legal Information Project
Investigator(s):	Pun KH
Department:	Computer Science
Source(s) of Funding:	Department of Justice, HKSAR Government - General Award
Start Date:	04/2004

Abstract:

To study community legal information.

List of Research Outputs

Cheung A.S.Y. and Pun K.H., Comparative Study on the Liability for Trade Mark Infringement of Online Auction Providers , In: Hugh Brett, European Intellectual Property Review. United Kingdom, Sweet and Maxwell, 2009, 31:11: 559-567.

Researcher : Qiu X

List of Research Outputs

Qiu X., Huang W., Wu C., Lau F.C.M. and Lin X., InstantLeap: an Architecture for Fast Neighbor Discovery in Large-scale P2P VoD Streaming, In: Thomas Plagemann, Springer Multimedia Systems Journal. Springer-Verlag, 2010, 16: 183-198.

Researcher : Ren J

List of Research Outputs

Ren J., Lee S.D., Chen X., Kao C.M., Cheng C.K. and Cheung D.W.L., Naive Bayes Classification of Uncertain Data, The IEEE International Conference on Data Mining (IEEE ICDM 2009). 2009.

Researcher : Schnieders D

List of Research Outputs

Schnieders D., Wong K.K.Y. and Dai Z., Polygonal Light Source Estimation, Asian Conference on Computer Vision. 2009, III: 96-107.

Schnieders D., Fu X. and Wong K.K.Y., Reconstruction of Display and Eyes from a Single Image, IEEE International Conference on Computer Vision and Pattern Recognition. 2010, 1442-1449.

Researcher : Shang L

List of Research Outputs

Bi B., Shang L. and Kao C.M., Collaborative Resource Discovery in Social Tagging Systems, Proceedings of The 18th ACM Conference on Information and Knowledge Management (ACM CIKM 2009). 2009.

Researcher : Si S

List of Research Outputs

Si S., Tao D. and Chan K.P., Cross-Domain Web Image Annotation, Workshop of Internet Multimedia Mining, Int. Conf. on Data Mining. Miami, U.S.A., 2009, 184-189.

Si S., Tao D. and Chan K.P., Discriminative Hessian Eigenmaps for Face Recognition, Int. Conf. on Acoustic, Speech, Signal Processing, 2010. Dallas, U.S.A., 5586-5589.

Si S., Tao D. and Chan K.P., Evolutionary Cross-Domain Discriminative Hessian Eigenmaps, IEEE Trans. on Image Processing. IEEE, 2010, 19: 1075-1086.

Si S., Tao D. and Chan K.P., Transfer Discriminative Logmaps, 2009 IEEE Pacific Rim Conference on Multimedia. Bangkok, Springer, 2009, 131-143.

Researcher : Siu WY

List of Research Outputs

Siu W.Y., Mamoulis N., Yiu S.M. and Chan H.L., A Data-Mining Approach for Multiple Structural Alignment of Proteins, Bioinformation. 2010, 4(8): 366 - 370.

Researcher : Sun F

List of Research Outputs

Liu Y., Wang W.P., Levy B., Sun F., Yan D., Lu L. and Yang C.L., On Centroidal Voronoi Tessellation -- Energy Smoothness And Fast Computation, ACM Transactions on Graphics. 2009, 28, No. 24.

Yan D., Levy B., Liu Y., Sun F. and Wang W.P., Isotropic Remeshing with Fast and Exact Computation of Restricted Voronoi Diagram, Computer Graphics Forums (Symposium on Geometry Processing 2009). Berlin, Germnay, 1445-1454, 2009, 28(5): 10pp.

Researcher : Sun L

List of Research Outputs

Cheng C.K., Xie X., Yiu M.L., Chen J. and Sun L., UV-diagram: A Voronoi Diagram for Uncertain Data, The IEEE Intl. Conf. on Data Engineering (IEEE ICDE 2010). 2010.

Researcher : Tam SL

List of Research Outputs

Hon W.K., Lam T.W., Shah R., Tam S.L. and Vitter J., Succinct Index for Dynamic Dictionary Matching, The 20th International Symposium on Algorithms and Computation. Springer, 2009, LNCS 5878: 1034-1043.

Lam T.W., Li R., Tam S.L., Wong C.K., Wu M.K.E. and Yiu S.M., High Throughput Short Read Alignment via Bi-directional BWT, The IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2009). 2009, 31 - 36.

Tam S.L., Linear-size Indexes for Approximate Pattern Matching and Dictionary Matching. Hong Kong, 2010.

Tam S.L., Wu M.K.E., Lam T.W. and Yiu S.M., Succinct Text Indexing with Wildcards, The 16th String Processing and Information Retrieval Symposium (SPIRE). Springer, 2009, 39-50.

Researcher : Ting HF

Project Title:	Design and analysis of algorithms for identifying recently-frequent items in a data stream
Investigator(s):	Ting HF
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	01/2008
Completion Date:	12/2009

Abstract:

(1) Design and analyze streaming algorithms for the Basic counting problem. (2) Design and analyze streaming algorithms for the Heavy-Hitters problem. (3) Design and analyze streaming algorithms for the Top-k problem. (4) Implement and testing empirically all algorithms designed in this project.

Project Title:	Design and analysis of online algorithms for graphs coloring with applications in frequency assignments
Investigator(s):	Ting HF
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	01/2009

Abstract:

1. Design online algorithms for the Conflict-free coloring problem and analyze their competitive ratios 2. Implement all our algorithms and test them empirically to see how well they perform when applying to do frequency assignment for cellular networks

Project Title:	The 36th International Colloquium on Automata, Language and Programming (ICALP 2009) Sleep with guilt and work faster to minimize flow plus energy
Investigator(s):	Ting HF
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	07/2009
Completion Date:	07/2009

Abstract:

N/A

Project Title:	Online Optimization Algorithms on Wireless Communication Networks
Investigator(s):	Ting HF, Zhang Y
Department:	Computer Science
Source(s) of Funding:	Small Project Funding
Start Date:	01/2010

Abstract:

1. For OVSF code assignment problem, close the gap between the lower bound 2 and the upper bound 6, and consider OVSF code assignment in cellular networks. 2. For Unit Disk Graph Coloring, design online algorithms with competitive ratio less than 5. 3. In Network Discovery, design online algorithms to achieve some properties about the graph with minimal number of queries. 4. In Computing with Uncertainty, deriving efficient strategies for computing with mobile users.

List of Research Outputs

Chan H.L., Lam T.W., Lee L.K. and Ting H.F., Approximating frequent items in asynchronous data stream over a sliding window, The 7th Workshop on Approximation and Online Algorithms. Springer, 2009, 49 -- 61.

Chan H.L., Lam T.W., Lee L.K. and Ting H.F., Continuous Monitoring of Distributed Data Streams over a Time-based Sliding Window., 27th International Symposium on Theoretical Aspects of Computer Science, STACS 2010. LIPIcs, 179-190.

Chan J.W.T., Chin F.Y.L., Ting H.F. and Zhang Y., Online Problems for Frequency Assignment and OVSF Code Assignment in Wireless Communication Networks, SIGACT News. 2009, 40(3): 86-98.

Chan S.H., Lee L.K., Lam T.W. and Ting H.F., Non-claairvoyant scheduling for weighted flow time and energy on speed bounded processors, The 16th Computing: the Australasian Theory Symposium (CATS). 2010, 3-10.

Chan W.T., Chin F.Y.L., Ting H.F. and Zhang Y., Online Tree Node Assignment with Resource Augmentation, The 15th International Computing and Combinatorics Conference (COCOON'2009). Niagara Falls, New York, USA, 2009, 358-367.

Chin F.Y.L., Ting H.F. and Zhang Y., 1-Space Bounded Algorithms for 2-Dimensional Bin Packing, The 20th Annual International Symposium on Algorithms and Computation (ISAAC 2009). Hawaii, USA, 2009.

Chin F.Y.L., Ting H.F. and Zhang Y., A constant-competitive algorithm for online OVSF code assignment, Algorithmica. 2010, 56: 89-104.

Chin F.Y.L., Ting H.F., Tsin Y.H. and Zhang Y., Online uniformly inserting points on grid, The 6th International Conference on Algorithmic Aspect in Information and Management (AAIM). 2010, 282-292.

Hung R.Y.S. and Ting H.F., Design and analysis of online batching systems, Algorithmica. 2010, 57: 217-228.

Hung R.Y.S., Lee L.K. and Ting H.F., Finding frequent items over sliding windows with constant update time, Information processing letters. 2010, 110: 257-260.

Lam T.W., Lee L.K., Ting H.F., To K.K. and Wong W.H., Sleep with guilt and work faster to minimize flow plus energy, In: S. Albers, The 36th International Colloquium on Automata, Languages and Programming (ICALP). Springer, 2009, 665-676.

Researcher : To KK

List of Research Outputs

Lam T.W., Lee L.K., Ting H.F., To K.K. and Wong W.H., Sleep with guilt and work faster to minimize flow plus energy, In: S. Albers, The 36th International Colloquium on Automata, Languages and Programming (ICALP). Springer, 2009, 665-676.

Researcher : Tsang WW

Project Title:	Evaluating the CDF of the Kolmogorov Statistic for unknown parameter case
Investigator(s):	Tsang WW, Chong CF
Department:	Computer Science
Source(s) of Funding:	Small Project Funding
Start Date:	09/2005

Abstract:

A goodness-of-fit test checks whether a set of samples follows a hypothetical distribution. The most common test for checking discrete samples is the chi-square test. The counterpart for real samples is the Kolmogorov-Smirnov (KS) test [3,5]. Consider an ordered set of samples, x1≤x2≤…≤xn, and the hypothetical distribution, F(x), where all parameters of F(x) are known. The empirical distribution function (EDF) is defined as / | 0, x < x1, Fn(x) = < k/n , xk ≤ x < xk+1, k = 1,2,…,n-1, | 1, xn ≤ x. \ The Kolmogorov statistic, Dn, measures the maximum absolute distance between F(x) and Fn(x). The test result is conventionally determined by looking up tables of the critical values of Dn obtained from simulation [6]. In 2003, we developed a program that evaluates the cumulative distribution function (CDF) of Dn with 13-digit accuracy for 2 ≤ n ≤ 16000 [4]. The test result can thus be determined from the CDF of Dn. However, the KS test is still not quit readily applicable because the parameters of F(x) are often unspecified in real world applications. For example, in the testing of exponentiality, usually we have no idea on the mean of the exponential distribution. Therefore, we need to first estimate the mean from the samples. Then we check whether the samples follow the exponential distribution of this mean. The Dn so computed is noticeably smaller than that of the case where the mean is known beforehand. Durbin is the first scholar who advocates the study of the difference between the two cases [1,2]. Stephens has worked out a couple approximation formulas for the CDFs of two unknown parameter cases [7]. However, these formulas deviate from the true values by as much as 2%. Such extent of errors is not cmmensurate with the modern computing capability. The objective of this project is to develop accurate computer programs that evaluate the CDFs of Dn for the tests of normality and exponentiality, the two most common goodness-of-fit tests in unknown parameter case. Reference [1] J. Durbin, Distribution theory for tests based on the sample distribution function, Regional Conference Series in Applied Mathematics (9), SIAM, Philadelphia, 1973. [2] J. Durbin, Kolmogorov-smirnov tests when parameters are estimated, with applications to tests of exponentiality and tests on spacings, Biometrika, 62(1): 5-22, 1975. [3] A. Kolmogorov, Sulla determinazione empirical ei una legge di distributione, Giornale Dell' Istituto Italiano Degli Attuari, 4:83-91, 1933. [4] G. Marsaglia, W. W. Tsang and J. Wang, Evaluating Kolmogorov's distribution, Journal of Statistical Software, 8(18): 1-4, 2003. [5] N. V. Smirnov, On the deviation of the empirical distribution function, Recreational Mathematics (Mat. Sbornik), 6:3-26, 1939. [6] M. A. Stephens, EDF statistics for goodness of fit and some comparisons, Journal of the American Statistical Association, 69 (347): 730-737, 1974. [7] M. A. Stephens, Asymptotic results for goodness-of-fit statistics with unknown parameters, Annuals of Statistics, 4:357-369, 1976.

Project Title:	Evaluating the CDFs of the Anderson-Darling Statistics for normality and exponentiality testing
Investigator(s):	Tsang WW
Department:	Computer Science
Source(s) of Funding:	Small Project Funding
Start Date:	12/2006

Abstract:

A goodness-of-fit test checks whether a set of samples follows a hypothetical distribution. The most common test for checking discrete samples is the Pearson’s chi-square test. The most common one for checking real samples is the Kolmogorov-Smirnov (KS) test [6,8]. The Anderson-Darling (AD) test is another test for checking real samples [1,2]. Consider an ordered set of samples, x1≤x2≤…≤xn, and the hypothetical distribution, F(x), whose parameters are completely specified. The empirical distribution function (EDF) is denoted as Fn(x). Fn(x) equals 0 when x < x1, equals k/n when xk ≤ x ≤ xk+1 for k = 1,2,...,n-1, and equals 1 when xn ≤ x. The KS statistic, Dn, measures the maximum absolute distance between F(x) and Fn(x). The AD statistic, An, measures the weighted sum of the squares of the differences between F(x) and Fn(x). The AD test is a special case of the Cramer-von Mises approach. It is more powerful than the KS test [9]. Both Dn and An are distribution-free. The cumulative distribution function (CDF) of An is difficult to evaluate. T. W. Anderson and D. A. Darling have given an integration formula for the limiting CDF [1]. In 1954, they computed critical values of three significant levels of the limiting CDF using numerical integration [2]. P. A. W. Lewis tabulated the CDF of An for n = 1 to 8 and n = ∞ in 1961 [11]. He only evaluated the first two terms of a series expansion of the CDF. In 1974, Stevens gave tables of critical values for the limiting CDF [9]. In 1988, C. D. Sinclair and B. D. Spurr gave two approximation formulas for the limiting CDF [7]. In 2000, D. E. A. Giles derived a saddlepoint approximation for the limiting CDF [5]. His formula has accuracy similar to the Sinclair’s in the right tail of the distribution but superior in the left tail. The accuracy of the formulas or critical values mentioned above is at best 4 to 5 digits. In real life applications, the parameters of a hypothetical distribution are often not known. For example, in the testing of normality, usually we first estimate the mean and variance from the samples. Then we check whether the samples follow the normal distribution with such mean and variance. A test statistic, Dn or An, so computed is noticeably smaller than that of the case where the mean and variance are known beforehand. J. Durbin is the first scholar who advocates the study of the difference between the known (completely specified) and unknown parameter cases [3,4]. The statistics in the latter are not distribution free, i.e., the distribution of Dn or An depends on F(x). Durbin has derived a formula for the limiting CDF of An for exponentiality testing. However, his approach is computationally unstable in the crucial right tail region. M. A. Stephens has tabulated a table of critical values of An for normality and exponentiality testing from simulation results [10]. These values have about 2-digit accuracy in general but deviate from the true ones by as much as 2% in some cases. The objective of this project is to develop computer programs that evaluate the CDFs of the AD statistics in two most common applications 1. To develop a computer program that evaluates the CDF of the AD statistic for normality testing when the mean and variance are estimated from the samples being tested. No such program is currently available. A few critical values of the CDF found in literature deviate from the true values by as much as 2%. We aim at achieving an accuracy of 3 digits. 2. To develop a computer program that evaluates the CDF of the AD statistic for exponentiality testing when the mean is estimated from the samples being tested. No such program is currently available. A few critical values found in literature deviate from the true values by as much as 2%. We aim at achieving an accuracy of 3 digits. References [1] T. W. Anderson and D. A. Darling, Asymptotic theory of certain ‘goodness-of-fit’ criteria based on stochastic processes, Ann. Math. Stat., 23 193-212, 1952. [2] T. W. Anderson and D. A. Darling, A test of goodness of fit, J. Amer. Stat., Assn., 49 765-769, 1954. [3] J. Durbin, Distribution theory for tests based on the sample distribution function, Regional Conference Series in Applied Mathematics (9), SIAM, Philadelphia, 1973. [4] J. Durbin, Kolmogorov-smirnov tests when parameters are estimated, with applications to tests of exponentiality and tests on spacings, Biometrika, 62(1): 5-22, 1975. [5] D. E. A. Giles, A saddlepoint approximation to the distribution function of the Anderson-Darling test statistic, Communications in Statistics: Simulation and Computation, Volume 30, Issue 4, 2001. [6] A. Kolmogorov, Sulla determinazione empirical ei una legge di distributione, Giornale Dell’ Istituto Italiano Degli Attuari, 4:83-91, 1933. [7] Sinclair and B. D. Spurr, Approximations to the Distribution Function of the Anderson-Darling Test Statistic, Journal of the American Statistical Association, Vol. 83, No. 404, December, 1988. [8] N. V. Smirnov, On the deviation of the empirical distribution function, Recreational Mathematics (Mat. Sbornik), 6:3-26, 1939. [9] M. A. Stephens, EDF statistics for goodness of fit and some comparisons, Journal of the American Statistical Association, 69 (347): 730-737, 1974. [10] M. A. Stephens, Asymptotic results for goodness-of-fit statistics with unknown parameters, Annuals of Statistics, 4:357-369, 1976. [11] P. A. W. Lewis, Distribution of the Anderson-Darling statistic, Ann. Math. Stat., 32, 1118-1124, 1961.

Researcher : Tse KSH

List of Research Outputs

Ieong S.C.R., Lai K.Y., Chow K.P., Kwan Y.K., Law Y.W., Tse K.S.H. and Tse W.H.K., Forensic Investigation of Peer-to-Peer Networks, In: Chang-Tsun Li (University of Warwick, UK), Handbook of Research on Computational Forensics, Digital Crime and Investigation: Methods and Solution. United Kingdom, IGI Global, 2009, 355-378.

Kwan Y.K., Overill R.E., Chow K.P., Silomon J.A.M., Tse K.S.H., Law Y.W. and Lai K.Y., Evaluation of Evidence in Internet Auction Fraud Investigations, 6th Annual IFIP WG 11.9 International Conference on Digital Forensics, Advances in Digital Forensics VI, Ch.7. Hong Kong, Springer, 2010, 95-106.

Tse W.H.K., Chow K.P., Law Y.W., Ieong S.C.R., Kwan Y.K., Tse K.S.H. and Lai K.Y., Analyzing Storage Media of Digital Camera , The 2009 International Workshop on Forensics for Future Generation Communication environments (F2GC-09). Jeju Island, Korea, 2009.

Researcher : Tse TH

Project Title:	Towards an integrated method for program verification, testing and debugging
Investigator(s):	Tse TH
Department:	Computer Science
Source(s) of Funding:	Incentive Award for RGC GRF Fundable But Not Funded Projects
Start Date:	07/2003

Abstract:

N/A

Project Title:	27th Annual international computer software and applications conference
Investigator(s):	Tse TH
Department:	Computer Science
Source(s) of Funding:	Croucher Foundation - Conference / Seminars
Start Date:	09/2004

Abstract:

27th Annual international computer software and applications conference

Project Title:	TIRAMISU: testing context-sensitive, concurrent and middleware-based ubiquitous software
Investigator(s):	Tse TH
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	09/2006

Abstract:

1 The objectives of the project are as follows: (a) To provide integration testing techniques for context-sensitive concurrent middleware-based ubiquitous systems, (b) To provide unit testing techniques for context-sensitive concurrent middleware-based ubiquitous systems, (c) To provide software engineers with effective testing procedures and tools for context-sensitive software, and (d) To improve on the quality and reliability of ubiquitous software. 2 The proposed project has important theoretical and practical implications: (i) State explosion is an open problem in software testing and verification. It becomes significantly worse in context-sensitive and concurrent ubiquitous software. This project will address the problem using an enhanced form of extension particularly suitable for conformance testing. (ii) Results of context-sensitive ubiquitous applications may be hidden behind a hardware-software architecture so that they cannot easily be observed or recorded. This project will address the problem of testing context-sensitive software without precise oracles. (iii) Formal methods such as model checking and static analysis are gaining acceptance by practicing software engineers in recent years. This project will help accelerate this trend by demonstrating the usefulness of the CSP model for software engineering and software testing. (iv) The project will provide practical procedures and tools for software engineers to enhance the efficacy of software quality assurance, thus improving the cost-effectiveness of software development. (v) The usefulness of the testing techniques will be evaluated by experimental studies based on simulated and empirical data. (vi) The results of the project are important to the quality assurance of context-sensitive ubiquitous applications in Hong Kong as a logistics centre for Pearl River Delta.

Project Title:	Automatic Fault Localization for Wireless Sensor Network Software Applications: a Statistical Fault Divergence Approach
Investigator(s):	Tse TH
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	09/2007

Abstract:

1. The objectives of the project are as follows: (a) To develop a statistically robust and reliable fault localization technique to identify faults from dynamic behaviors of programs. (b) To reduce the cost of program debugging by providing a more accurate set of candidate fault positions. (c) To improves on the quality of the software through program debugging. (d) To provide software engineers with effective fault localization tools for conventional and pervasive applications. 2. This project has important theoretical and practical impacts: (i) The project will demonstrate the values and importance of robust statistical techniques for program debugging. (ii) It will demonstrate the usefulness of statistical program behaviors for fault localization. (iii) It will boost the importance of data flow information for debugging pervasive applications. (iv) The techniques will provide practical procedures for software engineers to enhance the efficacy of program maintenance, thus improving the cost-effectiveness of software development. (v) The usefulness of techniques will be evaluated by experimental studies based on empirical data as well as case studies in conventional programs and real-life wireless sensor network and context-aware applications.

Project Title:	CACTES: a formal framework for CompositionAl Conformance TEsting of Service compositions
Investigator(s):	Tse TH
Department:	Computer Science
Source(s) of Funding:	Seed Funding Programme for Basic Research
Start Date:	03/2008
Completion Date:	03/2010

Abstract:

In service computing, a service usually collaborates with other services. It may also comprise other services and self-offering actions. The testing of a service is complex and needs to address both the concurrent and sequential dimensions of service composition. As service compositions may be complicated, it is attractive to apply compositional testing, which means the assurance of the overall quality of an entire service composition based on the testing of individual (or groups of) services. Previous work focuses on addressing the concurrent dimension, since the sequentiality aspect is thought of as relatively easy. However, we have identified quite a number of non-trivial problems in the latter aspect that cannot be handled satisfactorily by existing techniques. We propose, therefore, to address the sequential dimension in this project. We shall conduct a formal analysis of conformance testing in the context of sequential service composition. We propose to use Communicating Sequential Processes (CSP), a classic process algebra, as the platform to handle the issue and model a service as a process. We have identified a number of counterexamples to show that there are subtle problems to hinder compositional testing in sequential composition of services. We propose the notion of sequential extension as the basis for tackling such difficulties. A service which sequentially extends another service will enable the latter to be decomposed sequentially into component services. These component services can be abstracted individually and compositionally and, at the same time, the former service should conform to the resultant service composition thus constructed. We target to produce a series of mathematical theorems to show that the notions support compositional conformance testing. To show the practicality of our formal framework, we shall then implement it in a testing tool and conduct experimentation to verify the viability of our proposal. The objectives of the project are as follows: (a) To study the theoretical aspects of service composition and conformance testing. (b) To propose a formal framework of the conformance testing of service compositions. (c) To develop a toolset for the conformance testing of service compositions. (d) To conduct experimentation to verify the viability of the proposed formal framework and implementation.

Project Title:	CACTES: a formal framework for CompositionAL Conformance TEsting of Service compositions
Investigator(s):	Tse TH
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	09/2008

Abstract:

1. To study the theoretical aspects of service composition and conformance testing. 2. To propose a formal framework of the conformance testing of service compositions. 3. To develop a toolset for the conformance testing of service compositions. 4. To conduct experimentation to verify the viability of the proposed formal framework and implementation.

Project Title:	TRIFL: TowaRds Improved methodologies for Fault Localization
Investigator(s):	Tse TH
Department:	Computer Science
Source(s) of Funding:	Seed Funding Programme for Basic Research
Start Date:	04/2009

Abstract:

Program debugging is laborious, and fault localization is recognized as its most difficult component. Popular statistical fault localization techniques such as Tarantula, CBI and SOBER automatically rank the statements or predicates of a faulty program to produce a suspicious list of program entities. Despite their popularity, our initial empirical studies find that such techniques can be improved in several aspects: (a) Existing techniques are either based on edge profiles at the predicate level, or based on vertex profiles at the statement level. It has been found that statement-level fault localization is more effective while edge profiles provide more useful information. We propose an improved methodology of fault localization based on edge profiles at the statement and block level. (b) Existing techniques rely on the statistics of both failure-causing and passed test cases. We propose an improved methodology that uses only the information from failure-causing test cases but outperforms current techniques. (c) Existing techniques assume that the execution behaviours of the test cases follow known statistical distributions. We propose that non-parametric statistical methods may give improved results. The objectives of the project can be itemized as follows: 1. To develop a statistically robust and reliable fault localization methodologies to identify faults from edge profiles at the statement and block levels. 2. To reduce the cost of fault localization by capturing only the behaviours of failure-causing test cases. 3. To reduce the cost of software development by providing a more accurate set of suspicious statements and predicates. 4. To improves on the quality of the software through effective fault localization. 5. To provide software engineers with advanced fault localization tools.

List of Research Outputs

Chan W.K., Ho J.C.F. and Tse T.H., Finding failures from passed test cases: improving the pattern classification approach to the testing of mesh simplification programs, Software Testing, Verification and Reliability (ISI impact factor of journal: 1.632). 2010, 20 (2): 89−120.

Chen H.Y. and Tse T.H., Automatic generation of normal forms for testing object-oriented software, Proceedings of the 9th International Conference on Quality Software (QSIC 2009), IEEE Computer Society Press, Los Alamitos, CA. 2009, 108−116.

Chen T.Y., Kuo F.C., Merkel R.G. and Tse T.H., Adaptive random testing: the ART of test case diversity, Journal of Systems and Software (ISI impact factor of journal: 1.340). 2010, 83 (1): 60−66.

Chen T.Y., Tse T.H. and Zhou Z.Q., Semi-proving: an integrated method for program proving, testing, and debugging, IEEE Transactions on Software Engineering (ISI impact factor of journal: 3.750). 2010, doi:10.1109/TSE.2010.23.

Jiang B., Zhang Z., Chan W.K. and Tse T.H., Adaptive random test case prioritization, Proceedings of the 24th IEEE/ACM International Conference on Automated Software Engineering (ASE 2009), IEEE Computer Society Press, Los Alamitos, CA (acceptance rate: 17.1%). 2009, 233−244.

Jiang B., Zhang Z., Tse T.H. and Chen T.Y., Best Paper Award, 33rd Annual International Computer Software and Applications Conference (COMPSAC 2009), IEEE Computer Society, Washington, DC, USA . 2009.

Jiang B., Zhang Z., Tse T.H. and Chen T.Y., How well do test case prioritization techniques support statistical fault localization [Best Paper Award, COMPSAC 2009], Proceedings of the 33rd Annual International Computer Software and Applications Conference (COMPSAC 2009), vol. 1, IEEE Computer Society Press, Los Alamitos, CA. 2009, 99−106.

Mei L., Chan W.K., Tse T.H. and Kuo F.C., An empirical study of the use of Frankl-Weyuker data flow testing criteria to test BPEL web services, Proceedings of the 33rd Annual International Computer Software and Applications Conference (COMPSAC 2009), vol. 1, IEEE Computer Society Press, Los Alamitos, CA. 2009, 81−88.

Mei L., Chan W.K. and Tse T.H., Data flow testing of service choreography, Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC 2009/FSE-17), ACM Press, New York, NY (acceptance rate: 14.7%). 2009, 151−160.

Mei L., Chan W.K., Tse T.H. and Merkel R.G., Tag-based techniques for black-box test case prioritization for service testing, Proceedings of the 9th International Conference on Quality Software (QSIC 2009), IEEE Computer Society Press, Los Alamitos, CA. 2009, 21−30.

Poon P.L., Tang S.F., Tse T.H. and Chen T.Y., CHOC'LATE: a framework for specification-based testing, Communications of the ACM (ISI impact factor of journal: 2.346). 2010, 53 (4): 113−118.

Tse T.H., A Unifying Framework for Structured Analysis and Design Models: an Approach Using Initial Algebra Semantics and Category Theory, Cambridge Tracts in Theoretical Computer Science, vol. 11. Paperback edition, Cambridge University Press, Cambridge, 2009, vi + 179 pages.

Tse T.H., Cover Design, Proceedings of the 9th International Conference on Quality Software (QSIC 2009), IEEE Computer Society Press, Los Alamitos, CA. 2009.

Tse T.H., Ko C.C., Leung J. and Liang R., I phone, therefore I can, Invited Paper, 2010 Human Service Information Technology Applications (HUSITA 9), in 2010 Joint World Conference on Social Work and Social Development: The Agenda (SWSD 2010). 2010.

Tse T.H., Position statement: conventional wisdom works in conventional circumstances: towards new solutions to new paradigms that challenge software testing, Proceedings of the 33rd Annual International Computer Software and Applications Conference (COMPSAC 2009), vol. 1, IEEE Computer Society Press, Los Alamitos, CA. 2009, pp. 1iv−1v.

Tse T.H., editorial board member, Journal for Universal Computer Science. Springer, Berlin, 2010.

Tse T.H., editorial board member, Journal of Systems and Software. Elsevier, Amsterdam, 2010.

Tse T.H., editorial board member, Software Testing, Verification and Reliability. Wiley, New York, NY, 2010.

Tse T.H., editorial board member, Software: Practice and Experience. Elsevier, Amsterdam, 2010.

Wong W.E., Tse T.H., Glass R.L., Basili V.R. and Chen T.Y., An assessment of systems and software engineering scholars and institutions (2002−2006), Journal of Systems and Software (ISI impact factor of journal: 1.340). 2009, 82 (8): 1370−1373.

Wong W.E., Chan W.K., Tse T.H. and Kuo F.-.C., guest editor, Special Issue on Dynamic Analysis and Testing, Journal of Systems and Software. Elsevier, Amsterdam, 2010.

Zhang Z., Chan W.K., Tse T.H., Jiang B. and Wang X., Capturing propagation of infected program states, Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC 2009/FSE-17), ACM Press, New York, NY (acceptance rate: 14.7%). 2009, 43−52.

Zhang Z., Chan W.K., Tse T.H. and Hu P., Experimental study to compare the use of metamorphic testing and assertion checking, Journal of Software. 2009, 20 (10): 2637−2654.

Zhang Z., Jiang B., Chan W.K., Tse T.H. and Wang X., Fault localization through evaluation sequences, Journal of Systems and Software (ISI impact factor of journal: 1.340). 2010, 83 (2): 174−187.

Zhang Z., Chan W.K., Tse T.H., Hu P. and Wang X., Is non-parametric hypothesis testing model robust for statistical fault localization?, Information and Software Technology (ISI impact factor of journal: 1.821). 2009, 51 (11): 1573−1585.

Zhang Z., Chan W.K., Tse T.H., Lu H. and Mei L., Resource prioritization of code optimization techniques for program synthesis of wireless sensor network applications, Journal of Systems and Software (ISI impact factor of journal: 1.340). 2009, 82 (9): 1376−1387.

Researcher : Tse WHK

List of Research Outputs

Ieong S.C.R., Lai K.Y., Chow K.P., Kwan Y.K., Law Y.W., Tse K.S.H. and Tse W.H.K., Forensic Investigation of Peer-to-Peer Networks, In: Chang-Tsun Li (University of Warwick, UK), Handbook of Research on Computational Forensics, Digital Crime and Investigation: Methods and Solution. United Kingdom, IGI Global, 2009, 355-378.

Tse W.H.K., Chow K.P., Law Y.W., Ieong S.C.R., Kwan Y.K., Tse K.S.H. and Lai K.Y., Analyzing Storage Media of Digital Camera , The 2009 International Workshop on Forensics for Future Generation Communication environments (F2GC-09). Jeju Island, Korea, 2009.

Researcher : U LH

List of Research Outputs

U L.H., Mamoulis N. and Mouratidis K., A Fair Assignment Algorithm for Multiple Preference Queries, Proceedings of the 35th Very Large Data Bases Conference (VLDB). Lyon, France, 2009, 1054-1065.

U L.H., Mouratidis K. and Mamoulis N., Continuous Spatial Assignment of Moving Users, The VLDB Journal. Springer, 2010, 19(2): 141-160.

U L.H., Mamoulis N., Berberich K. and Bedathur S., Durable Top-k Search in Document Archives, Proceedings of the ACM Conference on Management of Data (SIGMOD), pp. 483-494, Indianapolis, IN. 2010.

U L.H., Matching Problems in Large Databases. Hong Kong, 2010.

U L.H., Mouratidis K., Yiu M.L. and Mamoulis N., Optimal Matching between Spatial Datasets under Capacity Constraints, ACM Transactions on Database Systems (TODS). 2010, 35(2): 44.

Researcher : U LH

List of Research Outputs

U L.H., Mouratidis K. and Mamoulis N., Continuous Spatial Assignment of Moving Users, The VLDB Journal. Springer, 2010, 19(2): 141-160.

U L.H., Matching Problems in Large Databases. Hong Kong, 2010.

U L.H., Mouratidis K., Yiu M.L. and Mamoulis N., Optimal Matching between Spatial Datasets under Capacity Constraints, ACM Transactions on Database Systems (TODS). 2010, 35(2): 44.

Researcher : Wang CL

Project Title:	The HKU Grid Computing Research Center
Investigator(s):	Wang CL, Lau FCM
Department:	Computer Science
Source(s) of Funding:	The University of Hong Kong Foundation Seed Grant
Start Date:	07/2003

Abstract:

To share resources and knowledge under the theme of grid computing; to collaborate on a joint effort to develop the Hong Kong grid which will then contribute to the China National Grid and the Asia-Pacific Grid.

Project Title:	Adaptation and virtualization techniques for grid
Investigator(s):	Wang CL
Department:	Computer Science
Source(s) of Funding:	Matching Fund for Hi-Tech Research and Development Program of China (863 Projects)
Start Date:	08/2007

Abstract:

To build the HKU grid node, one among the 12 grid nodes in the whole China, at HKU/CS department's gideon 300 cluster and computer centre's power cluster.

Project Title:	Ad-hoc Collaborative Trajectory Calibration Using GPS Mobile Phones
Investigator(s):	Wang CL
Department:	Computer Science
Source(s) of Funding:	Small Project Funding
Start Date:	11/2008

Abstract:

Localization is a basic service in pervasive computing (a.k.a. ubiquitous computing). Nowadays most high-end mobile phones have build-in GPS functionality, while other ordinary ones have not. In this research, we want to demonstrate that if the urban pedestrians who carry GPS-enabled mobile phones are willing to share their GPS location information with other mobile phone users who carry devices without GPS, with only a small percentage of GPS-enabled mobile phones, we are able to track location of most mobile phone users and predict their location, even they don’t have the GPS phones. Our work attempts to answer the following questions: 1. At least with what percentage of GPS-enabled mobile nodes can we ensure a reasonable accurate estimation for all mobile nodes with certain mobility model and in certain physical area? 2. How can we formally attribute the observed improvement on urban pedestrians’ localization and trajectory estimation by sharing users’ GPS information? That is, how to do a formal probabilistic analysis to quantify the accuracy refinement according to each mobility model ? 3. Could we guarantee a reasonable trajectory accuracy even if there are some selfish nodes who are willing to receive others’ GPS information, but won’t share their GPS locations with others ? 4. How can we better leverage the node collision information and past location information to further improve the accuracy of our trajectory estimation? Distributed ad-hoc localization has been well studied for a long time and in many different fields, ranging from wireless sensor network [3,7,15] to mobile robotics localization [16]. The problems to be tackled in our project, inherit most challenges from the two research areas, yet there are some differences: 1. Our targeted environment is the urban metropolitan area, which is featured by people with realistic mobility pattern that cannot be represented by any single existing entity or group mobility models [6]. Particularly, we assume obstacles with street or blocks will constrain people’s moving behavior. 2. The nodes in our problem are mobile devices carried by urban pedestrians. The location of anchor nodes and normal nodes are changing dynamically. The usage scenarios in our case can result in some extreme conditions (e.g., intermittently disconnected network, sparse or uneven node distribution), which makes naive distributed collaboration not possible to be applied directly.

Project Title:	Adaptive Software Support for Fine-Grained Distributed Object Sharing on Multicore Clusters
Investigator(s):	Wang CL
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	01/2010

Abstract:

1) Easy-to-use programming paradigm for multicore clusters: To support a programming paradigm that strikes a balance between programmability and scalability, we propose a new synchronization mechanism called cock that unifies the beauty of locks and transactions at runtime to enhance concurrency without application code changes; 2) Scalable distributed transactional memory consistency model: We will design a new memory consistency model called home-based lazy transactional consistency (HLTC) to support the above programming paradigm. We aim for lazier update propagation and cluster-wide access conflict detection using the minimal messaging and memory overheads; 3) Profile-guided adaptive protocol refinement: Improving on the baseline HLTC model, we aim for challenging dynamic runtime optimizations. We will build a lightweight profiling framework to sample sharing and conflicting patterns. The runtime can make use of them for adaptive decisions on switching between lock and transaction, changing coherence protocols, thread concurrency control and scheduling transaction commits. We will show novel ideas of synchronized code block labeling, feedback-reconfigurable conflict resolution and speculative prediction to resolve doomed conflicts; 4) Critical section streamlining: With the above profiler, we propose a new working-set model for objects within heavily contended critical sections. Based on it, we will develop new thread placement and object prefetching schemes, and techniques to help most protocol communications unaffected by GC-incurred delay-jitters.

Project Title:	24th IEEE International Parallel and Distributed Processing Symposium Adaptive Sampling-Based Profiling Techniques for Optimizing the Distributed JVM Runtime
Investigator(s):	Wang CL
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	04/2010
Completion Date:	04/2010

Abstract:

N/A

List of Research Outputs

Lam K.T., Luo Y. and Wang C.L., Adaptive Sampling-Based Profiling Techniques for Optimizing the Distributed JVM Runtime, The 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS2010) . 2010.

Lam K.T. and Wang C.L., Web Application Server Clustering with Distributed Java Virtual Machine (Chapter 28), In: Kuan-ching Li, Ching-hsien Hsu, Laurence Tianruo Yang, Jack Dongarra, Hans Zima , Handbook of Research on Scalable Computing Technologies. IGI Global, 2009, 658-681.

Luo Y., Lam K.T. and Wang C.L., Path-Analytic Distributed Object Prefetching, The 10th International Symposium on Pervasive Systems, Algorithms and Networks (I-SPAN 2009). 2009.

Wang C.L., (keynote) Towards Easy-to-use PGAS Parallel Programming – The Distributed JVM Approach, The Third International Joint Conference on Computational Sciences and Optimization (CSO 2010), Huangshan Anhui, China.. 2010.

Researcher : Wang K

List of Research Outputs

Wang K., Designing Authentication Scheme for Wireless Sensor Networks. Hong Kong, 2009.

Researcher : Wang WP

Project Title:	7th Asian Symposium on Visualization VISJET & VISFLOOD: Software for Environmental Hydraulic Modeling and Visualization
Investigator(s):	Wang WP
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	11/2003

Abstract:

N/A

Project Title:	Continuous collision detection for composite quadric models in computer graphics
Investigator(s):	Wang WP
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	07/2006

Abstract:

Establish algebraic conditions to characterize contact configurations of commonly used quadrics, such as ellipsoids, cones and cylinders, in 3D. Based on the above results, develop efficient algorithms for continuous collision detection of composite quadric models in the case where the intersection curves are conics.

Project Title:	34th International Conference and Exhibition on Computer Graphics and Interactive Techniques (SIGGRAPH 2007) Geometry of Multilayer Freeform Structures for Architecture
Investigator(s):	Wang WP
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	08/2007

Abstract:

N/A

Project Title:	Research Output Prize (Faculty of Engineering)
Investigator(s):	Wang WP, Liu Y
Department:	Computer Science
Source(s) of Funding:	Research Output Prize (in Faculty)
Start Date:	11/2007

Abstract:

The Research Output Prize accords recognition to an author (or team of authors) of a single research output published or created in the preceding calendar year. Faculties are free to determine what form of research output best represents their research achievement and how it should be selected.

Project Title:	New Internet-based Visualization Technology for Water Quality Management
Investigator(s):	Wang WP, Lee JHW
Department:	Computer Science
Source(s) of Funding:	Seed Funding Programme for Applied Research
Start Date:	11/2007

Abstract:

In this proposal, we shall study and develop new technology and prototypes in relation to internet-based visualization for water quality management. Environmental (water) sustainability is vital to the social and economic development of the Hong Kong Special Administrative Region. As Hong Kong moves into the era of knowledge-based economy, more effective tools of water quality management are needed to support decisions and public engagement by all stakeholders. An internet and GIS-based water quality forecast and management system for all of Hong Kong’s waters will be proposed for funding to the Hong Kong Jockey Club. An innovative environmental knowledge base for Hong Kong is to be developed to provide (1) daily beach water quality forecasts and advisories; (2) on-line water quality and flow information for key coastal sites; (3) data on carrying capacity and flushing rate of fish culture zones; (4) daily prognosis for red tide outbreaks and water pollution disasters; (5) virtual reality water quality assessment for environmental impact assessment, monitoring and audit of infrastructure projects such as wastewater treatment and desalination plants. This large-scale environmental knowledge base provides advanced web-based water quality modeling and forecast. The seed funding will enable us to develop the supporting IT and visualization system that is needed to make robust and reliable data flow and high quality 3D interactive graphics possible over the Internet. [Key issues] As the proposed water quality forecast and management system is targeted to be publicly accessible over the Internet, our 3D visualization should cater for a vast variety of the computing power of the Internet clients. To allow for visualization of uniform quality among these different clients, it is most appropriate to have the graphics rendering be handled by a reliable server with high-end 3D graphics capabilities. The rendering results are then sent to the clients for display. However, currently the major challenge is the limitation of the Internet bandwidth and the connection speeds of the clients. Hence, new visualization technology needs to be developed to ensure real-time delivery, given the limited amount of data to be transmitted over the Internet. [Objectives] The focus of this project is to develop robust internet-based 3D visualization technology to enable interactive virtual reality incorporating both static (e.g., landscape) and dynamic (e.g., water simulation results) data. We will test the system within the framework of our overall system for water quality forecast and management in a networked environment to support public engagement.

Project Title:	Computation of Planar Hexagonal Mesh Surfaces
Investigator(s):	Wang WP
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	08/2008

Abstract:

1. Study the relationship between triangle meshes and hexagonal meshes under Dupin duality, and study conditions that characterize those triangle meshes that correspond to valid P-Hex meshes, i.e., P-Hex meshes with planar hexagonal faces free of self-intersection. 2. Develop a method for designing a regular triangle mesh from vector fields defined over a smooth surface, with the properties of the triangle mesh guided by conditions obtained in objective (1). 3. Develop an optimization scheme for making all faces of a hexagonal mesh planar. Such a scheme will also be capable of optimizing the shape and size of the faces of a P-Hex mesh according to design requirements. 4. Study the computation of offset surfaces of P-Hex meshes. These offset surfaces are fundamental to designing a multi-layered supporting structure of a building surface modeled with a P-Hex mesh surface.

Project Title:	SIGGRAPH 2008 Computation of Rotation Minimizing Frames
Investigator(s):	Wang WP
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	08/2008

Abstract:

N/A

Project Title:	Computation of Centroidal Voronoi Tessellation
Investigator(s):	Wang WP
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	09/2009

Abstract:

1) A thorough study will be conducted on the smoothness of the CVT energy function in various settings, including 2D and higher dimensional Euclidean spaces and surface manifolds in 3D. This study is fundamental since sufficient smoothness of objective functions is needed to ensure fast and robust convergence of various derivative-based optimization methods; 2) Fast methods with super-linear convergence will be developed for CVT computation. For a CVT problem with a large number of data points in practice, it is crucial to experiment with various quasi-Newton methods (such as the BFGS method) and study efficient implementation of the Newton method by exploiting the inherent sparsity of the Hessian matrix in the CVT problem; 3) Efficient surface meshing methods will be developed based on extended formulations of CVT on surfaces. In addition to smooth analysis of CVT energy and fast convergence, we will also study curvature-based metric for computing anisotropic meshes, constrained CVT for feature preservation, and fast intersection algorithm for computing Voronoi diagrams on triangle mesh surfaces.

Project Title:	Applications of Spectral Mesh Processing
Investigator(s):	Wang WP
Department:	Computer Science
Source(s) of Funding:	Seed Funding Programme for Basic Research
Start Date:	06/2010

Abstract:

[Purpose] We shall investigate spectral mesh processing and its applications in computer graphics. We plan to devise and develop an algorithm which is able to generate quality quadrilateral meshes from the model based on spectral methods. We also plan to explore the feasibility and potential of spectral mesh processing based on anisotropic operators for a couple of applications, e.g., mesh frequency filter and mesh denoising. [Background] Spectral methods involve the use of eigenvalues, eigenvectors and eigenspace projections derived from a properly defined linear operator. Spectral methods are widely used in computer science, e.g., in the field of graph theory, data mining, machine learning, computer network, computer vision, computer graphics, etc. Recently, spectral methods also become an active research topic in computer graphics, especially for mesh processing. Spectral processing, or more precisely, spectral mesh processing, applies spectral methods based on appropriately designed mesh operators to achieve desired results. In the past few years there have been an increasing number of geometry processing applications which utilized the eigenstructures of a variety of mesh operators in different manners. Typical applications include mesh compression [KG00], smoothing [Tau95], segmentation [LZ07], registration [OSG08], surface remeshing [DBG06], etc. [Key Issues] We focus on the key issues of using spectral meshing processing in two major aspects: mesh generation and anisotropic spectral analysis. Mesh generation --- Mesh generation is an operation for generating a polygonal mesh to approximate a given geometric entity and is one of the most fundamental functions that are used by nearly all digital geometry systems. Dong et al. [DBG06] designed an algorithm which first employed spectral analysis on quadrilateral mesh generation. Their algorithm combined the eigenvectors of the Laplacian matrix and Morse-Smale complex to generate a quadrilateral mesh for a closed input triangle mesh. Dong et al.’s method was extended by Huang et al. [HZM08] with additional orientation and alignment control to mesh generation. However, both papers are designated for input models of closed surface geometry and they fail to generate high quality meshes for models with boundaries, which are commonly used. The construction of hexahedral meshes for volumetric data is another important geometric processing operation. So far there is no effective algorithm that can generate a high quality hexahedral meshes for an arbitrary volume. Inevitably, volumetric data in 3-dimensional space have boundaries, and the remeshing quality near boundary affects not only the aesthetics of a mesh but also the accuracy of finite element analysis that may further be applied to the output mesh. Hence, the ability of handling boundaries becomes very important for a mesh generation algorithm. Anisotropic spectral analysis --- Taubin [Tau95] was the first to introduce the techniques in signal processing to geometry processing. He revealed the similarity between the spectral geometry processing and the Fourier analysis. It is well known that Fourier analysis is a powerful tool for signal processing. With Fourier analysis, image processing researchers can easily design different kinds of filters, e.g., a lower pass filter, band boost filter, high boost filter, for different requirements. Similarly for a geometry mesh, we may also find the corresponding filters by introducing spectral mesh analysis. Vallet et al. [VL08] designed a framework which can efficiently perform spectral analysis on a input triangle mesh. However, the framework fails to meet the challenge of how to filter high frequency details while preserving sharp features in the mesh. In other words, after filtering the mesh by a lower pass or band boost operation, sharp features are possibly smoothed out, which is not desired in many applications. The difficulty mainly comes from the fact that the Laplace operator is an isotropic operator, and as a result the Laplace operator cannot distinguish sharp features and fine details. A method that can perform spectral analysis of a mesh, while preserving sharp features, is therefore useful to many applications. [Objectives] We shall devise a framework for generating a quality mesh from an input model which possibly has boundaries based on spectral analysis. The framework can deal with volumetric mesh generation as well. We shall devise a set of boundary rules to guide the generation of the mesh near the boundaries. We shall also study the construction of the Morse-Smale complex, which is another key step of the framework for 2-manifolds, as well as 3-manifolds. The applications of spectral analysis are not limited to mesh generation. We also plan to investigate the feasibility of extending the Laplace operator for anisotropic spectral analysis. New anisotropic Laplace operators will be introduced and expected to be able to handle sharp features using spectral analysis more effectively. A system will also be built to verify the newly designed operators.

List of Research Outputs

Chang J.W., Wang W.P. and Kim M.S., Efficient collision detection using a dual OBB-sphere bounding volume hierarchy, Computer-Aided Design. Elsevier, 2010, 42(1): 50-57.

Choi Y.K., Li X.Q., Rong F.G., Wang W.P. and Cameron S., Determining The Directional Contact Range Of Two Convex Polyhedra, Computer-Aided Design. Elsevier, 2010, 42(1): 27-35.

Lee J.H.W., Tang H.W., Wang W.P., Li Y. and Cheung V., State Science and Technology Awards, Second Prize, 2010, Ministry of Science and Technology, China. 2010.

Liu Y., Wang W.P., Levy B., Sun F., Yan D., Lu L. and Yang C.L., On Centroidal Voronoi Tessellation -- Energy Smoothness And Fast Computation, ACM Transactions on Graphics. 2009, 28, No. 24.

Ma Y.P., Tu C.H. and Wang W.P., Computing the Distance between Canal Surfaces, In: Mourrain, Bernard; Schaefer, Scott; Xu, Guoliang, Advances in Geometric Modeling and Processing. Springer, 2010, 6130, Lecture Notes in Computer Science: 88-103.

Rong G.D., Liu Y., Wang W.P., Yin X.T., Gu X.F. and Guo X.H., GPU-assisted computation of centroidal Voronoi tessellation, IEEE Transactions on Visualization & Computer Graphics. USA, IEEE CS Press, 2010, Vol. 16.

Wang W.P., Invited conference talk at European Project SAGA Workshop (Shape, Geometry and Algebra), Nov. 17-21, 2009 Spain . 2009.

Wang W.P., Universitas 21 Fellowship 2010-2011, University of Hong Kong. 2010.

Yan D., Wang W.P., Levy B. and Liu Y., Efficient Computation of 3D Clipped Voronoi Diagram, In: Mourrain, Bernard; Schaefer, Scott; Xu, Guoliang, Advances in Geometric Modeling and Processing. Springer, 2010, 6130, Lecture Notes in Computer Science: 269-282.

Yan D., Wintz J., Mourrain B., Wang W.P., Boudon F. and Godin C., Efficient and robust reconstruction of botanical branching structure from laser scanned points, 11th IEEE International conference on Computer-Aided Design and Computer Graphics. Beijing, China, IEEE, 2009, 572-575.

Yan D., Levy B., Liu Y., Sun F. and Wang W.P., Isotropic Remeshing with Fast and Exact Computation of Restricted Voronoi Diagram, Computer Graphics Forums (Symposium on Geometry Processing 2009). Berlin, Germnay, 1445-1454, 2009, 28(5): 10pp.

Researcher : Wang Y

List of Research Outputs

Wang Y., A Study on Structured Covariance Modeling Approaches to Designing Compact Recognizers of Online Handwritten Chinese Characters. Hong Kong, 2009.

Researcher : Wong CK

List of Research Outputs

Lam T.W., Li R., Tam S.L., Wong C.K., Wu M.K.E. and Yiu S.M., High Throughput Short Read Alignment via Bi-directional BWT, The IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2009). 2009, 31 - 36.

Researcher : Wong HY

List of Research Outputs

Jiang L., Yiu S.M., Dong Y., Hui C.K. and Wong H.Y., Secure Chained Threshold Proxy Signature without and with Supervision, Journal of Software Engineering and Applications. 2009, 2(4): 267 - 275.

Researcher : Wong KF

List of Research Outputs

Wong K.F., Cheung W.Y., Lam T.W. and Yiu S.M., Local Structural Alignment of Regular RNA with Affine Gap Model, The 6th International Symposium on Bioinformatics Research and Applications (ISBRA’10). 2010.

Wong K.F., Lam T.W., Sung W.K. and Yiu S.M., Structural Alignment of RNA with Complex Pseudoknot Structure, The 9th International Workshop on Algorithms in Bioinformatics (WABI). 2009.

Wong K.F. and Yiu S.M., Structure Prediction of Simple Non-Standard Pseudoknot, The First International Conference on Bioinformatics (Bioinformatics 2010). 2010.

Researcher : Wong KKY

Project Title:	The Ninth Asian Conference on Computer Vision (ACCV 2009) Polygonal Light Source Estimation
Investigator(s):	Wong KKY
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	09/2009
Completion Date:	09/2009

Abstract:

N/A

Project Title:	Polygonal Light Source Estimation
Investigator(s):	Wong KKY, Schnieders D
Department:	Computer Science
Source(s) of Funding:	Small Project Funding
Start Date:	01/2010

Abstract:

The calibration of light sources plays an important role in the fields of computer graphics and computer vision. For instance, in image-based rendering and augmented reality, light information of a scene is exploited to render computer generated models into the video/image sequences of a scene seamlessly. As another example, the classic shape-from-shading technique recovers the 3D shape of an object by relating its intensity values observed in an image to its surface normal vectors and the light source direction. Traditional techniques for light source estimation often make the common assumption of a distant point light source. Such an assumption conveniently reduces the complexity in modeling the image formation process, and allows simple light estimation by recovering only the direction and strength of a light source. In an indoor environment such as an office and a laboratory, however, distant point light sources are very rare. The dominant light sources in such an environment are area light sources (e.g., rectangular light sources). Limited by the physical dimension of an indoor environment, these light sources are usually located at finite distances from the observer. As a result, the distant point light source assumption no longer applies, and this calls for a completely different approach of light estimation for indoor environments. Due to the relatively high complexity involved in estimating area light sources, however, the literature in this area is rather sparse. In this project, we aim at developing a novel method for area light source estimation in an indoor environment. To simplify the problem, only polygonal light sources will be considered. A polygonal light source is treated as a polygon in 3D space composed of a set of straight lines. In this project, we propose to recover a polygonal light source by observing its reflection on a specular sphere and reconstructing its edges in 3D space. Below are the four key problems being addressed. 1) Estimating the size and position of a sphere from its image in a single view. 2) Estimating the length and position of a line in 3D space from its reflection on a sphere with known size and position. 3) Estimating the shape and position of a polygonal light source from its reflections on multiple spheres in a single view. 4) Estimating the shape and position of a polygonal light source from its reflections on a single sphere observed at multiple viewpoints. By developing solutions to the above problems, it would allow us to recover both the shape and position of a polygonal light source by either (i) observing its reflections on multiple uncalibrated spheres* at a single viewpoint; or (ii) observing its reflections on a single uncalibrated sphere at multiple unknown viewpoints. (* uncalibrated spheres refer to those with unknown sizes and unknown positions)

List of Research Outputs

He X., Yuk S.C., Chow K.P., Wong K.K.Y. and Chung H.Y., A 3D face reconstruction approach in enhancing the robustness and accuracy of face recognition, Shanghai and Hong Kong Symposium on Science & Technology. 2009.

He X., Yuk S.C., Chow K.P., Wong K.K.Y. and Chung H.Y., Automatic 3D Face Texture Mapping Framework from Single Image, 1st International Conference on Internet Multimedia Computing and Service. Kunming, China, 2009, 123-128.

He X., Yuk S.C., Chow K.P., Wong K.K.Y. and Chung H.Y., Super-Resolution of Faces Using Texture Mapping on a Generic 3D Model, International Conference on Image and Graphics. Xi’an, China, 2009, 361-365.

Liang C. and Wong K.K.Y., 3D Reconstruction Using Silhouettes from Unordered Viewpoints, Image and Vision Computing. 2010, 28(4): 579-589.

Liu M. and Wong K.K.Y., Shape from Shadings under Perspective Projection and Turntable Motion, International Conference on Computer Vision Theory and Applications. 2010, 28-35.

Schnieders D., Wong K.K.Y. and Dai Z., Polygonal Light Source Estimation, Asian Conference on Computer Vision. 2009, III: 96-107.

Schnieders D., Fu X. and Wong K.K.Y., Reconstruction of Display and Eyes from a Single Image, IEEE International Conference on Computer Vision and Pattern Recognition. 2010, 1442-1449.

Shao L., Zhang H., Wong K.K.Y. and Luo J., Guest editor of the special issue "Fast and Robust Methods for Multiple-View Vision", EURASIP Journal on Image and Video Processing. 2010.

Researcher : Wong KY

List of Research Outputs

Wong K.Y. and Yip C.L., Identifying centers of circulating and spiraling vector field patterns and its applications, Pattern Recognition. 2009, 42: 1371-1387.

Researcher : Wong WH

List of Research Outputs

Lam T.W., Lee L.K., Ting H.F., To K.K. and Wong W.H., Sleep with guilt and work faster to minimize flow plus energy, In: S. Albers, The 36th International Colloquium on Automata, Languages and Programming (ICALP). Springer, 2009, 665-676.

Researcher : Wong WK

List of Research Outputs

Wong W.K., Cheung D.W.L., Hung E., Kao C.M. and Mamoulis N., An Audit Environment for Outsourcing of Frequent Itemset Mining, Proceedings of the 35th International Conference on Very Large Data Bases (VLDB 2009). 2009.

Wong W.K., Cheung D.W.L., Hung E. and Kao C.M., An Audit Environment for Outsourcing of Frequent Itemset Mining, Proceedings of the 35th Very Large Data Bases Conference (VLDB). Lyon, France, 2009, 1162-1172.

Wong W.K., Mamoulis N. and Cheung D.W.L., Non-homogeneous Generalization in Privacy Preserving Data Publishing, Proceedings of the ACM Conference on Management of Data (SIGMOD), pp. 483-494, Indianapolis, IN. 2010.

Researcher : Wong WK

List of Research Outputs

Researcher : Wong YY

List of Research Outputs

Wong Y.Y., Transparent Process Migration for Distributed Java Computing. Hong Kong, 2009.

Researcher : Wu C

Project Title:	Fast Lookup and Dynamic Replication Algorithms for Large-Scale Peer-to-Peer On-Demand Streaming Applications
Investigator(s):	Wu C
Department:	Computer Science
Source(s) of Funding:	Seed Funding Programme for Basic Research
Start Date:	10/2008
Completion Date:	10/2010

Abstract:

Imagine a movie fan can access thousands of movies at any time, enjoying clear pictures, smooth playback and full features of VCR; and all these only require a regular computer and a residential Internet connection, but not a set top box with extra monthly cable subscription fees. Internet video-on-demand (VoD) applications, aiming to achieve the above picture, have emerged since the late 1990's. Earlier VoD systems rely on the server-client model, where the high server bandwidth costs have significantly limited their scalability to support millions of concurrent users with high-quality video streams. Only recently, the emergence of peer-to-peer (P2P) communication paradigm, has shed light on a fundamentally new design of Internet VoD applications. Featuring resource contributions from regular users to reduce the server load, P2P paradigm has achieved great success in the design of Internet file sharing applications (e.g., Gnutella [1], BitTorrent [2]) and live media streaming (IPTV) applications (e.g., PPLive [3], UUSee [4], PPStream [5]). However, the progress in P2P VoD design has been slow. The fundamental difficulty of P2P VoD design lies at the vast diversity of the content watched and stored at the users: Different from live streaming where users in each IPTV channel are viewing at a similar pace, VoD users may well be watching videos from a much larger collection and playing different parts of a video. The lack of content overlapping has led to a fundamentally lower level of mutual help among peers, while the requirement for a stable good streaming quality remains. To utilize peer resources, recent P2P VoD solutions feature a BitTorrent-like block-exchanges, where peers watching the same video are organized into a loosely coupled mesh overlay and retrieve (pull) blocks to be played in the near future from each other [6,7]. Compared to earlier tree-based solutions [8,9], the mesh-pull based design achieves better peer bandwidth utilization, and has thus led to the recent development and deployment of a few commercial P2P VoD applications, such as PPLive [3] and UUSee [4]. Though practical deployment of mesh-pull based P2P VoD has appeared, early experience with the deployment [10,4] has still pointed to two critical technical challenges, whose high-quality solutions would play a key role in the success of the large-scale deployment of P2P VoD applications. 1. Fast lookup of video blocks. As VoD users may jump to watch different parts of a video at will, it is critically important for a VoD protocol to quickly locate the peers storing the desired blocks, in order to minimize the re-buffering delay and provide users a smooth viewing experience. Existing solutions include: (1) Tracking server assistant lookup, where a dedicated server keeps track of the block availability at the peers and offers this information to requesting peers. Current practical systems have largely relied on such tracking servers, but this solution has represented a significant scalability bottleneck with the expansion of the systems. (2) DHT-based lookup, in which a distributed hash table (DHT) is constructed to store content location information onto different peers, where each lookup is achieved by O(log(N)) routing steps in a network of N peers [7,11,12]. The introduction of DHT adds additional complexity into the VoD protocol, as it maintains a separate P2P overlay for content lookup, other than the streaming overlay. (3) A few recent proposals design specific P2P overlays based on different graph structures to facilitate the lookup, including dynamic skip list [13] and ring-assisted overlay [14], which achieve an O(log(N)) or O(log(T/w)) lookup complexity (where T and w are video size and peer buffer size), respectively. 2. Optimal management of peer caches. To maximally utilize peer resource to reduce server load, a critical step is to make optimal block replication and replacement in the peer caches, in order to meet the dynamic demand for different blocks in the system over time. Existing practical P2P VoD applications typically allocate a hard disk space of 1-2 GB on each peer, which simply caches recently played blocks and replaces blocks based on a FIFO, LRU or LFU algorithm when the storage is full. Such heuristic schemes make sub-optimal utilization of peer storage, leading to a call for more efficient cache management algorithms. A few recent proposals have advocated block replication strategies based on the popularity of different video blocks: Yiu et al. [12] estimate the block popularity in the current network using a discrete-time Markov model; Cheng et al. [15] predict the future block popularity using a simple predictor based on the current popularity of each block. Though block popularity considered, all the existing solutions have ignored another critical element, the bandwidth sufficiency at the peers for uploading the replicated blocks, and there is no guarantee in the existing proposals that the replicating peers may have available bandwidth to serve the demand for their blocks. In this project, we aim to design fundamentally better solutions to address the above challenges, based on our following belief: First, we believe the O(log(N)) or O(log(T/w)) lookup performance is still far from being optimal, while a best lookup algorithm with O(1) complexity is desirable and achievable; Second, we believe any good cache management algorithm should carefully consider both block popularity and available block upload bandwidth in its dynamic block replication strategy design. To fulfil these goals, this research project is structured as two sub-projects: In sub-project 1, we seek to design an efficient indexing overlay structure, which achieves the optimal block lookup performance of O(1), i.e., only a small constant number of forwarding steps are needed to locate any block in the overlay. In addition, the resulting indexing overlay for each video provides a unified framework for the streaming functions and lookup operations, without introducing any additional lookup service layer. In sub-project 2, we investigate optimal block replication and replacement strategies, based on prediction of supply insufficiency for each block. Such supply insufficiency of a video block takes into consideration the following two factors: the insufficient replications of the block and the insufficient upload bandwidth at peers which cache the block. The prediction-based strategies are expected to maximize peer storage and bandwidth utilization to meet the demand for each block at any given time.

Project Title:	The 28th IEEE International Conference on Computer Communications (IEEE INFOCOM) Diagnosing Network-wide P2P Live Streaming Inefficiencies Distilling Superior Peers in Large-Scale P2P Streaming Systems
Investigator(s):	Wu C
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	04/2009

Abstract:

N/A

Project Title:	Modeling the Performance and Locality Tradeoff in P2P File Sharing System Design
Investigator(s):	Wu C
Department:	Computer Science
Source(s) of Funding:	Small Project Funding
Start Date:	12/2009

Abstract:

In recent years, there has been a surge in the number of peer-to-peer (P2P) applications over the Internet, enabling large-scale content distribution at cheap server cost (e.g., BitTorrent, PPLive, Skype). The prosperousness of P2P applications has brought about huge amounts of P2P traffic, which significantly changes the traffic pattern in the Internet and dramatically increases traffic-relay cost at the Internet Service Providers (ISPs). Such a cost threat has led to ISPs’ packet filtering and rate throttling towards P2P traffic [1], while on the other hand P2P application providers react by encryption and dynamic ports to avoid being identified [2]. There have recently emerged hot arguments that such a conflict cannot lead to desirable outcomes for both parties, with ISPs loosing P2P-fan subscribers and P2P applications sacrificing desirable performance. Instead, traffic localization designs have been proposed that connect peers to nearby (local) neighbors in terms of delay, routing hop count, etc., by approaches at either the P2P application side [3,5,6] or ISP side [7,8], or based on collaborations between both parties [4,9]. While such locality in peer selection is effective in reducing the P2P traffic across network boundaries, it may not always be aligned with the performance optimization, e.g., download speed maximization, targeted by P2P file sharing systems: Choosing neighbors based only on the criteria of network proximity that minimizes network cost may result in significant downloading performance downgrade at the peers, i.e., there usually exists a non-negligible tradeoff between the file downloading performance and peer connectivity localization at times over the lifetime of the peers. For example, assuming two ISPs A and B. When the network bandwidth between ISP A and ISP B is large, and that inside ISP B is small, if a peer in ISP B only selects neighbors within the same ISP, it will achieve a file downloading bandwidth much smaller than the case that it downloads from neighbors in ISP A. Given such a realistic situation, a practical solution for the benefits of both the P2P application provider and the ISPs, is to achieve a desired tradeoff between performance and locality in the P2P system, that is acceptable and possibly decided by both parties. Intriguing questions thus arise: How can one formally characterize such a tradeoff between performance and locality? How can one design effective and dynamic peer selection strategies, that achieve any desired tradeoff in a practical P2P file sharing system? In this project, we seek to answer these questions by formally characterizing the performance and locality tradeoff in peer selection for BitTorrent-like P2P file sharing systems. In particular, we have the following objectives: (1) we first seek to formally model the peer selection in P2P file sharing systems using a b-matching optimization model; (2) we then seek to extend the general model to reflect the performance and locality tradeoff; (3) finally, we seek to design effective and fully distributed peer selection algorithms that achieve the optimum point of the performance and locality tradeoff in the entire system. [1] Comcast is Using Sandvine to Manage P2P Connections, http://www.dslreports.com/forum/r18323368-Comcast-is-using-Sandvine-to-manage-P2P-Connections. [2] BitTorrent Developers Introduce Comcast Busting Encryption, http://torrentfreak.com/bittorrent-devs-introduce-comcast-busting-encryption-080215/. [3] T. Karagiannis, P. Rodriguez, and K. Papagiannaki, “Should Internet Service Providers Fear Peer-Assisted Content Distribution,” in Proc. of the Internet Measurement Conference (IMC’2005), October 2005. [4] H. Xie, Y. R. Yang, A. Krishnamurthy, Y. G. Liu, and A. Silberschatz, “P4p: Provider portal for applications,” SIGCOMM Comput. Commun. Rev., vol. 38, no. 4, pp. 351–362, 2008. [5] R. Bindal, P. Cao, W. Chan, J. Medval, G. Suwala, T. Bates, and A. Zhang, “Improving Trafﬁc Locality in BitTorrent via Biased Neighbor Selection,” in Proc. of the 26th IEEE International Conference on Distributed Computing Systems (ICDCS 2006), July 2006. [6] C. Gkantsidis, T. Karagiannis, P. Rodriguez, and M. Vojnovic, “Planet Scale Software Updates,” in Proc. of ACM SIGCOMM, September 2006. [7] O. Saleh and M. Hefeeda, “Modeling and Caching of Peer-to-Peer Trafﬁc,” in Proc. of the 14th IEEE International Conference on Network Protocols (ICNP 2006), November 2006. [8] G. Shen, Y. Wang, Y. Xiong, B. Zhao, and Z.-L. Zhang, “HPTP: Relieving The Tension between ISPs and P2P,” in Proc. of the 6th International Workshop on Peer-to-Peer Systems (IPTPS 2007), February 2007. [9] V. Aggarwal, A. Feldmann, and C. Scheideler, “Can isps and p2p users cooperate for improved performance?” SIGCOMM Comput. Commun. Rev., vol. 37, no. 3, pp. 29–40, 2007.

Project Title:	The 29th Conference on Computer Communications (IEEE INFOCOM) UUSee: Large-Scale Operational On-Demand Streaming with Random Network Coding
Investigator(s):	Wu C
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	03/2010
Completion Date:	03/2010

Abstract:

N/A

List of Research Outputs

Huang W., Wu C. and Lau F.C.M., The Performance and Locality Trade-off in BitTorrent-like P2P File-Sharing Systems, IEEE International Conference on Communications (IEEE ICC 2010). 2010.

Liu Z., Wu C., Li B. and Zhao S., UUSee: Large-Scale Operational On-Demand Streaming with Random Network Coding, IEEE INFOCOM. 2010.

Qiu X., Huang W., Wu C., Lau F.C.M. and Lin X., InstantLeap: an Architecture for Fast Neighbor Discovery in Large-scale P2P VoD Streaming, In: Thomas Plagemann, Springer Multimedia Systems Journal. Springer-Verlag, 2010, 16: 183-198.

Wu C., Introduction to Convex Optimization: Optimal Rate Control in Communication Networks, Winter School, Institute of Theoretical Computer Science and Communications, CUHK. 2010.

Researcher : Wu MKE

List of Research Outputs

Lam T.W., Li R., Tam S.L., Wong C.K., Wu M.K.E. and Yiu S.M., High Throughput Short Read Alignment via Bi-directional BWT, The IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2009). 2009, 31 - 36.

Tam S.L., Wu M.K.E., Lam T.W. and Yiu S.M., Succinct Text Indexing with Wildcards, The 16th String Processing and Information Retrieval Symposium (SPIRE). Springer, 2009, 39-50.

Wu M.K.E., Improved Indexing for Next Generation Bioinformatics Applications. Hong Kong, 2009.

Researcher : Xie X

List of Research Outputs

Cheng C.K., Xie X., Yiu M.L., Chen J. and Sun L., UV-diagram: A Voronoi Diagram for Uncertain Data, The IEEE Intl. Conf. on Data Engineering (IEEE ICDE 2010). 2010.

Researcher : Yan D

List of Research Outputs

Liu Y., Wang W.P., Levy B., Sun F., Yan D., Lu L. and Yang C.L., On Centroidal Voronoi Tessellation -- Energy Smoothness And Fast Computation, ACM Transactions on Graphics. 2009, 28, No. 24.

Yan D., Wang W.P., Levy B. and Liu Y., Efficient Computation of 3D Clipped Voronoi Diagram, In: Mourrain, Bernard; Schaefer, Scott; Xu, Guoliang, Advances in Geometric Modeling and Processing. Springer, 2010, 6130, Lecture Notes in Computer Science: 269-282.

Yan D., Wintz J., Mourrain B., Wang W.P., Boudon F. and Godin C., Efficient and robust reconstruction of botanical branching structure from laser scanned points, 11th IEEE International conference on Computer-Aided Design and Computer Graphics. Beijing, China, IEEE, 2009, 572-575.

Yan D., Levy B., Liu Y., Sun F. and Wang W.P., Isotropic Remeshing with Fast and Exact Computation of Restricted Voronoi Diagram, Computer Graphics Forums (Symposium on Geometry Processing 2009). Berlin, Germnay, 1445-1454, 2009, 28(5): 10pp.

Yan D., Variational Shape Segmentation and Mesh Generation. Hong Kong, 2010.

Researcher : Yan D

List of Research Outputs

Liu Y., Wang W.P., Levy B., Sun F., Yan D., Lu L. and Yang C.L., On Centroidal Voronoi Tessellation -- Energy Smoothness And Fast Computation, ACM Transactions on Graphics. 2009, 28, No. 24.

Yan D., Variational Shape Segmentation and Mesh Generation. Hong Kong, 2010.

Researcher : Yang B

List of Research Outputs

Yang B., Peng Y., Leung C.M., Yiu S.M., Chen J. and Chin F.Y.L., Unsupervised Binning of Environmental Genomic Fragments based on an Error Robust Selection of l-mers, BMC Bioinformatics. London, United Kingdom, BioMed Central, 2010, 11 (Suppl 2): S5.

Yang B., Peng Y., Leung C.M., Yiu S.M., Chen J. and Chin F.Y.L., Unsupervised Binning of Environmental Genomic Fragments based on an Error Robust Selection of l-mers, The Third International Workshop on Data and Text Mining in Bioinformatics (DTMBIO 09) in the 18th ACM Conference on Information and Knowledge Management. Hong Kong, 2009.

Researcher : Yip CL

List of Research Outputs

Wong K.Y. and Yip C.L., Identifying centers of circulating and spiraling vector field patterns and its applications, Pattern Recognition. 2009, 42: 1371-1387.

Researcher : Yiu SM

Project Title:	Investigation of the lineage and tissue specificity of genes
Investigator(s):	Yiu SM, Smith DK
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	10/2007
Completion Date:	03/2010

Abstract:

To identify genes of varying degrees of lineage specificity and map these where possible to differing degrees of tissue specificity; to identify the domains and disordered regions in the products of these genes; to examine the role of domain recombination and evolutionary rate in relation to the degree of lineage and/or tissue specificity; to identify whether disordered proteins are more or less “novel”, lineage or tissue specific than ordered proteins; to assess whether classifications of tissue specific genes by lineage specificity, domain age or evolutionary rate or combinations of these can provide assistance in deciphering tissue specific gene regulation.

Project Title:	Indexing multiple similar DNA sequences
Investigator(s):	Yiu SM, Lam TW
Department:	Computer Science
Source(s) of Funding:	Seed Funding Programme for Basic Research
Start Date:	06/2009
Completion Date:	01/2011

Abstract:

1. Introduction New DNA sequencing technologies are now available (e.g. Solexa [1], Solid [2], 454 [3]) which can produce billions of reads (short DNA fragments) within a few days with low cost. More and more DNA sequences of the same species can be produced in just days or months which facilitates the study of evolution and diseases related to mutation (e.g. The 1000 Genomes Project [4] announced in January this year aims at producing at least 1000 DNA sequences from 1000 different persons for the study of human genetic variations to support disease studies). However, this also implies that a huge amount of sequences need to be analyzed for which existing bioinformatics tools cannot be easily scaled up to this level. This, in turn, poses many challenging problems to the bioinformatics community (e.g. [5, 6]). For example, to study the mutations of human chromosome 1, we may need to handle 100 DNA sequences, each with 300 million characters. In total, we are dealing with a set of DNA sequence data with 30 billion characters. What researchers really want is an efficient tool to help them analyze this volume of data. One of the basic, but important and frequently used analysis operations is to locate patterns in these sequences (pattern matching). In the following, we will see that none of the existing solutions can easily be scaled up to handle this amount of data. In this proposal, we focus on this problem of indexing multiple DNA sequences of the same species (usually they are very similar with about 99.9% similarity in sequence) to allow efficient pattern matching. 2. Background and existing solutions Indexing DNA sequences for pattern matching has been a well-studied subject in computational biology. Since we have to handle a large volume of sequence data, traditional indexing data structure such as suffix tree [7] is not applicable. The best known implementation of a suffix tree requires about 17n bytes space for a text of length n [8]. In our case, it would be more than 1,20G memory. Another alternative is to store the suffix tree in the hard disk, however, the access time will be increased by several order of magnitude and the construction of a harddisk-based suffix tree takes a very long time, at least several months [9]. A more feasible solution is to consider compressed indexing data structures. These data structures (e.g. FM-index [10, 11], CSA (Compressed suffix array) [12, 13], BWT (Burrow-Wheeler Text) [14]) reduces the space complexity from O(n) bytes to O(n) bits, while having a similar searching performance as a suffix tree. Among all these data structures, BWT was found to be the most efficient and the memory requirement can be about 0.4n bytes [15, 16]. And the index construction time is very reasonable, experiments show that it takes only one hour for the human genome which is of 3G characters. However, even for this best known solution, when it applies to our case, storing all 100 sequences, each with 300M characters, it would require 40G memory, which far exceeds the memory capacity of a standard PC. 3. Objectives On the other hand, since these 100 sequences are all DNA sequence of human chromosome 1, they should be very similar except some of the characters (about 0.01%). In this proposal, we want to study how to exploit the high similarity between these sequences in order to come up with a more space-economic indexing data structure to allow efficient pattern matching. The following problems will be studied in this project. Problem 1: Given a set of r texts T1, T2, …, Tr of the same length n, and k positions such that for each position x, there exist i <> j such that Ti[x] <> Tj[x], design a space-efficient (say using O(n) + o(kr) bits) indexing data structure to store these texts such that given any pattern, we can efficiently locate all occurrences of this pattern in the texts. Note that k << n. Problem 2: Problem 1 aims at finding occurrences of the exact pattern. Problem 2 is an extension of Problem 1 by finding locations of the patterns in the texts which is similar (with edit distance of at most 2) to the input pattern.

Project Title:	IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2009) High Throughput Short Read Alignment via Bi-directional BWT
Investigator(s):	Yiu SM
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	11/2009
Completion Date:	11/2009

Abstract:

N/A

Project Title:	Algorithms for Inferring k-articulated Phylogenetic Network
Investigator(s):	Yiu SM
Department:	Computer Science
Source(s) of Funding:	General Research Fund (GRF)
Start Date:	11/2009

Abstract:

1) To develop efficient algorithms for constructing a k-articulated phylogenetic network with small k (e.g. k <= 3) of a given set of binary phylogenetic trees; 2) Extend the algorithms in objective 1 to handle non-binary phylogenetic trees; 3) Develop algorithms for constructing k-articulated phylogenetic network with small k directly from the information of genes (e.g. in terms of a distance matrix that captures the similarity of every two strains of the species based on the genes) of the species without constructing the phylogenetic trees first.

Project Title:	Non-coding RNA structural alignment with special types of pseudoknots
Investigator(s):	Yiu SM
Department:	Computer Science
Source(s) of Funding:	Seed Funding Programme for Basic Research
Start Date:	03/2010

Abstract:

Introduction: Non-coding RNAs (ncRNAs) are those RNA molecules that do not translate into proteins. Recently, people started to realize the importance of ncRNAs. They have been shown to be involved in many biological processes (e.g. [1-3]). The number of ncRNAs in human is expected to be large [4, 5]. Currently, over 200,000 ncRNAs have been found. More and more are expected to be revealed. Since ncRNAs do not produce proteins, it is not easy to identify these molecules using laboratory techniques. On the other hand, it is known that ncRNAs belonging to the same family and having similar functions should be similar in terms of their secondary structures and the primary sequences. The observation provides a clue for identifying ncRNAs using computational approaches. Based on the known ncRNAs and their structures, we align them with all possible subregions in the genome/chromosome. The regions that are similar to the known ncRNAs in both the structure and the primary sequence are likely to be ncRNAs of the same family as the known ncRNA. The key step in this process is to compute a structural alignment score which captures the similarity (based on the sequences as well as the secondary structures) between the known ncRNA and a target sequence with unknown structure. Issues: There is a sub-structure called pseudoknot that may exist in the secondary structure of an ncRNA molecule. This sub-structure was found to play an important role in many critical biological functions (e.g. [6-8]) such as telomerase, catalytic functions, and self-spicing introns. However, the presence of pseudoknots in the secondary structure makes the alignment problem computationally harder. It is believed that the alignment problem becomes NP-hard (no fast algorithm exists) for general pseudoknots. Existing alignment algorithms (e.g. PAL [9]) can only support simple pseudoknots (called standard pseudoknot of degree k). Many other algorithms (e.g. RSEARCH [10] and FASTR [11]) do not even support pseudoknots. On the other hand, there exist secondary structures of known ncRNAs with more complicated pseudoknots. For example, in Rfam 9.1 database, among 71 pseudoknotted families, 18 of them have more complicated pseudoknot structures than standard pseudoknot of degree k. None of the existing algorithms or tools are applicable to solve the alignment problem when dealing with these complicated pseudoknots. Objectives: In this proposal, we consider more complicated pseudoknot structures that are found in known ncRNAs. The objectives of the proposal include the followings. (1) Try to identify those complicated pseudoknot structures from known ncRNAs and formally define these structures. (2) Develop alignment algorithm(s) to handle these types of structures. The algorithms will be implemented and evaluated on real ncRNA data.

List of Research Outputs

Chim T.W., Yiu S.M., Hui C.K. and Li V.O.K., SPECS: Secure and Privacy Enhancing Communications Schemes for VANETs, Ad Hoc Networks. 2010, 28: 160 - 175.

Chim T.W., Yiu S.M., Hui C.K., Jiang L. and Li V.O.K., SPECS: Secure and Privacy Enhancing Communications for VANET, ADHOCNETS '09. Niagara Falls, Canada, 2009.

Chim T.W., Yiu S.M., Hui C.K., Jiang L. and Li V.O.K., SPECS: Secure and Privacy Enhancing Communications for VANET, The First International Conference on Ad Hoc Networks (AdHocNets 2009). Canada, 2009.

Dong Y., Chim T.W., Li V.O.K., Yiu S.M. and Hui C.K., ARMR: Anonymous Routing Protocol with Multiple Routes for Communications in Mobile Ad Hoc Networks, Ad Hoc Networks. 2009, 7(8): 1536-50.

Fang J., Jiang L., Yiu S.M. and Hui C.K., Efficient key integrity verification for quantum cryptography using combinatorial group testing, SPIE Defense, Security, and Sensing Conference. USA, 2010.

Fang J., Jiang L., Yiu S.M. and Hui C.K., Hard Disk Integrity Check by Hashing with Combinatorial Group Testing, The 2009 International Workshop on Forensics for Future Generation Communication environments (F2GC-09). Jeju Island, Korea, 2009.

Jia Y., Song Y., Yiu S.M. and Smith D.K., Bioinformatics Study of the Lineage and Tissue Specificity of Gene Expression, 14th Research Postgraduate Symposium, December 2 & 3, 2009, The University of Hong Kong. 2009.

Jiang L., Yiu S.M., Dong Y., Hui C.K. and Wong H.Y., Secure Chained Threshold Proxy Signature without and with Supervision, Journal of Software Engineering and Applications. 2009, 2(4): 267 - 275.

Lam T.W., Li R., Tam S.L., Wong C.K., Wu M.K.E. and Yiu S.M., High Throughput Short Read Alignment via Bi-directional BWT, The IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2009). 2009, 31 - 36.

Law Y.W., Chan P.F., Yiu S.M., Tang B., Lai K.Y., Chow K.P., Ieong S.C.R., Kwan Y.K., Hon W.K. and Hui C.K., Identifying Static and Dynamic Volatile Memory Data from Multiple Dumps for Live Forensics, Sixth IFIP WG 11.9 International Conference on Digital Forensics. Hong Kong, 2010.

Leung C.M., Leung S.Y., Yiu S.M. and Chin F.Y.L., Predicting Conserved Metabolic Pathways Leading to an Important Final Product (Poster), The 8th Annual International Conference on Computational Systems Bioinformatics Conference (CSB 2009). Stanford University, USA, 2009.

Li R., Yu C., Li Y., Lam T.W., Yiu S.M., Kristiansen K. and Wang J., SOAP2: an Improved Ultrafast Tool for Short Read Alignment, Bioinformatics. 2009, 25(15): 1966-67.

Peng Y., Leung C.M., Yiu S.M., Chin F.Y.L. and Li R., Assembling Short Reads with Much Less Memory, The 11th International Meeting on Human Genome Variation and Complex Genome Analysis (HGV 2009). Tallinn, Estonia, 2009.

Siu W.Y., Mamoulis N., Yiu S.M. and Chan H.L., A Data-Mining Approach for Multiple Structural Alignment of Proteins, Bioinformation. 2010, 4(8): 366 - 370.

Tam S.L., Wu M.K.E., Lam T.W. and Yiu S.M., Succinct Text Indexing with Wildcards, The 16th String Processing and Information Retrieval Symposium (SPIRE). Springer, 2009, 39-50.

Wong K.F., Cheung W.Y., Lam T.W. and Yiu S.M., Local Structural Alignment of Regular RNA with Affine Gap Model, The 6th International Symposium on Bioinformatics Research and Applications (ISBRA’10). 2010.

Wong K.F., Lam T.W., Sung W.K. and Yiu S.M., Structural Alignment of RNA with Complex Pseudoknot Structure, The 9th International Workshop on Algorithms in Bioinformatics (WABI). 2009.

Wong K.F. and Yiu S.M., Structure Prediction of Simple Non-Standard Pseudoknot, The First International Conference on Bioinformatics (Bioinformatics 2010). 2010.

Yang B., Peng Y., Leung C.M., Yiu S.M., Chen J. and Chin F.Y.L., Unsupervised Binning of Environmental Genomic Fragments based on an Error Robust Selection of l-mers, BMC Bioinformatics. London, United Kingdom, BioMed Central, 2010, 11 (Suppl 2): S5.

Yang B., Peng Y., Leung C.M., Yiu S.M., Chen J. and Chin F.Y.L., Unsupervised Binning of Environmental Genomic Fragments based on an Error Robust Selection of l-mers, The Third International Workshop on Data and Text Mining in Bioinformatics (DTMBIO 09) in the 18th ACM Conference on Information and Knowledge Management. Hong Kong, 2009.

Yiu S.M., Computational Biology: What Computer Science Can Help?, the 1st IFIP Conference on Bioinformatics, Surat, India. 2010.

Yiu S.M., High Throughput Short Read Alignment via Bi-direction BWT, 2nd Mainland and Hong Kong Workshop on Bioinformatics, Yunan, China. 2010.

Yiu S.M., High Throughput Short Read Alignment via Bi-direction BWT, Shenzhen Institutes of Advanced Technology, Chinese Academy of Science (CAS). 2010.

Yiu S.M., Outstanding Teaching Award, The University of Hong Kong. 2009.

Yiu S.M., Security and Privacy Issues for Intervehicular Communications in VANETs, Harbin Institute of Technology, Shenzhen Graduate School. 2009.

Yiu S.M., Security and Privacy Issues for Intervehicular Communications in VANETs, Institute for Advanced Study, Tsinghua University, Beijing. 2010.

Yiu S.M., Structural Alignment of RNAs, Institute of Theoretical Computer Science, Tsinghua University, Beijing. 2010.

Yiu S.M., Structural Alignment of ncRNAs with Complicated Pseudoknot, Computational Biology Research Center (CBRC), AIST, Japan. 2010.

Researcher : Yu D

List of Research Outputs

Hua Q., Wang Y., Yu D. and Lau F.C.M., Dynamic Programming Based Algorithms for Set Multicover and Multiset Multicover Problems, Theoretical Computer Science. Elsevier, 2010, 411: 2467-2474.

Hua Q., Wang Y., Yu D. and Lau F.C.M., Set Multi-Covering via Inclusion-Exclusion, Theoretical Computer Science. Elsevier, 2009, 410: 3882-3892.

Researcher : Yuk SC

List of Research Outputs

He X., Yuk S.C., Chow K.P., Wong K.K.Y. and Chung H.Y., A 3D face reconstruction approach in enhancing the robustness and accuracy of face recognition, Shanghai and Hong Kong Symposium on Science & Technology. 2009.

He X., Yuk S.C., Chow K.P., Wong K.K.Y. and Chung H.Y., Automatic 3D Face Texture Mapping Framework from Single Image, 1st International Conference on Internet Multimedia Computing and Service. Kunming, China, 2009, 123-128.

He X., Yuk S.C., Chow K.P., Wong K.K.Y. and Chung H.Y., Super-Resolution of Faces Using Texture Mapping on a Generic 3D Model, International Conference on Image and Graphics. Xi’an, China, 2009, 361-365.

Researcher : Zhang Y

Project Title:	The 15th International Computing and Combinatorics Conference (COCOON'2009) Online Tree Node Assignment with Resource Augmentation
Investigator(s):	Zhang Y
Department:	Computer Science
Source(s) of Funding:	URC/CRCG - Conference Grants for Teaching Staff
Start Date:	07/2009
Completion Date:	07/2009

Abstract:

N/A

List of Research Outputs

Chan J.W.T., Chin F.Y.L., Ting H.F. and Zhang Y., Online Problems for Frequency Assignment and OVSF Code Assignment in Wireless Communication Networks, SIGACT News. 2009, 40(3): 86-98.

Chan W.T., Chin F.Y.L., Ting H.F. and Zhang Y., Online Tree Node Assignment with Resource Augmentation, The 15th International Computing and Combinatorics Conference (COCOON'2009). Niagara Falls, New York, USA, 2009, 358-367.

Chin F.Y.L., Ting H.F. and Zhang Y., 1-Space Bounded Algorithms for 2-Dimensional Bin Packing, The 20th Annual International Symposium on Algorithms and Computation (ISAAC 2009). Hawaii, USA, 2009.

Chin F.Y.L., Ting H.F. and Zhang Y., A constant-competitive algorithm for online OVSF code assignment, Algorithmica. 2010, 56: 89-104.

Chin F.Y.L., Ting H.F., Tsin Y.H. and Zhang Y., Online uniformly inserting points on grid, The 6th International Conference on Algorithmic Aspect in Information and Management (AAIM). 2010, 282-292.

Kao M.Y., Leung C.M., Sun H. and Zhang Y., Deterministic Polynomial-Time Algorithms for Designing Short DNA Words, 7th Annual Conference on Theory and Applications of Models of Computation (TAMC 2010). 2010.

Zhang Y., Chin F.Y.L. and Zhu H., A 1-Local Asymptotic 13/9-Competitive Algorithm for Multicoloring Hexagonal Graphs , Algorithmica. Springer, 2009, 54(4): 557–567.

Zhang Y., SIGBOT: Signature-based Multiple-Bug Localization. Hong Kong, 2010.

Researcher : Zhang Y

List of Research Outputs

Chan J.W.T., Chin F.Y.L., Ting H.F. and Zhang Y., Online Problems for Frequency Assignment and OVSF Code Assignment in Wireless Communication Networks, SIGACT News. 2009, 40(3): 86-98.

Chin F.Y.L., Ting H.F. and Zhang Y., A constant-competitive algorithm for online OVSF code assignment, Algorithmica. 2010, 56: 89-104.

Zhang Y., Chin F.Y.L. and Zhu H., A 1-Local Asymptotic 13/9-Competitive Algorithm for Multicoloring Hexagonal Graphs , Algorithmica. Springer, 2009, 54(4): 557–567.

Zhang Y., SIGBOT: Signature-based Multiple-Bug Localization. Hong Kong, 2010.

Researcher : Zhang Z

List of Research Outputs

Jiang B., Zhang Z., Chan W.K. and Tse T.H., Adaptive random test case prioritization, Proceedings of the 24th IEEE/ACM International Conference on Automated Software Engineering (ASE 2009), IEEE Computer Society Press, Los Alamitos, CA (acceptance rate: 17.1%). 2009, 233−244.

Jiang B., Zhang Z., Tse T.H. and Chen T.Y., Best Paper Award, 33rd Annual International Computer Software and Applications Conference (COMPSAC 2009), IEEE Computer Society, Washington, DC, USA . 2009.

Jiang B., Zhang Z., Tse T.H. and Chen T.Y., How well do test case prioritization techniques support statistical fault localization [Best Paper Award, COMPSAC 2009], Proceedings of the 33rd Annual International Computer Software and Applications Conference (COMPSAC 2009), vol. 1, IEEE Computer Society Press, Los Alamitos, CA. 2009, 99−106.

Zhang Z., Chan W.K., Tse T.H., Jiang B. and Wang X., Capturing propagation of infected program states, Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC 2009/FSE-17), ACM Press, New York, NY (acceptance rate: 14.7%). 2009, 43−52.

Zhang Z., Chan W.K., Tse T.H. and Hu P., Experimental study to compare the use of metamorphic testing and assertion checking, Journal of Software. 2009, 20 (10): 2637−2654.

Zhang Z., Jiang B., Chan W.K., Tse T.H. and Wang X., Fault localization through evaluation sequences, Journal of Systems and Software (ISI impact factor of journal: 1.340). 2010, 83 (2): 174−187.

Zhang Z., Chan W.K., Tse T.H., Hu P. and Wang X., Is non-parametric hypothesis testing model robust for statistical fault localization?, Information and Software Technology (ISI impact factor of journal: 1.821). 2009, 51 (11): 1573−1585.

Zhang Z., Chan W.K., Tse T.H., Lu H. and Mei L., Resource prioritization of code optimization techniques for program synthesis of wireless sensor network applications, Journal of Systems and Software (ISI impact factor of journal: 1.340). 2009, 82 (9): 1376−1387.

Zhang Z., Software Debugging through Dynamic Analysis of Program Structures. Hong Kong, 2010.

-- End of Listing --