About
As a motivated research scientist and machine learning engineer, I have a proven track…
Experience
Education
-
Activities and Societies: Taiwanese Scholar Society
-
Master of Computer Science.
• Designed a two-phase arbiter model that speed up simulation performance 20 times than traditional simulation approach in Multiprocessor System on Chip platform.
• Implemented in C++ with SystemC library.
• Related publication won Outstanding Paper Award out of 72 papers in international workshop. -
Activities and Societies: The President of student association of Computer Science and Engineering Department The Captain of softball Team in Computer Science and Engineering Department
Licenses & Certifications
Volunteer Experience
-
Vice President
Taiwanese Scholar Society
- 1 year 1 month
Social Services
-
President
The Student Association of the Department of Computer Science and Engineering in NSYSU
- 1 year 1 month
Science and Technology
Publications
-
(Vision Paper) A Vision for Spatio-Causal Situation Awareness, Forecasting, and Planning
ACM Transactions on Spatial Algorithms and Systems
-
CTT: Causally Informed Tensor Train Decomposition
IEEE Big Data
Tensor Train (TT) is a tensor decomposition technique designed to resolve the curse of dimensionality and the intermediate memory blow-up problems in traditional techniques for high-dimensional data analysis. Tensor train process provides linear space complexity by creating a sequential tensor network of low modalities. However, the selected sequence of decomposition order can have a significant impact on the accuracy and representativeness of the final decomposition and, unfortunately…
Tensor Train (TT) is a tensor decomposition technique designed to resolve the curse of dimensionality and the intermediate memory blow-up problems in traditional techniques for high-dimensional data analysis. Tensor train process provides linear space complexity by creating a sequential tensor network of low modalities. However, the selected sequence of decomposition order can have a significant impact on the accuracy and representativeness of the final decomposition and, unfortunately, choosing a good order for the TT representation is not a trivial task. In this paper, we observe that the causal structure underlying the data can impact the tensor train process and that a rough estimate of causality can be used to inform the order of the latent spaces to consider. Enlightened by this observation, we propose a novel causally informed tensor train decomposition (CTT) approach to tackle the sequence selection problem in TT- decomposition. CTT leverages the structural information in a given causal graph and recommends a suitable causally-informed decomposition sequence for TT-decomposition.
-
PanCommunity: Non-Monolithic Complex Epidemic and Pandemic Modeling.
The 9th International Conference on Infectious Disease Dynamics
-
Tensor Analysis
Chameleon User Experiment Blog
Introduction to basic tensor analysis with tensor train decomposition.
-
GTT: Leveraging data characteristics for guiding the tensor train decomposition
Information Systems
The demand for searching, querying multimedia data such as image, video and audio is omnipresent, how to effectively access data for various applications is a critical task. Nevertheless, these data usually are encoded as multi-dimensional arrays, or tensor, and traditional data mining techniques might be limited due to the curse of dimensionality. Tensor decomposition is proposed to alleviate this issue. Commonly used tensor decomposition algorithms include CP-decomposition (which seeks a…
The demand for searching, querying multimedia data such as image, video and audio is omnipresent, how to effectively access data for various applications is a critical task. Nevertheless, these data usually are encoded as multi-dimensional arrays, or tensor, and traditional data mining techniques might be limited due to the curse of dimensionality. Tensor decomposition is proposed to alleviate this issue. Commonly used tensor decomposition algorithms include CP-decomposition (which seeks a diagonal core) and Tucker-decomposition (which seeks a dense core). Naturally, Tucker maintains more information, but due to the denseness of the core, it also is subject to exponential memory growth with the number of tensor modes. Tensor train (TT) decomposition addresses this problem by seeking a sequence of three-mode cores: but unfortunately, currently, there are no guidelines to select the decomposition sequence. In this paper, we propose a GTT method for guiding the tensor train in selecting the decomposition sequence. GTT leverages the data characteristics (including number of modes, length of the individual modes, density, distribution of mutual information, and distribution of entropy) as well as the target decomposition rank to pick a decomposition order that will preserve information. Experiments with various data sets demonstrate that GTT effectively guides the TT-decomposition process towards decomposition sequences that better preserve accuracy.
Other authors -
W2FM: The Doubly-Warped Factorization Machine
Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD)
-
DataStorm: Coupled, Continuous Simulations for Complex Urban Environment
ACM/IMS Transactions on Data Science
Urban systems are characterized by complexity and dynamicity. Data-driven simulations represent a promising approach in understanding and predicting complex dynamic processes in the presence of shifting demands of urban systems. Yet, today’s silo-based, de-coupled simulation engines fail to provide an end-to-end view of the complex urban system, preventing informed decision-making. In this article, we present DataStorm to support integration of existing simulation, analysis and visualization…
Urban systems are characterized by complexity and dynamicity. Data-driven simulations represent a promising approach in understanding and predicting complex dynamic processes in the presence of shifting demands of urban systems. Yet, today’s silo-based, de-coupled simulation engines fail to provide an end-to-end view of the complex urban system, preventing informed decision-making. In this article, we present DataStorm to support integration of existing simulation, analysis and visualization components into integrated workflows. DataStorm provides a flow engine, DataStorm-FE, for coordinating data and decision flows among multiple actors (each representing a model, analytic operation, or a decision criterion) and enables ensemble planning and optimization across cloud resources. DataStorm provides native support for simulation ensemble creation through parameter space sampling to decide which simulations to run, as well as distributed instantiation and parallel execution of simulation instances on cluster resources. Recognizing that simulation ensembles are inherently sparse relative to the potential parameter space, we also present a density-boosting partition-stitch sampling scheme to increase the effective density of the simulation ensemble through a sub-space partitioning scheme, complemented with an efficient stitching mechanism that leverages partial and imperfect knowledge from partial dynamical systems to effectively obtain a global view of the complex urban process being simulated.
Other authorsSee publication -
GTT: Guiding the Tensor Train Decomposition (Best paper candidate)
International Conference on Similarity Search and Applications (SISAP)
-
Matrix Factorization with Interval-valued Data Sets (Extended Abstract)
The 36th IEEE International Conference on Data Engineering (ICDE)
-
Matrix Factorization with Interval-valued Data Sets
IEEE Transactions on Knowledge and Data Engineering (TKDE)
-
Datastom-FE: a data- and decision-flow and coordination engine for coupled simulation ensembles
Proceedings of the VLDB Endowment
-
Personalized PageRank in Uncertain Graphs with Mutually Exclusive Edges
The 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
Measures of node ranking, such as personalized PageRank, are utilized in many web and social-network based prediction and recommendation applications. Despite their e ectiveness when the underlying graph is certain, however, these measures become di cult to apply in the presence of uncertainties, as they are not de- signed for graphs that include uncertain information, such as edges that mutually exclude each other. While there are several ways to naively extend existing techniques (such as…
Measures of node ranking, such as personalized PageRank, are utilized in many web and social-network based prediction and recommendation applications. Despite their e ectiveness when the underlying graph is certain, however, these measures become di cult to apply in the presence of uncertainties, as they are not de- signed for graphs that include uncertain information, such as edges that mutually exclude each other. While there are several ways to naively extend existing techniques (such as trying to encode uncer- tainties as edge weights or computing all possible scenarios), as we discuss in this paper, these either lead to large degrees of errors or are very expensive to compute, as the number of possible worlds can grow exponentially with the amount of uncertainty. To tackle with this challenge, in this paper, we propose an e cient Uncertain Personalized PageRank (UPPR) algorithm to approximately compute personalized PageRank values on an uncertain graph with edge uncertainties. UPPR avoids enumeration of all possible worlds, yet it is able to achieve comparable accuracy by carefully encoding edge uncertainties in a data structure that leads to fast approximations. Experimental results show that UPPR is very e cient in terms of execution time and its accuracy is comparable or be er than more costly alternatives.
Other authorsSee publication -
A Formal Model for Intellectual Relationships among Knowledge Workers and Knowledge Organizations
Journal of Visual Languages & Computing, 2015
An academic learning network consists of multiple knowledge organizations and knowledge workers. The intellectual relationships can be derived from the interactions among them. In this paper, we propose a formal model to describe the interactions within proposed academic learning network and further provide an evaluation process to quantify intellectual relationships. Our approach is also integrated with a realistic social platform SMNET.
Other authors -
A Formal Model for Intellectual Relationships among Knowledge Workers and Knowledge Organizations
The 20th International Conference on Distributed Multimedia Systems
Academic learning network is consisted of multiple knowledge organizations and knowledge workers. The degree of relationship can be derived from the interaction within them. In this paper, we propose a formal ownership model to decribe the interactions within academic learning network and further provide an evaulation process to quantify the degree of relationship. Proposed approach is also integrated with realistic social platform SMNET.
Other authors -
Automatic Generation of High-speed Accurate TLM Models for Out-of-Order Pipelined Bus
ACM Transactions on Embedded Computing Systems
Although pipelined/out-of-order (PL/OO) execution features are commonly supported by the state- of-the-art bus designs, no existing manual Transaction-Level-Modeling (TLM) approaches can effectively construct fast and accurate simulation models for PL/OO bus designs. Mainly, the inhe- rent high design complexity of concurrent PL/OO behaviors makes the manual approaches tedious modeling processes that involve error-prone abstraction work. To tackle these complicated model- ing tasks, this paper…
Although pipelined/out-of-order (PL/OO) execution features are commonly supported by the state- of-the-art bus designs, no existing manual Transaction-Level-Modeling (TLM) approaches can effectively construct fast and accurate simulation models for PL/OO bus designs. Mainly, the inhe- rent high design complexity of concurrent PL/OO behaviors makes the manual approaches tedious modeling processes that involve error-prone abstraction work. To tackle these complicated model- ing tasks, this paper presents an automatic approach that performs systematic abstraction through static analysis and generation of fast-and-accurate simulation models. Our generated Cycle-Count- Accurate models can perform simulation 12 times faster than Cycle-Accurate models while pre- serving the same PL/OO transaction execution cycle counts.
Other authors -
A Formal Full Bus TLM Modeling for Fast and Accurate Contention Analysis
The 17th Workshop on Synthesis And System Integration of Mixed Information technologies
-
Evolutionary Approach for Crowdsourcing Quality Control
Journal of Visual Languages and Computing, 2014
Crowdsourcing is widely used for solving simple tasks (e.g. tagging images) and recently, some re- searchers propose new crowdsourcing models to handle complex tasks (e.g. article writing). In both type of crowdsourcing models (for simple and complex tasks), voting is a technique that is widely used for quality control [9]. For example, 5 workers are asked to write 5 outlines for an article, and another 5 workers are asked to vote for the best outline among the 5 outlines. However, we argue…
Crowdsourcing is widely used for solving simple tasks (e.g. tagging images) and recently, some re- searchers propose new crowdsourcing models to handle complex tasks (e.g. article writing). In both type of crowdsourcing models (for simple and complex tasks), voting is a technique that is widely used for quality control [9]. For example, 5 workers are asked to write 5 outlines for an article, and another 5 workers are asked to vote for the best outline among the 5 outlines. However, we argue that voting is actually a technique that selects a high quality answer from a set of answers. It does not directly enhance answer quality. In this paper, we propose a new quality control approach for crowdsourcing that can incrementally improve answer quality. The new approach is based upon two principles --- evolutionary computing and slow intelligence, which help the crowdsourcing system to propagate knowledge among workers and incrementally improve the answer quality. We perform explicitly 2 experimental case studies to show the effectiveness of the new approach. The case study results show that the new approach can incrementally improve answer quality and produce high quality answers.
Other authors
Patents
-
Full Bus Transaction Level Modeling Approach for Fast and Accurate Contention Analysis
Issued US 20130054854
The present invention presents an effective Cycle-count Accurate Transaction level (CCA-TLM) full bus modeling and simulation technique. Using the two-phase arbiter and master-slave models, an FSM-based Composite Master-Slave-pair and Arbiter Transaction (CMSAT) model is proposed for efficient and accurate dynamic simulations. This approach is particularly effective for bus architecture exploration and contention analysis of complex Multi-Processor System-on-Chip (MPSoC) designs.
Other inventorsSee patent
Projects
-
PanCommunity: Leveraging Data and Models for Understanding and Improving Community Response in Pandemics
The goal of this integrative research effort is to enhance the understanding of the complex relationships characterizing pandemics and interventions under crisis. The global-scale response to the COVID-19 pandemic triggered drastic measures including economic shutdowns, travel bans, stay-home orders, and even complete lockdowns of entire cities, regions, and countries. The need to effectively produce and deliver PPE, testing and vaccines has affected different communities of stakeholders in…
The goal of this integrative research effort is to enhance the understanding of the complex relationships characterizing pandemics and interventions under crisis. The global-scale response to the COVID-19 pandemic triggered drastic measures including economic shutdowns, travel bans, stay-home orders, and even complete lockdowns of entire cities, regions, and countries. The need to effectively produce and deliver PPE, testing and vaccines has affected different communities of stakeholders in different ways, requiring coordination at family/business units, counties/states to federal level entities. This project, therefore, considers communities at local, federal, and international (US and Japan) scales and investigate impact of testing, preventative measures and vaccines, when used in combination, to improve community and inter-agency response at the different scales. The impacts of this research includes technologies to help save lives, restore basic services and community functionality, and establish a platform that supports core capabilities including planning, public information, and warning. The project organizes an interdisciplinary community, bringing together (a) computer/data scientists, (b) domain and social scientists and policy experts, (c) federal, state, local governments, (d) industry and nonprofits, and (e) educators, to serve as a nexus for major research collaborations that will: overcome key research barriers and explore and catalyze new paradigms and practices in cross-community response to pandemics; enable development and sharing of sustainable and reusable technologies, coupled with extensive broader dissemination activities; act as a resource for public policy guidance on relevant strategies and regulations; and provide education, broadening participation, and workforce development at all levels (K12 to postgraduate) for the next generation of scientists, engineers, and practitioners.
-
DataStorm: A Data Enabled System for End-to-End Disaster Planning and Response
-
This project will enhance disaster response and community resilience through multi-faceted research to create a big data system to support data-driven simulations with the necessary volume, velocity, and variety and integrate and optimize the key aspects and decisions in disaster management. This includes (a) a novel computational infrastructure capable of executing multiple coupled simulations synergistically, under a unified probabilistic model, (b) addressing computational challenges that…
This project will enhance disaster response and community resilience through multi-faceted research to create a big data system to support data-driven simulations with the necessary volume, velocity, and variety and integrate and optimize the key aspects and decisions in disaster management. This includes (a) a novel computational infrastructure capable of executing multiple coupled simulations synergistically, under a unified probabilistic model, (b) addressing computational challenges that arise from the need to acquire, integrate, model, analyze, index, and search, in a scalable manner, large volumes of multi-variate, multi-layer, multi-resolution, and interconnected and inter-dependent spatio-temporal data that arise from disaster simulations and real-world observations, (c) a new high performance data processing system to support continuous observation of the numerical results for simulations from different domains with diverse resource demands and time constraints. These models, algorithms, and systems will be integrated into a disaster data management cyber-infrastructure (DataStorm) that will enable innovative applications and generate broad impacts–through close collaborations with domain experts from transportation, public health, and emergency management–in disaster planning and response.
-
Personal Healthcare in Slow intelligence System
-
Developing a control panel Java program with Slow Intelligence System (SIS) to control various healthcare sensors to collect/upload user's status, e.g. blood pressure, SPO2. This system helps hospitals/doctors remotely monitor patients' health status in real time.
Other creatorsSee project -
Performance Prediction in MLB Game
-
Implemented three machine leaning algorithms, including Naive Bayes, Alternating Decision Tree and Adaptive Boost to predict the performance of Major League Baseball team. Further combining wrapper feature selection approach to achieve more than 70% accuracy in prediction.
-
A Sensor-Cloud Simulation Platform with Slow Intelligent System
-
Designed a sensor-cloud platform simulates the sensors collect temperature within Pittsburgh area and upload it to database for further data analysis with Slow Intelligence System component-based approach.
In this project, we established a sensor-cloud simulation platform with Slow Intelligent System (SIS) system. SIS provides a component-based approach, which is perfectly suitable for sensor-cloud system design. Furthermore, designer can easily configure their own sensor-cloud sysytem…Designed a sensor-cloud platform simulates the sensors collect temperature within Pittsburgh area and upload it to database for further data analysis with Slow Intelligence System component-based approach.
In this project, we established a sensor-cloud simulation platform with Slow Intelligent System (SIS) system. SIS provides a component-based approach, which is perfectly suitable for sensor-cloud system design. Furthermore, designer can easily configure their own sensor-cloud sysytem without redundant modeling effort. We also provide a realistic application in section 3 to demonstrate our sensor-cloud system can achieve early stage verification. -
Automatic Grading Student Answers
-
Developed an automatic grading system which combining bag-of-word, latent semantic analysis (LSA) and textual entailment techniques using Python and Java that achieve 50% accuracy for grading students’ answers in questions.
Currently, the field of Natural Language Processing (NLP) is applied in different tasks in the education domain. Thus, this article presents an approach for an auto- matic grading student answers. To achieve this goal, we consider the following archi- tecture:…Developed an automatic grading system which combining bag-of-word, latent semantic analysis (LSA) and textual entailment techniques using Python and Java that achieve 50% accuracy for grading students’ answers in questions.
Currently, the field of Natural Language Processing (NLP) is applied in different tasks in the education domain. Thus, this article presents an approach for an auto- matic grading student answers. To achieve this goal, we consider the following archi- tecture: pre-processing, feature extraction and classification. In the feature extrac- tion step, we consider different approaches like: bag-of-words, LSA, textual entail- ment, and others. The principal results indicate that we achieve a similar perfor- mance with the baseline and some high- lights of our proposal is that it could be applied in different domains and is a gen- eral approach that can be evaluated with questions not in the training set.Other creatorsSee project -
A Genetic Algorithm-based Approach to Maximize Application Throughput in Multicore System
-
Designed a genetic algorithm-based (GA) approach in Python that can find a suitable thread assignment in multicore system to maximize the throughput of applications. The result achieves half converge time compares with brutally searching.
The affinity of thread assignment affects the performance of multithreaded application, diverse feature of applications or environment could be significant factor to result in different thread assignment. In this project, we use a genetic algorithm (GA)…Designed a genetic algorithm-based (GA) approach in Python that can find a suitable thread assignment in multicore system to maximize the throughput of applications. The result achieves half converge time compares with brutally searching.
The affinity of thread assignment affects the performance of multithreaded application, diverse feature of applications or environment could be significant factor to result in different thread assignment. In this project, we use a genetic algorithm (GA) based approach to explore the affinity of thread assignment. In our GA approach, we convert the affinity of thread assignment into specific genetic units and use crossover and selection process to explore the suitable affinity. During this evolution, we can obtain suitable thread assignment without brutally searching.Other creators -
The Implementation of Two-Pass Token Stream Arbitration in Nanophotonic Interconnection
-
Implemented a nanophotonic interconnection in a 64-cores system that supporting multiple-reader, multiple-writer (MWMR) with two-pass token stream arbitration policy and achieve 100% simulation accuracy in SystemC.
Due to the demand of high performance and low power consumption, Nanophotonic interconnection is considered as the future technique in multicore chip design. However, the increasing complexity of communication affects the performance significantly, how to achieve effective…Implemented a nanophotonic interconnection in a 64-cores system that supporting multiple-reader, multiple-writer (MWMR) with two-pass token stream arbitration policy and achieve 100% simulation accuracy in SystemC.
Due to the demand of high performance and low power consumption, Nanophotonic interconnection is considered as the future technique in multicore chip design. However, the increasing complexity of communication affects the performance significantly, how to achieve effective arbitration in multicore system is an essential issue. In additional to performance, the fairness should be considered in arbitration. In this report, we implement Two-pass token stream arbitration in Multiple Reader, Multiple Writer(MWMR) nanophotonic interconnection, two-pass token stream is the key to achieve effective and fair arbitration. The platform we use consists of 64 cores and use SystemC as our modeling language.Other creators -
Simulator for distributed directory cache coherence protocols in CMPs
-
-
An Automatic TLM Bus model generator for Communication Architecture Exploration
-
Designed a transaction level model (TLM) bus generator that can precisely capture bus behavior in cycle count accurate (CCA) level with finite state machine-based (FSM) formal model using SystemC. The result achieves 20 times speedup in simulation performance while maintaining 100% accuracy.
How to efficiently model and fast, yet accurately, simulate a bus in System-On-a-Chip (SoC) design is one of the key issues of current electronic system design. Though TLM (Transaction Level…Designed a transaction level model (TLM) bus generator that can precisely capture bus behavior in cycle count accurate (CCA) level with finite state machine-based (FSM) formal model using SystemC. The result achieves 20 times speedup in simulation performance while maintaining 100% accuracy.
How to efficiently model and fast, yet accurately, simulate a bus in System-On-a-Chip (SoC) design is one of the key issues of current electronic system design. Though TLM (Transaction Level Modeling) is proven as an effective design methodology for managing the ever-increasing complexity of system level designs, conventional TLM design methodology often requires designers to separately exploit cycle-approximate and cycle-accurate models for gaining either simulation speed or accuracy respectively. Consequently, designers repeatedly perform the time-consuming task of re-writing and performing consistency checks for different abstraction level models of the same design. To ease the work, the project proposes to develop an automatic tool that simultaneously generates both fast and accurate transaction level bus models for system simulation. The proposed approach relieves designers from the tedious and error-prone process of refining models and checking for consistency.Other creators
Honors & Awards
-
IEEE BigData 2023 Student Travel Award
IEEE BigData 2023 Student Travel Award Committee
The Student Travel Award for attending the IEEE BigData Conference 2023, which will be held in Sorrento, Italy, from December 15–18, 2023. The committee chose 21 students to receive the travel prizes after carefully examining the 86 student applicants' application materials.
-
CIDSE Doctoral Fellowship - Spring 2021
The School of Computing, Informatics and Decision Systems Engineering, Arizona State University
-
Engineering Graduate Fellowship - Spring 2020
The Ira A. Fulton Schools of Engineering, Arizona State University
-
CIDSE Doctoral Fellowship - Spring 2020
The School of Computing, Informatics, and Decision Systems Engineering, Arizona State University
-
SIGIR/IR'17 Student Travel Grants
The 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
ACM SIGIR/IR 2017 Student Travel Grant for a reimbursement of $1850.
-
Dean's Fellowship award
The Ira A. Fulton Schools of Engineering and the School of Computing, Informatics, and Decision Systems Engineering
4 academic years plus summers in the form of 50% FET Graduate Research Associate appointment with the Center for Assured and Scalable Data Engineering (CASCADE) and under the supervision of Dr. Candan.
-
Outstanding Paper Award
The 17th Workshop on Synthesis And System Integration of Mixed Information technologies
"A Formal Full Bus TLM Modeling for Fast and Accurate Contention Analysis"
Proposed an arbiter model in MPSoC platform to speedup simulation performance about 20 times and maintain 100% accuracy compare with traditional cycle accurate (CA) simulation technique. -
Excellent work in the Competition of Undergraduate Special Topics Practice
Computer Science Department
Porting a Linux kernel (2.6) into ARM-s3c2140 development board, using Qt to design a digital photo frame, users can use proposed program to view/upload their photos with touch screen in remote network filesystem (NFS).
-
National Sun-Yat Sen University Excellent Student Award
Computer Science Department
Languages
-
Taiwanese
Native or bilingual proficiency
-
Chinese
Native or bilingual proficiency
-
English
Professional working proficiency
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore More