International Journal of Blockchain Applications and Secure Computing
This exploratory applied study examines the nature and dimensions of cryptocurrency, namely bitco... more This exploratory applied study examines the nature and dimensions of cryptocurrency, namely bitcoin, a peer-to-peer network for facilitating digital barter. As the most widely used cryptocurrency, bitcoin has carved itself a niche market while also promoting the use of other cryptocurrencies. Through descriptive analysis and a visual analytic approach, the study highlights key characteristics and dimensions of bitcoin. The study helps understand the nature and extent of bitcoin use, assisting policymakers to shape and regulate the cryptocurrency marketplace in this contemporary volatile environment.
The Journal of International Information Management, 1996
The growth of end-user computing and recent developments in information technology, such as clien... more The growth of end-user computing and recent developments in information technology, such as client/server architecture and data warehouse, promote the use of remote materialized views (RMVs) to support end-users. This article presents a differential scheme to refresh remote end-user's views. The scheme stores the effects of updates relevant to the RMV in a "difference table" which is transmitted to the remote site upon receiving the refresh request to update the RMV. The scheme provides a fast response to a user's refresh request. We discuss the data structures and algorithms of the scheme. Performance measures are developed and compared with the regeneration scheme.
European Journal of Operational Research, Mar 1, 1996
Customers are scheduled to arrive at periodic scheduling intervals to receive service from a sing... more Customers are scheduled to arrive at periodic scheduling intervals to receive service from a single server system. A customer must start receiving service within a given departure interval; if this is not the case, the system will pay a penalty and/or transfer the customer to another facility at the system's cost. A complete transient and steady-state analysis of the system
The problem of determining data requirements in cases where statistical query answers are desired... more The problem of determining data requirements in cases where statistical query answers are desired is studied. Specifically, we consider the value of storing aggregate data that can be used to speed up answering such queries, but at the potential costs of incomplete information due to either estimation error or staleness, as well as increased costs of update. We formulate the overall optimization problem for design, and decompose it into several subproblems that can be separately addressed. Two of these subproblems are the choice of update method, and choice of aggregates. Qualitative results are given regarding the selection of update policy, and design heuristics, based on numerical experiments, are given for single-attribute Legendre polynomial aggregates. Multivariate Legendre aggregates are also discussed, and suggestions for future research are given.
The main function of a data warehouse is the separation of the decision layer from the operation ... more The main function of a data warehouse is the separation of the decision layer from the operation layer so that users can invoke analysis, planning, and decision support applications without having to worry about constantly evolving operational databases. Such applications allow ad hoc queries for which no predefined reports exist. It is possible that an ad hoc query is submitted by different users or even by the same user at different times, requiring its repeated evaluations even though the contents of the warehouse have not changed in between. In this work, we propose an enhancement to a data warehouse structure, by building additional intelligence in the form of an adaptive and efficient query cache. The cache contains a list of recently executed ad hoc queries and their answers. Whenever possible, a new query is satisfied by answers already stored in the cache, thereby avoiding potentially large data scans. We discuss issues related to organizing and searching the query cache. In particular, we outline subsumption detection algorithms for a number of different situations that allow for quick decision on whether the cache can be used to evaluate an arriving que~
Creating an Enterprise 2.0 extends from employee content creation and collaboration to building a... more Creating an Enterprise 2.0 extends from employee content creation and collaboration to building an architecture that enables mobility and where possible allows for emerging technologies and methodologies. Yet, in many cases implementing this new architecture may find resistance, not from knowledge workers, but from IT managers who are apprehensive because of the potential security compromise, proliferation of multiple standards, and delegation of procurement authority to individual users or business units. In this research article, the authors describe a structured and deliberate approach towards building Enterprise 2.0 environment. Many elements of the approach were defined and developed at Vanguard, Inc. as it developed an integrated communication and collaboration environment over several years. The authors emphasize the process followed and lessons learned.
Functional dependencies are the most commonly used approach for capturing real-word integrity con... more Functional dependencies are the most commonly used approach for capturing real-word integrity constraints which are to be reflected in a database. There are, however, many useful kinds of constraints, especially approximate ones, that cannot be represented correctly by functional dependencies and therefore are enforced via programs which update the database, if they are enforced at all. This tends to make such constraints invisible since they are not an explicit part of the database, increasing maintenance problems and the likelihood of inconsistencies. We propose a new approach, cluster dependencies, as a way to enforce approximate dependencies. By treating equality as a fuzzy concept and defining appropriate similarity measures, it is possible to represent a broad range of approximate constraints directly in the database by storing and accessing cluster definitions. We discuss different interpretations of cluster dependencies and describe the additional data structures needed to enforce them. We also contrast them with an existing approach, fuzzy functional dependencies, which are much more limited in the kind of approximate constraints they can represent.
IEEE Transactions on Knowledge and Data Engineering, 1998
A database allows its users to reduce uncertainty about the world. However, not all properties of... more A database allows its users to reduce uncertainty about the world. However, not all properties of all objects can always be stored in a database. As a result, the user may have to use probabilistic inference rules to estimate the data required for his decisions. A decision based on such estimated data may not be perfect. The authors call the
From the Publisher: In this groundbreaking book, two acknowledged experts explore the underlying ... more From the Publisher: In this groundbreaking book, two acknowledged experts explore the underlying principles of systems integration, and, with the help of numerous case studies show IT managers, systems analysts, and project managers how to apply those principles to solving complex business problems. The authors reveal the linkages between business processes and how they can be supported in enterprise-wide integrated systems. Rather than review specific products and tools, the authors use real-life examples to provides readers with a practical understanding of integrated system architectures and how they function within the framework of an Enterprise Planning System.
This paper presents a methodology for trading-off the cost of incomplete information against the ... more This paper presents a methodology for trading-off the cost of incomplete information against the data-related costs in the design of database systems. It investigates how the usage patterns of the database, defined by the characteristics of information requests presented to it, affect its conceptual design. The construction of minimum-cost answers to information requests for a variety of query types and cost structures is also studied. The resulting costs of incomplete database information are balanced against the data-related costs in the derivation of the optimal design.
ACM SIGMIS Database: the DATABASE for Advances in Information Systems, 2000
The main function of a data warehouse is the separation of the decision layer from the operation ... more The main function of a data warehouse is the separation of the decision layer from the operation layer so that users can invoke analysis, planning, and decision support applications without having to worry about constantly evolving operational databases. Such applications allow ad hoc queries for which no predefined reports exist. It is possible that an ad hoc query is submitted by different users or even by the same user at different times, requiring its repeated evaluations even though the contents of the warehouse have not changed in between. In this work, we propose an enhancement to a data warehouse structure, by building additional intelligence in the form of an adaptive and efficient query cache. The cache contains a list of recently executed ad hoc queries and their answers. Whenever possible, a new query is satisfied by answers already stored in the cache, thereby avoiding potentially large data scans. We discuss issues related to organizing and searching the query cache. In particular, we outline subsumption detection algorithms for a number of different situations that allow for quick decision on whether the cache can be used to evaluate an arriving que~
In a typical database application, it is commonly assumed that user information requirements can ... more In a typical database application, it is commonly assumed that user information requirements can only be satisfied by the most current data. The desirable attributes of infolmation often include being up-to-date, timeliness, and accuracy, with the implicit assumption that without these attributes a response to a query has little or no value. This assumption is challenged in this work. We consider the tradeoff between the cost of incomplete information, due to the use of stale data, and the incremental cost of providing a current answer. We propose that the database system be extended to include a data cache, in which copies of frequently needed data will be kept. Objects in the cache are not updated as the database changes, but rather are refreshed whenever the cost of using stale data exceeds some prespecified level. We also discuss alternative refresh policies and cache search schemes
Customers are scheduled to arrive at periodic scheduling intervals to receive service from a sing... more Customers are scheduled to arrive at periodic scheduling intervals to receive service from a single server system. A customer must start receiving service within a given departure interval; if this is not the case, the system will pay a penalty and/or transfer the customer to another facility at the system's cost. A complete transient and steady-state analysis of the system
IEEE Transactions on Knowledge and Data Engineering, 1994
The exact expression for the expected number of disk accesses required to retrieve a given number... more The exact expression for the expected number of disk accesses required to retrieve a given number of records, called the Yao function, requires iterative computations. Several authors have developed approximations to the Yao function, all of which have substantial errors in some situations. We derive and evaluate simple upper and lower bounds that never differ by more than a small fraction of a disk access.
From an internal audit perspective, enterprise systems have created new opportunities and challen... more From an internal audit perspective, enterprise systems have created new opportunities and challenges in managing internal as well as external risks. In this work, we report results of a survey that examines internal auditors’ ability to identify and manage operational, financial, technological, compliance and other risks as the organization migrates to an ERP environment. Our findings show that the internal auditors perceive a reduction in financial and operational risk and an increase in technical risks. These effects are somewhat mitigated by their ability to assess and manage these risks. We also find that internal audit departments satisfied their needs for ERP skills not by outsourcing but by providing staff with in-house training.
A key element in all decision support systems is availability of sufficiently good and timely dat... more A key element in all decision support systems is availability of sufficiently good and timely data to support the decision making process. Much research was, and is, devoted to data and information quality: attributes, assurance that quality data is used in the decision process, etc. In this paper we concentrate on a particular dimension of data availability and usage the retrieval of data in a timely and decision enhancing manner. We propose to augment the decision support databases by an adaptive and efficient query cache. The cache contains snapshots of the decision support database, each being the answer to a recently invoked query. A snapshot can be reused by the originating user, or a different user, at a later time -provided the use of cached data leads to savings over the use of a new query, and these savings exceed the cost of using stale date. The proposed scheme is conceptually different from conventional data replication schemes. In data replication schemes the data item...
International Journal of Blockchain Applications and Secure Computing
This exploratory applied study examines the nature and dimensions of cryptocurrency, namely bitco... more This exploratory applied study examines the nature and dimensions of cryptocurrency, namely bitcoin, a peer-to-peer network for facilitating digital barter. As the most widely used cryptocurrency, bitcoin has carved itself a niche market while also promoting the use of other cryptocurrencies. Through descriptive analysis and a visual analytic approach, the study highlights key characteristics and dimensions of bitcoin. The study helps understand the nature and extent of bitcoin use, assisting policymakers to shape and regulate the cryptocurrency marketplace in this contemporary volatile environment.
The Journal of International Information Management, 1996
The growth of end-user computing and recent developments in information technology, such as clien... more The growth of end-user computing and recent developments in information technology, such as client/server architecture and data warehouse, promote the use of remote materialized views (RMVs) to support end-users. This article presents a differential scheme to refresh remote end-user's views. The scheme stores the effects of updates relevant to the RMV in a "difference table" which is transmitted to the remote site upon receiving the refresh request to update the RMV. The scheme provides a fast response to a user's refresh request. We discuss the data structures and algorithms of the scheme. Performance measures are developed and compared with the regeneration scheme.
European Journal of Operational Research, Mar 1, 1996
Customers are scheduled to arrive at periodic scheduling intervals to receive service from a sing... more Customers are scheduled to arrive at periodic scheduling intervals to receive service from a single server system. A customer must start receiving service within a given departure interval; if this is not the case, the system will pay a penalty and/or transfer the customer to another facility at the system's cost. A complete transient and steady-state analysis of the system
The problem of determining data requirements in cases where statistical query answers are desired... more The problem of determining data requirements in cases where statistical query answers are desired is studied. Specifically, we consider the value of storing aggregate data that can be used to speed up answering such queries, but at the potential costs of incomplete information due to either estimation error or staleness, as well as increased costs of update. We formulate the overall optimization problem for design, and decompose it into several subproblems that can be separately addressed. Two of these subproblems are the choice of update method, and choice of aggregates. Qualitative results are given regarding the selection of update policy, and design heuristics, based on numerical experiments, are given for single-attribute Legendre polynomial aggregates. Multivariate Legendre aggregates are also discussed, and suggestions for future research are given.
The main function of a data warehouse is the separation of the decision layer from the operation ... more The main function of a data warehouse is the separation of the decision layer from the operation layer so that users can invoke analysis, planning, and decision support applications without having to worry about constantly evolving operational databases. Such applications allow ad hoc queries for which no predefined reports exist. It is possible that an ad hoc query is submitted by different users or even by the same user at different times, requiring its repeated evaluations even though the contents of the warehouse have not changed in between. In this work, we propose an enhancement to a data warehouse structure, by building additional intelligence in the form of an adaptive and efficient query cache. The cache contains a list of recently executed ad hoc queries and their answers. Whenever possible, a new query is satisfied by answers already stored in the cache, thereby avoiding potentially large data scans. We discuss issues related to organizing and searching the query cache. In particular, we outline subsumption detection algorithms for a number of different situations that allow for quick decision on whether the cache can be used to evaluate an arriving que~
Creating an Enterprise 2.0 extends from employee content creation and collaboration to building a... more Creating an Enterprise 2.0 extends from employee content creation and collaboration to building an architecture that enables mobility and where possible allows for emerging technologies and methodologies. Yet, in many cases implementing this new architecture may find resistance, not from knowledge workers, but from IT managers who are apprehensive because of the potential security compromise, proliferation of multiple standards, and delegation of procurement authority to individual users or business units. In this research article, the authors describe a structured and deliberate approach towards building Enterprise 2.0 environment. Many elements of the approach were defined and developed at Vanguard, Inc. as it developed an integrated communication and collaboration environment over several years. The authors emphasize the process followed and lessons learned.
Functional dependencies are the most commonly used approach for capturing real-word integrity con... more Functional dependencies are the most commonly used approach for capturing real-word integrity constraints which are to be reflected in a database. There are, however, many useful kinds of constraints, especially approximate ones, that cannot be represented correctly by functional dependencies and therefore are enforced via programs which update the database, if they are enforced at all. This tends to make such constraints invisible since they are not an explicit part of the database, increasing maintenance problems and the likelihood of inconsistencies. We propose a new approach, cluster dependencies, as a way to enforce approximate dependencies. By treating equality as a fuzzy concept and defining appropriate similarity measures, it is possible to represent a broad range of approximate constraints directly in the database by storing and accessing cluster definitions. We discuss different interpretations of cluster dependencies and describe the additional data structures needed to enforce them. We also contrast them with an existing approach, fuzzy functional dependencies, which are much more limited in the kind of approximate constraints they can represent.
IEEE Transactions on Knowledge and Data Engineering, 1998
A database allows its users to reduce uncertainty about the world. However, not all properties of... more A database allows its users to reduce uncertainty about the world. However, not all properties of all objects can always be stored in a database. As a result, the user may have to use probabilistic inference rules to estimate the data required for his decisions. A decision based on such estimated data may not be perfect. The authors call the
From the Publisher: In this groundbreaking book, two acknowledged experts explore the underlying ... more From the Publisher: In this groundbreaking book, two acknowledged experts explore the underlying principles of systems integration, and, with the help of numerous case studies show IT managers, systems analysts, and project managers how to apply those principles to solving complex business problems. The authors reveal the linkages between business processes and how they can be supported in enterprise-wide integrated systems. Rather than review specific products and tools, the authors use real-life examples to provides readers with a practical understanding of integrated system architectures and how they function within the framework of an Enterprise Planning System.
This paper presents a methodology for trading-off the cost of incomplete information against the ... more This paper presents a methodology for trading-off the cost of incomplete information against the data-related costs in the design of database systems. It investigates how the usage patterns of the database, defined by the characteristics of information requests presented to it, affect its conceptual design. The construction of minimum-cost answers to information requests for a variety of query types and cost structures is also studied. The resulting costs of incomplete database information are balanced against the data-related costs in the derivation of the optimal design.
ACM SIGMIS Database: the DATABASE for Advances in Information Systems, 2000
The main function of a data warehouse is the separation of the decision layer from the operation ... more The main function of a data warehouse is the separation of the decision layer from the operation layer so that users can invoke analysis, planning, and decision support applications without having to worry about constantly evolving operational databases. Such applications allow ad hoc queries for which no predefined reports exist. It is possible that an ad hoc query is submitted by different users or even by the same user at different times, requiring its repeated evaluations even though the contents of the warehouse have not changed in between. In this work, we propose an enhancement to a data warehouse structure, by building additional intelligence in the form of an adaptive and efficient query cache. The cache contains a list of recently executed ad hoc queries and their answers. Whenever possible, a new query is satisfied by answers already stored in the cache, thereby avoiding potentially large data scans. We discuss issues related to organizing and searching the query cache. In particular, we outline subsumption detection algorithms for a number of different situations that allow for quick decision on whether the cache can be used to evaluate an arriving que~
In a typical database application, it is commonly assumed that user information requirements can ... more In a typical database application, it is commonly assumed that user information requirements can only be satisfied by the most current data. The desirable attributes of infolmation often include being up-to-date, timeliness, and accuracy, with the implicit assumption that without these attributes a response to a query has little or no value. This assumption is challenged in this work. We consider the tradeoff between the cost of incomplete information, due to the use of stale data, and the incremental cost of providing a current answer. We propose that the database system be extended to include a data cache, in which copies of frequently needed data will be kept. Objects in the cache are not updated as the database changes, but rather are refreshed whenever the cost of using stale data exceeds some prespecified level. We also discuss alternative refresh policies and cache search schemes
Customers are scheduled to arrive at periodic scheduling intervals to receive service from a sing... more Customers are scheduled to arrive at periodic scheduling intervals to receive service from a single server system. A customer must start receiving service within a given departure interval; if this is not the case, the system will pay a penalty and/or transfer the customer to another facility at the system's cost. A complete transient and steady-state analysis of the system
IEEE Transactions on Knowledge and Data Engineering, 1994
The exact expression for the expected number of disk accesses required to retrieve a given number... more The exact expression for the expected number of disk accesses required to retrieve a given number of records, called the Yao function, requires iterative computations. Several authors have developed approximations to the Yao function, all of which have substantial errors in some situations. We derive and evaluate simple upper and lower bounds that never differ by more than a small fraction of a disk access.
From an internal audit perspective, enterprise systems have created new opportunities and challen... more From an internal audit perspective, enterprise systems have created new opportunities and challenges in managing internal as well as external risks. In this work, we report results of a survey that examines internal auditors’ ability to identify and manage operational, financial, technological, compliance and other risks as the organization migrates to an ERP environment. Our findings show that the internal auditors perceive a reduction in financial and operational risk and an increase in technical risks. These effects are somewhat mitigated by their ability to assess and manage these risks. We also find that internal audit departments satisfied their needs for ERP skills not by outsourcing but by providing staff with in-house training.
A key element in all decision support systems is availability of sufficiently good and timely dat... more A key element in all decision support systems is availability of sufficiently good and timely data to support the decision making process. Much research was, and is, devoted to data and information quality: attributes, assurance that quality data is used in the decision process, etc. In this paper we concentrate on a particular dimension of data availability and usage the retrieval of data in a timely and decision enhancing manner. We propose to augment the decision support databases by an adaptive and efficient query cache. The cache contains snapshots of the decision support database, each being the answer to a recently invoked query. A snapshot can be reused by the originating user, or a different user, at a later time -provided the use of cached data leads to savings over the use of a new query, and these savings exceed the cost of using stale date. The proposed scheme is conceptually different from conventional data replication schemes. In data replication schemes the data item...
Uploads
Papers by Aditya Saharia