1995 Technical Reports | George Mason Department of Computer Science

Alessandro D'Atri, Amihai Motro, and Laura Tarantino

ISSE-TR-95-15

ViewFinder is a graphical tool for browsing in databases that provides a flexible, yet intuitive environment for exploratory searches. The design approach has been to provide maximum functionality and generality without sacrificing simplicity. The constructs of ViewFinder's external model are essentially object-oriented: class and token objects, membership relationships between tokens and classes, generalization relationships between classes, inheritance, and so on. This external model is based on an internal model which resembles a semantic network. Such a network may be extracted from a variety of data models, including object-oriented, entity-relationship and extended relational models. This architecture gives ViewFinder a large degree of model independence. The main construct of the external model are displays of objects (either classes or tokens), called views. Commands are available for efficient traversal of the database, displaying views of any class or token. To speed up repetitive searches, views may be synchronized: the user sets up several views, linked in a tree-like structure, so that when the information displayed in the root view is modified (e.g., scrolled by the user), the contents of the other views change automatically. Additional commands are available to search, order, aggregate and select the information displayed in a view, thus providing a simple query facility.

Applying Formal Methods to Semantic-Based Decomposition of Transactions

Paul Ammann, Sushil Jajodia, and Indrakshi Ray

ISSE-TR-95-14

In some important database applications, performance requirements are not satisfied by the traditional approach of serializability, in which transactions appear to execute atomically and in isolation on a consistent database state. Although many researchers have investigated the process of decomposing transactions into steps to increase concurrency, the focus of the research is typically on implementing a decomposition supplied by the database application developer, with relatively little attention to what constitutes a desirable decomposition and how the developer should obtain such a decomposition. In this paper, we focus on the decomposition process itself. A decomposition generates a set of proof obligations that must be satisfied to show that the decomposition correctly models the original collection of transactions. We introduce the notion of semantic histories to formulate and prove the necessary properties, and the notion of successor sets to describe efficiently the correct interleavings of steps. The successor set constraints use information about conflicts between steps so as to take full advantage of conflict serializability at the level of steps. We implement the resulting, verified decomposition in a two-phase locking environment.

Advanced Transaction Processing in Multilevel Secure File Stores

Elisa Bertino, Sushil Jajodia, Luigi Mancini, and Indrajit Ray

ISSE-TR-95-13

The concurrency control requirements for transaction processing in a multilevel secure file system are different from those in conventional transaction processing systems; in particular, there is the need to coordinate transactions at different security levels avoiding both potential timing covert channels and the starvation of transactions at high security levels. Suppose a transaction at a low security level attempts to write a data item that is being read by a transaction at a higher security level. On the one hand, a timing covert channel arises if the transaction at the low security level is either delayed or aborted by the scheduler. On the other hand, the transaction at the high security level may be subjected to an indefinite delay if it is forced to abort repeatedly. This paper extends the two-phase locking mechanism that is suitable for conventional transaction processing systems, to multilevel secure data management systems. The scheme presented here, avoids the abort of higher level transactions, nonetheless guaranteeing serializability. The system programmer is provided with a powerful set of linguistic constructs which supports exception handling, partial rollback and forward recovery. The proper use of these constructs can prevent the indefinite delay in completion of a high level transaction, and allows the system programmer to trade off starvation with transaction isolation.

Integrating Testing with the Software Development Process

Zhenyi Jin and A. Jefferson Offutt

ISSE-TR-95-12

Software testing is usually postponed to the end of the development, after coding has started, or even after it has ended. By waiting until this late in the process, testing winds up being compressed -- there are not enough resources (time and budget) remaining, problems with previous stages have been solved by taking time and dollars from testing, and there is insufficient time to adequately plan for testing. In this paper, we present an integrated approach to testing software, where testing activities begin as soon as development activities begin, and are carried out in parallel with the development activities. We present specific activities -- including planning, active testing, and development influencing activities -- that are associated with each of the traditional lifecycle phases. These activities can be carried out by the developers or separate test engineers, and can be associated with development activities within the confines of any development process. These activities allows the tester to detect and prevent throughout the software development process, leading to more reliable and higher quality software.

Finding Mode Invariants in SCR Specifications

Zhenyi Jin

ISSE-TR-95-11

This paper presents an algorithm to derive modeclass invariants from SCR specifications. Matrices are formed to help to find the invariants. A case study of a cruise control system shows that the algorithm is capable of determining the same invariants that were proved via model checking in earlier investigations. This algorithm provides a mechanical procedure that are much easier and clearer for computing invariants than computing from reachability graphs. It can be easily automated and handle big systems. It views the SCR specifications from the SCR structure's point of view and analyzes mode invariants as structural properties. It presents a new way to analyze software requirements documents and detects implicitly implied system property information that can be very useful and important in understanding, developing, and testing the software system. Related issues such as generating test cases that are independent of the design structure based on the SCR specifications are discussed to reveal possible research problems and directions in the future.

A Semantic Model of Program Faults

Jeff Offutt and Jane Hayes

ISSE-TR-95-10

Program faults are artifacts that are widely studied, but there are many aspects of faults that we still do not understand. In addition to the simple fact that one important goal during testing is to detect faults, a full understanding of the characteristics of faults is crucial to several research areas in testing. These include fault-based testing, testability, mutation testing, and the comparative evaluation of testing strategies. In this preliminary paper, we explore the fundamental nature of faults by looking at the differences between a syntactic and semantic characterization of faults. We offer definitions of these characteristics and explore the differentiation. Specifically, we discuss the concept of "size" of program faults -- the measurement of size provides interesting and useful distinctions between the syntactic and semantic characterization of faults. We use the fault size observations to make several predictions about testing and present data that supports this model. We also use the model to offer explanations about several questions that have intrigued testing researchers.

Data and Information Architectures for Large-Scale Distributed Data Intensive Information Systems

Larry Kerschberg, Hassan Gomaa, Daniel A. Menasce

ISSE-TR-95-09

No abstract or online report available.

A Software Architectural Design Method for Large-Scale Distributed Data Intensive Information Systems

Hassan Gomaa, Larry Kerschberg, Daniel A. Menasce

ISSE-TR-95-08

This paper describes a Software Architectural Design Method for Large-Scale Distributed Data Intensive Information Systems. The method starts by developing a domain model of the system. The domain model is then mapped to a federated client/server software architecture, where the servers need to cooperate with each other to service client requests. To demonstrate the viability of the architecture, specific user scenarios are applied to the federated client / server software architecture using event sequence diagrams. The method is illustrated by applying it to the design of a complex software system, the Earth Observing System Data and Information System (EOSDIS) Core System.

A Performance Oriented Design Methodology for Large-Scale Distributed Data Intensive Information Systems

Daniel A. Menasce', Hassan Gomaa, and Larry Kerschberg

ISSE-TR-95-07

The Earth Observing System (EOS) Data and Information System (EOSDIS) is perhaps one of the most important examples of large-scale, geographically distributed, and data intensive systems. Designing such systems in a way that guarantees that the resulting design will satisfy all functional and performance requirements is not a trivial task. This paper presents a performance-oriented methodology to design large-scale distributed data intensive information systems. The methodology is then applied to the design of EOSDIS Core System (ECS). Performance results, based on queuing network models of ECS are also presented.

Using Formal Methods To Reason About Semantics-Based Decompositions Of Transactions

Paul Ammann, Sushil Jajodia, and Indrakshi Ray

ISSE-TR-95-06

Many researchers have investigated the process of decomposing transactions into smaller pieces to increase concurrency. The research typically focuses on implementing a decomposition supplied by the database application developer, with relatively little attention to what constitutes a desirable decomposition and how the developer should obtain such a decomposition. In this paper, we argue that the decomposition process itself warrants attention. A decomposition generates a set of proof obligations that must be satisfied to show that a particular decomposition correctly models the original collection of transactions. We introduce the notion of semantic histories to formulate and prove the necessary properties. Since the decomposition impacts not only the atomicity of transactions, but isolation and consistency as well, we present a technique based on formal methods that allows these properties to be surrendered in a carefully controlled manner. Note: The text of this paper appears in VLDB '95, Proceedings of the Twenty-First International Conference on Very Large Databases, Zurich, Switzerland, September, 1995. In addition, the appendices of this report contain proofs that space constraints precluded from the VLDB paper.

Temporal Semantic Assumptions and Their Use in Database Query Evaluation

Claudio Bettini, X. Sean Wang and Sushil Jajodia

ISSE-TR-95-05

Temporal data explicitly stored in a temporal database are often associated with certain semantic assumptions. Each assumption can be viewed as a way of deriving implicit information from the explicitly stored data. Rather than leaving the task of deriving (possibly infinite) implicit data to application programs, as is the case currently, it is desirable that this be handled by the database management systems. To achieve this, this paper formalizes and studies two types of semantic assumptions: point-based and interval-based. The point-based assumptions include those assumptions that use interpolation methods, while the interval-based assumptions include those that involve different temporal types (time granularities). In order to incorporate semantic assumptions into query evaluation, this paper introduces a translation procedure that converts a user query into a system query such that the answer of this system query over the explicit data is the same as that of the user query over the explicit AND the implicit data. The paper also investigates the finiteness (safety) of user queries and system queries.

The Problem of Optimal Approximations of Queries Using Views and Its Applications

Alexander Brodsky and Amihai Motro

ISSE-TR-95-04

In several recent works a common scenario is described: a relational database scheme R and a set of views V of this scheme are defined, and queries on the database scheme R must be transformed to equivalent queries on the views V. Given a query Q on R, the first problem is whether there exists a query Q_V expressed exclusively on V which is equivalent to Q, and, if it exists, how to find it. Clearly, an equivalent Q_V may not always exist. In such a case, it is important to approximate Q as best as possible with a query on V. The problems here are to define approximations, to determine whether a best approximation exists, whether it is unique, and how to find it. In this paper we formalize and answer these questions. Specifically, we show that the following problems are decidable for a conjunctive query Q and a set of conjunctive views V: (1) is there a conjunctive query Q_V on V that is equivalent to Q, and (2) is there a union Q_U of conjunctive queries on V that is equivalent to Q? Furthermore, we show that Q_V and Q_U are effectively computable if they exist. Moreover, the greatest lower bound of Q exists and is unique up to equivalence. This is done by introducing lower bound classifications, and proving their existence, finiteness and effective computability. This problem has many interesting applications, including physical database design (aimed at optimization of query processing), cooperative answering (where answers are annotated with some of their properties), and multidatabase query decomposition. We examine these applications, and we show how our work extends earlier results obtained in these applications.

Multiplex: A Formal Model for Multidatabases and Its Implementation

Amihai Motro

ISSE-TR-95-03

The integration of information from multiple databases has been an enduring subject of research for almost 20 years, and many different solutions have been attempted or proposed. Missing from this research has been a uniform framework. Usually, each solution develops its own ad-hoc framework, designed to address the particular aspects of the problem that are being attacked and the particular methodology that is being used. To address this situation, in this paper we define a formal model for multidatabases, which we call Multiplex. Mulitplex is a simple extension of the relational model, which may serve as a uniform abstraction for many previous ad-hoc solutions. Multiplex is based on formal assumptions of integrability, which distinguish between scheme and instance reconcilability among independent databases. Multiplex supports database heterogeneity, and it provides several degrees of freedom that allow it to model actual situations encountered in multidatabase applications. In addition, in situations in which a single answer is not obtainable (either because the global query is not answerable, or there are multiple candidate answers), Multiplex defines approximative answers. Finally, Multiplex provides a practical platform for implementation. A prototype of such an implementation is described briefly.

The Effectiveness of Category-Partition Testing of Object-Oriented Software

Alisa Irvine and A. Jefferson Offutt

ISSE-TR-95-02

When migrating from conventional to object-oriented programming, developers face difficult decisions in modifying their development process to best use the new technology. In particular, ensuring that the software is highly reliable in this new environment poses different challenges. Developers need understanding of effective ways to test the software. This paper presents empirical data that show that the existing technique of category-partition testing can effectively find faults in object-oriented software, and new techniques are not necessarily needed. For this study, we identified types of faults that are common to C++ software and inserted faults of these types into two C++ programs. Test cases generated using the category-partition method were used to test the programs. A fault was considered detected if it caused the program to terminate abnormally or if the output was different from the output of the original program. The results show that the combination of the category-partition method and a tool for detecting memory management faults may be effective for testing C++ programs in general. Since there is no evidence that traditional techniques are not effective, software developers may not need new testing methods when migrating to object-oriented development.

Multilevel Relational (MLR) Data Model

Fang Chen and Ravi S. Sandhu

ISSE-TR-95-01

Many multilevel data models have been proposed, and different models have different merits. In this paper, we unify many of these merits to build a new multilevel data model as a natural extension of the traditional relational data model. We describe our new model called the Multilevel Relational (MLR) data model for multilevel relations with element-level labeling. The MLR model builds upon prior work of numerous authors in this area, and integrates ideas from a number of sources. The new {\em data-based} semantics is given to the MLR data model which combines ideas from SeaView, belief-based semantics and LDV model, and has the advantages of both eliminating ambiguity and retaining upward information flow. The model is simple, unambiguous and powerful. It has five integrity properties and five operation statements for manipulating multilevel relations. In order to support this integration, several new concepts are introduced and many old ones are redefined. A central contribution of this paper is proofs of soundness, completeness and security of the MLR data model. These proofs respectively show that any of the provided operation statements can keep database state legal (i.e., satisfying all integrity properties), every legal database state can be constructed, and the MLR data model is non-interfering. This is the first time that such properties have been formally proved for a multilevel relational data model. The expressive power of the MLR model is also discussed in this paper, and is compared with several other models.

Integration Testing Based on Software Couplings

Zhenyi Jin and A. Jefferson Offutt

ISSE-TR-95-00

Integration testing is an important part of the testing process, but few integration testing techniques have been systematically studied or defined. This paper presents an integration testing technique based on couplings between software components. The coupling-based testing technique is described, and 12 coverage criteria are defined. The coupling-based technique was compared with the category-partition method. Results shows that the coupling-based technique detected more faults, and used fewer test cases than category-partition on a subject program. This modest result indicates that the coupling-based testing approach can benefit practitioners who are performing integration testing on software.