2005 Technical Reports | George Mason Department of Computer Science

CoJava: A Unified Language for Simulation and Optimization

Alexander Brodsky and Hadon Nash

ISE-TR-05-06

We have proposed and implemented the language CoJava, which offers both the advantages of simulation-like process modeling in Java, and the capabilities of true decision optimization. By design, the suntax of CoJava is identical to the programming language Java, extended with special constructs to (1) make a non-deterministic choice of a numeric value, (2) assert a constraint, and (3) designate a program variable as the objective to be optimized. The semantics of CoJava interprets a program as an optimal nondeterministic execution path, namely, a path that (1) satisfies the range conditions in the choice statements, (2) satisfies the assert-constraint statements, and (3) produces the optimal value in a designated program variable, among all execution paths that satisfy (1) and (2). To run a CoJava program amounts to first finding an optimal execution path, and then procedurally executing it. We have developed a CoJava constraint compiler based on a reduction of the problem of finding an optimal execution trace to a standard symbolic formulation, reinterpreting Java code as a symbolic constraint construction, and solving the resulting optimization problem on an external solver. To demonstrate the power of CoJava, we have implemented a realistic problem in the area of robot arm control in CoJava. The robot arm is constructed using self-contained components implemented as CoJava classes, that model robot's arm movements based on Newton's laws.

VirtuE: a Formal Model of Virtual Enterprises for Information Markets

Alessandro D'Atri and Amihai Motro

ISE-TR-05-05

A vital part of a modern economy is an information market. In this market, information products are being traded in countless ways. Information is bought, modified, integrated, incorporated into other products, and then sold again. Often, the manufacturing of an information product requires the collaboration of several participants. A virtual enterprise is a community of business entities that collaborate on the manufacturing of complex products. This collaboration is often ad hoc, for a specific product only, after which the virtual enterprise may dismantle. The virtual enterprise paradigm is particularly appealing for modeling collaborations for manufacturing information products, and in this paper we present a new model, called VirtuE, for modeling such activities. VirtuE has three principal components. First, it defines a distributed infrastructure with concepts such as members, products, inventories, and production plans. Second, it defines transactions among members, to enable collaborative production of complex products. Finally, it provides means for the instrumentation of enterprises, to measure their performance and to govern their behavior.

A Comparative Evaluation of Tests Generated from Different UML Diagrams: Diagrams and Data

Aynur Abdurazik and Jeff Offutt and Andrea Baldini

ISE-TR-05-04

This paper presents a single project experiment on the fault revealing capabilities of model-based test sets. The tests are generated from UML statecharts and UML sequence diagrams. This experiment found that the statechart test sets did better at revealing unit level faults than the sequence diagram test sets, and the sequence diagram test sets did better at revealing integration level faults than the statechart test sets. The statecharts also resulted in more test cases than the sequence diagrams. The experiment showed that model-based testing can be used to systematically generate test data and indicates that different UML models can play different roles in testing.

Optimal Manufacturing Trees

Amihai Motro and Alessandro D`Atri

ISE-TR-05-03

We consider the process of manufacturing a product from a given number of elementary components. By assembling intermediate products, the target product can be manufactured in a variety of processes, each modeled by a tree. We are interested in manufacturing turnaround: the time between receiving an order at the root and its completion. We express the turnaround time of each manufacturing process (tree) with a formula that incorporates three parameters: the time required to create elementary components, the time required to assemble a product from its components and the time required to deliver the product to its procurer (another manufacturer). We show that this turnaround formula is optimized in a manufacturing process that corresponds to a perfect (or nearly perfect) tree. The degree of the optimal tree (i.e., the ideal number of components in each sub-assembly) is shown to be independent of the number of elementary components, suggesting that in each manufacturing environment there is an ideal assembly size, which is optimal for the manufacturing of products of any scale.

Blind Custodians: A Database Service Architecture that Supports Privacy without Encryption

Amihai Motro and Francesco Parisi-Presicce

ISE-TR-05-02

We describe an architecture for a database service that does not assume that the service provider can be trusted. Unlike other architectures that address this problem, this architecture, which we call blind custodians, does not rely on encryption. Instead, it offers confidentiality by means of information dissociation: The server only stores âfragmentsâ of information that are considered safe (i.e., each fragment does not violate privacy), while the client stores the associations between the fragments that are necessary to reconstruct the information. We argue that this architecture allows satisfactory confidentiality, while offering two important advantages: (1) It does not restrict the types of queries that can be submitted by clients (as encryption-based methods invariably do), and (2) it requires only light processing at the client, assigning the bulk of the processing to the server (as befits a true service). Moreover, the architecture permits flexible control over the level of confidentiality that should be maintained (at the cost of additional overhead).

Intensional Encapsulations of Database Subsets via Genetic Programming

Aybar C. Acar and Amihai Motro

ISE-TR-05-01

Finding intensional encapsulations of database subsets is the inverse of query evaluation. Whereas query evaluation transforms an intensional expression (the query) to its extension (a set of data values), intensional encapsulation assigns an intensional expression to a given set of data values. We describe a method for deriving intensional representations of subsets of records in large database tables. Our method is based on the paradigm of genetic programming. It is shown to achieve high accuracy and maintain compact expression size, while requiring cost that is acceptable to all applications, but those that require instantaneous results. Intensional encapsulation has a broad range of applications including cooperative answering, information integration, security and data mining.

Evolving Families of Designs Using L-Systems

Elena Popovici

GMU-CS-TR-2005-8

Evolutionary computation has proven its utility in automating the process of engineering design. However, little attention has been paid to the scalability of generated designs, which is an important issue. This paper addresses this issue and proves the viability of evolving families of designs using parameterized L-Systems as a representation. The rest of the paper is organized as follows: first, an introduction as to why scalability is important and difficult; second, a review of existing work on evolving L-Systems; the third section contains a description of the application domain used for this feasibility study, details on the L-Systems and the EA used; section four presents experiments conducted and their results; the paper ends with a discussion, drawing conclusions and setting goals for future work.

A new inlier identification scheme for robust estimation problems

Wei Zhang and Jana Košecká

GMU-CS-TR-2005-7

The data in vision problems, often heavily contaminated by outliers, call for efficient robust techniques to identify inliers for correct estimation. RANSAC algorithm is a frequently used robust estimator for computer vision problems. In traditional RANSAC scheme, when data contain significant fraction of outliers, large number of samples is needed in order to obtain at least one outlier free sample. In addition to that, each hypothesis generated from the samples is typically evaluated using all data points, which further lowers the efficiency. In this paper, we propose a novel hypothesis evaluation scheme, which enables efficient classification of the data points as outliers and inliers, while requiring a small fixed number of samples. The method is based on the observation that for each data point the properties of the distribution of the residuals with respect to the generated hypotheses reveal whether the point is an outlier or inlier. The problem of inlier/outlier identification can then be formulated as a classification problem. We demonstrate the proposed method on motion estimation problems with large fraction of outliers on both synthetic and real data.

Nonparametric estimation of multiple structurers with outliers

Wei Zhang and Jana Košecká

GMU-CS-TR-2005-6

In many computer vision problems, several instances of a particular model need to be recovered from the noisy data. In such cases one is faced with the problem of simultaneous estimation of the number of models and their parameters. This problem becomes difficult as the measurement noise in the data increases and the data are further corrupted by outliers. This is especially the case in a variety of motion estimation problems, where the displacement between the views is large and the process of establishing correspondences is difficult. In this paper we propose a novel nonparametric sampling based method for solving this problem. The main novelty of the proposed method lies in the analysis of the distribution of residuals of individual datas points with respect to the set of hypotheses, generated by a RANSAC-like sampling process. We will show that the modes of the residual distributions directly reveal the presence of multiple models and facilitate the recovery of the individual models, without making any assumptions about the distribution of the outliers or the noise process. The proposed approach is capable of handling data with large fraction of outliers. Experiments with both synthetic and real data are presented to demonstrate the effectiveness of the proposed approach.

Selecting Informative Actions Improves Cooperative Multiagent Learning

Liviu Panait and Sean Luke

GMU-CS-TR-2005-5

In concurrent cooperative multiagent learning, each agent in a team simultaneously learns to improve the overall performance of the team, with no direct control over the actions chosen by its teammates. An agent's action selection directly influences the rewards received by all the agents; this results in a co-adaptation among the concurrent learning processes. Co-adaptation can drive the team towards suboptimal solutions because agents tend to select those actions that are rewarded better, without any consideration for how such actions may affect the search of their teammates. We argue that to counter this tendency, agents should also prefer actions that inform their teammates about the structure of the joint search space in order to help them choose from among various action options. We analyze this approach in a cooperative coevolutionary framework, and we propose a new algorithm, oCCEA, that highlights the advantages of selecting informative actions. We show that oCCEA generally outperforms other cooperative coevolution algorithms on our test problems.

Lenience Towards Teammates Helps in Cooperative Multiagent Learning

Liviu Panait and Keith Sullivan and Sean Luke

GMU-CS-TR-2005-4

Concurrent learning is a form of cooperative multiagent learning in which each agent has an independent learning process and little or no control over its teammates' actions. In such learning algorithms, an agent's perception of the joint search space depends on the reward received by both agents, which in turn depends on the actions currently chosen by the other agents. The agents will tend to converge towards certain areas of the space because of their learning processes. As a result, an agent's perception of the search space may benefit if computed over multiple rewards at early stages of learning, but additional rewards have little impact towards the end. We thus suggest that agents should be lenient with their teammates: ignore many of the low rewards initially, and fewer rewards as learning progresses. We demonstrate the benefit of lenience in a cooperative co-evolution algorithm and in a new reinforcement learning algorithm.

Can Good Learners Always Compensate for Poor Learners?

Keith Sullivan and Liviu Panait and Gabriel Balan and Sean Luke

GMU-CS-TR-2005-3

Can a good learner compensate for a poor learner when paired in a coordination game? Previous work has given an example where a special learning algorithm (FMQ) is capable of doing just that when paired with a specific capable algorithm even in games which stump the poorer algorithm when paired with itself. In this paper, we argue that this result is not general. We give a straightforward extension to the coordination game in which FMQ cannot compensate for the lesser algorithm. We also provide other problematic pairings, and argue that another highquality algorithm cannot do so either.

Probabilistic Location Recognition using Reduced Feature Set

Fayin Li and Jana Košecká

GMU-CS-TR-2005-2

The localization capability is central to basic navigation tasks and motivates development of various visual navigation systems. These systems can be used both as navigational aids for visually impaired or in the context of autonomous mobile systems. In this paper we describe a two stage approach for localization in indoor environments. In the first stage, the environment is partitioned into several locations, each characterized by a set of scale-invariant keypoints and their associated descriptors. In the second stage the keypoints of the query view are integrated probabilistically yielding an estimate of most likely location. The emphasis of our approach in the environment model acquisition stage is on the selection of discriminative features, best suited for characterizing individual locations. The high recognition rate is maintained with only 10% of the originally detected features, yielding a substantial speedup in recognition. The ambiguities due to the self-similarity and dynamic changes in the environment are resolved by exploiting spatial relationships between locations captured by Hidden Markov Model. Once the most likely location is determined, the relative pose of the camera with respect to the reference view can be computed.

Reachability Testing of Concurrent Programs

Jeff Lei and Richard Carver

GMU-CS-TR-2005-1

One approach to testing concurrent programs, called reachability testing, generates synchronization sequences automatically, and on-the-fly, without constructing any static models. In this paper, we present a general execution model for concurrent programs that allows reachability testing to be applied to several commonly used synchronization constructs. We also present a new method for performing reachability testing. This new method guarantees that every partially-ordered synchronization sequence will be exercised exactly once without having to save any sequences that have already been exercised. We describe a prototype reachability testing tool called RichTest and report some empirical results, including a comparison between RichTest and a partial order reduction based tool called VeriSoft. RichTest performed significantly better for the programs in our study. Index Terms: Software Testing, Reachability Testing, Concurrent Programming