Technical Reports
Permanent link for this collectionhttps://hdl.handle.net/2022/14049
Browse
Recent Submissions
Now showing 1 - 20 of 719
Item type: Item , Vanishing Point: a Visual Road-Detection Program for a DARPA Grand Challenge Vehicle(2005-12) Antolovic, Danko; Leykin, Alex; Johnson, StevenItem type: Item , Graph DSLs : A Survey on Green-Marl & Sparql(2017-07) Kanewala, ThejakaMany real world problems are formulated as graphs and standard graph processing algorithms are used to search solutions. Applications of graphs and related algorithms can be found in many domains. Domains vary from standard scientific applications to social media applications such as Facebook. Creating and processing graphs in HPC environments adds lot of complexities. Hiding detail complexities and yet achieving the necessary performance enables lot of people to develop graph applications. In this paper we discuss about 2 such graph specific programming languages; 1. Green-Marl - an imperative programming language for graph processing, 2. SPARQL - An SQL like query language for graph processing.Item type: Item , A Survey on π-Calculus(2017-06) Kanewala, ThejakaCalculus is the mathematical study of change. Process Calculus models the changing behavior of computer processes. Usually a process refers to a running program. λ- calculus is a form of process calculus that has been heavily used in the functional programming arena. While, λ-calculus model changes in sequential computer processes, π-calculus formalizes behavior of a concurrent process. In this paper we will study general π-calculus language and variations of π-calculus that are applicable to distributed systems and programming languages.Item type: Item , Strategies and Tradeoffs in Designing and Implementing Embedded DSLs(2017-06) Kanewala, ThejakaDomain Specific Language (DSL) is an elegant software engineering solution to fairly complex problems in specific subject areas. While DSLs provide apt solutions to many domain problems, developing a DSL from the scratch is a laborious task that consumes considerable amount of money and time. Recently embedding has become a widely used methodology to develop DSLs. Embedded DSLs (EDSLs) reduces time and cost by reusing host programming language features such as parser, type checker, etc. In this paper we will go through various strategies used to embed a DSL into a general purpose programming language. Also we will discuss several implementation strategies for each embedding type and a comparison between different embedding strategies.Item type: Item , Distributed Control in HPX(2017-05) Kanewala, ThejakaTasks in an unordered algorithm can be performed in any order and the final result does not depends on the task processing order. How- ever, prioritizing tasks improve the efficiency of the algorithm. In our earlier work, we proposed a work scheduling mechanism for un- ordered distributed algorithms, called "Distributed Control" (DC). In our prior work we compared DC performance by implementing a DC based Single Source Shortest Path (SSSP) algorithm. We compared performance of popular ∆-Stepping based SSSP implementation with DC based SSSP. Our results showed that DC performance is much better compared to ∆-Stepping. Our previous implementation was based on a runtime environment called AM++. In this paper, we discuss a DC implementation based on HPX-3; a distributed runtime environment for ParalleX execution model. We will discuss implementation challenges encountered and various strategies used to overcome those and also a performance comparison between our previous implementation and current implementation.Item type: Item , Preproceedings of the 26nd Symposium on Implementation and Application of Functional Languages (IFL 2014)(2014-10) Tobin-Hochstadt, SamThe 26th Symposium on Implementation and Application of Functional Languages (IFL 2014) takes place at Northeastern University in Boston, USA from October 1 to 3, 2014. It represents the return of IFL to the USA for the third time. IFL 2014 is hosted by the Programming Research Lab at Northwestern University. At the time of writing, the symposium had 47 registered participants from DenÂmark, the Netherlands, Norway, Germany, France, Hungary, the United Kingdom and the United States of America. The goal of the IFL symposia is to bring together researchers actively engaged in the implementation and application of functional and function-based programming languages. It is a venue for researchers to present and discuss new ideas and concepts, works in progress, and publication-ripe results. Following the IFL tradition, there is a post-symposium review process to proÂduce formal proceedings which will be published by the ACM in the International Conference Proceedings Series. All participants in IFL 2014 were invited to submit either a draft paper or an extended abstract describing work to the presented at the symposium. The submissions were screened by the program committee chair to make sure they are within the scope of IFL. Submissions appearing in the draft proceedings are not peer-reviewed publications. After the symposium, authors have the opportunity to incorporate the feedback from discussions at the symposium into their paper and may submit a revised full article for the formal review process. These revised submissions will be reviewed by the program committee using prevailing academic standards. The IFI 2014 program consists of 31 presentations and one invited talk. The Contributions in this volume are ordered accordingly to the intended schedule of presentation. In order to make IFL 2014 as accessible as possible, we have not insisted on any particular style or length for the submissions. Such rules only apply to the version submitted for post-symposium reviewing. As is usual for IFL, the program last three days with a social event and an invited talk. The invited talk will be given by Niko Matsakis of Mozilla Research, who will discuss the role of ownership in the type system of Rust, a programming language designed for low-level systems programming in a memory-safe fashion. The social event takes place on October 2 and consists of two parts: a trip through the city and harbor of Boston on a duck boat, and, in the evening, a banquet dinner in downtown Copley Square. We are grateful to many people for their help in preparation for IFL 2014. Most significantly, Asumu Takikawa of Northeastern University served as local Arrangements chair, and without his efforts this event would not have been possible. Additionally, the staff of the College of Computer and Information Science, particularly Nicole Bekerian and Doreen Hodgkin, helped make IFL a success. Our student volunteers also play an important role in the smooth running of the event. Special thanks are due to Rinus Plasrneijer, last year's chair and the head of the IFL steering committee, for advice and experience that improved IFL 2014.Item type: Item , A Virtual Filesystem Framework to Support Embedded Software Development(2007-06) Pisupati, BhanuWe present an approach to simplify the software development process for embedded systems by supporting key development tasks such as debugging, tracing and configuration. The approach is based on the use of distributed filesystem abstractions; principal building blocks within an embedded system in the form of ``systems on chip'' (SoC) export filesystem abstractions that are composed together up the system hierarchy, and provide familiar file based interfaces with which to interact with the entire system. The central question addressed in this thesis is as to how the workstation centric idea of a distributed filesystem can be implemented and effectively applied to facilitate various software development tasks in the embedded domain. To this end, a primary contribution of our work is the realization of distributed filesystem implementations that are compatible with resource constrained embedded architectures. We demonstrate use of the filesystems in enabling debugging and tracing in heterogeneous, multiprocessor environments, while addressing issues central to SoC based systems. The virtual filesystem model is also applied to facilitate usage, configuration and deployment in a contrasting embedded application domain in the form of distributed sensor networks, thereby demonstrating the its adaptability.Item type: Item , eXtensible Relational Databases: a Relational Approach to Interoperability(2004-07) Lu, J; Cheung, S; Wyss, CatharineItem type: Item , MD-SQL: A Language for Meta-Data Queries over Relational Databases(1999-07) Rood, C; Van Gucht, Dirk; Wyss, FelixFuture users of large, interconnected databases and data warehouses will increasingly require schematic transparency of data manipulation systems, in that (i) data from heterogeneous sources must be compared and interrelated and (ii) data must be queried and extracted by distant users having minimal knowledge of its logical structure. A query language that abstracts over meta-data as well as ordinary data is needed. Previous work in this area has resulted in HILOG, SchemaLog and SchemaSQL. Although SchemaSQL improves on its predecessors, it remains somewhat informal and relies on a specialized transformation into a fragment of the tabular algebra to give it a viable operational semantics. In contrast, we provide a complete EBNF for Meta-Data SQL (MD-SQL) as a straightforward extension of a relationally complete subset of standard SQL. Like SchemaSQL, MD-SQL allows queries involving meta-data and ordinary data in a multi-database context over potentially disparate platforms. Schematic elements and data are freely interchangeable, and queries are allowed whose output type cannot be known at compile time. Unlike SchemaSQL however, each MD-SQL query translates into a series of simple, atomic operations, each of which is inherently relational. We formalize this translation by presenting a complete meta-algebra which is shown to be equivalent to MD-SQL. Furthermore, we provide some complexity results, in particular that MD-SQL and the meta-algebra yield characterizations of PSPACE. We also give results concerning when the output type of an MD-SQL query can be deduced at compile time. Finally we briefly discuss an implementation of MD-SQL over an ordinary, relational system that uses the DynamicSQL/CLI standard. Since MD-SQL is relational in nature, our implementation can benefit directly from existing query optimization techniques.Item type: Item , The Linear System Analyzer(1998-06) Bramley, Randall; Gannon, Dennis; Stuckey, T; Villacis, J; Akman, E; Balasubramanian, J; Breg, Fabian; Diwan, S; Govindaraju, MadhusudhanItem type: Item , pC++/streams: A Library for I/O on Complex Distributed Data Structures(1995-01) Gotwals, Jacob; Srinivas, Suresh; Gannon, DennisItem type: Item , Undulant-Block Pivoting and Integer-Preserving Matrix Inversion(1995-08) Wise, DavidItem type: Item , EMILY: A Visualization Tool for Large Sparse Matrices(1994-07) Bramley, Randall; Loos, TomItem type: Item , Analog Test Board: Design and Operation(1994-08) Montante, RobertItem type: Item , Disentangled Representation Learning Using (β-)VAE and GAN(2022-08) Haghir Ebrahimabadi, MohammadGiven a dataset of images containing different objects with different features such as shape, size, rotation, and x-y position, and a Variational Autoencoder (VAE), creating a disentangled encoding of these features in the hidden vector space of the VAE was the task of interest in this thesis. The dSprite dataset provided the desired features for the required experiments in this research. After training the VAE with combinations of a Generative Adversarial Network (GAN), each dimension of the hidden vector was disrupted to explore the disentanglement in each dimension. Note that the GAN was used to improve the quality of output image reconstruction.Item type: Item , Meta Proximal Policy Optimization for Cooperative Multi-Agent Continuous Control(2022-05) Fang, BoliIn this thesis we propose Multi-Agent Proxy Proximal Policy Optimization (MA3PO), a novel multi-agent deep reinforcement learning algorithm that tackles the challenge of cooperative continuous multi-agent control. Our method is driven by the observation that most existing multi-agent reinforcement learning algorithms mainly focus on discrete state/action spaces and are thus computationally infeasible when extended to environments with continuous state/action spaces. To address the issue of computational complexity and to better model intra-agent collaboration, we make use of the recently successful Proximal Policy Optimization algorithm that effectively explores of continuous action spaces, and incorporate the notion of emph{intrinsic motivation} via emph{meta-gradient methods} so as to stimulate the behavior of individual agents in cooperative multi-agent settings. Towards these ends, we design proxy rewards to quantify the effect of individual agent-level intrinsic motivation onto the team-level reward, and apply meta-gradient methods to leverage such an addition with a learning-to-learning optimization paradigm so that our algorithm can learn the team-level cumulative reward effectively. Furthermore, we have also conducted experiments on various open multi-agent reinforcement learning benchmark environments with continuous action spaces. Our results demonstrate that our meta proximal policy optimization algorithm is not only comparable with other existing state-of-the-art algorithmic benchmarks in terms of performances, but also significantly reduces training time complexity as compared to existing techniques.Item type: Item , Reproducibility in Scientific Computing(2021-02) Klinginsmith, JonathanThe Oxford English Dictionary defines the scientific method as “a method of procedure that has characterized natural science since the 17th century, consisting in systematic observation, measurement, and experiment, and the formulation, testing, and modification of hypotheses†[1]. Theory and experimentation, the first two pillars of the scientific method, have stood for centuries. Scientists have formulated theories and hypothe- ses and used experimentation to validate or refute theories. However in recent years, computing has widely been considered the third pillar of science [2]. Advances in sensors, imaging, and scientific instrumentation have led to generation of large amounts of scientific data. Data generation have become ubiquitous with science as well [3], to the point that some researchers consider data-intensive science to the fourth pillar [4]. Ivie and Thain define scientific computing “as computing applied to the physical sciences (biology, chemistry, physics, and so on) for the purposes of simulation or data analysis†[5]. Within the computational science research community, Stodden states “the digitization of science combined with the Internet create a new transparency in scientific knowledge, potentially moving scientific progress from building with black boxes, to one where the boxes themselves remain wholly transparent†[6]. As the scientific process further leverages both technology and data, the need to reproduce compu- tational experiments has become imperative in the scientific discovery process. However, as a computational science researcher reading a scientific paper, it can be challenging to reproduce the authors’ experiments. To fully reproduce the computational experiment, one must have the same versions of software installed and configured, have access to the original data, and leverage the same parameters used within the original experiment. In many cases, having access to all these items is not possible [7]. Even if the original data are not available, it should be reasonable to expect experi- mental setup to be reproducible. Specifically, if the infrastructure setup and the software installation and configuration can be performed in a reproducible manner then scientists are much more enabled at replicating or extending the experiment in question. Figure 1 models the progression of a computa- tional science experiment. The three phases: configuration, execution, and publication represent logical constructs where experimental activities performed and can be replicated. During the configuration phase, software must be installed and configured and when necessary infrastructure must be provisioned. This phase also includes any data preparation or downloads. Input parameters may be modified so that the experiment can be executed multiple times. Within the execution phase, the actual experiment is performed. Data and metrics are generated from experiment for use in the publication stage. In the publication stage, data tables, figures, and charts are produced for information sharing and presentation of experimental results. Fig. 1. Experimental progression Many computational science disciplines are leveraging machine learning or artificial intelligence, in general, to make scientific inferences from trained datasets. Hutson [8] discussed the reproducibility crisis in artificial intelligence research. One of the most basic contributors to the crisis is researchers’ lack of sharing and publishing software. Heaven [9] provides reasons why AI is dealing with issues in reproducibility. He mentions one of most basic reasons for the crisis is the “lack of access to three things: code, data, and hardware.†He continues that the divide between the “haves“ and the “have-nots“ of AI data and hardware is also contributing to the crisis. In many cases as part of the experimental progression highlighted in in Figure 1 a scientific computing analysis requires multiple sequential or paraItem type: Item , A Bayesian Evaluation of User App Choice in the Presence of Risk Communication on Android Devices(2019-09) Momenzadeh, Behnood; Camp, JeanIn this work we empirically explore the possibility that people lack the information to make risk-aware decisions when choosing between mobile apps, and if given such information would change their behavior. Specifically we examine the choice of apps by users when risk information is embedded in the display of apps. Currently, no such information is readily available. Despite the presence of permissions information, it is not cognitively feasible to compare apps on permission, nor security or privacy in current app stores. One component to resolving this lack of information is the creation of clear, effective risk communication at time of app selection. One core test of risk communication is if it influences decision-making. Here we test indicators that allow users to differentiate the risk associated with apps, and examine the impact on decision-making in four app categories. We use an experimental model grounded in medical interventions, where we add an intervention in multiple situations (in this case app categories) and compare these to the pre-existing baseline. The question we address here is not if such an indicator can be reliably generated, but rather if were clearly indicated would it make a difference? To answer this we built an extended Android Play Store that embedded indicators using the lock icon as a cue. We recruited sixty participants to test the interaction using tablets running the extended store on Jelly Bean. The Play Store was otherwise unaltered, and included the standard user ratings, download count, and permissions interface. The result was that participants systematically choose apps with lower ratings or lesser download counts instead choosing apps with higher ratings with respect to risk. We compare our results to the users’ behavior in Android Market, indicating that individuals not only prefer higher privacy with no loss of functionality, but also that some participants may trade-off functionality for privacy.Item type: Item , IoTMarketplace: Informing Purchase Decisions with Risk Communication(2019-08) Gopavaram, Shakthidhar Reddy; Dev, Jayati; Das, Sanchari; Camp, JeanIt is common for people to declare that online privacy is important, to indicate that it is valuable, and to simultaneously behave in a manner inconsistent with these expressed preferences. This discrepancy between users' concerns and their behavior has been explained by three factors: information asymmetry, bounded rationality, and psychological biases. The conflict between expressions of concern and purchase decisions is amplified as the Internet of Things brings the potential for real-time multimedia surveillance, even at home. But there is not empirical evidence that privacy or security influence purchase decisions about IoT devices. In this work, we design an interface for an Internet of Things (IoT) Marketplace that enables participants to make more privacy aware purchases by addressing three of the factors that are associated with the privacy paradox. We then conduct a between subjects experiment to test the effect of the interaction on product selection. The results from this experiment show that when participants are presented with an interface that addresses all three factors, they purchase IoT devices that are more privacy preserving even if they have to pay a premium to do so. The results also show that when participants are presented only with privacy indicators they do not consistently make privacy-preserving device choices if the psychological biases affecting users' decision making are not also addressed by interaction design.Item type: Item , LoCal: A Language for Programs Operating on Serialized Data(2019-04) Vollmer, Michael; Koparkar, Chaitanya; Rainey, Mike; Sakka, Laith; Kulkarni, Milind; Newton, RyanIn a typical data-processing program, the representation of data in memory is distinct from its representation in a serialized form on disk. The former has pointers and ar- bitrary, sparse layout, facilitating easy manipulation by a program, while the latter is packed contiguously, facilitating easy I/O. We propose a language, LoCal, to unify in-memory and serialized formats. LoCal extends a region calculus into a location calculus, employing a type system that tracks the byte-addressed layout of all heap values. We formalize LoCal and prove type safety, and show how LoCal programs can be inferred from unannotated source terms. We transform the existing Gibbon compiler to use LoCal as an intermediate language, with the goal of achieving a balance between code speed and data compactness by intro- ducing just enough indirection into heap layouts, preserving the asymptotic complexity of traditional representations, but working with mostly or completely serialized data. We show that our approach yields significant performance improve- ment over prior approaches to operating on packed data, without abandoning idiomatic programming with recursive functions.