Gateways 2018: Full Schedule

10:55am CDT

Accessing Distributed Jupyter / Spark in OnDemand

There are a variety of gateway software platforms available, each of which provide their own unique advantages. OnDemand’s unique architecture empowers developers and users to easily create and run system-level access applications as well as interactive HPC applications. In this demonstration we show the ease of use through OnDemand to standup a Jupyter / Spark stack and run a distributed workload on an HPC cluster all within a browser.

Presenters and Authors

Alan Chalker

Eric Franz

David Hudak

Executive Director, Ohio Supercomputer Center

I am interested in Supercomputer and Research Computing management issues and technically in science gateways and the Open OnDemand platform.

Douglas Johnson

Jeremy Nicklas

Wednesday September 26, 2018 10:55am - 11:15am CDT
Balcones Room, Commons Conference Center 10100 Burnet Road, Bldg 137, Austin, TX 78758

Concurrents B, Gateway Usage and Usability Studies

11:15am CDT

Morning-a-month Website Usability Testing for a Materials Science Gateway

Science gateways, offering access to scientific tools and large amounts of data in the form of web portals and other applications, are often run by the research groups who produce these tools and data. As the prevalence of these gateways increases, a key issue that arises is that of usability. The developers of gateways want users to be able to interact with the gateways easily and efficiently. However, design choices are typically made by members of the research groups involved and can be ad-hoc, biased, and generally unpredictable. To ensure users can make full use of a gateway, the need arises for a usability study.

The resources needed to do a large-scale usability study are quite significant. Ideal usability testing would involve comparing multiple versions of a user interface to see which version users prefer. In addition to the time and effort needed to produce these versions, many participants need to be found, scripts need to be written, and a large amount of time needs to be dedicated for the testing session itself.

While the benefits of large-scale usability testing are obvious, research groups often have neither the user numbers nor the rate of development that would warrant such an effort. Additionally, these groups often do not have the resources needed to fund/staff such an endeavor.

When starting a usability testing initiative at the Materials Project, our approach was to make usability testing as easy as possible while still obtaining valuable information. We focused on a "one morning a month'" model and held 3 testing sessions over the course of 3 months, with 10 participants in total. Each session made use of a script to keep participant experience consistent. Scripts were slightly tweaked between sessions to obtain information on the impact of the script itself on user behavior. Each session was streamed live to observers in another room as well as recorded for later review, and observations were noted immediately after each session was over. With only 10 participants and a minimal budget, we were able to draw conclusions about web design and user behavior specific to our portal. We hope these conclusions, as well as our notes on the testing process itself, prove useful to developers of other science gateways.

Presenters and Authors

Joseph Palakapilly

Kristin Persson

Donald Winston

Wednesday September 26, 2018 11:15am - 11:35am CDT
Balcones Room, Commons Conference Center 10100 Burnet Road, Bldg 137, Austin, TX 78758

Concurrents B, Gateway Usage and Usability Studies

11:35am CDT

Plotting Advancements to the GenApp Framework

GenApp, a Generalized Application generation framework, is a general tool for rapid deployment of applications to an extensible set of target languages. To produce fully functional science gateways and standalone graphical user interface (GUI) applications, GenApp weaves libraries of code fragments and user defined modules as directed by simple textual definition files. Apart from the scientific code to be deployed, these definition files are the only input from the user. This conceptual simplicity makes GenApp ideally suitable for scientist (physicists, chemists etc.) with little-to-no CS expertise who wish to deploy their scientific software on the web. Currently, GenApp is used to generate multiple web-based science gateways primarily in the Small Angle Scattering field. GenApp features are frequently added as required by use cases. We will briefly cover existing basic and advanced GenApp capabilities and focus on detailed discussion of the most recently integrated GenApp features including enhancements to the user interface such as enabling robust interactive 2D and 3D data plotting.

Presenters and Authors

emre brookes

Alexey Savelyev

Wednesday September 26, 2018 11:35am - 11:55am CDT
Balcones Room, Commons Conference Center 10100 Burnet Road, Bldg 137, Austin, TX 78758

Concurrents B, Gateway Usage and Usability Studies

11:55am CDT

Clustering Download Events to Identify Classrooms

The Network for Computational Nanotechnology's (NCN) [1] nanoHUB site uses the HUBzero® platform [2] to offer a variety of content, simulation tools, and collaboration methods to an international community of students, teachers and professionals. Understanding and identifying educational usage of nanoHUB to form communities around nanotechnology education and improve education content is a long term objective of nanoHUB. While simulation tool and collaboration users log into nanoHUB, providing us with an identity with which to associate their usage, the majority of activity is from unidentified users who download content and come to the site from outside references such as search engine results. This paper describes a method to detect classroom usage from content download events with no additional information, identifying classroom usage by any user of nanoHUB material and providing insights into content usage.

Presenters and Authors

Gerhard Klimeck

Dwight McKay

Senior Data Science Engineer, Purdue University

Dwight McKay is a Senior Data Science Engineer with Purdue University’s central research computing group. He focuses on data visualization and analysis. He joined the NanoHUB effort in March 2013. Dwight has served as Director of Research Systems, managing the group who designs... Read More →

Michael Zentner

Director, HUBzero Platform, Purdue University / HUBzero

Entrepreneurship, Leadership of large cyberinfrastructure projects.

Wednesday September 26, 2018 11:55am - 12:05pm CDT
Balcones Room, Commons Conference Center 10100 Burnet Road, Bldg 137, Austin, TX 78758

Concurrents B, Gateway Usage and Usability Studies

12:05pm CDT

Visualizing User Interactions with Simulation Tool

In order to improve user experiences with simulation tools hosted by cyberinfrastructure, we endeavor to gain a better understanding of how users interact with tools. The dimensionality of these tools is often too large to be intuitively understood. This paper presents two contributions to the study of user behavior: the MEANDER algorithm for visualizing sessions of user activity, and a scoring method ("searchiness") for characterizing a user's behavior along an axis of "wildcatting" vs. searching. The MEANDER algorithm uses graph heuristics to "squash" a high dimensional path of exploration into a (distorted) plane for rendering. The "searchiness" score is built upon the same graph techniques.

Presenters and Authors

Nathan Denny

Gerhard Klimeck

Michael Zentner

Director, HUBzero Platform, Purdue University / HUBzero

Entrepreneurship, Leadership of large cyberinfrastructure projects.

Wednesday September 26, 2018 12:05pm - 12:15pm CDT
Balcones Room, Commons Conference Center 10100 Burnet Road, Bldg 137, Austin, TX 78758

Concurrents B, Gateway Usage and Usability Studies

4:00pm CDT

SciServer: Collaborative Science Platform

SciServer is a collaborative science platform that allows researchers across scientific disciplines to host and share their data sets, and provides a flexible, easy to use, framework for data retrieval and server-side data-intensive analysis with the largest science data sets.

Presenters and Authors

Gerard Lemson

Dmitry Medvedev

Manuchehr Taghizadeh Popp

Michael Rippin

Ani Thakar

Wednesday September 26, 2018 4:00pm - 4:20pm CDT
Balcones Room, Commons Conference Center 10100 Burnet Road, Bldg 137, Austin, TX 78758

Concurrents B, Gateway Platforms

4:20pm CDT

Fully Integrating Data with Compute Workflows: A Platform to Better Serve Scientific Research

The NSF Office of Advanced Cyberinfrastructure has recognized the emerging and evolving need for platforms that fully integrate data and computing workflows, and is calling for research to deliver systems that provide a full spectrum of data services and also offer a coherent coupling with computing software. The Digital Environment to Enable Data-driven Science (DEEDS) project has created a cross-domain, self-serve platform for data and computing that supports the entire end-to-end research investigation process. DEEDS offers interactive interfaces to 1) collect, manage, and explore data, 2) define and launch tools, 3) track computational workflows, and 4) access toolkits for ad hoc analytics. All interfaces are available from a single dashboard so that the workflow between data and tools is smooth and intuitive. In this paper, we describe DEEDS innovations for handling data and computational workflows, and we present the use cases from four science domains that defined features, services, and usability requirements for DEEDS.

Presenters and Authors

Andres Bejarano

Ann Christine Catlin

Steven Clark

Parameswaran Desigavinayagam

Sumudinie Fernando

Chandima Hewanadungodage

Omkar Patil

Guneshi Wickramaarachchi

Wednesday September 26, 2018 4:20pm - 4:40pm CDT
Balcones Room, Commons Conference Center 10100 Burnet Road, Bldg 137, Austin, TX 78758

Concurrents B, Gateway Platforms

4:40pm CDT

SeedMe2: Extensible data sharing websites for teams

Data is an integral part of scientific research, and data size problems have become endemic as computation and analyses are producing an increasingly large amount of data that research teams are inevitably tasked with managing these rapidly growing data collections. Existing solutions are largely focused upon providing storage space, whether local or in the cloud, and a familiar folder tree-style hierarchy. While these file system solutions work, they separate the data from essential contextual information, such as metadata, descriptive text and equations, job execution parameters, visualizations, and on-going data discussion among the researchers. Important discussions, for instance, remain in email logs or forums, while descriptive text is left in README files or embedded in those same email logs and forums. This distribution of contextual information makes it harder to keep track of it all and keep data from being orphaned or misinterpreted. A more unified approach is needed that keeps data and context together within the same storage system.

This interactive demonstration shows key features of building blocks for data sharing and data management developed by the SeedMe2 (Stream, Encode, Explore and Disseminate My Experiments) project. It enables research teams to manage, share, search, visualize, and present their data in a web-based environment using an access-controlled, branded, and customizable website they own and control. It supports storing and viewing data in a familiar tree hierarchy but also supports formatted annotations, lightweight visualizations, and threaded comments on any file/folder. The system can be easily extended and customized to support metadata, job parameters, and other domain and project-specific contextual items. The software is open source and available as an extension to the popular Drupal content management system.

Project website: http://dibbs.seedme.org
Trial website: http://sandbox.seedme.org

Citation
Chourasia, Amit; Nadeau, David; Wong, Mona; Norman, Michael (2018): SeedMe2: Extensible data sharing websites for teams. figshare. Paper. https://doi.org/10.6084/m9.figshare.7070291.v1

Presenters and Authors

Amit Chourasia

David Nadeau

Michael Norman

Mona Wong

Software engineer, UCSD SDSC

Wednesday September 26, 2018 4:40pm - 4:50pm CDT
Balcones Room, Commons Conference Center 10100 Burnet Road, Bldg 137, Austin, TX 78758

Concurrents B, Gateway Platforms

4:50pm CDT

SCAIGATE: Science Gateway for Scientific Computing with Artificial Intelligence and Reconfigurable Architectures

SCAIGATE is an ambitious project to design the first AI-centric science gateway based on field-programmable gate arrays (FPGAs). The goal is to democratize access to FPGAs and AI in scientific computing and related applications. When completed, the project will enable the large-scale deployment and use of machine learning models on AI-centric FPGA platforms, allowing increased performance-efficiency, reduced development effort, and customization at unprecedented scale, all while simplifying ease-of-use in science domains which were previously AI-lagging. SCAIGATE was an incubation project at the Science Gateway Community Institute (SGCI) bootcamp held in Austin, Texas in 2018.

Authors: David Ojika, Herman Lam, Bhavesh Patel and Ann Gordon-Ross

Presenters and Authors