2018 Research Projects

UCSD REHS Important Dates to Remember

  • The REHS Online Application Process is Now Open!
  • REHS Application Deadline: March 15, 2018
  • Period of Internship:  June 18- August 10, 2018

Summer 2018 Research Project Descriptions

Molecular Mechanisms of Diseases: cancer, autism, Alzheimer’s disease
Dr. Igor Tsigelny, Dept. of Neurosciences, SDSC
View Description


Autistic Spectrum Disorders (ASD)

ASD is a neurodevelopmental disorder that includes social impairments, restrictive and repetitive behavioral patterns, and communication difficulties. Scientists have not found the cause for autism, however ASD is associated with both genetic and environmental factors. Autism is predicted to be caused by genetic mutations in critical neurological areas of the genome due to its etiological heterogeneity and likelihood in siblings. Specific genes and genomic loci have been located and implicated in ASD development. We study these genes interactions with the transcription factors that activate them and to find the spots of possible drug administration, molecular modeling and molecular dynamics simulations.

Alzheimer’s

Alzheimer’s Disease is a neurodegenerative disorder that affects a variety of the brain’s functions, starting with short term memory loss, and eventually progressing to mood swings, disorientation, depression, and even loss of bodily functions. It is responsible for 60-70% of all dementia cases, and is the sixth leading cause of death in America. It is characterized by a buildup of β-amyloid plaques in the brain. There is no known cure for the disorder today, but recently, the amyloid hypothesis has been heavily researched as the mechanism of the disease. The amyloid precursor protein (APP) is cut by two enzymes, β-secretase and γ-secretase, and β-amyloid, the toxic form of the protein, is one of the segments that remains. Since the marker of Alzheimer’s is the buildup of this plaque in the brain, there is potential for slowing the progression of the disease if production of this protein is inhibited. We study existing drugs inactions with the target proteins and try to find methods to improve their action.

Cancer

Cancer is the second leading cause of death in the United States, and responsible for 580,350 deaths in 2013, it has become ever more imperative that researchers focus their attention in this field. Ingenuity Integrated Pathway Analysis (IPA) program is used to build and explore transcriptional networks, microRNA-mRNA target networks, phosphorylation cascades, and Protein-Protein or Protein-DNA interactions. We try to identify regulatory effects that lead from signaling events to transcriptional events for causal network analysis and to construct a comprehensive chart of diseases and functions associated with target genes so as to elucidate gene functionality and possible points of drug administration.

  1. Number of Students to be supported: 3
  2. Name of Lead person: Dr. Igor Tsigelny, Dept. of Neurosciences, SDSC

The students will be a part of the research team working on the larger scale project that includes the project described here. They will attend the group meetings and communicate with the team members using other methods of communication. The student will work closely with the lead person and the other personnel involved.

  1. No parallel internship in other organizations are permitted.
  2. Student Prerequisite

We are looking for a student who is interested in both biology and computation.

  1. Number of hours per week: 15-20 hours
  2. Relevant link:
    San Diego Supercomputer Center: http://www.sdsc.edu

 

Visualizing University Locations
Dr. Amit Chourasia, Visualization Services, San Diego Supercomputer Center, UCSD
View Description

Research project: This project involves analyzing how each data
There are several universities in the world. Some of them are very well known, while others not known beyond their local region. The goal of this project is to gather information about universities and create a well-defined list of their attributes along with their location. Then embark on creating visualization based on this data. Through visualization we hope to see how universities are distributed around the world. Thereafter, associate these universities with over 100,000 proteins registered in the Protein Data Bank. Finally, package all this work as a website such that can be easily filtered and searched by the public.

Number of students: 2

Name of lead person: Dr. Amit Chourasia, Visualization Services, San Diego Supercomputer Center, UCSD

Plan to integrate students into the group activity

This new project and the students will have an opportunity to contribute towards a larger effort on this work.

Student Perquisites

To apply for this project, you should be comfortable using the command line for Linux, MacOS, or Windows, and have some script programming knowledge, such as for shell scripts, Python, Java. Experience with Javascript, HTML, CSS is a plus, but optional. An interest in computing and visualization is encouraged.

Number of hours per week: 20 hours

Relevant Link: http://www.sdsc.edu/pmac/research-areas/memory-analysis.html

 

Auto-Correlation in Seismology Research
Dr. Pietro Cicotti, Dr. Dawei Mu, San Diego Supercomputer Center, UCSD
View Description

Analysis and correlation of patterns in seismic data can improve our ability to recognize pattens and anticipate events of interest. SDSC developed an auto-correlation program that detects repeated seismic signals without traditional template recognition preprocessing; this program is still in development and needs to be profiled and optimized.

Depending on background and interest, the successful applicant will work on various components of this software and will be involved in activities that include benchmarking,  profiling performance on CPU and GPU systems, test optimizations, and visualize results.

  1. Number of Students to be supported: 1
  2. Name of Lead person: Dr. Pietro Cicotti, Dr. Dawei Mu, San Diego Supercomputer Center, UCSD
  1. Plan to Integrate Student into Group Activity

The student will be a part of the research team working on the larger scale project that includes the project described here. He or she will attend the group meetings and communicate with the team members using other methods of communication. The student will work closely with the lead person and the other personnel involved.

  1. Student Prerequisite

The applicants should be interested in computer science and engineering. Some programming experience is desirable.

  1. Number of hours per week: 15-20 hours
  2. Relevant link:
    San Diego Supercomputer Center: http://www.sdsc.edu

 

Metabolic Profiles for Cancer Diagnostics
Dr. Igor Tsigelny, Dept. of Neurosciences, SDSC
View Description


Cancer is the second leading cause of death in the United States, and responsible for 580,350 deaths in 2013, it has become ever more imperative that researchers focus their attention in this field. Ingenuity Integrated Pathway Analysis (IPA) program is used to build and explore transcriptional networks, microRNA-mRNA target networks, phosphorylation cascades, and Protein-Protein or Protein-DNA interactions. We try to identify regulatory effects that lead from signaling events to transcriptional events for causal network analysis and to construct a comprehensive chart of diseases and functions associated with target genes so as to elucidate gene functionality and possible points of drug administration.

  1. Number of Students to be supported: 1
  2. Name of Lead person: Dr. Igor Tsigelny, Dept. of Neurosciences, Moores Cancer Center, SDSC
  3. Plan to Integrate Student into Group Activity

The student will be a part of the research team working on the larger scale project that includes the project described here. He or she will attend the group meetings and communicate with the team members using other methods of communication. The student will work closely with the lead person and the other personnel involved.

  1. Student Prerequisites

We are looking for a student who is interested in both biology and computation.

  1. Number of hours per week: 15-20 hours
    No parallel internships in other organizations are permitted.
  2. Relevant link:
    San Diego Supercomputer Center: http://www.sdsc.edu

 

Neuroscience Gateway Software Intern
Subha Sivagnanam, and Amit Majumdar, San Diego Supercomputer Center, UCSD
View Description

The Neuroscience Gateways (NSG – www.nsgportal.org) provides access to supercomputing resources for computational neuroscientists. Through a simple web-based portal, the NSG provides user-friendly environment for uploading models, specifying supercomputing job parameters, querying running job status, receiving job completion notices, and storing and retrieving output data.  The NSG distributes user jobs to appropriate supercomputing resources. We are interested in students who are proficient in scripting languages  to be able to develop a code that will automatically submit various neuroscience modeling tools to supercomputing resources and test the NSG software infrastructure. Required skills include python or perl programming, exposure to Linux systems.

  1. Number of Students to be supported: 1-2 – for web development
  2. Name of Lead person: Subha Sivagnanam, Amit Majumdar, and Kenneth Yoshimoto, San Diego Supercomputer Center, UCSD
  3. Plan to Integrate Student into Group Activity

The student will be a part of the research team working on the larger scale project that includes the project described here. He or she will attend the group meetings and communicate with the team members using other methods of communication. The student will work closely with the lead person and the other personnel involved.

  1. Student Prerequisite

Knowledge of web technologies such as javascript, MySQL database, HTML, & XML, Prior knowledge of using supercomputing resources is desirable but not required, Keen eye for detail

  1. Number of hours per week: 10-15 hours
  2. Relevant link:
    San Diego Supercomputer Center: http://www.sdsc.edu/

 

Science Writing & Video Communications Interns
Jan Zverina, Division Director of External Relations, San Diego Supercomputer Center, UC San Diego
View Description

  1. Overall Research Project

SDSC’s External Relations group is seeking high school students who have an interest in basic journalism plus a focus on learning more about effectively communicating results of scientific research projects that benefit society at large. Candidates should have a strong desire to learn more about how to communicate such research through both the written word as well as via video interviews and graphics — and to make it both engaging and easily comprehended by a general audience. At the same time, students should be mindful of not diminishing the scientific work of the researcher or research team in any way. Sound challenging?

As a result of this internship, students will come away with a better understanding of how computational science is conducted, while having the ability to reduce the technical jargon and other barriers to understanding and communicating how high-performance computing and data storage assist a wide range of today’s researchers. Students will have the opportunity to choose a scientific domain that is of interest to them.

This internship will include creating one or two short videos of selected SDSC researchers, as well as communicating how SDSC resources such as the ‘Comet’ supercomputer assist researchers in conducting their studies. An overarching goal is to help “take the mystery out of supercomputing” so that general, less technical audiences understand the benefits that such systems provide. Students will be assisted with the final editing and completion of the videos. See more on this under Item 4, Plan to Integrate Student into Group Activity.

We anticipate that this ‘Science Writing & Video Communications’ track may encourage some students to consider careers in computational sciences, which is one of the more lucrative careers in overall journalism as technical writing remains a sought-after skill.

Project Title: Science Writing & Video Communications: Highlighting SDSC’s Expertise

  1. Research Project Background: The National Science Foundation is urging researchers and science writers alike to write at a more basic level of understanding. This from an email dated January 23, 2015:

National Science Foundation Director France Córdova recently announced important new steps to enhance transparency and accountability at NSF. Under the new directive, principal investigators will work with program officers to draft an award abstract in plain English that describes the project and how it will advance the progress of science, the national defense, or the nation’s health, prosperity and welfare. The move comes amid intense scrutiny from Capitol Hill about the value of research in the social sciences and on climate issues.

  1. Number of Students to be supported: Two
  2. Name of lead mentor: Jan Zverina, Executive Director of External Relations, San Diego Supercomputer Center, UC San Diego. With assistance from the group’s videographers.
  3. Plan to Integrate Student into Group Activity: Jan Zverina will supervise students directly, help them to develop questions to create brief videos of selected SDSC staff, and create a variety of media to articulate the basics of supercomputing expertise in a clear and compelling way to both the general public and their student peers. Students will participate in the poster session at the end of the internship, and are encouraged to be as creative as they want in their presentation. For each project, an initial meeting will be used to outline the basics, with review and editing done both via email and through regularly scheduled in-person meetings. Interns will receive constructive criticism of their work, with constructive suggestions for how it might be improved. Interns will also be asked to review each other’s work and suggest constructive improvements. Completed articles or videos may appear on the SDSC website or in SDSC’s ‘Innovators’ e-newsletter, with full credit given to students for their work so they may include it in their portfolios. Students may also be invited to report on the SDSC Big Data Summer Institute in August if they are interested and their schedule allows.
  4. Student Prerequisites:

(Required)

Evidence of ability to write at least at a 10th grade level (A or B grades in 10th grade English)

An A or B grade in at least one science or engineering course at the high school level. The ability to receive constructive criticism of their writing with grace and humor, and to make revisions guided by professional communication staffers. Students will need to be comfortable in interviewing researchers, primarily in person. Mentors will always provide an introduction to the researchers.

  1. Number of hours per week: 10-15 hours.
  2. IT Support: Students will be provided a workspace and workstation with internet access and a USB drive. All video equipment will be provided by SDSC External Relations, along with a tutorial on how to use it.
  3. Relevant links: All students applying for this internship should familiarize themselves with the following:
  • The San Diego Supercomputer Center website at http://www.sdsc.edu/. Click on ‘Press Releases’ under the ‘News & Events’ tab on the home page to review recent press releases.
  • Review the ‘Researcher Spotlight’ section on the lower part of the SDSC home page.
  • Review the SDSC Annual Report available as a pdf file at http://www.sdsc.edu/pub/index.html. This report provides an overview of the depth and breadth of SDSC’s projects and activities, and the staff members behind them.
  • Review the SDSC Innovators newsletter at http://www.sdsc.edu/news_and_events/newsletter.html

 

Summer Outreach Programs Videographer
Ange Mason, Education Program Manager, San Diego Supercomputer Center, UCSD
View Description

During the past few years, SDSC’s Education and Outreach group has sought to build its social media presence and what we have found is that short videos of our summer workshops and research intern interviews have proven to be very popular.  We place this content and still images on our summer SDSC TV YouTube project and our Facebook pages to showcase our outreach activities year round. Our four Facebook pages are:

The intern will work with the program manager to brainstorm ideas to make our social media content vibrant, fun, appealing and informative with content such as videos, student video spotlights, summer workshop and internship photos, workshop instructor interviews, researcher interviews, and other innovative ideas. Content such as this will comprise a weekly SDSC TV feature, which will be posted each Friday.

This is a super fun (yes, I said super fun) opportunity to work with researchers and students throughout the week, as well as the opportunity to interact with STEM camp instructors and campers. This is a large project with many fun and creative components. Let your talent behind the camera soar!

Students applying on their own will be given the option to work as a team and thus be paired with another applicant, or the option to work alone. Student teams of two may also apply and will be interviewed together. Please indicate the name of your teammate in your application.

Number of Students Requested: 1-2

Name of Lead person: Ange Mason, Education Program Manager, San Diego Supercomputer Center, UCSD

Plan to Integrate Student into Group Activity:
The intern will be a part of the summer student staff working to assist with social media outreach. He or she will attend the group meetings and share in weekly planning and logistical meetings. The student will work closely with the project lead and the other Education team members.

Student Prerequisite:

We are looking for a student who is self-driven and able to work independently and with others. The intern must possess strong interpersonal skills. The intern must posses sstrong organizational skills plus  excellent written and oral skills. The intern must also be comfortable using a camera or video camera and have strong video creation knowledge. For a hint as to what kind of intern we are seeking, please visit YouTube and enter SDSCTV or click on the following link: https://www.youtube.com/results?search_query=SDSCTV.

SDSC will work with you to obtain needed software for this project.

Number of hours per week: 20 hours

Relevant Links:
San Diego Supercomputer Center: http://www.sdsc.edu/

San Diego Supercomputer Center StudentTECH program: http://education.sdsc.edu/studenttech

 

SDSC Datacenter Network Performance Monitoring Intern
Tom Hutton, Network Engineer, San Diego Supercomputer Center, UCSD
View Description

This position(s) will be researching the use of and testing the deployment of the PerfSonar Toolkit on small inexpensive microcomputers that could be deployed as a flock of testers within a datacenter. The individuals working on this position will handle the installation, configuration, testing and then evaluation of a particular hardware platform for datacenter testing purposes.

Planned hardware for testing/deployment will be:
EVS LIVA 32GB running Debian
EVS LIVA 64GB running Debian
ECS Liva-X 32GB running Debian
GigaByte 8G running CentOS
Intel NUC 8G running CentOS
Zotac 8G running CentOS

Number of Students Requested: Two

Name of Lead person: Tom Hutton, Network Engineer, San Diego Supercomputer Center, UCSD

Plan to Integrate Student into Group Activity:

Student will be expected to spend 50% of their time working on this project at SDSC and will be able to spend the other 50% remotely.  Students will have opportunity to directly communicate with the SDSC staff programmers.

Student Prerequisite:

We are looking for students who are self-driven and able to work with minimal supervision.  Student should have familiarity with the LINUX OS and basic operating system installation.

Number of hours per week: 7-20

Relevant Links:
San Diego Supercomputer Center: http://www.sdsc.edu 

 

Summer Computing and Science Academy Teaching Assistant
Ange Mason, Education Program Manager, San Diego Supercomputer Center, UCSD
View Description

Do you love working with students in a fun and informal setting? Do you like learning about new software programs? Then this summer opportunity is for you! Help the San Diego Supercomputer Center (SDSC) with its summer workshops and learn new skills in the process. We are offering a variety of workshops from environmental science to cinematic arts and Java to physics, C programming and much more! We need you to be a course assistant, helping the instructor by helping the students with questions and project help. Students will work either at SDSC or at another location on the UCSD campus. Learn new skills in leadership, while learning new programming languages and scientific methods.

Don’t worry if you don’t know the software being taught. All you need to do is follow along with the instructor. This is a great opportunity for you to learn many of the new and familiar software packages being used today.

Number of Students Requested: 16

Name of Lead person: Ange Mason, Education Program Manager, San Diego Supercomputer Center, UCSD

Plan to Integrate Student into Group Activity:
The student will be a part of the summer student staff working to assist with our busy schedule of summer workshops. He or she will attend the group meetings and share in weekly planning and logistical meetings. The student will work closely with the project lead and the other Education team members.

Student Prerequisite:
We are looking for a student who is self-driven and able to work with minimal supervision. Student must be customer service oriented and like working with a diverse population of students.
Each week, students will be sent the software links for the following week. Students must become familiar with the software in order to be an effective workshop assistant. This is a position requirement.

Preferred But Not Required
Student TEACHING Assistants should have prior experience working with middle and high school students in a camp or TA environment or have attended previous SDSC workshops.

All student assistants will be required to have a background check through UCSD.

Number of hours per week: 20- 24 hours.  Students are asked to work full days or half days for continuity with the workshop participants.

Relevant Links:
Summer 2018 Programs: http://education.sdsc.edu/studenttech/?page_id=570
San Diego Supercomputer Center: http://www.sdsc.edu/
San Diego Supercomputer Center StudentTECH program: http://education.sdsc.edu/studenttech/

 

Analyzing the Scheduling Efficiency of a Batch Processing System
Martin Kandes, System Administrator, San Diego Supercomputer Center, UCSD
View Description

Project Description:

Comet is SDSC’s newest supercomputer. The result of a $21.6M National Science Foundation (NSF) award, Comet deliverers over 2.7 petaFLOPS of computing power to scientists, engineers, and researchers all around the world. In fact, within its first 18 months of operation, Comet served over 10,000 unique users across a range of scientific disciplines, becoming one of the most widely used supercomputers in NSF’s Extreme Science and Engineering Discover Environment (XSEDE) program ever.

Being a shared computational resource in high demand, nearly all of work performed on Comet takes the form of batch jobs submitted by its users to its job scheduler. To match this user job demand to its computational resources, Comet employs the Slurm Workload Manager, which relies on a fair-share scheduling algorithm to balance the efficient, fair use of resources on the machine against the competition for those resources by its users.

The aim of this research project is to analyze the scheduling efficiency of the current Slurm configuration on Comet. Your work may help determine whether or not changes to the current Slurm configuration can help improve overall system utilization long-term. In particular, your research will likely focus on a potential bottleneck identified in changes made to the scheduler configuration to support the recent deployment of Comet’s P100 NVIDIA GPU nodes. During the course of this project, you will learn how-to:
manage and analyze GBs of raw data spread across tens to hundreds of thousands of files;
parse text and numerical data in Python;
compute the scheduling efficiency of a batch system;
use and interact with the Slurm batch job scheduler; and
run analysis workloads on Comet

Number of students to be supported: 1

Name of lead person overseeing the student’s research: Marty Kandes, Ph.D., Computational & Data Science Research Specialist, HPC User Services Group, SDSC

Names, organizations, and roles of others who may provide additional mentoring for the student: Mahidhar Tatineni, Ph.D., Manager, HPC User Services Group, SDSC; Nicole Wolter, Computational & Data Science Research Specialist, HPC User Services Group; Trevor Cooper, Applications Programmer/Systems Engineer, HPC Systems Group

A description of the plan to integrate the student researcher into the group’s activities: Prior to beginning the REHS program, the selected student researcher will be provided with recommended programming exercises to help build the skills they will need to learn in order to successfully complete this project. Dr. Kandes will be available via email to provide guidance to the student on how to approach these exercises. During the first week of the REHS program, the student researcher will then work closely with Dr. Kandes to build a research plan that clearly defines the milestones of the project in order to meet its goals. Thereafter, the student researcher will be expected to largely work independently on their project, but meet regularly with Dr. Kandes each week to provide updates on their progress, ask questions, and discuss any technical issues they’ve encountered. During the REHS program, the student researcher will also likely interact with other members of User Services Group on portions of their project and be invited to attend regular group meetings.

List of student prerequisites for the research project: Applicants must have a demonstrated interest in computer science and mathematics and some previous programming experience.

Number of hours the student will be asked to work per week: 15 – 25 hours

Relevant web links:

http://www.sdsc.edu/News%20Items/PR20170201_Comet_10k.html
https://en.wikipedia.org/wiki/Batch_processing
https://en.wikipedia.org/wiki/Job_scheduler
https://en.wikipedia.org/wiki/Slurm_Workload_Manager
https://en.wikipedia.org/wiki/Utilization
http://www.sdsc.edu/News%20Items/PR20170502_Comet_GPU.html
https://en.wikipedia.org/wiki/Job_shop_scheduling

 

Multiscale Simulations in Chemistry and Biophysics
Dr. Andreas Goetz, San Diego Supercomputer Center, UCSD
View Description

In this REHS project, which will be supervised by Dr. Andreas Goetz at the San Diego Supercomputer Center, the students will make a contribution to our ongoing multiscale simulation software development, optimization, and simulations projects. Work to be performed can cover different aspects, depending on the background and interests of the applicant. This can include performance optimizations and benchmarks of performance critical algorithms on multi and many core (such as GPUs) architectures or numerical simulations of catalytic events in biomolecules or at metal catalyst surfaces with quantum chemistry or biomolecular simulation codes. Depending on the progress of the project and the individual ability of the students, the interns will thus have the possibility to acquire skills in setting up and performing multiscale QM/MM simulations in a Linux based operating system environment, analyzing performance data, and programming in scripting and compiled languages including Bash, Python, Fortran and C, and possibly parallel programming with CUDA and MPI. As an essential part of the internship the students will learn how to document their research, prepare research reports and present their results to their peers– skills that are very important for a successful research and engineering career.

Number of students to be supported:
Up to 2 students, working on complementary aspects such as profiling and optimizations of
different code paths or numerical simulations of different but related catalytic processes.

Person overseeing the student research experience:
Dr. Andreas Goetz

Plan to integrate students into research activities:
Dr. Andreas Goetz who is actively working on the model development, software implementation and simulation projects will closely supervise the students on a daily basis. In addition, the students will have the opportunity to interact with other high school students and undergraduate or graduate interns that will be working on other projects related to molecular simulations.

Student Prerequisites:
Software development/optimization related work requires experience with some compiled languages such as C/C++/Fortran and ideally (though not required) parallelization with
OpenMP, MPI, CUDA, OpenACC. Computational chemistry / biomolecular simulations projectsrequire advanced classes in chemistry or biology. Exposure to scripting languages or familiarity
with Linux and command line environments is expected. Specific knowledge of any of these areas is less critical than exceptional intellectual ability.

Number of work hours per week: 20 to 25

Relevant web links:
AMBER software: www.ambermd.org
ADF software: www.scm.com
Dr. Goetz’ web site: www.awgoetz.de

 

Elucidation of Common Molecular Mechanisms of Parkinson’s, Alzheimer Diseases, and Diabetes
Dr. Igor Tsigelny, Dept. of Neurosciences, San Diego Supercomputer Center, UCSD
View Description

Aggregation of specific proteins is associated with a number of diseases including Parkinson’s (PD), Alzheimer’s (AD), and other neurodegenerative diseases. Type II diabetes (T2D) is also linked to aggregation of Islet amyloid polypeptides (IAPP) IAPP is found to be significantly accumulated as extracellular amyloid deposits in the pancreas of T2D patients; it is shown that early oligomers are significantly cytotoxic species that lead to beta-cell apoptosis. The mechanisms through which IAPP leads to T2D are unclear. Various lines of evidence appeared to support the contention that IAPP oligomers built in the early stages of aggregation form annular protofibrils and consequently pores in the membranes the same way as amyloid-beta and alpha-synuclein correspondingly create pores in the cell membrane in AD and PD. Increase of calcium influx to the cells can lead to apoptosis in case of all these diseases.

The specific tasks of the project are using computer modeling and simulations to accomplish the following goals: (1) To characterize the interactions between IAPP, amyloid beta, and alpha-synuclein molecules in oligomers, and interactions between molecules and the membranes. (2) To validate a concept of self-assembly of these molecules, using a set of mutants known to enhance or inhibit IAPP aggregation process with identifiable structural properties.

The task will include extensive analysis of research articles in the field, related to modeling and simulations abovementioned proteins and their interactions with the membrane. Computational molecular modeling and molecular dynamics simulations.

  1. Number of Students to be supported: 1
  2. Name of Lead person: Dr. Igor Tsigelny, Dept. of Neurosciences, SDSC
  3. Plan to Integrate Student into Group Activity

The student will be a part of the research team working on the larger scale project that includes the project described here. He or she will attend the group meetings and communicate with the team members using other methods of communication. The student will work closely with the lead person and the other personnel involved.

  1. Student Prerequisite
    We are looking for a student who is interested in both biology and computation.
  1. Number of hours per week: 15-20 hours
  2. Relevant link:
    San Diego Supercomputer Center: http://www.sdsc.edu

 

Text Classification via Supervised Machine Learning for an Issue Tracking System
Martin Kandes, System Administrator, San Diego Supercomputer Center, UCSD
View Description

Project Description:

Comet is SDSC’s newest supercomputer. The result of a $21.6M National Science Foundation (NSF) award, Comet deliverers over 2.7 petaFLOPS of computing power to scientists, engineers, and researchers all around the world. In fact, within its first 18 months of operation, Comet served over 10,000 unique users across a range of scientific disciplines, becoming one of the most widely used supercomputers in NSF’s Extreme Science and Engineering Discover Environment (XSEDE) program ever.

The High-Performance Computing (HPC) User Services Group at SDSC helps manage user support for Comet. This includes, but is not limited to, managing user accounts, answering general user inquires, debugging technical problems reported by users, and making best practice recommendations on how users can achieve high-performance when running their scientific workloads on Comet. These interactions between Comet’s user community and the User Service Group are largely managed through email exchanges tracked by XSEDE’s internal issue tracking system. However, while Comet is expected to maintain a 24x7x365 uptime, user support is generally only provided during normal business hours. With such a large user community spread across nearly every timezone,  the result is a number of user support tickets submitted during non-business hours waiting between 12 hours to several days for responses from the User Services Group.

The aim of this research project is to use supervised machine learning techniques to perform text classification on Comet’s user support tickets. If an efficient classification scheme can be developed, the User Services Group may eventually be able to provide automated email responses to some of the more common user issues reported during non-business hours. During the course of this project, you will learn how-to:
parse text-based data in Python;
build and label a non-trivial dataset;
use supervised machine learning techniques;
run analysis workloads on Comet

Number of students to be supported: 1

Name of lead person overseeing the student’s research: Marty Kandes, Ph.D., Computational & Data Science Research Specialist, HPC User Services Group, SDSC

Names, organizations, and roles of others who may provide additional mentoring for the student: Mahidhar Tatineni, Ph.D., Manager, HPC User Services Group, SDSC; Nicole Wolter, Computational & Data Science Research Specialist, HPC User Services Group; Subhashini Sivagnanam, Computational & Data Science Research Specialist, HPC User Services Group

A description of the plan to integrate the student researcher into the group’s activities: Prior to beginning the REHS program, the selected student researcher will be provided with recommended programming exercises to help build the skills they will need to learn in order to successfully complete this project. Dr. Kandes will be available via email to provide guidance to the student on how to approach these exercises. During the first week of the REHS program, the student researcher will then work closely with Dr. Kandes to build a research plan that clearly defines the milestones of the project in order to meet its goals. Thereafter, the student researcher will be expected to largely work independently on their project, but meet regularly with Dr. Kandes each week to provide updates on their progress, ask questions, and discuss any technical issues they’ve encountered. During the REHS program, the student researcher will also likely interact with other members of User Services Group on portions of their project and be invited to attend regular group meetings.

List of student prerequisites for the research project: Applicants must have a demonstrated interest in computer science and mathematics, a basic understanding of probability and statistics, and some previous programming experience.

Number of hours the student will be asked to work per week: 15 – 25 hours

Relevant web links:
http://www.sdsc.edu/News%20Items/PR20170201_Comet_10k.html
https://en.wikipedia.org/wiki/Issue_tracking_system
https://en.wikipedia.org/wiki/Supervised_learning
https://en.wikipedia.org/wiki/Document_classification

Data Analytics Intern
Ange Mason, Education Program Manager, San Diego Supercomputer Center, UCSD
View Description

Overall Project

The Research Experience for High School Students (REHS) program is celebrating its eighth year. Almost 400 students have participated in this program since 2010. Where did they all go? What college did they attend? What career path did they take?

What we would like to know is if the REHS program had any bearing on college selection or career path.

Working with the Spatial Information Systems Lab at SDSC and their data analytics software SUAVE, the selected students will work to formulate questions for maximum relevant data collection. Using this data will allow students to incorporate SUAVE in their findings and ultimately, their final project.

About the Spatial Information Systems Lab

The Spatial Information Systems Lab conducts research and develops technologies and infrastructure that enable users to access, integrate and manage spatial information. Application domains range from hydrology and environmental sciences, to neuroscience. Working in collaboration with other SDSC R&D Labs and UCSD programs  we support spatial information processing and Web mapping in a variety of projects. Our main research focuses on cyberinfrastructure for managing observational data, distributed information systems, web services and XML schemas for standards-based data interchange, spatial data integration and GIS, online data discovery and visualization.

Number of Students Requested: 1-2

Name of Lead person: Ange Mason, Education Program Manager, San Diego Supercomputer Center, UCSD

Plan to Integrate Student into Group Activity:

The intern will be a part of the summer student staff working to assist with social media outreach. He or she will attend the group meetings and share in weekly planning and logistical meetings. The student will work closely with the project lead and the other Education team members.

Student Prerequisite:

We are looking for a student who is self-driven and able to work with minimal supervision. The intern must be customer service oriented and able to work with a diverse population. The intern must possess excellent written and oral skills. The intern must also be comfortable using a camera or video camera.

Number of hours per week: To be arranged (15-20)

Relevant Links:

San Diego Supercomputer Center:

http://www.sdsc.edu/

Guidance and Answers at Your Fingertips

Ange Mason of the San Diego Supercomputer Center (amason@ucsd.edu, 858 534-5064) and Lindy Wong (liw030@ucsd.edu) are available to provide guidance and answer any questions you might have. Just ask!