Skip to content

Wang Lab Bootcamp

There is just so much to learn when starting to do research in a lab that does so many different things. It can be quite daunting. Here is a quick list of skills you may or may not have, but more importantly, it lays out some resources where you can gain these skills. The most important is for all of us to understand where you are technically, where you interests lie, and to find the right project where you can contribute and simultaneously grow both technically and as a scientist.

Computing Skills

Here is a list of high level computing skills that are using in projects in the lab. Not all projects require all of the skills, but will require generally more than one.

It is good to have an assessment of your skills so that we pair you with the right project and also provide you the right resources to get up to speed when you need them.

  1. Python
  2. Linux Command Line
  3. Conda Environment Management
  4. NextFlow Workflow Proficiency
  5. CS Data Structure
  6. CS Algorithms
  7. Data Science - Data Manipulation
  8. Data Science - Data Visualization
  9. Containerization
  10. Web Tools
  11. Batch Computing
  12. Source Version Control
  13. Remote Computing

Python Language

You are familiar with the python3 programming language. This includes how to write functions, dependency installation (pip), create modules, commandline tools, and testing.

Here are a few resources to get you up to speed:

  1. https://wiki.python.org/moin/BeginnersGuide/Programmers

Linux Command Line

You are familiar with the Linux command line to run tools, manipulate files smoothly, and install packages.

Here are a few resources to get you up to speed:

  1. https://ubuntu.com/tutorials/command-line-for-beginners

Conda Environment Management

You are familiar with Conda in order to manipulate development and dependency environments.

Here are a few resources to get you up to speed:

  1. https://conda.io/projects/conda/en/latest/user-guide/getting-started.html

NextFlow Workflow Development

You are familiar with NextFlow workflow environment. You are able to

  1. Run a nextflow workflow
  2. Write a nextflow workflow from scratch

Here are a few resources to get you up to speed:

  1. https://training.seqera.io/

Computer Science Data Structures

You are familiar with common data science data structures, their properties, their advantages/disadvantages, and common algorithms that utilize them.

This is generally covered in CS 014 at UC Riverside.

Computer Science Algorithms

You are familiar with algorithms and their application.

This is generally covered in CS 141 at UC Riverside.

Data Science - Data Manipulation

You are familiar with how to read and parse data and manipulate it, e.g. filtering, sorting, grouping, pivoting, melting, joining, cleaning, etc.

You can learn how to do this with the following resources:

  1. https://pandas.pydata.org/pandas-docs/stable/user_guide/10min.html

Data Science - Data Visualization

You are familiar with how to summarize and visualize data. This can include the following visualization techniques: histograms, 2D histograms, bar plots, scatter plots, box plots, etc.

You can learn how to get started with the following resources:

  1. https://seaborn.pydata.org/tutorial/introduction.html
  2. https://www.fireblazeaischool.in/blogs/data-visualization-using-plotly/

Machine Learning/Deep Learning

Lorem Ipsum

Containerization, e.g. Docker, Kubernetes

You are familiar with how to containerize your applications with docker and docker-compose.

The following resources are a decent place to start:

  1. https://stackify.com/docker-tutorial/

Web Tools/Services

You are familiar with how to build interactive web applications. In our lab, we recommend using Dash and Flask.

The following reosurces are a decent place to get started:

  1. https://dash.plotly.com/

Additionally, the Wang Lab has its own templates for building these applications - check them out here

Batch Computing/Kubernetes

You are familiar with how to run large numbers of tasks in HPC environments like batch systems or Kubernetes.

Source Version Control

You are familiar with how to use source code version control, specifically git and github.

Please review the following topics if you are not familiar

  1. Branches
  2. Pull Requests

For a quick reference, if you are working with a repository that exists and want to contribute your own code into a branch, you'll want to do the following

# Creating a branch
git branch my-new-branch

# Checkout this branch
git checkout my-new-branch

# Add files to add to this branch
git add new_file.txt

# Commit the data
git commit -m "Adding new file"

# Push the branch to GitHub
git push --set-upstream origin my-new-branch

Remote Computing

You are familiar with how to set up a remote workstation over ssh.

Development Environment

We recommend to use VS Code for software development as it is a powerful platform for text editing, debugging, and running software.

Mass Spectrometry Background

We want you to have an understanding of mass spectrometry for working with the lab. Resources to come!

TO DO !


Last update: May 3, 2023 21:43:20