Skip to content

Latest commit

 

History

History
 
 

Back-End Web Development

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Back-End Web Development

Pre-requisite: Python for Data Science/ and "Linux for Bioinformatics/"

Note This miniproject is experimental -- if you find flaws or have other suggestions, let Henry know.

As bioinformatics becomes more central in the study of biology, it is important to develop user-friendly tools for biologists so that they can run analyses without needing to code. Excellent examples of this can be found in gepia2, enrichr, and cBioPortal. While R-shiny and plotly dash are useful for quickly creating small web apps that can serve a modest number of users, more complicated and scalable applications will require a robust web framework.

In this miniproject, you are tasked with creating a REST API backend using python and the flask framework. I recommend using basic flask at first, and then building the API in flask-restful once you are more comfortable. The API must contain the following endpoints:

  1. <app_base_url>/api/gapminder

This endpoint should accept GET requests and should return all the data from gapminder_clean.csv in JSON format.

  1. <app_base_url>/api/country

This endpoint should accept GET requests which specify parameters that match the columns of gapminder_clean.csv. The goal is to select the countries that meet the specified criteria and return them.

E.g., <app_base_url>/api/country?year=1962&co2-gt=10&continent=north-america (filter by CO2 > 10, continent is North America, and year is 1962) should return something like:

{'countries': ['Canada', 'United States']}
  1. Extend the <app_base_url>/api/gapminder endpoint so that it has the same filtering capability as in the previous step (2).

E.g., <app_base_url>/api/gapminder?year=1962&co2-gt=10&continent=north-america (filter by CO2 > 10, continent is North America, and year is 1962) should all the 1962 data for Canada and the United States in JSON format.

  1. Package your gapminder_clean.csv dataset in an sqlite database. Change your API code so that all queries work with the sqlite database and the .csv file is no longer needed.

  2. Deploy your application along with any necessary databases to AWS. You can use a free AWS EC2 instance to do this. See the instructions in the Linux for Bioinformatics/ training. The only difference is that you will need to allow HTTP/HTTPS traffic while configuring your instance. You can find an example of this here.

  3. Set up nginx on your EC2 instance and use it to serve your flask API. For a nice guide to setting this up, see here and additional guidance here. NOTE: Keep in mind that you won't be able to access any ports on the remote machine unless they were opened during EC2 setup in the Security Group settings.

  4. Create a postman account. Follow their tutorials to learn how to use it.

  5. Use postman to test and document your API. Documentation should be sufficient for someone else to fully utilize the API.

  6. Share the API with Henry in postman.

Resources for learning

This is a challenging task for those without web development experience. To get comfortable, it is recommended to first learn flask using their tutorial here. It may also be useful to watch a comprehensive tutorial on flask API, like this one.

SQL can also be challenging. There is an excellent course on it in DataCamp. Postman is also difficult at first -- so it would be wise to watch a tutorial about it here.