【Data Science Project】 3D SARS-CoV-2 Protein Visualization With Biopython

16 min readSep 26, 2021
A coronavirus uses a protein on its membrane — shown here in red in a molecular model — to bind to a receptor — shown in blue — on a human cell to enter the cell. Once inside, the virus uses the cells’ machinery to make more copies of itself. (Juan Gaertner / Science Source)


In the fields of Life Sciences, visualization is particularly important with challenging data from cutting-edge experimental techniques, such as 3D genomics, spatial transcriptomics, 3D proteomics, epiproteomics, high-throughput imaging, and metagenomics (O’Donoghue, 2021). Data visualization is how research gets communicated. It is no longer just an option for aesthetics but emerging as a critical sub-discipline in the fields of both Life Sciences and Data Science.

With machine learning and deep learning to develop Artificial Intelligence systems, data visualization is often overlooked. However, in biological processes, a key idea is that structure determines function i.e. how a protein is folded determines its function. For example:

  • Collagen: a fibrous protein found in the skin is shaped like a rope and give strength to our skin so that it won’t tear.
Collagen: a fibrous protein found in the skin. CC BY-SA 3.0, (source)
  • Hemoglobin: a globular protein used to transport oxygen in the blood.

