Computational Biology and Bioinformatics | Data Science | Human | Canada | sukanta.21.saha@gmail.com | linkedin.com/in/sukanta-saha

How a biologist navigates his way through the data revolution

This is my first Medium article. And I would like to talk a little bit about myself, to begin with. I am currently studying Molecular, Cellular, and Developmental Biology with a focus on Bioinformatics, at the University of Alberta, Canada. I had literally zero background in Computer Science previously. In fact, I am currently taking my first CS class right now in my second year of University.

Image for post
Photo by National Cancer Institute on Unsplash

Like most biologists, I too lacked CS skills and was not the best at mathematics either. I have had a Calculus class in my first year, with a GPA of 1.3. Yeah… But that is not really an accurate representation of my Mathematical capabilities. It was the first year of University. I hated the Prof for overcomplicating most of the topics (I learned those concepts on my own the previous year when I got introduced to Calculus in high school and was awestruck by its elegance), and well, I was just reckless. Just your regular first semester of University when you are trying out so many new experiences and Calculus is really, not your first priority, to say the least. And I guess it's justified because your first semester of college is reserved for being a little reckless because you won’t be able to afford to be so later on, or at least that’s how I see it. …


How AI just solved the 50-year-old grand challenge of protein folding and modeling

In 1953, when James Watson and Francis Crick, with the help of Rosalind Franklin, had worked out the structure of the DNA double helix, they went down to a local pub in Cambridge, celebrating that they “discovered the secret of life!”. Exactly 50 years later in 2003, the Human Genome Project was completed. Humanity, at last, deciphered its own blueprint, the blueprint that Mother Nature had used to create the very first organisms to resemble modern humans. Many thought this milestone was biology’s equivalent to Neil Armstrong’s first steps on the moon.

Image for post
Photo by National Cancer Institute on Unsplash

For a quick Biology 101; all living things on earth are made of 21 different organic molecules called amino acids. These molecules mix and match to form the various different kinds of biological structures that build us, our dogs, our cats, our plants, our food, and everything in nature. What differs from one structure to another due to these different combinations of amino acids, is their shape, which ultimately plays a major role in dictating their functions. The specific sequences of these amino acids are determined by the DNA patterns in which our genome is made, and therefore, theoretically, if we know the genetic code, we know the amino acid sequence it will lead to, and if we know that, we should have a somewhat accurate prediction of its shape, and by a little more research, what its functions are. Abnormalities in these protein structures due to faulty mutations in our gene are often the cause of many health disorders. …


A schematic approach to understanding Clinical Trials and Drug Development

With Covid-19 affecting each of our lives in some way or another, all of us are eagerly waiting for a vaccine. For many outsiders, the clinical research flow may seem a little too confusing. So let's take a bird’s eye view of how a drug gets to the market from being discovered as a potential candidate and how the people behind each of the stages are essential for ensuring the safety and efficacy of the drug.

Image for post
Photo by JOSHUA COLEMAN on Unsplash

The major organizations involved are:

  1. Sponsors: The pharma company or the university that discovered the potential candidate molecules for a vaccine. Industrial sponsors typically do not require third party funding, but academic sponsors like universities usually acquire funds from grants like the NIH. …


with an AI generated article — a brief introduction to NLP

Natural languages are languages that convey ideas through the utilization of written elements. These obviously include languages like English, ancient Greek, Chinese, and Dothraki but do not include Computer languages like Python or R. Ancient languages which use art as a form of expression may be somewhere in the grey area between what is considered Natural language and what is not, in the conventional sense. Any sort of processing like manipulation, curation, or analysis of these Natural languages is considered NLP.

Image for post
Photo by NASA on Unsplash

NLP typically involves applications of Computer Science and Linguistics. The very first task we taught computers for NLP was deconstructing the sentence structure into smaller components like the noun phrase or the verb phrase. These are further divided to specifically assign their Parts of Speech tag in the context of the sentence. Hard coding these rules into a program may be a little too tedious, but doable up to a certain level of complexity. …


Tokenization, Stopwords, Stemming, and PoS Tagging (with code) — Part 1

NLTK (Natural Language Toolkit) is the go-to API for NLP (Natural Language Processing) with Python. It is a really powerful tool to preprocess text data for further analysis like with ML models for instance. It helps convert text into numbers, which the model can then easily work with. This is the first part of a basic introduction to NLTK for getting your feet wet and assumes some basic knowledge of Python.

Image for post
Photo by Patrick Tomasso on Unsplash

First, you want to install NLTK using pip (or conda). The command for this is pretty straightforward for both Mac and Windows: pip install nltk . If this does not work, try taking a look at this page from the documentation. Note, you must have at least version — 3.5 …


A blueprint of what the journey looks like

With recruiters listing a myriad of “preferred skills” in their job postings, learning Data Science can get quite overwhelming at times. Dividing the journey up into five chapters can provide a clearer picture of what lies ahead.

Image for post
Photo by Joshua Earle on Unsplash

Chapter 1

Great! You have decided to learn the art of Data Science and now you want to take on the challenge. But which language should you pick? R or Python? In my previous article, I explain why, but for now, all you need to do is just pick one and get started with it. Go to YouTube and search up tutorials for the language of your choice. …


Opinion

and why you are wasting your time

TLDR;

In short, what matters most as a beginner in Data Science is that you DO Data Science. So just go with either one of the languages and prioritize getting some projects done while sipping away at your choice of sugary beverage. That’s how you will learn the fastest.

Image for post
Photo by Fernando Hernandez on Unsplash

While I may be tempted to just recommend Python straight-away (Python is my main, but I do have some working knowledge of R), I want to present an unbiased evaluation of the effectiveness of the two languages for a beginner. …


What's with all the buzz?

This may seem like a straight-forward question but I wouldn’t be writing an article on it if it really were. A quick Google search and Wikipedia will render the following definition, “an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex”. Let's dissect this statement!

Image for post
Photo by Michał Parzuchowski on Unsplash

The first three words that Wikipedia uses to describe it are “an interdisciplinary field”. Think of three broad fields that may help; Biology, Computer Science, and Data Science. But what exactly are the disciplines within these three fields? When it comes to applications in the context of Biology, the current realm includes everything from Molecular Biology, Genetics, Genomics, Proteomics, and Metabolomics, to Systematics, Evolution, Pharmacology, Biomedicine, and Health Sciences. This is not an exhaustive list. …

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store