Back to all courses
AlphaCare is an open-source project that Keshav Boudaria and I have been working on for the past few weeks, and it's built entirely on top of freely available open-source data, algorithms, and compute. In this first video of the AlphaCare series, I'll explain how we can use it to classify ECG data from patient heartbeats to accurately predict the likelihood of different types of heart disease, mainly Arrhythmia. The goal of AlphaCare is to progressively improve it's capabilities as a community until it's able to be used as a tool to treat and prevent the top 10 major disease globally. Ultimately, we'd like to use it to treat the root cause of all diseases, Aging. AlphaCare is a work in progress, we have a lot of work to do together. I can't wait to learn and grow with all of you, let's make a massive positive impact together!
I'm a technologist on a mission to spread data literacy. Artificial Intelligence, Mathematics, Science, Technology, I simplify these topics to help you understand how they work. Using this knowledge you can build wealth and live a happier, more meaningful life. I live to serve this community. We are the fastest growing AI community in the world!
In this guide, we will learn about a real-time demo of a healthcare project developed using Python and deep learning to detect abnormalities in a patient's heartbeat signal. We will also use an API Hub called RapidAPI to look up APIs to display more symptom information.
AlphaCare is often used as a visual aid tool for cardiologists during heart surgery, which helps them to visualize and diagnose the correct regions of a heart and avoid damaging the other body parts unnecessarily, which could ultimately save a patient's life.
This guide also teaches how we can build this tool and how you, too, can develop your heart disease classifier using Python.
According to the WHO, heart disease is the number one cause of death globally. Over 75% of these deaths occur in low and middle-income countries.
The main reason for this massive number of deaths is that it's detected and treated very late. Most people can't afford a doctor in these low and middle-income countries—those who can visit only once a year.
Going to the doctor back and forth will seem archaic within a decade. We need to think differently for the future. We can't wait for the disease to happen first and then healthcare to be provided. Instead, we can prevent them from ever occurring in the first place using data and algorithms.
Talking about the future earlier, biometric devices are already reshaping how we do everything. Instead of visiting a doctor once a year, we'll be wearing biometric devices that generate millions of data points by monitoring hundreds of bodily signals called biomarkers.
These biomarkers can measure blood pressure, heart rate variability, basic metabolic studies, x-ray findings, and other tissues, etc., in real-time. This could be in the form of a Smart watch, Smart Ring, Heart Monitor, Smart Belt, and eventually Nanobot in our bloodstream.
Deep neural networks will be making health predictions using this data 24/7. For those who don't know, deep learning is part of a broader family of machine learning methods based on artificial neural networks with representation learning.
The goal of AlphaCare is to use deep learning to diagnose and prevent all diseases eventually, including the root cause of all of them aging. We'll progressively add new capabilities to the system in multiple guides to make it happen.
Before building a heart disease classifier, we need some background research to understand the problem. We can quickly search on Research Explorer for applied deep learning, which gives access to various heart disease papers. This biomedical literature can be hard to interpret for beginners. Traditionally, understanding all this may require a decade of your school life and $200,000 worth of tuition, which will easily be avoided entirely with AI in the future.
Let’s use a tool called scholarcy to make these research papers simple for us to understand. Scholarcy is the online article summarizer tool which reads your research articles, reports, and book chapters in seconds and breaks them down into bite-sized sections – so you can quickly assess how important any document is to your work. It can be installed as a simple Chrome extension.
Scholarcy uses a statistical language model, transforming massive text data sets of domain-specific jargon into short, concise language. Using this, we can develop a mental model for a particular health problem and a history of previous technologies used to solve it.
Let's take the example of arrhythmia to understand the whole process. An arrhythmia is an abnormality of the heart's rhythm, in which the heart may beat too slowly, too quickly, or irregularly. These abnormalities range from a minor inconvenience or discomfort to a potentially fatal problem.
Let's now find some data sets. The best way is to look them up on universities, government, or .org
websites. Google dataset search can assist us here. We're looking for heartbeat data collected using an ECG or electrocardiogram device, a standard doctor tool.
Once we find the required dataset, we will look for related features that can be used as input to our algorithm. We've got a whole host of patient demographic data. Luckily, this dataset has been labeled by a cardiologist into six categories. Each is a different type of heart arrhythmia.
Typically with machine learning techniques like random forests and support vector machines, we have to select the right features. But with deep learning, the network learns which features are ideal for predictions given enough data. To be safe, though, let's remove some noise from this data.
If we research ECGs, we'll find two common forms of noise, i.e., powerline interference and electromyographic noise. The literature shows we can remove them to ensure accurate predictions using the Wavelet Transform technique.
It uses a collection of functions called wavelets or small waves, each at a different scale. This tells us which frequencies are present in our signal and when they occurred. It works with different scales: at a large scale where it analyzes the significant features, and at a smaller scale where it explores the more minor features.
It starts at the beginning of the signal and slowly moves the wavelet toward the end of the signal, a process known as convolution. Once we run this transform on our data, the noise will be removed, and we can more accurately predict irregularities.
py
import numpy# wavelet classclass Wavelet() :"""Complex Morlet wavelet"""_omega0 = 5.0fourierwl=4* NP.pi/(_omega0+ NP.sqrt (2.0+_ omega0**2))def wf(self, s_omega):H= NP.ones (len(s_omega))n=len(s _omega)for i in range(len(s_omega)) :if s_omega[i] < 0.0: H[i]=0.0xhat=0.75112554*( NP.exp(-(s_omega-self. omega0)**2/2.0) )*Hif __name__ =="__main__":#step 1 - define waveletwavelet = Morlet#step 2- define parametersmaxscale=4notes=16scaling="log"plotpower2d=True# Step 3 set up some example data# sinusoids of two periods, 128 and 32.Ns=1024Nlo=0Nhi-Nsx=np.arange (0.0,1.0*Ns,1.0)A=np.sin(2.0*np.pi*x/128.0)B=np.sin(2.0*np.pi x/32.0)A[512:768]+=B[0:256]# Step 4 Wavelet transform the datacw=wavelet(A, maxscale, notes, scaling=scaling)scales cw.getscales()cwt= cw.getdata()
Since we have a one-dimensional signal, we could use a one-dimensional convolution neural network or CNN, state-of-the-art for image classification, to make predictions. Adding more dimensions in this process would develop more accuracy. However, three-dimensional CNN would require much more training data, which we don't have. So, let's just start using a two-dimensional CNN.
We can transform our 1D data into a 2D representation by building a CNN with the keras library. Each set of matrix operations, like pooling, convolution, and activations, has been proven to compress data into a representation that can make predictions.
We can train our algorithm using Google Colab for free with their Cloud GPU. Once the training process is complete, we can evaluate how the network learned over time and make correct predictions to classify a given part of the signal at a time step as an arrhythmia or not.
We have already prepared this algorithm for you using Google Colab, and you can train your model through this. You can find the link to the kaggle code here.
After testing the model, we can also view its classification and accuracy scores by testing them on the part of our data set. We can make this into something more useable through simple commands.
Like previous steps, we have prepared the algorithm that teaches you how to evaluate the model. You can refer to code line 89 for the evaluation algorithm Model Evaluation.
Let's now build a web interface for our model. We can download the flask framework for Python web apps. Afterward, we can spin up a bare-bones web app with just a few commands in the terminal. First, you need to clone this repository. For this, run the following command in the terminal:
sh
git clone https://github.com/imfing/keras-flask-deploy-webapp.git
Then install all the python dependencies by running the following command:
sh
sudo pip install -r requirements.txt
This will go ahead and download all the dependencies in the project. Now in the flask, we'll allow the user to upload an image using this image processing library and some JavaScript. Once uploaded, we'll insert our model to make predictions at a given URL, followed by printing the results into HTML.
We have the option to add more symptom metadata by using RapidAPI, the world's largest API hub, to embed patient metadata into our app quickly.
We'll select the Symptom Checker API, and we can even test it out from the dashboard by giving it patient demographic information. It'll provide us with a Python snippet, which we can embed into our app.
This will provide cardiologists with even more helpful information. Once we test it out, we can see results on a webpage.
There are three things that we have learned from this guide: