When it comes to image analysis, Artificial Intelligence, or more specifically convolutional neural networks, work incredibly well. This isn’t surprising since this is roughly what is going on in our own minds, unconsciously fortunately! Think about how good we are at making sense of what we see. We are amazing, right. We can effortlessly tell thousands of faces apart yet when we think about it, how could we describe in words what makes these faces different? It would be close to impossible, yet without any effort we all subconsciously do it with ease.
So, if there is any piece of analysis you do by looking at an image, from diagnosing cancer from the appearance of a mole, to checking if your cells in culture are healthy, if you, a human, can do it, so can a computer. Of course, this is easy to say, it’s just a bunch of words on a page. But, how do we physically get there? There are a few simple questions you can ask yourself to know if you are taking a computers job!
Firstly, think about the possibility of success. Computers are not omniscient beings, for example predicting the weather from pictures of cats and dogs probably isn’t possible. However, predicting whether an image contains a cat or a dog is. Think about the likelihood of the right information actually being in the image. Do images of cats and dogs contain enough information to predict the stock market? Probably not, but do they contain enough information to predict whether there is a cat or dog or both in the image? Of course it does, since this can easily be done by a human. Please forgive the silly examples, but I’m sure you see what I’m getting at.
Another way of looking at the above is, “can the analysis be done by a human?” If so, then the answer is a resounding yes to whether a computer can do it! You are taking a computers job. Poor computer. Given enough examples, which I will elaborate on later, this is possible.
If the job cannot be done by a human, can it still be done? This is where your own expertise comes in! If you feel the information might be in the image, then it might well be possible. A nice example of this is predicting whether a cell is dead or alive or dead just from an image. If you are an experienced cell scientist, we may know that at least some aspects of cell morphology (size, shape, bumpiness etc) certainly correlate with deadness. So, it seems likely that a model that can pull all this information out of an image and look at it from a 10-dimensional perspective, might stand a good chance at being good at it.
The beauty of CNNs is that you don’t have to explain that things like size, shape, bumpiness of a cell are useful bits of info for predicting cell death. If the info is there, the model will learn to pull out the best features and maybe consider them in a high dimensional way to come to the right result.
A quick note on dimensions! Why are computers good at modelling? Well, a reason is that they can very easily ‘think’ in 3, 4, 5…etc dimensions. We can only really visualise in 3 dimensions. This makes sense since we evolved to survive in a 3-dimensional world. At the end of the day, we are smart, but we are no smarter than we need to be to survive! So, in summary ask yourself these questions, and these are your answers.
Question 1: Can a human do it? If yes, then a computer certainly can.
If no, question 2 is: Do you think the information might be in the image from your experience? If yes, then the answer is a computer can most likely do it. It’s certainly worth a shot.
If the answer to the above is “no”, then question 3 is, is it worth the time and money to generate a dataset to see if it is possible? If the model works, would it be a huge advantage for you? If so, it may well justify taking a punt. If not, then maybe consider a simpler version of the problem and begin the question cycle once more!
From problem to solution
If you have considered whether it is worth going down this route, the next step is turning “problems” into “datasets”. To do this, first you must start with a well-defined problem. A problem needs to be defined as objectively as possible. A good example of this is predicting whether a cell is “alive” or “dead”. This is reasonably objective in that things like dyes can be used to assess whether a cell is alive and dead. The hypothesis can be compared to standard methods and tested.
An example of a poorly defined problem is “is my cell good or bad”. Indeed, how would one even go about labelling a dataset of good or bad. How would one explain to another colleague how to do this? Ideally there needs to be something objective about it. For example, if you were labelling a dataset, could you explain to a total (but very intelligent) stranger how they could label the images just as well as you could? In a way, a computer is an exceptionally intelligent stranger! In the sense that they are a total stranger to what you are trying to do.
Once a problem is well defined, it is easy to turn it into a dataset. For example, pictures of single cells can be given labels of “alive” and “dead”. This could be done manually or automatically, for example having an additional stained image as the target e.g. cells above a certain fluorescence threshold are labelled as “dead”, else, “alive”.
All you need to train a model are examples of inputs, e.g. a lot of pictures of cats and dogs, and the desired output from the model e.g. each image has a label of “cat” or “dog” which you want the model to return. Simple learning by example. That’s all it is. You just need good examples.
This is an example of a nice problem because it can be turned into a dataset. The dataset is what you expect the input to be e.g. a black and white photo, and what you want the output to be, a colour version of the input image.
If your problem can be turned into a dataset, then the model can be built. Generally, a large number of input example and paired output examples are all that is required.
So, in summary, if you can answer these questions as yes, you should be saving your own precious brain and getting computers doing the work!
- Do you feel the problem is solvable?
- Can you define the problem well?
- Can the problem be turned into a dataset?
You need not worry about the deeper details of “dataset to model”. That’s what CellAi is here for.