deep learning in medicine

• Classification: output is a category
• Regression: continuous vs discreet
• anomaly detection: EEG data
• synthesis and sampling
• density estimation or probability mass function estimation
• ie clusters of distribution

Performance Measure, P:

• Accuracy of predicted labels compared to true labels
• mean squared error between target and predictions
• likelihood: probability of predicting the true outcome label for samples
• classification
• function of the model parameters
• conditional probability
• i.i.d. = identically and independently drawn

Machine learning formulation:

• input: X
• output: Y
• evaluation: loss

Supervised Learning- Classification

• input X: continuous or categorical vector or Matrix or Tensor
• output Y: categorical label
• task fθ(x): some function f that computes probability of the each for each sample
• evaluation loss

Supervised Learning- Regression

• input X: continuous or categorical vector or Matrix or Tensor
• output Y: continuous target
• task fθ(x): some function f that computes target for each sample
• evaluation loss: mean squared error loss, or adversarial loss, etc

Supervised Learning- Structured Output

• input X: continuous or categorical vector or Matrix or Tensor
• output Y: continuous or categorical vector or Matrix or Tensor
• task fθ(x): some function f that computes  a vector/matrix/tensor for each sample
• evaluation loss: combination

Unsupervised Learning- Density Estimation

• input X: continuous or categorical vector or Matrix or Tensor
• output Y: P(X)
• evaluation loss: log likelihood of observing Xs are they are

Unsupervised Learning- Denoising

• input X (noisy): continuous or categorical vector or Matrix or Tensor
• output X: denoised continuous or categorical vector or Matrix or Tensor
• task fθ(x): some function that returns an output identical in size but with nice constraints
• evaluation loss

Gradient descent: used to find the local minimum

Solution to under/overfitting:

• randomly sample and set aside a test set; the rest of the data becomes the training set
• optimize the loss function on the training set only
• have 3 sets: training set, validation set, and test set