Learning Machines: W3 Class Notes

Graph theory describes hierarchy of related elements:

  • vertices (nodes) are entities; edges represent the relationship between nodes
  • can represent grouped paths in photoshop; a computer program;
  • perceptron is a directed graph (info flows in one direction)
  • recurrent neural networks allow cycles (loops) in their flow of info

Perceptron Implementation Notes:

  • sign activation function: if number is greater than zero, output is 1; less than zero, output is -1
  • bias input: always equal to one
  • supervised training procedure:
    • make predictions (ie, weights are random)
    • have perceptron guess outputs; compare to actual known outputs
    • compute the error; adjust all weights accordingly
    • repeat
  • HW:
    • construct data sets, train it on all three (AND and OR should be 100% accurate)
      • extra column of 1’s for bias input
      • input is 2 columns (3 for bias input), output is 1 column
      • AND set: true = 1, false = -1; two column pairs
        • input [1,1], output = [1]; input [1, -1], output [-1], etc
      • OR set
      • XOR set will not be 100% accuracy (probably 50%)
    • class Perceptron
      • initializer function (number_of_input_dimensions, num_of_output_dimensions)
        • weight = np.rand(num_input)
      • predict(inputs)
        • return array of output predictions
      • training function(iterations, inputs, known outputs)
        • for iter in range(num_iters):
          • predict = (
      • myPerceptron = Perceptrion()
        • myPerceptron.train()
        • myPerceptron.predict()

Linear separation:

  • Exclusive Or: (a OR b) AND (NOT (a AND b))
    • Both variables are dependent on each other, whereas in AND and OR models neither nodes need to know about each other
      • not linearly separated like AND and OR
  • If machine-learning about whether pixels compose a picture of a person, the perceptron asks each pixel if it may be part of a picture of a person, and if more than 50% say yes, then the output is yes, the picture is of a person
    • does not account for interdependency of pixels

Calculus Primer:

  • Calculus is about approximating the analog world
  • derivative: rate of change in some phenomenon
    • power rule: multiply power by variable’s coefficient, reduce power by 1
      • derivative of x^2 is 2x
    • chain rule: f(g(x)) = f'(g(x))g'(x) >> nested function, able to compute derivative by splaying it out


On the new breathing exercise: since this is supposed to slow the heart rate and put the body into parasympathetic mode, I tried this on a night I couldn’t get to sleep. It was difficult to regulate my breathing so that I didn’t run out of air during the 8-count exhalation, which put a lot of pressure on the 4-count inhalation, making the entire exercise very stressful. I don’t have the best lungs either, which was probably a detriment to my practice.


  • What are the best consumer-grade EEG devices? Emotiv, biosemi (active shielding, micro amplifier on each electrode)
  • What factors influence the accuracy of EEGs? accuracy of sensors, number of sensors
  • Resources on campus? Classes? Participate as a subject, assistant at 6 washington place, back to right ahnd side, behind elevator, yellow posters; neuroimging; free software called EEGLab

Learning Machines w2 hw: k-means clustering

Not gonna lie, I pair-programmed this with a software engineer who specializes in machine learning. I was able to code the initial setup just fine, but somehow failed to make the connection that the data points were vectors, and all the calculation required was just vector math. Being new to python—and having forgotten all my high school math—this assignment was pretty overwhelming, but with some help I feel like I now understand k-means clustering pretty well.

Would like to update this later so that the data points change color based on which cluster set it belongs to. The plots currently are not very legible.

Click for full image.

Steps taken to group 50 data points into 4 clusters.

Steps taken to group 100 data points into 5 clusters.

Full code here: https://github.com/xujenna/learning_machines/blob/master/kmeans_clustering.py

ICM w2 hw

Well that was fun.

Don’t have much to say about this. Just checked off the following requirements:

  • One element controlled by the mouse: background color
  • One element that changes over time, independently of the mouse: movement of facets
  • One element that is different every time you run the sketch: positions of sparkles

w2 Transtech: class notes

parasympathetic vs sympathetic systems

exercise: inhale for 4 counts, exhale for 8 counts
> slows down heart rate, puts body into parasympathetic system, lowers cortisol levels

Allostatic load: good, tolerable and toxic stress
Allostasis system tries to regulate

norepinephrine is low: can’t identify what is important
dopamine is low: can’t sustain attention but can see what is important; low sense of reward, no interest in life

mental quiescence/absorption meditations:
> yoga, TM (repeat mantra with least amount of effort possible, enjoy pleasure of meditation; switches into parasympathetic system), shamata (focus on breathing, attention stabilizes)
> you don’t experience outside perceptions, body, etc, just what you are focusing on
> can come into contact with the “deepest part of yourself”
> decreased sympathetic and increased parasympathetic ANS
>> decrease in heart rate and blood pressure, but increase in cerebral activity (in alpha and theta bands)
>> with repeated practice, this type of meditation leads to the establishing of internal metabolic rest as the baseline state of the organism

Tonic alertness: controlled parasympathetic system
Mindfulness/zen elicit heightened parasympathetic activity and tonic alertness

deity yoga, deity meditation, chakra meditations, etc lead to heightened sympathetic activation and phasic alertness

relaxation meditations lower stress but do not affect cognitive function


Current Imagining Modalities

  • Hemodynamic/Metabolic
    • PET
    • fMRI
  • Nuclear
    • MRS
  • ElectroMagnetic
    • MEG
    • EEF/ iEEF-EcoG (intracranial)
  • Optical (tracks blood flow using laser, or makes cells responsive to certain lights)
    • NIRS
    • Optogenetic

Electromagnetic methods:

  • high temporal resolution ~ 1ms (can see how signal enfolds)
  • Can’t tell exactly where signal is coming from
  • high density of electrodes allows mathematical analysis to give idea of where signals are coming from
  • picks up a lot of “garbage”: signals from the environment, muscles around the skull (gamma, hard to clean out), eye movements (blinking, moving eyes left to right creates strong magnetic field, darting of eyes), heart rate
  • Frequencies: greater the frequency, the weaker it is
    • delta waves: 1-12 hertz, slower and dominate brain while it’s inactive (deep dreams, comas
      • deep sleep: memories consolidated
      • REM: cleaning out info
    • theta frequencies:
      • hippocampus: broadcasts theta, carries memories
      • prefrontal: cognitive activity (paying attention)
      • posterior channels: alpha/theta during creativity
      • daydreaming, not paying attention
    • alpha: most predominate frequency in brain
      • thalamus: uses it to organize other parts of brain
      • prefrontal: suppresses non-preferred parts of experience
      • idling signal
      • posterior: alpha increases when you close your eyes
    • beta: (13-35 hertz) most associated with every day thinking, language processing,
      • motor areas are mostly beta
      • prefrontal: “high beta stress”
    • gamma range (35-80+ low gamma, 80-250+ high gamma):
      • difficult to pick up w/ eeg because of muscle contamination


  • alpha-theta protocol: increase alpha and theta in posterior areas (increases creativity?)
  • beta protocol: increase beta and decrease theta in frontal sites

Learning Machines: w2 notes


huffman encoding: scan entire doc, look at frequency of occurrence in overall document, then encode based on frequencies

  • scalar: individual number
  • vector: one dimensional list of numbers
  • matrix: a list of vectors, two dimensional list of numbers
  • tensor: a list of matrices is one example of a tensor

k-means clustering (viz demonstration here)

  1. choose random points to serve as center of cluster
  2. measure distances between “center” points and data points
  3. sort points into clusters based on distances
  4. move center points to actual center of clusters
  5. measure new distances between “center” points and data points (newly adjusted points may change which points belong to which cluster)
  6. repeat until stable (points no longer jump between clusters)

Who are you? Meditation exercise

In lieu of ‘homework’ following the class 1, try practicing the ‘who am I?’ meditation, either alone, or with a friend as we did in the class. Please keep a journal/blog of your experiences which you will need to show me (email me a url to it) at the end of the course. We will do this with all meditations we learn in this course.

When we first did this exercise in class, my partner Michelle and I found it really challenging; turns out, we have a lot in common and our meditation just devolved into an enthusiastic conversation (it definitely didn’t help that we’re both pretty affirmative/smiley people).

I also personally found it difficult to talk about myself at such a long length, which is not something I enjoy doing in general, in any context. My uncertainty about where to start, and even what to speak about at all, was indicative of a weakness in my own sense of myself, or at least a reluctancy to reduce myself to former occupations. But this seems forgivable, considering that I’m at ITP to expand my skillset and change careers. I mean, how else do New Yorkers define themselves if not by their job titles?