skip-gram: predicts next/prev word(s) based on present word
CBOW: opposite of skip-gram; input is a sequence of words, output is the next word
embedding size = possible relational directions
universal sentence encoder: colab, arxiv
hierarchal neural story generator (fairseq): repo
wiki-tSNE: groups wikipedia articles by topic
python library wikipedia
text = wikipedia.page("New York University")
print(text.content)
spacy: better than nltk? can parse entities, ie organizations (New York University) time (12pm), etc