Skip to content

Commit

Permalink
updated getting started
Browse files Browse the repository at this point in the history
  • Loading branch information
Ellery Wulczyn authored and Ellery Wulczyn committed May 12, 2016
1 parent 09dced6 commit 75be6f1
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 6 deletions.
20 changes: 17 additions & 3 deletions src/Wikipedia Navigation Vectors - Getting Started.ipynb

Large diffs are not rendered by default.

6 changes: 3 additions & 3 deletions src/getting_started_helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,18 @@
from sklearn.manifold import TSNE


def get_tsne(embedding, pca_dim = 20, n_words=10000):
def get_tsne(embedding, pca_dim = 20, n_items=10000):
"""
TSNE dimensionality reduction.
The TSNE algorithm is quite slow, so we:
1. only use the first n_words from the embedding
1. only use the first n_items from the embedding
2. reduce embedding dimensionality via PCA
3. run TSNE on reduced embedding matrix
"""
X = embedding.E[:n_words]
X = embedding.E[:n_items]
pca = PCA(n_components=pca_dim)
X = pca.fit_transform(X)
tsne = TSNE(n_components=2, random_state=0)
Expand Down

0 comments on commit 75be6f1

Please sign in to comment.