-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get raw vectors for further processing in Word2Vec #3309
Conversation
Can one of the admins verify this patch? |
} | ||
|
||
/** | ||
* Returns the strings with it´s raw vectors for further processing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it´s
-> its
(btw, the character ´
does not belong to ascii, which should be avoid in the code base)
@@ -461,4 +461,13 @@ class Word2VecModel private[mllib] ( | |||
.tail | |||
.toArray | |||
} | |||
|
|||
/** | |||
* Returns the strings with it's raw vectors for further processing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's
-> its
|
||
/** | ||
* Returns the strings with its raw vectors for further processing | ||
* (e.g. clustering) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for making one more iteration! The doc could be more concise, e.g.
Returns a map of words to their vector representations.
for further processing (e.g. cluster)
is not necessary because we don't restrict how users will use it.
Usually, if Returns ...
contains all the information, we don't need @return
.
Another question I have is on the return type. We use Array[Float]
to save space in computation. Shall we return Map[String, Vector]
here, just to be consistent with others? @Ishiihara
ok to test |
test this please |
Test build #23790 has started for PR 3309 at commit
|
Test build #23790 has finished for PR 3309 at commit
|
Test PASSed. |
e.g. clustering Author: tkaessmann <[email protected]> Closes apache#3309 from tkaessmann/branch-1.2 and squashes the following commits: e3a3142 [tkaessmann] changes the comment for getVectors 58d3d83 [tkaessmann] removes sign from comment a5be213 [tkaessmann] fixes getVectors to fit code guidelines 3782fa9 [tkaessmann] get raw vectors for further processing
LGTM. I've merged this into branch-1.2. I made a new PR #3437 with the same content to the master branch. Usually a PR should be sent to the master branch unless it is for a specific branch. |
e.g. clustering Author: tkaessmann <[email protected]> Closes apache#3309 from tkaessmann/branch-1.2 and squashes the following commits: e3a3142 [tkaessmann] changes the comment for getVectors 58d3d83 [tkaessmann] removes sign from comment a5be213 [tkaessmann] fixes getVectors to fit code guidelines 3782fa9 [tkaessmann] get raw vectors for further processing
e.g. clustering