Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request - Difference between parts of speech #21

Open
nklswbr opened this issue Aug 5, 2020 · 8 comments · May be fixed by #48
Open

Request - Difference between parts of speech #21

nklswbr opened this issue Aug 5, 2020 · 8 comments · May be fixed by #48

Comments

@nklswbr
Copy link

nklswbr commented Aug 5, 2020

Would love to see an option to differentiate between parts of speech

Something like

console.log(randomWords({partOfSpeech: 'noun', exactly: 2}))
['army', 'eye']
@marcodali
Copy link

marcodali commented Jan 9, 2023

I use this library to generate dummy data for the automated generated entities like this

{
    "id": "1",
    "name": "John",
    "lastname": "Wick",
    "email": "[email protected]",
    "phone": "159 814 9227"
  }

It would be awesome if as @nklswbr suggests you can make a call to specify what kind or type of words you need e.g;

  • name (Cristina, Jennifer, Yolanda)
  • lastname (Alfest, Martinez, Jaramillo)
  • country (Spain, Morocco, Brazil)
  • animal (bear, bird, fish)
  • color (magenta, cyan, navy blue)

@BoDonkey
Copy link
Contributor

BoDonkey commented Mar 9, 2023

How to work on this issue

  1. Fork this repo into your own account
  2. Make changes to the code - since it is in your account, you could work directly on main, but wouldn't you rather get some Git practice by working on a branch, and then merging to your own main?
  3. Write one or more tests to make sure your feature works
  4. Make sure your code changes don't break any existing tests.
  5. Update the CHANGELOG.md file - since other people might contribute during the same sprint cycle, add your change log message under 'UNRELEASED', not a specific version. We will add the version number when everything is approved and published on npm.
  6. When you are done, return to this repository and create a PR to pull code from your fork. Read more about this here. Make sure to fill out the PR template as best as you can.
  7. Navigate to our Discord server here if you have already joined, or here if you need an invitation and post a message in the open-source contribution channel that you need a PR review.
  8. After (hopefully) a short amount of time, one of our team engineers will review your PR and potentially advise you about things they want to see changed.
  9. If you need to make changes, go back to your local fork, make changes, and contribute those back to the main repo. This will update the PR.
  10. When you are done with changes and want a re-review, ask in Discord once again.
  11. After your PR is accepted, celebrate!!! 🎉

Code suggestions for this issue

This one will take some extra effort and potential code refactoring. The crucial thing to pay attention to is not making any breaking changes that impact backward compatibility (bc).

One potential approach might be to refactor the dictionary into various categories (or find a dictionary like this someplace). Then for bc, you could merge each of the individual word dictionaries into a single extensive dictionary for the original functions to obtain words. This is only a suggestion. Maybe there is a better way to classify the words within our current dictionary. Discussion about the approach is welcome in our Discord.

One area of difficulty is "making everybody happy". The person who originally opened the issue wanted 'parts of speech', like 'nouns' or 'verbs'. While @marcodali wants much more specific words returned, like 'lastname'. Is this even possible with the existing dictionary? Changing the dictionary is a possibility, but we don't want to make the library smaller or less complex. If changing the dictionary, it might be necessary to first map out the distribution of word sizes to make sure the new dictionary has a similar complexity.

The other issue is the fact that right now, there aren't any guardrails in place because the other options can almost always be fulfilled. What if a user asks for an 'insect', but this isn't in any of your dictionaries? What happens if the user asks for ten animals starting with a and your dictionary only has three? How are incorrect or unfulfillable options currently handled? Suggesting how you would take care of this either here on the issue or in the Discord would be great to get feedback and make sure you are on a good track.

Finally, please make use of the Discord to ask questions. Try to answer the questions yourself using internet resources, but don't be afraid to ask questions on the Discord about anything. We are here to help!

@ronisarkarexe
Copy link

I would like to contribute to this issue.

@BoDonkey
Copy link
Contributor

Hi @ronisarkarexe - Sounds great. As a warning, another Dev has submitted a PR that makes some breaking changes by updating to ES6 and named exports. It shouldn't have a lot of impact, but you will probably have to do a little code refactoring prior to final PR acceptance. I need to talk with our CTO about how we are going to handle the other PR that is still in play. In the meantime, if you have any questions feel free to visit our Discord. We have several channels about open-source contribution where you can ask questions.
Cheers!
Bob

@UnKnoWn-Consortium
Copy link

I would suggest retrofitting the library functions to accept external word lists as an option instead of complicating the word list structure or changing the dictionary. That should make everyone happy.

@BoDonkey
Copy link
Contributor

That isn't a bad thought @UnKnoWn-Consortium. So a user would have to pass a curated dictionary to use the "part of speech" function?

@UnKnoWn-Consortium
Copy link

UnKnoWn-Consortium commented Mar 12, 2024

@BoDonkey That can work. Or easier people can just compose their own "part of speech" function with say the generate function by passing it a "part of speech" dictionary (or "lastname" dictionary). This way it can literally work with any string be it Latin or Hiragana.

@BoDonkey
Copy link
Contributor

Ahh, I see what you mean. A specialized dictionary or the built-in as a fallback. Cool!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants