Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some documentation issues #225

Closed
bluebad opened this issue Mar 20, 2024 · 3 comments
Closed

Some documentation issues #225

bluebad opened this issue Mar 20, 2024 · 3 comments
Labels
documentation Improvements or additions to documentation help wanted Extra attention is needed

Comments

@bluebad
Copy link

bluebad commented Mar 20, 2024

I've encountered several documentation issues.

In tantivy.py, the comment documentation for the Snippet class is incorrect.

Additionally, after implementing a piece of code for adding indexes based on the example, I faced a "ValueError: An error occurred in a thread: 'An index writer was killed.. A worker thread encountered an error (io::Error most likely) or panicked.'" after calling index writer.commit() tens of thousands of times. However, the issue was resolved after following the example in the tests directory, where writer.wait_merging_threads() is called after multiple commits. I am not sure if this is an issue on my end, but if it's not, then the documentation should be updated to clarify this process. Thank you for your work!

@cjrh cjrh added documentation Improvements or additions to documentation help wanted Extra attention is needed labels Mar 20, 2024
@cjrh
Copy link
Collaborator

cjrh commented Mar 20, 2024

The documentation is very rough and we know there is a lot of work to be done. We're all volunteers though so fixes happen as time allows.

comment documentation for the Snippet class is incorrect.

Could you specify in what way it's incorrect? This will help anyone else who comes along and has a few minutes to spare to submit a quick fix for a low-hanging-fruit issue.

after calling index writer.commit() tens of thousands of times

You should only be calling commit after adding a batch of documents, not after each individual document.

@bluebad
Copy link
Author

bluebad commented Mar 20, 2024

Thank you for your hard work. I just hope to make a contribution to this great project. My comments are not meant to offend or criticize.
Below is the docstring for the Snippet:

class Snippet(object):
    """
    Tantivy schema.
    
    The schema is very strict. To build the schema the `SchemaBuilder` class is
    provided.
    """

It seems to have mistakenly used the documentation for Schema.

@bluebad
Copy link
Author

bluebad commented Mar 20, 2024

You should only be calling commit after adding a batch of documents, not after each individual document.

I did call add_document multiple times before committing, approximately once every 1000 calls, but I encountered the error mentioned above. It was not until I used wait_merging_threads that I was able to successfully complete the entire indexing process.

cjrh added a commit to cjrh/tantivy-py that referenced this issue Mar 21, 2024
@cjrh cjrh closed this as completed in e9363e7 Mar 22, 2024
cjrh pushed a commit to cjrh/tantivy-py that referenced this issue Sep 3, 2024
…-oss#225)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants