Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nullpointer in the Memory Cache in hydra 0.4.0 #298

Open
ssimon opened this issue Jan 15, 2014 · 3 comments
Open

Nullpointer in the Memory Cache in hydra 0.4.0 #298

ssimon opened this issue Jan 15, 2014 · 3 comments

Comments

@ssimon
Copy link

ssimon commented Jan 15, 2014

Getting a Nullpointer in the Memory Cache in hydra 0.4.0. Don't really know where to start dubugging this..

2014-01-15 17:02:30,064 [Thread-4] ERROR com.findwise.hydra.Main - Got an uncaught exception. Shutting down Hydra
java.lang.NullPointerException: null
    at com.findwise.hydra.MemoryCache.removeStale(MemoryCache.java:183) ~[hydra-core.jar:na]
    at com.findwise.hydra.CachingDocumentNIO.flush(CachingDocumentNIO.java:372) ~[hydra-core.jar:na]
    at com.findwise.hydra.CachingDocumentNIO$CacheMonitor.run(CachingDocumentNIO.java:424) ~[hydra-core.jar:na]
2014-01-15 17:02:30,064 [Thread-4] INFO com.findwise.hydra.Main - Got shutdown request...
@ssimon
Copy link
Author

ssimon commented Jan 15, 2014

Maybe it's in combination with discarding documents that this fails..?

@ssimon
Copy link
Author

ssimon commented Jan 15, 2014

Or outputting rather, I didn't have a discarding stage in that pipeline.

Is the document removed from the memory cache if it was outputed already?

@laserval
Copy link
Contributor

Hm, just some thinking without debugging:
So the relevant line is https://github.com/Findwise/Hydra/blob/0.4.0/database/src/main/java/com/findwise/hydra/MemoryCache.java#L183

Entry<DocumentID<T>, Long> entry = it.next();
if (time - entry.getValue() > stalerThanMs) {
    DatabaseDocument<T> d = getDocumentById(entry.getKey());
    list.add(d);
    map.remove(d.getID());      <- there
    it.remove();
}

This is all synchronized on the MemoryCache instance. The iterator it is over the entire cache and gives the time they were last touched. It looks like the entry in the iterator either doesn't exist in the cache or has no ID. Since the key in the entry is the document ID, it's probably the case that the document is no longer in the cache.

Outputting a document marks it as processed, using this method:
https://github.com/Findwise/Hydra/blob/0.4.0/database/src/main/java/com/findwise/hydra/CachingDocumentNIO.java#L121

public boolean markProcessed(DatabaseDocument<T> d, String stage) {
        DatabaseDocument<T> cached = cache.getDocumentById(d.getID());
        if (cached != null) {
                d.putAll(cached);
                cache.remove(d.getID());
        }
        if (writer.markProcessed(d, stage)) {
                return true;
        }
        return false;
}

So documents that are marked as processed should be removed from the cache. But then it shouldn't be there in the iterator for documents that are going to be flushed, anyway.

Do you have any more information about the pipeline, and if there is any condition for triggering this?

@laserval laserval modified the milestones: Future, 0.5.0 Feb 27, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants