-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fetching feeds for the first time only shows the last item in the feed. #14
Comments
This is a bug in davereader. davereader has cache objects for different data. The accessor functions try to get the data from the cache object. If it does not exist they create a new data object, try to read old data stored on the filesystem with Unfortunately a lot of the function used during
and the update of the arrays with the empty data object takes too long. In this case we have a callback function that is called every time feedparser (or better the underlying xml parser) parsed an
As parsing and all the logic in between is very fast, the first readFile() callback from getRiverData() finishes after a lot of items finished to be processed, and the newly generated river data objects overwrite each other. For readFeedRivher it is even worse as a deep directory structure is created with fs.mkdir(). A solution would be to read the river data for before the feed is parsed to make sure the river data object is in function subscribeToFeed (urlfeed, listname) {
getFeedRiver(urlfeed, function() {}) // make sure we read the feed river into allTheFeedRivers[]
getRiverData(listname) // make sure we read the river data into allTheRivers[] On the other hand this just bets on the chance that there is enough time for readFile() and the involved async callbacks to finish before the first Without restructuring the logic a lot it would be easier to switch to the synchronous version of the involved fs methods. I added a few log functions to
|
Thanks for the report! :-) It may take me a couple of days to swing around to this, but it sounds like you have found a serious problem. We'll get this addressed soon. Dave |
Carsten et al -- this is how I plan to address the problem. http://scripting.com/2018/01/13/183433.html Dave |
I'd opt for the boring option and create/load all files and directories at startup with sync methods. It's the simpler solution with less potential problems and easier to read code. The code in the gist will have the some problem (through with much smaller timing windows). The delete riverQueues [f]; is executed right before line 19 in another call to readRiverFile(): riverQueues [f].push (callback); it breaks. A Promise might do the trick here, and there are packages that add fs calls with promise support. But that adds more unnecessary complexity IMO. |
Hah! I saw that one tool. I'm glad you're paying attention. ;-) I'll give it some more thought. There's another problem, wonder if you see it. I can read all the river files at startup but what about newly subscribed-to feeds while River5 is running. |
BTW, there's an even simpler approach than reading all the river files at startup. Let's see what you think @csenger. :-) I always thought the way readFeed works is not good. And that's what we're really paying the price for here, the fact that it returns items as a stream, instead of returning them all at once. Instead, I could rewrite readFeed to read the feed, and return all the new items to a callback, in an array. It could have 1, 2, 3 or N items in it. Doesn't matter. Than I'd read the river file, add the items in a for loop, totally synchronous, and that's that. No bullshit. Use exactly the same caching method as in the current version of davereader. I like this method because it can easily be implemented outside River5 and fully tested, and dropped in as a unit. I'm also pretty sure I have that code in exactly that form somewhere in my database of solved problems. What do you think?? |
I do have the routine I describe, in this project -- https://github.com/scripting/feedParserDemo/blob/master/demo.js#L30 Dave |
Here's a new local routine to go inside readFeed -- https://gist.github.com/scripting/c3eab6e60073678325593c20c1f9fece#file-readxmlfeed-js-L36 Dave |
I have a new version running here that has both changes.
I'm going to burn it in here, and if it all seems to work, release the new version. |
Easy question - Is there a functional/unit test that proves it works or not? |
@morthond -- no there is no such thing. If you want to create one that would be fantastic. However it's probably going to be rough, because you'd have to simulate feeds with more than one new item in them. And that would involve having a standard data folder and lists folder that it could run from. |
New version of reader project, and new release of davereader NPM package, v0.6.2. Change notes here -- https://github.com/scripting/reader#v062----11718-by-dw Here's link to the history for these changes. They really were quite simple and straightforward, not all glamorous and elegant as the one I first latched onto. ;-) Thanks @csenger for the help finding this problem and figuring out a solution. Dave |
I fetched a fresh copy of the repo and stripped the lists/ and config down to a single feed (lists/hn.opml) and let the system run. It reports to the console that it collects all the items. The rivers/ folder (once it is manufactured) only has the last item. The view only shows the last item correspondingly. It should show all the items.
I tried with both flAddItemsFromNewSubs set and unset in the config.json.
{
"maxRiverItems": 300,
"flAddItemsFromNewSubs": true,
"homePage": {
"panels": [
{
"title": "Hacker News",
"river": "hn.js"
}
]
}
}
The text was updated successfully, but these errors were encountered: