Skip to content
This repository has been archived by the owner on Nov 10, 2024. It is now read-only.

stream_tweets2() does not save the output #294

Closed
DarioBoh opened this issue Oct 10, 2018 · 10 comments · Fixed by #472
Closed

stream_tweets2() does not save the output #294

DarioBoh opened this issue Oct 10, 2018 · 10 comments · Fixed by #472

Comments

@DarioBoh
Copy link

Problem

stream_tweets2() deletes the .json output . I can see it creates a temporary folder while it's running, but the folder is deleted when the functions stops.

stream_tweets2(q = c(-86.6874, 29.6644,-85.1626, 30.5846), dir = 'tweet2', timeout = 5, include_rts = F, parse = F)

Session info

sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.4

Other attached packages:
[1] rtweet_0.6.8 rvest_0.3.2 xml2_1.2.0 forcats_0.3.0 stringr_1.3.1 dplyr_0.7.6 purrr_0.2.5
[8] readr_1.1.1 tidyr_0.8.1 tibble_1.4.2 ggplot2_3.0.0 tidyverse_1.2.1

@internaut
Copy link

I can confirm this using a similar setup on Linux.

Even if you have an already existing directory "streamtest" with a file "stream-1.json" and specify append = TRUE it will delete the whole directory afterwards. No data is retained anywhere.

@mkearney
Copy link
Collaborator

Sorry about this @DarioBoh and @internaut. I know from personal experience this can be super frustrating. I actually designed stream_tweets2() to be sort of a testing space so I could more robustly experiment with ways to reliably capture tweets from the stream API. I have actually integrated some of those things into stream_tweets() but I haven't done a good job of making sure stream_tweets2() functions properly. I'll make either deprecating or fixing the function a priority!

@internaut
Copy link

Thanks for this clarification.

You should definitely update the Live streaming tweets vignette then and remove sentences like:

To ensure the stream automatically reconnects following any interruption prior to the specified stream time, use stream_tweets2()."

This is why I used this function in the first place! There's no indication in the docs that this is an experimental function. You should remove it. A published R package is not a "testing space". Your own development environment is a testing space and you can use git branches for experimental features.

So does the package actually support automatic reconnection or is this claim wrong because stream_tweets2() is not really usable?

@DarioBoh
Copy link
Author

@mkearney thanks for the response and you work on this package, I also agree with @internaut that the documentation should clarify whether the reconnection is currently supported on stream_tweets or whether users should find some workarounds to deal with would-be interruptions until stream_tweets2() works. Actually, I assumed that stream_tweets() could cope with interruptions until I read the documentation for stream_tweets2()...

@mkearney
Copy link
Collaborator

@internaut Yikes. That is definitely an oversight. Thanks for pointing it out. Will fix soon!

@internaut
Copy link

Okay thanks, a lot! I didn't want to sound to harsh. I really appreciate your work at this package and it works fine (if you stick to the right functions ;). I know how hard it is to keep all documentation synced with the current status quo of a package.

You should also point out the current status of the automatic reconnection feature. It's still unclear to me, if it is implemented and working (I didn't have the time to test it myself yet).

@mkearney
Copy link
Collaborator

@internaut No worries. You did great–to the point, but not mean. I've never been great with 'to do' lists, so these kinds of issues really help me out!

@user-afk
Copy link

This still seems to be an issue.

It's particularly bad that setting dir to a folder that exists will delete the entire folder (including all files the user already had in that folder).

This function should be removed from the CRAN package until it is fixed as it has potential to cause irreversible loss of data without warning.

@Waqasejaz
Copy link

@user-afk I second that. I just wasted 6 hours worth of streaming, only to find out that the file is no where to be found. I am a big fan of Michael's work, and I am sure he will do something about it. I was so excited to get it working :(

@hmeleiro
Copy link

I installed @Kwazao's fork and it worked for me.

This was referenced Feb 15, 2021
@llrs llrs closed this as completed in #472 Feb 22, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants