-
Notifications
You must be signed in to change notification settings - Fork 778
Anom detection needs at least 2 periods worth of data #15
Comments
After debugging..
So the period is basically a day here and we are expecting more than 2*1440 = 2880 observations. It's implicit that the granularity should be one minute and we need at least two days worth of data. Is there anything that can be done here when the granularity is multiple minutes? |
Your totally right. The seasonality we were looking at was either daily (if the data was minutely or hourly), or weekly (if the data was daily). We added However, it would be nice for |
thanks. I'll try to come up with something. |
I get this even with daily data, and I've confirmed using the internal AnomalyDetection::: functions that it is correctly recognizing that the period. Minimal example: quantmod::getSymbols("^GSPC") |
Hi Elbamos, I was able to reproduce your error, and I'll look into posting a patch soon. In the interim, you can run the data using the following:
That will give a weekly periodicity, and assumes a longterm stable state of 30 days. Both parameters can be changed, but the longterm_period must be at least (period*2)+1. The other issue was that the timestamps are currently doubles, while the Ts function is expecting a POSIX type. We are checking for that, but I think we are going to re work this to return the timestamps in the same format as they were passed in. Hope that helps. Cheers, |
I'm just wondering if this ever got fixed... |
From help(AnomalyDetectionVec): period Defines the number of observations in a single period, and used during seasonal decomposition. But what is the definition of a period? In the forecast package one uses a "frequency" argument which is specified in terms of a year: quarterly data would be frequency = 4, monthly data is frequency =12, daily data would be frequency = 365. What is the definition of "period" in this package? I have monthly data (1 row per month). What period do I use? |
Hi rtjohn, We used period here to denote the number of observations in a single cycle of the dominant seasonal component. This way we can define the number of observations per cycle without having to relate the number of cycles to some window, e.g., annual, quarterly, etc. Best, |
I think there are some terminology confusions here. Time series data generally can have trend, seasonal, and/or cyclic components, right? So you want users to "define the number of observations per cycle" (cyclic component)? But the definition of a cyclic component is that they are not of a fixed period... So again for monthly data with let's say a strong true "season"-al pattern (changing drastically from winter, to spring to fall to summer) the period argument should be 3 right? I'd have 3 periods in a single "cycle" as you'd call it? |
@rtjohn while I totally relate to the point you're making, and I've found the issue confusing also, im pretty sure the package uses the same conventions for cycle and period definition as base R does. Which is definitely not friendly, but the package should conform to the convention of the platform.
|
@rtjohn I see what you're saying. This Seasonal-Trend Decomposition paper was a big part of developing the package, and we based our naming conventions around their notion of "Seasonal, Trend, Residual" terminology. So in that case, Seasonal components would be the repeating cycles in the time series, the Trend would account for the variations from winter to summer, and the Residual should be the unimodal noise that we can use to detect the anoms. Also, Jordan and I have an audio background, so we tend to treat cycle as synonymous with period. Let us know if we could improve the doc strings though. |
Hi all, I am quite new to this package and would like to use it for some analysis i am doing. I have data that is not regular ie. trading. Would i be able to use the AnomalyDetection to identify say irregular rices charged? If so, what would i set the "period" to, as on some days there might be a trade every second, or hour, and on some days none? i have data for roughly a year. Any help will be greatly appreciated! |
Still get-Error in detect_anoms(all_data[[i]], k = max_anoms, alpha = alpha, num_obs_per_period = period, : |
What's the definition of period here? The data contains a time series for about 4 days with granularity of 10 minutes.
Posting the data frame "bar" here
https://www.dropbox.com/s/1j263k6srq18qpp/bar.Rda?dl=0
The text was updated successfully, but these errors were encountered: