Skip to content
This repository has been archived by the owner on Nov 13, 2021. It is now read-only.

data.frame Column Error #42

Open
stevebanik opened this issue May 12, 2015 · 3 comments
Open

data.frame Column Error #42

stevebanik opened this issue May 12, 2015 · 3 comments

Comments

@stevebanik
Copy link

I created a data.frame called foo and attempted to format it exactly like raw_data, but when I set res, I get an error.

My data.frame:

head(foo)
timestamp count
1 2015-05-11 13:54:00 42748.0
2 2015-05-11 13:55:00 44152.0
3 2015-05-11 13:56:00 43642.0
4 2015-05-11 13:57:00 42544.0
5 2015-05-11 13:58:00 41627.0
6 2015-05-11 13:59:00 42138.0

Setting res, getting an error:

res = AnomalyDetectionTs(foo, max_anoms=0.02, direction='both', plot=TRUE)
Error in AnomalyDetectionTs(foo, max_anoms = 0.02, direction = "both", :
data must be a 2 column data.frame, with the first column being a set of timestamps, and the second coloumn being numeric values.

raw_data looks quite like foo:

head(raw_data)
timestamp count
1 1980-09-25 14:01:00 182.478
2 1980-09-25 14:02:00 176.231
3 1980-09-25 14:03:00 183.917
4 1980-09-25 14:04:00 177.798
5 1980-09-25 14:05:00 165.469
6 1980-09-25 14:06:00 181.878

Any idea what I'm doing wrong?

Thanks,

Steve

@stevebanik
Copy link
Author

UPDATE: Value should likely be num, not chr:

str(foo)
'data.frame': 1439 obs. of 2 variables:
$ date : POSIXct, format: "2015-05-11 14:20:00" "2015-05-11 14:21:00" ...
$ value: chr "36185.0" "38591.0" "36313.0" "34467.0" ...

I used transform to change that:

D <- transform(foo, value = as.numeric(value))
Warning message:
In eval(expr, envir, enclos) : NAs introduced by coercion

And now it's num:

str(D)
'data.frame': 1439 obs. of 2 variables:
$ date : POSIXct, format: "2015-05-11 14:20:00" "2015-05-11 14:21:00" ...
$ value: num 36185 38591 36313 34467 35717 ...

but "Anom detection needs at least 2 periods worth of data":

anomalyDetectionResult <- AnomalyDetectionTs(D, max_anoms=0.2, threshold = "None", direction='both', plot=TRUE, only_last = "day", e_value = TRUE)
Error in detect_anoms(all_data[[i]], k = max_anoms, alpha = alpha, num_obs_per_period = period, :
Anom detection needs at least 2 periods worth of data

I seem to recall reading another issue about that, so I'll look for it again.

@nullbuddy1243
Copy link

@stevebanik How did you get your timestamps to be in that format? I've tried doing

foo_timestamp <-as.POSIXct(parse_iso_8601(doc$fields$`@timestamp`))

But my timestamps now are nums

str(foo_dataframe)
'data.frame':   100 obs. of  2 variables:
 $ timestamp_list: num  1.44e+09 1.44e+09 1.44e+09 1.44e+09 1.44e+09 ...
 $ in_bytes_list : num  977 1965 973 986 977 ...

And when I run the anomaly detector

AnomalyDetectionVec(foo_dataframe, period=100, plot=TRUE)
Error in AnomalyDetectionVec(es_out2, period = 100, plot = TRUE) : 
  data must be a single data frame, list, or vector that holds numeric values.

My data frame looks like this:

head(foo_dataframe)
  timestamp_list in_bytes_list
1     1437617401           977
2     1437617401          1965
3     1437617401           973
4     1437617401           986
5     1437617401           977
6     1437617391           605

Any help would be greatly appreciated!

@QuantScientist3
Copy link

Same here.
Were you able to resolve this?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants