-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-16508][SPARKR] doc updates and more CRAN check fixes #14734
Changes from all commits
ea9f772
341a2f8
4b6c42e
06ab299
58b7677
4d0edb5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -150,7 +150,7 @@ setMethod("explain", | |
|
||
#' isLocal | ||
#' | ||
#' Returns True if the `collect` and `take` methods can be run locally | ||
#' Returns True if the \code{collect} and \code{take} methods can be run locally | ||
#' (without any Spark executors). | ||
#' | ||
#' @param x A SparkDataFrame | ||
|
@@ -182,7 +182,7 @@ setMethod("isLocal", | |
#' @param numRows the number of rows to print. Defaults to 20. | ||
#' @param truncate whether truncate long strings. If \code{TRUE}, strings more than | ||
#' 20 characters will be truncated. However, if set greater than zero, | ||
#' truncates strings longer than `truncate` characters and all cells | ||
#' truncates strings longer than \code{truncate} characters and all cells | ||
#' will be aligned right. | ||
#' @param ... further arguments to be passed to or from other methods. | ||
#' @family SparkDataFrame functions | ||
|
@@ -642,10 +642,10 @@ setMethod("unpersist", | |
#' The following options for repartition are possible: | ||
#' \itemize{ | ||
#' \item{1.} {Return a new SparkDataFrame partitioned by | ||
#' the given columns into `numPartitions`.} | ||
#' \item{2.} {Return a new SparkDataFrame that has exactly `numPartitions`.} | ||
#' the given columns into \code{numPartitions}.} | ||
#' \item{2.} {Return a new SparkDataFrame that has exactly \code{numPartitions}.} | ||
#' \item{3.} {Return a new SparkDataFrame partitioned by the given column(s), | ||
#' using `spark.sql.shuffle.partitions` as number of partitions.} | ||
#' using \code{spark.sql.shuffle.partitions} as number of partitions.} | ||
#'} | ||
#' @param x a SparkDataFrame. | ||
#' @param numPartitions the number of partitions to use. | ||
|
@@ -1132,9 +1132,8 @@ setMethod("take", | |
|
||
#' Head | ||
#' | ||
#' Return the first NUM rows of a SparkDataFrame as a R data.frame. If NUM is NULL, | ||
#' then head() returns the first 6 rows in keeping with the current data.frame | ||
#' convention in R. | ||
#' Return the first \code{num} rows of a SparkDataFrame as a R data.frame. If \code{num} is not | ||
#' specified, then head() returns the first 6 rows as with R data.frame. | ||
#' | ||
#' @param x a SparkDataFrame. | ||
#' @param num the number of rows to return. Default is 6. | ||
|
@@ -1406,11 +1405,11 @@ setMethod("dapplyCollect", | |
#' | ||
#' @param cols grouping columns. | ||
#' @param func a function to be applied to each group partition specified by grouping | ||
#' column of the SparkDataFrame. The function `func` takes as argument | ||
#' column of the SparkDataFrame. The function \code{func} takes as argument | ||
#' a key - grouping columns and a data frame - a local R data.frame. | ||
#' The output of `func` is a local R data.frame. | ||
#' The output of \code{func} is a local R data.frame. | ||
#' @param schema the schema of the resulting SparkDataFrame after the function is applied. | ||
#' The schema must match to output of `func`. It has to be defined for each | ||
#' The schema must match to output of \code{func}. It has to be defined for each | ||
#' output column with preferred output column name and corresponding data type. | ||
#' @return A SparkDataFrame. | ||
#' @family SparkDataFrame functions | ||
|
@@ -1497,9 +1496,9 @@ setMethod("gapply", | |
#' | ||
#' @param cols grouping columns. | ||
#' @param func a function to be applied to each group partition specified by grouping | ||
#' column of the SparkDataFrame. The function `func` takes as argument | ||
#' column of the SparkDataFrame. The function \code{func} takes as argument | ||
#' a key - grouping columns and a data frame - a local R data.frame. | ||
#' The output of `func` is a local R data.frame. | ||
#' The output of \code{func} is a local R data.frame. | ||
#' @return A data.frame. | ||
#' @family SparkDataFrame functions | ||
#' @aliases gapplyCollect,SparkDataFrame-method | ||
|
@@ -1657,7 +1656,7 @@ setMethod("$", signature(x = "SparkDataFrame"), | |
getColumn(x, name) | ||
}) | ||
|
||
#' @param value a Column or NULL. If NULL, the specified Column is dropped. | ||
#' @param value a Column or \code{NULL}. If \code{NULL}, the specified Column is dropped. | ||
#' @rdname select | ||
#' @name $<- | ||
#' @aliases $<-,SparkDataFrame-method | ||
|
@@ -1747,7 +1746,7 @@ setMethod("[", signature(x = "SparkDataFrame"), | |
#' @family subsetting functions | ||
#' @examples | ||
#' \dontrun{ | ||
#' # Columns can be selected using `[[` and `[` | ||
#' # Columns can be selected using [[ and [ | ||
#' df[[2]] == df[["age"]] | ||
#' df[,2] == df[,"age"] | ||
#' df[,c("name", "age")] | ||
|
@@ -1792,7 +1791,7 @@ setMethod("subset", signature(x = "SparkDataFrame"), | |
#' select(df, df$name, df$age + 1) | ||
#' select(df, c("col1", "col2")) | ||
#' select(df, list(df$name, df$age + 1)) | ||
#' # Similar to R data frames columns can also be selected using `$` | ||
#' # Similar to R data frames columns can also be selected using $ | ||
#' df[,df$age] | ||
#' } | ||
#' @note select(SparkDataFrame, character) since 1.4.0 | ||
|
@@ -2443,7 +2442,7 @@ generateAliasesForIntersectedCols <- function (x, intersectedColNames, suffix) { | |
#' Return a new SparkDataFrame containing the union of rows | ||
#' | ||
#' Return a new SparkDataFrame containing the union of rows in this SparkDataFrame | ||
#' and another SparkDataFrame. This is equivalent to `UNION ALL` in SQL. | ||
#' and another SparkDataFrame. This is equivalent to \code{UNION ALL} in SQL. | ||
#' Note that this does not remove duplicate rows across the two SparkDataFrames. | ||
#' | ||
#' @param x A SparkDataFrame | ||
|
@@ -2486,7 +2485,7 @@ setMethod("unionAll", | |
|
||
#' Union two or more SparkDataFrames | ||
#' | ||
#' Union two or more SparkDataFrames. This is equivalent to `UNION ALL` in SQL. | ||
#' Union two or more SparkDataFrames. This is equivalent to \code{UNION ALL} in SQL. | ||
#' Note that this does not remove duplicate rows across the two SparkDataFrames. | ||
#' | ||
#' @param x a SparkDataFrame. | ||
|
@@ -2519,7 +2518,7 @@ setMethod("rbind", | |
#' Intersect | ||
#' | ||
#' Return a new SparkDataFrame containing rows only in both this SparkDataFrame | ||
#' and another SparkDataFrame. This is equivalent to `INTERSECT` in SQL. | ||
#' and another SparkDataFrame. This is equivalent to \code{INTERSECT} in SQL. | ||
#' | ||
#' @param x A SparkDataFrame | ||
#' @param y A SparkDataFrame | ||
|
@@ -2547,7 +2546,7 @@ setMethod("intersect", | |
#' except | ||
#' | ||
#' Return a new SparkDataFrame containing rows in this SparkDataFrame | ||
#' but not in another SparkDataFrame. This is equivalent to `EXCEPT` in SQL. | ||
#' but not in another SparkDataFrame. This is equivalent to \code{EXCEPT} in SQL. | ||
#' | ||
#' @param x a SparkDataFrame. | ||
#' @param y a SparkDataFrame. | ||
|
@@ -2576,8 +2575,8 @@ setMethod("except", | |
|
||
#' Save the contents of SparkDataFrame to a data source. | ||
#' | ||
#' The data source is specified by the `source` and a set of options (...). | ||
#' If `source` is not specified, the default data source configured by | ||
#' The data source is specified by the \code{source} and a set of options (...). | ||
#' If \code{source} is not specified, the default data source configured by | ||
#' spark.sql.sources.default will be used. | ||
#' | ||
#' Additionally, mode is used to specify the behavior of the save operation when data already | ||
|
@@ -2613,7 +2612,7 @@ setMethod("except", | |
#' @note write.df since 1.4.0 | ||
setMethod("write.df", | ||
signature(df = "SparkDataFrame", path = "character"), | ||
function(df, path, source = NULL, mode = "error", ...){ | ||
function(df, path, source = NULL, mode = "error", ...) { | ||
if (is.null(source)) { | ||
source <- getDefaultSqlSource() | ||
} | ||
|
@@ -2635,14 +2634,14 @@ setMethod("write.df", | |
#' @note saveDF since 1.4.0 | ||
setMethod("saveDF", | ||
signature(df = "SparkDataFrame", path = "character"), | ||
function(df, path, source = NULL, mode = "error", ...){ | ||
function(df, path, source = NULL, mode = "error", ...) { | ||
write.df(df, path, source, mode, ...) | ||
}) | ||
|
||
#' Save the contents of the SparkDataFrame to a data source as a table | ||
#' | ||
#' The data source is specified by the `source` and a set of options (...). | ||
#' If `source` is not specified, the default data source configured by | ||
#' The data source is specified by the \code{source} and a set of options (...). | ||
#' If \code{source} is not specified, the default data source configured by | ||
#' spark.sql.sources.default will be used. | ||
#' | ||
#' Additionally, mode is used to specify the behavior of the save operation when | ||
|
@@ -2675,7 +2674,7 @@ setMethod("saveDF", | |
#' @note saveAsTable since 1.4.0 | ||
setMethod("saveAsTable", | ||
signature(df = "SparkDataFrame", tableName = "character"), | ||
function(df, tableName, source = NULL, mode="error", ...){ | ||
function(df, tableName, source = NULL, mode="error", ...) { | ||
if (is.null(source)) { | ||
source <- getDefaultSqlSource() | ||
} | ||
|
@@ -2752,11 +2751,11 @@ setMethod("summary", | |
#' @param how "any" or "all". | ||
#' if "any", drop a row if it contains any nulls. | ||
#' if "all", drop a row only if all its values are null. | ||
#' if minNonNulls is specified, how is ignored. | ||
#' if \code{minNonNulls} is specified, how is ignored. | ||
#' @param minNonNulls if specified, drop rows that have less than | ||
#' minNonNulls non-null values. | ||
#' \code{minNonNulls} non-null values. | ||
#' This overwrites the how parameter. | ||
#' @param cols optional list of column names to consider. In `fillna`, | ||
#' @param cols optional list of column names to consider. In \code{fillna}, | ||
#' columns specified in cols that do not have matching data | ||
#' type are ignored. For example, if value is a character, and | ||
#' subset contains a non-character column, then the non-character | ||
|
@@ -2879,8 +2878,8 @@ setMethod("fillna", | |
#' in your system to accommodate the contents. | ||
#' | ||
#' @param x a SparkDataFrame. | ||
#' @param row.names NULL or a character vector giving the row names for the data frame. | ||
#' @param optional If `TRUE`, converting column names is optional. | ||
#' @param row.names \code{NULL} or a character vector giving the row names for the data frame. | ||
#' @param optional If \code{TRUE}, converting column names is optional. | ||
#' @param ... additional arguments to pass to base::as.data.frame. | ||
#' @return A data.frame. | ||
#' @family SparkDataFrame functions | ||
|
@@ -3058,7 +3057,7 @@ setMethod("str", | |
#' @note drop since 2.0.0 | ||
setMethod("drop", | ||
signature(x = "SparkDataFrame"), | ||
function(x, col, ...) { | ||
function(x, col) { | ||
stopifnot(class(col) == "character" || class(col) == "Column") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. just to clarify removing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This actually follows from the discussion in #14705. A summary may be seen at #14735 (comment) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks - that sounds good There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. right, in fact, this one was added in #14705 - which we missed and shouldn't be added. |
||
|
||
if (class(col) == "Column") { | ||
|
@@ -3218,8 +3217,8 @@ setMethod("histogram", | |
#' and to not change the existing data. | ||
#' } | ||
#' | ||
#' @param x s SparkDataFrame. | ||
#' @param url JDBC database url of the form `jdbc:subprotocol:subname`. | ||
#' @param x a SparkDataFrame. | ||
#' @param url JDBC database url of the form \code{jdbc:subprotocol:subname}. | ||
#' @param tableName yhe name of the table in the external database. | ||
#' @param mode one of 'append', 'overwrite', 'error', 'ignore' save mode (it is 'error' by default). | ||
#' @param ... additional JDBC database connection properties. | ||
|
@@ -3237,7 +3236,7 @@ setMethod("histogram", | |
#' @note write.jdbc since 2.0.0 | ||
setMethod("write.jdbc", | ||
signature(x = "SparkDataFrame", url = "character", tableName = "character"), | ||
function(x, url, tableName, mode = "error", ...){ | ||
function(x, url, tableName, mode = "error", ...) { | ||
jmode <- convertToJSaveMode(mode) | ||
jprops <- varargsToJProperties(...) | ||
write <- callJMethod(x@sdf, "write") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these things show up as CRAN warnings ? I dont see them on my machine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering about this part as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm not sure why but what I see is a much longer list
I'm still getting the same, longer output after upgrading to R 3.3.1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might have to do with the R version used. I am using R 3.2.1 on my machine while this is from R 3.3.0 -- Using a later R version is obviously better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly, that was my first thought. But Jenkins is running R 3.3.1 I think?
Oops 3.1.1, so older.
log