Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix typos #433

Merged
merged 1 commit into from
Aug 8, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -17,7 +17,7 @@ siuba ([小巴](http://www.cantonese.sheik.co.uk/dictionary/words/9139/)) is a p
* `summarize()` - reduce one or more columns down to a single number.
* `arrange()` - reorder the rows of data.

These actions can be preceeded by a `group_by()`, which causes them to be applied individually to grouped rows of data. Moreover, many SQL concepts, such as `distinct()`, `count()`, and joins are implemented.
These actions can be preceded by a `group_by()`, which causes them to be applied individually to grouped rows of data. Moreover, many SQL concepts, such as `distinct()`, `count()`, and joins are implemented.
Inputs to these functions can be a pandas `DataFrame` or SQL connection (currently postgres, redshift, or sqlite).

For more on the rationale behind tools like dplyr, see this [tidyverse paper](https://tidyverse.tidyverse.org/articles/paper.html).
2 changes: 1 addition & 1 deletion docs/api_table_core/01_filter.Rmd
Original file line number Diff line number Diff line change
@@ -64,7 +64,7 @@ Otherwise, python will group the operation like `_.cyl == (4 | _.gear) == 5`.

### Dropping NAs

As with most subsetting in pandas, when a condition evalutes to an `NA` value, the row is automatically excluded. This is different from pandas indexing, where `NA` values produce errors.
As with most subsetting in pandas, when a condition evaluates to an `NA` value, the row is automatically excluded. This is different from pandas indexing, where `NA` values produce errors.

```{python}
df = pd.DataFrame({
2 changes: 1 addition & 1 deletion docs/api_table_core/07_summarize.Rmd
Original file line number Diff line number Diff line change
@@ -36,7 +36,7 @@ mtcars >> summarize(avg_mpg = _.mpg.mean())

### Summarizing per group

When you use summarize with a grouped DataFrame, the result has the same number of rows as there are groups in the data. For example, there are 3 values of cylinders (`cyl`) a row can have (4, 6, or 8), so ther result will be 3 rows.
When you use summarize with a grouped DataFrame, the result has the same number of rows as there are groups in the data. For example, there are 3 values of cylinders (`cyl`) a row can have (4, 6, or 8), so the result will be 3 rows.

```{python}
(mtcars
2 changes: 1 addition & 1 deletion docs/developer/sql-translators.ipynb
Original file line number Diff line number Diff line change
@@ -26,7 +26,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using sqlalchemy select statment for convenience\n",
"### Using sqlalchemy select statement for convenience\n",
"\n",
"Throughout this vignette, we'll use a select statement object from sqlalchemy,\n",
"so we can conveniently access its columns as needed."
2 changes: 1 addition & 1 deletion docs/guide_analysis.Rmd
Original file line number Diff line number Diff line change
@@ -270,4 +270,4 @@ Select works okay, now let's uncomment the next line.
)
```

We found our bug! Note that when working with SQL, siuba prints out the name of the verb where the error occured. This is very useful, and will be added to working with pandas in the future!
We found our bug! Note that when working with SQL, siuba prints out the name of the verb where the error occurred. This is very useful, and will be added to working with pandas in the future!
4 changes: 2 additions & 2 deletions docs/key_features.ipynb
Original file line number Diff line number Diff line change
@@ -703,7 +703,7 @@
")</pre>\n",
" </td>\n",
" <td>\n",
" <pre lang=\"pyton\">\n",
" <pre lang=\"python\">\n",
"mtcars.assign(\n",
" res = lambda d: d.hp - d.hp.mean()\n",
")</pre>\n",
@@ -721,7 +721,7 @@
")</pre>\n",
" </td>\n",
" <td>\n",
" <pre lang=\"pyton\">\n",
" <pre lang=\"python\">\n",
"mtcars.assign(\n",
" res = mtcars.hp - g_cyl.hp.transform(\"mean\")\n",
")</pre>\n",
4 changes: 2 additions & 2 deletions examples/architecture/004-user-defined-functions.ipynb
Original file line number Diff line number Diff line change
@@ -36,7 +36,7 @@
"\n",
"This is the tyranny of methods. The object defining the method owns the method. To add or modify a method, you need to modify the class behind the object.\n",
"\n",
"Now, this isn't totally true--the class could provide a way for you to register your method (like accessors). But wouldn't it be nice if the actions we wanted to perform on data didn't have to check in with the data class itself? Why does the data class get to decide what we do with it, and why does it get priviledged methods?\n",
"Now, this isn't totally true--the class could provide a way for you to register your method (like accessors). But wouldn't it be nice if the actions we wanted to perform on data didn't have to check in with the data class itself? Why does the data class get to decide what we do with it, and why does it get privileged methods?\n",
"\n",
"### Enter singledispatch\n",
"\n",
@@ -82,7 +82,7 @@
"\n",
"This concept is incredibly powerful for two reasons...\n",
"\n",
"* many people can define actions over a DataFrame, without a quorum of priviledged methods.\n",
"* many people can define actions over a DataFrame, without a quorum of privileged methods.\n",
"* you can use normal importing, so don't have to worry about name conflicts\n",
"\n"
]
2 changes: 1 addition & 1 deletion examples/architecture/006-autocompletion.ipynb
Original file line number Diff line number Diff line change
@@ -293,7 +293,7 @@
"\n",
"Essentially, our challenge is figuring how where autocomplete could fit in. Just to set the stage, the IPython IPCompleter uses some of its own useful completion strategies, but the bulk of where we benefit comes from its use of the library jedi.\n",
"\n",
"In the sections below, I'll first give a quick preview of how jedi works, followed by two sequence diagrams of how it's intergrated into the ipykernel."
"In the sections below, I'll first give a quick preview of how jedi works, followed by two sequence diagrams of how it's integrated into the ipykernel."
]
},
{
6 changes: 3 additions & 3 deletions siuba/dply/verbs.py
Original file line number Diff line number Diff line change
@@ -415,8 +415,8 @@ def __getitem__(self, x):


def var_slice(colnames, x):
"""Return indices in colnames correspnding to start and stop of slice."""
# TODO: produces bahavior similar to df.loc[:, "V1":"V3"], but can reverse
"""Return indices in colnames corresponding to start and stop of slice."""
# TODO: produces behavior similar to df.loc[:, "V1":"V3"], but can reverse
# TODO: make DRY
# TODO: reverse not including end points
if isinstance(x.start, Var):
@@ -1345,7 +1345,7 @@ def unite(__data, col, *args, sep = "_", remove = True):
__data: a DataFrame
col: name of the to-be-created column (string).
*args: names of each column to combine.
sep: separater joining each column being combined.
sep: separator joining each column being combined.
remove: whether to remove the combined columns from the returned DataFrame.

"""
2 changes: 1 addition & 1 deletion siuba/experimental/pivot/__init__.py
Original file line number Diff line number Diff line change
@@ -68,7 +68,7 @@ def pivot_longer(

if not np.all(split_lengths == split_lengths[0]):
raise ValueError(
"Splitting by {} leads to unequal lenghts ({}).".format(
"Splitting by {} leads to unequal lengths ({}).".format(
names_sep if names_sep is not None else names_pattern
)
)
2 changes: 1 addition & 1 deletion siuba/siu/dispatchers.py
Original file line number Diff line number Diff line change
@@ -237,7 +237,7 @@ def __rrshift__(self, x):

This function handles two cases:
* callable >> pipe -> pipe
* otherewise, evaluate the pipe
* otherwise, evaluate the pipe

"""
if isinstance(x, (Symbolic, Call)):
2 changes: 1 addition & 1 deletion siuba/sql/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from .verbs import LazyTbl, sql_raw
from .translate import SqlColumn, SqlColumnAgg, SqlFunctionLookupError

# preceed w/ underscore so it isn't exported by default
# proceed w/ underscore so it isn't exported by default
# we just want to register the singledispatch funcs
from .dply import vector as _vector
from .dply import string as _string
2 changes: 1 addition & 1 deletion siuba/sql/verbs.py
Original file line number Diff line number Diff line change
@@ -454,7 +454,7 @@ def _show_query(tbl, simplify = False):


if simplify:
# try to strip table names and labels where uneccessary
# try to strip table names and labels where unnecessary
with use_simple_names():
print(compile_query())
else:
4 changes: 2 additions & 2 deletions siuba/tests/test_dply_series_methods.py
Original file line number Diff line number Diff line change
@@ -146,10 +146,10 @@ def do_test_missing_implementation(entry, backend):
#if get_spec_no_mutate(entry, backend):
# pytest.skip("Spec'd failure")

## case: Needs to be implmented
## case: Needs to be implemented
## TODO(table): uses xfail
#if backend_status == "todo":
# pytest.xfail("TODO: impelement this translation")
# pytest.xfail("TODO: implement this translation")
#
## case: Can't be used in a mutate (e.g. a SQL ordered set aggregate function)
## TODO(table): no_mutate