Skip to content

Commit

Permalink
docs: update readme with advent of code (#1516)
Browse files Browse the repository at this point in the history
* Update duplicates_pandas.py (#1427)

Fixing Bug Report #1384
Dataset with categorical features causes memory error even on tiny dataset.

* chore(actions): update sonarsource/sonarqube-scan-action action to v2.0.1

* chore(actions): update actions/checkout action to v4

* docs: setup new docs with mkdocs (#1418)

* chore(actions): update actions/checkout action to v4

* fix: remove the duplicated cardinality threshold under categorical and text settings

* fix: fixate matplotlib upper version

* docs: change from `zap` to `sparkles` (#1447)

Co-authored-by: Fabiana <[email protected]>

* fix: template {{ file_name }} error in HTML wrapper (#1380)

* Update javascript.html

* Update style.html

* feat: add density histogram (#1458)

* feat: add histogram density option

* test: add unit test

* fix: discard weights if exceed max_bins

* docs: update README.html (#1461)

Update url of use cases, main integrations, and common issues.

* fix: bug when creating a new report (#1440)

* fix: gen wordcloud only for non-empty cols (#1459)

* fix: table template ignoring text format (#1462)

* fix: table template ignoring text format

* fix: timeseries unit test

* fix(linting): code formatting

---------

Co-authored-by: Azory YData Bot <[email protected]>

* fix: to_category misshandling pd.NA (#1464)

* docs: add 📊 for Key features (#1451)

See also #1445 (comment)

* docs: fix hyperlink - related to package name change (#1457)

Co-authored-by: Martin Mokry <[email protected]>

* chore(deps): increase numpy upper limit (#1467)

* chore(deps): increase numpy upper limit

* chore(deps): fixate numpy version for spark

* chore(deps): fix numba package version, and filter warns (#1468)

* chore: fix numba package version, and filter warns

* fix: skip isort linter on init

* chore(deps): update dependency typeguard to v4 (#1324)

* chore(deps): update dependency typeguard to v4

---------

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Maciej Bukczynski <[email protected]>

* docs: update docs with advent of code

* docs: update links for fabric

---------

Co-authored-by: boris-kogan <[email protected]>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Vasco Ramos <[email protected]>
Co-authored-by: ricardodcpereira <[email protected]>
Co-authored-by: Anselm Hahn <[email protected]>
Co-authored-by: Joge <[email protected]>
Co-authored-by: Alex Barros <[email protected]>
Co-authored-by: Miriam Seoane Santos <[email protected]>
Co-authored-by: Chris Mahoney <[email protected]>
Co-authored-by: Azory YData Bot <[email protected]>
Co-authored-by: martin-kokos <[email protected]>
Co-authored-by: Martin Mokry <[email protected]>
Co-authored-by: Maciej Bukczynski <[email protected]>
Co-authored-by: Fabiana Clemente <[email protected]>
  • Loading branch information
15 people authored Dec 7, 2023
1 parent 06b6535 commit 8c6c315
Show file tree
Hide file tree
Showing 13 changed files with 40 additions and 31 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/docs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ jobs:
git config core.autocrlf false
- name: Setup Python
uses: actions/setup-python@v5
uses: actions/setup-python@v4
with:
python-version: "3.10"

Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/pull-request.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ jobs:
git config core.autocrlf false
- name: Set up Python 3.8
uses: actions/setup-python@v5
uses: actions/setup-python@v4
with:
python-version: "3.10"

Expand Down Expand Up @@ -89,7 +89,7 @@ jobs:
- uses: actions/checkout@v4

- name: Setup Python
uses: actions/setup-python@v5
uses: actions/setup-python@v4
with:
python-version: "3.10"

Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/release-deprecated.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:
- uses: actions/checkout@v4

- name: Setup Python 3.8
uses: actions/setup-python@v5
uses: actions/setup-python@v4
with:
python-version: "3.8"

Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
run: echo "value=${GITHUB_REF#refs/*/}" >> $GITHUB_OUTPUT

- name: Setup Python 3.10
uses: actions/setup-python@v5
uses: actions/setup-python@v4
with:
python-version: "3.10"

Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ jobs:
- uses: actions/checkout@v4

- name: Setup python
uses: actions/setup-python@v5
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
architecture: x64
Expand Down Expand Up @@ -101,7 +101,7 @@ jobs:
- uses: actions/checkout@v4

- name: Setup python
uses: actions/setup-python@v5
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
architecture: x64
Expand Down Expand Up @@ -185,7 +185,7 @@ jobs:
steps:
- uses: actions/checkout@v4
- name: Setup python
uses: actions/setup-python@v5
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
architecture: x64
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
The package outputs a simple and digested analysis of a dataset, including **time-series** and **text**.

> **Looking for a scalable solution that can fully integrate with your database systems?**<br>
> Leverage YData Fabric Data Catalog to connect to different databases and storages (Oracle, snowflake, PostGreSQL, GCS, S3, etc.) and leverage an interactive and guided profiling experience in Fabric. Check out the [Community Version](https://ydata.ai/ydata-fabric-free-trial).
> Leverage YData Fabric Data Catalog to connect to different databases and storages (Oracle, snowflake, PostGreSQL, GCS, S3, etc.) and leverage an interactive and guided profiling experience in Fabric. Check out the [Community Version](http://ydata.ai/register?utm_source=ydata-profiling&utm_medium=documentation&utm_campaign=YData%20Fabric%20Community).
## ▶️ Quickstart

Expand Down
2 changes: 1 addition & 1 deletion docs/advanced_settings/collaborative_data_profiling.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,4 +56,4 @@ users and per project. YData Fabric Data Catalog helps in maintinaing
regulatory compliance by identifying any sensitive data.

Try today the Catalog experience in with [Fabric Community
version](https://ydata.ai/ydata-fabric-free-trial)!
version](http://ydata.ai/register?utm_source=ydata-profiling&utm_medium=documentation&utm_campaign=YData%20Fabric%20Community)!
2 changes: 1 addition & 1 deletion docs/features/big_data.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ computation time of the profiling becomes a bottleneck,
!!! info "Scale in a fully managed system"

Looking for an fully managed system that is able to scale the profiling
for large datasets? [Sign up Fabric](https://ydata.ai/ydata-fabric-free-trial)
for large datasets? [Sign up Fabric](http://ydata.ai/register?utm_source=ydata-profiling&utm_medium=documentation&utm_campaign=YData%20Fabric%20Community)
community for distributed data profiling.

## Pyspark
Expand Down
14 changes: 9 additions & 5 deletions docs/features/collaborative_data_profiling.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# Data quality Profiling with a Collaborative experience
# Data Catalog - A collaborative experience to profile datasets & relational databases

!!! note
!!! note "Data Catalog with data quality profiling"

[Sign-up Fabric community](https://ydata.ai/ydata-fabric-free-trial) to try the **data catalog**
and **collaborative** experience for data profiling at scale!
[Sign-up Fabric community](http://ydata.ai/register?utm_source=ydata-profiling&utm_medium=documentation&utm_campaign=YData%20Fabric%20Community) to try the **data catalog**
and **collaborative** experience for datasets and database profiling at scale!

[YData Fabric](https://ydata.ai/products/fabric) is a Data-Centric AI
development platform. YData Fabric provides all capabilities of
Expand Down Expand Up @@ -42,6 +42,10 @@ An interactive experience that allows to drill-down in a comprehensive data prof
and relationship analysis, providing deep insights into data structure,
distributions and interactions for improved data preparation.

<p style="text-align:center;">
<iframe width="560" height="315" src="https://www.youtube.com/embed/9EupCg5YQLE?si=Tuu68p6sj_RzxTBn&amp;clip=UgkxGNvIAcxUiqBSepTZzP2-4evffzjU7aHX&amp;clipt=EJbiBxinoAg" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
</p>

### Data quality indexes

Access and navigate indicators and data quality statistics, such as completeness, uniqueness
Expand All @@ -61,4 +65,4 @@ users and per project. YData Fabric Data Catalog helps in maintaining
regulatory compliance by identifying any sensitive data.

Try today the Catalog experience in with [Fabric Community
version](https://ydata.ai/ydata-fabric-free-trial)!
version](http://ydata.ai/register?utm_source=ydata-profiling&utm_medium=documentation&utm_campaign=YData%20Fabric%20Community)!
2 changes: 1 addition & 1 deletion docs/getting-started/concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ This section provides a comprehensive profiling over the potential dataset outli
based on observed variance.
The identification of outliers allows the data analyst or scientist to assess whether they are genuine data anomalies or erroneous entries, allowing for informed decisions on whether to retain, transform, or exclude these points in further analyses.

Feature limited to user of the [cloud hosted solution](https://ydata.ai/ydata-fabric-free-trial).
Feature limited to user of the [cloud hosted solution](http://ydata.ai/register?utm_source=ydata-profiling&utm_medium=documentation&utm_campaign=YData%20Fabric%20Community).

## Preview data
For a quick overview of the data, ydata-profiling provides the following sections that can be easily configure by the user:
Expand Down
23 changes: 18 additions & 5 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,17 @@ YData-profiling is a leading tool in the data understanding step of the data sci
complete with statistics and visualizations. The significance of the package lies in how it streamlines the process of
understanding and preparing data for analysis in a single line of code! If you're ready to get started see the [quickstart](getting-started/quickstart.md)!

!!! question "Scalable solution to integrate with database systems?"
!!! tip "Advent of Code - Get featured on ydata-profiling"

Leverage YData Fabric Data Catalog to connect to different databases and storages **(Oracle, snowflake, PostGreSQL, GCS, S3, etc.)**
and leverage an interactive and guided profiling experience in [Fabric](https://ydata.ai/products/fabric).
*“I want to get into open source, but I don’t know how.”* - Does this sound familiar to you? Have you been wanting to get more involved with open-source software, but no one’s given you an entry point?

That's why we joined [The Advent of code this year](https://zilliz.com/advent-of-code). Contribute to ydata-profiling and win some 🐼🐼 swag!

Check out the [Community Version](https://ydata.ai/ydata-fabric-free-trial).
How can you be part of it?

- Give us some love with a Github ⭐
- Write an article or create a tutorial like other [members the communit already did.](https://medium.com/@seckindinc/data-profiling-with-python-36497d3a1261)
- Feeling adventurous? Contribute with a PR. We have a list of [great issues to get you started.](https://github.com/ydataai/ydata-profiling/issues?q=label%3A%22getting+started+%E2%98%9D%22+)

![ydata-profiling report](_static/img/ydata-profiling.gif)

Expand Down Expand Up @@ -42,6 +47,14 @@ To learn more about the package check out [concepts overview](getting-started/co
## 📝 Features, functionalities & integrations
YData-profiling can be used to deliver a variety of different applications. The documentation includes guides, tips and tricks for tackling them:

!!! question "Data Catalog with data profiling for databases & storages"

Need to profile directly from databases and data storages **(Oracle, snowflake, PostGreSQL, GCS, S3, etc.)**?

Try [YData Fabric Data Catalog](https://ydata.ai/products/data_catalog) for interactive and scalable data profiling

Check out the [free Community Version](http://ydata.ai/register?utm_source=ydata-profiling&utm_medium=documentation&utm_campaign=YData%20Fabric%20Community).

| Features & functionalities | Description |
|------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------|
| [Comparing datasets](features/comparing_datasets.md) | Comparing multiple version of the same dataset |
Expand All @@ -54,7 +67,7 @@ YData-profiling can be used to deliver a variety of different applications. The

### Tutorials

Looking for how to use certain features or how to intgrate `ydata-profiling` in your currect stack and workflows,
Looking for how to use certain features or how to integrate `ydata-profiling` in your currect stack and workflows,
check our step-by-step tutorials.

- **How to master exploratory data analysis with ydata-profiling?** Check this [step-by-step tutorial](https://medium.com/ydata-ai/auditing-data-quality-with-pandas-profiling-b1bf1919f856).
Expand Down
2 changes: 1 addition & 1 deletion docs/integrations/pipelines.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ similar way as with Airflow.
!!! tip "Fabric Community version"

[YData Fabric](https://ydata.ai/products/fabric) has a community version that you can start using today to create data workflows with pipelines.
[Sign up here](https://ydata.ai/ydata-fabric-free-trial) and start building your pipelines. ydata-profiling is installed by default in all YData images.
[Sign up here](http://ydata.ai/register?utm_source=ydata-profiling&utm_medium=documentation&utm_campaign=YData%20Fabric%20Community) and start building your pipelines. ydata-profiling is installed by default in all YData images.

![ydata-profiling in a pipeline](../_static/img/profiling_pipelines.png)

Expand Down
8 changes: 0 additions & 8 deletions docsrc/source/pages/reference/changelog/v4_5_1.md

This file was deleted.

0 comments on commit 8c6c315

Please sign in to comment.