Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: update readme with advent of code #1516

Merged
merged 26 commits into from
Dec 7, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
07d5819
Update duplicates_pandas.py (#1427)
boris-kogan Aug 21, 2023
6ceeead
chore(actions): update sonarsource/sonarqube-scan-action action to v2…
renovate[bot] Aug 30, 2023
d8ffdb1
chore(actions): update actions/checkout action to v4
renovate[bot] Sep 5, 2023
c6f90cb
docs: setup new docs with mkdocs (#1418)
vascoalramos Sep 12, 2023
cb66a7e
chore(actions): update actions/checkout action to v4
renovate[bot] Sep 12, 2023
fcf57b2
fix: remove the duplicated cardinality threshold under categorical an…
ricardodcpereira Sep 18, 2023
58158ef
fix: fixate matplotlib upper version
ricardodcpereira Sep 18, 2023
829acf5
docs: change from `zap` to `sparkles` (#1447)
Anselmoo Sep 19, 2023
fdc0346
fix: template {{ file_name }} error in HTML wrapper (#1380)
jogecodes Sep 20, 2023
1c500d5
feat: add density histogram (#1458)
alexbarros Sep 26, 2023
62b0231
docs: update README.html (#1461)
miriamspsantos Sep 26, 2023
6c23196
fix: bug when creating a new report (#1440)
chrimaho Sep 27, 2023
df76ea7
fix: gen wordcloud only for non-empty cols (#1459)
alexbarros Sep 27, 2023
b7fac9e
fix: table template ignoring text format (#1462)
alexbarros Sep 27, 2023
797c799
fix: to_category misshandling pd.NA (#1464)
alexbarros Sep 27, 2023
07322a5
docs: add 📊 for Key features (#1451)
Anselmoo Sep 27, 2023
6d60670
docs: fix hyperlink - related to package name change (#1457)
martin-kokos Sep 27, 2023
bc12fde
chore(deps): increase numpy upper limit (#1467)
alexbarros Sep 27, 2023
a57e234
chore(deps): fix numba package version, and filter warns (#1468)
alexbarros Sep 27, 2023
9f8bf18
chore(deps): update dependency typeguard to v4 (#1324)
renovate[bot] Oct 4, 2023
5733205
Merge branch 'develop' of https://github.com/ydataai/ydata-profiling …
Dec 6, 2023
2aca483
Merge remote-tracking branch 'origin/develop' into develop
Dec 6, 2023
86a0ea9
docs: update docs with advent of code
Dec 7, 2023
e7801c2
Merge branch 'develop' into docs/advent_code
fabclmnt Dec 7, 2023
5d9d13a
docs: update links for fabric
Dec 7, 2023
479a646
Merge branch 'docs/advent_code' of https://github.com/ydataai/ydata-p…
Dec 7, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/docs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ jobs:
git config core.autocrlf false

- name: Setup Python
uses: actions/setup-python@v5
uses: actions/setup-python@v4
with:
python-version: "3.10"

Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/pull-request.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ jobs:
git config core.autocrlf false

- name: Set up Python 3.8
uses: actions/setup-python@v5
uses: actions/setup-python@v4
with:
python-version: "3.10"

Expand Down Expand Up @@ -89,7 +89,7 @@ jobs:
- uses: actions/checkout@v4

- name: Setup Python
uses: actions/setup-python@v5
uses: actions/setup-python@v4
with:
python-version: "3.10"

Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/release-deprecated.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:
- uses: actions/checkout@v4

- name: Setup Python 3.8
uses: actions/setup-python@v5
uses: actions/setup-python@v4
with:
python-version: "3.8"

Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
run: echo "value=${GITHUB_REF#refs/*/}" >> $GITHUB_OUTPUT

- name: Setup Python 3.10
uses: actions/setup-python@v5
uses: actions/setup-python@v4
with:
python-version: "3.10"

Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ jobs:
- uses: actions/checkout@v4

- name: Setup python
uses: actions/setup-python@v5
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
architecture: x64
Expand Down Expand Up @@ -101,7 +101,7 @@ jobs:
- uses: actions/checkout@v4

- name: Setup python
uses: actions/setup-python@v5
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
architecture: x64
Expand Down Expand Up @@ -185,7 +185,7 @@ jobs:
steps:
- uses: actions/checkout@v4
- name: Setup python
uses: actions/setup-python@v5
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
architecture: x64
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
The package outputs a simple and digested analysis of a dataset, including **time-series** and **text**.

> **Looking for a scalable solution that can fully integrate with your database systems?**<br>
> Leverage YData Fabric Data Catalog to connect to different databases and storages (Oracle, snowflake, PostGreSQL, GCS, S3, etc.) and leverage an interactive and guided profiling experience in Fabric. Check out the [Community Version](https://ydata.ai/ydata-fabric-free-trial).
> Leverage YData Fabric Data Catalog to connect to different databases and storages (Oracle, snowflake, PostGreSQL, GCS, S3, etc.) and leverage an interactive and guided profiling experience in Fabric. Check out the [Community Version](http://ydata.ai/register?utm_source=ydata-profiling&utm_medium=documentation&utm_campaign=YData%20Fabric%20Community).

## ▶️ Quickstart

Expand Down
2 changes: 1 addition & 1 deletion docs/advanced_settings/collaborative_data_profiling.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,4 +56,4 @@ users and per project. YData Fabric Data Catalog helps in maintinaing
regulatory compliance by identifying any sensitive data.

Try today the Catalog experience in with [Fabric Community
version](https://ydata.ai/ydata-fabric-free-trial)!
version](http://ydata.ai/register?utm_source=ydata-profiling&utm_medium=documentation&utm_campaign=YData%20Fabric%20Community)!
2 changes: 1 addition & 1 deletion docs/features/big_data.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ computation time of the profiling becomes a bottleneck,
!!! info "Scale in a fully managed system"

Looking for an fully managed system that is able to scale the profiling
for large datasets? [Sign up Fabric](https://ydata.ai/ydata-fabric-free-trial)
for large datasets? [Sign up Fabric](http://ydata.ai/register?utm_source=ydata-profiling&utm_medium=documentation&utm_campaign=YData%20Fabric%20Community)
community for distributed data profiling.

## Pyspark
Expand Down
14 changes: 9 additions & 5 deletions docs/features/collaborative_data_profiling.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# Data quality Profiling with a Collaborative experience
# Data Catalog - A collaborative experience to profile datasets & relational databases

!!! note
!!! note "Data Catalog with data quality profiling"

[Sign-up Fabric community](https://ydata.ai/ydata-fabric-free-trial) to try the **data catalog**
and **collaborative** experience for data profiling at scale!
[Sign-up Fabric community](http://ydata.ai/register?utm_source=ydata-profiling&utm_medium=documentation&utm_campaign=YData%20Fabric%20Community) to try the **data catalog**
and **collaborative** experience for datasets and database profiling at scale!

[YData Fabric](https://ydata.ai/products/fabric) is a Data-Centric AI
development platform. YData Fabric provides all capabilities of
Expand Down Expand Up @@ -42,6 +42,10 @@ An interactive experience that allows to drill-down in a comprehensive data prof
and relationship analysis, providing deep insights into data structure,
distributions and interactions for improved data preparation.

<p style="text-align:center;">
<iframe width="560" height="315" src="https://www.youtube.com/embed/9EupCg5YQLE?si=Tuu68p6sj_RzxTBn&amp;clip=UgkxGNvIAcxUiqBSepTZzP2-4evffzjU7aHX&amp;clipt=EJbiBxinoAg" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
</p>

### Data quality indexes

Access and navigate indicators and data quality statistics, such as completeness, uniqueness
Expand All @@ -61,4 +65,4 @@ users and per project. YData Fabric Data Catalog helps in maintaining
regulatory compliance by identifying any sensitive data.

Try today the Catalog experience in with [Fabric Community
version](https://ydata.ai/ydata-fabric-free-trial)!
version](http://ydata.ai/register?utm_source=ydata-profiling&utm_medium=documentation&utm_campaign=YData%20Fabric%20Community)!
2 changes: 1 addition & 1 deletion docs/getting-started/concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ This section provides a comprehensive profiling over the potential dataset outli
based on observed variance.
The identification of outliers allows the data analyst or scientist to assess whether they are genuine data anomalies or erroneous entries, allowing for informed decisions on whether to retain, transform, or exclude these points in further analyses.

Feature limited to user of the [cloud hosted solution](https://ydata.ai/ydata-fabric-free-trial).
Feature limited to user of the [cloud hosted solution](http://ydata.ai/register?utm_source=ydata-profiling&utm_medium=documentation&utm_campaign=YData%20Fabric%20Community).

## Preview data
For a quick overview of the data, ydata-profiling provides the following sections that can be easily configure by the user:
Expand Down
23 changes: 18 additions & 5 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,17 @@ YData-profiling is a leading tool in the data understanding step of the data sci
complete with statistics and visualizations. The significance of the package lies in how it streamlines the process of
understanding and preparing data for analysis in a single line of code! If you're ready to get started see the [quickstart](getting-started/quickstart.md)!

!!! question "Scalable solution to integrate with database systems?"
!!! tip "Advent of Code - Get featured on ydata-profiling"

Leverage YData Fabric Data Catalog to connect to different databases and storages **(Oracle, snowflake, PostGreSQL, GCS, S3, etc.)**
and leverage an interactive and guided profiling experience in [Fabric](https://ydata.ai/products/fabric).
*“I want to get into open source, but I don’t know how.”* - Does this sound familiar to you? Have you been wanting to get more involved with open-source software, but no one’s given you an entry point?

That's why we joined [The Advent of code this year](https://zilliz.com/advent-of-code). Contribute to ydata-profiling and win some 🐼🐼 swag!

Check out the [Community Version](https://ydata.ai/ydata-fabric-free-trial).
How can you be part of it?

- Give us some love with a Github ⭐
- Write an article or create a tutorial like other [members the communit already did.](https://medium.com/@seckindinc/data-profiling-with-python-36497d3a1261)
- Feeling adventurous? Contribute with a PR. We have a list of [great issues to get you started.](https://github.com/ydataai/ydata-profiling/issues?q=label%3A%22getting+started+%E2%98%9D%22+)

![ydata-profiling report](_static/img/ydata-profiling.gif)

Expand Down Expand Up @@ -42,6 +47,14 @@ To learn more about the package check out [concepts overview](getting-started/co
## 📝 Features, functionalities & integrations
YData-profiling can be used to deliver a variety of different applications. The documentation includes guides, tips and tricks for tackling them:

!!! question "Data Catalog with data profiling for databases & storages"

Need to profile directly from databases and data storages **(Oracle, snowflake, PostGreSQL, GCS, S3, etc.)**?

Try [YData Fabric Data Catalog](https://ydata.ai/products/data_catalog) for interactive and scalable data profiling

Check out the [free Community Version](http://ydata.ai/register?utm_source=ydata-profiling&utm_medium=documentation&utm_campaign=YData%20Fabric%20Community).

| Features & functionalities | Description |
|------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------|
| [Comparing datasets](features/comparing_datasets.md) | Comparing multiple version of the same dataset |
Expand All @@ -54,7 +67,7 @@ YData-profiling can be used to deliver a variety of different applications. The

### Tutorials

Looking for how to use certain features or how to intgrate `ydata-profiling` in your currect stack and workflows,
Looking for how to use certain features or how to integrate `ydata-profiling` in your currect stack and workflows,
check our step-by-step tutorials.

- **How to master exploratory data analysis with ydata-profiling?** Check this [step-by-step tutorial](https://medium.com/ydata-ai/auditing-data-quality-with-pandas-profiling-b1bf1919f856).
Expand Down
2 changes: 1 addition & 1 deletion docs/integrations/pipelines.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ similar way as with Airflow.
!!! tip "Fabric Community version"

[YData Fabric](https://ydata.ai/products/fabric) has a community version that you can start using today to create data workflows with pipelines.
[Sign up here](https://ydata.ai/ydata-fabric-free-trial) and start building your pipelines. ydata-profiling is installed by default in all YData images.
[Sign up here](http://ydata.ai/register?utm_source=ydata-profiling&utm_medium=documentation&utm_campaign=YData%20Fabric%20Community) and start building your pipelines. ydata-profiling is installed by default in all YData images.

![ydata-profiling in a pipeline](../_static/img/profiling_pipelines.png)

Expand Down
8 changes: 0 additions & 8 deletions docsrc/source/pages/reference/changelog/v4_5_1.md

This file was deleted.

Loading