A better README file (#1079)

* Updated the tutorial document. 1. Corrected the spelling mistake -> (sigular to single) 2. Corrected the statement -> the number of dimensions is the rank of the array. 3. Made 2 more small changes. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix typo * Updated README.md file, Added contribution Guidelines section, Updated the installation, and Hacking sections with code snippets. * Added scripts, in the Getting Started section, inspired from the README of Tensorflow. * Added resources list in the Getting started section, and Updated the contributing Guidelines sections, (inspired from Numpy, Scipy's README). * Provided a quick guide that tells everything about the GSoC 2023 program in the contributing Guidelines section. * Created a new file that will work as a GitHub action to check all markdown-files in the root of the repository for any broken links. (It runs only once a week). * Changed one of the badge under the Project status. Co-authored-by: Claudia Comito <[email protected]> * Added a section- New, which might be used to announce any news. --------- Co-authored-by: SaiSuraj27 <[email protected]> Co-authored-by: Claudia Comito <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
helmholtz-analytics · Feb 9, 2023 · fade48d · fade48d
1 parent bcea48a
commit fade48d
Show file tree

Hide file tree

Showing 2 changed files with 117 additions and 45 deletions.
diff --git a/.github/workflows/markdown-links-check.yml b/.github/workflows/markdown-links-check.yml
@@ -0,0 +1,20 @@
+name: Markdown Links Check
+# runs every monday at 9 am
+on:
+  schedule:
+    - cron: "0 9 * * 1"
+
+jobs:
+  check-links:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@master
+      - uses: gaurav-nelson/github-action-markdown-link-check@v1
+        # checks all markdown files from root but ignores subfolders
+        # By Removing the max-depth variable we can modify it -> to check all the .md files in the entire repo.
+        with:
+          use-quiet-mode: 'yes'
+          # Specifying yes to show only errors in the output
+          use-verbose-mode: 'yes'
+          # Specifying yes to show detailed HTTP status for checked links.
+          max-depth: 0
diff --git a/README.md b/README.md
@@ -6,23 +6,21 @@
 
 Heat is a distributed tensor framework for high performance data analytics.
 
-Project Status
---------------
+# Project Status
+
+[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.2531472.svg)](https://doi.org/10.5281/zenodo.2531472)
 [![Mirror and run GitLab CI](https://github.com/helmholtz-analytics/heat/actions/workflows/ci_cb.yml/badge.svg)](https://github.com/helmholtz-analytics/heat/actions/workflows/ci_cb.yml)
 [![Documentation Status](https://readthedocs.org/projects/heat/badge/?version=latest)](https://heat.readthedocs.io/en/latest/?badge=latest)
 [![codecov](https://codecov.io/gh/helmholtz-analytics/heat/branch/main/graph/badge.svg)](https://codecov.io/gh/helmholtz-analytics/heat)
 [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
 [![license: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
 [![Downloads](https://pepy.tech/badge/heat)](https://pepy.tech/project/heat)
 
-NEW!
---------------
-- [Quick Start](quick_start.md) for new users and contributors (Jan 14, 2023)
-
+# New
 
+[Quick Start](quick_start.md) for new users and contributors (Jan 14, 2023).
 
-Goals
------
+# Goals
 
 Heat is a flexible and seamless open-source software for high performance data
 analytics and machine learning. It provides highly optimized algorithms and data
@@ -37,90 +35,145 @@ scientific and data science applications.
 Heat allows you to tackle your actual Big Data challenges that go beyond the
 computational and memory needs of your laptop and desktop.
 
-Features
---------
+# Features
 
 * High-performance n-dimensional tensors
 * CPU, GPU and distributed computation using MPI
 * Powerful data analytics and machine learning methods
 * Abstracted communication via split tensors
 * Python API
 
-Getting Started
----------------
-
-TL;DR: [Quick Start](quick_start.md)
-
-Check out our Jupyter Notebook [tutorial]((https://github.com/helmholtz-analytics/heat/blob/main/scripts/)tutorial.ipynb)
-right here on Github or in the /scripts directory.
-
-The complete documentation of the latest version is always deployed on
-[Read the Docs](https://heat.readthedocs.io/).
-
-Support Channels
-----------------
+# Support Channels
 
 We use [StackOverflow](https://stackoverflow.com/tags/pyheat/) as a forum for questions about Heat.
 If you do not find an answer to your question, then please ask a new question there and be sure to
 tag it with "pyheat".
 
 You can also reach us on [GitHub Discussions](https://github.com/helmholtz-analytics/heat/discussions).
 
-Requirements
-------------
+# Requirements
 
 Heat requires Python 3.7 or newer.
 Heat is based on [PyTorch](https://pytorch.org/). Specifically, we are exploiting
 PyTorch's support for GPUs *and* MPI parallelism. For MPI support we utilize
 [mpi4py](https://mpi4py.readthedocs.io). Both packages can be installed via pip
 or automatically using the setup.py.
 
-
-Installation
-------------
-
-TL;DR: [Quick Start](quick_start.md)
+# Installation
 
 Tagged releases are made available on the
 [Python Package Index (PyPI)](https://pypi.org/project/heat/). You can typically
 install the latest version with
 
-> $ pip install heat[hdf5,netcdf]
+```
+$ pip install heat[hdf5,netcdf]
+```
 
 where the part in brackets is a list of optional dependencies. You can omit
 it, if you do not need HDF5 or NetCDF support.
 
 **It is recommended to use the most recent supported version of PyTorch!**
 
-It is also very important to ensure that the PyTorch version is compatible with the local CUDA installation.
+**It is also very important to ensure that the PyTorch version is compatible with the local CUDA installation.**
 More information can be found [here](https://pytorch.org/get-started/locally/).
 
-Hacking
--------
-TL;DR: [Quick Start](quick_start.md)
+# Hacking
 
 If you want to work with the development version, you can check out the sources using
 
-> $ git clone https://github.com/helmholtz-analytics/heat.git
+```
+$ git clone <https://github.com/helmholtz-analytics/heat.git>
+```
 
 The installation can then be done from the checked-out sources with
 
-> $ pip install .[hdf5,netcdf,dev]
+```
+$ pip install heat[hdf5,netcdf,dev]
+```
+
+# Getting Started
+
+TL;DR: [Quick Start](quick_start.md) (Read this to get a quick overview of Heat).
+
+Check out our Jupyter Notebook [**Tutorial**](https://github.com/helmholtz-analytics/heat/blob/main/scripts/)
+right here on Github or in the /scripts directory, to learn and understand about the basics and working of Heat.
+
+The complete documentation of the latest version is always deployed on
+[Read the Docs](https://heat.readthedocs.io/).
+
+***Try your first Heat program***
+
+```shell
+$ python
+```
+
+```python
+>>> import heat as ht
+>>> x = ht.arange(10,split=0)
+>>> print(x)
+DNDarray([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=ht.int32, device=cpu:0, split=0)
+>>> y = ht.ones(10,split=0)
+>>> print(y)
+DNDarray([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], dtype=ht.float32, device=cpu:0, split=0)
+>>> print(x + y)
+DNDarray([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.], dtype=ht.float32, device=cpu:0, split=0)
+```
+
+### Also, you can test your setup by running the [`heat_test.py`](https://github.com/helmholtz-analytics/heat/blob/main/scripts/heat_test.py) script:
+
+```shell
+mpirun -n 2 python heat_test.py
+```
+
+### It should print something like this:
+
+```shell
+x is distributed:  True
+Global DNDarray x:  DNDarray([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=ht.int32, device=cpu:0, split=0)
+Global DNDarray x:
+Local torch tensor on rank  0 :  tensor([0, 1, 2, 3, 4], dtype=torch.int32)
+Local torch tensor on rank  1 :  tensor([5, 6, 7, 8, 9], dtype=torch.int32)
+```
+
+## Resources:
+
+* [Heat Tutorials](https://heat.readthedocs.io/en/latest/tutorials.html)
+* [Heat API Reference](https://heat.readthedocs.io/en/latest/autoapi/index.html)
+
+### Parallel Computing and MPI:
+
+* @davidhenty's [course](https://www.archer2.ac.uk/training/courses/200514-mpi/)
+* Wes Kendall's [Tutorials](https://mpitutorial.com/tutorials/)
+
+### mpi4py
+
+* [mpi4py docs](https://mpi4py.readthedocs.io/en/stable/tutorial.html)
+* [Tutorial](https://www.kth.se/blogs/pdc/2019/08/parallel-programming-in-python-mpi4py-part-1/)
+
+# Contribution guidelines
+
+**We welcome contributions from the community, if you want to contribute to Heat, be sure to review the [Contribution Guidelines](contributing.md) before getting started!**
+
+We use [GitHub issues](https://github.com/helmholtz-analytics/heat/issues) for tracking requests and bugs, please see [Discussions](https://github.com/helmholtz-analytics/heat/discussions) for general questions and discussion, and You can also get in touch with us on [Mattermost](https://mattermost.hzdr.de/signup_user_complete/?id=3sixwk9okpbzpjyfrhen5jpqfo). You can sign up with your GitHub credentials. Once you log in, you can introduce yourself on the `Town Square` channel.
+
+Small improvements or fixes are always appreciated; issues labeled as **"good first issue"** may be a good starting point.
+
+If you’re unsure where to start or how your skills fit in, reach out! You can ask us here on GitHub, by leaving a comment on a relevant issue that is already open.
+
+**If you are new to contributing to open source, [this guide](https://opensource.guide/how-to-contribute/) helps explain why, what, and how to get involved.**
 
-We welcome contributions from the community, please check out our [Contribution Guidelines](contributing.md) before getting started!
+### For people who want to contribute through GSoC 2023 program, here is a [quick Guide](https://github.com/MLSC-BSOITR/Ultimate-GSOC-Guide/blob/main/GSoC2023Presentation.pdf) about the complete program.
 
-License
--------
+# License
 
 Heat is distributed under the MIT license, see our
 [LICENSE](LICENSE) file.
 
-Citing Heat
------------
+# Citing Heat
 
 If you find Heat helpful for your research, please mention it in your publications. You can cite:
 
-- Götz, M., Debus, C., Coquelin, D., Krajsek, K., Comito, C., Knechtges, P., Hagemeier, B., Tarnawa, M., Hanselmann, S., Siggel, S., Basermann, A. & Streit, A. (2020). HeAT - a Distributed and GPU-accelerated Tensor Framework for Data Analytics. In 2020 IEEE International Conference on Big Data (Big Data) (pp. 276-287). IEEE, DOI: 10.1109/BigData50022.2020.9378050.
+* Götz, M., Debus, C., Coquelin, D., Krajsek, K., Comito, C., Knechtges, P., Hagemeier, B., Tarnawa, M., Hanselmann, S., Siggel, S., Basermann, A. & Streit, A. (2020). HeAT - a Distributed and GPU-accelerated Tensor Framework for Data Analytics. In 2020 IEEE International Conference on Big Data (Big Data) (pp. 276-287). IEEE, DOI: 10.1109/BigData50022.2020.9378050.
 
 ```
 @inproceedings{heat2020,
@@ -148,8 +201,7 @@ If you find Heat helpful for your research, please mention it in your publicatio
 }
 ```
 
-Acknowledgements
-----------------
+## Acknowledgements
 
 *This work is supported by the [Helmholtz Association Initiative and
 Networking Fund](https://www.helmholtz.de/en/about_us/the_association/initiating_and_networking/)