Skip to content

Commit

Permalink
docs: edits across the board
Browse files Browse the repository at this point in the history
Signed-off-by: Alex Aizman <[email protected]>
  • Loading branch information
alex-aizman committed Aug 10, 2024
1 parent 875e0e9 commit bef16fb
Show file tree
Hide file tree
Showing 41 changed files with 134 additions and 150 deletions.
41 changes: 13 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
![License](https://img.shields.io/badge/license-MIT-blue.svg)
![Go Report Card](https://goreportcard.com/badge/github.com/NVIDIA/aistore)

AIStore (AIS for short) is a built from scratch, lightweight storage stack tailored for AI apps. It's an elastic cluster that can grow and shrink at runtime and can be ad-hoc deployed, with or without Kubernetes, anywhere from a single Linux machine to a bare-metal cluster of any size.
AIStore (AIS for short) is a built-from-scratch, lightweight storage stack tailored for AI apps. It's an elastic cluster that can grow and shrink at runtime and can be ad-hoc deployed, with or without Kubernetes, anywhere from a single Linux machine to a bare-metal cluster of any size.

AIS [consistently shows balanced I/O distribution and linear scalability](https://aistore.nvidia.com/blog/2024/02/16/multihome-bench) across arbitrary numbers of clustered nodes. The ability to scale linearly with each added disk was, and remains, one of the main incentives. Much of the initial design was also driven by the ideas to [offload](https://aistore.nvidia.com/blog/2023/06/09/aisio-transforms-with-webdataset-pt-3) custom dataset transformations (often referred to as [ETL](https://aistore.nvidia.com/blog/2021/10/21/ais-etl-1)). And finally, since AIS is a software system that aggregates Linux machines to provide storage for user data, there's the requirement number one: reliability and data protection.

Expand Down Expand Up @@ -60,7 +60,7 @@ Since prerequisites boil down to, essentially, having Linux with a disk the depl

| Option | Objective |
| --- | ---|
| [Local playground](https://github.com/NVIDIA/aistore/blob/main/docs/getting_started.md#local-playground) | AIS developers or first-time users, Linux or Mac OS; to get started, run `make kill cli aisloader deploy <<< $'N\nM'`, where `N` is a number of targets, `M` - gateways |
| [Local playground](https://github.com/NVIDIA/aistore/blob/main/docs/getting_started.md#local-playground) | AIS developers or first-time users, Linux or Mac OS; to get started, run `make kill cli aisloader deploy <<< $'N\nM'`, where `N` is a number of [targets](/docs/overview.md#terminology), `M` - gateways |
| Minimal production-ready deployment | This option utilizes preinstalled docker image and is targeting first-time users or researchers (who could immediately start training their models on smaller datasets) |
| [Easy automated GCP/GKE deployment](https://github.com/NVIDIA/aistore/blob/main/docs/getting_started.md#kubernetes-deployments) | Developers, first-time users, AI researchers |
| [Large-scale production deployment](https://github.com/NVIDIA/ais-k8s) | Requires Kubernetes and is provided via a separate repository: [ais-k8s](https://github.com/NVIDIA/ais-k8s) |
Expand All @@ -80,16 +80,16 @@ AIStore supports multiple ways to populate itself with existing datasets, includ
* **copy** multiple matching objects;
* **archive** multiple objects
* **prefetch** remote bucket or parts of thereof;
* **download** raw http(s) addressible directories, including (but not limited to) Cloud storages;
* **promote** NFS or SMB shares accessible by one or multiple (or all) AIS target nodes;
* **download** raw http(s) addressable directories, including (but not limited to) Cloud storages;
* **promote** NFS or SMB shares accessible by one or multiple (or all) AIS [target](/docs/overview.md#terminology) nodes;

> The on-demand "way" is maybe the most popular, whereby users just start running their workloads against a [remote bucket](docs/providers.md) with AIS cluster positioned as an intermediate fast tier.
But there's more. In [v3.22](https://github.com/NVIDIA/aistore/releases/tag/v1.3.22), we introduce [blob downloader](/docs/blob_downloader.md), a special facility to download very large remote objects (BLOBs). And in [v3.23](https://github.com/NVIDIA/aistore/releases/tag/v1.3.23), there's a new capability, dubbed [bucket inventory](/docs/s3inventory.md), to list very large S3 buckets _fast_.

## Installing from release binaries

Generally, AIStore (cluster) requires at least some sort of [deployment](/deploy#contents) procedure. There are standalone binaries, though, that can be [built](Makefile) from source or, alternatively, installed directly from GitHub:
Generally, AIStore (cluster) requires at least some sort of [deployment](/deploy#contents) procedure. There are standalone binaries, though, that can be [built](Makefile) from source or installed directly from GitHub:

```console
$ ./scripts/install_from_binaries.sh --help
Expand All @@ -99,25 +99,13 @@ The script installs [aisloader](/docs/aisloader.md) and [CLI](/docs/cli.md) from

## PyTorch integration

AIS is one of the PyTorch [Iterable Datapipes](https://github.com/pytorch/data/tree/main/torchdata/datapipes/iter/load#iterable-datapipes).
PyTorch integration is a growing set of datasets (both iterable and map-style), samplers, and dataloaders:

Specifically, [TorchData](https://github.com/pytorch/data) library provides:
* [AISFileLister](https://pytorch.org/data/main/generated/torchdata.datapipes.iter.AISFileLister.html#aisfilelister)
* [AISFileLoader](https://pytorch.org/data/main/generated/torchdata.datapipes.iter.AISFileLoader.html#aisfileloader)
* [Taxonomy of abstractions and API reference](/docs/pytorch.md)
* [AIS plugin for PyTorch: usage examples](https://github.com/NVIDIA/aistore/tree/main/python/aistore/pytorch/README.md)
* [Jupyter notebook examples](https://github.com/NVIDIA/aistore/tree/main/python/examples/aisio-pytorch/)

to list and, respectively, load data from AIStore.

Further references and usage examples - in our technical blog at https://aistore.nvidia.com/blog:
* [PyTorch: Loading Data from AIStore](https://aistore.nvidia.com/blog/2022/07/12/aisio-pytorch)
* [Python SDK: Getting Started](https://aistore.nvidia.com/blog/2022/07/20/python-sdk)

Since AIS natively supports a number of [remote backends](/docs/providers.md), you can also use (PyTorch + AIS) to iterate over Amazon S3 and Google Cloud buckets, and more.

## Reuse

This repo includes [SGL and Slab allocator](/memsys) intended to optimize memory usage, [Streams and Stream Bundles](/transport) to multiplex messages over long-lived HTTP connections, and a few other sub-packages providing rather generic functionality.

With a little effort, they all could be extracted and used outside.
Since AIS natively supports [remote backends](/docs/providers.md), you can also use (PyTorch + AIS) to iterate over Amazon S3, GCS and Azure buckets, and more.

## Guides and References

Expand Down Expand Up @@ -151,9 +139,6 @@ With a little effort, they all could be extracted and used outside.
- [Jobs](/docs/cli/job.md)
- Security and Access Control
- [Authentication Server (AuthN)](/docs/authn.md)
- Tutorials
- [Tutorials](/docs/tutorials/README.md)
- [Videos](/docs/videos.md)
- Power tools and extensions
- [Reading, writing, and listing *archives*](/docs/archive.md)
- [Distributed Shuffle](/docs/dsort.md)
Expand Down Expand Up @@ -195,16 +180,16 @@ With a little effort, they all could be extracted and used outside.
- [Getting started](/docs/getting_started.md)
- [Docker](/docs/docker_main.md)
- [Useful scripts](/docs/development.md)
- Profiling, race-detecting, and more
- Profiling, race-detecting and more
- Batch jobs
- [Batch operations](/docs/batch.md)
- [eXtended Actions (xactions)](/xact/README.md)
- [eXtended Actions (xactions)](https://github.com/NVIDIA/aistore/blob/main/xact/README.md)
- [CLI: `ais job`](/docs/cli/job.md) and [`ais show job`](/docs/cli/show.md), including:
- [prefetch remote datasets](/docs/cli/object.md#prefetch-objects)
- [copy bucket](/docs/cli/bucket.md#copy-bucket)
- [copy multiple objects](/docs/cli/bucket.md#copy-multiple-objects)
- [download remote BLOBs](/docs/cli/blob-downloader.md)
- [promote NFS or SMB share](https://aistore.nvidia.com/blog/2022/03/17/promote), and more
- [promote NFS or SMB share](https://aistore.nvidia.com/blog/2022/03/17/promote)
- Assorted Topics
- [Virtual directories](/docs/howto_virt_dirs.md)
- [System files](/docs/sysfiles.md)
Expand Down
2 changes: 1 addition & 1 deletion cmd/cli/cli/const.go
Original file line number Diff line number Diff line change
Expand Up @@ -228,7 +228,7 @@ const (
jobShowRebalanceArgument = "[REB_ID] [NODE_ID]"

// Perf
showPerfArgument = "show performance counters, throughput, latency, and more (" + tabtab + " specific view)"
showPerfArgument = "show performance counters, throughput, latency, disks, used/available capacities (" + tabtab + " specific view)"

// ETL
etlNameArgument = "ETL_NAME"
Expand Down
6 changes: 3 additions & 3 deletions cmd/cli/cli/show_hdlr.go
Original file line number Diff line number Diff line change
Expand Up @@ -138,23 +138,23 @@ var (
}
showCmdCluster = cli.Command{
Name: cmdCluster,
Usage: "show cluster nodes and utilization",
Usage: "main dashboard: show cluster at-a-glance (nodes, software versions, utilization, capacity, memory and more)",
ArgsUsage: showClusterArgument,
Flags: showCmdsFlags[cmdCluster],
Action: showClusterHandler,
BashComplete: showClusterCompletions, // NOTE: level 0 hardcoded
Subcommands: []cli.Command{
{
Name: cmdSmap,
Usage: "show Smap (cluster map)",
Usage: "show cluster map (Smap)",
ArgsUsage: optionalNodeIDArgument,
Flags: showCmdsFlags[cmdSmap],
Action: showSmapHandler,
BashComplete: suggestAllNodes,
},
{
Name: cmdBMD,
Usage: "show BMD (bucket metadata)",
Usage: "show bucket metadata (BMD)",
ArgsUsage: optionalNodeIDArgument,
Flags: showCmdsFlags[cmdBMD],
Action: showBMDHandler,
Expand Down
2 changes: 1 addition & 1 deletion docs/_posts/2021-07-30-etl.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Of course, I’m talking about ETL workloads. Machine learning has three, and on

ETL – or you can simply say “data preprocessing” because that’s what it is (my advice, though, if I may, would be to say “ETL” as it may help institute a sense of shared values, etc.) – in short, ETL is something that is usually done prior to training.

Examples? Well, ask a random person to name a fruit, and you’ll promptly hear back “an apple.” Similarly, ask anyone to name an ETL workload, and many, maybe most, will immediately respond with “augmentation”. Which in and of itself is a shortcut for a bunch of concrete sprightly verbs: flip, rotate, scale, crop, and more.
Examples? Well, ask a random person to name a fruit, and you’ll promptly hear back “an apple.” Similarly, ask anyone to name an ETL workload, and many, maybe most, will immediately respond with “augmentation”. Which in and of itself is a shortcut for a bunch of concrete sprightly verbs: flip, rotate, scale, crop and more.

My point? My point is, and always will be, that any model – and any deep-learning neural network, in particular – is only as good as the data you feed into it. That’s why they flip and rotate and what-not. And that’s precisely why they augment or, more specifically, extract-transform-load, raw datasets commonly used to train deep learning classifiers. Preprocess, train, and repeat. Reprocess, retrain, and compare the resulting mAP (for instance). And so on.

Expand Down
2 changes: 1 addition & 1 deletion docs/_posts/2023-04-03-transform-images-with-python-sdk.md
Original file line number Diff line number Diff line change
Expand Up @@ -190,7 +190,7 @@ etl_group(image_etl)

### AIS/PyTorch connector

In the steps above, we demonstrated a few ways to transform objects, but to use the results we need to load them into a Pytorch Dataset and DataLoader. In PyTorch, a dataset can be defined by inheriting [`torch.utils.data.Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset). Datasets can be fed into a `DataLoader` to handle batching, shuffling, etc. (see ['torch.utils.data.DataLoader'](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader)).
In the steps above, we demonstrated a few ways to transform objects, but to use the results we need to load them into a PyTorch Dataset and DataLoader. In PyTorch, a dataset can be defined by inheriting [`torch.utils.data.Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset). Datasets can be fed into a `DataLoader` to handle batching, shuffling, etc. (see ['torch.utils.data.DataLoader'](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader)).

To implement inline ETL, transforming objects as we read them, you will need to create a custom PyTorch Dataset as described [by PyTorch here](https://pytorch.org/tutorials/beginner/basics/data_tutorial.html). In the future, AIS will likely provide some of this functionality directly. For now, we will use the output of the offline ETL (bucket-to-bucket) described above and use the provided `AISDataset` to read the transformed results. More info on reading AIS data into PyTorch can be found [on the AIS blog here](https://aiatscale.org/blog/2022/07/12/aisio-pytorch).

Expand Down
2 changes: 1 addition & 1 deletion docs/_posts/2023-04-10-tco-any-to-any.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,4 +121,4 @@ And that's the upshot.

## References

* [Lifecycle management: maintenance mode, rebalance/rebuild, and more](/docs/lifecycle_node.md)
* [Lifecycle management: maintenance mode, rebalance/rebuild and more](/docs/lifecycle_node.md)
Original file line number Diff line number Diff line change
Expand Up @@ -128,10 +128,10 @@ def view_data(dataloader):
2. Documentation, blogs, videos:
- https://aiatscale.org
- https://github.com/NVIDIA/aistore/tree/main/docs
- Pytorch intro to Datasets and DataLoaders: https://pytorch.org/tutorials/beginner/basics/data_tutorial.html
- PyTorch intro to Datasets and DataLoaders: https://pytorch.org/tutorials/beginner/basics/data_tutorial.html
- Discussion on Datasets, DataPipes, DataLoaders: https://sebastianraschka.com/blog/2022/datapipes.html
3. Full code example
- [Pytorch Pipelines With WebDataset Example](/python/examples/aisio-pytorch/pytorch_webdataset.py)
- [PyTorch Pipelines With WebDataset Example](/python/examples/aisio-pytorch/pytorch_webdataset.py)
4. Dataset
- [The Oxford-IIIT Pet Dataset](https://www.robots.ox.ac.uk/~vgg/data/pets/)

2 changes: 1 addition & 1 deletion docs/archive.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ All sharding formats are equally supported across the entire set of AIS APIs. Fo

> ie., objects formatted as .tar, .tgz, etc. - see above
and including the corresponding pathnames into generated result sets. Clients can run concurrent multi-object (source bucket => destination bucket) transactions to en masse generate new archives from [selected](/docs/batch.md) subsets of files, and more.
and including the corresponding pathnames into generated result sets. Clients can run concurrent multi-object (source bucket => destination bucket) transactions to en masse generate new archives from [selected](/docs/batch.md) subsets of files.

APPEND to existing archives is also provided but limited to [TAR only](https://aistore.nvidia.com/blog/2021/08/10/tar-append).

Expand Down
2 changes: 1 addition & 1 deletion docs/batch.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ Complete and most recently updated list of supported jobs can be found in this [

Last (but not the least) is - time. Job execution may take many seconds, sometimes minutes or hours.

Examples include erasure coding or n-way mirroring a dataset, resharding and reshuffling a dataset, and more.
Examples include erasure coding or n-way mirroring a dataset, resharding and reshuffling a dataset and more.

Global rebalance gets (automatically) triggered by any membership changes (nodes joining, leaving, powercycling, etc.) that can be further visualized via `ais show rebalance` CLI.

Expand Down
2 changes: 1 addition & 1 deletion docs/blob_downloader.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ AIStore supports multiple ways to populate itself with existing datasets, includ
* **copy** multiple matching objects;
* **archive** multiple objects
* **prefetch** remote bucket or parts of thereof;
* **download** raw http(s) addressible directories, including (but not limited to) Cloud storages;
* **download** raw http(s) addressable directories, including (but not limited to) Cloud storages;
* **promote** NFS or SMB shares accessible by one or multiple (or all) AIS target nodes;

> The on-demand "way" is maybe the most popular, whereby users just start running their workloads against a [remote bucket](docs/providers.md) with AIS cluster positioned as an intermediate fast tier.
Expand Down
6 changes: 3 additions & 3 deletions docs/bucket.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,11 +41,11 @@ AIStore uses the popular and well-known bucket abstraction, originally (likely)

Similar to S3, AIS bucket is a _container for objects_.

> An object, in turn, is a file **and** a metadata that describes that object and normally includes: checksum, version, references to copies (replicas), size, last access time, source bucket (if object's origin is a Cloud bucket), custom user-defined attributes, and more.
> An object, in turn, is a file **and** a metadata that describes that object and normally includes: checksum, version, references to copies (replicas), size, last access time, source bucket (if object's origin is a Cloud bucket), custom user-defined attributes and more.
AIS is a flat `<bucket-name>/<object-name>` storage hierarchy where named buckets store user datasets.

In addition, each AIS bucket is a point of applying (per-bucket) management policies: checksumming, versioning, erasure coding, mirroring, LRU eviction, checksum and/or version validation, and more.
In addition, each AIS bucket is a point of applying (per-bucket) management policies: checksumming, versioning, erasure coding, mirroring, LRU eviction, checksum and/or version validation.

AIS buckets *contain* user data performing the same function as, for instance:

Expand Down Expand Up @@ -695,7 +695,7 @@ For background and usage examples, please see [CLI: AWS-specific bucket configur
* [`ais ls`](https://github.com/NVIDIA/aistore/blob/main/docs/cli/bucket.md#list-objects)
* [Virtual directories](/docs/howto_virt_dirs.md)

`ListObjects` API returns a page of object names and, optionally, their properties (including sizes, access time, checksums, and more), in addition to a token that serves as a cursor, or a marker for the *next* page retrieval.
`ListObjects` API returns a page of object names and, optionally, their properties (including sizes, access time, checksums), in addition to a token that serves as a cursor, or a marker for the *next* page retrieval.

> Go [ListObjects](https://github.com/NVIDIA/aistore/blob/main/api/bucket.go) API
Expand Down
2 changes: 1 addition & 1 deletion docs/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ Following is a brief summary (that's non-exhaustive and slightly outdated):
| [`ais job`](/docs/cli/job.md) | Query and manage jobs (aka eXtended actions or `xactions`). |
| [`ais object`](/docs/cli/object.md) | PUT and GET (write and read), APPEND, archive, concat, list (buckets, objects), move, evict, promote, ... |
| [`ais search`](/docs/cli/search.md) | Search `ais` commands. |
| [`ais show`](/docs/cli/show.md) | Monitor anything and everything: performance (all aspects), buckets, jobs, remote clusters, and more. |
| [`ais show`](/docs/cli/show.md) | Monitor anything and everything: performance (all aspects), buckets, jobs, remote clusters and more. |
| [`ais log`](/docs/cli/log.md) | Download ais nodes' logs or view the logs in real time. |
| [`ais storage`](/docs/cli/storage.md) | Show capacity usage on a per bucket basis (num objects and sizes), attach/detach mountpaths (disks). |
{: .nobreak}
Expand Down
Loading

0 comments on commit bef16fb

Please sign in to comment.