Version 3.24 arrives nearly 4 months after the previous one and contains more than 400 commits that can be structured into several main categories (topics):
Table of Contents
- Core
- Initial Sharding (
ishard
); Distributed Shuffle (dsort
) - Authentication; Access Control
- CLI
- Python: SDK (AIStore, AuthN); PyTorch DataLoader; Tools
- Build; Lint; Continuous Integration (CI)
- Documentation and Tests
- "alerts: clear node-restarted" 16cc3ad25 | * after 10h (currently hardcoded)
- "gateways to count
(GET, PUT, DELETE)
errors; skip logging" 327bcc8c1 - "(new) keep-alive error counter & keep-alive alert" 4e027cff4 | * add keep-alive error counter and alert, respectively: | - "err.kalive.n" | - "keep-alive-errors" | * keep-alive alert clears after 5 minutes (default) of no-change | * separately, bump minor versions: | - aisloader, cli, authn
- "tls cert (re)loader: even more alerts" 3f3e50263
| * part eight, prev. commit: 55db4cb4638a41
alert comment tls-cert-will-soon-expire
warning: less than 3 days remains until X.509 cert expires tls-cert-expired
red alert (as the name implies) tls-cert-invalid
ditto - "tls cert (re)loader: (valid, invalid, expired) state; more alerts" 55db4cb46 | * part seven, prev. commit: 1fe3c80bec3f | * with refactoring
- "tls cert (re)loader: raise/clear alerts; follow-up" 1fe3c80be | * part six, prev. commit: 7c338a2494f07edb
- "state flags as red alerts and cyan warnings" 1a8503f16 | * add helper for CLI (tbd) to use two respective colors | * prev. commit: c434fb65380359c
- "state flags as red alerts and/or warnings" c434fb653 | * 'show cluster' to show red
- "node state flags (cont-d)" b3e1adfe7 | * prev. commit: 3fcd0ef3c35c
- "high number of goroutines: revise/amend; add node-state alert" 7324344cd
- "observability: add head(object) counter, latency, and error-count" 7a15bb331 | * counting only remote heads for now
- "observability: multipart-upload: add put/get metrics" 9fbf736d9
- "observability: Prometheus labels (major)" f5d271bfb
| * further split StatsD and Prometheus sources:
| -
statsValue
- with no labels for Prometheus | -runner.reg
- simplified --/-- | - units and naming: computed latencies are always reported in milliseconds, | computed throughput - in MB/s | - unlike respective "total"s that are always in nanoseconds and bytes, | respectively | - units and naming: use "bytes" suffix for all ".size" metrics | (formerly: "mbytes") | - uptime is now "uptime" (formerly, "up_ms_time") | * Prometheus: add all help descriptions | * part six, prev. commit: c9016261399c5c61 - "observability: Prometheus labels" c90162613 | * cover common (target, gateway) metrics | * amend labeled helps | * continued refactoring | * part five, prev. commit: 406669f6a044e55
- "observability: Prometheus labels" 406669f6a | * initialize StatsD or Prometheus at the right time, and not before | * part four, prev. commit: d77b3d76276e34c
- "observability: Prometheus labels" d77b3d762 | * labels for disks and backends | (the rest TBD) | * revise and refactor; reduce code | * part three, prev. commit: 21c1cd6979c25a
- "observability: Prometheus labels" 21c1cd697 | * part one
- "make Prometheus default; left and right helpers" 5dd9704f7 | * Prometheus is now default for local playground as well
- "stats: add io-error metrics (new); register statically" e2c6e0dd8
| * statically register all error counters
| * io error counters: GET, PUT, DELETE
| * simplify and refactor
| * TODO: CLI to utilize
IsIOErrMetric
- "stats: remove cold-read-write metric (obsolete) c8c187ef1 | * transitioned to using per-backend "total" latencies | * up cli
- "observability: per-backend 'get' and 'put' metrics - part 2" 5fd101589
| new metrics
| ===========
| *
<provider>.put.ns.total
and<provider>.get.ns.total
| * this is purely the time taken for AIStore to GET/PUT from/to a remote backend | *<provider>.e2e.put.ns.total
and<provider>.e2e.get.ns.total
| * total time of a GET/PUT request, respectively | * i.e., AIStore overhead + time to GET/PUT from/to a remote backend | *ver.change.n
andver.change.size
per backend - "observability: per-backend 'get' and 'put' metrics (major)" 8a13453bd
| * get-cold* and put-cold* are gone;
<backend>.<verb>
is the new convention | * backend to register their own metrics at construction time | * with substantial refactoring | * TODO: | -remote-del
counter - "build-time choice: Prometheus (default) or StatsD" c01950002
| *
statsValue
primitive to contain only one respective label: | (StatsD or Prometheus) | * part four, prev. commit: 202f4cd2d3508f - "add docker image for authn container" b16b867f6
- "build-time choice: Prometheus (default) or StatsD" 202f4cd2d | * part three, prev. commit: b334ff969211
- "build-time choice: Prometheus (default) or StatsD" b334ff969 | * part two, prev. commit: 860c1369f059
- "stats: physically separate Prometheus and StatsD; build and lint" 860c1369f
| * new build tag:
statsd
| * update make and lint scripts and associated yaml | - add build and lint permutations | * extract common constants and helpers, reduce code duplication | * update docs: documentstatsd
and other build tags | - removeAIS_PROMETHEUS
environment - "extend 'show cluster' - add 'alert' column" 40d6580df | * move version & build to the summary | - show version (or build) with individual nodes iff there are different versions and builds, respectively | * add alert column; hide it iff all state flags are OK | - for enumeration, see cmn/cos/node_state_flags
- "extend 'show cluster' - add 'alert' column" 40d6580df | * move version & build to the summary | - show version (or build) with individual nodes iff there are different versions and builds, respectively | * add alert column; hide it iff all state flags are OK | - for enumeration, see cmn/cos/node_state_flags
- "Prometheus: skip zero value metrics when collecting" 9155d283d
- "disk metrics; CLI: verbose counters, empty version" bd927bc01
- "add 'ais tls validate-certificates' command" 0a2f25cc7 | * also, add load-cert case for secondary proxies (fix)
- "tls cert (re)loader: even more alerts" 3f3e50263
| * part eight, prev. commit: 55db4cb4638a41
alert comment tls-cert-will-soon-expire
warning: less than 3 days remains until X.509 cert expires tls-cert-expired
red alert (as the name implies) tls-cert-invalid
ditto - "tls cert (re)loader: (valid, invalid, expired) state; more alerts" 55db4cb46 | * part seven, prev. commit: 1fe3c80bec3f | * with refactoring
- "tls cert (re)loader: raise/clear alerts; follow-up" 1fe3c80be | * part six, prev. commit: 7c338a2494f07edb
- "[API change] show TLS certificate details; add top-level 'ais tls' command" 091f7b0e0
| * Go API: add api/x509 source:
| -
api.LoadX509Cert
| -api.GetX509Info
| * CLI: add cmd/cli/cli/x509; consolidate all TLS in there | * CLI: add top-levelais tls
; update all related docs and references | * prev. commit sequence: 3f3e5026323deed | * separately: | - aistore as reverse-proxy is obsolete - update the docs, add | disclaimer | - related (very old) commit: 2cc82126b0625b2a1d - "tls cert (re)loader: document; consolidate all HTTPS related topics" 7c338a249 | * part five, prev. commit: 5e92eff58c06ae3
- "tls cert (re)loader: add admin API and CLI" 5e92eff58 | * to reload unconditionally (skipping cert-changed check) | * part four, prev. commit: 21807df84a7fdee
- "tls cert (re)loader: mtime and size; fingerprint; flex. scheduling" 21807df84 | * rename all internals; refactor | * extend the state - keep mtime and size | * schedule housekeeper based on remaining time | * remove "fingerprinting" | * part three, prev. commit: 31d1a799f7e547
- "tls cert loader: rewrite from scratch; intra-cluster clients" 31d1a799f
| * from scratch, prev. commit 8107a3bb10a51478
| - original ticket: [NGNSDS-632]
| * separately, introduce
intra-cluster
to differentiate clients | -NewTLS(..., intra-cluster)
- "FSHC v2: upgrade rc2 config to rc3" 49df73872 | * temp patch only to carry out interim upgrade within v3.24 | - (ref v324) | - (v3.23 => new fshc) and (3.24.rc2 => new fshc)
- "Go API: HEAD(object) args; [config change]: IO errors" 64bf8b267 | * HEAD(object) API change, part one | * FSHC config change: IO error limit and duration | (formerly, soft error) | * with substantial refactoring
- "FSHC v2: joggers to always check before running" 061a7c9ef | * related commit: ee3957913b5724
- "assorted fixes: fs-linux; CLI iterations; FSHC; OOM periodic" 447115159
| * revise (mountpath => filesystem) resolution
| - mountpath vs FS mountpoint; relative path
| - refactor & cleanup; fix and add comments
| * CLI: when iterating, perform aistore version check only the first time
| - extend
longRun
singleton - add iteration count | * OOM periodic: flip CAS statement (typo) | * FSHC: flip CAS statement (ditto) | - refactor & cleanup | * FSHC config: reduce default soft-error limit to 10 (was 100) - "rescan disks; manually run FSHC (advanced use only)" 69ea283b8 | * two new admin APIs, CLI, and implementations
- "never call FSHC with nil mountpath; reduce code" fdf575e20 | * also, remove erroneous assert
- "FSHC v2" f24b5aa2f
| * count (GET, PUT, DELETE) errors more precisely
| - aka "soft IO errors"
| * move
fshc
callbacks from target's put-object tolom
| - create, rename | * part twelve, prev. commit: 43d88d22ae8c13 - "FSHC v2: backward compatibility (config)" 11fa3cdd1
- "follow-up" f242fd47e
- "FSHC v2" 43d88d22a
| * non-IO error correction
| * add
EBADFD
,ECANCELED
,os.NewSyscallError
| * part eleven, prev. commit: a2d04da3a67ad - "[config change]: FSHC v2 (major)" a2d04da3a | * track and handle total number of soft errors | * extend fshc config; add new knobs | * revise health/fshc.go logic | * part ten, prev. commit: ee3957913b57242
- "filesystem health checker (fshc) version 2" ee3957913 | * part nine, prev. commit: e0a312cd9fbac
- "filesystem health checker (fshc) version 2" e0a312cd9
| * add 'err-mountpath-changed-at-runtime'
| - list-objects will now detect it and trigger FSHC
| - TODO: consider making the check inside
Get()
andGetAvail()
| * part eight, prev. commit: 9ba4f97dee926 - "filesystem health checker (fshc) version 2" 9ba4f97de
| * fshc: resolve filesystem, compare IDs
| * fshc: refactor 'run' method
| * fs: amend
fs.Equal
| * core: lop.open to check bucket directory and possibly escalate | * ais: retry GET only when erasure-coded | * CLI: add format-bucket-name; cleanup | * part seven, prev. commit: f465a1a910a6faa - "filesystem health checker (fshc) version 2" f465a1a91 | * fs: now responsible to trigger FSHC - directly and in place | * fshc: additionally check fstat and statfs | * periodic target-stats: disk stats, with and without cap-refresh | * part six, prev. commit: 1ed2c1b69f1
- "filesystem health checker (fshc) version 2" 1ed2c1b69 | * at runtime: resolve (mpath, FS) to disks, and handle: | - no disks | - disk loss | - new disk attachments | * part five, prev. commit: ccef8082e95794
- "filesystem health checker (fshc) version 2" ccef8082e | * CLI alerts: | - add 'ais storage disk' | - amend 'ais storage mountpath' | * with refactoring | * part four, prev. commit: bda7bc9901ed73
- "filesystem health checker (fshc) version 2" bda7bc990 | * CLI 'storage mountpath' to show alerts | * part three, prev. commit: 0319347b451e
- "filesystem health checker (fshc) version 2" 0319347b4
| * add disk-fault alert
| *
<DISK NAME>[<alert>]
convention, with suffix enumeration in fs/api.go | * part two - "filesystem health checker (FSHC) version 2" 733688e06 | * full rewrite | * part one
- "amend intra-cluster health ping" b550c0b2b | * w/ comments inline | * related: e3503fc67c38e02
- "(new) keep-alive error counter & keep-alive alert" 4e027cff4 | * add keep-alive error counter and alert, respectively: | - "err.kalive.n" | - "keep-alive-errors" | * keep-alive alert clears after 5 minutes (default) of no-change | * separately, bump minor versions: | - aisloader, cli, authn
- "keep-alive (follow-up)" 38c031c11 | * slow-keepalive: simplify-out check for DNS error | - related: a8a5c1e342a99 | * cold-GET: amend cleanup logic | - don't uncache (no need) | - don't remove copies (not produced yet) | - shorten EPIPE error message | * with authn enabled, 401/403 codes may be happening much more frequently | - with the potential to quickly generate megabytes of log records | - thus, making an exception | * on the related note, proxies must also count (GET, PUT, DELETE) errors
- "retry primary keepalive (part four)" 89d06dbca | * with refactoring | * prev. commits: 9c961d591141, e3503fc67c38e | * separately, CLI 'ais log get' inline help
- "amend primary election (part three)" 9c961d591 | * node => current-primary retry via pub-addr, if different | * with refactoring; logs | * part three, prev. commit: e3503fc67c38
- "retry primary keepalive (part two)" e3503fc67
| * primary => node via
palive.retry
via pub addr, if different | * with refactoring | * part two, prev. commit: a8a5c1e342a99 - "retry slow keepalive upon DNS lookup failure, given" a8a5c1e34 | * different control and pub hostnames
- "close EC streams when idle, reopen on demand" b471cb1d6 | * gateways: remove open-ec-streams logic from bucket initialization | (no need)
- "close EC streams when idle, reopen on demand (major)" 5eb467789 | * remove entire code tos (statically) open streams based on BMD | * upon inactivity timeout go ahead and close EC streams | * part four, prev. commit: 0642c8572832e
- "close EC streams when idle, reopen on demand" 0642c8572
| * refactor and amend housekeeper
| - use UnregInterval consistently across
| - add
UnregIf
| - reduce work chan capacity to 48 (was 512); add "channel full" check | * target: implement on-EC/off-EC handler | - TODO: revisit 1m delay | * part three, prev. commit: 11148689394e - "rebalance vs dynamic EC streams; housekeeping; dsort; downloader" a91636bc3 | * open/close and ref-count EC streams when rebalancing | * consolidate common housekeeping durations | - xactions | - notifications | - transactions | * dsort & downloader: housekeep upon the first respective usage | * with substantial refactoring
- "[config change] close EC streams when idle, reopen on demand (major)" 111486893
| * cluster config: add "ec_streams_time"
| * proxy:
onEC
when initializing bucket | - s3 and, separately, native API | * new sources: ais/prxec and ais/tgtec | - add target /v1/ec endpint | * refactor; reduce copy/paste; remove unused code | * miscellaneous micro-optimizations | * part two, prev. commit: d8a71bb59fb0a18d - "close/reopen EC (intra-cluster) streams on demand (major)" d8a71bb59
| * ref-count EC xactions (jobs)
| -
incActive
and notification callback | * EC active/inactive state in now cluster-wide information; works as | follows: | * piggyback on keep-alive heartbeats | - target => (fastKalive) => primary | - primary => (fastKalive response) => non-primary | * part one
- "list-objects: sort virtual dirs first, objects second" b630043f7 | * part four, prev. commit: da0606fa17eec
- "list-objects: amend listing virtual ('synthetic') dirs" da0606fa1 | * aws and gcp backends to handle virtual dirs, set 'is-dir' bit | * (azure TBD) | * always return virtual directories (if any) - unless | explicitly disallowed via '--no-dirs' switch | * move '--no-dirs' logic to the backends | * part three, prev. commits: 0b14d0ec37b10, a9773251e7f78
- "S3: list-objects to return all virtual subdirectories" 0b14d0ec3
| * when listing with
apc.LsNoRecursion
flag (CLI--non-recursive
) - "list-objects: skip virtual directories" a9773251e
| * new bit flag in the control message:
LsNoDirs
| * CLI as well | * prev. commit: 02843305c19ce63 - "list-objects: skip virtual directories" 02843305c | * but only when using listed results to allocate LOMs (e.g., copy, prefetch)
- "[API change] show TLS certificate details; add top-level 'ais tls' command" 091f7b0e0
| * Go API: add api/x509 source:
| -
api.LoadX509Cert
| -api.GetX509Info
- "transport header & burst size can now be set at runtime" 0a54545e1 | * intra-cluster transport: the two knobs were readonly | - not anymore | * with refactoring
- "[API change]: extend HEAD(object) to check remote metadata" c1004dd2b | * add support for QparamLatestVer ("latest-ver") | - HEAD is now similar to GET(object) | * when checking local/remote equality, return specific cause: | what exactly failed to match | * part two, prev. commit: 64bf8b26721a90
- "Go API: HEAD(object) args; [config change]: IO errors" 64bf8b267 | * HEAD(object) API change, part one | * FSHC config change: IO error limit and duration | (formerly, soft error) | * with substantial refactoring
- "new admin API: disable/enable cloud backend at runtime" 779a7b9f2 | * still remains CLI and docs | * and also removing: | 'ais config cluster backend.conf='{"gcp":{}, "aws":{}}', and similar
- "add Go API to query configured backend providers" dac041bbe | * (the corresponding fields in the cluster config are hidden)
- "[config change]: FSHC v2 (major)" a2d04da3a | * track and handle total number of soft errors | * extend fshc config; add new knobs | * revise health/fshc.go logic | * part ten, prev. commit: ee3957913b57242
- "[API change] do not accept node URL - always require node ID" 482e720f3 | * up cli
- "[API change] do not accept node URL - always require node ID" 071ddea92 | * when not in cluster map, validate via "self-removed" history | * (security)
- "Go API: add
EnableRebalance
andDisableRebalance
" f5deb20a2 - "(config, log): use 'log.stats_time' if defined" 053ce1175
- "[Go API change] new field
BaseNameOnly
inArchiveMultiObj
API" 70165c8d3 | this boolean field specifies only extracting base names as names of archived | objects; dsort must recognize the record keys correctly - "[Go API change] new get-stats to show disk IOPS and capacity, both" 5cdab34d6
| * Go API: add get-any-stats
| * CLI:
ais storage disk
&performance disk
to include (used%, avail) capacity | * separately, fix scripts/install_from_binaries.sh - "[API change] get-bucket-info to support prefix option" 56935d414 | * e.g.: 'ais ls ais://nnn --summary --prefix=aaa/bbb'
- "mark
Conf
field inBackendConf
as not marshalable" 5507ee016
- "fix log-removal regression" 7e928e53f | * when total size exceeds configured maximum | * with refactoring
- "datapath query (dpq)" 8c3b9438e, fea2b63ba | * remove rendundant constants - use s3 header prefix instead | * remove debug assert | * error message to include raw query
- "add docker image for AIS utilities" e9827b983 | - rename admin to ais-util | - update packages and optimized layers
- "new RMD not to trigger rebalance when disabled in the config" 550cade20 | * several distinct scenarios: by user, RMD with action message, | RMD without message (ref) | * extra check in the latter case | * still trusting local copy of the cluster config, though
- "add support for GCP and AWS backends in Google Colab" e53aaf35a
- "target startup: configured backends vs linked backends" 056d4f9cc | * failure to initialize a real (non-mock) backend - is fatal | * with minor refactoring for clarity
- "cleanup fs-path-error only in API responses" 6cfa2754e | * and not anywhere else
- "python, CLI, EC (follow-up)" e3b646378 | * python: prefetch w/ num-workers | * up cli | * close/reopen EC streams: negative timeout (fix) | * list-range xactions: minor ref
- "prefetch/copy/transform: number of concurrent workers" a5a30247d, 8aa832619
| * prefetch (job):
| - extend
apc.PrefetchMsg
control: add num-workers | * CLIais prefetch
: add '--num-workers' option | * copy-objects/transform-objects (jobs): | - extendapc.TCOMsg
: add num-workers | * amend & revise common list-range iterator (lrit) | * with refactoring - "delete/evict objects: fix overcounting io errors" 89acff646
| * (evict-objects & not-in-cluster) - is a valid combination
| * related config: fshc.
io_err_limit
| * related commit: e2c6e0dd877e45d71 - "[gc logs] compute total size in a housekeeping callback" 350bbff8a | * use a separate goroutine if and only if exceeded configured limit | * (micro-optimizations)
- "build rc4; fixes" 8fd68450c
| * v3.24.rc4
| * rewrite
cos.SaveReader
and friends | * universally usecos.Remove
| * introduceerr-bdir
| * log open/close-ec-streams on both sides - "add write-xid (micro-optimizations)" 68858f950
- "housekeeper to pass monotime to a callback" b17f42234 | * with minor cleanup, micro-opt
- "superfluous response; non-existence; CLI ec-encode v2" 713920710 | * fix "superfluous response" from HEAD(bucket) when given invalid URL | * fix fs/walk vs concurrent object deletion | - non-existence (condition) includes missing-metadata | * CLI: revise/rewrite 'ais start ec-encode' and 'ais start mirror' | * with refactoring and renaming
- "intra-cluster notifications: reduce locking, mem allocations" b7965b7be | * micro-optimize
- "cmn:
CopyProps
for use externally (K8s operator)" 361595619 - "micro-optimize obj props to http headers conversion" 212d2f72f | * target obj-head handler: use pre-mapped headers, avoid repetitive churn | * api/apc: fuse textproto logic; optimize and simplify | * part two, prev. commit: 530288d44fa5fe26
- "micro-optimize obj props to http headers conversion" 530288d44 | 1. canonicalize all header constants | 2. add static map: [internal prop name => canonical header name]
- "follow-up: amend log" 75309301c
| * amend log:
ec
&transport/bundle
packages - "ios: fix handling of devices with empty physical_block_size" f8fd327f3, 595e26261 | * prev. commit: 7324344cd3c
- "global rebalance vs targets that are being decommissioned" 56f7347a4
| from the rebalancing perspective, a target node that is in maintenance mode or
| that is being decommissioned must still be considered "active" unless
| this target has already reached post-rebalancing (
SnodeMaintPostReb
) state - "follow-up: rebalance; archive" 727e45da9 | * reb: log header by strings builder | * archive: refactor 5c94da9ceb043
- "transport header & burst size can now be set at runtime" 0a54545e1 | * intra-cluster transport: the two knobs were readonly | - not anymore | * with refactoring
- "follow-up: remove changes for TLS client" 6c0979615
- "[NGNSDS-632] TLS support" 8107a3bb1
- "ios: remove
lsblk
cmd, addsysfs-block
parsing logic instead" 19c1041df | * this removes the last linux command (executable) that aistore itself | used to run | * with refactoring - "(intra-cluster transport, rebalance): channels, log, more log, refactoring" 9dcbfad7d | * make the transport module's verbosity settable at runtime | - redo most of the verbose logging | * transport/stream collector: | - increase chan size | - periodically dump idle streams, if any | * reb and transport/bundle modules | - add begin/end log records | * add open/close/abort log records | * LZ4 compressed stream (state) is now a pointer | * add yet another scripted test (target IDs hardcoded) | * with minor refactoring
- "API: fix panic on setting query parameters" fbcb90382
- "'uname' is a pointer" 681e4c497
| * fix
lcache
re-caching | * regression: e6045456d449c - "add SECURITY.md to outline security policy and supported versions" 479b4ca68
- "logger: micro-optimize time stamping" 455872f67
- "aisloader: minor fix" 5499e584b
- "recognize DNS lookup error" cec2e83c0 | * and retry, if need be
- "CIDR to select public IP upon node's startup" e685b402a
| * new env var
AIS_PUBLIC_IP_CIDR
| if defined, will take precedence overAIS_CLUSTER_CIDR
| * for comments, see see api/env/ais | * part two, prev. commit: 8defcb378bb508 - "reuse local-redirect CIDR to select public IP (to listen on)" 8defcb378
| * at node's startup, if its
config.HostNet.Hostname
is empty: | - list local unicast IPs; | - if there's more than one: use local-redirect CIDR to make the selection. | * in effect, reuse local-redirect CIDR for the second purpose | * with refactoring and comments inline - "mountpath joggers for archive jobs" b7d076534
- "archive multi-object: create a shard when" 9b3513273 | * when doesn't exist, even when control-message.append is true
- "feat: add
exclude
reaction type upon finding missing extensions" 2d4b78bfc | * refactor structure and logic ofMissExtReact
| * implementexclude
action that removes any incomplete sample if it doesn't contain all required extensions it also removes all unnecessary extension files - "strings: rather right; minor" d5b75b650
- "fix maximum-total-log-size handling" 6ffcadab0 | * log names do not contain ".log." anymore
- "micro-optimize multihoming; post-initialize cluster map" 8a2e7a3b7 | * add snode net-namer with two implementations: (single, multi) | * perform residual init on each new cluster map instance
- "ais: fix node-join return values to avoid panic" 21cce483f
- "ais: properly start listening on all extra pub interfaces" dc551e6c6
- "metasync: amend GFN notifications" abffbe3d4 | * num connection-refused retries: sync vs notify | * metasync-notify: never reset handle-pending timer | * add err-work-channel-full, and use it | * with minor refactoring | * part two, prev. commit: 1a01903358fcb0
- "metasync: amend GFN notifications" 1a0190335 | * always notify via metasync-post | * add extra checks when not to
- "fix: always update
black
to latest version infmt-fix
andfmt-check
" 162dfef6d - "s3: presigned HEAD request" f6fcd7c3f | * do not read HEAD resp. body | * with refactoring | * prev. commit: 5efbfcd8626d75
- "don't use io.read-all (micro-optimization)" 470f0379f | * part two | * HEAD request vs Content-Length; comments | * refactor: presigned s3; k8s client
- "don't use io.read-all (micro-optimization)" 6c8b7c0f0 | * add cos.ReadAll and cos.ReadAllN | * with minor refactoring (htrun)
- "use fixed size arrays (ref)" 2c7b1d928
- "dpq parsing: max num iterations (minor)" 1fe096c8b
- "datapath query parameter (dpq) parsing" 2edb03400 | * debug or no-debug: keys must be known or excepted
- "follow up" e84c966c2
- "general: fix
DisableColdGET
feature and add tests" cf7933392 - "follow-up" e38e68810
- "refactor: use goroutines to execute archive API calls asynchronously" 25b7cd8c6
- "follow-up (fspaths)" 6a960950e
- "ais: pass original request to
HeadObj
" 5efbfcd86 - "expose raw get/put latency metrics" d0fa2aca6
- "feat: used
cos.ParsedTemplate
to parse and generate shard's name" f0d3b6ff6 - "GOMAXPROCS, et al." e2a68fa77 | * revisit, add comments | * with minor refactoring
- "fix: Handle edge case of remainder records after sharding" 4ec9c9046
- "fix: remove wait for
Object.promote
on synchronous execution" f7c581daa - "misc: remove statsd from aisnode image" d259b8b44
- "disable/enable cloud backend at runtime" acdfb0398 | * two-phase commit | * CLI 'ais advanced [enable/disable]' | * part two, prev. commit: 779a7b9f201e
- "misc: cleanup prod K8s container images" e3f964685
- "micro-optimize 'lmeta.unpack'" b488be001
- "aisloader: add '--list-dirs' option to list virtual subdirectories" c69d58af9
- "new object metadata type:
chunk
" 0d915b0f1 | * cos.UnsafeS, cos.UnsafeB | * with refactoring | * part one | * related commits: 397e1e2c, 57b94581, 3816bfbf - "further isolate access to LOM internals; fstat" 397e1e2cc | * prev. commit: f882ef45732cc
- "return writer, not file; EC restore-replica; fast append" 57b945817 | * with partial rewrite: | - GET => EC restore-replica | - fast append to TAR | * tests: append to arch: more stress | * part two, prev. commit: 887bb0544e6497
- "append to TAR: remove redundant fseek, simplify" 51a22738e
- "return writer, not file; add create-part, create-slice" 887bb0544 | * part one, related commit: 3816bfbf82094
- "etl: simplify and refactor 'inline | offline' transforms" cdbb82801 | * push, redirect, and reverse
- "return object reader, not file" 3816bfbf8
| *
cos.LomReader
- "further isolate access to LOM internals" f882ef457 | * lom.FQN
- "BID bitwise structure and flags (object metadata)" cf211aba7 | * part three, prev. commit: 8e961d76473699
- "BID bitwise structure and flags (object metadata)" 8e961d764 | * part two, prev. commit: 054fecdc94fadc
- "BID bitwise structure and flags (object metadata)" 054fecdc9 | * part one
- "ios: replace
du
with raw syscalls" 15c20536c - "storage and bucket summary: move and parallelize on-disk sizing" 2ad584561
| * remove
fs.OnDiskSize
('du') from the job's BEGIN phase | * run it in parallel with walking objects and, possibly, counting | remotes | * refactornewSumm
construction - "object 'hrw-fqn' is now a pointer" 48b5f2a9b | * part three, prev. commit: e6045456d449cf
- "object 'uname' is now a pointer" e6045456d | * part two, prev. commit: f337216fc28b7
- "idle job timeout is now atomic; version is now a pointer (part two)" 7896e263d
- "object version is now a pointer" f337216fc | * [backward compatibility] was a string
- "rename avail-paths (minor, ref)" f25f3042d
- "ios: remove unused functions" d40b97304
- "bucket summary: fix begin timeout" bc1ee691f
- "fs: replace running
sh
,df
andawk
with/proc/mounts
read" 7c4dd1c3a - "ext/dload: remove unused returns (minor)" 86939a750
- "dsort: rename
order_file
toEKM
and improve EKM file parsing logic" ccbeefe27 | * the term order_file was misleading, as it suggested functionality related to "ordering," | * whereas its purpose is only to provide rules for categorizing source records without | * any specific order. Renaming it to EKM clarifies its role and makes the code and API spec more intuitive. | * enhance EKM file parsing logic by removing the reliance on file extensions | (new logic now auto-detects the file type by first attempting to parse it as JSON, | and then falls back to line-based parsing if fails). - "support and validate template format strings in
dsort
andishard
EKM" 072294682 | * stats/target log to always log red | * flags string formatting: | - e.g. single: "OOS", multiple: "[OOS OOM]" | * part two, prev. commit: 40d6580df689370 - "fix: adjust minimum file size check for compressed archive types" 5c94da9ce | - when a file is smaller than the block size and padded with zeros, compression can remove the padding, | - resulting in a file size less than the block size. | - for compressed archive types, the check now only ensures the size is sufficient to detect the | - corresponding magic numbers, rather than strictly adhering to block size.
- "support count-based
shard_size
config inishard
" 767dbb660 | * rename allmax_shard_size
toshard_size
| * refractorisharder
archive logic to keep current shard size/count as | internal state - "exclude samples not specified in
ishard
EKM config" fb8296e12 - "enable
ishard
to use regex-based external key map through dsort" 5580f0603 - "support regex in dsort external key map (EKM)" d78e4b4fa
| * supports using regex as the record identifier to match multiple records into a pattern flexibly
| *
ishard
needs this feature to use prefix as regex to pack all records under the same virtual directory into a pattern - "display
ishard
effective total object size after applying missing ext react" 3768b73f2 | * renameMissingExtAction
toMissingExtManager
to reflect its role in managing all information about extensions, including their effective object size. | * display the re-calculated effective object size in a progress bar. - "logs: replace
nlog
withfmt
forishard
stdout" 0bbdc869d | *nlog
shouldn't be used for just printing info to stdout/stderr. | replaced withfmt
instead. | * added error handling in archive goroutines - "integrate
dsort
withishard
" 25495d8b1 | * support alphanumeric/shuffle algorithms and associated configs | * replace alllog
withnlog
when parsing CLI params | remains: | - support content sort and external key map | - enable dsort dry-run preview fromishard
- "enable IEC, SI formats for
max_shard_size
config inishard
" 661c8a73f - "add dry-run option for
ishard
with expected shards layout" d990ada77 - "implement configurable record key and report missing extensions" a1b58cf47
- "add progress bar for
ishard
execution" 01f9b3423 - "enable
ishard
prefix option for specifying source files to include" a2a0445d4 - "restructure
ishard
package as a standalone executable" 93a80736e - "(new)
ishard
utility to archive objects according to subdirectory paths" 252f9526a - "dsort: fix placement of error check" 5802f1151
- "dsort(EKM): implemented
ExternalKeyMap
as shards format option" 49b0668fc - "dsort: supported algorithms (alpha, shuffle, content key) in
DsortFramework
" 308cd2c7c - "dsort: implemented and tested python datatype
DsortFramework
in SDK" 6d3aa4b6c
- "CLI: update token filepath handling, directory creation, and error management" 60b137ffc
| - rename function
tokfile
togetTokenFilePath
for better clarity. | - rename instances oftokenFile
inlogoutUserHandler
,loginUserHandler
, andrevokeTokenHandler
| withtokenFilePath
for better clarity. | - replaceos.Create
withos.CreateDir
and revert to returning the token filepath and err ingetTokenFilePath
. | - streamline error handling for getting token file path. | - replaceos.Remove
withcos.RemoveFile
inlogoutUserHandler
. - "CLI: create tokenfile if absent during login" 759cb7f2a
- "fix proxy access check" ff86daefa
- "show-cluster is a cluster-level operation (fix)" 08c038235
- "add Show CLuster as Cluster-Level Op" 049a3a300
- "log user-name for failed operations" d0510d879
- "authn config: add json tag to unexported fields" 622cca17d | * cli
- "authn config: add json tag to unexported fields" e24c869c5 | * fields previously added in this commit: 73682b80d403
- "add default config to docker image" 0ef7fcd3f
- "override configuration with environment variables for server" bc10234c7
- "update authn container entrypoint to use new config path" 0259869ab
- "add support for env vars for admin creds and secret" d50c2237a | - add support for the following environment variables: | * AIS_AUTHN_SECRET_KEY: Secret key for token signing | * AIS_AUTHN_SU_NAME: Admin username | * AIS_AUTHN_SU_PASS: Admin password | - documentation updated accordingly
- "refactor, maintain" 73682b80d | * config: remove rlock; use pointers
- "refactor
User
entity and remove cluster ID from LoginMsg" 658058e26 | - remove roles string[] from the User entity to simplify role management. | - remove cluster ID from the login message to streamline the login process.
- "add 'AIS_AUTHN_TOKEN' environment" 79543fd6a
| * add
AIS_AUTHN_TOKEN
(value) env variable | - not to confuse withAIS_AUTHN_TOKEN_FILE
| * refactorapi.LoadToken
- "add parse-retries-flag to reuse" fcaba7c85
- "
ais put <multiple-files>
will now generate a list of failures" 01e6e484f | * e.g.Error: failed to PUT 13 files ("/tmp/.ais-put-failures.3343698.log")
- "'ais put --retries ' with increasing timeout, if need be" 99b7a961a
- "when 'ls bucket/objname' becomes 'ls bucket --prefix objname'" e3febd2a4
- "remove 'cluster restart required' warning for auth.enabled config" 6299e3313, 4435c92af
- "add 'ais tls validate-certificates' command" 0a2f25cc7 | * also, add load-cert case for secondary proxies (fix)
- "follow-up" 508b8add5 | * 'ais cluster set-primary --force' | * up cli
- "[API change] show TLS certificate details; add top-level 'ais tls' command" 091f7b0e0
| * Go API: add api/x509 source:
| -
api.LoadX509Cert
| -api.GetX509Info
| * CLI: add cmd/cli/cli/x509; consolidate all TLS in there | * CLI: add top-levelais tls
; update all related docs and references | * prev. commit sequence: 3f3e5026323deed | * separately: | - aistore as reverse-proxy is obsolete - update the docs, add | disclaimer | - related (very old) commit: 2cc82126b0625b2a1d - "when command usage is multi-line; amend and refactor cli/docs" b6653a2f8
| * const
Usage
(refactor all sources) | * combineais cp
help and documentation; fix all cross-refs | * docs: add '--num-workers' - "disable/enable cloud backend at runtime" acdfb0398 | * two-phase commit | * CLI 'ais advanced [enable/disable]' | * part two, prev. commit: 779a7b9f201e
- "copy/transform: number of concurrent workers" 2414c6898
| * CLI
ais cp
: add '--num-workers' option | * CLIais etl
: ditto | * assorted: | - github-CI python | - tests: skip ec-destroy-bucket | - fs-path-error: empty path is now treated as not - "user-friendly "did you mean" message" d015ea5e6 | * erroneous 'ais show gs://abc[/object]' will now produce the right hint | * up cli
- "fix rendering issues for EC xactions output" 94578e9f0
- "extend 'show cluster' - add 'alert' column" 40d6580df | * move version & build to the summary | - show version (or build) with individual nodes iff there are different versions and builds, respectively | * add alert column; hide it iff all state flags are OK | - for enumeration, see cmn/cos/node_state_flags
- "tls config validation - make it a warning" 88902b55b
- "assorted fixes (minor)" 8f33765ee
| * red
NetworkError
| * storage summary: rm wrong assert | * inline tips | add rule to CI that triggers python SDK AuthN tests when there are changes made to relevant files or tests. - "an option to show calendar date and hh:mm:ss timestamp, both" 6702cfa73
| * e.g.:
| -
$ ais show job --date-time --all
| -$ ais show rebalance --date-time
- "CLI/AuthN: update token filepath handling, directory creation, and error management" 60b137ffc
| - rename function
tokfile
togetTokenFilePath
for better clarity. | - rename instances oftokenFile
inlogoutUserHandler
,loginUserHandler
, andrevokeTokenHandler
| withtokenFilePath
for better clarity. | - replaceos.Create
withos.CreateDir
and revert to returning the token filepath and err ingetTokenFilePath
. | - streamline error handling for getting token file path. | - replaceos.Remove
withcos.RemoveFile
inlogoutUserHandler
. - "CLI/AuthN: create tokenfile if absent during login" 759cb7f2a
- "tls config validation; user-friendly error messages; config reset" 5f9454a08, 83a186baa | * all of the above, plus: | - do not initialize TLS client unless required | - and vice versa
- "use Go API to query configured backend providers" 756a653ae
- "CLI (follow-up)" 3fdb3e6c1 | * CLI: warn mountpath with no disks (and labels) | * CLI e2e: need to wait longer when cluster has a lot of data | * stats/target: fix error message typo
- "assorted fixes: fs-linux; CLI iterations; FSHC; OOM periodic" 447115159
| * revise (mountpath => filesystem) resolution
| - mountpath vs FS mountpoint; relative path
| - refactor & cleanup; fix and add comments
| * CLI: when iterating, perform aistore version check only the first time
| - extend
longRun
singleton - add iteration count | * OOM periodic: flip CAS statement (typo) | * FSHC: flip CAS statement (ditto) | - refactor & cleanup | * FSHC config: reduce default soft-error limit to 10 (was 100) - "show configured backend providers" 663d98975 | * ref ba492a11a580d2e | * up cli
- "show configured backend providers" ba492a11a | * config.backend section is hidden - still, | show respective completion and minimal content
- "disk metrics; CLI: verbose counters, empty version" bd927bc01
| * do not build disk metric names at runtime
| * CLI: skip internal (
lcache
,stream
) counters unless verbose | * CLI: version check vs. nodes in maintenance - "support per-backend cumulative "total" latencies
| * revise 'ais performance latency'
| * use
.total.
latencies and their respective counters | * related commit: 5fd101589c7 - "'show performance'" 0609b040c | * remove redundant alias
- "update authn entities and templates" c0db9fa0b
- "list-objects: color virtual dirs" 592f63bfd | * and show nothing in the "cached" column
- "sdk/python: release version 1.8.0" a8dd990ab
- "sdk/python: refactor internal object classes for accessing and iterating over object content" 5ae897a0d
- "sdk/python: refactor module structure" 0fc6e98b4
- "sdk/python: add support for 'AIS_AUTHN_TOKEN' env var in SDK Client; bump version to v1.7.3" 49d772163
- "sdk/python: release version 1.7.2" 32ececc3c
- "sdk/python: memory usage optimization for ObjectFile" e804411fa
- "sdk/python: add example object-file stress test" 78dd6c6f5
- "sdk/python: release version 1.7.1" c2cebdf76
- "python: fix date parsing by ensuring timezone-aware datetime objects (github-CI)" 9baf63aef | - ensure all datetime objects are timezone-aware in UTC | - fix date parsing issues encountered in github-CI
- "sdk/python: object file max_resume per object file" 3eaf228d3
- "sdk/python: objectfile patches (tests + resume logic)" 9803bf97f
- "sdk/python: object group num_workers" c0faf87b6
- "sdk/python: logging changes (decouple log config from package)" c95a9fa4b
- "sdk/python: release version 1.7.0" 04261c497
- "sdk/python: ObjectFile (File-Like Object)" 6a9c9fc76
| - ObjectFile (file-like object extending BufferedIOBase) with support for retries and error recovery,
| including a notebook demo tests.
| - iter_from_position in ObjectReader, which returns an iterator over each chunk of bytes in the object
| starting from the specificied byte position, including tests.
| - add integration tests for
ObjectReader
. | - update Python SDK documentation (and fixes for minor related issues in AuthN documentation generation). - "sdk/python: add extensive testing for AuthN module" 901379124 | - add individual tests for each permission (and derived role) using the Python SDK.
- "sdk/python: add Python SDK AuthN README & update docs" 232622455 | - add README.md for Python SDK AuthN sub-package. | - update make generate-sdk-docs recipe (/python/Makefile) to include aistore/sdk/authn. | - update docs via generate-sdk-docs.
- "python/authn: remove unused (derived) roles" 4d459b8b1
| - remove unsupported (only internally used) derived roles in
AccessAttr
class. - "sdk/python: pool request sessions across python processes" 06b3a29e9 | add a dict of request sessions that is indexed by process ID, thus removing any modification of the source clients.
- "python: release sdk v1.6.0" 9e7655dfb
- "python/authn: AuthN Error Handling" cb4ea3259 | - Makes error handler method (raise_ais_error or raise_authn_error with raise_ais_error as default) | a parameter of RequestClient and modified both Client and AuthNClient to initialize | RequestClient with proper error handler. | - Separates errors and handling by package | - Minimally changes client-side usage (usage of AuthNClient and Client remains the same, | only RequestClient usage changes but rarely used by user, only internally by AuthNClient and Client).
- "python/authn: add AuthN Client Logout" 4a4b131dd
- "python/auth: add Tokens API" 100bf6b78
- "sdk/python: refactor request client to move session specific properties to SessionManager" e743e01f1
- "python/authn: add Users API" 604e60a9e | add the Users API to the authentication module, enabling the management of role-assigned users. | - add APIs to create, update, delete, get, and list users. | - add appropriate unit and integration tests.
- "python/pytorch: Implement shuffling and custom saturation factor for dynamic sampling" d41d87e6c | - shuffling with the Dataloader doesn't work when using a custom sampler, so we implement it as part of our | - dynamic sampler. This shuffling works by generating a random list of indices using permutation. | - also, support user-provided saturation factors.
- "python/pytorch: fix WorkerSessionManager returning None when using samplers with no workers" 84b335aa4 | when using a dynamic batch sampler without a dataloader (e.g., to just get the batch indices), | we trigger a bug where session manager returns None.
- "sdk/python: add retry support via urllib3.Retry" ad78fa1d1
- "sdk/python: fix type hint for python 3.8 compatibility" 18dc104a7 | - fix type hints for Python 3.8 compatibility (replace tuple, list, dict with Tuple, List, Dict from typing)
- "python/authn: implement Roles API" b15826eab | add the Roles API to the authentication module, enabling role-based access control features. | - add APIs to create, update, delete, get, and list roles. | - add Unit and Integration tests.
- "sdk/python: refactor object get to move request logic to object reader" 9562ed458
- "sdk/python: implement prefixes for object groups" be347a692 | add prefix support to ObjectGroup which is needed in the AIS Pytorch datasets.
- "python/pytorch: resnet50 using WebDataset" fb8757248
| an example for WebDataset training using
AISShardReader
and existing torch models. - "python/pytorch: use Tuple instead of tuple to support older python versions" 5f909a664 | python versions 3.8 and older cannot use tuple as a type directly; instead, we must import Tuple from typing.
- "sdk/python: set custom obj props" a6afbb8e2
- "python/authn: add cluster operations - Implemented methods for listing, retrieving, registering" 98b4be02a | - listing, retrieving, registering, updating, and deleting clusters
- "python: fix env var for TLS" 0698ac2b3
- "python: enhance client to accept tokens and implement authn login" 4a1949a7a
| - update the AIStore Python client to accept authorization tokens.
| - add
authn login
functionality to enable users to log in and obtain tokens using their credentials. | - unit + integration tests for AuthN in Python - "sdk/python: fix remote tests to avoid concurrent object access failures" be7ca27b3
- "sdk/python: update release version" 227652334
- "python/pytorch: solidify objects as the backing data structure type" 2bf96fb53
- "sdk/python: support object props for object head" 010b80d1e
- "python/pytorch: fix length for iterable datasets" c06c0a565
- "python/pytorch: decode shards on client side in
ShardReader
and support non-uniform samples" 2120562e8 - "python/pytorch: create classifier model training example" bcc1f613c
- "python/pytorch: implement dynamic sampler for map based datasets" aaec09d4f
- "python/pytorch: add multiple worker support to ShardReader" a553c2861
- "python/pytorch: add progress bar to iter dataset with support for workers" 40c624bff
- "python/s3compat: update certifi dependency" b86102eac
- "python/pytorch: integrate alive_progress into ShardReader" ac71ef7d8
- "python/pytorch: add support for multiple workers in iter datasets using worker slices" b1a5afd8b
- "python/pytorch: update examples for pytorch datasets" b9b781e30
- "python/pytorch: improve error handling for datasets and remove unused Client" 8572b8ef0
- "python/pytorch: refactor datasets and utils" cef15ae7c
- "sdk/python: modify
Object.promote
to return job ID for status check" 575a04296 - "python/pytorch: add wrapper for parse_url to fix upstream torch imports" b628816e8
- "pyaisloader: enable ETL option for pyaisloader benchmarks" 0732eb990
- "sdk/python: implemented and tested prefix support in bucket summary and bucket info methods" abc0300c3
- "python/pytorch: Calculate length in iter to save additional iteration cost" bf12042f6
- "python: include
STATUS_PARTIAL_CONTENT
status code handling in bucket summary" 2800543fb - "python/pytorch: add ShardReader example to docs" 68a5a958c
- "python/pytorch: implement WebDataset shard reader and tests" 1be385bc7
- "python/pytorch: Refactor datasets into separate files" 68e98f41f
- "pyaisloader: performance benchmark tests for AISDataset and AISIterDataset" 14cddfec2
- "sdk/python: add
Object.append_content
method" 7699758f1 - "python/pytorch: fix regression from 6904" 5429a7fbd
- "python: release python sdk v1.4.23" 63bfb4253
- "python: add functionality to fetch objects by URL - Implemented fetch_object_from_url function in the Client class to enable object retrieval using a URL." 7cbc80f17
- "python: correct return type for raw stream in ObjectReader" 198935a30 | change the return type of the raw() method in ObjectReader to return the correct file-like | object instead of bytes. Improve docstrings and add type annotations for clarity.
- "python/pytorch: follow-up" 74a8a5b74
- "github-CI: update actions/download-artifact to v4" e72fb7a93
- "github-CI: follow-up; add support for python 3.8+, remove xattrs installation" 34e3b6070
- "lint" 2e9edeec3 | * golangci 1.61.0 (was 1.60.2)
- "CI: Allow changing runner tags via variables" 6759515d2
- "github-CI: update upload-artifact v4 for pypi release" a6612f7ec
- "gitlab-CI: only run python authn tests on build success" ace4bce22
- "gitlab-CI: fix AuthN python tests not triggering automatically" d686c5991
| - add
when: always
to the rules forauthn
label and directory changes to ensure automatic job triggering. | - reorder rules to avoid every job defaulting to manual triggers. - "bump 'google-protobuf' to address CVE-2024-7254" eb369d4c8
- "bump rexml to address CVE-2024-41946" 09a8b0377 | https://nvd.nist.gov/vuln/detail/CVE-2024-41946
- "github-CI: remove unnecessary dirs, change space config" f3369232c
- "gitlab-CI: fix" 8d6f47f8d
- "tools: amend test skipping logic" 8f50f0439
- "lint; up cli" 3500e9288 | * golangci 1.60.2 (was 1.60.1); linters: | - exportloopref (deprecated) | + copyloopvar linter
- "build: bump rc3" 461a64bc1 | * v3.24.rc3
- "CI: Do not always run python-authn tests" 1d41eced3
- "CI: correct
ishard
long test directory" e36e7a576 - "name-is-too-long, and similar cleanups (minor)" a9a9a6bdd | * checkmarx pass two
- "checkmarx compliance, part 1" c751f263d
- "build: upgrade OSS packages" a0a2b0006 | * aistore and cli, both
- "CI: fix" eae7d4c9c
- "github-CI: add authn tests" 080809de1
- "CI: Python SDK AuthN Tests" cbc3fbc60
- "lint" 9f51aed7a | * golangci 1.60.1 (was 1.59.1)
- "CI: standardize github configuration" 875e0e911
- "CI: add botocore dependencies for lint" f3733d3bf
- "CI: update pylint" fa96f71f4
- "CI: update and simplify Dockerfile" 268cfd131
- "deploy: log pod errors upon minikube setup timeout" 0d437de0b | ensures any issues causing the timeout are captured and logged for easier debugging in github workflow
- "scripts: add rancher lpp to gitlab runner setup script" f72fc7dd6
- "scripts: add colour to ignore rules list for spell check" b28cbb5e8
- "build: demote
ht://
backend; revise local-playground scripts" 1a028279b | * loopback count and size | * default number of mountpaths = 4 | * docs/getting-started | * part three, prev. commit: aa9f4288658 - "build: demote
ht://
backend; revise local-playground scripts" aa9f42886 | * add build taght
; linkht://
conditionally | * related commit: 50db672cb34f90 - "build: bump rc2 cli" 69d2659e3
- "build: bump rc2" abe0c157a | * v3.24.rc1
- "build: bump rc1" 83f6c9372 | * v3.24.rc1
- "CI: build AuthN image in docker workflow" eb9aa1dbe
- "up cli" 44324256a | * compile with extended FSHC config | * support per-backend metrics ('get-cold' removed)
- "CI: fix netflify" 843cd65f7
- "dependabot fix: REXML denial of service vulnerability" d3e470deb | * fix for https://github.com/NVIDIA/aistore/security/dependabot/25
- "fix: include
utils.sh
source in minikube deployment" 3c0a49a1e | - since commit 50db672cb, some utility functions used inaisnode_config.sh
are missing during minikube deployment | - the fix ensures thatutils.sh
script is imported in both docker image build and minikube deployment. - "fix: use portable env var check in deploy script for compatibility with macOS zsh" 8e5ed0006
- "authn: fix local deployment environment variable renaming" eae062e97 | - corrected environment variable in local deployment setup.
- "deploy: standardize Makefile for container build" 316e344fb
- "deploy: remove readiness script from aisnode container" 137e58791
- "general: remove Terraform as deployment option" 72fee04ff
- "scripts: fix clean_deploy panic and implement empty value checking" 8f7c3ee3a
- "build: upgrade grpc" de58a6eb1 | * https://github.com/NVIDIA/aistore/security/dependabot/24
- "CI: cleanup around linter config" 3c6a02484
- "deploy: generalize and fix building aisnode image" 434a89c18
- "build: aisnode Dockerfile follow-up" 7a0916fdd
- "build: standardize builder stage in Dockerfile" 8670a3361
- "build/CLI: upgrade OSS packages" 3e21e3eb4 | * part two, prev. commit: c331cf26dd0e6
- "build: upgrade OSS packages" c331cf26d | * prev. commit: f2ee1c0c18726e21
- "CI: add pytorch integration tests to CI" 65d02ae90
- "build: upgrade OSS packages; update Go toolchain" f2ee1c0c1 | * aistore and cli, both | * go get -u ./... && go get [email protected] | * prev. related: ceb7159b82a71afa
Technical Blog
- Resilient Data Loading with ObjectFile
- Google Colab + AIStore
- Accelerating AI Workloads with AIStore and PyTorch
- "blog: Google Colab + AIStore" [fb3fe2b7f](https://github.com/NVIDIA/aistore/commit/[)
- "docs: presigned s3 requests; edits" [164bf0ef9](https://github.com/NVIDIA/aistore/commit/[)
- "docs: update Python SDK streaming object file example with retry" [d3a541ff7](https://github.com/NVIDIA/aistore/commit/[)
- "docs: cli/advanced.md, environment-vars.md, https.md, authn.md, cli.md" [e40ec36d1](https://github.com/NVIDIA/aistore/commit/[) | * new content mostly around https | * v3.24 updates | * cross-references, etc. text works
- "docs: update AIStore setup instructions for Google Colab with notebook link" fbb20626c
- "docs: running AIStore in Google Colab" 53d4e43a3
- "docs: update docker-single readme, add multi-disk example" 110ecb276
- "docs: update overview, terminology" 7776b1dea
- "tech blog site: aistore.nvidia.com" f7fa977cc | * s/aiatscale.org/aistore.nvidia.com/ | * Makefile follow-up
- "docs: edits across the board" bef16fb38
- "docs: aistorage/cluster-minimal readme" 1f909bf08
- "docs: create pytorch docs and update make generate-docs" 34a00ba5b
- "blog: initial sharding (
ishard
)" 35bca58f5 - "docs: add metric names-and-types reference" 1923f3dc4 | * include both internal and externally visible names, descriptions, and labels
- "docs: amend
LRU
andSpace
configuration" 771ff60b7 - "docs: update authn documentation" 68d669f17
- "docs: disable/enable cloud backend at runtime" b0e5db9eb | * part three, prev. commit: acdfb0398439c
- "docs: add howto-virtual-directories" c73d275ed | * prev. commits: 592f63bfd814, b630043f7198
- "docs: remove mentions of
du
anddf
commands" 59719f66e - "docs: clarify purpose and implementation of botocore patch testing" 5c3fa229c
- "docs: getting-started '--cleanup' option" 8bcfa8269
- "docs: main readme, bucket inventory" 69041a17a | * also, un-defer s3 put datapath (minor)
- "local playground: environment vs STDIN;
TAGS
vs AIS_BACKEND_PROVIDERS" 50db672cb | * (usability) - "local playground with disks; up cli" 7b9faaf15
| * introduce
AIS_LOCAL_PLAYGROUND
env | * revert commit a53d3b0b8eb4b (ais/utils) - "local playground with disks; clarify" a53d3b0b8 | * skip or not to skip localhost | * clarify one possible PUT fail
- "local-playground: warn extended attributes may not be supported" b108c4202
- "tests: run python ETL tests with reworked k8s CI setup" cd5277768
- "tests: a job we try to abort may have already finished" 7b0b0bca0
- "tests: fix running/finished race" 884fba383
- "tests: when killing/restoring nodes" 33fa512cb
| * always try to wait for the original (prior to test running) node counts
| * note:
proxyURL
is global but often used as a local var - "tests: fix mock cloud backend" 4cfb1f22a
- "tests: implemented and refactored
ishard
long stress test" 00586467e - "tests: cleanup logs in minikube VM after running k8s tests" 888da5de8
- "tests: add ETL tests for concurrent transformations and various object sizes" 727662f61
- "test-skipping logic (minor)" 5bf9321ec | * fix 7d0d196723f2c5