Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: upload-client uploadDirectory, by default, sorts the provided files by file name to help the user call us in a way that is deterministic and minimizes cost #1173

Merged
merged 37 commits into from
Nov 29, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
d355936
upload-client uploadDirectory, by default, checks to ensure the provi…
gobengo Nov 21, 2023
61b4869
cleanup
gobengo Nov 21, 2023
bd37127
add code to unsorted error
gobengo Nov 21, 2023
6d16e52
increase coverage of sharding.js
gobengo Nov 21, 2023
1936e4b
rm unused optional param
gobengo Nov 21, 2023
1879009
Update packages/upload-client/src/index.js
gobengo Nov 21, 2023
bc3282a
Update packages/upload-client/src/sharding.js
gobengo Nov 28, 2023
eeb28bc
docs: simplify the readme js code example (#1174)
olizilla Nov 21, 2023
34435e7
fix: don't error when we can't figure out a name for a space (#1177)
travis Nov 22, 2023
29d5630
feat!: account plan subscriptions and space usage API sugar (#1171)
Nov 22, 2023
c5ebfe7
chore(main): release capabilities 12.0.3 (#1163)
it-dag-house Nov 22, 2023
1f6f5ee
chore(main): release upload-client 12.0.2 (#1168)
it-dag-house Nov 22, 2023
28c857a
chore(main): release access 18.0.3 (#1166)
it-dag-house Nov 22, 2023
6004e8b
docs: get receipt (#1160)
vasco-santos Nov 22, 2023
f6b8cd9
feat: add store.get and upload.get to clients (#1178)
Nov 25, 2023
9503f16
chore(main): release upload-client 12.1.0 (#1180)
it-dag-house Nov 25, 2023
61aeb4a
chore(main): release w3up-client 11.1.0 (#1179)
it-dag-house Nov 25, 2023
ae5f852
fix: export filecoin types (#1185)
Nov 27, 2023
a93a4d1
chore(main): release w3up-client 11.1.1 (#1186)
it-dag-house Nov 27, 2023
120bdbc
fix: export ProgressStatus
Nov 27, 2023
cbaebf0
chore(main): release w3up-client 11.1.2 (#1187)
it-dag-house Nov 27, 2023
85a1197
fix: thread abort signal through login functions (#1189)
travis Nov 28, 2023
adeeec2
chore(main): release w3up-client 11.1.3 (#1190)
it-dag-house Nov 28, 2023
3f13942
feat: aggregator keeping oldest piece ts (#1188)
vasco-santos Nov 28, 2023
e26a068
fix: storefront events cron with max concurrency (#1191)
vasco-santos Nov 28, 2023
89c5c7b
chore(main): release filecoin-client 3.1.3 (#1162)
it-dag-house Nov 28, 2023
de9f84a
chore(main): release filecoin-api 4.2.0 (#1164)
it-dag-house Nov 28, 2023
7c2f3ed
chore(main): release upload-api 7.3.4 (#1165)
it-dag-house Nov 28, 2023
0617fe5
adjust uploadDirectory to have options.customOrder option and sort fi…
gobengo Nov 28, 2023
01f9442
Merge branch 'main' into 1172-uploadDirectory-sorted
gobengo Nov 28, 2023
17db4b2
lint
gobengo Nov 28, 2023
16c0c06
fix typo
gobengo Nov 28, 2023
9c1186a
Update packages/upload-client/src/index.js
gobengo Nov 29, 2023
11cfe79
rename getSortKey to getComparedValue per review feedback
gobengo Nov 29, 2023
21adf04
change // to jsdoc comment on UploadDirectoryOptions
gobengo Nov 29, 2023
0777d10
fix coverage
gobengo Nov 29, 2023
7024180
Merge branch 'main' into 1172-uploadDirectory-sorted
gobengo Nov 29, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 10 additions & 3 deletions packages/upload-client/src/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,17 @@ import * as Store from './store.js'
import * as Upload from './upload.js'
import * as UnixFS from './unixfs.js'
import * as CAR from './car.js'
import { ShardingStream } from './sharding.js'
import { ShardingStream, requireSortedFiles } from './sharding.js'

export { Store, Upload, UnixFS, CAR }
export * from './sharding.js'

const CONCURRENT_REQUESTS = 3

/**
* @typedef {import('./types.js').FileLike} FileLike
*/

/**
* Uploads a file to the service and returns the root data CID for the
* generated DAG.
Expand Down Expand Up @@ -63,13 +67,16 @@ export async function uploadFile(conf, file, options = {}) {
* has the capability to perform the action.
*
* The issuer needs the `store/add` and `upload/add` delegated capability.
* @param {import('./types.js').FileLike[]} files File data.
* @param {Iterable<FileLike> & { sorted?: boolean }} files Files that should be in the directory.
gobengo marked this conversation as resolved.
Show resolved Hide resolved
gobengo marked this conversation as resolved.
Show resolved Hide resolved
* To ensure determinism in the IPLD encoding, by default these files MUST be sorted by file.name or this function will return a rejected promise .
* To explicitly upload with in an indeterminate way, pass `files` with `files.sorted === false`.
gobengo marked this conversation as resolved.
Show resolved Hide resolved
* @param {import('./types.js').UploadDirectoryOptions} [options]
*/
export async function uploadDirectory(conf, files, options = {}) {
const entries = 'sorted' in files ? files : requireSortedFiles(files)
gobengo marked this conversation as resolved.
Show resolved Hide resolved
return await uploadBlockStream(
conf,
UnixFS.createDirectoryEncoderStream(files, options),
UnixFS.createDirectoryEncoderStream(entries, options),
options
)
}
Expand Down
79 changes: 79 additions & 0 deletions packages/upload-client/src/sharding.js
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
import { blockEncodingLength, encode, headerEncodingLength } from './car.js'

/**
* @typedef {import('./types.js').FileLike} FileLike
*/

// https://observablehq.com/@gozala/w3up-shard-size
const SHARD_SIZE = 133_169_152

Expand Down Expand Up @@ -84,3 +88,78 @@ export class ShardingStream extends TransformStream {
})
}
}

/**
* @template T
* wrap an iterable in a new iterable that will iterate the same items,
* but will error if the iterated items aren't sorted according to a comparator
* @implements {Iterable<T>}
*/
class Sorted {
sorted = /** @type {const} */ (true)
/**
* @param {Iterable<T>} iterable
* @param {(a: T, b: T) => number} comparator
*/
constructor(iterable, comparator) {
this.iterable = iterable
this.comparator = comparator
}
[Symbol.iterator]() {
const { comparator, iterable } = this
return function* () {
let prev = null
for (const cur of iterable) {
if (prev && comparator(prev, cur) === 1) {
throw Object.assign(
new Error(
`sequentially iterated items were not sorted as expected`
),
{ unsorted: [prev, cur], code: 'SORTED_EXPECTATION_UNMET' }
gobengo marked this conversation as resolved.
Show resolved Hide resolved
)
}
yield cur
prev = cur
}
}.bind(this)()
}
gobengo marked this conversation as resolved.
Show resolved Hide resolved
}

/**
* given an iterable of files, return another iterable that ensures
* that the files are iterated in a sorted order.
*
* @param {Iterable<import('./types.js').FileLike>} files
* @param {(file: import('./types.js').FileLike) => string} getSortKey - given a FileLike, return a value by which all the FileLikes should be sorted
* @returns
*/
export const requireSortedFiles = (files, getSortKey = (a) => a.name) => {
return new Sorted(files, (a, b) => defaultFileComparator(a, b, getSortKey))
}

/**
* Default comparator for FileLikes. Sorts by file name in ascending order.
*
* @param {FileLike} a
* @param {FileLike} b
* @param {(file: FileLike) => string} getSortKey
*/
export const defaultFileComparator = (a, b, getSortKey = (a) => a.name) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but getSortKey is actually returning a value not a key.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I was using key in the sense of this https://docs.python.org/3/howto/sorting.html#key-functions
but I think you mean it like the 'key/value' of the File object, which is probably a more reasonable interpretation in this context than what I meant. I will rename it to getSortValue because that seems at least slightly less likely to cause confusion than what I did

return ascending(a, b, getSortKey)
}

/**
* a comparator for sorting in ascending order. Use with Sorted or Array#sort.
*
* @template T
* @param {T} a
* @param {T} b
* @param {(i: T) => any} getSortKey - given an item being sorted, return the value by which it should be sorted
*/
function ascending(a, b, getSortKey) {
const ask = getSortKey(a)
const bsk = getSortKey(b)
if (ask === bsk) return 0
else if (ask < bsk) return -1
return 1
}
93 changes: 92 additions & 1 deletion packages/upload-client/test/index.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,12 @@ import * as CAR from '@ucanto/transport/car'
import * as Signer from '@ucanto/principal/ed25519'
import * as StoreCapabilities from '@web3-storage/capabilities/store'
import * as UploadCapabilities from '@web3-storage/capabilities/upload'
import { uploadFile, uploadDirectory, uploadCAR } from '../src/index.js'
import {
uploadFile,
uploadDirectory,
uploadCAR,
defaultFileComparator,
} from '../src/index.js'
import { serviceSigner } from './fixtures.js'
import { randomBlock, randomBytes } from './helpers/random.js'
import { toCAR } from './helpers/car.js'
Expand Down Expand Up @@ -368,6 +373,92 @@ describe('uploadDirectory', () => {

assert.equal(carCIDs.length, 2)
})

it('ensures files is sorted unless sorted property also provided', async () => {
const space = await Signer.generate()
const agent = await Signer.generate() // The "user" that will ask the service to accept the upload
const proofs = await Promise.all([
StoreCapabilities.add.delegate({
issuer: space,
audience: agent,
with: space.did(),
expiration: Infinity,
}),
UploadCapabilities.add.delegate({
issuer: space,
audience: agent,
with: space.did(),
expiration: Infinity,
}),
])
const service = mockService({
store: {
add: provide(StoreCapabilities.add, ({ capability }) => ({
ok: {
status: 'upload',
headers: { 'x-test': 'true' },
url: 'http://localhost:9200',
with: space.did(),
link: /** @type {import('../src/types.js').CARLink} */ (
capability.nb.link
),
},
})),
},
upload: {
add: provide(UploadCapabilities.add, ({ capability }) => {
if (!capability.nb) throw new Error('nb must be present')
return { ok: capability.nb }
}),
},
})

const server = Server.create({
id: serviceSigner,
service,
codec: CAR.inbound,
validateAuthorization,
})
const connection = Client.connect({
id: serviceSigner,
codec: CAR.outbound,
channel: server,
})
/**
* @param {Iterable<import('../src/types.js').FileLike>} files
*/
const upload = (files) =>
uploadDirectory(
{ issuer: agent, with: space.did(), proofs, audience: serviceSigner },
files,
{
connection,
}
)

const unsortedFiles = [
new File([await randomBytes(32)], '/b.txt'),
new File([await randomBytes(32)], '/b.txt'),
new File([await randomBytes(32)], 'c.txt'),
new File([await randomBytes(32)], 'a.txt'),
]
assert.rejects(
upload(unsortedFiles),
'uploading unsorted files returns rejected promise'
)

// sorted files should work
const sortedFiles = [...unsortedFiles].sort(defaultFileComparator)
assert.doesNotReject(
upload(sortedFiles),
'uploading unsorted files returns rejected promise'
)

assert.doesNotReject(
upload(Object.assign([...unsortedFiles], { sorted: false })),
'can upload unsorted files if sorted property is provided'
)
})
})

describe('uploadCAR', () => {
Expand Down