Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pr-upload: Upload bottles to the Internet Archive #10677

Merged
merged 8 commits into from
Feb 24, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
177 changes: 177 additions & 0 deletions Library/Homebrew/archive.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
# typed: false
# frozen_string_literal: true

require "digest/md5"
require "utils/curl"

# The Internet Archive API client.
#
# @api private
class Archive
extend T::Sig

include Context
include Utils::Curl

class Error < RuntimeError
end

sig { returns(String) }
def inspect
"#<Archive: item=#{@archive_item}>"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to get some tests for this class.

end

sig { params(item: T.nilable(String)).void }
def initialize(item: "homebrew")
raise UsageError, "Must set the Archive item!" unless item

@archive_item = item
end

def open_api(url, *args, auth: true)
if auth
key = Homebrew::EnvConfig.internet_archive_key
raise UsageError, "HOMEBREW_INTERNET_ARCHIVE_KEY is unset." if key.blank?

if key.exclude?(":")
raise UsageError, "Use HOMEBREW_INTERNET_ARCHIVE_KEY=access:secret. See https://archive.org/account/s3.php"
end

args += ["--header", "Authorization: AWS #{key}"]
end

curl(*args, url, print_stdout: false, secrets: key)
end

sig {
params(local_file: String,
directory: String,
remote_file: String,
warn_on_error: T.nilable(T::Boolean)).void
}
def upload(local_file, directory:, remote_file:, warn_on_error: false)
local_file = Pathname.new(local_file)
unless local_file.exist?
msg = "#{local_file} for upload doesn't exist!"
raise Error, msg unless warn_on_error

# Warn and return early here since we know this upload is going to fail.
opoo msg
return
end

md5_base64 = Digest::MD5.base64digest(local_file.read)
url = "https://#{@archive_item}.s3.us.archive.org/#{directory}/#{remote_file}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to make s3.us.archive.org a constant somewhere given it's used in a few places.

args = ["--upload-file", local_file, "--header", "Content-MD5: #{md5_base64}"]
args << "--fail" unless warn_on_error
result = T.unsafe(self).open_api(url, *args)
return if result.success? && result.stdout.exclude?("Error")

msg = "Bottle upload failed: #{result.stdout}"
raise msg unless warn_on_error

opoo msg
end

sig {
params(formula: Formula,
directory: String,
warn_on_error: T::Boolean).returns(String)
}
def mirror_formula(formula, directory: "mirror", warn_on_error: false)
formula.downloader.fetch

filename = ERB::Util.url_encode(formula.downloader.basename)
destination_url = "https://archive.org/download/#{@archive_item}/#{directory}/#{filename}"

odebug "Uploading to #{destination_url}"

upload(
formula.downloader.cached_location,
directory: directory,
remote_file: filename,
warn_on_error: warn_on_error,
)

destination_url
end

# Gets the MD5 hash of the specified remote file.
#
# @return the hash, the empty string (if the file doesn't have a hash), nil (if the file doesn't exist)
sig { params(directory: String, remote_file: String).returns(T.nilable(String)) }
def remote_md5(directory:, remote_file:)
url = "https://#{@archive_item}.s3.us.archive.org/#{directory}/#{remote_file}"
result = curl_output "--fail", "--silent", "--head", "--location", url
if result.success?
result.stdout.match(/^ETag: "(\h{32})"/)&.values_at(1)&.first || ""
else
raise Error if result.status.exitstatus != 22 && result.stderr.exclude?("404 Not Found")

nil
end
end

sig { params(directory: String, filename: String).returns(String) }
def file_delete_instructions(directory, filename)
<<~EOS
Run:
curl -X DELETE -H "Authorization: AWS $HOMEBREW_INTERNET_ARCHIVE_KEY" https://#{@archive_item}.s3.us.archive.org/#{directory}/#{filename}
Or run:
ia delete #{@archive_item} #{directory}/#{filename}
EOS
end

sig {
params(bottles_hash: T::Hash[String, T.untyped],
warn_on_error: T.nilable(T::Boolean)).void
}
def upload_bottles(bottles_hash, warn_on_error: false)
bottles_hash.each do |_formula_name, bottle_hash|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bottles_hash.each do |_formula_name, bottle_hash|
bottles_hash.each_value do |bottle_hash|

directory = bottle_hash["bintray"]["repository"]
bottle_count = bottle_hash["bottle"]["tags"].length

bottle_hash["bottle"]["tags"].each_value do |tag_hash|
filename = tag_hash["filename"] # URL encoded in Bottle::Filename#archive
delete_instructions = file_delete_instructions(directory, filename)

local_filename = tag_hash["local_filename"]
md5 = Digest::MD5.hexdigest(File.read(local_filename))

odebug "Checking remote file #{@archive_item}/#{directory}/#{filename}"
result = remote_md5(directory: directory, remote_file: filename)
case result
when nil
# File doesn't exist.
odebug "Uploading #{@archive_item}/#{directory}/#{filename}"
upload(local_filename,
directory: directory,
remote_file: filename,
warn_on_error: warn_on_error)
when md5
# File exists, hash matches.
odebug "#{filename} is already published with matching hash."
bottle_count -= 1
when ""
# File exists, but can't find hash
failed_message = "#{filename} is already published!"
raise Error, "#{failed_message}\n#{delete_instructions}" unless warn_on_error

opoo failed_message
else
# File exists, but hash either doesn't exist or is mismatched.
failed_message = <<~EOS
#{filename} is already published with a mismatched hash!
Expected: #{md5}
Actual: #{result}
EOS
raise Error, "#{failed_message}#{delete_instructions}" unless warn_on_error

opoo failed_message
end
end

odebug "Uploaded #{bottle_count} bottles"
end
end
end
10 changes: 9 additions & 1 deletion Library/Homebrew/dev-cmd/pr-pull.rb
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,8 @@ def pr_pull_args
description: "Message to include when autosquashing revision bumps, deletions, and rebuilds."
flag "--artifact=",
description: "Download artifacts with the specified name (default: `bottles`)."
flag "--archive-item=",
description: "Upload to the specified Internet Archive item (default: `homebrew`)."
flag "--bintray-org=",
description: "Upload to the specified Bintray organisation (default: `homebrew`)."
flag "--tap=",
Expand All @@ -65,6 +67,7 @@ def pr_pull_args
description: "Comma-separated list of workflows which can be ignored if they have not been run."

conflicts "--clean", "--autosquash"
conflicts "--archive-item", "--bintray-org"

named_args :pull_request, min: 1
end
Expand Down Expand Up @@ -357,6 +360,7 @@ def pr_pull

workflows = args.workflows.presence || ["tests.yml"]
artifact = args.artifact || "bottles"
archive_item = args.archive_item
bintray_org = args.bintray_org || "homebrew"
mirror_repo = args.bintray_mirror || "mirror"
tap = Tap.fetch(args.tap || CoreTap.instance.name)
Expand Down Expand Up @@ -424,7 +428,11 @@ def pr_pull
upload_args << "--keep-old" if args.keep_old?
upload_args << "--warn-on-upload-failure" if args.warn_on_upload_failure?
upload_args << "--root-url=#{args.root_url}" if args.root_url
upload_args << "--bintray-org=#{bintray_org}"
upload_args << if archive_item.present?
"--archive-item=#{archive_item}"
else
"--bintray-org=#{bintray_org}"
end
safe_system HOMEBREW_BREW_FILE, *upload_args
end
end
Expand Down
52 changes: 40 additions & 12 deletions Library/Homebrew/dev-cmd/pr-upload.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
# frozen_string_literal: true

require "cli/parser"
require "archive"
require "bintray"

module Homebrew
Expand All @@ -27,6 +28,8 @@ def pr_upload_args
switch "--warn-on-upload-failure",
description: "Warn instead of raising an error if the bottle upload fails. "\
"Useful for repairing bottle uploads that previously failed."
flag "--archive-item=",
description: "Upload to the specified Internet Archive item (default: `homebrew`)."
flag "--bintray-org=",
description: "Upload to the specified Bintray organisation (default: `homebrew`)."
flag "--root-url=",
Expand All @@ -47,6 +50,18 @@ def check_bottled_formulae(bottles_hash)
end
end

def archive?(bottles_hash)
@archive ||= bottles_hash.values.all? do |bottle_hash|
bottle_hash["bottle"]["root_url"].start_with? "https://archive.com/"
end
end

def bintray?(bottles_hash)
@bintray ||= bottles_hash.values.all? do |bottle_hash|
bottle_hash["bottle"]["root_url"].match? %r{^https://[\w-]+\.bintray\.com/}
end
end

def github_releases?(bottles_hash)
@github_releases ||= bottles_hash.values.all? do |bottle_hash|
root_url = bottle_hash["bottle"]["root_url"]
Expand Down Expand Up @@ -76,11 +91,16 @@ def pr_upload
bottle_args += json_files

if args.dry_run?
service = if github_releases?(bottles_hash)
"GitHub Releases"
else
"Bintray"
end
service =
if archive?(bottles_hash)
"Internet Archive"
elsif bintray?(bottles_hash)
"Bintray"
elsif github_releases?(bottles_hash)
"GitHub Releases"
else
odie "Service specified by root_url is not recognized"
end
puts <<~EOS
brew #{bottle_args.join " "}
Upload bottles described by these JSON files to #{service}:
Expand All @@ -102,7 +122,20 @@ def pr_upload
safe_system HOMEBREW_BREW_FILE, *audit_args
end

if github_releases?(bottles_hash)
if archive?(bottles_hash)
# Handle uploading to the Internet Archive.
archive_item = args.archive_item || "homebrew"
archive = Archive.new(item: archive_item)
archive.upload_bottles(bottles_hash,
warn_on_error: args.warn_on_upload_failure?)
elsif bintray?(bottles_hash)
# Handle uploading to Bintray.
bintray_org = args.bintray_org || "homebrew"
bintray = Bintray.new(org: bintray_org)
bintray.upload_bottles(bottles_hash,
publish_package: !args.no_publish?,
warn_on_error: args.warn_on_upload_failure?)
elsif github_releases?(bottles_hash)
# Handle uploading to GitHub Releases.
bottles_hash.each_value do |bottle_hash|
root_url = bottle_hash["bottle"]["root_url"]
Expand All @@ -128,12 +161,7 @@ def pr_upload
end
end
else
# Handle uploading to Bintray.
bintray_org = args.bintray_org || "homebrew"
bintray = Bintray.new(org: bintray_org)
bintray.upload_bottles(bottles_hash,
publish_package: !args.no_publish?,
warn_on_error: args.warn_on_upload_failure?)
odie "Service specified by root_url is not recognized"
end
end
end
4 changes: 4 additions & 0 deletions Library/Homebrew/env_config.rb
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,10 @@ module EnvConfig
default_text: 'The "Beer Mug" emoji.',
default: "🍺",
},
HOMEBREW_INTERNET_ARCHIVE_KEY: {
description: "Use this API key when accessing the Internet Archive S3 API, where bottles are stored. " \
"The format is access:secret. See https://archive.org/account/s3.php",
},
HOMEBREW_LIVECHECK_WATCHLIST: {
description: "Consult this file for the list of formulae to check by default when no formula argument " \
"is passed to `brew livecheck`.",
Expand Down
20 changes: 20 additions & 0 deletions Library/Homebrew/test/archive_spec.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# typed: false
# frozen_string_literal: true

require "archive"

describe Archive, :needs_network do
subject(:archive) { described_class.new(item: "homebrew") }

describe "::remote_checksum" do
it "detects a published file" do
hash = archive.remote_md5(directory: ".", remote_file: "cmake-3.1.2.yosemite.bottle.tar.gz")
expect(hash).to eq("c6e525d472124670b0b635800488f438")
end

it "fails on a non-existent file" do
hash = archive.remote_md5(directory: "bottles", remote_file: "my-fake-bottle-1.0.snow_hyena.tar.gz")
expect(hash).to be nil
end
end
end
2 changes: 2 additions & 0 deletions completions/bash/brew
Original file line number Diff line number Diff line change
Expand Up @@ -1475,6 +1475,7 @@ _brew_pr_pull() {
case "$cur" in
-*)
__brewcomp "
--archive-item
--artifact
--autosquash
--bintray-mirror
Expand Down Expand Up @@ -1507,6 +1508,7 @@ _brew_pr_upload() {
case "$cur" in
-*)
__brewcomp "
--archive-item
--bintray-org
--debug
--dry-run
Expand Down
2 changes: 2 additions & 0 deletions completions/fish/brew.fish
Original file line number Diff line number Diff line change
Expand Up @@ -1053,6 +1053,7 @@ __fish_brew_complete_arg 'pr-publish' -l workflow -d 'Target workflow filename (


__fish_brew_complete_cmd 'pr-pull' 'Download and publish bottles, and apply the bottle commit from a pull request with artifacts generated by GitHub Actions'
__fish_brew_complete_arg 'pr-pull' -l archive-item -d 'Upload to the specified Internet Archive item (default: `homebrew`)'
__fish_brew_complete_arg 'pr-pull' -l artifact -d 'Download artifacts with the specified name (default: `bottles`)'
__fish_brew_complete_arg 'pr-pull' -l autosquash -d 'Automatically reformat and reword commits in the pull request to our preferred format'
__fish_brew_complete_arg 'pr-pull' -l bintray-mirror -d 'Use the specified Bintray repository to automatically mirror stable URLs defined in the formulae (default: `mirror`)'
Expand All @@ -1077,6 +1078,7 @@ __fish_brew_complete_arg 'pr-pull' -l workflows -d 'Retrieve artifacts from the


__fish_brew_complete_cmd 'pr-upload' 'Apply the bottle commit and publish bottles to Bintray or GitHub Releases'
__fish_brew_complete_arg 'pr-upload' -l archive-item -d 'Upload to the specified Internet Archive item (default: `homebrew`)'
__fish_brew_complete_arg 'pr-upload' -l bintray-org -d 'Upload to the specified Bintray organisation (default: `homebrew`)'
__fish_brew_complete_arg 'pr-upload' -l debug -d 'Display any debugging information'
__fish_brew_complete_arg 'pr-upload' -l dry-run -d 'Print what would be done rather than doing it'
Expand Down
4 changes: 3 additions & 1 deletion completions/zsh/_brew
Original file line number Diff line number Diff line change
Expand Up @@ -1229,10 +1229,11 @@ _brew_pr_publish() {
# brew pr-pull
_brew_pr_pull() {
_arguments \
'(--bintray-org)--archive-item[Upload to the specified Internet Archive item (default: `homebrew`)]' \
'--artifact[Download artifacts with the specified name (default: `bottles`)]' \
'(--clean)--autosquash[Automatically reformat and reword commits in the pull request to our preferred format]' \
'--bintray-mirror[Use the specified Bintray repository to automatically mirror stable URLs defined in the formulae (default: `mirror`)]' \
'--bintray-org[Upload to the specified Bintray organisation (default: `homebrew`)]' \
'(--archive-item)--bintray-org[Upload to the specified Bintray organisation (default: `homebrew`)]' \
'--branch-okay[Do not warn if pulling to a branch besides the repository default (useful for testing)]' \
'(--autosquash)--clean[Do not amend the commits from pull requests]' \
'--debug[Display any debugging information]' \
Expand All @@ -1255,6 +1256,7 @@ _brew_pr_pull() {
# brew pr-upload
_brew_pr_upload() {
_arguments \
'--archive-item[Upload to the specified Internet Archive item (default: `homebrew`)]' \
'--bintray-org[Upload to the specified Bintray organisation (default: `homebrew`)]' \
'--debug[Display any debugging information]' \
'--dry-run[Print what would be done rather than doing it]' \
Expand Down
Loading