Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pr-upload: Upload bottles to the Internet Archive #10677

Merged
merged 8 commits into from
Feb 24, 2021
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
185 changes: 185 additions & 0 deletions Library/Homebrew/archive.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
# typed: false
# frozen_string_literal: true

require "digest/md5"
require "utils/curl"

# Archive API client.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Archive API client.
# The Internet Archive API client.

#
# @api private
class Archive
extend T::Sig

include Context
include Utils::Curl

class Error < RuntimeError
end

sig { returns(String) }
def inspect
"#<Archive: item=#{@archive_item}>"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to get some tests for this class.

end

sig { params(item: T.nilable(String)).void }
def initialize(item: "homebrew")
@archive_item = item

raise UsageError, "Must set the Archive item!" unless @archive_item

ENV["HOMEBREW_FORCE_HOMEBREW_ON_LINUX"] = "1" if @archive_item == "homebrew" && !OS.mac?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment explaining this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is here on the foundational software engineering principles of cargo cult engineering and Chesterton's Fence, cribbed from

ENV["HOMEBREW_FORCE_HOMEBREW_ON_LINUX"] = "1" if @bintray_org == "homebrew" && !OS.mac?

I'm not sure of its purpose, so I sadly cannot comment it. I'm happy deleting this line, and we can add it back if we find that we need it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I were to guess, it was a hack for uploading and verifying (more likely the verifying step) macOS bottles running on a Linux CI runner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've deleted this line. We can crib it from bintray.rb if needed.

end

def open_api(url, *args, auth: true)
if auth
raise UsageError, "HOMEBREW_ARCHIVE_KEY is unset." unless (key = Homebrew::EnvConfig.archive_key)

if key.exclude?(":")
raise UsageError,
"Use HOMEBREW_ARCHIVE_KEY=access:secret. See https://archive.org/account/s3.php"
end

args += ["--header", "Authorization: AWS #{key}"]
end

curl(*args, url, print_stdout: false, secrets: key)
end

sig {
params(local_file: String,
directory: String,
remote_file: String,
warn_on_error: T.nilable(T::Boolean)).void
}
def upload(local_file, directory:, remote_file:, warn_on_error: false)
unless File.exist? local_file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make local_file a Pathname and then you can use local_file.exist? here and local_file.read below.

msg = "#{local_file} for upload doesn't exist!"
raise Error, msg unless warn_on_error

# Warn and return early here since we know this upload is going to fail.
opoo msg
return
end

md5_base64 = Digest::MD5.base64digest(File.read(local_file))
url = "https://#{@archive_item}.s3.us.archive.org/#{directory}/#{remote_file}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to make s3.us.archive.org a constant somewhere given it's used in a few places.

args = ["--upload-file", local_file, "--header", "Content-MD5: #{md5_base64}"]
args << "--fail" unless warn_on_error
result = T.unsafe(self).open_api(url, *args)
return if result.success? && result.stdout.exclude?("Error")

msg = "Bottle upload failed: #{result.stdout}"
raise msg unless warn_on_error

opoo msg
end

sig { params(url: String).returns(T::Boolean) }
def stable_mirrored?(url)
headers, = curl_output("--connect-timeout", "15", "--location", "--head", url)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 15? Would be good to have a comment or shared constant if this is copied from elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷‍♂️ Cribbed from bintray.rb

Copy link
Contributor Author

@sjackman sjackman Feb 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted this method stable_mirrored?

status_code = headers.scan(%r{^HTTP/.* (\d+)}).last.first
status_code.start_with?("2")
end

sig {
params(formula: Formula,
directory: String,
warn_on_error: T::Boolean).returns(String)
}
def mirror_formula(formula, directory: "mirror", warn_on_error: false)
formula.downloader.fetch

filename = ERB::Util.url_encode(formula.downloader.basename)
destination_url = "https://archive.org/download/#{@archive_item}/#{directory}/#{filename}"

odebug "Uploading to #{destination_url}"

upload(
formula.downloader.cached_location,
directory: directory,
remote_file: filename,
warn_on_error: warn_on_error,
)

destination_url
end

# Gets the MD5 hash of the specified remote file.
#
# @return the hash, the empty string (if the file doesn't have a hash), nil (if the file doesn't exist)
sig { params(directory: String, remote_file: String).returns(T.nilable(String)) }
def remote_md5(directory:, remote_file:)
url = "https://#{@archive_item}.s3.us.archive.org/#{directory}/#{remote_file}"
result = curl_output "--fail", "--silent", "--head", "--location", url
if result.success?
result.stdout.match(/^ETag: "(\h{32})"/)&.values_at(1)&.first || ""
else
raise Error if result.status.exitstatus != 22 && result.stderr.exclude?("404 Not Found")

nil
end
end

sig { params(directory: String, filename: String).returns(String) }
def file_delete_instructions(directory, filename)
<<~EOS
Run:
curl -X DELETE -H "Authorization: AWS $HOMEBREW_ARCHIVE_KEY" https://#{@archive_item}.s3.us.archive.org/#{directory}/#{filename}
Or run:
ia delete #{@archive_item} #{directory}/#{filename}
EOS
end

sig {
params(bottles_hash: T::Hash[String, T.untyped],
warn_on_error: T.nilable(T::Boolean)).void
}
def upload_bottles(bottles_hash, warn_on_error: false)
bottles_hash.each do |_formula_name, bottle_hash|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bottles_hash.each do |_formula_name, bottle_hash|
bottles_hash.each_value do |bottle_hash|

directory = bottle_hash["bintray"]["repository"]
bottle_count = bottle_hash["bottle"]["tags"].length

bottle_hash["bottle"]["tags"].each do |_tag, tag_hash|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bottle_hash["bottle"]["tags"].each do |_tag, tag_hash|
bottle_hash["bottle"]["tags"].each_value do |tag_hash|

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch. Rubocop should have an audit for this case.

filename = tag_hash["filename"] # URL encoded in Bottle::Filename#archive
delete_instructions = file_delete_instructions(directory, filename)

local_filename = tag_hash["local_filename"]
md5 = Digest::MD5.hexdigest(File.read(local_filename))

odebug "Checking remote file #{@archive_item}/#{directory}/#{filename}"
result = remote_md5(directory: directory, remote_file: filename)
case result
when nil
# File doesn't exist.
odebug "Uploading #{@archive_item}/#{directory}/#{filename}"
upload(local_filename,
directory: directory,
remote_file: filename,
warn_on_error: warn_on_error)
when md5
# File exists, hash matches.
odebug "#{filename} is already published with matching hash."
bottle_count -= 1
when ""
# File exists, but can't find hash
failed_message = "#{filename} is already published!"
raise Error, "#{failed_message}\n#{delete_instructions}" unless warn_on_error

opoo failed_message
else
# File exists, but hash either doesn't exist or is mismatched.
failed_message = <<~EOS
#{filename} is already published with a mismatched hash!
Expected: #{md5}
Actual: #{result}
EOS
raise Error, "#{failed_message}#{delete_instructions}" unless warn_on_error

opoo failed_message
end
end

odebug "Uploaded #{bottle_count} bottles"
end
end
end
10 changes: 9 additions & 1 deletion Library/Homebrew/dev-cmd/pr-pull.rb
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,8 @@ def pr_pull_args
description: "Message to include when autosquashing revision bumps, deletions, and rebuilds."
flag "--artifact=",
description: "Download artifacts with the specified name (default: `bottles`)."
flag "--archive-item=",
description: "Upload to the specified Archive item (default: `homebrew`)."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: "Upload to the specified Archive item (default: `homebrew`)."
description: "Upload to the specified Internet Archive item (default: `homebrew`)."

flag "--bintray-org=",
description: "Upload to the specified Bintray organisation (default: `homebrew`)."
flag "--tap=",
Expand All @@ -65,6 +67,7 @@ def pr_pull_args
description: "Comma-separated list of workflows which can be ignored if they have not been run."

conflicts "--clean", "--autosquash"
conflicts "--archive-item", "--bintray-org"

named_args :pull_request, min: 1
end
Expand Down Expand Up @@ -357,6 +360,7 @@ def pr_pull

workflows = args.workflows.presence || ["tests.yml"]
artifact = args.artifact || "bottles"
archive_item = args.archive_item
bintray_org = args.bintray_org || "homebrew"
mirror_repo = args.bintray_mirror || "mirror"
tap = Tap.fetch(args.tap || CoreTap.instance.name)
Expand Down Expand Up @@ -424,7 +428,11 @@ def pr_pull
upload_args << "--keep-old" if args.keep_old?
upload_args << "--warn-on-upload-failure" if args.warn_on_upload_failure?
upload_args << "--root-url=#{args.root_url}" if args.root_url
upload_args << "--bintray-org=#{bintray_org}"
upload_args << if archive_item.present?
"--archive-item=#{archive_item}"
else
"--bintray-org=#{bintray_org}"
end
safe_system HOMEBREW_BREW_FILE, *upload_args
end
end
Expand Down
52 changes: 40 additions & 12 deletions Library/Homebrew/dev-cmd/pr-upload.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
# frozen_string_literal: true

require "cli/parser"
require "archive"
require "bintray"

module Homebrew
Expand All @@ -27,6 +28,8 @@ def pr_upload_args
switch "--warn-on-upload-failure",
description: "Warn instead of raising an error if the bottle upload fails. "\
"Useful for repairing bottle uploads that previously failed."
flag "--archive-item=",
description: "Upload to the specified Archive item (default: `homebrew`)."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: "Upload to the specified Archive item (default: `homebrew`)."
description: "Upload to the specified Internet Archive item (default: `homebrew`)."

flag "--bintray-org=",
description: "Upload to the specified Bintray organisation (default: `homebrew`)."
flag "--root-url=",
Expand All @@ -47,6 +50,18 @@ def check_bottled_formulae(bottles_hash)
end
end

def archive?(bottles_hash)
@archive ||= bottles_hash.values.all? do |bottle_hash|
bottle_hash["bottle"]["root_url"].start_with? "https://archive.com/"
end
end

def bintray?(bottles_hash)
@bintray ||= bottles_hash.values.all? do |bottle_hash|
bottle_hash["bottle"]["root_url"].match? %r{^https://[\w-]+\.bintray\.com/}
end
end

def github_releases?(bottles_hash)
@github_releases ||= bottles_hash.values.all? do |bottle_hash|
root_url = bottle_hash["bottle"]["root_url"]
Expand Down Expand Up @@ -76,11 +91,16 @@ def pr_upload
bottle_args += json_files

if args.dry_run?
service = if github_releases?(bottles_hash)
"GitHub Releases"
else
"Bintray"
end
service =
if archive?(bottles_hash)
"Archive.org"
elsif bintray?(bottles_hash)
"Bintray"
elsif github_releases?(bottles_hash)
"GitHub Releases"
else
odie "Service specified by root_url is not recognized"
end
puts <<~EOS
brew #{bottle_args.join " "}
Upload bottles described by these JSON files to #{service}:
Expand All @@ -102,7 +122,20 @@ def pr_upload
safe_system HOMEBREW_BREW_FILE, *audit_args
end

if github_releases?(bottles_hash)
if archive?(bottles_hash)
# Handle uploading to Archive.org.
archive_item = args.archive_item || "homebrew"
archive = Archive.new(item: archive_item)
archive.upload_bottles(bottles_hash,
warn_on_error: args.warn_on_upload_failure?)
elsif bintray?(bottles_hash)
# Handle uploading to Bintray.
bintray_org = args.bintray_org || "homebrew"
bintray = Bintray.new(org: bintray_org)
bintray.upload_bottles(bottles_hash,
publish_package: !args.no_publish?,
warn_on_error: args.warn_on_upload_failure?)
elsif github_releases?(bottles_hash)
# Handle uploading to GitHub Releases.
bottles_hash.each_value do |bottle_hash|
root_url = bottle_hash["bottle"]["root_url"]
Expand All @@ -128,12 +161,7 @@ def pr_upload
end
end
else
# Handle uploading to Bintray.
bintray_org = args.bintray_org || "homebrew"
bintray = Bintray.new(org: bintray_org)
bintray.upload_bottles(bottles_hash,
publish_package: !args.no_publish?,
warn_on_error: args.warn_on_upload_failure?)
odie "Service specified by root_url is not recognized"
end
end
end
4 changes: 4 additions & 0 deletions Library/Homebrew/env_config.rb
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@ module EnvConfig
description: "Linux only: Pass this value to a type name representing the compiler's `-march` option.",
default: "native",
},
HOMEBREW_ARCHIVE_KEY: {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
HOMEBREW_ARCHIVE_KEY: {
HOMEBREW_INTERNET_ARCHIVE_KEY: {

description: "Use this API key when accessing the Archive.org API (where bottles are stored). " \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: "Use this API key when accessing the Archive.org API (where bottles are stored). " \
description: "Use this API key when accessing the Internet Archive API (where bottles are stored). " \

"The format is access:secret. See https://archive.org/account/s3.php",
},
HOMEBREW_ARTIFACT_DOMAIN: {
description: "Prefix all download URLs, including those for bottles, with this value. " \
"For example, `HOMEBREW_ARTIFACT_DOMAIN=http://localhost:8080` will cause a " \
Expand Down
20 changes: 20 additions & 0 deletions Library/Homebrew/test/archive_spec.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# typed: false
# frozen_string_literal: true

require "archive"

describe Archive, :needs_network do
subject(:archive) { described_class.new(item: "homebrew") }

describe "::remote_checksum" do
it "detects a published file" do
hash = archive.remote_md5(directory: ".", remote_file: "cmake-3.1.2.yosemite.bottle.tar.gz")
expect(hash).to eq("c6e525d472124670b0b635800488f438")
end

it "fails on a non-existent file" do
hash = archive.remote_md5(directory: "bottles", remote_file: "my-fake-bottle-1.0.snow_hyena.tar.gz")
expect(hash).to be nil
end
end
end
2 changes: 2 additions & 0 deletions completions/bash/brew
Original file line number Diff line number Diff line change
Expand Up @@ -1475,6 +1475,7 @@ _brew_pr_pull() {
case "$cur" in
-*)
__brewcomp "
--archive-item
--artifact
--autosquash
--bintray-mirror
Expand Down Expand Up @@ -1507,6 +1508,7 @@ _brew_pr_upload() {
case "$cur" in
-*)
__brewcomp "
--archive-item
--bintray-org
--debug
--dry-run
Expand Down
2 changes: 2 additions & 0 deletions completions/fish/brew.fish
Original file line number Diff line number Diff line change
Expand Up @@ -1053,6 +1053,7 @@ __fish_brew_complete_arg 'pr-publish' -l workflow -d 'Target workflow filename (


__fish_brew_complete_cmd 'pr-pull' 'Download and publish bottles, and apply the bottle commit from a pull request with artifacts generated by GitHub Actions'
__fish_brew_complete_arg 'pr-pull' -l archive-item -d 'Upload to the specified Archive item (default: `homebrew`)'
__fish_brew_complete_arg 'pr-pull' -l artifact -d 'Download artifacts with the specified name (default: `bottles`)'
__fish_brew_complete_arg 'pr-pull' -l autosquash -d 'Automatically reformat and reword commits in the pull request to our preferred format'
__fish_brew_complete_arg 'pr-pull' -l bintray-mirror -d 'Use the specified Bintray repository to automatically mirror stable URLs defined in the formulae (default: `mirror`)'
Expand All @@ -1077,6 +1078,7 @@ __fish_brew_complete_arg 'pr-pull' -l workflows -d 'Retrieve artifacts from the


__fish_brew_complete_cmd 'pr-upload' 'Apply the bottle commit and publish bottles to Bintray or GitHub Releases'
__fish_brew_complete_arg 'pr-upload' -l archive-item -d 'Upload to the specified Archive item (default: `homebrew`)'
__fish_brew_complete_arg 'pr-upload' -l bintray-org -d 'Upload to the specified Bintray organisation (default: `homebrew`)'
__fish_brew_complete_arg 'pr-upload' -l debug -d 'Display any debugging information'
__fish_brew_complete_arg 'pr-upload' -l dry-run -d 'Print what would be done rather than doing it'
Expand Down
Loading