-
-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pr-upload: Upload bottles to the Internet Archive #10677
Changes from 6 commits
b0bd15f
2794641
c11692b
7fd6d21
3d715ca
0f2c47d
641f134
be11a22
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,185 @@ | ||||||
# typed: false | ||||||
# frozen_string_literal: true | ||||||
|
||||||
require "digest/md5" | ||||||
require "utils/curl" | ||||||
|
||||||
# Archive API client. | ||||||
# | ||||||
# @api private | ||||||
class Archive | ||||||
extend T::Sig | ||||||
|
||||||
include Context | ||||||
include Utils::Curl | ||||||
|
||||||
class Error < RuntimeError | ||||||
end | ||||||
|
||||||
sig { returns(String) } | ||||||
def inspect | ||||||
"#<Archive: item=#{@archive_item}>" | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would be good to get some tests for this class. |
||||||
end | ||||||
|
||||||
sig { params(item: T.nilable(String)).void } | ||||||
def initialize(item: "homebrew") | ||||||
@archive_item = item | ||||||
|
||||||
raise UsageError, "Must set the Archive item!" unless @archive_item | ||||||
|
||||||
ENV["HOMEBREW_FORCE_HOMEBREW_ON_LINUX"] = "1" if @archive_item == "homebrew" && !OS.mac? | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you add a comment explaining this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This line is here on the foundational software engineering principles of cargo cult engineering and Chesterton's Fence, cribbed from brew/Library/Homebrew/bintray.rb Line 32 in 3efab3e
I'm not sure of its purpose, so I sadly cannot comment it. I'm happy deleting this line, and we can add it back if we find that we need it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I were to guess, it was a hack for uploading and verifying (more likely the verifying step) macOS bottles running on a Linux CI runner. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've deleted this line. We can crib it from |
||||||
end | ||||||
|
||||||
def open_api(url, *args, auth: true) | ||||||
if auth | ||||||
raise UsageError, "HOMEBREW_ARCHIVE_KEY is unset." unless (key = Homebrew::EnvConfig.archive_key) | ||||||
|
||||||
if key.exclude?(":") | ||||||
raise UsageError, | ||||||
"Use HOMEBREW_ARCHIVE_KEY=access:secret. See https://archive.org/account/s3.php" | ||||||
end | ||||||
|
||||||
args += ["--header", "Authorization: AWS #{key}"] | ||||||
end | ||||||
|
||||||
curl(*args, url, print_stdout: false, secrets: key) | ||||||
end | ||||||
|
||||||
sig { | ||||||
params(local_file: String, | ||||||
directory: String, | ||||||
remote_file: String, | ||||||
warn_on_error: T.nilable(T::Boolean)).void | ||||||
} | ||||||
def upload(local_file, directory:, remote_file:, warn_on_error: false) | ||||||
unless File.exist? local_file | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Make |
||||||
msg = "#{local_file} for upload doesn't exist!" | ||||||
raise Error, msg unless warn_on_error | ||||||
|
||||||
# Warn and return early here since we know this upload is going to fail. | ||||||
opoo msg | ||||||
return | ||||||
end | ||||||
|
||||||
md5_base64 = Digest::MD5.base64digest(File.read(local_file)) | ||||||
url = "https://#{@archive_item}.s3.us.archive.org/#{directory}/#{remote_file}" | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would be good to make |
||||||
args = ["--upload-file", local_file, "--header", "Content-MD5: #{md5_base64}"] | ||||||
args << "--fail" unless warn_on_error | ||||||
result = T.unsafe(self).open_api(url, *args) | ||||||
return if result.success? && result.stdout.exclude?("Error") | ||||||
|
||||||
msg = "Bottle upload failed: #{result.stdout}" | ||||||
raise msg unless warn_on_error | ||||||
|
||||||
opoo msg | ||||||
end | ||||||
|
||||||
sig { params(url: String).returns(T::Boolean) } | ||||||
def stable_mirrored?(url) | ||||||
headers, = curl_output("--connect-timeout", "15", "--location", "--head", url) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why 15? Would be good to have a comment or shared constant if this is copied from elsewhere. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🤷♂️ Cribbed from There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Deleted this method |
||||||
status_code = headers.scan(%r{^HTTP/.* (\d+)}).last.first | ||||||
status_code.start_with?("2") | ||||||
end | ||||||
|
||||||
sig { | ||||||
params(formula: Formula, | ||||||
directory: String, | ||||||
warn_on_error: T::Boolean).returns(String) | ||||||
} | ||||||
def mirror_formula(formula, directory: "mirror", warn_on_error: false) | ||||||
formula.downloader.fetch | ||||||
|
||||||
filename = ERB::Util.url_encode(formula.downloader.basename) | ||||||
destination_url = "https://archive.org/download/#{@archive_item}/#{directory}/#{filename}" | ||||||
|
||||||
odebug "Uploading to #{destination_url}" | ||||||
|
||||||
upload( | ||||||
formula.downloader.cached_location, | ||||||
directory: directory, | ||||||
remote_file: filename, | ||||||
warn_on_error: warn_on_error, | ||||||
) | ||||||
|
||||||
destination_url | ||||||
end | ||||||
|
||||||
# Gets the MD5 hash of the specified remote file. | ||||||
# | ||||||
# @return the hash, the empty string (if the file doesn't have a hash), nil (if the file doesn't exist) | ||||||
sig { params(directory: String, remote_file: String).returns(T.nilable(String)) } | ||||||
def remote_md5(directory:, remote_file:) | ||||||
url = "https://#{@archive_item}.s3.us.archive.org/#{directory}/#{remote_file}" | ||||||
result = curl_output "--fail", "--silent", "--head", "--location", url | ||||||
if result.success? | ||||||
result.stdout.match(/^ETag: "(\h{32})"/)&.values_at(1)&.first || "" | ||||||
else | ||||||
raise Error if result.status.exitstatus != 22 && result.stderr.exclude?("404 Not Found") | ||||||
|
||||||
nil | ||||||
end | ||||||
end | ||||||
|
||||||
sig { params(directory: String, filename: String).returns(String) } | ||||||
def file_delete_instructions(directory, filename) | ||||||
<<~EOS | ||||||
Run: | ||||||
curl -X DELETE -H "Authorization: AWS $HOMEBREW_ARCHIVE_KEY" https://#{@archive_item}.s3.us.archive.org/#{directory}/#{filename} | ||||||
Or run: | ||||||
ia delete #{@archive_item} #{directory}/#{filename} | ||||||
EOS | ||||||
end | ||||||
|
||||||
sig { | ||||||
params(bottles_hash: T::Hash[String, T.untyped], | ||||||
warn_on_error: T.nilable(T::Boolean)).void | ||||||
} | ||||||
def upload_bottles(bottles_hash, warn_on_error: false) | ||||||
bottles_hash.each do |_formula_name, bottle_hash| | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
directory = bottle_hash["bintray"]["repository"] | ||||||
bottle_count = bottle_hash["bottle"]["tags"].length | ||||||
|
||||||
bottle_hash["bottle"]["tags"].each do |_tag, tag_hash| | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice catch. Rubocop should have an audit for this case. |
||||||
filename = tag_hash["filename"] # URL encoded in Bottle::Filename#archive | ||||||
delete_instructions = file_delete_instructions(directory, filename) | ||||||
|
||||||
local_filename = tag_hash["local_filename"] | ||||||
md5 = Digest::MD5.hexdigest(File.read(local_filename)) | ||||||
|
||||||
odebug "Checking remote file #{@archive_item}/#{directory}/#{filename}" | ||||||
result = remote_md5(directory: directory, remote_file: filename) | ||||||
case result | ||||||
when nil | ||||||
# File doesn't exist. | ||||||
odebug "Uploading #{@archive_item}/#{directory}/#{filename}" | ||||||
upload(local_filename, | ||||||
directory: directory, | ||||||
remote_file: filename, | ||||||
warn_on_error: warn_on_error) | ||||||
when md5 | ||||||
# File exists, hash matches. | ||||||
odebug "#{filename} is already published with matching hash." | ||||||
bottle_count -= 1 | ||||||
when "" | ||||||
# File exists, but can't find hash | ||||||
failed_message = "#{filename} is already published!" | ||||||
raise Error, "#{failed_message}\n#{delete_instructions}" unless warn_on_error | ||||||
|
||||||
opoo failed_message | ||||||
else | ||||||
# File exists, but hash either doesn't exist or is mismatched. | ||||||
failed_message = <<~EOS | ||||||
#{filename} is already published with a mismatched hash! | ||||||
Expected: #{md5} | ||||||
Actual: #{result} | ||||||
EOS | ||||||
raise Error, "#{failed_message}#{delete_instructions}" unless warn_on_error | ||||||
|
||||||
opoo failed_message | ||||||
end | ||||||
end | ||||||
|
||||||
odebug "Uploaded #{bottle_count} bottles" | ||||||
end | ||||||
end | ||||||
end |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -49,6 +49,8 @@ def pr_pull_args | |||||
description: "Message to include when autosquashing revision bumps, deletions, and rebuilds." | ||||||
flag "--artifact=", | ||||||
description: "Download artifacts with the specified name (default: `bottles`)." | ||||||
flag "--archive-item=", | ||||||
description: "Upload to the specified Archive item (default: `homebrew`)." | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
flag "--bintray-org=", | ||||||
description: "Upload to the specified Bintray organisation (default: `homebrew`)." | ||||||
flag "--tap=", | ||||||
|
@@ -65,6 +67,7 @@ def pr_pull_args | |||||
description: "Comma-separated list of workflows which can be ignored if they have not been run." | ||||||
|
||||||
conflicts "--clean", "--autosquash" | ||||||
conflicts "--archive-item", "--bintray-org" | ||||||
|
||||||
named_args :pull_request, min: 1 | ||||||
end | ||||||
|
@@ -357,6 +360,7 @@ def pr_pull | |||||
|
||||||
workflows = args.workflows.presence || ["tests.yml"] | ||||||
artifact = args.artifact || "bottles" | ||||||
archive_item = args.archive_item | ||||||
bintray_org = args.bintray_org || "homebrew" | ||||||
mirror_repo = args.bintray_mirror || "mirror" | ||||||
tap = Tap.fetch(args.tap || CoreTap.instance.name) | ||||||
|
@@ -424,7 +428,11 @@ def pr_pull | |||||
upload_args << "--keep-old" if args.keep_old? | ||||||
upload_args << "--warn-on-upload-failure" if args.warn_on_upload_failure? | ||||||
upload_args << "--root-url=#{args.root_url}" if args.root_url | ||||||
upload_args << "--bintray-org=#{bintray_org}" | ||||||
upload_args << if archive_item.present? | ||||||
"--archive-item=#{archive_item}" | ||||||
else | ||||||
"--bintray-org=#{bintray_org}" | ||||||
end | ||||||
safe_system HOMEBREW_BREW_FILE, *upload_args | ||||||
end | ||||||
end | ||||||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -2,6 +2,7 @@ | |||||
# frozen_string_literal: true | ||||||
|
||||||
require "cli/parser" | ||||||
require "archive" | ||||||
require "bintray" | ||||||
|
||||||
module Homebrew | ||||||
|
@@ -27,6 +28,8 @@ def pr_upload_args | |||||
switch "--warn-on-upload-failure", | ||||||
description: "Warn instead of raising an error if the bottle upload fails. "\ | ||||||
"Useful for repairing bottle uploads that previously failed." | ||||||
flag "--archive-item=", | ||||||
description: "Upload to the specified Archive item (default: `homebrew`)." | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
flag "--bintray-org=", | ||||||
description: "Upload to the specified Bintray organisation (default: `homebrew`)." | ||||||
flag "--root-url=", | ||||||
|
@@ -47,6 +50,18 @@ def check_bottled_formulae(bottles_hash) | |||||
end | ||||||
end | ||||||
|
||||||
def archive?(bottles_hash) | ||||||
@archive ||= bottles_hash.values.all? do |bottle_hash| | ||||||
bottle_hash["bottle"]["root_url"].start_with? "https://archive.com/" | ||||||
end | ||||||
end | ||||||
|
||||||
def bintray?(bottles_hash) | ||||||
@bintray ||= bottles_hash.values.all? do |bottle_hash| | ||||||
bottle_hash["bottle"]["root_url"].match? %r{^https://[\w-]+\.bintray\.com/} | ||||||
end | ||||||
end | ||||||
|
||||||
def github_releases?(bottles_hash) | ||||||
@github_releases ||= bottles_hash.values.all? do |bottle_hash| | ||||||
root_url = bottle_hash["bottle"]["root_url"] | ||||||
|
@@ -76,11 +91,16 @@ def pr_upload | |||||
bottle_args += json_files | ||||||
|
||||||
if args.dry_run? | ||||||
service = if github_releases?(bottles_hash) | ||||||
"GitHub Releases" | ||||||
else | ||||||
"Bintray" | ||||||
end | ||||||
service = | ||||||
if archive?(bottles_hash) | ||||||
"Archive.org" | ||||||
elsif bintray?(bottles_hash) | ||||||
"Bintray" | ||||||
elsif github_releases?(bottles_hash) | ||||||
"GitHub Releases" | ||||||
else | ||||||
odie "Service specified by root_url is not recognized" | ||||||
end | ||||||
puts <<~EOS | ||||||
brew #{bottle_args.join " "} | ||||||
Upload bottles described by these JSON files to #{service}: | ||||||
|
@@ -102,7 +122,20 @@ def pr_upload | |||||
safe_system HOMEBREW_BREW_FILE, *audit_args | ||||||
end | ||||||
|
||||||
if github_releases?(bottles_hash) | ||||||
if archive?(bottles_hash) | ||||||
# Handle uploading to Archive.org. | ||||||
archive_item = args.archive_item || "homebrew" | ||||||
archive = Archive.new(item: archive_item) | ||||||
archive.upload_bottles(bottles_hash, | ||||||
warn_on_error: args.warn_on_upload_failure?) | ||||||
elsif bintray?(bottles_hash) | ||||||
# Handle uploading to Bintray. | ||||||
bintray_org = args.bintray_org || "homebrew" | ||||||
bintray = Bintray.new(org: bintray_org) | ||||||
bintray.upload_bottles(bottles_hash, | ||||||
publish_package: !args.no_publish?, | ||||||
warn_on_error: args.warn_on_upload_failure?) | ||||||
elsif github_releases?(bottles_hash) | ||||||
# Handle uploading to GitHub Releases. | ||||||
bottles_hash.each_value do |bottle_hash| | ||||||
root_url = bottle_hash["bottle"]["root_url"] | ||||||
|
@@ -128,12 +161,7 @@ def pr_upload | |||||
end | ||||||
end | ||||||
else | ||||||
# Handle uploading to Bintray. | ||||||
bintray_org = args.bintray_org || "homebrew" | ||||||
bintray = Bintray.new(org: bintray_org) | ||||||
bintray.upload_bottles(bottles_hash, | ||||||
publish_package: !args.no_publish?, | ||||||
warn_on_error: args.warn_on_upload_failure?) | ||||||
odie "Service specified by root_url is not recognized" | ||||||
end | ||||||
end | ||||||
end |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -15,6 +15,10 @@ module EnvConfig | |||||
description: "Linux only: Pass this value to a type name representing the compiler's `-march` option.", | ||||||
default: "native", | ||||||
}, | ||||||
HOMEBREW_ARCHIVE_KEY: { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
description: "Use this API key when accessing the Archive.org API (where bottles are stored). " \ | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
"The format is access:secret. See https://archive.org/account/s3.php", | ||||||
}, | ||||||
HOMEBREW_ARTIFACT_DOMAIN: { | ||||||
description: "Prefix all download URLs, including those for bottles, with this value. " \ | ||||||
"For example, `HOMEBREW_ARTIFACT_DOMAIN=http://localhost:8080` will cause a " \ | ||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# typed: false | ||
# frozen_string_literal: true | ||
|
||
require "archive" | ||
|
||
describe Archive, :needs_network do | ||
subject(:archive) { described_class.new(item: "homebrew") } | ||
|
||
describe "::remote_checksum" do | ||
it "detects a published file" do | ||
hash = archive.remote_md5(directory: ".", remote_file: "cmake-3.1.2.yosemite.bottle.tar.gz") | ||
expect(hash).to eq("c6e525d472124670b0b635800488f438") | ||
end | ||
|
||
it "fails on a non-existent file" do | ||
hash = archive.remote_md5(directory: "bottles", remote_file: "my-fake-bottle-1.0.snow_hyena.tar.gz") | ||
expect(hash).to be nil | ||
end | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.