Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create jobs queue for scheduling new builds #3

Merged
merged 1 commit into from
Sep 25, 2019

Conversation

msehnout
Copy link
Contributor

No description provided.

@msehnout msehnout force-pushed the job-queue branch 2 times, most recently from cc671b3 to 3d95f1b Compare September 24, 2019 17:41
internal/weldr/api.go Outdated Show resolved Hide resolved
queue/manifest.go Outdated Show resolved Hide resolved
queue/queue.go Outdated Show resolved Hide resolved
queue/queue.go Outdated Show resolved Hide resolved
@msehnout msehnout force-pushed the job-queue branch 3 times, most recently from cd7bdca to 2452bdc Compare September 24, 2019 18:37
@msehnout msehnout requested a review from teg September 24, 2019 18:44
@msehnout msehnout marked this pull request as ready for review September 25, 2019 06:23
@teg
Copy link
Member

teg commented Sep 25, 2019

Great stuff! Thanks! I'll merge this for now, and work on top of it rather than give more feedback here (usually we'd work together on the PR, but since you'll be away for a while and this is all new code anyway, I figure that is fine).

@teg teg merged commit 0861b80 into osbuild:master Sep 25, 2019
msehnout pushed a commit that referenced this pull request Oct 14, 2020
Fedora 33 ships the new API so let's do the switch now.

But... this would break older Fedoras because they only have the old API,
right?

We have the following options:

1) Ship xmlrpc compat package to Fedora 33+. This would mean that we delay the API switch till F32 EOL. This would be the most elegant solution, yet it has two issues: a) We will surely not be able to deliver the compat package before F33 Final Freeze. b) It's an extra and annoying work.

2) Downstream patch. No.

3) Use build constraints and have two versions of our code for both different
   API.

I chose solution #3. It has an issue though:

%gobuild macro already passes -tags argument to go build. Therefore the
following line fails because it's not possible to use -tags more than once:

%gobuild -tags kolo_xmlrpc_oldapi ...

Therefore I had to come up with manual tinkering with the build constraints
in the spec file. This is pretty ugly but I like that:

1) Go code is actually clean, no weird magic is happening there.
2) We can still ship our software to Fedora/RHEL as we used to
   (no downstream patches)
3) All downstreams can use the upstream spec file directly.

Note that this doesn't affect RHEL in any way as it uses vendored libraries.
thozza referenced this pull request in thozza/osbuild-composer Apr 27, 2021
The `cloudbuildResourcesFromBuildLog()` function from the internal GCP
package could cause panic while parsing Build job log which failed early
and didn't create any Compute Engine resources. The function relied on
the `Regexp.FindStringSubmatch()` method to always return a match
while being used on the build log. Accessing a member of a nil slice
would cause a panic in `osbuild-worker`, such as:

Stack trace of thread 185316:
\#0  0x0000564e5393b5e1 runtime.raise (osbuild-worker)
\#1  0x0000564e5391fa1e runtime.sigfwdgo (osbuild-worker)
\#2  0x0000564e5391e354 runtime.sigtrampgo (osbuild-worker)
\#3  0x0000564e5393b953 runtime.sigtramp (osbuild-worker)
\#4  0x00007f37e98e3b20 __restore_rt (libpthread.so.0)
\#5  0x0000564e5393b5e1 runtime.raise (osbuild-worker)
\#6  0x0000564e5391f5ea runtime.crash (osbuild-worker)
\#7  0x0000564e53909306 runtime.fatalpanic (osbuild-worker)
\#8  0x0000564e53908ca1 runtime.gopanic (osbuild-worker)
\#9  0x0000564e53906b65 runtime.goPanicIndex (osbuild-worker)
\#10 0x0000564e5420b36e github.com/osbuild/osbuild-composer/internal/cloud/gcp.cloudbuildResourcesFromBuildLog (osbuild-worker)
\#11 0x0000564e54209ebb github.com/osbuild/osbuild-composer/internal/cloud/gcp.(*GCP).CloudbuildBuildCleanup (osbuild-worker)
\#12 0x0000564e54b05a9b main.(*OSBuildJobImpl).Run (osbuild-worker)
\#13 0x0000564e54b08854 main.main (osbuild-worker)
\#14 0x0000564e5390b722 runtime.main (osbuild-worker)
\#15 0x0000564e53939a11 runtime.goexit (osbuild-worker)

Add a unit test testing this scenario.

Make the `cloudbuildResourcesFromBuildLog()` function more robust and
not blindly expect to find matches in the build log. As a result the
`cloudbuildBuildResources` struct instance returned from the function
may be empty. Subsequently make sure that the `CloudbuildBuildCleanup()`
method handles an empty `cloudbuildBuildResources` instance correctly.
Specifically the `storageCacheDir.bucket` may be an empty string and
thus won't exist. Ensure that this does not result in infinite loop by
checking for `storage.ErrBucketNotExist` while iterating the bucket
objects.

Signed-off-by: Tomas Hozza <[email protected]>
thozza referenced this pull request in thozza/osbuild-composer Apr 27, 2021
The `cloudbuildResourcesFromBuildLog()` function from the internal GCP
package could cause panic while parsing Build job log which failed early
and didn't create any Compute Engine resources. The function relied on
the `Regexp.FindStringSubmatch()` method to always return a match
while being used on the build log. Accessing a member of a nil slice
would cause a panic in `osbuild-worker`, such as:

Stack trace of thread 185316:
 #0  0x0000564e5393b5e1 runtime.raise (osbuild-worker)
 #1  0x0000564e5391fa1e runtime.sigfwdgo (osbuild-worker)
 #2  0x0000564e5391e354 runtime.sigtrampgo (osbuild-worker)
 #3  0x0000564e5393b953 runtime.sigtramp (osbuild-worker)
 #4  0x00007f37e98e3b20 __restore_rt (libpthread.so.0)
 #5  0x0000564e5393b5e1 runtime.raise (osbuild-worker)
 #6  0x0000564e5391f5ea runtime.crash (osbuild-worker)
 #7  0x0000564e53909306 runtime.fatalpanic (osbuild-worker)
 #8  0x0000564e53908ca1 runtime.gopanic (osbuild-worker)
 #9  0x0000564e53906b65 runtime.goPanicIndex (osbuild-worker)
 #10 0x0000564e5420b36e github.com/osbuild/osbuild-composer/internal/cloud/gcp.cloudbuildResourcesFromBuildLog (osbuild-worker)
 #11 0x0000564e54209ebb github.com/osbuild/osbuild-composer/internal/cloud/gcp.(*GCP).CloudbuildBuildCleanup (osbuild-worker)
 #12 0x0000564e54b05a9b main.(*OSBuildJobImpl).Run (osbuild-worker)
 #13 0x0000564e54b08854 main.main (osbuild-worker)
 #14 0x0000564e5390b722 runtime.main (osbuild-worker)
 #15 0x0000564e53939a11 runtime.goexit (osbuild-worker)

Add a unit test testing this scenario.

Make the `cloudbuildResourcesFromBuildLog()` function more robust and
not blindly expect to find matches in the build log. As a result the
`cloudbuildBuildResources` struct instance returned from the function
may be empty. Subsequently make sure that the `CloudbuildBuildCleanup()`
method handles an empty `cloudbuildBuildResources` instance correctly.
Specifically the `storageCacheDir.bucket` may be an empty string and
thus won't exist. Ensure that this does not result in infinite loop by
checking for `storage.ErrBucketNotExist` while iterating the bucket
objects.

Signed-off-by: Tomas Hozza <[email protected]>
thozza added a commit that referenced this pull request Apr 29, 2021
The `cloudbuildResourcesFromBuildLog()` function from the internal GCP
package could cause panic while parsing Build job log which failed early
and didn't create any Compute Engine resources. The function relied on
the `Regexp.FindStringSubmatch()` method to always return a match
while being used on the build log. Accessing a member of a nil slice
would cause a panic in `osbuild-worker`, such as:

Stack trace of thread 185316:
 #0  0x0000564e5393b5e1 runtime.raise (osbuild-worker)
 #1  0x0000564e5391fa1e runtime.sigfwdgo (osbuild-worker)
 #2  0x0000564e5391e354 runtime.sigtrampgo (osbuild-worker)
 #3  0x0000564e5393b953 runtime.sigtramp (osbuild-worker)
 #4  0x00007f37e98e3b20 __restore_rt (libpthread.so.0)
 #5  0x0000564e5393b5e1 runtime.raise (osbuild-worker)
 #6  0x0000564e5391f5ea runtime.crash (osbuild-worker)
 #7  0x0000564e53909306 runtime.fatalpanic (osbuild-worker)
 #8  0x0000564e53908ca1 runtime.gopanic (osbuild-worker)
 #9  0x0000564e53906b65 runtime.goPanicIndex (osbuild-worker)
 #10 0x0000564e5420b36e github.com/osbuild/osbuild-composer/internal/cloud/gcp.cloudbuildResourcesFromBuildLog (osbuild-worker)
 #11 0x0000564e54209ebb github.com/osbuild/osbuild-composer/internal/cloud/gcp.(*GCP).CloudbuildBuildCleanup (osbuild-worker)
 #12 0x0000564e54b05a9b main.(*OSBuildJobImpl).Run (osbuild-worker)
 #13 0x0000564e54b08854 main.main (osbuild-worker)
 #14 0x0000564e5390b722 runtime.main (osbuild-worker)
 #15 0x0000564e53939a11 runtime.goexit (osbuild-worker)

Add a unit test testing this scenario.

Make the `cloudbuildResourcesFromBuildLog()` function more robust and
not blindly expect to find matches in the build log. As a result the
`cloudbuildBuildResources` struct instance returned from the function
may be empty. Subsequently make sure that the `CloudbuildBuildCleanup()`
method handles an empty `cloudbuildBuildResources` instance correctly.
Specifically the `storageCacheDir.bucket` may be an empty string and
thus won't exist. Ensure that this does not result in infinite loop by
checking for `storage.ErrBucketNotExist` while iterating the bucket
objects.

Signed-off-by: Tomas Hozza <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants