Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sprint production release for 2021.09.27 #1961

Closed
harshad16 opened this issue Sep 15, 2021 · 10 comments
Closed

sprint production release for 2021.09.27 #1961

harshad16 opened this issue Sep 15, 2021 · 10 comments
Labels
area/release-eng Issues or PRs related to Release Engineering kind/documentation Categorizes issue or PR as related to documentation. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@harshad16
Copy link
Member

Hello, Thoth-station!

This Issue would be used for the current sprint cycle production release.
By the end of the sprint cycle, we will consolidate the information of thoth-station components features upgrade and fixes in this issue.

@harshad16 harshad16 added sig/devops lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. labels Sep 15, 2021
@sesheta sesheta added the needs-triage Indicates an issue or PR lacks a `triage/...` label and requires one. label Sep 15, 2021
@harshad16
Copy link
Member Author

/kind documentation
/area release-eng
/triage accepted
/milestone 2021.09.27

@sesheta sesheta added this to the 2021.09.27 milestone Sep 15, 2021
@sesheta sesheta added kind/documentation Categorizes issue or PR as related to documentation. area/release-eng Issues or PRs related to Release Engineering triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/...` label and requires one. labels Sep 15, 2021
@fridex
Copy link
Contributor

fridex commented Sep 15, 2021

Optimizations in prescriptions loading in a deployment

As our open database about Python projects (thoth-station/prescriptions) grew, we observed large overhead needed to handle it in a raw form (YAML files) in a deployment. As of now we have ~50+k YAML files that result in ~190+MiB that were copied in the deployment to the recommendation engine so that they could be used. Moreover, the YAML text format introduced overhead in parsing the files and creating resolution pipeline units out of them. With recent changes, we pre-load all the YAML files and construct a binary format holding directly Python objects on each adviser component release. The binary file (pickle) is directly loaded into memory in the cloud-based resolver on each request. This significantly speeds up prescriptions loading in deployment and allows the open database of observations for Python ecosystem grow even more. Originally, the whole prescription overhead (prescriptions loading, parsing, initializing) was roughly 1minute (+container pull), now the overhead is roughly 0.01 second.

Related: thoth-station/prescriptions#50
Related: thoth-station/adviser#2085
Related: #1950

@fridex
Copy link
Contributor

fridex commented Sep 15, 2021

CVE update job rewritten to OSV 0.8 format

PyPA upstream rewritten the advisory-db to conform to OSV 0.8 format. We adopted this format by updating the logic of our cve-update-job.

Related: thoth-station/cve-update-job#424
Related: pypa/advisory-database@7872b0a

@fridex
Copy link
Contributor

fridex commented Sep 15, 2021

Open Source Security Foundation - Security Scorecard

Starting this release, we are providing information derived out of OpenSSF Security Scorecards as provided by the Open Source Security Foundation. To support this, prescriptions-refresh-job queries scorecards available in BigQuery and constructs prescriptions out of them which are directly committed to the thoth-station/prescriptions repository - open database about Python open-source projects. As scorecards are specific to repositories, they are automatically mapped to python packages under the hood, based on Thoth's knowledge The prescriptions refresh job automatically updates prescriptions as scorecards get updated. See relevant prescriptions for flask or tensorflow for examples.

@pacospace
Copy link
Contributor

Thoth tutorial: create and use overlays for Elyra AI Pipelines steps.

This tutorial is used to show the concept of overlays, how they are applied to software stacks, what are overlays builds and the use of the built images in AI Pipelines.

Related:

@fridex
Copy link
Contributor

fridex commented Sep 16, 2021

Warning produced if users use forked projects on GitHub

As part of data aggregation done in prescriptions-refresh-job, we are aggregating information whether the given project is a fork. If so, users are warned about the use of forked projects. Prescriptions are automatically updated as the GitHub state changes.

Example: thoth-station/prescriptions#17217

@fridex
Copy link
Contributor

fridex commented Sep 16, 2021

Warn if a package used was not updated in the last 365 days

As part of data aggregation done in prescriptions-refresh-job, we are aggregating information on whether the given project was updated in the past 365 days on GitHub (any commit to the default Git branch). If not, we warn users about this fact. Prescriptions are automatically updated as the GitHub state changes.

Example: https://github.com/thoth-station/prescriptions/pull/17394/files

@fridex
Copy link
Contributor

fridex commented Sep 16, 2021

Warn if the used project has less than 5 contributors on GitHub

As part of data aggregation done in prescriptions-refresh-job, we are aggregating information on the number of contributors. If the number of contributors is less than 3, we warn users about this fact. Prescriptions are automatically updated as the GitHub state changes.

@fridex
Copy link
Contributor

fridex commented Sep 17, 2021

Information about GitHub popularity

When users ask for advise, starting this release we provide also information about the community on GitHub. If a project has small community, we warn about its use, otherwise we provide information on how big the community is. Prescriptions are automatically updated as the GitHub state changes.

Example (very high popularity): flask
Example (high popularity): wheel
Example (moderate popularity): configparser
Example (low popularity): untokenize

@harshad16
Copy link
Member Author

we have completed the release of 2021.09.27 🎉 🎊 🥳

Features

Optimizations in prescriptions loading in a deployment

As our open database about Python projects (thoth-station/prescriptions) grew, we observed large overhead needed to handle it in a raw form (YAML files) in a deployment. As of now we have ~50+k YAML files that result in ~190+MiB that were copied in the deployment to the recommendation engine so that they could be used. Moreover, the YAML text format introduced overhead in parsing the files and creating resolution pipeline units out of them. With recent changes, we pre-load all the YAML files and construct a binary format holding directly Python objects on each adviser component release. The binary file (pickle) is directly loaded into memory in the cloud-based resolver on each request. This significantly speeds up prescriptions loading in deployment and allows the open database of observations for Python ecosystem grow even more. Originally, the whole prescription overhead (prescriptions loading, parsing, initializing) was roughly 1minute (+container pull), now the overhead is roughly 0.01 second.

Related: thoth-station/prescriptions#50
Related: thoth-station/adviser#2085
Related: #1950

Open Source Security Foundation - Security Scorecard

Starting this release, we are providing information derived out of OpenSSF Security Scorecards as provided by the Open Source Security Foundation. To support this, prescriptions-refresh-job queries scorecards available in BigQuery and constructs prescriptions out of them which are directly committed to the thoth-station/prescriptions repository - open database about Python open-source projects. As scorecards are specific to repositories, they are automatically mapped to python packages under the hood, based on Thoth's knowledge The prescriptions refresh job automatically updates prescriptions as scorecards get updated. See relevant prescriptions for flask or tensorflow for examples.

Thoth tutorial: create and use overlays for Elyra AI Pipelines steps.

This tutorial is used to show the concept of overlays, how they are applied to software stacks, what are overlays builds and the use of the built images in AI Pipelines.

Related:

Warning produced if users use forked projects on GitHub

As part of data aggregation done in prescriptions-refresh-job, we are aggregating information whether the given project is a fork. If so, users are warned about the use of forked projects. Prescriptions are automatically updated as the GitHub state changes.

Example: thoth-station/prescriptions#17217

Warn if a package used was not updated in the last 365 days

As part of data aggregation done in prescriptions-refresh-job, we are aggregating information on whether the given project was updated in the past 365 days on GitHub (any commit to the default Git branch). If not, we warn users about this fact. Prescriptions are automatically updated as the GitHub state changes.

Example: https://github.com/thoth-station/prescriptions/pull/17394/files

Warn if the used project has less than 5 contributors on GitHub

As part of data aggregation done in prescriptions-refresh-job, we are aggregating information on the number of contributors. If the number of contributors is less than 3, we warn users about this fact. Prescriptions are automatically updated as the GitHub state changes.

Information about GitHub popularity

When users ask for advise, starting this release we provide also information about the community on GitHub. If a project has small community, we warn about its use, otherwise we provide information on how big the community is. Prescriptions are automatically updated as the GitHub state changes.

Example (very high popularity): flask
Example (high popularity): wheel
Example (moderate popularity): configparser
Example (low popularity): untokenize

Component Updates

Thanks for the amazing work everyone. 💯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/release-eng Issues or PRs related to Release Engineering kind/documentation Categorizes issue or PR as related to documentation. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
No open projects
Status: 💯 Releases
Development

No branches or pull requests

4 participants