From 1edc3136571b1d33f818bf275ef2875e4f1d5c20 Mon Sep 17 00:00:00 2001 From: Robert Lin Date: Sat, 24 Aug 2024 17:25:05 -0700 Subject: [PATCH] chore: sourcegraph is not open-source --- .../2020-5-7-sourcegraph-intern.md | 24 +++++++++---------- content/_experience/2021-7-5-sourcegraph.md | 10 ++++---- content/_posts/2020-06-21-docker-sidecar.md | 12 +++++----- ...8-mirroring-github-permissions-at-scale.md | 18 +++++++------- ...-10-10-investing-in-development-of-devx.md | 4 ++-- ...022-2-20-self-documenting-self-updating.md | 20 ++++++++-------- content/_posts/2022-4-10-extending-search.md | 4 ++-- content/_posts/2022-4-18-stateless-ci.md | 4 ++-- .../_posts/2022-5-21-anatomy-of-a-logger.md | 4 ++-- 9 files changed, 50 insertions(+), 50 deletions(-) diff --git a/content/_experience/2020-5-7-sourcegraph-intern.md b/content/_experience/2020-5-7-sourcegraph-intern.md index 1dbc321..66c8fe6 100644 --- a/content/_experience/2020-5-7-sourcegraph-intern.md +++ b/content/_experience/2020-5-7-sourcegraph-intern.md @@ -30,8 +30,6 @@ My work as an intern had several areas of focus: * improving the [process for creating Sourcegraph releases](#sourcegraph-releases) to on-premise deployments with new capabilities * experimenting with changes to the [pipelines that help us roll out Sourcegraph changes](#deployment-pipelines) to the various deployments we manage ourselves -Most of the company's work is open-source, so you can [see my pull requests for Sourcegraph on GitHub](https://github.com/search?q=org%3Asourcegraph+author%3Abobheadxi+is%3Amerged+updated%3A%3C2021-05-01&type=pullrequests&s=comments&o=desc)! If you poke around, you might spot me chiming in on a variety of other pull requests and issue discussions as well. - A brief hiatus after my internship, I [returned to Sourcegraph full-time](2021-7-5-sourcegraph.md).
@@ -41,18 +39,18 @@ A brief hiatus after my internship, I [returned to Sourcegraph full-time](2021-7 During my time at Sourcegraph, a major part of my focus has been on expanding the capabilities of Sourcegraph's built-in monitoring stack and improving the experience for administrators of Sourcegraph deployments, Sourcegraph engineers, and Sourcegraph support. * I created a new sidecar service to ship with the [Sourcegraph Prometheus image](https://docs.sourcegraph.com/dev/background-information/observability/prometheus), which I wrote a bit about in [this blog post](../_posts/2020-06-21-docker-sidecar.md). This service enabled me to build: - * [alerting capabilities and configuration](https://docs.sourcegraph.com/admin/observability/alerting) directly within Sourcegraph, which now powers all alerting needs (routing, paging, and more) at Sourcegraph and [completely replaced our old alerting infrastructure](https://github.com/sourcegraph/sourcegraph/issues/5370#issuecomment-629406540) - * the ability to [include recent alerts data in bug reports](https://github.com/sourcegraph/sourcegraph/pull/10704) and [render service status within the Sourcegraph app](https://github.com/sourcegraph/sourcegraph/pull/11957) + * [alerting capabilities and configuration](https://docs.sourcegraph.com/admin/observability/alerting) directly within Sourcegraph, which now powers all alerting needs (routing, paging, and more) at Sourcegraph and [completely replaced our old alerting infrastructure](https://github.com/sourcegraph/sourcegraph-public-snapshot/issues/5370#issuecomment-629406540) + * the ability to [include recent alerts data in bug reports](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/10704) and [render service status within the Sourcegraph app](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/11957)
The Prometheus sidecar allows for detailed diagnostic feedback within the main Sourcegraph application.
-* I built features for and refactored the [Sourcegraph monitoring generator](https://docs.sourcegraph.com/dev/background-information/observability/monitoring-generator), which generates the Grafana dashboards, Prometheus rules and alerts definitions, documentation, and more that ship with Sourcegraph from a [custom monitoring specification](https://github.com/sourcegraph/sourcegraph/blob/main/monitoring/monitoring/README.md) that teams use to declare monitoring relevant to their services. Some changes include: - * [team ownership of alerts](https://github.com/sourcegraph/sourcegraph/issues/12010), which is part of what drives our alerting infrastructure and also guides support request routing. - * [new API design for customising graph panels](https://github.com/sourcegraph/sourcegraph/pull/17112) within our monitoring specification - * generated [dashboard overlays](https://github.com/sourcegraph/sourcegraph/pull/17198) for alert events and version changes +* I built features for and refactored the [Sourcegraph monitoring generator](https://docs.sourcegraph.com/dev/background-information/observability/monitoring-generator), which generates the Grafana dashboards, Prometheus rules and alerts definitions, documentation, and more that ship with Sourcegraph from a [custom monitoring specification](https://github.com/sourcegraph/sourcegraph-public-snapshot/blob/main/monitoring/monitoring/README.md) that teams use to declare monitoring relevant to their services. Some changes include: + * [team ownership of alerts](https://github.com/sourcegraph/sourcegraph-public-snapshot/issues/12010), which is part of what drives our alerting infrastructure and also guides support request routing. + * [new API design for customising graph panels](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/17112) within our monitoring specification + * generated [dashboard overlays](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/17198) for alert events and version changes * driving a cross-team discussion to [overhaul the principles that drive our work on this tooling](https://github.com/sourcegraph/about/pull/2000) to help guide the future of monitoring at Sourcegraph
@@ -62,8 +60,8 @@ During my time at Sourcegraph, a major part of my focus has been on expanding th I also made a wide range of other improvements such as: -* [Shipping cAdvisor with Sourcegraph](https://github.com/sourcegraph/sourcegraph/issues/9791), which now serves container metrics for our standardised dashboards across deployment types -* [Update dashboards to scale with deployment sizes](https://github.com/sourcegraph/sourcegraph/pull/12756) +* [Shipping cAdvisor with Sourcegraph](https://github.com/sourcegraph/sourcegraph-public-snapshot/issues/9791), which now serves container metrics for our standardised dashboards across deployment types +* [Update dashboards to scale with deployment sizes](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/12756)
@@ -73,19 +71,19 @@ Previously, creating Sourcegraph releases was a lengthy, complex process that in * I made extensive improvements to the [Sourcegraph release tool](https://about.sourcegraph.com/handbook/engineering/distribution/tools/release), which handles automation of release tasks such as generating multi-repository changes, creating tags, setting up tracking issues, adding calendar events, making announcements, and more. * New automated changes and consolidated features as part of work to [reduce the number of steps to create a release](https://github.com/orgs/sourcegraph/projects/90) - * [Multi-repository changeset tracking](https://github.com/sourcegraph/sourcegraph/pull/15032) + * [Multi-repository changeset tracking](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/15032) * Extensive refactors to improve the tool's extensibility and reliability * Overall, helped reduce time to cut a release from several days to just a few hours
- Chart of downwards trend of steps required to create a release, based on checklist items in our generated release tracking issues (for example, sourcegraph#17727). + Chart of downwards trend of steps required to create a release, based on checklist items in our generated release tracking issues (for example, sourcegraph#17727). Patch release steps increased due to improved documentation and standardisation of the process.
-* Improved our integration and regression testing suite by introducing the capability to [directly leverage candidate images in tests, generalising test setup tooling, and migrating our automated upgrade tests to ensure compatibility](https://github.com/sourcegraph/sourcegraph/pull/14974) +* Improved our integration and regression testing suite by introducing the capability to [directly leverage candidate images in tests, generalising test setup tooling, and migrating our automated upgrade tests to ensure compatibility](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/14974) * Work on automated end-to-end testing by the Distribution team also contributed to the removal of many release steps * The use of shared per-build candidate images is now the standard way to run integration tests, saving a lot of build time previously spent building images for each individual test diff --git a/content/_experience/2021-7-5-sourcegraph.md b/content/_experience/2021-7-5-sourcegraph.md index 04e7570..813253c 100644 --- a/content/_experience/2021-7-5-sourcegraph.md +++ b/content/_experience/2021-7-5-sourcegraph.md @@ -15,9 +15,11 @@ description: "July 2021 - Present | Remote" author: robert --- -Since July 2021, I have been working as a software engineer at [Sourcegraph](#about-sourcegraph), firstly in the the newly created [Developer Experience team](#developer-experience) for about 5 months and later in the [Sourcegraph Cloud team](#sourcegraph-cloud). +Since July 2021, I have been working as a software engineer at [Sourcegraph](#about-sourcegraph) in various teams across the company over time. -Most of the company's work is open-source (to a lesser extent on the Sourcegraph Cloud team), so you can [see some of my contributions for Sourcegraph on GitHub](https://github.com/search?q=org%3Asourcegraph+author%3Abobheadxi+is%3Amerged+created%3A%3E2021-05-01&type=pullrequests&s=comments&o=desc)! +- [Core Services](#core-services) +- [Sourcegraph Cloud](#sourcegraph-cloud) +- [Developer experience](#developer-experience) ## Core Services @@ -39,7 +41,7 @@ During my 15 months as part of the Developer Experience team, I contributed exte - [`sg`, the Sourcegraph developer tool](https://docs.sourcegraph.com/dev/background-information/sg), in particular building out a infrastructure to [allow development of `sg` to scale](../_posts/2022-10-10-investing-in-development-of-devx.md) - Sourcegraph's continuous integration infrastructure and CI pipeline generator - the Sourcegraph monitoring generator, which manages converting monitoring definitions into integrations with Sourcegraph's monitoring ecosystem like Grafana dashboards, Prometheus Alertmanager alerts, and generated alert response documentation. -- driving the discussion, implementation, and adoption of [standardised logging](https://github.com/sourcegraph/sourcegraph/pull/33956) and [OpenTelemetry](https://github.com/sourcegraph/sourcegraph/issues/39397) in Sourcegraph +- driving the discussion, implementation, and adoption of [standardised logging](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/33956) and [OpenTelemetry](https://github.com/sourcegraph/sourcegraph-public-snapshot/issues/39397) in Sourcegraph - designing and building a new architecture for [scalable, stateless continuous integration agents](../_posts/2022-4-18-stateless-ci.md) ...and more. @@ -47,7 +49,7 @@ During my 15 months as part of the Developer Experience team, I contributed exte In addition to work directly related to the Developer Experience teams' ownership areas, I also contributed to other parts of the core Sourcegraph application during my time with the team, such as: - [scaling GitHub permissions mirroring](../_posts/2021-10-8-mirroring-github-permissions-at-scale.md) for large enterprises and supporting the continued maintenance of Sourcegraph's permissions syncing systems -- designing and developing [an extended permissions model for Sourcegraph](https://github.com/sourcegraph/sourcegraph/issues/27916), notably [implementing expanded access control parsing for Perforce](https://github.com/sourcegraph/sourcegraph/pull/26745) +- designing and developing [an extended permissions model for Sourcegraph](https://github.com/sourcegraph/sourcegraph-public-snapshot/issues/27916), notably [implementing expanded access control parsing for Perforce](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/26745)
diff --git a/content/_posts/2020-06-21-docker-sidecar.md b/content/_posts/2020-06-21-docker-sidecar.md index a4974ca..c29da67 100644 --- a/content/_posts/2020-06-21-docker-sidecar.md +++ b/content/_posts/2020-06-21-docker-sidecar.md @@ -30,7 +30,7 @@ While I'll generally refer to Grafana in this writeup, you can apply it to prett --- -**⚠️ Update:** Since the writing of this post, we have pivoted on the plan ([sourcegraph#11452](https://github.com/sourcegraph/sourcegraph/issues/11452#issuecomment-648628953)) and most of the work here no longer lives in our Grafana distribution, but is instead a part of our Prometheus distribution - see [sourcegraph#11832](https://github.com/sourcegraph/sourcegraph/pull/11832) for the new implementation. You can explore the source code [on Sourcegraph](https://sourcegraph.com/github.com/sourcegraph/sourcegraph/-/tree/docker-images/prometheus), and relevant documentation [here](https://docs.sourcegraph.com/dev/background-information/observability/prometheus). +**⚠️ Update:** Since the writing of this post, we have pivoted on the plan ([sourcegraph#11452](https://github.com/sourcegraph/sourcegraph-public-snapshot/issues/11452#issuecomment-648628953)) and most of the work here no longer lives in our Grafana distribution, but is instead a part of our Prometheus distribution - see [sourcegraph#11832](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/11832) for the new implementation. You can explore the source code [on Sourcegraph](https://sourcegraph.com/github.com/sourcegraph/sourcegraph-public-snapshot/-/tree/docker-images/prometheus), and relevant documentation [here](https://docs.sourcegraph.com/dev/background-information/observability/prometheus). Most of this article still applies though, but with Prometheus + Alertmanager instead of Grafana. @@ -309,15 +309,15 @@ And that's it for a rudimentary sidecar service that allows you to continue trea Some relevant pull requests implementing these features: -* [sourcegraph#11427](https://github.com/sourcegraph/sourcegraph/pull/11427) - I ended up reverting this due to bugs in certain environments and adding it back in [sourcegraph#11483](https://github.com/sourcegraph/sourcegraph/pull/11483), but both PRs include relevant discussions. These PRs implements a basic sidecar without start and restart capabilities. -* [sourcegraph#11554](https://github.com/sourcegraph/sourcegraph/pull/11554) adds the ability for the sidecar to start and restart the main service. +* [sourcegraph#11427](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/11427) - I ended up reverting this due to bugs in certain environments and adding it back in [sourcegraph#11483](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/11483), but both PRs include relevant discussions. These PRs implements a basic sidecar without start and restart capabilities. +* [sourcegraph#11554](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/11554) adds the ability for the sidecar to start and restart the main service. Note that most of the above work has been superseded by a pivot to Prometheus (see the update at the start of this post). Following the pivot, a lot of other work was enabled by the addition of this sidecar: -* [sourcegraph#12010](https://github.com/sourcegraph/sourcegraph/issues/12010) (implementation: [sourcegraph#12491](https://github.com/sourcegraph/sourcegraph/pull/12491)) proposed a mechanism for denoting ownership in our monitoring and routing alerts appropriately. -* [sourcegraph#17602](https://github.com/sourcegraph/sourcegraph/pull/17602) demonstrated potential summary capabilities a sidecar can export. -* [sourcegraph#17014](https://github.com/sourcegraph/sourcegraph/pull/17014) and [sourcegraph#17034](https://github.com/sourcegraph/sourcegraph/pull/17034) adds timestamped links to relevant Grafana panels to alert messages. +* [sourcegraph#12010](https://github.com/sourcegraph/sourcegraph-public-snapshot/issues/12010) (implementation: [sourcegraph#12491](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/12491)) proposed a mechanism for denoting ownership in our monitoring and routing alerts appropriately. +* [sourcegraph#17602](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/17602) demonstrated potential summary capabilities a sidecar can export. +* [sourcegraph#17014](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/17014) and [sourcegraph#17034](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/17034) adds timestamped links to relevant Grafana panels to alert messages. ## About Sourcegraph diff --git a/content/_posts/2021-10-8-mirroring-github-permissions-at-scale.md b/content/_posts/2021-10-8-mirroring-github-permissions-at-scale.md index ca9bfce..ec77089 100644 --- a/content/_posts/2021-10-8-mirroring-github-permissions-at-scale.md +++ b/content/_posts/2021-10-8-mirroring-github-permissions-at-scale.md @@ -40,7 +40,7 @@ The time to sync increases dramatically for even larger numbers of users and rep ## Sourcegraph and repository authorization -I got my first hands-on experience with Sourcegraph's authorization providers when [expanding `p4 protect` support for the Perforce integration](https://github.com/sourcegraph/sourcegraph/pull/23755). +I got my first hands-on experience with Sourcegraph's authorization providers when [expanding `p4 protect` support for the Perforce integration](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/23755). In a nutshell, Sourcegraph internally defines an interface *authorization providers* can implement to provide access lists for users (*user-centric* permissions) and repositories (*repo-centric* permissions) - [`authz.Provider`](https://sourcegraph.com/github.com/sourcegraph/sourcegraph@8685a6bef8c3e9d2556335cb25448dbc1b356a4a/-/blob/internal/authz/iface.go) - to populate a single source-of-truth table for permissions. This happens continuously and passively in the background. The populated table is then queried by various code paths that use the data to decide what content can and cannot be shown to a user. @@ -116,9 +116,9 @@ Even if we had a 100 teams and organizations, this would fall under the hourly r To mitigate outdated caches, a flag to the provider interface was added to allow partial cache invalidation along the path of a sync (important because you don't want every single team and organization queued for a sync all at once) and tying it into the various ways of triggering a sync (notably webhook receivers and the API). -The approach was promising, and a feature-flagged[^flagged] user-centric sync backed by a Redis cache was implemented in [sourcegraph#23978 authz/github: user-centric perms sync from team/org perms caches](https://github.com/sourcegraph/sourcegraph/pull/23978). +The approach was promising, and a feature-flagged[^flagged] user-centric sync backed by a Redis cache was implemented in [sourcegraph#23978 authz/github: user-centric perms sync from team/org perms caches](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/23978). -[^flagged]: Well, admittedly, it was only feature-flagged to off by default [in a follow-up PR](https://github.com/sourcegraph/sourcegraph/pull/24318) when I realised this required additional authentication scopes we do not request by default against the GitHub API (in order to query organizations and teams). +[^flagged]: Well, admittedly, it was only feature-flagged to off by default [in a follow-up PR](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/24318) when I realised this required additional authentication scopes we do not request by default against the GitHub API (in order to query organizations and teams). ## Two-way sync @@ -145,7 +145,7 @@ org/team: { On paper, the performance improvements gained here are similar to the ones when implementing caching for user-centric sync, except scaling off the number of users in teams and organizations instead of repositories. -This was implemented in [sourcegraph#24328 authz/github: repo-centric perms sync from team/org perms caches](https://github.com/sourcegraph/sourcegraph/pull/24328). +This was implemented in [sourcegraph#24328 authz/github: repo-centric perms sync from team/org perms caches](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/24328). ## Scaling in practice @@ -166,13 +166,13 @@ All was well at first in the trial run - the backlog of repositories queued for Metrics indicated jobs were timing out, and a look at the logs revealed thousands upon thousands of lines of random comma-delimited numbers. It seemed that printing all this junk was causing the service to stall, and sure enough [setting the log driver to `none`](https://docs.docker.com/config/containers/logging/configure/#configure-the-logging-driver-for-a-container) to disable all output on the relevant service allowed the sync to proceed and continue. -Where did the log come from? [I left a stray `log.Printf("%+v\n", group)` in there when I was debugging cache entries](https://github.com/sourcegraph/sourcegraph/pull/24822). At scale these entries could contain many thousands of entries, causing the system to degrade. Be careful what you log! +Where did the log come from? [I left a stray `log.Printf("%+v\n", group)` in there when I was debugging cache entries](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/24822). At scale these entries could contain many thousands of entries, causing the system to degrade. Be careful what you log! ### Postgres parameter limits -A service we call `repo-updater` has an internal service called `PermsSyncer` that manages a queue of jobs to request updated access lists using these authorization providers for users and repositories based on a variety of heuristics such as permissions age, as well as on events like webhooks and repository visits ([diagram](https://sourcegraph.com/github.com/sourcegraph/sourcegraph@8685a6bef8c3e9d2556335cb25448dbc1b356a4a/-/blob/enterprise/cmd/repo-updater/internal/authz/doc.go)). Access lists returned by authorization providers are upserted into a single [`repo_permissions` table](https://github.com/sourcegraph/sourcegraph/blob/main/internal/database/schema.md#table-publicrepo_permissions) that is the source of truth for all repositories a *Sourcegraph* user can access, and vice versa. +A service we call `repo-updater` has an internal service called `PermsSyncer` that manages a queue of jobs to request updated access lists using these authorization providers for users and repositories based on a variety of heuristics such as permissions age, as well as on events like webhooks and repository visits ([diagram](https://sourcegraph.com/github.com/sourcegraph/sourcegraph@8685a6bef8c3e9d2556335cb25448dbc1b356a4a/-/blob/enterprise/cmd/repo-updater/internal/authz/doc.go)). Access lists returned by authorization providers are upserted into a single [`repo_permissions` table](https://github.com/sourcegraph/sourcegraph-public-snapshot/blob/main/internal/database/schema.md#table-publicrepo_permissions) that is the source of truth for all repositories a *Sourcegraph* user can access, and vice versa. -Entries can also be upserted into a table called [`repo_pending_permissions`](https://github.com/sourcegraph/sourcegraph/blob/main/internal/database/schema.md#table-publicrepo_pending_permissions), which is home to permissions that do not have a Sourcegraph user attached yet. When a user logs in via a code host's OAuth mechanism to Sourcegraph, the user's Sourcegraph identity attached to the user's identity on that code host (this allows a Sourcegraph user to be associated with multiple code hosts), and relevant entries in `repo_pending_permissions` are "granted" to the user. +Entries can also be upserted into a table called [`repo_pending_permissions`](https://github.com/sourcegraph/sourcegraph-public-snapshot/blob/main/internal/database/schema.md#table-publicrepo_pending_permissions), which is home to permissions that do not have a Sourcegraph user attached yet. When a user logs in via a code host's OAuth mechanism to Sourcegraph, the user's Sourcegraph identity attached to the user's identity on that code host (this allows a Sourcegraph user to be associated with multiple code hosts), and relevant entries in `repo_pending_permissions` are "granted" to the user. This means that once the massive number of repositories in the trial run was fully mirrored from GitHub, a user attempting to log in could have a huge set of pending permissions granted to it all at once. Of course, this broke with a fun-looking error: @@ -258,7 +258,7 @@ FROM unnest(ARRAY['hello','world']::TEXT[], ARRAY['1,2,3','4,5,6']::TEXT[]) AS t(a, b) ``` -An `EXPLAIN ANALYZE` on the 5000-row sample query that didn't hit the parameter limit, however, indicated that the performance of this was about 5x worse than before (with a cost of 337.51, compared to the previous cost of 62.50). It was also a bit of a dirty hack anyway, so I ended up resorting to simply paging the insert instead to avoid hitting the parameter limit. This was implemented in [sourcegraph#24852 database: page upsertRepoPermissionsBatchQuery](https://github.com/sourcegraph/sourcegraph/pull/24852). +An `EXPLAIN ANALYZE` on the 5000-row sample query that didn't hit the parameter limit, however, indicated that the performance of this was about 5x worse than before (with a cost of 337.51, compared to the previous cost of 62.50). It was also a bit of a dirty hack anyway, so I ended up resorting to simply paging the insert instead to avoid hitting the parameter limit. This was implemented in [sourcegraph#24852 database: page upsertRepoPermissionsBatchQuery](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/24852). However, it seemed that this was not the only instance of us exceeding the parameter limits. Another query was running into a similar issue on a different customer instance. This time, there were no array types in the values being inserted, so I was able to try out the insert-as-arrays workaround: @@ -290,7 +290,7 @@ This implementation of the query was slower for smaller cases, but for larger da I originally had the function decide which query to use based on the size of the insert, but during code review it was recommended that we just stick to one implementation for simplicity, since permissions mirroring happens asynchronously and is not particularly latency-sensitive. -This was implemented in [sourcegraph#24972 database: provide upsertUserPendingPermissionsBatchQuery insert values as array](https://github.com/sourcegraph/sourcegraph/pull/24972/files). +This was implemented in [sourcegraph#24972 database: provide upsertUserPendingPermissionsBatchQuery insert values as array](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/24972/files). ## Results diff --git a/content/_posts/2022-10-10-investing-in-development-of-devx.md b/content/_posts/2022-10-10-investing-in-development-of-devx.md index 746009f..13959e6 100644 --- a/content/_posts/2022-10-10-investing-in-development-of-devx.md +++ b/content/_posts/2022-10-10-investing-in-development-of-devx.md @@ -179,7 +179,7 @@ A service configuration might look like: if [ -n "$DELVE" ]; then export GCFLAGS='all=-N -l' fi - go build -gcflags="$GCFLAGS" -o .bin/oss-frontend github.com/sourcegraph/sourcegraph/cmd/frontend + go build -gcflags="$GCFLAGS" -o .bin/oss-frontend github.com/sourcegraph/sourcegraph-public-snapshot/cmd/frontend checkBinary: .bin/oss-frontend env: CONFIGURATION_MODE: server @@ -231,7 +231,7 @@ For example, we can configure `PATH` for you, or make sure things are installed Enabling the development of good tooling, scripting, automation makes a difference. There’s a lot that can be done to improve how tooling is developed and improved, like the ideas I’ve brought up in this post - we don't have to settle for cryptic tooling everywhere! -If you’re interested in how all this is implemented, [`sg` is open source - come check us out on GitHub](https://github.com/sourcegraph/sourcegraph/tree/main/dev/sg)! +If you’re interested in how all this is implemented, [`sg` is open source - come check us out on GitHub](https://github.com/sourcegraph/sourcegraph-public-snapshot/tree/main/dev/sg)! *Note - I had originally hoped to present this as a lightning talk at Gophercon Chicago 2022, but I was too late to queue up on the day of the presentations, so I figured might as well turn it into a post.* diff --git a/content/_posts/2022-2-20-self-documenting-self-updating.md b/content/_posts/2022-2-20-self-documenting-self-updating.md index 89d06da..ccbb380 100644 --- a/content/_posts/2022-2-20-self-documenting-self-updating.md +++ b/content/_posts/2022-2-20-self-documenting-self-updating.md @@ -166,11 +166,11 @@ func GitServer() *monitoring.Container {
Explore - what our monitoring generator looks like today! + what our monitoring generator looks like today!
-Since the specification is built on a typed language, the API itself is self-documenting in that authors of monitoring definitions can easily access what options are available and what each does through [generated API docs](https://sourcegraph.com/github.com/sourcegraph/sourcegraph/-/docs/monitoring/monitoring) or code intelligence available in Sourcegraph or in your IDE, making it very easy to pick up and work with. +Since the specification is built on a typed language, the API itself is self-documenting in that authors of monitoring definitions can easily access what options are available and what each does through [generated API docs](https://sourcegraph.com/github.com/sourcegraph/sourcegraph-public-snapshot/-/docs/monitoring/monitoring) or code intelligence available in Sourcegraph or in your IDE, making it very easy to pick up and work with. ![](../../assets/images/posts/self-documenting/monitoring-api-hover.png) @@ -185,13 +185,13 @@ We also now have a tool, [`sg`](https://docs.sourcegraph.com/dev/background-info This all comes together to form a cohesive monitoring development and usage ecosystem that is tightly integrated, encodes best practices, self-documenting (both in the content it generates as well as the APIs available), and easy to extend. -Learn more about our observability ecosystem in our [developer documentation](https://docs.sourcegraph.com/dev/background-information/observability), and check out the [monitoring generator source code here](https://sourcegraph.com/github.com/sourcegraph/sourcegraph/-/blob/monitoring/monitoring). +Learn more about our observability ecosystem in our [developer documentation](https://docs.sourcegraph.com/dev/background-information/observability), and check out the [monitoring generator source code here](https://sourcegraph.com/github.com/sourcegraph/sourcegraph-public-snapshot/-/blob/monitoring/monitoring).
## Continuous integration pipelines -At Sourcegraph, our core continuous integration pipeline are - you guessed it - generated! Our [pipeline generator program](https://sourcegraph.com/github.com/sourcegraph/sourcegraph/-/tree/enterprise/dev/ci) analyses a build's variables (changes, branch names, commit messages, environment variables, and more) in order to create a pipeline to run on our [Buildkite](https://buildkite.com/) agent fleet. +At Sourcegraph, our core continuous integration pipeline are - you guessed it - generated! Our [pipeline generator program](https://sourcegraph.com/github.com/sourcegraph/sourcegraph-public-snapshot/-/tree/enterprise/dev/ci) analyses a build's variables (changes, branch names, commit messages, environment variables, and more) in order to create a pipeline to run on our [Buildkite](https://buildkite.com/) agent fleet. Typically, [Buildkite pipelines](https://buildkite.com/docs/pipelines/defining-steps) are specified similarly to [GitHub Action workflows](https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions) - by committing a YAML file to your repository that build agents pick up and run. This YAML file will specify what commands should get run over your codebase, and will usually support some simple conditions. @@ -330,7 +330,7 @@ With just the pretty minimal configuration above, each step is generated with a - agents: queue: standard command: - - ./tr ./dev/ci/go-backcompat/test.sh only github.com/sourcegraph/sourcegraph/internal/database + - ./tr ./dev/ci/go-backcompat/test.sh only github.com/sourcegraph/sourcegraph-public-snapshot/internal/database env: MINIMUM_UPGRADEABLE_VERSION: 3.36.0 key: gopostgresBackcompattestinternaldatabase @@ -343,7 +343,7 @@ In this snippet, we have: - A default queue to run the job on - this can be feature-flagged to run against experimental agents. - The shared `MINIMUM_UPGRADEABLE_VERSION` variable that gets used for other steps as well, such as upgrade tests. - A generated key, useful for identifying steps and creating [step dependencies](https://buildkite.com/docs/pipelines/dependencies). -- Commands prefixed with `./tr`: [this script](https://sourcegraph.com/github.com/sourcegraph/sourcegraph/-/blob/enterprise/dev/ci/scripts/trace-command.sh) creates and uploads traces for our builds! +- Commands prefixed with `./tr`: [this script](https://sourcegraph.com/github.com/sourcegraph/sourcegraph-public-snapshot/-/blob/enterprise/dev/ci/scripts/trace-command.sh) creates and uploads traces for our builds!
@@ -353,7 +353,7 @@ In this snippet, we have:
-Features like the build step traces [was implemented without having to make sweeping changes pipeline configuration](https://github.com/sourcegraph/sourcegraph/pull/29444/files), thanks to the generated approach - we just had to adjust the generator to inject the appropriate scripting, and now it _just works_ across all commands in the pipeline. +Features like the build step traces [was implemented without having to make sweeping changes pipeline configuration](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/29444/files), thanks to the generated approach - we just had to adjust the generator to inject the appropriate scripting, and now it _just works_ across all commands in the pipeline. Additional functions are also available that tweak how a step is created. For example, with `bk.AnnotatedCmd` one can indicate that a step will generate annotations by writing to `./annotations` - a wrapper script is configured to make sure these annotations gets picked up and uploaded via Buildkite's API: @@ -454,7 +454,7 @@ for rt := runtype.PullRequest + 1; rt < runtype.None; rt += 1 {
- A web version of this reference page is also published to the pipeline types reference. You can also check out the docs generation code directly! + A web version of this reference page is also published to the pipeline types reference. You can also check out the docs generation code directly!
@@ -493,13 +493,13 @@ Using a similar iteration over the available run types we can also provide toolt
- Check out the sg ci build source code directly, or the discussion behind the inception of this feature. + Check out the sg ci build source code directly, or the discussion behind the inception of this feature.
So now we have generated pipelines, documentation about them, the capability to extend pipeline specifications with additional feature like tracing, _and_ tooling that is integrated and automatically kept in sync with pipeline specifications - all derived from a single source of truth! -Learn more about our continuous integration ecosystem in our [developer documentation](https://docs.sourcegraph.com/dev/background-information/continuous_integration), and check out the [pipeline generator source code here](https://sourcegraph.com/github.com/sourcegraph/sourcegraph/-/tree/enterprise/dev/ci). +Learn more about our continuous integration ecosystem in our [developer documentation](https://docs.sourcegraph.com/dev/background-information/continuous_integration), and check out the [pipeline generator source code here](https://sourcegraph.com/github.com/sourcegraph/sourcegraph-public-snapshot/-/tree/enterprise/dev/ci).
diff --git a/content/_posts/2022-4-10-extending-search.md b/content/_posts/2022-4-10-extending-search.md index b9460bc..6766dea 100644 --- a/content/_posts/2022-4-10-extending-search.md +++ b/content/_posts/2022-4-10-extending-search.md @@ -67,7 +67,7 @@ Note that all the code internals mentioned in this post may change - you can vie Additionally, I am basically a complete outsider when it comes to our search internals, and the search code I interact with in this post was built by [Sourcegraph's fantastic search teams](https://handbook.sourcegraph.com/departments/product-engineering/engineering/code-graph/search/), so kudos[^kudos] to the teams for making this hack possible in the first place! -[^kudos]: So somewhat embarrassingly, on one of my iterations of this project [I complained a bit about the tedium of the many layers in the search backend](https://github.com/sourcegraph/sourcegraph/pull/33161), at which point I was educated by [Comby (structural search)](https://comby.dev/) creator [@rvantonder](https://github.com/rvantonder) on how [cleaning up the search internals is an ongoing effort and has improved significantly over the past year](https://github.com/sourcegraph/sourcegraph/pull/33161#issuecomment-1081441870). One of my biggest takeaways from this project is that search a very complex system and that building a suitable abstraction for the myriad of types of search that Sourcegraph already features is a monumental undertaking! +[^kudos]: So somewhat embarrassingly, on one of my iterations of this project [I complained a bit about the tedium of the many layers in the search backend](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/33161), at which point I was educated by [Comby (structural search)](https://comby.dev/) creator [@rvantonder](https://github.com/rvantonder) on how [cleaning up the search internals is an ongoing effort and has improved significantly over the past year](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/33161#issuecomment-1081441870). One of my biggest takeaways from this project is that search a very complex system and that building a suitable abstraction for the myriad of types of search that Sourcegraph already features is a monumental undertaking! ## Introducing a search job @@ -782,7 +782,7 @@ You can also check out a brief final demo I made of the state of the project at [![demo](https://cdn.loom.com/sessions/thumbnails/23c8d3f23bf942f3ba24896472047f5b-1648802342917-with-play.gif)](https://www.loom.com/share/23c8d3f23bf942f3ba24896472047f5b) -You can also check out the (messy) (and incomplete) code here: [sourcegraph#33316](https://github.com/sourcegraph/sourcegraph/pull/33316) +You can also check out the (messy) (and incomplete) code here: [sourcegraph#33316](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/33316)
diff --git a/content/_posts/2022-4-18-stateless-ci.md b/content/_posts/2022-4-18-stateless-ci.md index 4293f50..b73491a 100644 --- a/content/_posts/2022-4-18-stateless-ci.md +++ b/content/_posts/2022-4-18-stateless-ci.md @@ -97,11 +97,11 @@ The initial approach undertaken by the team used a single persistent [Kubernetes A new autoscaler service, `job-autoscaler`, was set up that pretty much did the exact same thing as the old `buildkite-autoscaler`, but instead of adjusting `spec.replicas`, it updated `spec.parallelism` instead, setting `spec.completions` and `spec.backoffLimit` to arbitrarily large values to prevent the Job from ever completing and shutting down. -This initial approach was used to iterate on some refinements to our pipelines to accommodate stateless agents (namely improved caching of resources). Upon rolling this out on a larger scale, however, we immediately ran into issues resulting in major CI outages, after which I outlined my thoughts in [sourcegraph#32843 dev/ci: stateless autoscaler: investigate revamped approach with dynamic jobs](https://github.com/sourcegraph/sourcegraph/issues/32843). It turns out, we probably should not be applying a stateful management approach (scaling a single Job entity up and down) to what should probably be a stateless queue processing mechanism. I decided to take point on re-implementing our approach. +This initial approach was used to iterate on some refinements to our pipelines to accommodate stateless agents (namely improved caching of resources). Upon rolling this out on a larger scale, however, we immediately ran into issues resulting in major CI outages, after which I outlined my thoughts in [sourcegraph#32843 dev/ci: stateless autoscaler: investigate revamped approach with dynamic jobs](https://github.com/sourcegraph/sourcegraph-public-snapshot/issues/32843). It turns out, we probably should not be applying a stateful management approach (scaling a single Job entity up and down) to what should probably be a stateless queue processing mechanism. I decided to take point on re-implementing our approach. ## Dynamic Kubernetes Jobs -In [sourcegraph#32843](https://github.com/sourcegraph/sourcegraph/issues/32843) I proposed an approach where we dispatch agents by creating new Kubernetes Jobs with `spec.parallelism` and `spec.completions` set to roughly number of agents needed to process all the jobs within the Buildkite jobs queue. This would mean that as soon as all the agents within a dispatched Job are "consumed" (have processed a Buildkite job and exited), [Kubernetes can clean up the Job and related resources](https://kubernetes.io/docs/concepts/workloads/controllers/job/#ttl-mechanism-for-finished-jobs), and that would be that. If more agents are needed, we simply keep dispatching more Jobs. This is done by a new service called `buildkite-job-dispatcher`. +In [sourcegraph#32843](https://github.com/sourcegraph/sourcegraph-public-snapshot/issues/32843) I proposed an approach where we dispatch agents by creating new Kubernetes Jobs with `spec.parallelism` and `spec.completions` set to roughly number of agents needed to process all the jobs within the Buildkite jobs queue. This would mean that as soon as all the agents within a dispatched Job are "consumed" (have processed a Buildkite job and exited), [Kubernetes can clean up the Job and related resources](https://kubernetes.io/docs/concepts/workloads/controllers/job/#ttl-mechanism-for-finished-jobs), and that would be that. If more agents are needed, we simply keep dispatching more Jobs. This is done by a new service called `buildkite-job-dispatcher`. Luckily, all the setup has been done for stateless agents with the existing Buildkite Job, so the way the dispatcher works is by fetching the deployed Job, resetting a variety of fields used internally by Kubernetes: diff --git a/content/_posts/2022-5-21-anatomy-of-a-logger.md b/content/_posts/2022-5-21-anatomy-of-a-logger.md index 99d5b62..d57b0b5 100644 --- a/content/_posts/2022-5-21-anatomy-of-a-logger.md +++ b/content/_posts/2022-5-21-anatomy-of-a-logger.md @@ -28,7 +28,7 @@ In my personal experience, I've seen logging cause some very real issues - [a de > Metrics indicated jobs were timing out, and a look at the logs revealed thousands upon thousands of lines of random comma-delimited numbers. It seemed that printing all this junk was causing the service to stall, and sure enough setting the log driver to none to disable all output on the relevant service allowed the sync to proceed and continue. [...] At scale these entries could contain many thousands of entries, causing the system to degrade. Be careful what you log! -At [Sourcegraph](/content/_experience/2021-7-5-sourcegraph.md) we currently use the cheekily named [`log15` logging library](https://github.com/inconshreveable/log15). Of course, a faster logger likely would not have prevented the above scenario from occurring (though we are in the process of [migrating to our new Zap-based logger](https://github.com/sourcegraph/sourcegraph/issues/33192)), but here's a set of (very unscientific) profiles that compare a somewhat "average" scenario of logging 3 fields with 3 fields of existing context in JSON format to demonstrate just how different Zap and `log15` handles rendering a log entry behind the scenes: +At [Sourcegraph](/content/_experience/2021-7-5-sourcegraph.md) we currently use the cheekily named [`log15` logging library](https://github.com/inconshreveable/log15). Of course, a faster logger likely would not have prevented the above scenario from occurring (though we are in the process of [migrating to our new Zap-based logger](https://github.com/sourcegraph/sourcegraph-public-snapshot/issues/33192)), but here's a set of (very unscientific) profiles that compare a somewhat "average" scenario of logging 3 fields with 3 fields of existing context in JSON format to demonstrate just how different Zap and `log15` handles rendering a log entry behind the scenes: ```go const iters = 100000 @@ -480,7 +480,7 @@ Turns out, seemingly simple things can be kind of complicated! However, in this Zap's design also provides some interesting ways to hook into its behaviour - Zap itself offers some examples, such as [`zaptest`](https://sourcegraph.com/github.com/uber-go/zap@v1.21.0/-/blob/zaptest/logger.go), which creates a logger with a custom `Writer` that sends output to Go's standard testing library. -At Sourcegraph, our [new Zap-based logger](https://github.com/sourcegraph/sourcegraph/issues/33192) offers utilities to [hook into an our configured logger](https://sourcegraph.com/github.com/sourcegraph/sourcegraph/-/blob/lib/log/logtest/logtest.go?L118-121) using Zap's [`WrapCore` API](https://sourcegraph.com/github.com/uber-go/zap@v1.21.0/-/blob/options.go?L42:6) to assert against log output (mostly for testing the log library itself), partly built on the existing `zaptest` utilities. We're also working on custom `Core` implementations to [automatically send logged errors to Sentry](https://github.com/sourcegraph/sourcegraph/pull/35582), and we [wrap `Field` constructors](https://sourcegraph.com/github.com/sourcegraph/sourcegraph/-/blob/lib/log/fields.go) to define custom behaviours (we disallow importing directly from Zap for this reason). Pretty nifty to still have such a high degree of customizability in an implementation so focused on optimizations! +At Sourcegraph, our [new Zap-based logger](https://github.com/sourcegraph/sourcegraph-public-snapshot/issues/33192) offers utilities to [hook into an our configured logger](https://sourcegraph.com/github.com/sourcegraph/sourcegraph-public-snapshot/-/blob/lib/log/logtest/logtest.go?L118-121) using Zap's [`WrapCore` API](https://sourcegraph.com/github.com/uber-go/zap@v1.21.0/-/blob/options.go?L42:6) to assert against log output (mostly for testing the log library itself), partly built on the existing `zaptest` utilities. We're also working on custom `Core` implementations to [automatically send logged errors to Sentry](https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/35582), and we [wrap `Field` constructors](https://sourcegraph.com/github.com/sourcegraph/sourcegraph-public-snapshot/-/blob/lib/log/fields.go) to define custom behaviours (we disallow importing directly from Zap for this reason). Pretty nifty to still have such a high degree of customizability in an implementation so focused on optimizations!