Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Telemetry #408

Merged
merged 22 commits into from
Jan 11, 2024
Merged

Telemetry #408

merged 22 commits into from
Jan 11, 2024

Conversation

visnup
Copy link
Member

@visnup visnup commented Dec 19, 2023

Resolves #327
Depends on https://github.com/observablehq/observablehq/pull/15524

Questions for reviewers:

  • Anyone want to copy edit the banner text, balancing disclosure, empathy, and helpfulness?
  • Are there docs anywhere yet for me to stub out telemetry explanation and so I can set the proper URL for the banner?
  • Does anyone care deeply about the timestamp format? Current version leverages performance.now which feels more aligned with first-pass questions around "how long does X take?"
  • Does time zone offset seem anonymous enough to capture? It's always been on my wishlist to better understand what local time of day people are doing things.
  • Any other environmental information to capture? Node version?
  • What preview events should we collect?

todo:

Probably in a follow up PR based on everyone's feedback:

  • more build data
    • number of files?
    • sizes of files?
  • preview stop event

Collects and sends telemetry to https://events.observablehq.com/cli. What's currently collected:

type TelemetryIds = {
   device: uuid; // uuid v4 saves to ~/.observablehq to attempt stability
   project: string; // one-way salted + hashed value of a project's git url falling back to current directory name
   session: uuid; // uuid v4 held in memory for the duration of the process
 };
 type TelemetryEnvironment = {
   version: string; // version of cli from package.json
   systemPlatform: string; // darwin, linux, win32...
   systemRelease: string;
   systemArchitecture: string; // arm64, x64...
   cpuCount: number;
   cpuModel: string | null;
   cpuSpeed: number | null;
   memoryInMb: number; // truncated to mb for more anonymity
   isCI: string | boolean;
   isDocker: boolean;
   isWSL: boolean;
 };
 type TelemetryTime = {
   now: number; // performance.now
   timeOrigin: number; // performance.timeOrigin
   timeZoneOffset: number; // minutes from utc, to derive wall clock hour
 };
 type TelemetryData = {
   event: "build" | "deploy" | "preview";
   step: "start" | "finish";
 };

Presents a banner once, when first run with a URL pointing to more information. Showing the banner saves timestamp to ~/.observablehq to know we've already shown it. That means checking to see if we should show it is async and therefore the banner could show up "late" and interspersed with other log messages. Is there a better way to check and save banner state?

@visnup visnup changed the title Record anonymous usage telemetry Telemetry Dec 20, 2023
@visnup visnup marked this pull request as ready for review January 10, 2024 03:43
@visnup visnup requested review from Fil, mbostock and mythmon January 10, 2024 03:44
.gitignore Outdated Show resolved Hide resolved
src/build.ts Outdated
@@ -144,6 +145,7 @@ export async function build(
await effects.copyFile(sourcePath, outputPath);
}
}
telemetry.record({event: "build", step: "finish"});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bet this kind of "start/end" telemetry is going to often be interesting for us. Is it possible to link ends to their starts at all? Maybe telemetry.record could return a message ID that we could note in the end event?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the session id will be stable across tied events.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the workflow on the analytic side would be to find a start event, and then look for end events with the same session id? Will that be annoying if there is an event that happens multiple times in a telemetry session?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, do you mean like in the case of maybe two builds happening concurrently and the events get interleaved?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean if we have a start event for something more granular, like one per page. Is that just not what this is for? Having average time-per-page and number-of-pages metrics would be useful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we could do something like that. so just easier rolled-up timings of blocks of code, right? maybe a premature idea would be to provide a telemetry.measure({event: "something"}, () => { /* block */ }). I guess I'm still not sure exactly what we'll want to measure so am willing to iterate a bunch and probably throw out old data as we do.

Copy link
Member Author

@visnup visnup Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could return an identifier (a counter probably that just increases with every call to record) if people want to reference these in other events:

const ts = telemetry.record({event: "build", step: "start"});
// ...
telemetry.record({event: "build", step: "milestone", start: ts});

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The block version is familiar, but it feels heavy handed. I'd really like language support for something like destructors, which makes this very nice, but we don't have those.

I think the ts version makes sense, and I'd be happy with that.

Another idea I had was to have the return value of record be a function or object with methods so you could do something like

const telemetrySpan = telemetry.startSpan({event: "build", step: "start"});
// ...
telemetrySpan.note({step: "milestone", foo: "bar"});
// ...
telemetrySpan.end({step: "end", status: "success"})

That's probably overkill though. Having .record always give back an auto incrementing ID sounds like a good approach.

src/config.ts Outdated Show resolved Hide resolved
src/deploy.ts Outdated Show resolved Hide resolved
src/telemetry.ts Show resolved Hide resolved
src/telemetry.ts Show resolved Hide resolved
src/telemetry.ts Show resolved Hide resolved
src/telemetry.ts Outdated Show resolved Hide resolved
Copy link
Member

@mythmon mythmon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concerns about the start/end pairing are a theoretical thing that don't apply directly to any telemetry events added in this PR. We can deal with that if we actually add events that need that.

@visnup visnup enabled auto-merge (squash) January 11, 2024 20:00
@visnup visnup merged commit c45051d into main Jan 11, 2024
2 checks passed
@visnup visnup deleted the visnup/telemetry branch January 11, 2024 20:01
@Fil
Copy link
Contributor

Fil commented Jan 11, 2024

❯ yarn dev

↳ http://127.0.0.1:3002/

node:internal/process/promises:289
            triggerUncaughtException(err, true /* fromPromise */);
            ^

[Error: EACCES: permission denied, open '/Users/fil/.observablehq'] {
  errno: -13,
  code: 'EACCES',
  syscall: 'open',
  path: '/Users/fil/.observablehq'
}

Node.js v20.3.0

❯ git bisect bad
c45051d is the first bad commit

@Fil
Copy link
Contributor

Fil commented Jan 11, 2024

🛑 Same blocking error for yarn build

@visnup
Copy link
Member Author

visnup commented Jan 11, 2024

OBSERVABLE_TELEMETRY_DISABLE=1 yarn dev to unblock you, but now I'm unsure why you can't write to ~/.observablehq. I guess I can just fallback and not care if that happens and not save persistent values anywhere.

@Fil
Copy link
Contributor

Fil commented Jan 11, 2024

probably something I tried during an earlier exploration; I like to make things break :)

I had this:

❯ ls -altr /Users/fil/.observablehq
-r--r--r--@ 1 fil  staff  103 20 nov 22:17 /Users/fil/.observablehq

setting it to writeable solved the issue

cinxmo pushed a commit that referenced this pull request Jan 12, 2024
* Record anonymous usage telemetry

* Stricter types

* Better persistence

* Detect CI and Docker environments

* Don't fail tests

* Configurable origin

* Fix lint

* Better time information

* Add isWSL heuristic

* Some tests

* Show a banner on first run

* Test debug disables telemetry too

* More testable

* Ironically, disable telemetry during tests

* Base telemetry origin on ui origin

* Some documentation of what we're collecting

* Manage our own singleton-ness

* Initial telemetry documentation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Telemetry
4 participants