diff --git a/CHANGELOG.md b/CHANGELOG.md index 0d6d0fe..27403a0 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,305 +1,205 @@ # [0.6.0](https://github.com/forcedotcom/agents/compare/0.5.11...0.6.0) (2025-01-16) - ### Bug Fixes -* modify structure, order, and types ([5f75c87](https://github.com/forcedotcom/agents/commit/5f75c878a19a15123bb3cbc37e8424e2b5b27834)) -* update test ([e22b460](https://github.com/forcedotcom/agents/commit/e22b4606793a70b51d1321a4a232295c64bd03b2)) - +- modify structure, order, and types ([5f75c87](https://github.com/forcedotcom/agents/commit/5f75c878a19a15123bb3cbc37e8424e2b5b27834)) +- update test ([e22b460](https://github.com/forcedotcom/agents/commit/e22b4606793a70b51d1321a4a232295c64bd03b2)) ### Features -* add agentCreate v2 API and mock ([333f8cb](https://github.com/forcedotcom/agents/commit/333f8cb58e6633348af035b757e404dd276db0c0)) - - +- add agentCreate v2 API and mock ([333f8cb](https://github.com/forcedotcom/agents/commit/333f8cb58e6633348af035b757e404dd276db0c0)) ## [0.5.11](https://github.com/forcedotcom/agents/compare/0.5.10...0.5.11) (2025-01-11) - ### Bug Fixes -* **deps:** bump ansis from 3.5.2 to 3.8.1 ([8721e95](https://github.com/forcedotcom/agents/commit/8721e95b6f08f158dd57df90cfed992ab74ee7a7)) - - +- **deps:** bump ansis from 3.5.2 to 3.8.1 ([8721e95](https://github.com/forcedotcom/agents/commit/8721e95b6f08f158dd57df90cfed992ab74ee7a7)) ## [0.5.10](https://github.com/forcedotcom/agents/compare/0.5.9...0.5.10) (2025-01-11) - ### Bug Fixes -* **deps:** bump @salesforce/source-deploy-retrieve ([3d67921](https://github.com/forcedotcom/agents/commit/3d679210bd7f969d9d9b88cbb042a1c232c7db4e)) - - +- **deps:** bump @salesforce/source-deploy-retrieve ([3d67921](https://github.com/forcedotcom/agents/commit/3d679210bd7f969d9d9b88cbb042a1c232c7db4e)) ## [0.5.9](https://github.com/forcedotcom/agents/compare/0.5.8...0.5.9) (2025-01-06) - ### Bug Fixes -* fixed column widths ([13b661f](https://github.com/forcedotcom/agents/commit/13b661f6d97a233a213bf266b187624b72d4d389)) - - +- fixed column widths ([13b661f](https://github.com/forcedotcom/agents/commit/13b661f6d97a233a213bf266b187624b72d4d389)) ## [0.5.8](https://github.com/forcedotcom/agents/compare/0.5.7...0.5.8) (2025-01-06) - ### Bug Fixes -* bump oclif table ([7ed9d30](https://github.com/forcedotcom/agents/commit/7ed9d30263fef2243bf087d2c7d5777f47a2c032)) - - +- bump oclif table ([7ed9d30](https://github.com/forcedotcom/agents/commit/7ed9d30263fef2243bf087d2c7d5777f47a2c032)) ## [0.5.7](https://github.com/forcedotcom/agents/compare/0.5.6...0.5.7) (2025-01-04) - ### Bug Fixes -* **deps:** bump @salesforce/source-deploy-retrieve ([3213bf5](https://github.com/forcedotcom/agents/commit/3213bf548beffca94eb69262607eaff1aeb3d3a3)) - - +- **deps:** bump @salesforce/source-deploy-retrieve ([3213bf5](https://github.com/forcedotcom/agents/commit/3213bf548beffca94eb69262607eaff1aeb3d3a3)) ## [0.5.6](https://github.com/forcedotcom/agents/compare/0.5.5...0.5.6) (2024-12-28) - ### Bug Fixes -* **deps:** bump ansis from 3.4.0 to 3.5.2 ([88b8163](https://github.com/forcedotcom/agents/commit/88b81635731983297d11a8de896e3e170bb58a51)) - - +- **deps:** bump ansis from 3.4.0 to 3.5.2 ([88b8163](https://github.com/forcedotcom/agents/commit/88b81635731983297d11a8de896e3e170bb58a51)) ## [0.5.5](https://github.com/forcedotcom/agents/compare/0.5.4...0.5.5) (2024-12-21) - ### Bug Fixes -* **deps:** bump @oclif/table from 0.3.7 to 0.3.9 ([a6fe6df](https://github.com/forcedotcom/agents/commit/a6fe6dfcfbb88a61940d15e762fc58587731d54e)) - - +- **deps:** bump @oclif/table from 0.3.7 to 0.3.9 ([a6fe6df](https://github.com/forcedotcom/agents/commit/a6fe6dfcfbb88a61940d15e762fc58587731d54e)) ## [0.5.4](https://github.com/forcedotcom/agents/compare/0.5.3...0.5.4) (2024-12-21) - ### Bug Fixes -* **deps:** bump fast-xml-parser from 4.5.0 to 4.5.1 ([20dde46](https://github.com/forcedotcom/agents/commit/20dde46762d9fa143d9512f2aac24afc77f16f18)) - - +- **deps:** bump fast-xml-parser from 4.5.0 to 4.5.1 ([20dde46](https://github.com/forcedotcom/agents/commit/20dde46762d9fa143d9512f2aac24afc77f16f18)) ## [0.5.3](https://github.com/forcedotcom/agents/compare/0.5.2...0.5.3) (2024-12-20) - ### Bug Fixes -* add env var for polling ([4a20d50](https://github.com/forcedotcom/agents/commit/4a20d50d68352c25509c4a1f32c2dce1286d6882)) - - +- add env var for polling ([4a20d50](https://github.com/forcedotcom/agents/commit/4a20d50d68352c25509c4a1f32c2dce1286d6882)) ## [0.5.2](https://github.com/forcedotcom/agents/compare/0.5.1...0.5.2) (2024-12-18) - ### Bug Fixes -* update HRO ([accc6a1](https://github.com/forcedotcom/agents/commit/accc6a109295b7b8ca4ae6d6410e761cabd5ccc7)) - - +- update HRO ([accc6a1](https://github.com/forcedotcom/agents/commit/accc6a109295b7b8ca4ae6d6410e761cabd5ccc7)) ## [0.5.1](https://github.com/forcedotcom/agents/compare/0.5.0...0.5.1) (2024-12-16) - ### Bug Fixes -* update expected response from details API ([49a19fd](https://github.com/forcedotcom/agents/commit/49a19fd8032a0d47d4256c554b48eb6f47c95bae)) - - +- update expected response from details API ([49a19fd](https://github.com/forcedotcom/agents/commit/49a19fd8032a0d47d4256c554b48eb6f47c95bae)) # [0.5.0](https://github.com/forcedotcom/agents/compare/0.4.5...0.5.0) (2024-12-16) - ### Features -* support TAP format for test results ([f48cb12](https://github.com/forcedotcom/agents/commit/f48cb1218a197b5c8b1f8a9bff7025c435781d96)) - - +- support TAP format for test results ([f48cb12](https://github.com/forcedotcom/agents/commit/f48cb1218a197b5c8b1f8a9bff7025c435781d96)) ## [0.4.5](https://github.com/forcedotcom/agents/compare/0.4.4...0.4.5) (2024-12-14) - ### Bug Fixes -* **deps:** bump @salesforce/source-deploy-retrieve ([a282749](https://github.com/forcedotcom/agents/commit/a28274962414ee576e9da3e3ba8df21eab3ab9ad)) - - +- **deps:** bump @salesforce/source-deploy-retrieve ([a282749](https://github.com/forcedotcom/agents/commit/a28274962414ee576e9da3e3ba8df21eab3ab9ad)) ## [0.4.4](https://github.com/forcedotcom/agents/compare/0.4.3...0.4.4) (2024-12-13) - - ## [0.4.3](https://github.com/forcedotcom/agents/compare/0.4.2...0.4.3) (2024-12-13) - - ## [0.4.2](https://github.com/forcedotcom/agents/compare/0.4.1...0.4.2) (2024-12-10) - ### Bug Fixes -* retrieve GenAiPlugins too ([1d461a2](https://github.com/forcedotcom/agents/commit/1d461a27a713212bb2f033046a71d47d3be21204)) - - +- retrieve GenAiPlugins too ([1d461a2](https://github.com/forcedotcom/agents/commit/1d461a27a713212bb2f033046a71d47d3be21204)) ## [0.4.1](https://github.com/forcedotcom/agents/compare/0.4.0...0.4.1) (2024-12-10) - ### Bug Fixes -* agent create working ([1a57a90](https://github.com/forcedotcom/agents/commit/1a57a90630a043ad6294a9895145fa4fd7442330)) -* agent.create WIP ([836f6db](https://github.com/forcedotcom/agents/commit/836f6db06cd529d0482b9fa9630b9314892b799f)) -* emit lifecycle events ([c3bba31](https://github.com/forcedotcom/agents/commit/c3bba3100f953a320efb4542f0ecc031a5d41179)) - - +- agent create working ([1a57a90](https://github.com/forcedotcom/agents/commit/1a57a90630a043ad6294a9895145fa4fd7442330)) +- agent.create WIP ([836f6db](https://github.com/forcedotcom/agents/commit/836f6db06cd529d0482b9fa9630b9314892b799f)) +- emit lifecycle events ([c3bba31](https://github.com/forcedotcom/agents/commit/c3bba3100f953a320efb4542f0ecc031a5d41179)) # [0.4.0](https://github.com/forcedotcom/agents/compare/0.3.1...0.4.0) (2024-12-10) - ### Features -* junit result formatter ([42ff64b](https://github.com/forcedotcom/agents/commit/42ff64bd855d4de5e4f6585ab5c141816d1bccf3)) - - +- junit result formatter ([42ff64b](https://github.com/forcedotcom/agents/commit/42ff64bd855d4de5e4f6585ab5c141816d1bccf3)) ## [0.3.1](https://github.com/forcedotcom/agents/compare/0.3.0...0.3.1) (2024-12-07) - ### Bug Fixes -* **deps:** bump @oclif/table from 0.3.3 to 0.3.5 ([49ce6d5](https://github.com/forcedotcom/agents/commit/49ce6d58fe3a433a14bad20d38240180aaff864f)) - - +- **deps:** bump @oclif/table from 0.3.3 to 0.3.5 ([49ce6d5](https://github.com/forcedotcom/agents/commit/49ce6d58fe3a433a14bad20d38240180aaff864f)) # [0.3.0](https://github.com/forcedotcom/agents/compare/0.2.4...0.3.0) (2024-12-05) - ### Features -* export test formatters ([b5f26df](https://github.com/forcedotcom/agents/commit/b5f26df835c3a08e9065f327afc4a6bc1997c9ab)) - - +- export test formatters ([b5f26df](https://github.com/forcedotcom/agents/commit/b5f26df835c3a08e9065f327afc4a6bc1997c9ab)) ## [0.2.4](https://github.com/forcedotcom/agents/compare/0.2.3...0.2.4) (2024-12-03) - ### Bug Fixes -* add doc for agent class ([5208345](https://github.com/forcedotcom/agents/commit/52083450e7eff71e1cc3caf570abd18939317d9a)) - - +- add doc for agent class ([5208345](https://github.com/forcedotcom/agents/commit/52083450e7eff71e1cc3caf570abd18939317d9a)) ## [0.2.3](https://github.com/forcedotcom/agents/compare/0.2.2...0.2.3) (2024-12-03) - - ## [0.2.2](https://github.com/forcedotcom/agents/compare/0.2.1...0.2.2) (2024-12-02) - ### Bug Fixes -* export more types ([0a1d408](https://github.com/forcedotcom/agents/commit/0a1d408dd0d815b36509d5b5fe343e7e0817d1a2)) - - +- export more types ([0a1d408](https://github.com/forcedotcom/agents/commit/0a1d408dd0d815b36509d5b5fe343e7e0817d1a2)) ## [0.2.1](https://github.com/forcedotcom/agents/compare/0.2.0...0.2.1) (2024-12-02) - ### Bug Fixes -* update return type on start ([c14e8c4](https://github.com/forcedotcom/agents/commit/c14e8c41c3180dad1df807a3531e944b89cce229)) - - +- update return type on start ([c14e8c4](https://github.com/forcedotcom/agents/commit/c14e8c41c3180dad1df807a3531e944b89cce229)) # [0.2.0](https://github.com/forcedotcom/agents/compare/0.1.6...0.2.0) (2024-12-02) - ### Bug Fixes -* add polling lifecycle events ([695fd08](https://github.com/forcedotcom/agents/commit/695fd086865c60850d53aa2753686ea5aeef2d4a)) -* use sf-plugins-core for making table ([97eaa63](https://github.com/forcedotcom/agents/commit/97eaa633fd739c214029ce0fb1dbd521f220c5ae)) - +- add polling lifecycle events ([695fd08](https://github.com/forcedotcom/agents/commit/695fd086865c60850d53aa2753686ea5aeef2d4a)) +- use sf-plugins-core for making table ([97eaa63](https://github.com/forcedotcom/agents/commit/97eaa633fd739c214029ce0fb1dbd521f220c5ae)) ### Features -* add cancel method ([8371f9f](https://github.com/forcedotcom/agents/commit/8371f9fd735bfd7e12dd4a95419e321bb34cf465)) -* mock agent testing ([8df61a9](https://github.com/forcedotcom/agents/commit/8df61a9fba8005d0823bba5ce5f14d3ab5a5c12e)) -* mocked agent testing ([334988d](https://github.com/forcedotcom/agents/commit/334988d753f942fbfecdaa776e2285c51b81ebf5)) -* poll both status and details ([61b03dc](https://github.com/forcedotcom/agents/commit/61b03dcba132ed07df194953c850595771f3ccff)) - - +- add cancel method ([8371f9f](https://github.com/forcedotcom/agents/commit/8371f9fd735bfd7e12dd4a95419e321bb34cf465)) +- mock agent testing ([8df61a9](https://github.com/forcedotcom/agents/commit/8df61a9fba8005d0823bba5ce5f14d3ab5a5c12e)) +- mocked agent testing ([334988d](https://github.com/forcedotcom/agents/commit/334988d753f942fbfecdaa776e2285c51b81ebf5)) +- poll both status and details ([61b03dc](https://github.com/forcedotcom/agents/commit/61b03dcba132ed07df194953c850595771f3ccff)) ## [0.1.6](https://github.com/forcedotcom/agents/compare/0.1.5...0.1.6) (2024-11-16) - ### Bug Fixes -* **deps:** bump cross-spawn from 7.0.3 to 7.0.5 ([7f43cc7](https://github.com/forcedotcom/agents/commit/7f43cc706b848fd54c88d04bee2c0b7b632d7e76)) - - +- **deps:** bump cross-spawn from 7.0.3 to 7.0.5 ([7f43cc7](https://github.com/forcedotcom/agents/commit/7f43cc706b848fd54c88d04bee2c0b7b632d7e76)) ## [0.1.5](https://github.com/forcedotcom/agents/compare/0.1.4...0.1.5) (2024-11-16) - ### Bug Fixes -* **deps:** bump @salesforce/core from 8.6.3 to 8.8.0 ([193237b](https://github.com/forcedotcom/agents/commit/193237b5dbbe7ce1ee596a3b7305b5602d0883f8)) - - +- **deps:** bump @salesforce/core from 8.6.3 to 8.8.0 ([193237b](https://github.com/forcedotcom/agents/commit/193237b5dbbe7ce1ee596a3b7305b5602d0883f8)) ## [0.1.4](https://github.com/forcedotcom/agents/compare/0.1.3...0.1.4) (2024-11-12) - ### Bug Fixes -* do not append spec in name ([284d5d5](https://github.com/forcedotcom/agents/commit/284d5d56ed99c67b93a65904a00fdb00a2552a0e)) - - +- do not append spec in name ([284d5d5](https://github.com/forcedotcom/agents/commit/284d5d56ed99c67b93a65904a00fdb00a2552a0e)) ## [0.1.3](https://github.com/forcedotcom/agents/compare/0.1.2...0.1.3) (2024-11-12) - ### Bug Fixes -* use latest ([92ecbba](https://github.com/forcedotcom/agents/commit/92ecbbabc403fe57bf4069f9928b029d23db7a16)) - - +- use latest ([92ecbba](https://github.com/forcedotcom/agents/commit/92ecbbabc403fe57bf4069f9928b029d23db7a16)) ## [0.1.2](https://github.com/forcedotcom/agents/compare/0.1.1...0.1.2) (2024-11-12) - ### Bug Fixes -* publish to preview ([3f5ccb6](https://github.com/forcedotcom/agents/commit/3f5ccb687017186eb29b8b18c7fdce33daee1f70)) - - +- publish to preview ([3f5ccb6](https://github.com/forcedotcom/agents/commit/3f5ccb687017186eb29b8b18c7fdce33daee1f70)) ## [0.1.1](https://github.com/forcedotcom/agents/compare/0.1.0...0.1.1) (2024-11-10) - ### Bug Fixes -* export Agent class ([6c42b63](https://github.com/forcedotcom/agents/commit/6c42b63bbe9a5a5cf6fa0cea8f5649d07aaa6adc)) - - +- export Agent class ([6c42b63](https://github.com/forcedotcom/agents/commit/6c42b63bbe9a5a5cf6fa0cea8f5649d07aaa6adc)) # [0.1.0](https://github.com/forcedotcom/agents/compare/0c5d8d6ab9e9a8470c7192a56350567882a3017b...0.1.0) (2024-11-09) - ### Bug Fixes -* improve types and linting ([d5a6cb3](https://github.com/forcedotcom/agents/commit/d5a6cb3348e63d52e10540e99cf509be64a26649)) -* revise readme and version ([f690b7f](https://github.com/forcedotcom/agents/commit/f690b7f8a911315f467f00f5f533e22e92c69a9e)) - +- improve types and linting ([d5a6cb3](https://github.com/forcedotcom/agents/commit/d5a6cb3348e63d52e10540e99cf509be64a26649)) +- revise readme and version ([f690b7f](https://github.com/forcedotcom/agents/commit/f690b7f8a911315f467f00f5f533e22e92c69a9e)) ### Features -* add initial agent job spec create and mock ([0c5d8d6](https://github.com/forcedotcom/agents/commit/0c5d8d6ab9e9a8470c7192a56350567882a3017b)) - - - +- add initial agent job spec create and mock ([0c5d8d6](https://github.com/forcedotcom/agents/commit/0c5d8d6ab9e9a8470c7192a56350567882a3017b)) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index a257f13..5f994e9 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -2,15 +2,15 @@ 1. Familiarize yourself with the codebase by reading the docs, in particular the [developing](./developing.md) doc. -1. Create a new issue before starting your project so that we can keep track of +2. Create a new issue before starting your project so that we can keep track of what you're trying to add/fix. That way, we can also offer suggestions or let you know if there is already an effort in progress. -1. Fork this repository. -1. Set up your environment using the information in the [developing](./developing.md) doc. -1. Create a _topic_ branch in your fork based on the correct branch (usually the **develop** branch, see [Branches section](./developing.md)). Note: this step is recommended but technically not required if contributing using a fork. -1. Edit the code in your fork. -1. Sign the CLA (see [CLA](#cla)). -1. Send us a pull request when you're done. We'll review your code, suggest any +3. Fork this repository. +4. Set up your environment using the information in the [developing](./developing.md) doc. +5. Create a _topic_ branch in your fork based on the correct branch (usually the **develop** branch, see [Branches section](./developing.md)). Note: this step is recommended but technically not required if contributing using a fork. +6. Edit the code in your fork. +7. Sign the CLA (see [CLA](#cla)). +8. Send us a pull request when you're done. We'll review your code, suggest any needed changes, and merge it in. ## Pull Requests @@ -31,7 +31,3 @@ Agreement. You can do so by going to . ### Merging Pull Requests Pull request merging is restricted to squash and merge only. - -## Helpful Resources - -- [developing](./developing.md) doc. diff --git a/src/agentTester.ts b/src/agentTester.ts index 41c9dd6..dcfdc79 100644 --- a/src/agentTester.ts +++ b/src/agentTester.ts @@ -9,7 +9,7 @@ import { Duration, env } from '@salesforce/kit'; import ansis from 'ansis'; import { MaybeMock } from './maybe-mock'; -export type TestStatus = 'NEW' | 'IN_PROGRESS' | 'COMPLETED' | 'ERROR'; +export type TestStatus = 'New' | 'InProgress' | 'Completed' | 'Error'; export type AgentTestStartResponse = { aiEvaluationId: string; @@ -25,14 +25,13 @@ export type AgentTestStatusResponse = { export type TestCaseResult = { status: TestStatus; - number: string; utterance: string; startTime: string; endTime?: string; generatedData: { type: 'AGENT'; actionsSequence: string[]; - outcome: 'Success' | 'Failure'; + outcome: string; topic: string; inputTokensCount: string; outputTokensCount: string; @@ -42,7 +41,7 @@ export type TestCaseResult = { actualValue: string; expectedValue: string; score: number; - result: 'Passed' | 'Failed'; + result: 'PASS' | 'FAILURE'; metricLabel: 'Accuracy' | 'Precision'; metricExplainability: string; status: TestStatus; @@ -53,7 +52,7 @@ export type TestCaseResult = { }>; }; -export type AgentTestDetailsResponse = { +export type AgentTestResultsResponse = { status: TestStatus; startTime: string; endTime?: string; @@ -106,7 +105,7 @@ export class AgentTester { * * @param {string} jobId * @param {Duration} timeout - * @returns {Promise} + * @returns {Promise} */ public async poll( jobId: string, @@ -117,57 +116,63 @@ export class AgentTester { } = { timeout: Duration.minutes(5), } - ): Promise { + ): Promise { const frequency = env.getNumber('SF_AGENT_TEST_POLLING_FREQUENCY_MS', 1000); const lifecycle = Lifecycle.getInstance(); const client = await PollingClient.create({ poll: async (): Promise => { - // NOTE: we don't actually need to call the status API here since all the same information is present on the - // details API. We could just call the details API and check the status there. - const [detailsResponse, statusResponse] = await Promise.all([this.details(jobId), this.status(jobId)]); - const totalTestCases = detailsResponse.testSet.testCases.length; - const failingTestCases = detailsResponse.testSet.testCases.filter((tc) => tc.status === 'ERROR').length; - const passingTestCases = detailsResponse.testSet.testCases.filter( - (tc) => tc.status === 'COMPLETED' && tc.expectationResults.every((r) => r.result === 'Passed') - ).length; - - if (statusResponse.status.toLowerCase() === 'completed') { + const statusResponse = await this.status(jobId); + if (statusResponse.status.toLowerCase() !== 'new') { + const resultsResponse = await this.results(jobId); + const totalTestCases = resultsResponse.testSet.testCases.length; + const passingTestCases = resultsResponse.testSet.testCases.filter( + (tc) => tc.status.toLowerCase() === 'completed' && tc.expectationResults.every((r) => r.result === 'PASS') + ).length; + const failingTestCases = resultsResponse.testSet.testCases.filter( + (tc) => + ['error', 'completed'].includes(tc.status.toLowerCase()) && + tc.expectationResults.some((r) => r.result === 'FAILURE') + ).length; + + if (resultsResponse.status.toLowerCase() === 'completed') { + await lifecycle.emit('AGENT_TEST_POLLING_EVENT', { + jobId, + status: resultsResponse.status, + totalTestCases, + failingTestCases, + passingTestCases, + }); + return { payload: resultsResponse, completed: true }; + } + await lifecycle.emit('AGENT_TEST_POLLING_EVENT', { jobId, - status: statusResponse.status, + status: resultsResponse.status, totalTestCases, failingTestCases, passingTestCases, }); - return { payload: detailsResponse, completed: true }; } - await lifecycle.emit('AGENT_TEST_POLLING_EVENT', { - jobId, - status: statusResponse.status, - totalTestCases, - failingTestCases, - passingTestCases, - }); return { completed: false }; }, frequency: Duration.milliseconds(frequency), timeout, }); - return client.subscribe(); + return client.subscribe(); } /** * Request test run details * * @param {string} jobId - * @returns {Promise} + * @returns {Promise} */ - public async details(jobId: string): Promise { - const url = `/einstein/ai-evaluations/runs/${jobId}/details`; + public async results(jobId: string): Promise { + const url = `/einstein/ai-evaluations/runs/${jobId}/results`; - return this.maybeMock.request('GET', url); + return this.maybeMock.request('GET', url); } /** @@ -246,19 +251,38 @@ function makeSimpleTable(data: Record, title: string): string { return `${title}\n${table}`; } -export async function humanFormat(details: AgentTestDetailsResponse): Promise { +export async function convertTestResultsToFormat( + results: AgentTestResultsResponse, + format: 'human' | 'json' | 'junit' | 'tap' +): Promise { + switch (format) { + case 'human': + return humanFormat(results); + case 'json': + return jsonFormat(results); + case 'junit': + return junitFormat(results); + case 'tap': + return tapFormat(results); + default: + throw new Error(`Unsupported format: ${format as string}`); + } +} + +async function humanFormat(details: AgentTestResultsResponse): Promise { const { Ux } = await import('@salesforce/sf-plugins-core'); const ux = new Ux(); const tables: string[] = []; for (const testCase of details.testSet.testCases) { + const number = details.testSet.testCases.indexOf(testCase) + 1; const table = ux.makeTable({ - title: `${ansis.bold(`Test Case #${testCase.number}`)}\n${ansis.dim('Utterance')}: ${testCase.utterance}`, + title: `${ansis.bold(`Test Case #${number}`)}\n${ansis.dim('Utterance')}: ${testCase.utterance}`, overflow: 'wrap', columns: ['test', 'result', { key: 'expected', width: '40%' }, { key: 'actual', width: '40%' }], data: testCase.expectationResults.map((r) => ({ test: humanFriendlyName(r.name), - result: r.result === 'Passed' ? ansis.green('Pass') : ansis.red('Fail'), + result: r.result === 'PASS' ? ansis.green('Pass') : ansis.red('Fail'), expected: r.expectedValue, actual: r.actualValue, })), @@ -269,19 +293,19 @@ export async function humanFormat(details: AgentTestDetailsResponse): Promise { const topic = tc.expectationResults.find((r) => r.name === 'topic_sequence_match'); - return topic?.result === 'Passed' ? acc + 1 : acc; + return topic?.result === 'PASS' ? acc + 1 : acc; }, 0); const topicPassPercent = (topicPassCount / details.testSet.testCases.length) * 100; const actionPassCount = details.testSet.testCases.reduce((acc, tc) => { const action = tc.expectationResults.find((r) => r.name === 'action_sequence_match'); - return action?.result === 'Passed' ? acc + 1 : acc; + return action?.result === 'PASS' ? acc + 1 : acc; }, 0); const actionPassPercent = (actionPassCount / details.testSet.testCases.length) * 100; const outcomePassCount = details.testSet.testCases.reduce((acc, tc) => { const outcome = tc.expectationResults.find((r) => r.name === 'bot_response_rating'); - return outcome?.result === 'Passed' ? acc + 1 : acc; + return outcome?.result === 'PASS' ? acc + 1 : acc; }, 0); const outcomePassPercent = (outcomePassCount / details.testSet.testCases.length) * 100; @@ -297,12 +321,12 @@ export async function humanFormat(details: AgentTestDetailsResponse): Promise tc.status === 'ERROR'); + const failedTestCases = details.testSet.testCases.filter((tc) => tc.status.toLowerCase() === 'error'); const failedTestCasesObj = Object.fromEntries( Object.entries(failedTestCases).map(([, tc]) => [ - `Test Case #${tc.number}`, + `Test Case #${failedTestCases.indexOf(tc) + 1}`, tc.expectationResults - .filter((r) => r.result === 'Failed') + .filter((r) => r.result === 'FAILURE') .map((r) => humanFriendlyName(r.name)) .join(', '), ]) @@ -312,11 +336,11 @@ export async function humanFormat(details: AgentTestDetailsResponse): Promise { - return Promise.resolve(JSON.stringify(details, null, 2)); +async function jsonFormat(results: AgentTestResultsResponse): Promise { + return Promise.resolve(JSON.stringify(results, null, 2)); } -export async function junitFormat(details: AgentTestDetailsResponse): Promise { +async function junitFormat(results: AgentTestResultsResponse): Promise { // eslint-disable-next-line import/no-extraneous-dependencies const { XMLBuilder } = await import('fast-xml-parser'); const builder = new XMLBuilder({ @@ -325,9 +349,13 @@ export async function junitFormat(details: AgentTestDetailsResponse): Promise tc.status === 'ERROR').length; - const time = details.testSet.testCases.reduce((acc, tc) => { + const testCount = results.testSet.testCases.length; + const failureCount = results.testSet.testCases.filter( + (tc) => + ['error', 'completed'].includes(tc.status.toLowerCase()) && + tc.expectationResults.some((r) => r.result === 'FAILURE') + ).length; + const time = results.testSet.testCases.reduce((acc, tc) => { if (tc.endTime && tc.startTime) { return acc + new Date(tc.endTime).getTime() - new Date(tc.startTime).getTime(); } @@ -336,27 +364,27 @@ export async function junitFormat(details: AgentTestDetailsResponse): Promise { + testsuite: results.testSet.testCases.map((testCase) => { const testCaseTime = testCase.endTime ? new Date(testCase.endTime).getTime() - new Date(testCase.startTime).getTime() : 0; return { - $name: `${details.testSet.name}.${testCase.number}`, + $name: `${results.testSet.name}.${results.testSet.testCases.indexOf(testCase) + 1}`, $time: testCaseTime, $assertions: testCase.expectationResults.length, failure: testCase.expectationResults .map((r) => { - if (r.result === 'Failed') { + if (r.result === 'FAILURE') { return { $message: r.errorMessage ?? 'Unknown error', $name: r.name }; } }) @@ -369,14 +397,16 @@ export async function junitFormat(details: AgentTestDetailsResponse): Promise\n${suites}`.trim(); } -export async function tapFormat(details: AgentTestDetailsResponse): Promise { +async function tapFormat(results: AgentTestResultsResponse): Promise { const lines: string[] = []; let expectationCount = 0; - for (const testCase of details.testSet.testCases) { + for (const testCase of results.testSet.testCases) { for (const result of testCase.expectationResults) { - const status = result.result === 'Passed' ? 'ok' : 'not ok'; + const status = result.result === 'PASS' ? 'ok' : 'not ok'; expectationCount++; - lines.push(`${status} ${expectationCount} ${details.testSet.name}.${testCase.number}`); + lines.push( + `${status} ${expectationCount} ${results.testSet.name}.${results.testSet.testCases.indexOf(testCase) + 1}` + ); if (status === 'not ok') { lines.push(' ---'); lines.push(` message: ${result.errorMessage ?? 'Unknown error'}`); diff --git a/src/index.ts b/src/index.ts index 5d05bdc..c8fe293 100644 --- a/src/index.ts +++ b/src/index.ts @@ -21,11 +21,8 @@ export { export { Agent, AgentCreateLifecycleStages } from './agent'; export { AgentTester, - humanFormat, - jsonFormat, - junitFormat, - tapFormat, - type AgentTestDetailsResponse, + convertTestResultsToFormat, + type AgentTestResultsResponse, type AgentTestStartResponse, type AgentTestStatusResponse, type TestCaseResult, diff --git a/src/maybe-mock.ts b/src/maybe-mock.ts index 1f8d053..602d515 100644 --- a/src/maybe-mock.ts +++ b/src/maybe-mock.ts @@ -164,7 +164,7 @@ export class MaybeMock { this.logger.debug(`Making ${method} request to ${url}`); switch (method) { case 'GET': - return this.connection.requestGet(url, { retry: { maxRetries: 3 } }); + return this.connection.requestGet(url, { retry: { maxRetries: 10 } }); case 'POST': if (!body) { throw SfError.create({ diff --git a/test/agentTester.test.ts b/test/agentTester.test.ts index 205a780..1a8ac4b 100644 --- a/test/agentTester.test.ts +++ b/test/agentTester.test.ts @@ -8,7 +8,7 @@ import { readFile } from 'node:fs/promises'; import { expect } from 'chai'; import { MockTestOrgData, TestContext } from '@salesforce/core/testSetup'; import { Connection } from '@salesforce/core'; -import { AgentTestDetailsResponse, AgentTester, humanFormat, junitFormat, tapFormat } from '../src/agentTester'; +import { AgentTestResultsResponse, AgentTester, convertTestResultsToFormat } from '../src/agentTester'; describe('AgentTester', () => { const $$ = new TestContext(); @@ -45,7 +45,7 @@ describe('AgentTester', () => { const output = await tester.status('4KBSM000000003F4AQ'); expect(output).to.be.ok; expect(output).to.deep.equal({ - status: 'IN_PROGRESS', + status: 'InProgress', startTime: '2024-11-13T15:00:00.000Z', }); }); @@ -62,11 +62,11 @@ describe('AgentTester', () => { }); }); - describe('details', () => { - it('should return details of completed test run', async () => { + describe('results', () => { + it('should return results of completed test run', async () => { const tester = new AgentTester(connection); await tester.start('suiteId'); - const output = await tester.details('4KBSM000000003F4AQ'); + const output = await tester.results('4KBSM000000003F4AQ'); // TODO: make this assertion more meaningful expect(output).to.be.ok; }); @@ -82,23 +82,23 @@ describe('AgentTester', () => { }); }); -describe('humanFormat', () => { +describe('human format', () => { it('should transform test results to human readable format', async () => { - const raw = await readFile('./test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ_details.json', 'utf8'); - const input = JSON.parse(raw) as AgentTestDetailsResponse; - const output = await humanFormat(input); + const raw = await readFile('./test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ_results.json', 'utf8'); + const input = JSON.parse(raw) as AgentTestResultsResponse; + const output = await convertTestResultsToFormat(input, 'human'); expect(output).to.be.ok; }); }); -describe('junitFormatter', () => { +describe('junit formatter', () => { it('should transform test results to JUnit format', async () => { - const raw = await readFile('./test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ_details.json', 'utf8'); - const input = JSON.parse(raw) as AgentTestDetailsResponse; - const output = await junitFormat(input); + const raw = await readFile('./test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ_results.json', 'utf8'); + const input = JSON.parse(raw) as AgentTestResultsResponse; + const output = await convertTestResultsToFormat(input, 'junit'); expect(output).to.deep.equal(` - + @@ -110,11 +110,11 @@ describe('junitFormatter', () => { }); }); -describe('tapFormatter', () => { +describe('tap formatter', () => { it('should transform test results to TAP format', async () => { - const raw = await readFile('./test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ_details.json', 'utf8'); - const input = JSON.parse(raw) as AgentTestDetailsResponse; - const output = await tapFormat(input); + const raw = await readFile('./test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ_results.json', 'utf8'); + const input = JSON.parse(raw) as AgentTestResultsResponse; + const output = await convertTestResultsToFormat(input, 'tap'); expect(output).to.deep.equal(`Tap Version 14 1..6 ok 1 CRM_Sanity_v1.1 diff --git a/test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ/1.json b/test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ/1.json index daf2bbc..58716da 100644 --- a/test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ/1.json +++ b/test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ/1.json @@ -1,4 +1,4 @@ { - "status": "IN_PROGRESS", + "status": "InProgress", "startTime": "2024-11-13T15:00:00.000Z" } diff --git a/test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ/2.json b/test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ/2.json index daf2bbc..58716da 100644 --- a/test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ/2.json +++ b/test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ/2.json @@ -1,4 +1,4 @@ { - "status": "IN_PROGRESS", + "status": "InProgress", "startTime": "2024-11-13T15:00:00.000Z" } diff --git a/test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ/3.json b/test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ/3.json index d4f6503..88bd062 100644 --- a/test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ/3.json +++ b/test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ/3.json @@ -1,4 +1,4 @@ { - "status": "COMPLETED", + "status": "Completed", "startTime": "2024-11-13T15:00:00.000Z" } diff --git a/test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ_details.json b/test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ_results.json similarity index 94% rename from test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ_details.json rename to test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ_results.json index b84e3ed..704b480 100644 --- a/test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ_details.json +++ b/test/mocks/einstein_ai-evaluations_runs_4KBSM000000003F4AQ_results.json @@ -1,5 +1,5 @@ { - "status": "COMPLETED", + "status": "Completed", "startTime": "2024-11-28T12:00:00Z", "endTime": "2024-11-28T12:00:48.56Z", "errorMessage": null, @@ -9,7 +9,6 @@ "testCases": [ { "status": "COMPLETED", - "number": 1, "utterance": "Summarize account Acme", "startTime": "2024-11-28T12:00:10Z", "endTime": "2024-11-28T12:00:20Z", @@ -27,7 +26,7 @@ "actualValue": "GeneralCRM", "expectedValue": "GeneralCRM", "score": 1.0, - "result": "Passed", + "result": "PASS", "metricLabel": "Accuracy", "metricExplainability": "Measures the correctness of the result.", "status": "Completed", @@ -41,7 +40,7 @@ "actualValue": "[\"IdentifyRecordByName\",\"SummarizeRecord\"]", "expectedValue": "[\"IdentifyRecordByName\",\"SummarizeRecord\"]", "score": 1.0, - "result": "Passed", + "result": "PASS", "metricLabel": "Precision", "metricExplainability": "Measures the precision of the result.", "status": "Completed", @@ -55,7 +54,7 @@ "actualValue": "Here is the summary of the account Acme. How else can I assist you? Acme is a customer since 2019. They have 3 open opportunities and 2 open cases.", "expectedValue": "Summary of account details are shown", "score": 0.9, - "result": "Passed", + "result": "PASS", "metricLabel": "Precision", "metricExplainability": "Measures the precision of the result.", "status": "Completed", @@ -67,8 +66,7 @@ ] }, { - "status": "ERROR", - "number": 2, + "status": "COMPLETED", "startTime": "2024-11-28T12:00:30Z", "utterance": "Summarize the open cases and Activities of acme from sep to nov 2024", "endTime": "2024-11-28T12:00:40Z", @@ -86,7 +84,7 @@ "actualValue": "GeneralCRM", "expectedValue": "GeneralCRM", "score": 1, - "result": "Passed", + "result": "PASS", "metricLabel": "Accuracy", "metricExplainability": "Measures the correctness of the result.", "status": "Completed", @@ -100,7 +98,7 @@ "actualValue": "[\"IdentifyRecordByName\",\"QueryRecords\"]", "expectedValue": "[\"IdentifyRecordByName\",\"QueryRecords\",\"GetActivitiesTimeline\"]", "score": 0.5, - "result": "Failed", + "result": "FAILURE", "metricLabel": "Precision", "metricExplainability": "Measures the precision of the result.", "status": "Completed", @@ -114,7 +112,7 @@ "actualValue": "It looks like I am unable to find the information you are looking for due to access restrictions. How else can I assist you?", "expectedValue": "Summary of open cases and activities associated with timeline", "score": 0.1, - "result": "Failed", + "result": "FAILURE", "metricLabel": "Precision", "metricExplainability": "Measures the precision of the result.", "status": "Completed",