Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

W-17237086 fix: updates based on live API #26

Merged
merged 7 commits into from
Jan 17, 2025
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 8 additions & 14 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
## Contributing

1. Familiarize yourself with the codebase by reading the docs, in
particular the [developing](./contributing/developing.md) doc.
1. Create a new issue before starting your project so that we can keep track of
particular the [developing](./developing.md) doc.
2. Create a new issue before starting your project so that we can keep track of
what you're trying to add/fix. That way, we can also offer suggestions or
let you know if there is already an effort in progress.
1. Fork this repository.
1. Set up your environment using the information in the [developing](./contributing/developing.md) doc.
1. Create a _topic_ branch in your fork based on the correct branch (usually the **develop** branch, see [Branches section](./contributing/developing.md)). Note: this step is recommended but technically not required if contributing using a fork.
1. Edit the code in your fork.
1. Sign the CLA (see [CLA](#cla)).
1. Send us a pull request when you're done. We'll review your code, suggest any
3. Fork this repository.
4. Set up your environment using the information in the [developing](./developing.md) doc.
5. Create a _topic_ branch in your fork based on the correct branch (usually the **develop** branch, see [Branches section](./developing.md)). Note: this step is recommended but technically not required if contributing using a fork.
6. Edit the code in your fork.
7. Sign the CLA (see [CLA](#cla)).
8. Send us a pull request when you're done. We'll review your code, suggest any
needed changes, and merge it in.

## Pull Requests
Expand All @@ -31,9 +31,3 @@ Agreement. You can do so by going to <https://cla.salesforce.com/sign-cla>.
### Merging Pull Requests

Pull request merging is restricted to squash and merge only.

## Helpful Resources

- All of the files in the [contributing](./contributing) folder have useful information, particularly the previously-mentioned [developing](./contributing/developing.md) doc.
- The [Source-Deploy-Retrieve Handbook](./HANDBOOK.md) contains an overview of all of the code in this project. This easy-to-read document can serve as an introduction and overview of the code and concepts, or as a reference for what a given module accomplishes and why it was designed.
- The [API documentation](https://forcedotcom.github.io/source-deploy-retrieve/) has details on using the classes and methods.
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "@salesforce/agents",
"description": "Client side APIs for working with Salesforce agents",
"version": "0.5.9",
"version": "0.5.10-dev.0",
"license": "BSD-3-Clause",
"author": "Salesforce",
"main": "lib/index",
Expand Down
144 changes: 87 additions & 57 deletions src/agentTester.ts
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ import { Duration, env } from '@salesforce/kit';
import ansis from 'ansis';
import { MaybeMock } from './maybe-mock';

export type TestStatus = 'NEW' | 'IN_PROGRESS' | 'COMPLETED' | 'ERROR';
export type TestStatus = 'New' | 'InProgress' | 'Completed' | 'Error';

export type AgentTestStartResponse = {
aiEvaluationId: string;
Expand All @@ -25,14 +25,13 @@ export type AgentTestStatusResponse = {

export type TestCaseResult = {
status: TestStatus;
number: string;
utterance: string;
startTime: string;
endTime?: string;
generatedData: {
type: 'AGENT';
actionsSequence: string[];
outcome: 'Success' | 'Failure';
outcome: string;
topic: string;
inputTokensCount: string;
outputTokensCount: string;
Expand All @@ -42,7 +41,7 @@ export type TestCaseResult = {
actualValue: string;
expectedValue: string;
score: number;
result: 'Passed' | 'Failed';
result: 'PASS' | 'FAILURE';
metricLabel: 'Accuracy' | 'Precision';
metricExplainability: string;
status: TestStatus;
Expand All @@ -53,7 +52,7 @@ export type TestCaseResult = {
}>;
};

export type AgentTestDetailsResponse = {
export type AgentTestResultsResponse = {
status: TestStatus;
startTime: string;
endTime?: string;
Expand Down Expand Up @@ -106,7 +105,7 @@ export class AgentTester {
*
* @param {string} jobId
* @param {Duration} timeout
* @returns {Promise<AgentTestDetailsResponse>}
* @returns {Promise<AgentTestResultsResponse>}
*/
public async poll(
jobId: string,
Expand All @@ -117,57 +116,63 @@ export class AgentTester {
} = {
timeout: Duration.minutes(5),
}
): Promise<AgentTestDetailsResponse> {
): Promise<AgentTestResultsResponse> {
const frequency = env.getNumber('SF_AGENT_TEST_POLLING_FREQUENCY_MS', 1000);
const lifecycle = Lifecycle.getInstance();
const client = await PollingClient.create({
poll: async (): Promise<StatusResult> => {
// NOTE: we don't actually need to call the status API here since all the same information is present on the
// details API. We could just call the details API and check the status there.
const [detailsResponse, statusResponse] = await Promise.all([this.details(jobId), this.status(jobId)]);
const totalTestCases = detailsResponse.testSet.testCases.length;
const failingTestCases = detailsResponse.testSet.testCases.filter((tc) => tc.status === 'ERROR').length;
const passingTestCases = detailsResponse.testSet.testCases.filter(
(tc) => tc.status === 'COMPLETED' && tc.expectationResults.every((r) => r.result === 'Passed')
).length;

if (statusResponse.status.toLowerCase() === 'completed') {
const statusResponse = await this.status(jobId);
if (statusResponse.status.toLowerCase() !== 'new') {
const resultsResponse = await this.results(jobId);
const totalTestCases = resultsResponse.testSet.testCases.length;
const passingTestCases = resultsResponse.testSet.testCases.filter(
(tc) => tc.status.toLowerCase() === 'completed' && tc.expectationResults.every((r) => r.result === 'PASS')
).length;
const failingTestCases = resultsResponse.testSet.testCases.filter(
(tc) =>
['error', 'completed'].includes(tc.status.toLowerCase()) &&
tc.expectationResults.some((r) => r.result === 'FAILURE')
).length;

if (resultsResponse.status.toLowerCase() === 'completed') {
await lifecycle.emit('AGENT_TEST_POLLING_EVENT', {
jobId,
status: resultsResponse.status,
totalTestCases,
failingTestCases,
passingTestCases,
});
return { payload: resultsResponse, completed: true };
}

await lifecycle.emit('AGENT_TEST_POLLING_EVENT', {
jobId,
status: statusResponse.status,
status: resultsResponse.status,
totalTestCases,
failingTestCases,
passingTestCases,
});
return { payload: detailsResponse, completed: true };
}

await lifecycle.emit('AGENT_TEST_POLLING_EVENT', {
jobId,
status: statusResponse.status,
totalTestCases,
failingTestCases,
passingTestCases,
});
return { completed: false };
},
frequency: Duration.milliseconds(frequency),
timeout,
});

return client.subscribe<AgentTestDetailsResponse>();
return client.subscribe<AgentTestResultsResponse>();
}

/**
* Request test run details
*
* @param {string} jobId
* @returns {Promise<AgentTestDetailsResponse>}
* @returns {Promise<AgentTestResultsResponse>}
*/
public async details(jobId: string): Promise<AgentTestDetailsResponse> {
const url = `/einstein/ai-evaluations/runs/${jobId}/details`;
public async results(jobId: string): Promise<AgentTestResultsResponse> {
const url = `/einstein/ai-evaluations/runs/${jobId}/results`;

return this.maybeMock.request<AgentTestDetailsResponse>('GET', url);
return this.maybeMock.request<AgentTestResultsResponse>('GET', url);
}

/**
Expand Down Expand Up @@ -246,19 +251,38 @@ function makeSimpleTable(data: Record<string, string>, title: string): string {
return `${title}\n${table}`;
}

export async function humanFormat(details: AgentTestDetailsResponse): Promise<string> {
export async function convertTestResultsToFormat(
results: AgentTestResultsResponse,
format: 'human' | 'json' | 'junit' | 'tap'
): Promise<string> {
switch (format) {
case 'human':
return humanFormat(results);
case 'json':
return jsonFormat(results);
case 'junit':
return junitFormat(results);
case 'tap':
return tapFormat(results);
default:
throw new Error(`Unsupported format: ${format as string}`);
}
}

async function humanFormat(details: AgentTestResultsResponse): Promise<string> {
const { Ux } = await import('@salesforce/sf-plugins-core');
const ux = new Ux();

const tables: string[] = [];
for (const testCase of details.testSet.testCases) {
const number = details.testSet.testCases.indexOf(testCase) + 1;
const table = ux.makeTable({
title: `${ansis.bold(`Test Case #${testCase.number}`)}\n${ansis.dim('Utterance')}: ${testCase.utterance}`,
title: `${ansis.bold(`Test Case #${number}`)}\n${ansis.dim('Utterance')}: ${testCase.utterance}`,
overflow: 'wrap',
columns: ['test', 'result', { key: 'expected', width: '40%' }, { key: 'actual', width: '40%' }],
data: testCase.expectationResults.map((r) => ({
test: humanFriendlyName(r.name),
result: r.result === 'Passed' ? ansis.green('Pass') : ansis.red('Fail'),
result: r.result === 'PASS' ? ansis.green('Pass') : ansis.red('Fail'),
expected: r.expectedValue,
actual: r.actualValue,
})),
Expand All @@ -269,19 +293,19 @@ export async function humanFormat(details: AgentTestDetailsResponse): Promise<st

const topicPassCount = details.testSet.testCases.reduce((acc, tc) => {
const topic = tc.expectationResults.find((r) => r.name === 'topic_sequence_match');
return topic?.result === 'Passed' ? acc + 1 : acc;
return topic?.result === 'PASS' ? acc + 1 : acc;
}, 0);
const topicPassPercent = (topicPassCount / details.testSet.testCases.length) * 100;

const actionPassCount = details.testSet.testCases.reduce((acc, tc) => {
const action = tc.expectationResults.find((r) => r.name === 'action_sequence_match');
return action?.result === 'Passed' ? acc + 1 : acc;
return action?.result === 'PASS' ? acc + 1 : acc;
}, 0);
const actionPassPercent = (actionPassCount / details.testSet.testCases.length) * 100;

const outcomePassCount = details.testSet.testCases.reduce((acc, tc) => {
const outcome = tc.expectationResults.find((r) => r.name === 'bot_response_rating');
return outcome?.result === 'Passed' ? acc + 1 : acc;
return outcome?.result === 'PASS' ? acc + 1 : acc;
}, 0);
const outcomePassPercent = (outcomePassCount / details.testSet.testCases.length) * 100;

Expand All @@ -297,12 +321,12 @@ export async function humanFormat(details: AgentTestDetailsResponse): Promise<st

const resultsTable = makeSimpleTable(results, ansis.bold.blue('Test Results'));

const failedTestCases = details.testSet.testCases.filter((tc) => tc.status === 'ERROR');
const failedTestCases = details.testSet.testCases.filter((tc) => tc.status.toLowerCase() === 'error');
const failedTestCasesObj = Object.fromEntries(
Object.entries(failedTestCases).map(([, tc]) => [
`Test Case #${tc.number}`,
`Test Case #${failedTestCases.indexOf(tc) + 1}`,
tc.expectationResults
.filter((r) => r.result === 'Failed')
.filter((r) => r.result === 'FAILURE')
.map((r) => humanFriendlyName(r.name))
.join(', '),
])
Expand All @@ -312,11 +336,11 @@ export async function humanFormat(details: AgentTestDetailsResponse): Promise<st
return tables.join('\n') + `\n${resultsTable}\n\n${failedTestCasesTable}\n`;
}

export async function jsonFormat(details: AgentTestDetailsResponse): Promise<string> {
return Promise.resolve(JSON.stringify(details, null, 2));
async function jsonFormat(results: AgentTestResultsResponse): Promise<string> {
return Promise.resolve(JSON.stringify(results, null, 2));
}

export async function junitFormat(details: AgentTestDetailsResponse): Promise<string> {
async function junitFormat(results: AgentTestResultsResponse): Promise<string> {
// eslint-disable-next-line import/no-extraneous-dependencies
const { XMLBuilder } = await import('fast-xml-parser');
const builder = new XMLBuilder({
Expand All @@ -325,9 +349,13 @@ export async function junitFormat(details: AgentTestDetailsResponse): Promise<st
ignoreAttributes: false,
});

const testCount = details.testSet.testCases.length;
const failureCount = details.testSet.testCases.filter((tc) => tc.status === 'ERROR').length;
const time = details.testSet.testCases.reduce((acc, tc) => {
const testCount = results.testSet.testCases.length;
const failureCount = results.testSet.testCases.filter(
(tc) =>
['error', 'completed'].includes(tc.status.toLowerCase()) &&
tc.expectationResults.some((r) => r.result === 'FAILURE')
).length;
const time = results.testSet.testCases.reduce((acc, tc) => {
if (tc.endTime && tc.startTime) {
return acc + new Date(tc.endTime).getTime() - new Date(tc.startTime).getTime();
}
Expand All @@ -336,27 +364,27 @@ export async function junitFormat(details: AgentTestDetailsResponse): Promise<st

const suites = builder.build({
testsuites: {
$name: details.subjectName,
$name: results.subjectName,
$tests: testCount,
$failures: failureCount,
$time: time,
property: [
{ $name: 'status', $value: details.status },
{ $name: 'start-time', $value: details.startTime },
{ $name: 'end-time', $value: details.endTime },
{ $name: 'status', $value: results.status },
{ $name: 'start-time', $value: results.startTime },
{ $name: 'end-time', $value: results.endTime },
],
testsuite: details.testSet.testCases.map((testCase) => {
testsuite: results.testSet.testCases.map((testCase) => {
const testCaseTime = testCase.endTime
? new Date(testCase.endTime).getTime() - new Date(testCase.startTime).getTime()
: 0;

return {
$name: `${details.testSet.name}.${testCase.number}`,
$name: `${results.testSet.name}.${results.testSet.testCases.indexOf(testCase) + 1}`,
$time: testCaseTime,
$assertions: testCase.expectationResults.length,
failure: testCase.expectationResults
.map((r) => {
if (r.result === 'Failed') {
if (r.result === 'FAILURE') {
return { $message: r.errorMessage ?? 'Unknown error', $name: r.name };
}
})
Expand All @@ -369,14 +397,16 @@ export async function junitFormat(details: AgentTestDetailsResponse): Promise<st
return `<?xml version="1.0" encoding="UTF-8"?>\n${suites}`.trim();
}

export async function tapFormat(details: AgentTestDetailsResponse): Promise<string> {
async function tapFormat(results: AgentTestResultsResponse): Promise<string> {
const lines: string[] = [];
let expectationCount = 0;
for (const testCase of details.testSet.testCases) {
for (const testCase of results.testSet.testCases) {
for (const result of testCase.expectationResults) {
const status = result.result === 'Passed' ? 'ok' : 'not ok';
const status = result.result === 'PASS' ? 'ok' : 'not ok';
expectationCount++;
lines.push(`${status} ${expectationCount} ${details.testSet.name}.${testCase.number}`);
lines.push(
`${status} ${expectationCount} ${results.testSet.name}.${results.testSet.testCases.indexOf(testCase) + 1}`
);
if (status === 'not ok') {
lines.push(' ---');
lines.push(` message: ${result.errorMessage ?? 'Unknown error'}`);
Expand Down
7 changes: 2 additions & 5 deletions src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,8 @@ export {
export { Agent, AgentCreateLifecycleStages } from './agent';
export {
AgentTester,
humanFormat,
jsonFormat,
junitFormat,
tapFormat,
type AgentTestDetailsResponse,
convertTestResultsToFormat,
type AgentTestResultsResponse,
type AgentTestStartResponse,
type AgentTestStatusResponse,
type TestCaseResult,
Expand Down
2 changes: 1 addition & 1 deletion src/maybe-mock.ts
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,7 @@ export class MaybeMock {
this.logger.debug(`Making ${method} request to ${url}`);
switch (method) {
case 'GET':
return this.connection.requestGet<T>(url, { retry: { maxRetries: 3 } });
return this.connection.requestGet<T>(url, { retry: { maxRetries: 10 } });
case 'POST':
if (!body) {
throw SfError.create({
Expand Down
Loading
Loading