Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(run): set pipeline run status as failed when component fails #836

Conversation

jvallesm
Copy link
Collaborator

@jvallesm jvallesm commented Nov 13, 2024

Because

  • When a component fails to execute due to, e.g., an invalid input, the status in the pipeline run is Completed.
  • When a pipeline fails in the PreTriggerActivity process (load information to memory, resolve secrets and connections, initialise templates), the pipeline run doesn't contain the input or snapshot recipe.

This commit

  • Marks the pipeline run and the component run as failed when any of the jobs in a component execution fails.
    • In the future we might want to have a more sophisticated way to handle errors. Perhaps a component can fail (e.g. another component is run given that condition as a fallback) and the pipeline execution can still be considered valid.
  • Splits PreTriggerActivity in smaller activities and triggers the upload (recipe / input) activities as soon as the data is ready in memory.

@jvallesm jvallesm self-assigned this Nov 13, 2024
Copy link

linear bot commented Nov 13, 2024

@jvallesm jvallesm marked this pull request as ready for review November 13, 2024 13:47
@jvallesm jvallesm force-pushed the jvalles/ins-6778-bug-display-completed-status-for-pipeline-runs-with branch from 33442a7 to 4ba45c0 Compare November 14, 2024 07:09
…gger

Because

- When `PreTriggerActivity` fails, the pipeline run log doesn't contain the recipe or input information.

This commit

- Moves the input and recipe upload before the pre-trigger activity

Before (failed trigger due to wrong connection reference):

```sh
curl   http://localhost:8080/v1beta/namespaces/admin/pipelines/opp/runs\?pageSize\=100 \
--header "Content-Type: application/json" \
--header "Authorization: Bearer instill_sk_wHPobuR0xFB8nOiqGY4rXi2N2BqxCKQ1" |  jq '.pipelineRuns[0] | {at: .startTime, in: .inputs, recipe: .recipeSnapshot}'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  7337  100  7337    0     0  80033      0 --:--:-- --:--:-- --:--:-- 80626
{
  "at": "2024-11-14T08:56:05.247033Z",
  "in": [],
  "recipe": null
}
```

After:

```sh
curl   http://localhost:8080/v1beta/namespaces/admin/pipelines/opp/runs\?pageSize\=100 \
--header "Content-Type: application/json" \
--header "Authorization: Bearer instill_sk_wHPobuR0xFB8nOiqGY4rXi2N2BqxCKQ1" |  jq '.pipelineRuns[0] | {at: .startTime, in: .inputs, recipe: .recipeSnapshot}'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 12473  100 12473    0     0   185k      0 --:--:-- --:--:-- --:--:--  187k
{
  "at": "2024-11-14T10:14:53.695775Z",
  "in": [
    {
      "prompt": "How many inhabitants does Pamplona have?"
    }
  ],
  "recipe": {
    "component": {
      "openai-0": {
        "input": {
          "chat-history": null,
          "frequency-penalty": null,
          "images": null,
          "max-tokens": null,
          "model": "gpt-4o",
          "n": 1,
          "presence-penalty": null,
          "prompt": "${variable.prompt}",
          "response-format": {
            "type": "text"
          },
          "system-message": "You are a helpful assistant.",
          "temperature": 1,
          "top-p": 1
        },
        "setup": "${connection.oo}",
        "task": "TASK_TEXT_GENERATION",
        "type": "openai"
      }
    },
    "output": {
      "resp": {
        "title": "Response",
        "value": "${openai-0.output.texts}"
      }
    },
    "variable": {
      "prompt": {
        "format": "string"
      }
    },
    "version": "v1beta"
  }
}
```
donch1989
donch1989 previously approved these changes Nov 15, 2024
@jvallesm
Copy link
Collaborator Author

jvallesm commented Nov 15, 2024

QA ✅

Pipeline run status on component failure

Before (jq task on JSON component with invalid input JSON):

curl   http://localhost:8080/v1beta/namespaces/admin/pipelines/jq-error/runs\?pageSize\=100 \
--header "Content-Type: application/json" \
--header "Authorization: Bearer instill_sk_wHPobuR0xFB8nOiqGY4rXi2N2BqxCKQ1" |  jq '.pipelineRuns[0] | {status: .status, in: .inputs}'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 26542  100 26542    0     0   122k      0 --:--:-- --:--:-- --:--:--  122k
{
  "status": "RUN_STATUS_COMPLETED",
  "in": [
    {
      "json": "{\"foo: \"bar\"}"
    }
  ]
}

After:

curl   http://localhost:8080/v1beta/namespaces/admin/pipelines/jq-error/runs\?pageSize\=100 \
--header "Content-Type: application/json" \
--header "Authorization: Bearer instill_sk_wHPobuR0xFB8nOiqGY4rXi2N2BqxCKQ1" |  jq '.pipelineRuns[0] | {status: .status, in: .inputs}'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 26542  100 26542    0     0   122k      0 --:--:-- --:--:-- --:--:--  122k
{
  "status": "RUN_STATUS_FAILED",
  "in": [
    {
      "json": "{foo: \"bar\"}"
    }
  ]
}

Missing input and recipe

Before (failed trigger due to wrong connection reference):

curl   http://localhost:8080/v1beta/namespaces/admin/pipelines/opp/runs\?pageSize\=100 \
--header "Content-Type: application/json" \
--header "Authorization: Bearer instill_sk_wHPobuR0xFB8nOiqGY4rXi2N2BqxCKQ1" |  jq '.pipelineRuns[0] | {at: .startTime, in: .inputs, recipe: .recipeSnapshot}'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  7337  100  7337    0     0  80033      0 --:--:-- --:--:-- --:--:-- 80626
{
  "at": "2024-11-14T08:56:05.247033Z",
  "in": [],
  "recipe": null
}

After:

curl   http://localhost:8080/v1beta/namespaces/admin/pipelines/opp/runs\?pageSize\=100 \
--header "Content-Type: application/json" \
--header "Authorization: Bearer instill_sk_wHPobuR0xFB8nOiqGY4rXi2N2BqxCKQ1" |  jq '.pipelineRuns[0] | {at: .startTime, in: .inputs, recipe: .recipeSnapshot}'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 12473  100 12473    0     0   185k      0 --:--:-- --:--:-- --:--:--  187k
{
  "at": "2024-11-14T10:14:53.695775Z",
  "in": [
    {
      "prompt": "How many inhabitants does Pamplona have?"
    }
  ],
  "recipe": {
    "component": {
      "openai-0": {
        "input": {
          "chat-history": null,
          "frequency-penalty": null,
          "images": null,
          "max-tokens": null,
          "model": "gpt-4o",
          "n": 1,
          "presence-penalty": null,
          "prompt": "${variable.prompt}",
          "response-format": {
            "type": "text"
          },
          "system-message": "You are a helpful assistant.",
          "temperature": 1,
          "top-p": 1
        },
        "setup": "${connection.oo}",
        "task": "TASK_TEXT_GENERATION",
        "type": "openai"
      }
    },
    "output": {
      "resp": {
        "title": "Response",
        "value": "${openai-0.output.texts}"
      }
    },
    "variable": {
      "prompt": {
        "format": "string"
      }
    },
    "version": "v1beta"
  }
}

logger.Error("Failed to upload pipeline run recipe", zap.Error(err))
}

// TODO group everything under condition
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this wasn't meant to be the final state of the changes. I had (staged but) uncommitted changes to upload the recipe and input only under the TriggerFromAPI condition. They were meant to be squashed into this commit but I added them separately in the next commit to ease the review.

@jvallesm jvallesm merged commit 70a5c52 into main Nov 15, 2024
12 checks passed
@jvallesm jvallesm deleted the jvalles/ins-6778-bug-display-completed-status-for-pipeline-runs-with branch November 15, 2024 07:44
donch1989 pushed a commit that referenced this pull request Nov 20, 2024
🤖 I have created a release *beep* *boop*
---


##
[0.47.0-beta](v0.46.0-beta...v0.47.0-beta)
(2024-11-20)


### Features

* **component:** add support for event specifications
([#837](#837))
([47a61cd](47a61cd))
* **component:** implement run-on-event for Slack and GitHub component
([#842](#842))
([1b6a569](1b6a569))
* convert pdf to image concurrently
([#818](#818))
([4c0ad97](4c0ad97))
* improve markdown chunking
([#822](#822))
([af1a36a](af1a36a))
* **json:** Support Rename Fields for JSON operator
([#813](#813))
([093714e](093714e))
* **recipe:** refactor run-on-event recipe structure
([#835](#835))
([78ea418](78ea418))
* **recipe:** rename `instill-format` to `format`
([#798](#798))
([80a9fc9](80a9fc9))
* **service:** implement PipelineErrorUpdated streaming event for
pipeline errors
([#846](#846))
([3156a5f](3156a5f))
* **vdp:** integrate blob storage to vdp
([#834](#834))
([5311549](5311549))
* **web:** add input schema to improve web operator
([#819](#819))
([f7e1fe9](f7e1fe9))


### Bug Fixes

* **data:** refactor numberData to support both float and integer types
([#832](#832))
([cf27452](cf27452))
* **document:** fix bug about convert to image
([#848](#848))
([a381c27](a381c27))
* fix bug about unit type
([#826](#826))
([a89fdf7](a89fdf7))
* **integration-test:** maximize build space on image build & push
([#823](#823))
([a439d22](a439d22))
* **run:** set pipeline run status as failed when component fails
([#836](#836))
([70a5c52](70a5c52))
* **service:** add MIME type detection in the backend binaryFetcher
([#854](#854))
([f434b2b](f434b2b))
* **service:** add missing nil check in includeIteratorComponentDetail()
([#831](#831))
([9cb5e9e](9cb5e9e))
* **service:** skip empty component definition in API response
([#847](#847))
([d61b55e](d61b55e))
* unit tests
([#820](#820))
([717200c](717200c))
* **vdp:** item does not contain the instill format, so we insert it
([#858](#858))
([2d25401](2d25401))
* **workflow:** allow integration usage within iterator
([#833](#833))
([c9bd169](c9bd169))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: No status
Development

Successfully merging this pull request may close these issues.

3 participants