Skip to content

Commit

Permalink
AWS Glue Update: Add ability to manually resume workflows in AWS Glue…
Browse files Browse the repository at this point in the history
… providing customers further control over the orchestration of ETL workloads.
  • Loading branch information
AWS committed Jul 27, 2020
1 parent f3247ae commit 033e170
Show file tree
Hide file tree
Showing 2 changed files with 79 additions and 13 deletions.
5 changes: 5 additions & 0 deletions .changes/next-release/feature-AWSGlue-fd303b2.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"type": "feature",
"category": "AWS Glue",
"description": "Add ability to manually resume workflows in AWS Glue providing customers further control over the orchestration of ETL workloads."
}
87 changes: 74 additions & 13 deletions services/glue/src/main/resources/codegen-resources/service-2.json
Original file line number Diff line number Diff line change
Expand Up @@ -1640,6 +1640,24 @@
],
"documentation":"<p>Resets a bookmark entry.</p>"
},
"ResumeWorkflowRun":{
"name":"ResumeWorkflowRun",
"http":{
"method":"POST",
"requestUri":"/"
},
"input":{"shape":"ResumeWorkflowRunRequest"},
"output":{"shape":"ResumeWorkflowRunResponse"},
"errors":[
{"shape":"InvalidInputException"},
{"shape":"EntityNotFoundException"},
{"shape":"InternalServiceException"},
{"shape":"OperationTimeoutException"},
{"shape":"ConcurrentRunsExceededException"},
{"shape":"IllegalWorkflowStateException"}
],
"documentation":"<p>Restarts any completed nodes in a workflow run and resumes the run execution.</p>"
},
"SearchTables":{
"name":"SearchTables",
"http":{
Expand Down Expand Up @@ -3164,7 +3182,7 @@
},
"State":{
"shape":"JobRunState",
"documentation":"<p>The condition state. Currently, the only job states that a trigger can listen for are <code>SUCCEEDED</code>, <code>STOPPED</code>, <code>FAILED</code>, and <code>TIMEOUT</code>. The only crawler states that a trigger can listen for are <code>SUCCEEDED</code>, <code>FAILED</code>, and <code>CANCELLED</code>.</p>"
"documentation":"<p>The condition state. Currently, the values supported are <code>SUCCEEDED</code>, <code>STOPPED</code>, <code>TIMEOUT</code>, and <code>FAILED</code>.</p>"
},
"CrawlerName":{
"shape":"NameString",
Expand Down Expand Up @@ -4011,7 +4029,7 @@
},
"MaxCapacity":{
"shape":"NullableDouble",
"documentation":"<p>The number of AWS Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the <a href=\"https://aws.amazon.com/glue/pricing/\">AWS Glue pricing page</a>.</p> <p>Do not set <code>Max Capacity</code> if using <code>WorkerType</code> and <code>NumberOfWorkers</code>.</p> <p>The value that can be allocated for <code>MaxCapacity</code> depends on whether you are running a Python shell job or an Apache Spark ETL job:</p> <ul> <li> <p>When you specify a Python shell job (<code>JobCommand.Name</code>=\"pythonshell\"), you can allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.</p> </li> <li> <p>When you specify an Apache Spark ETL job (<code>JobCommand.Name</code>=\"glueetl\"), you can allocate from 2 to 100 DPUs. The default is 10 DPUs. This job type cannot have a fractional DPU allocation.</p> </li> </ul>"
"documentation":"<p>The number of AWS Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the <a href=\"https://aws.amazon.com/glue/pricing/\">AWS Glue pricing page</a>.</p> <p>Do not set <code>Max Capacity</code> if using <code>WorkerType</code> and <code>NumberOfWorkers</code>.</p> <p>The value that can be allocated for <code>MaxCapacity</code> depends on whether you are running a Python shell job or an Apache Spark ETL job:</p> <ul> <li> <p>When you specify a Python shell job (<code>JobCommand.Name</code>=\"pythonshell\"), you can allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.</p> </li> <li> <p>When you specify an Apache Spark ETL job (<code>JobCommand.Name</code>=\"glueetl\") or Apache Spark streaming ETL job (<code>JobCommand.Name</code>=\"gluestreaming\"), you can allocate from 2 to 100 DPUs. The default is 10 DPUs. This job type cannot have a fractional DPU allocation.</p> </li> </ul>"
},
"SecurityConfiguration":{
"shape":"NameString",
Expand Down Expand Up @@ -5253,7 +5271,7 @@
"documentation":"<p>The unique of the node within the workflow where the edge ends.</p>"
}
},
"documentation":"<p>An edge represents a directed connection between two AWS Glue components which are part of the workflow the edge belongs to.</p>"
"documentation":"<p>An edge represents a directed connection between two AWS Glue components that are part of the workflow the edge belongs to.</p>"
},
"EdgeList":{
"type":"list",
Expand Down Expand Up @@ -7224,7 +7242,7 @@
},
"MaxCapacity":{
"shape":"NullableDouble",
"documentation":"<p>The number of AWS Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the <a href=\"https://aws.amazon.com/glue/pricing/\">AWS Glue pricing page</a>.</p> <p>Do not set <code>Max Capacity</code> if using <code>WorkerType</code> and <code>NumberOfWorkers</code>.</p> <p>The value that can be allocated for <code>MaxCapacity</code> depends on whether you are running a Python shell job or an Apache Spark ETL job:</p> <ul> <li> <p>When you specify a Python shell job (<code>JobCommand.Name</code>=\"pythonshell\"), you can allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.</p> </li> <li> <p>When you specify an Apache Spark ETL job (<code>JobCommand.Name</code>=\"glueetl\"), you can allocate from 2 to 100 DPUs. The default is 10 DPUs. This job type cannot have a fractional DPU allocation.</p> </li> </ul>"
"documentation":"<p>The number of AWS Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the <a href=\"https://aws.amazon.com/glue/pricing/\">AWS Glue pricing page</a>.</p> <p>Do not set <code>Max Capacity</code> if using <code>WorkerType</code> and <code>NumberOfWorkers</code>.</p> <p>The value that can be allocated for <code>MaxCapacity</code> depends on whether you are running a Python shell job, an Apache Spark ETL job, or an Apache Spark streaming ETL job:</p> <ul> <li> <p>When you specify a Python shell job (<code>JobCommand.Name</code>=\"pythonshell\"), you can allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.</p> </li> <li> <p>When you specify an Apache Spark ETL job (<code>JobCommand.Name</code>=\"glueetl\") or Apache Spark streaming ETL job (<code>JobCommand.Name</code>=\"gluestreaming\"), you can allocate from 2 to 100 DPUs. The default is 10 DPUs. This job type cannot have a fractional DPU allocation.</p> </li> </ul>"
},
"WorkerType":{
"shape":"WorkerType",
Expand Down Expand Up @@ -7309,7 +7327,7 @@
"members":{
"Name":{
"shape":"GenericString",
"documentation":"<p>The name of the job command. For an Apache Spark ETL job, this must be <code>glueetl</code>. For a Python shell job, it must be <code>pythonshell</code>.</p>"
"documentation":"<p>The name of the job command. For an Apache Spark ETL job, this must be <code>glueetl</code>. For a Python shell job, it must be <code>pythonshell</code>. For an Apache Spark streaming ETL job, this must be <code>gluestreaming</code>.</p>"
},
"ScriptLocation":{
"shape":"ScriptLocationString",
Expand Down Expand Up @@ -7378,7 +7396,7 @@
},
"JobRunState":{
"shape":"JobRunState",
"documentation":"<p>The current state of the job run. For more information about the statuses of jobs that have terminated abnormally, see <a href=\"https://docs.aws.amazon.com/glue/latest/dg/job-run-statuses.html\">AWS Glue Job Run Statuses</a>.</p>"
"documentation":"<p>The current state of the job run.</p>"
},
"Arguments":{
"shape":"GenericMap",
Expand Down Expand Up @@ -7504,7 +7522,7 @@
},
"MaxCapacity":{
"shape":"NullableDouble",
"documentation":"<p>The number of AWS Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the <a href=\"https://aws.amazon.com/glue/pricing/\">AWS Glue pricing page</a>.</p> <p>Do not set <code>Max Capacity</code> if using <code>WorkerType</code> and <code>NumberOfWorkers</code>.</p> <p>The value that can be allocated for <code>MaxCapacity</code> depends on whether you are running a Python shell job or an Apache Spark ETL job:</p> <ul> <li> <p>When you specify a Python shell job (<code>JobCommand.Name</code>=\"pythonshell\"), you can allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.</p> </li> <li> <p>When you specify an Apache Spark ETL job (<code>JobCommand.Name</code>=\"glueetl\"), you can allocate from 2 to 100 DPUs. The default is 10 DPUs. This job type cannot have a fractional DPU allocation.</p> </li> </ul>"
"documentation":"<p>The number of AWS Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the <a href=\"https://aws.amazon.com/glue/pricing/\">AWS Glue pricing page</a>.</p> <p>Do not set <code>Max Capacity</code> if using <code>WorkerType</code> and <code>NumberOfWorkers</code>.</p> <p>The value that can be allocated for <code>MaxCapacity</code> depends on whether you are running a Python shell job or an Apache Spark ETL job:</p> <ul> <li> <p>When you specify a Python shell job (<code>JobCommand.Name</code>=\"pythonshell\"), you can allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.</p> </li> <li> <p>When you specify an Apache Spark ETL job (<code>JobCommand.Name</code>=\"glueetl\") or Apache Spark streaming ETL job (<code>JobCommand.Name</code>=\"gluestreaming\"), you can allocate from 2 to 100 DPUs. The default is 10 DPUs. This job type cannot have a fractional DPU allocation.</p> </li> </ul>"
},
"WorkerType":{
"shape":"WorkerType",
Expand Down Expand Up @@ -8089,7 +8107,11 @@
"documentation":"<p>Details of the crawler when the node represents a crawler.</p>"
}
},
"documentation":"<p>A node represents an AWS Glue component like Trigger, Job etc. which is part of a workflow.</p>"
"documentation":"<p>A node represents an AWS Glue component such as a trigger, or job, etc., that is part of a workflow.</p>"
},
"NodeIdList":{
"type":"list",
"member":{"shape":"NameString"}
},
"NodeList":{
"type":"list",
Expand Down Expand Up @@ -8595,6 +8617,41 @@
"max":1000,
"min":0
},
"ResumeWorkflowRunRequest":{
"type":"structure",
"required":[
"Name",
"RunId",
"NodeIds"
],
"members":{
"Name":{
"shape":"NameString",
"documentation":"<p>The name of the workflow to resume.</p>"
},
"RunId":{
"shape":"IdString",
"documentation":"<p>The ID of the workflow run to resume.</p>"
},
"NodeIds":{
"shape":"NodeIdList",
"documentation":"<p>A list of the node IDs for the nodes you want to restart. The nodes that are to be restarted must have an execution attempt in the original run.</p>"
}
}
},
"ResumeWorkflowRunResponse":{
"type":"structure",
"members":{
"RunId":{
"shape":"IdString",
"documentation":"<p>The new ID assigned to the resumed workflow run. Each resume of a workflow run will have a new run ID.</p>"
},
"NodeIds":{
"shape":"NodeIdList",
"documentation":"<p>A list of the node IDs for the nodes that were actually restarted.</p>"
}
}
},
"Role":{"type":"string"},
"RoleArn":{
"type":"string",
Expand Down Expand Up @@ -10752,12 +10809,16 @@
"members":{
"Name":{
"shape":"NameString",
"documentation":"<p>Name of the workflow which was executed.</p>"
"documentation":"<p>Name of the workflow that was executed.</p>"
},
"WorkflowRunId":{
"shape":"IdString",
"documentation":"<p>The ID of this workflow run.</p>"
},
"PreviousRunId":{
"shape":"IdString",
"documentation":"<p>The ID of the previous workflow run.</p>"
},
"WorkflowRunProperties":{
"shape":"WorkflowRunProperties",
"documentation":"<p>The workflow run properties which were set during the run.</p>"
Expand Down Expand Up @@ -10799,19 +10860,19 @@
},
"TimeoutActions":{
"shape":"IntegerValue",
"documentation":"<p>Total number of Actions which timed out.</p>"
"documentation":"<p>Total number of Actions that timed out.</p>"
},
"FailedActions":{
"shape":"IntegerValue",
"documentation":"<p>Total number of Actions which have failed.</p>"
"documentation":"<p>Total number of Actions that have failed.</p>"
},
"StoppedActions":{
"shape":"IntegerValue",
"documentation":"<p>Total number of Actions which have stopped.</p>"
"documentation":"<p>Total number of Actions that have stopped.</p>"
},
"SucceededActions":{
"shape":"IntegerValue",
"documentation":"<p>Total number of Actions which have succeeded.</p>"
"documentation":"<p>Total number of Actions that have succeeded.</p>"
},
"RunningActions":{
"shape":"IntegerValue",
Expand Down

0 comments on commit 033e170

Please sign in to comment.