astronomer.providers.amazon.aws.triggers.emr

Module Contents

Classes

EmrContainerSensorTrigger

The EmrContainerSensorTrigger is triggered when EMR container is created, it polls for the AWS EMR EKS Virtual

EmrContainerOperatorTrigger

The EmrContainerSensorTrigger is triggered when EMR container is created, it polls for the AWS EMR EKS Virtual

EmrStepSensorTrigger

A trigger that fires once AWS EMR cluster step reaches either target or failed state

EmrJobFlowSensorTrigger

EmrJobFlowSensorTrigger is fired as deferred class with params to run the task in trigger worker, when

class astronomer.providers.amazon.aws.triggers.emr.EmrContainerSensorTrigger(virtual_cluster_id, job_id, max_retries=None, aws_conn_id='aws_default', poll_interval=10)[source]

Bases: airflow.triggers.base.BaseTrigger

The EmrContainerSensorTrigger is triggered when EMR container is created, it polls for the AWS EMR EKS Virtual Cluster Job status. It is fired as deferred class with params to run the task in trigger worker

Parameters
  • virtual_cluster_id (str) – Reference Emr cluster id

  • job_id (str) – job_id to check the state

  • max_retries (Optional[int]) – maximum retry for poll for the status

  • aws_conn_id (str) – Reference to AWS connection id

  • poll_interval (int) – polling period in seconds to check for the status

serialize(self)[source]

Serializes EmrContainerSensorTrigger arguments and classpath.

async run(self)[source]

Make async connection to EMR container, polls for the job state

class astronomer.providers.amazon.aws.triggers.emr.EmrContainerOperatorTrigger(virtual_cluster_id, name, job_id, aws_conn_id='aws_default', poll_interval=30, max_tries=None)[source]

Bases: airflow.triggers.base.BaseTrigger

The EmrContainerSensorTrigger is triggered when EMR container is created, it polls for the AWS EMR EKS Virtual Cluster Job status. It is fired as deferred class with params to run the task in trigger worker

Parameters
  • virtual_cluster_id (str) – The EMR on EKS virtual cluster ID.

  • name (str) – The name of the job run.

  • execution_role_arn – The IAM role ARN associated with the job run.

  • release_label – The Amazon EMR release version to use for the job run.

  • job_driver – Job configuration details, e.g. the Spark job parameters.

  • configuration_overrides – The configuration overrides for the job run, specifically either application configuration or monitoring configuration.

  • client_request_token – The client idempotency token of the job run request.

  • aws_conn_id (str) – Reference to AWS connection id.

  • poll_interval (int) – polling period in seconds to check for the status.

  • max_retries – maximum retry for poll for the status.

INTERMEDIATE_STATES :List[str] = ['PENDING', 'SUBMITTED', 'RUNNING']
FAILURE_STATES :List[str] = ['FAILED', 'CANCELLED', 'CANCEL_PENDING']
SUCCESS_STATES :List[str] = ['COMPLETED']
TERMINAL_STATES :List[str] = ['COMPLETED', 'FAILED', 'CANCELLED', 'CANCEL_PENDING']
serialize(self)[source]

Serializes EmrContainerOperatorTrigger arguments and classpath.

async run(self)[source]

Run until EMR container reaches the desire state

class astronomer.providers.amazon.aws.triggers.emr.EmrStepSensorTrigger(job_flow_id, step_id, aws_conn_id, poke_interval, target_states=None, failed_states=None)[source]

Bases: airflow.triggers.base.BaseTrigger

A trigger that fires once AWS EMR cluster step reaches either target or failed state

Parameters
  • job_flow_id (str) – job_flow_id which contains the step check the state of

  • step_id (str) – step to check the state of

  • aws_conn_id (str) – aws connection to use, defaults to ‘aws_default’

  • poke_interval (float) – Time in seconds to wait between two consecutive call to check emr cluster step state

  • target_states (Optional[Iterable[str]]) – the target states, sensor waits until step reaches any of these states

  • failed_states (Optional[Iterable[str]]) – the failure states, sensor fails when step reaches any of these states

serialize(self)[source]

Serializes EmrStepSensorTrigger arguments and classpath.

async run(self)[source]

Run until AWS EMR cluster step reach target or failed state

class astronomer.providers.amazon.aws.triggers.emr.EmrJobFlowSensorTrigger(job_flow_id, aws_conn_id, poll_interval, target_states=None, failed_states=None)[source]

Bases: airflow.triggers.base.BaseTrigger

EmrJobFlowSensorTrigger is fired as deferred class with params to run the task in trigger worker, when EMR JobFlow is created

Parameters
  • job_flow_id (str) – job_flow_id to check the state of

  • target_states (Optional[Iterable[str]]) – the target states, sensor waits until job flow reaches any of these states

  • failed_states (Optional[Iterable[str]]) – the failure states, sensor fails when job flow reaches any of these states

  • poll_interval (float) – polling period in seconds to check for the status

serialize(self)[source]

Serializes EmrJobFlowSensorTrigger arguments and classpath.

async run(self)[source]

Make async connection to EMR container, polls for the target job state