Dataproc Operator Async¶
To create a new cluster on Google Cloud Dataproc
DataprocCreateClusterOperatorAsync
.
create_cluster = DataprocCreateClusterOperatorAsync(
task_id="create_cluster",
project_id=PROJECT_ID,
cluster_config=CLUSTER_CONFIG,
region=REGION,
cluster_name=CLUSTER_NAME,
)
# https://github.com/astronomer/astronomer-providers/tree/main/astronomer/providers/google/cloud/example_dags/example_dataproc.py
To delete a cluster on Google Cloud Dataproc
DataprocDeleteClusterOperatorAsync
.
delete_cluster = DataprocDeleteClusterOperatorAsync(
task_id="delete_cluster",
project_id=PROJECT_ID,
cluster_name=CLUSTER_NAME,
region=REGION,
trigger_rule="all_done",
)
# https://github.com/astronomer/astronomer-providers/tree/main/astronomer/providers/google/cloud/example_dags/example_dataproc.py
To submits a job to a dataproc cluster
DataprocSubmitJobOperatorAsync
.
pig_task = DataprocSubmitJobOperatorAsync(
task_id="pig_task", job=PIG_JOB, region=REGION, project_id=PROJECT_ID
)
# https://github.com/astronomer/astronomer-providers/tree/main/astronomer/providers/google/cloud/example_dags/example_dataproc.py
To updates an existing cluster in a Google cloud platform project
DataprocUpdateClusterOperatorAsync
.
update_cluster = DataprocUpdateClusterOperatorAsync(
task_id="update_cluster",
cluster_name=CLUSTER_NAME,
cluster=CLUSTER_UPDATE,
update_mask=UPDATE_MASK,
graceful_decommission_timeout=TIMEOUT,
project_id=PROJECT_ID,
region=REGION,
)
# https://github.com/astronomer/astronomer-providers/tree/main/astronomer/providers/google/cloud/example_dags/example_dataproc.py