astronomer.providers.amazon.aws.hooks.redshift_data

Classes

RedshiftDataHook

RedshiftDataHook inherits from AwsBaseHook to connect with AWS redshift

Module Contents

class astronomer.providers.amazon.aws.hooks.redshift_data.RedshiftDataHook(*args, poll_interval=0, **kwargs)[source]

Bases: airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook

RedshiftDataHook inherits from AwsBaseHook to connect with AWS redshift by using boto3 client_type as redshift-data we can interact with redshift cluster database and execute the query

This class is deprecated and will be removed in 2.0.0. Use :class: ~airflow.providers.amazon.aws.hooks.redshift_data.RedshiftDataHook instead

Parameters:
  • aws_conn_id – The Airflow connection used for AWS credentials. If this is None or empty then the default boto3 behaviour is used. If running Airflow in a distributed manner and aws_conn_id is None or empty, then default boto3 configuration would be used (and must be maintained on each worker node).

  • verify – Whether or not to verify SSL certificates. https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html

  • region_name – AWS region_name. If not specified then the default boto3 behaviour is used.

  • client_type – boto3.client client_type. Eg ‘s3’, ‘emr’ etc

  • resource_type – boto3.resource resource_type. Eg ‘dynamodb’ etc

  • config – Configuration for botocore client. (https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html)

  • poll_interval (int) – polling period in seconds to check for the status

aws_connection_type: str = 'redshift-data'
client_type
poll_interval
get_conn_params()[source]

Helper method to retrieve connection args

execute_query(sql, params)[source]

Runs an SQL statement, which can be data manipulation language (DML) or data definition language (DDL)

Parameters:

sql (dict[Any, Any] | Iterable[Any]) – list of query ids

async get_query_status(query_ids)[source]

Async function to get the Query status by query Ids. The function takes list of query_ids, makes async connection to redshift data to get the query status by query id and returns the query status. In case of success, it returns a list of query IDs of the queries that have a status FINISHED. In the case of partial failure meaning if any of queries fail or is aborted by the user we return an error as a whole.

Parameters:

query_ids (list[str]) – list of query ids

async is_still_running(qid)[source]

Async function to check whether the query is still running to return True or in “PICKED”, “STARTED” or “SUBMITTED” state to return False.

queries_are_completed(query_ids, context)[source]

Check whether all queries complete