astronomer.providers.apache.hive.sensors.hive_partition

Module Contents

Classes

HivePartitionSensorAsync

Waits for a given partition to show up in Hive table asynchronously.

class astronomer.providers.apache.hive.sensors.hive_partition.HivePartitionSensorAsync(*, table, partition="ds='{{ ds }}'", metastore_conn_id='metastore_default', schema='default', poke_interval=60 * 3, **kwargs)[source]

Bases: airflow.providers.apache.hive.sensors.hive_partition.HivePartitionSensor

Waits for a given partition to show up in Hive table asynchronously.

Note

HivePartitionSensorAsync uses implya library instead of pyhive. The sync version of this sensor uses pyhive, but pyhive is currently unsupported.

Since we use implya library, please set the connection to use the port 10000 instead of 9083. This sensor currently supports auth_mechansim='PLAIN' only.

The library version of hive and hadoop in Dockerfile should match the remote cluster where they are running.

Parameters
  • table (str) – the table where the partition is present.

  • partition (Optional[str]) – The partition clause to wait for. This is passed as notation as in “ds=’2015-01-01’”

  • schema (str) – database which needs to be connected in hive. By default it is ‘default’

  • metastore_conn_id (str) – connection string to connect to hive.

  • polling_interval – The interval in seconds to wait between checks for partition.

execute(self, context)[source]

Airflow runs this method on the worker and defers using the trigger.

execute_complete(self, context, event=None)[source]

Callback for when the trigger fires - returns immediately. Relies on trigger to throw an exception, otherwise it assumes execution was successful.