google.cloud.hooks.bigquery

Module Contents

Classes

BigQueryHookAsync

Abstract base class for hooks, hooks are meant as an interface to

BigQueryTableHookAsync

Class to get async hook for Bigquery Table Async

Attributes

BigQueryJob

google.cloud.hooks.bigquery.BigQueryJob
class google.cloud.hooks.bigquery.BigQueryHookAsync(**kwargs)

Bases: astronomer.providers.google.common.hooks.base_google.GoogleBaseHookAsync

Abstract base class for hooks, hooks are meant as an interface to interact with external systems. MySqlHook, HiveHook, PigHook return object that can handle the connection and interaction to specific instances of these systems, and expose consistent methods to interact with them.

sync_hook_class
async get_job_instance(self, project_id, job_id, session)

Get the specified job resource by job ID and project ID.

async get_job_status(self, job_id, project_id=None)

Polls for job status asynchronously using gcloud-aio.

Note that an OSError is raised when Job results are still pending. Exception means that Job finished with errors

async get_job_output(self, job_id, project_id=None)

Get the big query job output for the given job id asynchronously using gcloud-aio.

get_records(self, query_results, nocast=True)

Given the output query response from gcloud aio bigquery, convert the response to records.

Parameters
  • query_results (Dict[str, Any]) – the results from a SQL query

  • nocast (bool) – indicates whether casting to bq data type is required or not

value_check(self, sql, pass_value, records, tolerance=None)

Match a single query resulting row and tolerance with pass_value

Returns

If Match fail, we throw an AirflowException.

Return type

None

interval_check(self, row1, row2, metrics_thresholds, ignore_zero, ratio_formula)

Checks that the values of metrics given as SQL expressions are within a certain tolerance

Parameters
  • row1 (Optional[str]) – first resulting row of a query execution job for first SQL query

  • row2 (Optional[str]) – first resulting row of a query execution job for second SQL query

  • metrics_thresholds (Dict[str, Any]) – a dictionary of ratios indexed by metrics, for example ‘COUNT(*)’: 1.5 would require a 50 percent or less difference between the current day, and the prior days_back.

  • ignore_zero (bool) – whether we should ignore zero metrics

  • ratio_formula (str) – which formula to use to compute the ratio between the two metrics. Assuming cur is the metric of today and ref is the metric to today - days_back. max_over_min: computes max(cur, ref) / min(cur, ref) relative_diff: computes abs(cur-ref) / ref

class google.cloud.hooks.bigquery.BigQueryTableHookAsync(**kwargs)

Bases: astronomer.providers.google.common.hooks.base_google.GoogleBaseHookAsync

Class to get async hook for Bigquery Table Async

sync_hook_class
async get_table_client(self, dataset, table_id, project_id, session)

Returns a Google Big Query Table object.

Parameters
  • dataset (str) – The name of the dataset in which to look for the table storage bucket.

  • table_id (str) – The name of the table to check the existence of.

  • project_id (str) – The Google cloud project in which to look for the table. The connection supplied to the hook must provide access to the specified project.

  • session (aiohttp.ClientSession) – aiohttp ClientSession