:py:mod:`astronomer.providers.amazon.aws.operators.sagemaker` ============================================================= .. py:module:: astronomer.providers.amazon.aws.operators.sagemaker Module Contents --------------- Classes ~~~~~~~ .. autoapisummary:: astronomer.providers.amazon.aws.operators.sagemaker.SageMakerProcessingOperatorAsync astronomer.providers.amazon.aws.operators.sagemaker.SageMakerTransformOperatorAsync astronomer.providers.amazon.aws.operators.sagemaker.SageMakerTrainingOperatorAsync Functions ~~~~~~~~~ .. autoapisummary:: astronomer.providers.amazon.aws.operators.sagemaker.serialize .. py:function:: serialize(result) Serialize any objects coming from Sagemaker API response to json string .. py:class:: SageMakerProcessingOperatorAsync(*, config, aws_conn_id = DEFAULT_CONN_ID, wait_for_completion = True, print_log = True, check_interval = CHECK_INTERVAL_SECOND, max_ingestion_time = None, action_if_job_exists = 'timestamp', **kwargs) Bases: :py:obj:`airflow.providers.amazon.aws.operators.sagemaker.SageMakerProcessingOperator` SageMakerProcessingOperatorAsync is used to analyze data and evaluate machine learning models on Amazon SageMaker. With SageMakerProcessingOperatorAsync, you can use a simplified, managed experience on SageMaker to run your data processing workloads, such as feature engineering, data validation, model evaluation, and model interpretation. .. seealso:: For more information on how to use this operator, take a look at the guide: :ref:`howto/operator:SageMakerProcessingOperator` :param config: The configuration necessary to start a processing job (templated). For details of the configuration parameter see :ref:``SageMaker.Client.create_processing_job`` :param aws_conn_id: The AWS connection ID to use. :param wait_for_completion: Even if wait is set to False, in async we will defer and the operation waits to check the status of the processing job. :param print_log: if the operator should print the cloudwatch log during processing :param check_interval: if wait is set to be true, this is the time interval in seconds which the operator will check the status of the processing job :param max_ingestion_time: The operation fails if the processing job doesn't finish within max_ingestion_time seconds. If you set this parameter to None, the operation does not timeout. :param action_if_job_exists: Behaviour if the job name already exists. Possible options are "increment" (default) and "fail". .. py:method:: execute(context) Creates processing job via sync hook `create_processing_job` and pass the control to trigger and polls for the status of the processing job in async .. py:method:: execute_complete(context, event = None) Callback for when the trigger fires - returns immediately. Relies on trigger to throw an exception, otherwise it assumes execution was successful. .. py:class:: SageMakerTransformOperatorAsync(*, config, aws_conn_id = DEFAULT_CONN_ID, wait_for_completion = True, check_interval = CHECK_INTERVAL_SECOND, max_ingestion_time = None, check_if_job_exists = True, action_if_job_exists = 'timestamp', **kwargs) Bases: :py:obj:`airflow.providers.amazon.aws.operators.sagemaker.SageMakerTransformOperator` SageMakerTransformOperatorAsync starts a transform job and polls for the status asynchronously. A transform job uses a trained model to get inferences on a dataset and saves these results to an Amazon S3 location that you specify. .. seealso:: For more information on how to use this operator, take a look at the guide: :ref:``howto/operator:SageMakerTransformOperator`` :param config: The configuration necessary to start a transform job (templated). If you need to create a SageMaker transform job based on an existed SageMaker model:: config = transform_config If you need to create both SageMaker model and SageMaker Transform job:: config = { 'Model': model_config, 'Transform': transform_config } For details of the configuration parameter of transform_config see :ref:``SageMaker.Client.create_transform_job`` For details of the configuration parameter of model_config, See: :ref:``SageMaker.Client.create_model`` :param aws_conn_id: The AWS connection ID to use. :param check_interval: If wait is set to True, the time interval, in seconds, that this operation waits to check the status of the transform job. :param max_ingestion_time: The operation fails if the transform job doesn't finish within max_ingestion_time seconds. If you set this parameter to None, the operation does not timeout. :param check_if_job_exists: If set to true, then the operator will check whether a transform job already exists for the name in the config. :param action_if_job_exists: Behaviour if the job name already exists. Possible options are "increment" (default) and "fail". This is only relevant if check_if_job_exists is True. .. py:method:: execute(context) Creates transform job via sync hook `create_transform_job` and pass the control to trigger and polls for the status of the transform job in async .. py:method:: execute_complete(context, event) Callback for when the trigger fires - returns immediately. Relies on trigger to throw an exception, otherwise it assumes execution was successful. .. py:class:: SageMakerTrainingOperatorAsync(*, config, aws_conn_id = DEFAULT_CONN_ID, wait_for_completion = True, print_log = True, check_interval = CHECK_INTERVAL_SECOND, max_ingestion_time = None, check_if_job_exists = True, action_if_job_exists = 'timestamp', **kwargs) Bases: :py:obj:`airflow.providers.amazon.aws.operators.sagemaker.SageMakerTrainingOperator` SageMakerTrainingOperatorAsync starts a model training job and polls for the status asynchronously. After training completes, Amazon SageMaker saves the resulting model artifacts to an Amazon S3 location that you specify. .. seealso:: For more information on how to use this operator, take a look at the guide: :ref:``howto/operator:SageMakerTrainingOperator`` :param config: The configuration necessary to start a training job (templated). For details of the configuration parameter see ``SageMaker.Client.create_training_job`` :param aws_conn_id: The AWS connection ID to use. :param print_log: if the operator should print the cloudwatch log during training :param check_interval: if wait is set to be true, this is the time interval in seconds which the operator will check the status of the training job :param max_ingestion_time: The operation fails if the training job doesn't finish within max_ingestion_time seconds. If you set this parameter to None, the operation does not timeout. :param check_if_job_exists: If set to true, then the operator will check whether a training job already exists for the name in the config. :param action_if_job_exists: Behaviour if the job name already exists. Possible options are "increment" (default) and "fail". This is only relevant if check_if_job_exists is True. .. py:method:: execute(context) Creates SageMaker training job via sync hook `create_training_job` and pass the control to trigger and polls for the status of the training job in async .. py:method:: execute_complete(context, event) Callback for when the trigger fires - returns immediately. Relies on trigger to throw an exception, otherwise it assumes execution was successful.