AWS Sagemaker Pipelines throws a "No finished training job found associated with this estimator" after introducing a register step

I am currently working on creating a Sagemaker Pipeline to train a Tensorflow model. I'm new to this area and I have been following this guide created by AWS as well as the standard pipeline workflow listed in the Sagemaker developer guide.

I have a pipeline that runs without error when I only include the preprocessing, training, evaluation, and condition steps. When I add the register step:

# Package evaluation metrics into an evaluation report `PropertyFile`
evaluation_report = PropertyFile(
        name="EvaluationReport", output_name="evaluation", path="evaluation_report.json"
)

# Create ModelMetrics object using the evaluation report from the evaluation step
# A ModelMetrics object contains metrics captured from a model.
model_metrics = ModelMetrics(model_statistics=evaluation_report)

# Create a RegisterModel step, which registers the model with Sagemaker Model Registry.
register_step = RegisterModel(
    name="Foo",
    estimator=estimator,
    model_data=train_step.properties.ModelArtifacts.S3ModelArtifacts,
    content_types=["text/csv"],
    response_types=["text/csv"],
    inference_instances=config["instance"]["inference"],
    transform_instances=config["instance"]["transform"],
    model_package_group_name="Bar",
    model_metrics=model_metrics,
    approval_status="approved",
)

to the condition step's if_steps:

# Create a Sagemaker Pipelines ConditionStep, using the condition above.
# Enter the steps to perform if the condition returns True / False.
cond_step = ConditionStep(
    name="MSE-Lower-Than-Threshold-Condition",
    conditions=[cond_lte],
    if_steps=[register_step],
    else_steps=[],
)

I get the following trace:

PropertyFile(name='EvaluationReport', output_name='evaluation', path='evaluation_report.json')
No finished training job found associated with this estimator. Please make sure this estimator is only used for building workflow config
Traceback (most recent call last):
  File "/Users/<user>/Code/<repo_name>/pipeline_definition.py", line 474, in <module>
    main()
  File "/Users/<user>/Code/<repo_name>/pipeline_definition.py", line 466, in main
    pipeline = define_pipeline()
  File "/Users/<user>/Code/<repo_name>/pipeline_definition.py", line 457, in define_pipeline
    print(json.loads(pipeline.definition()))
  File "/Users/<user>/Code/<repo_name>/venv/lib/python3.9/site-packages/sagemaker/workflow/pipeline.py", line 257, in definition
    request_dict = self.to_request()
  File "/Users/<user>/Code/<repo_name>/venv/lib/python3.9/site-packages/sagemaker/workflow/pipeline.py", line 89, in to_request
    "Steps": list_to_request(self.steps),
  File "/Users/<user>/Code/<repo_name>/venv/lib/python3.9/site-packages/sagemaker/workflow/utilities.py", line 37, in list_to_request
    request_dicts.append(entity.to_request())
  File "/Users/<user>/Code/<repo_name>/venv/lib/python3.9/site-packages/sagemaker/workflow/steps.py", line 99, in to_request
    "Arguments": self.arguments,
  File "/Users/<user>/Code/<repo_name>/venv/lib/python3.9/site-packages/sagemaker/workflow/condition_step.py", line 87, in arguments
    IfSteps=list_to_request(self.if_steps),
  File "/Users/<user>/Code/<repo_name>/venv/lib/python3.9/site-packages/sagemaker/workflow/utilities.py", line 39, in list_to_request
    request_dicts.extend(entity.request_dicts())
  File "/Users/<user>/Code/<repo_name>/venv/lib/python3.9/site-packages/sagemaker/workflow/step_collections.py", line 50, in request_dicts
    return [step.to_request() for step in self.steps]
  File "/Users/<user>/Code/<repo_name>/venv/lib/python3.9/site-packages/sagemaker/workflow/step_collections.py", line 50, in <listcomp>
    return [step.to_request() for step in self.steps]
  File "/Users/<user>/Code/<repo_name>/venv/lib/python3.9/site-packages/sagemaker/workflow/steps.py", line 209, in to_request
    step_dict = super().to_request()
  File "/Users/<user>/Code/<repo_name>/venv/lib/python3.9/site-packages/sagemaker/workflow/steps.py", line 99, in to_request
    "Arguments": self.arguments,
  File "/Users/<user>/Code/<repo_name>/venv/lib/python3.9/site-packages/sagemaker/workflow/_utils.py", line 423, in arguments
    model_package_args = get_model_package_args(
  File "/Users/<user>/Code/<repo_name>/venv/lib/python3.9/site-packages/sagemaker/session.py", line 4217, in get_model_package_args
    model_package_args["model_metrics"] = model_metrics._to_request_dict()
  File "/Users/<user>/Code/<repo_name>/venv/lib/python3.9/site-packages/sagemaker/model_metrics.py", line 66, in _to_request_dict
    model_quality["Statistics"] = self.model_statistics._to_request_dict()
AttributeError: 'PropertyFile' object has no attribute '_to_request_dict'

From this trace I see two, potentially related, issues. The immediate issue is the AttributeError: 'PropertyFile' object has no attribute '_to_request_dict'. I haven't been able to find any information on why we might be receiving it between forums and Sagemaker documentation.

I also see a sneaky issue towards the top of the trace that has plagued me all day. The line No finished training job found associated with this estimator. Please make sure this estimator is only used for building workflow config tells me that the register step is using our estimator when it should be waiting until after the training step has run. I can't seem to find any reference to this error, besides a somewhat-similar stack exchange post.

I've compared my code to the AWS-published examples many times and I'm confident that I'm not doing anything taboo. Would anyone be able to shine some light on what these errors are suggesting? Is there any more information or code that would be relevant?

Thanks so much!

No finished training job found associated with this estimator is normal and can be ignored. It's the SDK warning you that the model data of the estimator is being referenced, but the estimator hasn't been run yet. That's expected when using pipelines, since the pipeline is going to start the training job outside of the context of the notebook.

The actual problem is the PropertyFile you're passing to model_statistics. That parameter has to be a MetricsSource object.

It looks like you're trying to reference the evaluation report from a Processing Job. In that case you don't need to pass the PropertyFile to model_statistics, just reference the output of the job directly:

model_metrics = ModelMetrics(
    model_statistics=MetricsSource(
        s3_uri="{}/evaluation.json".format(
            step_eval.arguments["ProcessingOutputConfig"]["Outputs"][0]["S3Output"]["S3Uri"]
        ),
        content_type="application/json",
    )
)

AWS Sagemaker Pipelines throws a "No finished training job found associated with this estimator" after introducing a register step

Related

Recent Posts