Integrating with Amazon Machine Learning

Retraining is the process of providing new data to models in attempt keep your model accurate with the drift of actual outcome distribution over time. Like most application development, implementing a machine learning model is not a one time activity; it is best practice to continuously monitor your model and retrain it if new observations begin to deviate from the original training data distributions.
In order to retrain a model in Amazon you will need to create a completely new model with your updated data set. Be sure to avoid hard coding model Ids in your Appian applications so that updating your applications after retraining models will only require updating a single object such as a constant or connected system.

A key characteristic of good training data is that it is provided in a way that is optimized for learning and generalization. The process of putting together the data in this optimal format is known in the industry as feature transformation.
Feature transformation can be performed on all types of data (numeric, text, boolean). A simple example of feature transformation is converting all null numeric values to 0, but can also include more complex formulas for the purpose of normalizing data or discovering non-linearity in the variables distribution.
Feature transformation can take place prior to uploading data in Amazon or you can use built-in transformation recipes within the Amazon machine learning console. Regardless of the method used, the process should be repeatable such that models can be recreated or re-trained easily.

In order to test the accuracy of ML models a percentage of the data provided to Amazon is set aside for evaluation. By default Amazon splits the data such that 70% of it is used to train the model while 30% is used to evaluate it. The split percentage can be altered when creating the model.
It is important to split the input data such that there is a random distribution of observations between the training and evaluation data sources. If the data for either data source is skewed towards a certain target value the ML model could be skewed and the evaluation model may not be indicative of true performance.

In Amazon ML, you must shuffle your training data. Shuffling mixes up the order of your data so that the SGD algorithm doesn't encounter one type of data for too many observations in succession.
When creating a model via the admin console or the Appian AI Designer shared component wizard, you can indicate if you would like Amazon to shuffle your data or if you have already shuffled it.

See Also

Websites:

AML Developer Guide

Tags: integrations, Architecture

Integrating with Amazon Machine Learning

Appian Max Team — Tue, 23 Apr 2024 13:23:29 GMT

Revision 5 posted to Guide by Appian Max Team on 4/23/2024 1:23:29 PM

Note: Amazon ML is no longer available to new Amazon customers

Amazon Machine Learning Models

Amazon Machine Learning (AML) supports three different type of ML models. The type of model that Amazon will build depends on the type of target attribute that you want to predict.

Model	Prediction Type	Performance Metric
Regression	Predicts a numeric value	Root Mean Square Error (RMSE)
Binary Classification	Predicts binary values (ex. true or false)	Area Under the Curve (AUC)
Multiclass Classification	Predicts values that belong to a limited, predefined set of permissible values	F1 Score

Creating Amazon ML Models in Appian

Create an Amazon developer account and an Amazon S3 bucket to store the data you will use to create your model. A credit card is required and you will be charged to create models and make predictions, but costs are relatively insignificant (see AML pricing).
Download Appian AI Designer from shared components and follow the deployment instructions.
1. Note: you will need to have Appian automatically create the database tables by manually publishing the data store after the application import.

age	job	marital	education	default	housing	contact	duration	day_of_w	y
44	blue-collar	married	basic.4y	0	1	cellular	210	thu	0
53	technical	married	unknown	1	0	telephone	180	fri	1
28	management	single	university.degree	0	1	cellular	465	mon	1
39	services	divorced	high.school	0	1	cellular	180	wed	0

Navigate to: https://<your.server>/suite/sites/aml and follow the sites wizard to create a new model.
1. On the first tab you can select the S3 bucket created earlier.
2. If you do not plan on using Amazon’s feature transformation formulas than ensure that any data manipulation has done before formatting the data into a csv. See feature transformation below for more information.

Making Predictions

Evaluating and Adjusting Model Performance

Best Practices

Retraining is the process of providing new data to models in attempt keep your model accurate with the drift of actual outcome distribution over time. Like most application development, implementing a machine learning model is not a one time activity; it is best practice to continuously monitor your model and retrain it if new observations begin to deviate from the original training data distributions.
In order to retrain a model in Amazon you will need to create a completely new model with your updated data set. Be sure to avoid hard coding model Ids in your Appian applications so that updating your applications after retraining models will only require updating a single object such as a constant or connected system.

A key characteristic of good training data is that it is provided in a way that is optimized for learning and generalization. The process of putting together the data in this optimal format is known in the industry as feature transformation.
Feature transformation can be performed on all types of data (numeric, text, boolean). A simple example of feature transformation is converting all null numeric values to 0, but can also include more complex formulas for the purpose of normalizing data or discovering non-linearity in the variables distribution.
Feature transformation can take place prior to uploading data in Amazon or you can use built-in transformation recipes within the Amazon machine learning console. Regardless of the method used, the process should be repeatable such that models can be recreated or re-trained easily.

In order to test the accuracy of ML models a percentage of the data provided to Amazon is set aside for evaluation. By default Amazon splits the data such that 70% of it is used to train the model while 30% is used to evaluate it. The split percentage can be altered when creating the model.
It is important to split the input data such that there is a random distribution of observations between the training and evaluation data sources. If the data for either data source is skewed towards a certain target value the ML model could be skewed and the evaluation model may not be indicative of true performance.

In Amazon ML, you must shuffle your training data. Shuffling mixes up the order of your data so that the SGD algorithm doesn't encounter one type of data for too many observations in succession.
When creating a model via the admin console or the Appian AI Designer shared component wizard, you can indicate if you would like Amazon to shuffle your data or if you have already shuffled it.

See Also

Websites:

AML Developer Guide

Tags: integrations, Platform, Architecture

Integrating with Amazon Machine Learning

joel.larin — Tue, 31 Oct 2023 19:57:31 GMT

Revision 4 posted to Guide by joel.larin on 10/31/2023 7:57:31 PM

Note: Amazon ML is no longer available to new Amazon customers

Amazon Machine Learning Models

Amazon Machine Learning (AML) supports three different type of ML models. The type of model that Amazon will build depends on the type of target attribute that you want to predict.

Model	Prediction Type	Performance Metric
Regression	Predicts a numeric value	Root Mean Square Error (RMSE)
Binary Classification	Predicts binary values (ex. true or false)	Area Under the Curve (AUC)
Multiclass Classification	Predicts values that belong to a limited, predefined set of permissible values	F1 Score

Creating Amazon ML Models in Appian

Create an Amazon developer account and an Amazon S3 bucket to store the data you will use to create your model. A credit card is required and you will be charged to create models and make predictions, but costs are relatively insignificant (see AML pricing).
Download Appian AI Designer from shared components and follow the deployment instructions.
1. Note: you will need to have Appian automatically create the database tables by manually publishing the data store after the application import.

age	job	marital	education	default	housing	contact	duration	day_of_w	y
44	blue-collar	married	basic.4y	0	1	cellular	210	thu	0
53	technical	married	unknown	1	0	telephone	180	fri	1
28	management	single	university.degree	0	1	cellular	465	mon	1
39	services	divorced	high.school	0	1	cellular	180	wed	0

Navigate to: https://<your.server>/suite/sites/aml and follow the sites wizard to create a new model.
1. On the first tab you can select the S3 bucket created earlier.
2. If you do not plan on using Amazon’s feature transformation formulas than ensure that any data manipulation has done before formatting the data into a csv. See feature transformation below for more information.

Making Predictions

Evaluating and Adjusting Model Performance

Best Practices

Retraining is the process of providing new data to models in attempt keep your model accurate with the drift of actual outcome distribution over time. Like most application development, implementing a machine learning model is not a one time activity; it is best practice to continuously monitor your model and retrain it if new observations begin to deviate from the original training data distributions.
In order to retrain a model in Amazon you will need to create a completely new model with your updated data set. Be sure to avoid hard coding model Ids in your Appian applications so that updating your applications after retraining models will only require updating a single object such as a constant or connected system.

A key characteristic of good training data is that it is provided in a way that is optimized for learning and generalization. The process of putting together the data in this optimal format is known in the industry as feature transformation.
Feature transformation can be performed on all types of data (numeric, text, boolean). A simple example of feature transformation is converting all null numeric values to 0, but can also include more complex formulas for the purpose of normalizing data or discovering non-linearity in the variables distribution.
Feature transformation can take place prior to uploading data in Amazon or you can use built-in transformation recipes within the Amazon machine learning console. Regardless of the method used, the process should be repeatable such that models can be recreated or re-trained easily.

In order to test the accuracy of ML models a percentage of the data provided to Amazon is set aside for evaluation. By default Amazon splits the data such that 70% of it is used to train the model while 30% is used to evaluate it. The split percentage can be altered when creating the model.
It is important to split the input data such that there is a random distribution of observations between the training and evaluation data sources. If the data for either data source is skewed towards a certain target value the ML model could be skewed and the evaluation model may not be indicative of true performance.

In Amazon ML, you must shuffle your training data. Shuffling mixes up the order of your data so that the SGD algorithm doesn't encounter one type of data for too many observations in succession.
When creating a model via the admin console or the Appian AI Designer shared component wizard, you can indicate if you would like Amazon to shuffle your data or if you have already shuffled it.

See Also

Websites:

AML Developer Guide

Tags: integrations, Platform, Architecture

Integrating with Amazon Machine Learning

joel.larin — Tue, 31 Oct 2023 19:53:21 GMT

Revision 3 posted to Guide by joel.larin on 10/31/2023 7:53:21 PM

Note: Amazon ML is no longer available to new Amazon customers

Amazon Machine Learning Models

Amazon Machine Learning (AML) supports three different type of ML models. The type of model that Amazon will build depends on the type of target attribute that you want to predict.

Model	Prediction Type	Performance Metric
Regression	Predicts a numeric value	Root Mean Square Error (RMSE)
Binary Classification	Predicts binary values (ex. true or false)	Area Under the Curve (AUC)
Multiclass Classification	Predicts values that belong to a limited, predefined set of permissible values	F1 Score

Creating Amazon ML Models in Appian

Create an Amazon developer account and an Amazon S3 bucket to store the data you will use to create your model. A credit card is required and you will be charged to create models and make predictions, but costs are relatively insignificant (see AML pricing).
Download Appian AI Designer from shared components and follow the deployment instructions.
1. Note: you will need to have Appian automatically create the database tables by manually publishing the data store after the application import.

age	job	marital	education	default	housing	contact	duration	day_of_w	y
44	blue-collar	married	basic.4y	0	1	cellular	210	thu	0
53	technical	married	unknown	1	0	telephone	180	fri	1
28	management	single	university.degree	0	1	cellular	465	mon	1
39	services	divorced	high.school	0	1	cellular	180	wed	0

Navigate to: https://<your.server>/suite/sites/aml and follow the sites wizard to create a new model.
1. On the first tab you can select the S3 bucket created earlier.
2. If you do not plan on using Amazon’s feature transformation formulas than ensure that any data manipulation has done before formatting the data into a csv. See feature transformation below for more information.

Making Predictions

Once a model is created you can make batch predictions or individual real-time predictions. There are two main ways to make real time predictions within Appian: you can use either the shared component function AML_getRealtimePrediction or you can use the Amazon connected system object in Appian versions 18.2 or later. The AML_getRealtimePrediction function takes in a model ID and two parallel arrays that hold attribute names and attribute values. If using this function it is recommended to create a mapping rule that takes in a CDT and converts the CDT values into a text array to be passed into AML_getRealtimePrediction. Before even creating a connected system or creating a rule to call the API you can test out real time predictions from the AML admin console or from the machine learning model record in the Appian AI designer site. It's recommended to test out the predictions and evaluate the model (more below) before deciding to move forward with an initial model.

Evaluating and Adjusting Model Performance

Best Practices

Retraining is the process of providing new data to models in attempt keep your model accurate with the drift of actual outcome distribution over time. Like most application development, implementing a machine learning model is not a one time activity; it is best practice to continuously monitor your model and retrain it if new observations begin to deviate from the original training data distributions.
In order to retrain a model in Amazon you will need to create a completely new model with your updated data set. Be sure to avoid hard coding model Ids in your Appian applications so that updating your applications after retraining models will only require updating a single object such as a constant or connected system.

A key characteristic of good training data is that it is provided in a way that is optimized for learning and generalization. The process of putting together the data in this optimal format is known in the industry as feature transformation.
Feature transformation can be performed on all types of data (numeric, text, boolean). A simple example of feature transformation is converting all null numeric values to 0, but can also include more complex formulas for the purpose of normalizing data or discovering non-linearity in the variables distribution.
Feature transformation can take place prior to uploading data in Amazon or you can use built-in transformation recipes within the Amazon machine learning console. Regardless of the method used, the process should be repeatable such that models can be recreated or re-trained easily.

In order to test the accuracy of ML models a percentage of the data provided to Amazon is set aside for evaluation. By default Amazon splits the data such that 70% of it is used to train the model while 30% is used to evaluate it. The split percentage can be altered when creating the model.
It is important to split the input data such that there is a random distribution of observations between the training and evaluation data sources. If the data for either data source is skewed towards a certain target value the ML model could be skewed and the evaluation model may not be indicative of true performance.

In Amazon ML, you must shuffle your training data. Shuffling mixes up the order of your data so that the SGD algorithm doesn't encounter one type of data for too many observations in succession.
When creating a model via the admin console or the Appian AI Designer shared component wizard, you can indicate if you would like Amazon to shuffle your data or if you have already shuffled it.

See Also

Websites:

AML Developer Guide

Tags: integrations, Platform, Architecture

Integrating with Amazon Machine Learning

joel.larin — Tue, 31 Oct 2023 19:47:48 GMT

Revision 2 posted to Guide by joel.larin on 10/31/2023 7:47:48 PM

Note: Amazon ML is no longer available to new Amazon customers

Amazon Machine Learning Models

Amazon Machine Learning (AML) supports three different type of ML models. The type of model that Amazon will build depends on the type of target attribute that you want to predict.

Model	Prediction Type	Performance Metric
Regression	Predicts a numeric value	Root Mean Square Error (RMSE)
Binary Classification	Predicts binary values (ex. true or false)	Area Under the Curve (AUC)
Multiclass Classification	Predicts values that belong to a limited, predefined set of permissible values	F1 Score

Creating Amazon ML Models in Appian

Create an Amazon developer account and an Amazon S3 bucket to store the data you will use to create your model. A credit card is required and you will be charged to create models and make predictions, but costs are relatively insignificant (see AML pricing).
Download Appian AI Designer from shared components and follow the deployment instructions.
1. Note: you will need to have Appian automatically create the database tables by manually publishing the data store after the application import.

age	job	marital	education	default	housing	contact	duration	day_of_w	y
44	blue-collar	married	basic.4y	0	1	cellular	210	thu	0
53	technical	married	unknown	1	0	telephone	180	fri	1
28	management	single	university.degree	0	1	cellular	465	mon	1
39	services	divorced	high.school	0	1	cellular	180	wed	0

Navigate to: https://<your.server>/suite/sites/aml and follow the sites wizard to create a new model.
1. On the first tab you can select the S3 bucket created earlier.
2. If you do not plan on using Amazon’s feature transformation formulas than ensure that any data manipulation has done before formatting the data into a csv. See feature transformation below for more information.

Making Predictions

Once a model is created you can make batch predictions or individual real-time predictions. There are two main ways to make real time predictions within Appian: you can use either the shared component function AML_getRealtimePrediction or you can use the Amazon connected system object in Appian versions 18.2 or later. The AML_getRealtimePrediction function takes in a model ID and two parallel arrays that hold attribute names and attribute values. If using this function it is recommended to create a mapping rule that takes in a CDT and converts the CDT values into a text array to be passed into AML_getRealtimePrediction. Before even creating a connected system or creating a rule to call the API you can test out real time predictions from the AML admin console or from the machine learning model record in the Appian AI designer site. It's recommended to test out the predictions and evaluate the model (more below) before deciding to move forward with an initial model.

Evaluating and Adjusting Model Performance

Best Practices

Retraining is the process of providing new data to models in attempt keep your model accurate with the drift of actual outcome distribution over time. Like most application development, implementing a machine learning model is not a one time activity; it is best practice to continuously monitor your model and retrain it if new observations begin to deviate from the original training data distributions.
In order to retrain a model in Amazon you will need to create a completely new model with your updated data set. Be sure to avoid hard coding model Ids in your Appian applications so that updating your applications after retraining models will only require updating a single object such as a constant or connected system.

A key characteristic of good training data is that it is provided in a way that is optimized for learning and generalization. The process of putting together the data in this optimal format is known in the industry as feature transformation.
Feature transformation can be performed on all types of data (numeric, text, boolean). A simple example of feature transformation is converting all null numeric values to 0, but can also include more complex formulas for the purpose of normalizing data or discovering non-linearity in the variables distribution.
Feature transformation can take place prior to uploading data in Amazon or you can use built-in transformation recipes within the Amazon machine learning console. Regardless of the method used, the process should be repeatable such that models can be recreated or re-trained easily.

In order to test the accuracy of ML models a percentage of the data provided to Amazon is set aside for evaluation. By default Amazon splits the data such that 70% of it is used to train the model while 30% is used to evaluate it. The split percentage can be altered when creating the model.
It is important to split the input data such that there is a random distribution of observations between the training and evaluation data sources. If the data for either data source is skewed towards a certain target value the ML model could be skewed and the evaluation model may not be indicative of true performance.