Once you deploy the service, you will be able to navigate to the API Reference (Swagger definition) of your newly created service by going to https://<service_name>.azurewebsites.net/swagger . You will get the root url upon deploying your service.
The swagger UI is the authoritative definition of the API interface -- but we have duplicated a description of the API reference here so you can understand what the service will look like before you actually deploy it.
Use the following links to quickly jump to the relevant documentation:
When calling the service, you will need to set authentication headers with the API key that is provided as part of the set-up process.
If you did not copy the keys at the time of setting up the preconfigured solution, you can find those by:
- Logging into the Azure Portal
- Navigating to the resource group that was created by the solution. The resource group is named after the deployment name you provided when you set up the solution.
- Then navigate to App Service.
- Click on Application Settings
- The keys are under the App settings section, they are called AdminPrimaryKey, AdminSecondaryKey, RecommendPrimaryKey and RecommendSecondaryKey.
The primary and secondary Admin keys can be used for all API operations while the primary and secondary Recommend keys can only be used to get recommendations, typically used in client applications or websites requesting recommendations.
Sample authentication header:
x-api-key: <yourKeyGoesHere>
POST /api/models
Starts the process of training a new model that could be later used to query for recommendations. To start training a model one must first upload usage data, and optionally a catalog and evaluation files, to the new blob storage account created by the preconfigured solution.
Training a new model is an asynchronous operation. The response to this HTTP request contains a Location header that reference the newly created model, as well as the model in the body of the response. You can use the Get Model API to query for the model training status along with other model information.
Optionally model metrics are computed if evaluationUsageRelativePath is provided. See Model Evaluation for more details.
The request body should be a valid Model Training Parameters JSON object.
The response body will contain a Model JSON object.
Sample Request:
POST https://<service_name>.azurewebsites.net/api/models x-api-key: your_api_key Content-Type: application/json { "description": "Simple recommendations model", "blobContainerName": "input-files", "usageRelativePath": "booksTrainUsage", "catalogFileRelativePath": "books.csv", "evaluationUsageRelativePath": "booksTestUsage", "supportThreshold": 6, "cooccurrenceUnit": "User", "similarityFunction": "Jaccard", "enableColdItemPlacement": true, "enableColdToColdRecommendations": false, "enableUserAffinity": true, "allowSeedItemsInRecommendations": true, "enableBackfilling": true, "decayPeriodInDays": 30 }
Sample Response:
HTTP/1.1 201 Created Content-Type: application/json; charset=utf-8 Location: https://<service_name>.azurewebsites.net/api/models/e16198c0-3a72-4f4d-b8ab-e4c07c9bccdb { "id":"e16198c0-3a72-4f4d-b8ab-e4c07c9bccdb", "description": "Simple recommendations model", "creationTime":"2017-05-04T00:26:06.386324Z", "modelStatus":"Created" }
If you would like to train a "Frequently Bought Together" model:
- Set “enableUserAffinity” to false since you don’t need the time of the event or the event type to be used as inputs into the recommendation request.
- Set the “supportThreshold” based on the minimum number of co-occurrences that you want items to appear together in a transaction to be used in the model.
- Set “cooccurrenceUnit”. You can set to “Timestamp” if you want baskets to be modelled based on transactions that occurred at the same time in the check-out. If that is too strict, you can set to “User”.
- Set the “similarityFunction” desired. We would recommend starting with Jaccard.
The catalog file contains information about the items you are offering to your customer. The catalog data should follow the following format:
- Without features -
<Item Id>,<Item Name>,<Item Category>[,<Description>]
- With features -
<Item Id>,<Item Name>,<Item Category>,[<Description>],<Features list>
Without features:
AAA04294,Office Language Pack Online DwnLd,Office
AAA04303,Minecraft Download Game,Games
C9F00168,Kiruna Flip Cover,Accessories
With features:
AAA04294,Office Language Pack Online DwnLd,Office,, softwaretype=productivity, compatibility=Windows
BAB04303,Minecraft DwnLd,Games, softwaretype=gaming,, compatibility=iOS, agegroup=all
C9F00168,Kiruna Flip Cover,Accessories, compatibility=lumia,, hardwaretype=mobile
Name | Mandatory | Type | Description |
---|---|---|---|
Item Id | Yes | [A-z], [a-z], [0-9], [_] (Underscore), [-] (Dash) Max length: 450 |
Unique identifier of an item. |
Item Name | Yes | Any alphanumeric characters Max length: 255 |
Item name. |
Item Category | Yes | Any alphanumeric characters Max length: 255 |
Category to which this item belongs (e.g. Cooking Books, Drama…); can be empty. |
Description | No, unless features are present (but can be empty) | Any alphanumeric characters Max length: 4000 |
Description of this item. |
Features list | No | Any alphanumeric characters Max length: 4000; Max number of features:20 |
Comma-separated list of feature name=feature value that can be used to enhance model recommendation. |
The recommendations engine creates a statistical model that tells you what items are likely to be liked or purchased by a customer. When you have a new product that has never been interacted with it is not possible to create a model on co-occurrences alone. Let's say you start offering a new "children's violin" in your store, since you have never sold that violin before you cannot tell what other items to recommend with that violin.
That said, if the engine knows information about that violin (i.e. It's a musical instrument, it is for children ages 7-10, it is not an expensive violin, etc.), then the engine can learn from other products with similar features. For instance, you have sold violin's in the past and usually people that buy violins tend to buy classical music CDs and sheet music stands. The system can find these connections between the features and provide recommendations based on the features while your new violin has little usage.
Features are imported as part of the catalog data. The SAR algorithm that is used to train the model will automatically detect the strength of each of the features based on the transaction data.
You should create features that resemble a category. For instance, price=9.34 is not a categorical feature. On the other hand, a feature like priceRange=Under5Dollars is a categorical feature. Another common mistake is to use the name of the item as a feature. This would cause the name of an item to be unique so it would not describe a category. Make sure the features represent categories of items.
You should use less than 20 features.
Features are used by the model when there is not enough transaction data to provide recommendations on transaction information alone. So features will have the greatest impact on “cold items” – items with few transactions. If all your items have sufficient transaction information you may not need to enrich your model with features.
A usage file contains information about how those items are used, or the transactions from your business.
A usage file is a CSV (comma separated value) file where each row in a usage file represents an interaction between a user and an item. Each row is formatted as follows:
<User Id>,<Item Id>,<Time>[,<Event Type> | ,,<Custom Event Weight>]
Name | Mandatory | Type | Description |
---|---|---|---|
User Id | Yes | [A-z], [a-z], [0-9], [_] (Underscore), [-] (Dash) Max length: 255 |
Unique identifier of a user. |
Item Id | Yes | [A-z], [a-z], [0-9], [_] (Underscore), [-] (Dash) Max length: 450 |
Unique identifier of an item. |
Time | Yes | Date in ISO 8601 format: yyyy-MM-ddTHH:mm:ss (e.g. 2017-04-20T18:00:00) |
The time of the event |
Event Type *only used in model evaluation |
No | One of the following: Click (=weight of 1) RecommendationClick (=weight of 2) AddShopCart (=weight of 3) RemoveShopCart (=weight of -1) Purchase (=weight of 4) (defaults to Click) |
The type of transaction. This will be used to determine the event strength. *used only if Custom Event Weight is not provided. |
Custom Event Weight *only used in model evaluation |
No | number | The trasaction strength. *if provided, Event Type will be ignored. |
00037FFEA61FCA16,288186200,2015-08-14T11:02:52,Click
0003BFFDD4C2148C,297833400,2015-08-14T11:02:50,Purchase
0003BFFDD4C2118D,297833300,2015-08-14T11:02:40,,5.2
00030000D16C4237,297833300,2015-08-14T11:02:37,Purchase
0003BFFDD4C20B63,297833400,2015-08-14T11:02:12,Click
00037FFEC8567FB8,297833400,2015-08-14T11:02:04
GET /api/models/{modelId}
Returns the model information, including the training status, parameters used to build the model, the time of creation, statistics about the model and evaluation results (See Model JSON Schema).
Sample Request:
GET https://<service_name>.azurewebsites.net/api/models/e16198c0-3a72-4f4d-b8ab-e4c07c9bccdb x-api-key: your_api_key
Sample Response:
HTTP/1.1 200 OK Content-Type: application/json; charset=utf-8 { "id":"e16198c0-3a72-4f4d-b8ab-e4c07c9bccdb", "description": "Simple recommendations model", "creationTime":"2017-05-04T00:26:06.386324Z", "modelStatus": "Completed", "modelStatusMessage": "Model Training Completed Successfully", "parameters": { ... }, "statistics": { ... } }
GET /api/models
Returns the id, description, creation time and status of all the existing models. (See Model JSON Schema).
Sample Request:
GET https://<service_name>.azurewebsites.net/api/models x-api-key: your_api_key
Sample Response:
HTTP/1.1 200 OK Content-Type: application/json; charset=utf-8 [{ "id":"e16198c0-3a72-4f4d-b8ab-e4c07c9bccdb", "description": "Simple recommendations model", "creationTime":"2017-05-04T00:26:06.386324Z" "modelStatus": "Completed" }, { "id": "1bd76c6d-0a25-4d9a-8111-9e897823ae1f", "description": "my other model", "creationTime": "2017-05-04T00:26:51.1024762Z", "modelStatus": "Created" }]
DELETE /api/models/{modelId}
Deletes the specified model.
Important If you delete the default model then there will be no default model until it is set again. Any recommendation requests to the default model will fail.
Sample Request:
DELETE https://<service_name>.azurewebsites.net/api/models/e16198c0-3a72-4f4d-b8ab-e4c07c9bccdb x-api-key: your_api_key
Sample Response:
HTTP/1.1 202 Accepted Content-Length: 0
GET /api/models/{modelId}/recommend
Gets item-to-item recommendations for the model specified.
Empty recommendations may be returned if none of the items are in the catalog or if the model did not have sufficient training data to provide recommendations for the item.
Request parameters
Request Parameter | Description | Valid Values | Default Value |
---|---|---|---|
itemId | Seed item Id | string | |
recommendationCount | Number of recommended items to return | 1-100 | 10 |
The response body will be a JSON array of Recommendation Objects.
Sample Request:
GET https://<service_name>.azurewebsites.net/api/models/e16198c0-3a72-4f4d-b8ab-e4c07c9bccdb/recommend?itemId=70322 x-api-key: your_api_key
Sample Response:
HTTP/1.1 200 OK Content-Type: application/json; charset=utf-8 [{ "recommendedItemId": "46846", "score": 0.45787626504898071 }, { "recommendedItemId": "46845", "score": 0.12505614757537842 }, ... { "recommendedItemId": "41607", "score": 0.049780447036027908 }]
POST /api/models/{modelId}/recommend
Gets personalized recommendations for the model specified, given a set of recent usage events for a particular user.
An optional user id may also be specified, in which case the usage events of that user, extracted during model training from the input usage files, will also be considered.
Note: If a user id is provided, additional usage events may still be provided in the request body, representing a more recent user activity
Request parameters
Request Parameter | Description | Valid Values | Default Value |
---|---|---|---|
userId | The id of a user to get recommendations for. The user id will be ignored unless the model was trained with a enableUserToItemRecommendations set to true. See Model Training Parameters Schema for more info. |
string | null |
recommendationCount | Number of recommended items to return | 1-100 | 10 |
The request body must be an array of Get Recommendations Usage Events.
The response body will be a JSON array of Recommendation Objects.
Sample Request:
POST https://<service_name>.azurewebsites.net/api/models/e16198c0-3a72-4f4d-b8ab-e4c07c9bccdb/recommend?userId=user032669023 x-api-key: your_api_key [ { "itemId": "ItemId123", "eventType": "Click", "timestamp": "2017-01-31T23:59:59" }, { "itemId": "ItemId456", "eventType": "Purchase" }, { "itemId": "ItemId789", "weight": 2.3 }, { "itemId": "ItemId135" } ]
Sample Response:
HTTP/1.1 200 OK Content-Type: application/json; charset=utf-8 [{ "recommendedItemId": "46846", "score": 0.45787626504898071 }, { "recommendedItemId": "46845", "score": 0.12505614757537842 }, ... { "recommendedItemId": "41607", "score": 0.049780447036027908 }]
PUT /api/models/default
Sets the specified model id as the default model.
Request parameters:
Request Parameter | Description | Valid Value |
---|---|---|
modelId | The model id to set as the default model | GUID |
Sample Request:
PUT https://<service_name>.azurewebsites.net/api/models/default?modelId=e16198c0-3a72-4f4d-b8ab-e4c07c9bccdb x-api-key: your_api_key
Sample Response:
HTTP/1.1 200 OK
GET /api/models/default
To use this API, a default model must first be set
Returns the default model information, including the training status, parameters used to build the model, the time of creation, statistics about the model and evaluation results (See Model JSON Schema).
Sample Request:
GET https://<service_name>.azurewebsites.net/api/models/default x-api-key: your_api_key
Sample Response:
HTTP/1.1 200 OK Content-Type: application/json; charset=utf-8 { "id":"e16198c0-3a72-4f4d-b8ab-e4c07c9bccdb", "description": "Simple recommendations model", "creationTime":"2017-05-04T00:26:06.386324Z", "modelStatus": "Completed", "modelStatusMessage": "Model Training Completed Successfully", "parameters": { ... }, "statistics": { ... } }
GET /api/models/default/recommend
To use this API, a default model must first be set
Gets item-to-item recommendations for the default model.
Empty recommendations may be returned if none of the items are in the catalog or if the model did not have sufficient training data to provide recommendations for the item.
Request parameters
Request Parameter | Description | Valid Values | Default Value |
---|---|---|---|
itemId | Seed item Id | string | |
recommendationCount | Number of recommended items to return | 1-100 | 10 |
The response body will be a JSON array of Recommendation Objects. Request parameters
Request Parameter | Description | Valid Values | Default Value |
---|---|---|---|
itemId | Seed item Id | string | |
recommendationCount | Number of recommended items to return | 1-100 | 10 |
Sample Request:
GET https://<service_name>.azurewebsites.net/api/models/default/recommend?itemId=70322 x-api-key: your_api_key
Sample Response:
HTTP/1.1 200 OK Content-Type: application/json; charset=utf-8 [{ "recommendedItemId": "46846", "score": 0.45787626504898071 }, { "recommendedItemId": "46845", "score": 0.12505614757537842 }, ... { "recommendedItemId": "41607", "score": 0.049780447036027908 }]
POST /api/models/default/recommend
To use this API, a default model must first be set
Gets personalized recommendations for the default model, given a set of recent usage events for a particular user. The request body should be an array of Get Recommendations Usage Events. The response body will be a JSON array of Recommendation Objects.
Sample Request:
POST https://<service_name>.azurewebsites.net/api/models/default/recommend x-api-key: your_api_key [ { "itemId": "ItemId123", "eventType": "Click", "timestamp": "2017-01-31T23:59:59" }, { "itemId": "ItemId456", "eventType": "Purchase", }, { "itemId": "ItemId789", "weight": 2.3, }, { "itemId": "ItemId135", } ]
Sample Response:
HTTP/1.1 200 OK Content-Type: application/json; charset=utf-8 [{ "recommendedItemId": "46846", "score": 0.45787626504898071 }, { "recommendedItemId": "46845", "score": 0.12505614757537842 }, ... { "recommendedItemId": "41607", "score": 0.049780447036027908 }]
The following table specifies the schema of the model JSON object:
Property Name | Type | Description |
---|---|---|
id | string | The model id |
description | string | A textual description of the model, as provided when the model was created |
creationTime | Date Time | The model creation UTC time and date |
modelStatus | string | The status of the model: Created - the initial status of a new model InProgress - the model is being trained Completed - model trainining was completed successfully Failed - model trainining had failed |
modelStatusMessage | string | An optional message associated with the model status |
parameters | Model Training Parameters | The parameters used for training the model, as provided when the model was created |
statistics | Model Training Statistics | The model training statistics, specifying metrics like training duration and model quality |
The following table specifies the schema of the model training parameters JSON object (mandatory properties in bold):
Property Name | Mandatory? | Description | Type | Default Value |
---|---|---|---|---|
description | no | Textual description | string with max length of 256 characters | null |
blobContainerName | yes | Name of the container where the catalog, usage data and evaluation data are stored. Note that this container must be in the storage account created by the preconfigured solution | string | |
usageRelativePath | yes | Relative path to either a virtual directory that contains the usage file(s) or a specific usage file to be used for training. See Usage events file format | string | |
catalogFileRelativePath | no | Relative path to the catalog file. See Catalog file format | string | null |
evaluationUsageRelativePath | no | Relative path to either a virtual directory that contains the usage file(s) or to a specific usage file to be used for evaluation. See Usage events file format | string | null |
supportThreshold | no | How conservative the model is. Number of co-occurrences of items to be considered for modeling. | 3-50 | 6 |
cooccurrenceUnit | no | Indicates how to group usage events before counting co-occurrences. A 'User' cooccurrence unit will consider all items purchased by the same user as occurring together in the same session. A 'Timestamp' cooccurrence unit will consider all items purchased by the same user in the same time as occurring together in the same session. | User, Timestamp | User |
similarityFunction | no | Defines the similarity function to be used by the model. Lift favors serendipity, Co-occurrence favors predictability, and Jaccard is a nice compromise between the two. | Cooccurrence, Lift, Jaccard | Jaccard |
enableColdItemPlacement | no | Indicates if recommendations should also push cold items via feature similarity. | True, False | False |
enableColdToColdRecommendations | no | Indicates whether the similarity between pairs of cold items (catalog items without usage) should be computed. If set to false, only similarity between cold and warm items will be computed, using catalog item features. Note that this configuration is only relevant when enableColdItemPlacement is set to true. | True, False | False |
enableUserAffinity | no | For personalized recommendations, it defines whether the event type and the time of the event should be considered as input into the scoring. | True, False | False |
enableBackfilling | no | Backfill with popular items when the system does not find sufficient recommendations. | True, False | True |
allowSeedItemsInRecommendations | no | Allow seed items (items in the input or in the user history) to be returned as recommendation results. | True, False | False |
decayPeriodInDays | no | The decay period in days. The strength of the signal for events that are that many days old will be half that of the most recent events. | Integer | 30 |
enableUserToItemRecommendations | no | If true, userId is honored when requesting personalized recommendations. Training takes a bit longer when this is enabled. | True, False | False |
The following table specifies the schema of the model training statistics JSON object:
Property Name | Type | Description |
---|---|---|
totalDuration | time span | The total duration of the model training process |
trainingDuration | time span | The duration of the core model training |
catalogParsing | Parsing Report Schema | The catalog file parsing report |
usageEventsParsing | Parsing Report Schema | The usage events file(s) parsing report |
numberOfCatalogItems | number | The total number of items found in the catalog file |
numberOfUsers | number | The total number of unique users found in the usage events file(s) |
numberOfUsageItems | number | The total number of valid (which are present in catalog if provided) unique items found in usage file(s) |
catalogCoverage | number | The ratio of unique items found in usage file(s) and total items in catalog |
evaluation | Model Evaluation Schema | The model evaluation metrics |
catalogFeatureWeights | Catalog Feature Weights Schema | The calculated catalog feature's weights |
The following table specifies the schema of the catalog\usage file(s) parsing report JSON object:
Property Name | Type | Description |
---|---|---|
duration | time span | The total parsing duration |
errors | An array of Parsing Error Schema | A list of line parsing errors, if found |
successfulLinesCount | number | The total number of lines parsed |
totalLinesCount | number | The number of items with an unknown id |
The following table specifies the schema of the parsing error JSON object:
Property Name | Type | Description |
---|---|---|
error | string | The type of the parsing error: MalformedLine - The line is in an invalid CSV format MissingFields - The line is missing some mandatory fields BadTimestampFormat - The time stamp field is malformed BadWeightFormat - The event weight field is not numeric MalformedCatalogItemFeature - Some catalog item feature has a malformed format ItemIdTooLong - The item id string is longer than the maximum allowed IllegalCharactersInItemId - The item id string contains invalid characters. UserIdTooLong - The user id string is longer than the maximum allowed IllegalCharactersInUserId - The user id string contains invalid characters. UnknownItemId - The item id doesn't appear in the catalog DuplicateItemId - The item id appears more than once in the catalog |
count | number | The number of occurrences of this particular error |
sample | Parsing Error Sample Schema | A sample of an occurrence of this particular error |
Property Name | Type | Description |
---|---|---|
file | string | The name of the file containing the parsing error |
line | number | The line number of the parsing error |
The following table specifies the schema of the model evaluation JSON object:
Property Name | Type | Description |
---|---|---|
duration | time span | The total duration of the model evaluation process |
usageEventsParsing | Parsing Report Schema | The evaluation usage events file(s) parsing report |
metrics | Model Evaluation Metrics Schema | The model evaluation metrics |
The following table specifies the schema of the model evaluation metrics JSON object:
Property Name | Type | Description |
---|---|---|
precisionMetrics | Array of Precision Metric objects | Precision@K metrics for a model. These are a measure of quality of the Model. It works by splitting the input data into a test and training data. Then use the test period to evaluate what percentage of the customers would have actually clicked on a recommendation if k recommendations had been shown to them given their prior history. |
diversityMetrics | Model Diversity Metrics | The model diversity metrics. Diversity gives customers a sense of how diverse the item recommendations are, based on their usage shown by bucket eg: 0-90, 90-99, 99-100. In simple terms, how many recommendations are coming from most popular items, how many from non-popular etc., unique items recommended. |
The following table specifies the schema of the model precision metric JSON object:
Property Name | Type | Description |
---|---|---|
k | number | The value K used to calculate the metric values |
percentage | number | Precision@K percentage |
usersInTest | number | The total number of users found in the test dataset |
The following table specifies the schema of the model diversity metric JSON object:
Property Name | Type | Description |
---|---|---|
percentileBuckets | Array of Diversity Percentile Bucket | The diversity metrics for all of the popularity buckets |
totalItemsRecommended | number | Total number of items recommended (some may be duplicates) |
uniqueItemsRecommended | number | The total number of distinct items that were returned for evaluation |
uniqueItemsInTrainSet | The total number of distinct items in the train dataset |
The following table specifies the schema of the model diversity percentile bucket JSON object:
Property Name | Type | Description |
---|---|---|
min | number | The beginning percentile of the popularity bucket (inclusive) |
max | number | The ending percentile of the popularity bucket (exclusive) |
percentage | number | The fraction of all recommended users that belong to the specified popularity bucket |
The following table specifies the schema of the catalog feature weights JSON object:
Property Name | Type | Description |
---|---|---|
feature name | string | Feature name as appeared in the catalog items features. See Feature List Schema |
feature weight | number | The calculated weight of the feature. Features with higher absolute value indicate greater significance in the model when determining cold items correlations (cold items are catalog items that have no usage events). Negative weights indicate a reverse correlation, i.e. items that share the same value of a feature with a negative weight are considered less correlated |
The following table specifies the schema of the usage event JSON object used in get recommendations requests:
Property Name | Type | Description | Default Value |
---|---|---|---|
itemId | string | An item id to get recommendations for | |
timestamp | ISO 8601 format: yyyy-MM-ddTHH:mm:ss |
The timestamp of event | Current UTC date and time |
eventType | One of the following string values: Click (=weight of 1) RecommendationClick (=weight of 2) AddShopCart (=weight of 3) RemoveShopCart (=weight of -1) Purchase (=weight of 4) |
The event type. This will determain the event strength only if weight is not provided | Click |
weight | number | Custom event strength. If provided, eventType will be ignored | 1.0 |
The following table specifies the schema of the recommendation JSON object:
Property Name | Type | Description |
---|---|---|
recommendedItemId | string | The recommended item id |
score | number | The score of this recommendation |