Skip to content

Latest commit



960 lines (748 loc) · 44.6 KB

File metadata and controls

960 lines (748 loc) · 44.6 KB


Note: All mentions of the word "tab" and "page" refers to the definitions in Glossary (unless specified eg. HTML page)

Table of Contents

Base directory (back to Contents)

The HTML document.

The body section is populated by populateHtml.js upon loading the HTML page.

The responsive navigation bar design is taken from:

However, the navigation bar isn't properly responsive as the canvas is bigger than the responsive navigation bar.


populateHtml.js (back to Contents)

A script that populates the body section of index.html.

Specifically, it populates the top-navigation-bar container (<div class="topnav" id="topnav">) and the pages container (<div id='pages'>), and it runs when the HTML page's DOMContentLoaded event is fired.

It creates a page and a tab for every dict in modelInfo (type:array of dict) (found in /configs/CONFIG_modelInfo.js).

The page's title is set to modelInfo[i].pageTitle, the tab name is set to modelInfo[i].tabName, and the page is set to interact with the Google Cloud ML model defined by modelInfo[i].project, .model and .version.

initialisationScript.js (back to Contents)

A script that runs when the HTML page is first loaded. It job includes:

  • declaring global variables
  • pinging the CORS-Anywhere proxy and the Google Cloud ML models (to start them up if they're offline)

Global variables

Global variables Description
xhrDict Refer to googleApiFunctions.js > sendPayload > Misc. info
pagesThatsLoading Refer to guiFunctions.js > window.setInterval


What are being pinged:

  • the CORS-Anywhere proxy (defined /configs/CONFIG_misc.js > corsProxy)

  • every model defined in modelInfo (in /configs/CONFIG_misc.js)

For the proxy:
Besides to start it up, pinging the proxy also served to check whether the proxy is unescaping the headers properly.

It checks by sending a XHR post to via the proxy, which returns the XHR request headers (and all other details) back in the XHR response.

The XHR headers will be similar to that sent to the models, to simulate the XHRs sent to the models.

You can then check the XHR reponse in the console (under the collapse group: > Ping to cors proxy returned) or through the XHR traffic of the browser.

For the models:
The script pings every model defined in modelInfo (in /configs/CONFIG_misc.js) via the proxy, similar to a regular prediction request.

The image data payload sent is the smallest (or almost the smallest) possible image, either in 3D RGB array or base64.

The prediction data returned from the ping will be logged in the console, with a "Ping to "{MODEL_NAME}" returned" below.


> Prediction data returned
  model: "{MODEL_NAME}"
  Ping to "{MODEL_NAME}" returned

configs folder (back to Contents)

Contains configuration files.

CONFIG_credentials.js (back to Contents)

Contains the const credentials (type:dict), which contains the Google Cloud credentials.

This project requires the credentials of a service account that has permissions to request for predictions. (specifically ml.models.predict and ml.versions.predict as documented in — Setting up > Step 1: Getting Google Cloud credentials)
Credentials can be obtained via:
(Detailed instructions found in

The generated credentials JSON file will look something like this:

  "type": "service_account",
  "project_id": "my-project-123456",
  "private_key_id": "1a2b3c4d5e6f7g8h9i10j11k12l13m14o",
  "private_key": "-----BEGIN PRIVATE KEY-----\nAbCdE...fGhIj\n-----END PRIVATE KEY-----\n",
  "client_email": "[email protected]",
  "client_id": "1234567891011121314",
  "auth_uri": "",
  "token_uri": "",
  "auth_provider_x509_cert_url": "",
  "client_x509_cert_url": ""

To configure, change the key valves of credentials (type:dict) to that in your generated credentials JSON file as follows:

CONFIG_credentials.js dict. key Generated JSON dict. key Valve for above example
scope (don't change) -
clientEmail client_email [email protected]
clientId client_id 1234567891011121314
privateKey private_key -----BEGIN PRIVATE KEY-----\nAbCdE...fGhIj\n-----END PRIVATE KEY-----\n

CONFIG_modelInfo.js (back to Contents)

Contains the const modelInfo (type:array of dict), which contains:
  1. the details of the Google Cloud ML models
  2. the desired display names/titles for each model on the HTML page

modelInfo is an array of dict, with each dict containing the details of a single deployed Google Cloud ML model. Each dict will create its own page.

To configure, insert dict objects into modelInfo (type:array of dict) with the following key valves:

Dict. key Type What it is
tabName string the name displayed on page's tab, on the top navigation bar
pageTitle string the displayed page title for this model
project string the Google Cloud ML project name for this model
model string the Google Cloud ML model name for this model
version int the Google Cloud ML version for this model
acceptsBase64 boolean true - if the model accepts base64 encoded images.
false - if it accepts 3D RGB tensor/array
Example of 3D RGB tensor/array
a 3D RGB tensor/array of a 2x2 square
(color: R=1, G=2, B=3)
  [[1, 2, 3], [1, 2, 3]],
  [[1, 2, 3], [1, 2, 3]]
labelMap array of strings the prediction data returned from the model states the object class/type using an int. This labelMap maps that int to the name of the object.

Note: The first item in labelMap is always null because id 0 in Tensorflow label maps are not used. (it's reserved for the background label [refer to line 34 to 38])
So labelMap[0] is not used as well, as no detection boxes with id 0 are expected.

the prediction data returned will be in this general format:
{ predictions : [
    detection_boxes : [Array(4), Array(4), ... ]
    detection_classes : [1, 2, ... ] ,
    detection_scores : [0.597..., 0.535..., ... ] ,
    num_detections : 300,
As an example, consider this labelMap: labelMap : [null, 'obj1', 'obj2']
and this prediction's detection_classes: detection_classes : [1, 1, 2]

The names of 1st, 2nd and 3rd detection boxes will thus be: obj1, obj1 and obj2 respectively
confidenceThreshold float (range: 0.0 to 1.0) the min. confidence/score valve that a detection box has to be, before it's drawn/shown. Any boxes with scores < confidenceThreshold won't be shown
displayNames boolean true - show the object classes/names
false - hide the object classes/names (useful when there's only 1 object class and/or there's many boxes in 1 image, to avoid cluttering the image)

CONFIG_misc.js (back to Contents)

Miscellaneous configurables.

colorPalette (type:array of str) (back to Contents)

The colors palette of the detection boxes.

It is an array of str, with its str items being CSS colors names/codes.

Similar to labelMap in modelInfo in CONFIG_modelInfo.js, a detection box with class/type i (type:int) will be given the color: colorPalette[i]. Used in drawDetectionBoxes (in drawBoxes.js)

Note: The first item in colorPalette is always null because id 0 in Tensorflow label maps are not used. (it's reserved for the background label [refer to line 34 to 38])
So colorPalette[0] is not used as well, as no detection boxes with id 0 are expected.

acceptedFileExts (type:array of str) (back to Contents)

Contains all the accepted file/image extensions.

corsProxy (type:str) (back to Contents)

Contains the modified CORS-Anywhere proxy URL.

The modified CORS-Anywhere proxy is used to bypass CORS-restrictions, and to allow setting of restricted HTTP headers. This is to spoof Google Cloud ML into accept a prediction request from an unauthorised website.

Refer to googleApiFunctions.js > getPrediction > sendPayload for info on how its used.

Refer to — Setting up > Step 3: Hosting the website TODOcheckIfWorks for info on how to set up a new proxy, should the current proxy go down.

functions folder (back to Contents)

Contains all the project's functions, which are split into multiple JavaScript modules.

getImageData.js (back to Contents)

Contains the function that:

  • displays the uploaded image (but not the detection boxes)

  • converts the uploaded image to the correct format for sending to Google Cloud ML

  • monkey patch displayScaledImage to the canvas element object for redrawing of image by toggleBoxVisibility

getImageArray(pageDiv, toBase64, callback) [in getImageData.js] (back to Contents)

  • draws the uploaded image onto the page's canvas
  • and get the image data in either 3D RGB tensor/array or base64 encoded format
    Example of 3D RGB tensor/array
    a 3D RGB tensor/array of a 2x2 square
    (color: R=1, G=2, B=3)
      [[1, 2, 3], [1, 2, 3]],
      [[1, 2, 3], [1, 2, 3]]
Parameter Type Description
pageDiv HTMLDivElement <div> HTML element of the page
toBase64 bool whether or not the model accepts base64 encoded images, defined in acceptsBase64 (type:bool) in modelInfo (in CONFIG_modelInfo.js)
callback function the callback function to return the formatted image data; expects 2 params, callback(errorMsg, imageData), where errorMsg (type:str) is the error message, and is null when there's no error

What this function does:

  • read the uploaded image file

  • monkey patch displayScaledImage (with the same displaySize parameter) to the canvas element object as the function redrawImage

  • display the image on the canvas

  • scale down the image if it's estimated to be too big

  • format the image into either 3D RGB array or base64 string

  • return the formatted image data

Inner functions in getImageArray

onBoxSwitch() [Inner function of getImageArray] (back to Contents)

Toggle on the "Toggle box visibility" switch

Image of the switch

This is to ensure the detection boxes are always shown whenever a new image is uploaded.

Reason for doing so
Else the user might toggle-off the boxes, send a new image and wonder why there are no boxes (when in reality, it's because the boxes' visibilities are toggled off)

displayScaledImage(displaySize) [Inner function of getImageArray] (back to Contents)

Displays the uploaded image file, by scaling it to displaySize and then drawing it on the page's canvas.

The displayed image will have an area equal to (displaySize ^ 2).

Reason for (displaySize ^ 2) I thought it would be easier for one to estimate length rather than area. (ie. easier to estimate width and height of image rather than area)
So displaySize=512 will scale the images to the same area as a 512x512 image.

getRGBArray() [Inner function of getImageArray] (back to Contents)

Gets the image data in 3D RGB array format.
Example of 3D RGB tensor/array
a 3D RGB tensor/array of a 2x2 square
(color: R=1, G=2, B=3)
  [[1, 2, 3], [1, 2, 3]],
  [[1, 2, 3], [1, 2, 3]]

What this function does:

  • creates a new invisible canvas
  • estimate if image will exceed Google Cloud's payload limit of 1572864 bytes
    Details on how the `98303` value was calculated The largest string a pixel can be in 3D RGB array format can be is "[[xxx,xxx,xxx]]," where the RGB values are all 3-digit integers, and where the image is a 1 pixel thick vertical line. Thus, the max. byte size of a pixel is 16 (ie. the byte size of "[[xxx,xxx,xxx]],") Hence, assuming max. bytes per pixel, the max. number of pixels an image can have before exceeding the 1572864 bytes limit is: (1572864 - 2) / 16 = 98303.875 , where - 2 is for the other most square brackets.
    • if it's estimated to be too big, the image is scaled down
  • using the new canvas, get the raw 1D RGBA (A for alpha) image data
    Format of 1D RGBA image data
    [R1, G1, B1, A1, R2, G2, B2, A2, R3, ...]
    where R1, G1, B1, A1 are the RGBA values of the 1st pixel; R2, G2, B2, A2 are that of the 2nd pixel
  • parse/format the image data to 3D RGB array
    Example of 3D RGB tensor/array
    a 3D RGB tensor/array of a 2x2 square
    (color: R=1, G=2, B=3)
      [[1, 2, 3], [1, 2, 3]],
      [[1, 2, 3], [1, 2, 3]]
  • return the formatted image data to callback function

getBase64() [Inner function of getImageArray] (back to Contents)

Gets the image data in base64 encoding.

What this function does:

  • creates a new invisible canvas
  • estimate if image will exceed Google Cloud's payload limit of 1572864 bytes
  • if it's estimated to be too big, the image is scaled down
  • using the new canvas, get base64 image data
  • check if the base64 data is still too big
    • if its still too big, keep scaling it down until it's below the 1572864 bytes limit
  • return the formatted image data to callback function

Inner functions in getBase64

getScaledB64(_scale) [Inner function of getBase64] (back to Contents)

Scales the image down by _scale (type:float), then get the base64 image data of the scaled image.

_scale is how much the image area will be scaled down by. (ie. _scale = 0.5 will half the image area)

The scaling and getting of the base64 data is done using the invisible canvas created in getBase64.

getEstimatedScale(imageArea) [Inner function of getBase64] (back to Contents)

Estimate if image will exceed Google Cloud's payload limit of 1572864 bytes, and return the required scale-down value if it's too big.

1.54 is just a ballpark estimate of base64 byte-size per pixel.

Misc. functions in getImageData.js

getByteSize(str) [in getImageData.js] (back to Contents)

Same as the getByteSize in googleApiFunctions.js.

googleApiFunctions.js (back to Contents)

Contains the function that interact with the Google Cloud service, including:

  • authentication
  • sending the image payload for prediction (not including formatting the image into the right form (eg. base64) - thats done in getImageData.js)

getPrediction(pageDiv, model, imageData, callback) [in googleApiFunctions.js] (back to Contents)

Performs the Google Cloud authentication, and sending of the image payload for prediction that was mentioned above.

Parameter Type Description
pageDiv HTMLDivElement <div> HTML element of the page
model dict a dict in modelInfo (in /configs/CONFIG_modelInfo.js) that contains the info on the model for the page
imageData either 3D array or str the image data, that's been formatted by the getImageArray function (in getImageData.js); its either a 3D RGB array or a base64 string, depending on model.acceptsBase64 (type:bool)
callback function the callback function to return the prediction data from Google Cloud ML; expects 2 params, callback(errorMsg, predictionData), where errorMsg (type:str) is the error message, and is null when there's no error

Inner functions in getPrediction

getToken(_callback) [Inner function of getPrediction] (back to Contents)

Gets the short-lived (lasts for 1h) access token needed to authenticate prediction requests.

Google Cloud documentation on the general procedure to get access tokens (doesn't have documentation for pure JS):

The pure JavaScript implementation of above procedure (that is used by this function):

What this function does:

  • creates a dict payload with credentials.clientEmail and .clientId (in CONFIG_credentials.js), along with some other info

  • JSON.stringify the payload to give a JSON Web Token (JWT) (Here's some background on JWT and JWS)

  • sign it with credentials.privateKey to get a JSON Web Signature (JWS)

  • XHR post the JWS to the OAuth 2.0 URL, which returns the access token

  • return the access token to the callback function

sendPayload(token, _callback) [Inner function of getPrediction] (back to Contents)

Gets the prediction data from Google Cloud ML model, by sending the access token (returned from getToken) and the image data (formatted by getImageArray).

What this function does:

  • formats the image data into a stringified JSON payload

  • add fake headers to the XMLHttpRequest object, to spoof Google Cloud ML into accept a prediction request from an unauthorised website

    • escaping the restricted headers (ie. DNT, Origin, Referer, User-Agent) with a hyphen (-) prefix (as client browsers disallow setting of these headers - info)
  • XHR post the image payload to the Google Cloud ML URL via the modified CORS-Anywhere Proxy (proxy URL configured in corsProxy)

    • the proxy will unescape the restricted headers, and bypass CORS-restrictions
  • Google Cloud ML will return the prediction data, and that data will be returned to the callback function (ie. _callback(errorMsg, predictionData)) with errorMsg = null

    • if any error occurs, the error message (type:str) will be returned in the callback's 1st parameter — errorMsg — and null for the 2nd parameter — predictionData

Misc. info
xhrDict is a dict of XMLHttpRequests(XHR) objects that are currently running, and that have not gotten a response/error yet.

What it is for: To ensure only 1 XHR is running per page; aborting the previous XHR of the page when the user uploads another image before the prediction data is returned from Google Cloud ML.

{ PAGE_ID_1 : XMLHttpRequest_1,
  PAGE_ID_2 : XMLHttpRequest_2,
  ... }
where PAGE_ID_n is the id (type:str) of the page's <div> container
(eg. id="page-solarpanel" for <div id='page-solarpanel'>)

Misc. functions in googleApiFunctions.js

setDictHeaders(xhr, dictHeader) [in googleApiFunctions.js] (back to Contents)

Set headers of a XMLHttpRequest object using a dict; instead of doing XMLHttpRequest.setRequestHeader(header, value) for every header.

Parameter Type Description
xhr XMLHttpRequest the XMLHttpRequest object
dictHeader dict dict of headers in the format:
  ... }

Get byte size of the str (type:str); for determining if image payload size is over the Google Cloud ML's 157286 bytes limit.

commaFormat(floatOrInt) [in googleApiFunctions.js] (back to Contents)

Formats floatOrInt (type:float/int) to a string with commas at thousands places (eg. 1000000.1 -> "1,000,000.1")

drawBoxes.js (back to Contents)

Contains the function that parses the prediction data and draws the detection boxes.

drawDetectionBoxes(pageDiv, model, data) [in drawBoxes.js] (back to Contents)

Parses the prediction data returned by getPrediction and draws the detection boxes.

Parameter Type Description
pageDiv HTMLDivElement
HTML element of the page
model dict a dict in modelInfo _(in /configs/CONFIG_modelInfo.js)_ that contains the info on the model for the page
data dict of arrays prediction data returned by getPrediction
  detection_boxes : [
    [0.468..., 0.692..., 0.592..., 0.342...], 
    [0.367..., 0.532..., 0.451..., 0.831...],
  detection_classes : [1, 2, 1, 4, ...],
  detection_scores : [0.9984182119369507, 0.9751233458518982, 0.898982048034668, ...],
  num_detections : 300,
  raw_detection_boxes : [
    [0.510..., 0.019..., 0.777..., 0.228...],
    [0.404..., 0.619..., 0.663..., 0.682...],
  raw_detection_scores : [
    [9.449..., 7.941..., -0.344..., -0.416..., -1.152..., 0.184..., ...],
    [10.675..., 7.724..., -0.693..., -1.354..., -1.479..., -0.350..., ...],

What this function does:

  • adjust the boxes' line width and the text font-size to scale to the canvas size

  • iterate through the prediction data

    • color of box i determined by colorPalette[detection_classes[i]] (detection_classes is a key in data)

    • text color is always white

    • boxes with scores < model.confidenceThreshold won't be drawn

    • class/object name are displayed if model.displayNames == true. The name of box i is determined by model.labelMap[detection_classes[i]] (detection_classes is a key in data)

    • only the confidence scores of boxes with scores > 0.95 are displayed

  • monkey patch itself (with the same pageDiv, model, data parameters) to the canvas element object as the function redrawBoxes

guiFunctions.js (back to Contents)

Contains all gui-related functions, including:

  • switching tabs / pages
  • displaying loading animations
  • the functions trigger right after the user uploads an image, such as
    • validating if the uploaded file is of the correct type
    • calling the functions in the other modules to perform the prediction request
  • displaying error messages on the page
  • etc.

focusOn(tabElement) [in guiFunctions.js] (back to Contents)

Triggers when user clicks on another tab, which makes the clicked tab "active", and displays the respective page elements.

tabElement is the HTML element object of the selected tab.

What this function does:

  • remove the active class from all tabs elements (making them all black)
  • add active class to selected tab element (making it green)
  • hide all pages
  • show only the respective selected page
    If tab id is "tab-2", the function will find for the page with id "page-2"

Triggers when the window's width becomes narrow (eg. when viewing on phone), which makes the navigation bar responsive.

The responsive navigation bar design is taken from:

However, the navigation bar isn't properly responsive as the canvas is bigger than the responsive navigation bar.


window.setInterval( () => { ... [in guiFunctions.js] (back to Contents)

A script that animates the "Loading..." dots.

Uses pagesThatsLoading (type:dict of HTMLDivElement) to determine the pages that needs to be animated. If a page's div element object is in pagesThatsLoading, this script will animate the dots.

displayLoading(pageDiv) [in guiFunctions.js] (back to Contents)

Starts and displays the loading animations, and hide the "Toggle box visibility" switch on the pageDiv (type:HTMLDivElement) page.

It starts the "Loading..." dot animation by adding pageDiv (type:HTMLDivElement) to pagesThatsLoading (type:dict of HTMLDivElement) to be animated by the window.setInterval script.

stopLoading(pageDiv) [in guiFunctions.js] (back to Contents)

Stops and hides the loading animations on the pageDiv (type:HTMLDivElement) page.

It stops the "Loading..." dot animation by removing pageDiv (type:HTMLDivElement) from pagesThatsLoading (type:dict of HTMLDivElement), thus stopping the window.setInterval script from animating it.

Unlike displayLoading, this function doesn't show the "Toggle box visibility" switch.

When an error occurs, the loading animations needs to be hidden, and replaced by the error message instead of the switch.

So the hide-loading function and show-switch function needs to be seperated.

displayToggleSwitch(pageDiv) [in guiFunctions.js] (back to Contents)

Shows the "Toggle box visibility" switch on the pageDiv (type:HTMLDivElement) page.

displayError(pageDiv, strError) [in guiFunctions.js] (back to Contents)

Displays the strError (type:str) error message on the pageDiv (type:HTMLDivElement) page.

Image of error message

hideError(pageDiv) [in guiFunctions.js] (back to Contents)

Hides error message on the pageDiv (type:HTMLDivElement) page.

Image of error message

toggleBoxVisibility(chkbox) [in guiFunctions.js] (back to Contents)

Toggle visibility of detection boxes on the image, based on whether chkbox (type:HTMLInputElement with type="checkbox") is checked or not.

It hides boxes by redrawing the image on the canvas, on top of the old boxed image, and shows by redrawing the boxes.

It does this by calling the either the redrawBoxes or redrawImage functions that's been monkey patched onto the canvas.

isValidInput(inputElement) [in guiFunctions.js] (back to Contents)

Checks if the user-uploaded file/image has an extension thats found in acceptedFileExts (in CONFIG_misc.js).

buildClassnameTree(element) [in guiFunctions.js] (back to Contents)

Monkey patch HTML element objects to their parent elements by class names, for easy referencing.

Consider this div container

<div id='mycontainer'>
  <div class='class1'>
    <input class='class2' />

By passing the HTMLDivElement object (with id="mycontainer") through this function as myElement:

var myElement = document.getElementById('mycontainer')

It will add myElement's children to the myElement object and the children can be referenced by their class names (escaped with an underscore (_) prefix) (same for the children's children etc.):

<div id='mycontainer'>              // myElement
    <div class='class1'>            // myElement._class1
        <input class='class2' />    // myElement._class1._class2
        <p>[TEXT]</p>               // (No class, so not added to myElement._class1)

Here's a snippet of the div element (class="boxToggle") being referenced from the page's div element:

function displayToggleSwitch(pageDiv) { = 'block'

run(inputElement, model) [in guiFunctions.js] (back to Contents)

The function that triggers after the user uploads an valid image (validated by isValidInput). It calls the functions in the other modules to perform the prediction request.

Parameter Type Description
inputElement HTMLInputElement the file-upload input element object
model dict a dict in modelInfo (in /configs/CONFIG_modelInfo.js) that contains the info on the model for the page

Known major error (back to Contents)

Sometimes, either the ping or a prediction request to the Google Cloud ML models fails, and console logs the following errors:

- POST https://{CORS_PROXY_URL}/https://content-ml.googleapi ... predict?alt=json 503 (Service Unavailable)

- Access to XMLHttpRequest at 'https://{CORS_PROXY_URL}/' from origin {either 'null' or something else} has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.

and sometimes it's accompanied by the custom console warnings such as:

> XMLHttpRequest for prediction failed
  model: "{MODEL_NAME}"
Example images in Chrome browser

This seems to usually happen when the no ping/prediction requests have been made for awhile. (eg. during the start of the day)

It also stops occurring after the first (or first few, if the requests are send in quick succession) ping/request; meaning, if you refresh the page, the error will stop occurring.

It's likely due to a fault on Google Cloud ML's end, since the proxy ping shows that the proxy is working properly.

No fix for this error has been found.

A work around is to simply wait for some time (~1-2mins) after loading the HTML page, to allow the pings to the models to return this error; so that the later prediction requests won't get this error. (Reason: refer to above paragraphs)

Term What I mean
tab a tab on the navigation bar

Tab 1

Tab 2

page a set of HTML elements that is shown when a tab is selected, denoated in the HTML by a container: <div id='page-PAGE_NAME'>
Every model has one page

Page 1

Page 2