Skip to content
Timothy Blattner edited this page Oct 11, 2022 · 5 revisions

Trojai Test Harness Wiki

Contains information on running TrojAI and how to use the API for interacting with leaderboards.

About the Trojai Test Harness

The Trojai test harness is a codebase for running the competition for TrojAI. The primary controls for submitting jobs is through Google Drive by sharing singularity containers (based on the trojai-example with the competition google account. The shared file is then parsed to determine which leaderboard and dataset to execute a container on. The container is then submitted into a task execution framework that is executed on a Slurm cluster.

The front-end is a Github/NIST pages web-site. When new content is found, then the web-page is automatically updated to include the new content.

The back-end is split into several components:

  1. TrojaiConfig
  • Contains all paths and high-level customization options for the trojai competition.
  1. Actor
  • Holds information about actor; email, name, submission metadata.
  1. Leaderboards
  • Contains all information for a leaderboard; name, path to submissions, datasets, task
  1. Submissions
  • Contains all information from an actor's submission; metric results, container location, result location
  1. Dataset
  • Contains all information about a dataset; name, metrics to compute, dataset location
  1. Task
  • Contains information for how a task is executed; script locations, execution parameters

Many of these components contain command-line interfaces to interact with their metadata.

Command-line Interface

TrojaiConfig

  • python trojai_config.py --trojai-dirpath <main_trojai_path> --token-pickle-filepath <path_to_gdrive_pickle_file> --int
    • Creates the main trojai configuration file

Actor

  • python actor.py [add-actor|remove-actor|reset-actor]
    • add-actor --trojai-config-filepath <trojai_config_filepath> --name <actor_name> --email <email> --poc-email <poc-email> --type <public/performer>

      • Adds actor to competition
    • remove-actor --trojai-config-filepath <trojai_config_filepath> --email <actor_email>

      • Removes actor from competition
    • reset-actor

Leaderboard

  • python leaderboard.py [init|add-dataset]
    • init --trojai-config-filepath <trojai_config_filepath> --name <leaderboard_name> --task-name <task_name> [--add-default-datasplit]
      • Initializes a leaderboard. If default datasplits are setup in datasets directory (train, test, sts, holdout), then they will be automatically added to the leaderboard if you specific --add-default-datasplit
      • task_name must be one of the valid tasks listed inside of the Leaderboard. If a new task is needed then the Task abstraction should be used to create that new task.
    • add-dataset

Customizing the competition

Once a leaderboard has been added, then the next step is to customize the trojai_config.json to include the new leaderboard.

  • active_leaderboard_names
    • Names of the leaderboards that an actor is able to submit too.
  • archive_leaderboard_names
    • Names of the old leaderboards that are shown for archival purposes

Trojai API

The TrojAI API is used to interact with the leaderboard, actors in the competition, submissions, and adding new components into the competition. Utilities are also provided to customize new launches into a slurm cluster that has access to all of the metadata within the competition. Most of the components contain managers to help access specific actors, datasets, and submissions.

Installation

TrojaiConfig

trojai_config = TrojaiConfig.load_json(trojai_config_filepath)

  • Is the main holder of all directory paths for the competition.
  • Used across multiple APIs to load json files.

ActorManager

actor_manager = ActorManager.load_json(trojai_config)
actor = actor_manager.get(actor_email)
actor = actor_manager.get_from_name(actor_name)
actor = actor_manager.get_from_uuid(uuid)
actor_list = actor_manager.get_actors()
  • Contains helper functions to access actors
  • Actor object is used in multiple other API calls to operate with that actor.
  • Raises RuntimeError if it is unable to find requested actor or if multiple actors share the same requested information.

Leaderboard

leaderboard = Leaderboard.load_json(trojai_config, leaderboard_name)
submission_filepath = leaderboard.submissions_filepath
metrics = leaderboard.get_submission_metrics(data_split_name)
dataset = leaderboard.get_dataset(data_split_name)
task = leaderboard.get_task()
metadata_df = leaderboard.load_metadata_csv_into_df() # Loads the metadata pandas data frame, or None if it has not been generated
results_df = leaderboard.load_round_results_csv_into_df() # Loads summary results pandas data frame, or None if it has not been generated
  • Leaderboard is the entry point to obtaining submissions, datasets, metrics, and the task that executes the leaderboard.
  • The data split name maps to the name of the dataset; test, holdout, train, sts
  • Pandas metadata data-frames contains all metadata about the leaderboard's datasets and what parameters were used to generate them
  • Pandas results data-frames contains all result data for the leaderboard, generated from the submission_manager

Submissions

submission_manager = SubmissionManager.load_json(leaderboard)
all_actor_submissions = submission_manager.get_submissions_by_actor(actor)
valid_actor_submissions = submission_manager.gather_submissions(leaderboard, data_split_name, metric_name, target_metric_value, actor)

# Important submission metadata
submission.g_file # Google drive file, contains 'name', which is the name of the container
submission.metric_results # Dictionary of metric results: key = metric_name, value = result of metric
submission.saved_metric_results # Dictionary of any saved metric results such as plots: key = metric_name, value = path to saved file
submission.submission_epoch # submission epoch that maps back to the leaderboard. See time_utils for conversion info
submission.actor_submission_dirpath # path to actor submission
submission.execution_results_dirpath # the directory path to the actor's results for the submission

# Helper functions
predictions: np.ndarray, targets: np.ndarray, models: list[str] = submission.get_predictions_target_models(leaderboard) # Gets the predictions, targets, and associated models for each
actor_submission_filepath = submission.get_submission_filepath()
  • Submissions provide accessors to load all submission criteria