Skip to content

Latest commit

 

History

History
156 lines (118 loc) · 13.2 KB

table_persistence.md

File metadata and controls

156 lines (118 loc) · 13.2 KB

UCX objects

List of all UCX objects and their respective metadata.

Overview

Table utilization per workflow:

Table Generate Assessment Update Migration Progress Migrate Groups Migrate External Tables Upgrade Jobs Migrate tables Migrate Data Reconciliation
tables RW RW RW RW
grants RW RW RW RW
mounts RW RO RO RO
permissions RW RW RO RO
jobs RW RW RO
clusters RW RW
directfs_in_paths RW RW
directfs_in_queries RW RW
external_locations RW RO
workspace RW RO RO
workspace_objects RW
azure_service_principals RW
global_init_scripts RW
pipelines RW RW
groups RW RO
table_size RW
submit_runs RW
policies RW RW
migration_status RW RW RW
query_problems RW RW
workflow_problems RW RW
udfs RW RW RO
logs RW RW RW RW RW RW RW
recon_results RW
redash_dashboards RW RW
lakeview_dashboards RW RW

RW - Read/Write, the job generates or updates the table.
RO - Read Only

Inventory Database

$inventory.tables

Holds Inventory of all tables in all databases and their relevant metadata.

Column Datatype Description Comments
catalog string Original catalog of the table. hive_metastore by default
database string Original schema of the table
name string Name of the table
object_type string MANAGED, EXTERNAL, or VIEW
table_format string Table provider. Like delta or json or parquet.
location string Location of the data for table
view_text nullable string If the table is the view, then this column holds the definition of the view
upgraded_to string Upgrade Target (3 level namespace)

$inventory.grants

Inventory of all Table ACLs for tables indexed in tables table.

Column Datatype Description Comments
principal string User name, group name, or service principal name
action_type string Name of GRANT action
catalog string Original catalog of the table. hive_metastore by default
database Nullable string Original schema of the table
table Nullable string Name of the table
view Nullable string Name of the view
any_file bool Any file
anonymous_function string Grant for the anonymous function

$inventory.mounts

List of DBFS mount points.

Column Datatype Description Comments
name string Name of the mount point
source string Location of the backing dataset
instance_profile Nullable string This mount point is accessible only with this AWS IAM instance profile

$inventory.permissions

Workspace object level permissions

Column Datatype Description Comments
object_id string Either:
Group ID
Workspace Object ID
Redash Object ID
Scope name
supports string One of:
AUTHORIZATION

CLUSTERS
CLUSTER_POLICIES
DIRECTORIES
EXPERIMENTS
FILES
INSTANCE_POOLS
JOBS
NOTEBOOKS
PIPELINES
REGISTERED_MODELS
REPOS
SERVING_ENDPOINTS
SQL_WAREHOUSES
raw_object_permissions JSON JSON-serialized response of:
Generic Permissions
Secret ACL
Group roles and entitlements
Redash permissions

$inventory.jobs

Holds a list of all jobs with a notation of potential issues.

Column Datatype Description Comments
job_id string Job ID
job_name string Job Name
job_creator string UserID of the Job Creator
compatible int 1 or 0, used for percentage reporting
failures string List of issues identified by the assessment in JSON format

$inventory.clusters

Holds a list of all clusters with a notation of potential issues.

Column Datatype Description Comments
cluster_id string Cluster Id
cluster_name string Cluster Name
cluster_creator string UserID of the Cluster Creator
compatible int 1 or 0, used for percentage reporting
failures string List of issues identified by the assessment in JSON format

$inventory.external_locations

Holds a list of all external locations that will be required for the migration.

Column Datatype Description Comments
external_location string External Location URL

$inventory.workspace_objects

Holds a list of all workspace objects (notebooks, directories, files, repos and libraries) from the workspace. This is used by the permission crawler.

Column Datatype Description Comments
object_id string unique id of the object
object_type string type of object (NOTEBOOK, DIRECTORY, REPO, FILE, LIBRARY)
path string full path of the object in the workspace
language string language of the object (applicable for notebooks only)

$inventory.redash_dashboards and $inventory.lakeview_dashboards

Holds a list of all Redash or Lakeview dashboards. This is used by the QueryLinter and Redash migration.

Column Datatype Description Comments
id string The ID for this dashboard.
name string The title of the dashboard that appears in list views and at the top of the dashboard page.
parent string The identifier of the workspace folder containing the object.
query_ids list[string] The IDs of the queries referenced by this dashboard.
tags list[string] The tags set on this dashboard.