List of all UCX objects and their respective metadata.
Table utilization per workflow:
Table | Generate Assessment | Update Migration Progress | Migrate Groups | Migrate External Tables | Upgrade Jobs | Migrate tables | Migrate Data Reconciliation |
---|---|---|---|---|---|---|---|
tables | RW | RW | RW | RW | |||
grants | RW | RW | RW | RW | |||
mounts | RW | RO | RO | RO | |||
permissions | RW | RW | RO | RO | |||
jobs | RW | RW | RO | ||||
clusters | RW | RW | |||||
directfs_in_paths | RW | RW | |||||
directfs_in_queries | RW | RW | |||||
external_locations | RW | RO | |||||
workspace | RW | RO | RO | ||||
workspace_objects | RW | ||||||
azure_service_principals | RW | ||||||
global_init_scripts | RW | ||||||
pipelines | RW | RW | |||||
groups | RW | RO | |||||
table_size | RW | ||||||
submit_runs | RW | ||||||
policies | RW | RW | |||||
migration_status | RW | RW | RW | ||||
query_problems | RW | RW | |||||
workflow_problems | RW | RW | |||||
udfs | RW | RW | RO | ||||
logs | RW | RW | RW | RW | RW | RW | RW |
recon_results | RW | ||||||
redash_dashboards | RW | RW | |||||
lakeview_dashboards | RW | RW |
RW - Read/Write, the job generates or updates the table.
RO - Read Only
Holds Inventory of all tables in all databases and their relevant metadata.
Column | Datatype | Description | Comments |
---|---|---|---|
catalog | string | Original catalog of the table. hive_metastore by default | |
database | string | Original schema of the table | |
name | string | Name of the table | |
object_type | string | MANAGED, EXTERNAL, or VIEW | |
table_format | string | Table provider. Like delta or json or parquet. | |
location | string | Location of the data for table | |
view_text | nullable string | If the table is the view, then this column holds the definition of the view | |
upgraded_to | string | Upgrade Target (3 level namespace) |
Inventory of all Table ACLs for tables indexed in tables table.
Column | Datatype | Description | Comments |
---|---|---|---|
principal | string | User name, group name, or service principal name | |
action_type | string | Name of GRANT action | |
catalog | string | Original catalog of the table. hive_metastore by default | |
database | Nullable string | Original schema of the table | |
table | Nullable string | Name of the table | |
view | Nullable string | Name of the view | |
any_file | bool | Any file | |
anonymous_function | string | Grant for the anonymous function |
List of DBFS mount points.
Column | Datatype | Description | Comments |
---|---|---|---|
name | string | Name of the mount point | |
source | string | Location of the backing dataset | |
instance_profile | Nullable string | This mount point is accessible only with this AWS IAM instance profile |
Workspace object level permissions
Column | Datatype | Description | Comments |
---|---|---|---|
object_id | string | Either: Group ID Workspace Object ID Redash Object ID Scope name |
|
supports | string | One of: AUTHORIZATION CLUSTERS CLUSTER_POLICIES DIRECTORIES EXPERIMENTS FILES INSTANCE_POOLS JOBS NOTEBOOKS PIPELINES REGISTERED_MODELS REPOS SERVING_ENDPOINTS SQL_WAREHOUSES |
|
raw_object_permissions | JSON | JSON-serialized response of: Generic Permissions Secret ACL Group roles and entitlements Redash permissions |
Holds a list of all jobs with a notation of potential issues.
Column | Datatype | Description | Comments |
---|---|---|---|
job_id | string | Job ID | |
job_name | string | Job Name | |
job_creator | string | UserID of the Job Creator | |
compatible | int | 1 or 0, used for percentage reporting | |
failures | string | List of issues identified by the assessment in JSON format |
Holds a list of all clusters with a notation of potential issues.
Column | Datatype | Description | Comments |
---|---|---|---|
cluster_id | string | Cluster Id | |
cluster_name | string | Cluster Name | |
cluster_creator | string | UserID of the Cluster Creator | |
compatible | int | 1 or 0, used for percentage reporting | |
failures | string | List of issues identified by the assessment in JSON format |
Holds a list of all external locations that will be required for the migration.
Column | Datatype | Description | Comments |
---|---|---|---|
external_location | string | External Location URL |
Holds a list of all workspace objects (notebooks, directories, files, repos and libraries) from the workspace. This is used by the permission crawler.
Column | Datatype | Description | Comments |
---|---|---|---|
object_id | string | unique id of the object | |
object_type | string | type of object (NOTEBOOK, DIRECTORY, REPO, FILE, LIBRARY) | |
path | string | full path of the object in the workspace | |
language | string | language of the object (applicable for notebooks only) |
Holds a list of all Redash or Lakeview dashboards. This is used by the QueryLinter
and Redash
migration.
Column | Datatype | Description | Comments |
---|---|---|---|
id | string | The ID for this dashboard. | |
name | string | The title of the dashboard that appears in list views and at the top of the dashboard page. | |
parent | string | The identifier of the workspace folder containing the object. | |
query_ids | list[string] | The IDs of the queries referenced by this dashboard. | |
tags | list[string] | The tags set on this dashboard. |