Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge upstream 2 #306

Merged
merged 23 commits into from
Jan 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
cfe65cc
chore(ingest): speed up lintFix command (#12346)
hsheth2 Jan 15, 2025
32cbc7d
feat(sdk): support urn types in Urn.from_string (#12347)
hsheth2 Jan 15, 2025
a3c7a33
docs(ingest/mssql): update mssql_recipe.yml to include convert_urn_to…
gabe-lyons Jan 15, 2025
96cfa46
feat(ingest): add `num_queries_used_in_lineage` counter (#12336)
hsheth2 Jan 15, 2025
4cde4aa
chore(ci): truncate gh-pages branch history (#12360)
hsheth2 Jan 15, 2025
2226820
dev(ingest): use ruff instead of flake8 (#12359)
anshbansal Jan 16, 2025
d99b97a
chore(tableau): metrics to understand pagination in get_connection_ob…
sgomezvillamor Jan 16, 2025
765bf80
dev(ingest): move from isort to ruff (#12364)
anshbansal Jan 16, 2025
35e8d31
fix(ingest/datahub): dataHubExecutionRequest default exclude (#12365)
anshbansal Jan 16, 2025
ad0fbd7
fix(ingest/gc): infinite loop in getting soft deleted counts (#12363)
anshbansal Jan 16, 2025
18701b7
feat(cli): for python > 3.11 log a warning (#12366)
anshbansal Jan 16, 2025
0392a22
fix(ingest/tableau): Fix TableauUpstream create check (#12320)
treff7es Jan 16, 2025
b7b541c
feat(tableau): fine-grained page size (#12354)
sgomezvillamor Jan 16, 2025
7eaadb0
fix(sdk): cleanup empty secret names (#12367)
anshbansal Jan 16, 2025
0ddf886
chore(bump): bump/align avro-serializer (#12368)
david-leifker Jan 16, 2025
bfe9758
fix(cli): list-source-runs added null checking (#12369)
kevinkarchacryl Jan 16, 2025
440ba81
docs: modify banner for dh v1 & update core dropdown link (#12362)
yoonhyejin Jan 16, 2025
3084147
fix(pdl): Add Dataplatform Instance urn pdl file (#11754)
rharisi Jan 17, 2025
4a1fff5
feat(ui-plugin) - Allow custom userContext states to be added (#12057)
mkamalas Jan 17, 2025
fb08919
feat(ui): Enhancements to the user pic list selection within entities…
Deepalijain13 Jan 17, 2025
825309e
Fix(UI): Move setUpdatedName call inside updateName promise in Datase…
Bhadhri03 Jan 17, 2025
99ce309
feat(datahub) Remove serialVersionUID from constructor (#12150)
bda618 Jan 17, 2025
9d04e2a
Merge branch 'master' into oss-merge
hsheth2 Jan 17, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions .github/workflows/documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ on:
branches:
- "**"
paths:
- ".github/workflows/documentation.yml"
- "metadata-ingestion/**"
- "metadata-models/**"
- "docs/**"
Expand All @@ -13,6 +14,7 @@ on:
branches:
- master
paths:
- ".github/workflows/documentation.yml"
- "metadata-ingestion/**"
- "metadata-models/**"
- "docs/**"
Expand Down Expand Up @@ -56,9 +58,13 @@ jobs:
./gradlew --info docs-website:build

- name: Deploy
if: github.event_name == 'push'
uses: peaceiris/actions-gh-pages@v3
if: github.event_name == 'push' && github.repository == 'acryldata/datahub'
uses: peaceiris/actions-gh-pages@v4
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./docs-website/build
cname: datahubproject.io
# The gh-pages branch stores the built docs site. We don't need to preserve
# the full history of the .html files, since they're generated from our
# source files. Doing so significantly reduces the size of the repo's .git dir.
force_orphan: true
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ metadata-service/plugin/src/test/resources/sample-plugins/**
smoke-test/rollback-reports
coverage*.xml
.vercel
.envrc

# A long series of binary directories we should ignore
datahub-frontend/bin/main/
Expand Down
2 changes: 1 addition & 1 deletion build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,7 @@ project.ext.externalDependency = [
'junitJupiterEngine': "org.junit.jupiter:junit-jupiter-engine:$junitJupiterVersion",
// avro-serde includes dependencies for `kafka-avro-serializer` `kafka-schema-registry-client` and `avro`
'kafkaAvroSerde': "io.confluent:kafka-streams-avro-serde:$kafkaVersion",
'kafkaAvroSerializer': 'io.confluent:kafka-avro-serializer:5.1.4',
'kafkaAvroSerializer': "io.confluent:kafka-avro-serializer:$kafkaVersion",
'kafkaClients': "org.apache.kafka:kafka-clients:$kafkaVersion-ccs",
'snappy': 'org.xerial.snappy:snappy-java:1.1.10.5',
'logbackClassic': "ch.qos.logback:logback-classic:$logbackClassic",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,10 @@ public static <T> T restrictEntity(@Nonnull Object entity, Class<T> clazz) {
try {
Object[] args =
allFields.stream()
// New versions of graphql.codegen generate serialVersionUID
// We need to filter serialVersionUID out because serialVersionUID is
// never part of the entity type constructor
.filter(field -> !field.getName().contains("serialVersionUID"))
.map(
field -> {
// properties are often not required but only because
Expand Down
7 changes: 7 additions & 0 deletions datahub-web-react/src/app/context/CustomUserContext.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
/**
* Custom User Context State - This is a custom user context state and can be overriden in specific fork of DataHub.
* The below type can be customized with specific object properties as well if needed.
*/
export type CustomUserContextState = Record<string, any>;

export const DEFAULT_CUSTOM_STATE: CustomUserContextState = {};
3 changes: 3 additions & 0 deletions datahub-web-react/src/app/context/userContext.tsx
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import React from 'react';
import { CorpUser, PlatformPrivileges } from '../../types.generated';
import { CustomUserContextState, DEFAULT_CUSTOM_STATE } from './CustomUserContext';

/**
* Local State is persisted to local storage.
Expand All @@ -22,6 +23,7 @@ export type State = {
loadedPersonalDefaultViewUrn: boolean;
hasSetDefaultView: boolean;
};
customState?: CustomUserContextState;
};

/**
Expand Down Expand Up @@ -51,6 +53,7 @@ export const DEFAULT_STATE: State = {
loadedPersonalDefaultViewUrn: false,
hasSetDefaultView: false,
},
customState: DEFAULT_CUSTOM_STATE,
};

export const DEFAULT_CONTEXT = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,9 +48,9 @@
setIsEditing(false);
return;
}
setUpdatedName(name);
updateName({ variables: { input: { name, urn } } })
.then(() => {
setUpdatedName(name);

Check warning on line 53 in datahub-web-react/src/app/entity/shared/containers/profile/header/EntityName.tsx

View check run for this annotation

Codecov / codecov/patch

datahub-web-react/src/app/entity/shared/containers/profile/header/EntityName.tsx#L53

Added line #L53 was not covered by tests
setIsEditing(false);
message.success({ content: 'Name Updated', duration: 2 });
refetch();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -78,10 +78,26 @@
const renderSearchResult = (entity: Entity) => {
const avatarUrl =
(entity.type === EntityType.CorpUser && (entity as CorpUser).editableProperties?.pictureLink) || undefined;
const corpUserDepartmentName =
(entity.type === EntityType.CorpUser && (entity as CorpUser).properties?.departmentName) || '';
const corpUserId = (entity.type === EntityType.CorpUser && (entity as CorpUser).username) || '';
const corpUserTitle = (entity.type === EntityType.CorpUser && (entity as CorpUser).properties?.title) || '';

Check warning on line 84 in datahub-web-react/src/app/entity/shared/containers/profile/sidebar/Ownership/EditOwnersModal.tsx

View check run for this annotation

Codecov / codecov/patch

datahub-web-react/src/app/entity/shared/containers/profile/sidebar/Ownership/EditOwnersModal.tsx#L81-L84

Added lines #L81 - L84 were not covered by tests
const displayName = entityRegistry.getDisplayName(entity.type, entity);

Check warning on line 86 in datahub-web-react/src/app/entity/shared/containers/profile/sidebar/Ownership/EditOwnersModal.tsx

View check run for this annotation

Codecov / codecov/patch

datahub-web-react/src/app/entity/shared/containers/profile/sidebar/Ownership/EditOwnersModal.tsx#L86

Added line #L86 was not covered by tests
return (
<Select.Option value={entity.urn} key={entity.urn}>
<OwnerLabel name={displayName} avatarUrl={avatarUrl} type={entity.type} />
<Select.Option
key={entity.urn}
value={entity.urn}
label={<OwnerLabel name={displayName} avatarUrl={avatarUrl} type={entity.type} />}
>
<OwnerLabel
name={displayName}
avatarUrl={avatarUrl}
type={entity.type}
corpUserId={corpUserId}
corpUserTitle={corpUserTitle}
corpUserDepartmentName={corpUserDepartmentName}
/>

Check warning on line 100 in datahub-web-react/src/app/entity/shared/containers/profile/sidebar/Ownership/EditOwnersModal.tsx

View check run for this annotation

Codecov / codecov/patch

datahub-web-react/src/app/entity/shared/containers/profile/sidebar/Ownership/EditOwnersModal.tsx#L88-L100

Added lines #L88 - L100 were not covered by tests
</Select.Option>
);
};
Expand Down Expand Up @@ -381,6 +397,7 @@
value: owner.value.ownerUrn,
label: owner.label,
}))}
optionLabelProp="label"

Check warning on line 400 in datahub-web-react/src/app/entity/shared/containers/profile/sidebar/Ownership/EditOwnersModal.tsx

View check run for this annotation

Codecov / codecov/patch

datahub-web-react/src/app/entity/shared/containers/profile/sidebar/Ownership/EditOwnersModal.tsx#L400

Added line #L400 was not covered by tests
>
{ownerSearchOptions}
</SelectInput>
Expand Down
12 changes: 10 additions & 2 deletions datahub-web-react/src/app/shared/OwnerLabel.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -20,14 +20,22 @@
name: string;
avatarUrl: string | undefined;
type: EntityType;
corpUserId?: string;
corpUserTitle?: string;
corpUserDepartmentName?: string;
};

export const OwnerLabel = ({ name, avatarUrl, type }: Props) => {
export const OwnerLabel = ({ name, avatarUrl, type, corpUserId, corpUserTitle, corpUserDepartmentName }: Props) => {
const subHeader = [corpUserId, corpUserTitle, corpUserDepartmentName].filter(Boolean).join(' - ');

Check warning on line 30 in datahub-web-react/src/app/shared/OwnerLabel.tsx

View check run for this annotation

Codecov / codecov/patch

datahub-web-react/src/app/shared/OwnerLabel.tsx#L29-L30

Added lines #L29 - L30 were not covered by tests
return (
<OwnerContainerWrapper>
<OwnerContentWrapper>
<CustomAvatar size={24} name={name} photoUrl={avatarUrl} isGroup={type === EntityType.CorpGroup} />
<div>{name}</div>
<div>
<div>{name}</div>
{subHeader && <div style={{ color: 'gray' }}>{subHeader}</div>}
</div>

Check warning on line 38 in datahub-web-react/src/app/shared/OwnerLabel.tsx

View check run for this annotation

Codecov / codecov/patch

datahub-web-react/src/app/shared/OwnerLabel.tsx#L35-L38

Added lines #L35 - L38 were not covered by tests
</OwnerContentWrapper>
</OwnerContainerWrapper>
);
Expand Down
4 changes: 4 additions & 0 deletions datahub-web-react/src/graphql/search.graphql
Original file line number Diff line number Diff line change
Expand Up @@ -433,6 +433,8 @@ fragment searchResultsWithoutSchemaField on Entity {
lastName
fullName
email
departmentName
title
}
info {
active
Expand All @@ -442,6 +444,8 @@ fragment searchResultsWithoutSchemaField on Entity {
lastName
fullName
email
departmentName
title
}
editableProperties {
displayName
Expand Down
2 changes: 1 addition & 1 deletion docs-website/docusaurus.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ module.exports = {
announcementBar: {
id: "announcement-3",
content:
'<div style="display: flex; justify-content: center; align-items: center;width: 100%;"><!--img src="/img/acryl-logo-white-mark.svg" / --><!--div style="font-size: .8rem; font-weight: 600; background-color: white; color: #111; padding: 0px 8px; border-radius: 4px; margin-right:12px;">NEW</div--><p>Watch Metadata & AI Summit sessions on-demand.</p><a href="https://www.youtube.com/@DataHubProject/videos" target="_blank" class="button">Watch Now<span> →</span></a></div>',
'<div style="display: flex; justify-content: center; align-items: center;width: 100%;"><!--img src="/img/acryl-logo-white-mark.svg" / --><!--div style="font-size: .8rem; font-weight: 600; background-color: white; color: #111; padding: 0px 8px; border-radius: 4px; margin-right:12px;">NEW</div--><p>Learn about DataHub 1.0 launching at our 5th birthday party!</p><a href="https://lu.ma/0j5jcocn" target="_blank" class="button">Register<span> →</span></a></div>',
backgroundColor: "#111",
textColor: "#ffffff",
isCloseable: false,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ const solutionsDropdownContent = {
title: "DataHub Core",
description: "Get started with the Open Source platform.",
iconImage: "/img/solutions/icon-dropdown-core.png",
href: "/",
href: "/docs/quickstart",
},
{
title: "Cloud vs Core",
Expand Down
8 changes: 4 additions & 4 deletions docs/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,17 +115,17 @@ datahub ingest -c ./examples/recipes/example_to_datahub_rest.dhub.yaml --dry-run
datahub ingest -c ./examples/recipes/example_to_datahub_rest.dhub.yaml -n
```

#### ingest --list-source-runs
#### ingest list-source-runs

The `--list-source-runs` option of the `ingest` command lists the previous runs, displaying their run ID, source name,
The `list-source-runs` option of the `ingest` command lists the previous runs, displaying their run ID, source name,
start time, status, and source URN. This command allows you to filter results using the --urn option for URN-based
filtering or the --source option to filter by source name (partial or complete matches are supported).

```shell
# List all ingestion runs
datahub ingest --list-source-runs
datahub ingest list-source-runs
# Filter runs by a source name containing "demo"
datahub ingest --list-source-runs --source "demo"
datahub ingest list-source-runs --source "demo"
```

#### ingest --preview
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
package com.linkedin.common.urn;

import com.linkedin.data.template.Custom;
import com.linkedin.data.template.DirectCoercer;
import com.linkedin.data.template.TemplateOutputCastException;
import java.net.URISyntaxException;

public final class DataPlatformInstanceUrn extends Urn {

public static final String ENTITY_TYPE = "dataPlatformInstance";

private final DataPlatformUrn _platform;
private final String _instanceId;

public DataPlatformInstanceUrn(DataPlatformUrn platform, String instanceId) {
super(ENTITY_TYPE, TupleKey.create(platform, instanceId));
this._platform = platform;
this._instanceId = instanceId;
}

public DataPlatformUrn getPlatformEntity() {
return _platform;
}

public String getInstance() {
return _instanceId;
}

public static DataPlatformInstanceUrn createFromString(String rawUrn) throws URISyntaxException {
return createFromUrn(Urn.createFromString(rawUrn));
}

public static DataPlatformInstanceUrn createFromUrn(Urn urn) throws URISyntaxException {
if (!"li".equals(urn.getNamespace())) {
throw new URISyntaxException(urn.toString(), "Urn namespace type should be 'li'.");
} else if (!ENTITY_TYPE.equals(urn.getEntityType())) {
throw new URISyntaxException(
urn.toString(), "Urn entity type should be 'dataPlatformInstance'.");
} else {
TupleKey key = urn.getEntityKey();
if (key.size() != 2) {
throw new URISyntaxException(urn.toString(), "Invalid number of keys.");
} else {
try {
return new DataPlatformInstanceUrn(
(DataPlatformUrn) key.getAs(0, DataPlatformUrn.class),
(String) key.getAs(1, String.class));
} catch (Exception e) {
throw new URISyntaxException(urn.toString(), "Invalid URN Parameter: '" + e.getMessage());
}
}
}
}

public static DataPlatformInstanceUrn deserialize(String rawUrn) throws URISyntaxException {
return createFromString(rawUrn);
}

static {
Custom.initializeCustomClass(DataPlatformUrn.class);
Custom.initializeCustomClass(DataPlatformInstanceUrn.class);
Custom.registerCoercer(
new DirectCoercer<DataPlatformInstanceUrn>() {
public Object coerceInput(DataPlatformInstanceUrn object) throws ClassCastException {
return object.toString();
}

public DataPlatformInstanceUrn coerceOutput(Object object)
throws TemplateOutputCastException {
try {
return DataPlatformInstanceUrn.createFromString((String) object);
} catch (URISyntaxException e) {
throw new TemplateOutputCastException("Invalid URN syntax: " + e.getMessage(), e);
}
}
},
DataPlatformInstanceUrn.class);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
namespace com.linkedin.common

/**
* Standardized dataset identifier.
*/
@java.class = "com.linkedin.common.urn.DataPlatformInstanceUrn"
@validate.`com.linkedin.common.validator.TypedUrnValidator` = {
"accessible" : true,
"owningTeam" : "urn:li:internalTeam:datahub",
"entityType" : "dataPlatformInstance",
"constructable" : true,
"namespace" : "li",
"name" : "DataPlatformInstance",
"doc" : "Standardized data platform instance identifier.",
"owners" : [ "urn:li:corpuser:fbar", "urn:li:corpuser:bfoo" ],
"fields" : [ {
"type" : "com.linkedin.common.urn.DataPlatformUrn",
"name" : "platform",
"doc" : "Standardized platform urn."
}, {
"name" : "instance",
"doc" : "Instance of the data platform (e.g. db instance)",
"type" : "string",
} ],
"maxLength" : 100
}
typeref DataPlatformInstanceUrn = string
13 changes: 3 additions & 10 deletions metadata-ingestion/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -106,25 +106,18 @@ task modelDocUpload(type: Exec, dependsOn: [modelDocGen]) {


task lint(type: Exec, dependsOn: installDev) {
/*
The find/sed combo below is a temporary work-around for the following mypy issue with airflow 2.2.0:
"venv/lib/python3.8/site-packages/airflow/_vendor/connexion/spec.py:169: error: invalid syntax".
*/
commandLine 'bash', '-c',
"find ${venv_name}/lib -path *airflow/_vendor/connexion/spec.py -exec sed -i.bak -e '169,169s/ # type: List\\[str\\]//g' {} \\; && " +
"source ${venv_name}/bin/activate && set -x && " +
"black --check --diff src/ tests/ examples/ && " +
"isort --check --diff src/ tests/ examples/ && " +
"flake8 --count --statistics src/ tests/ examples/ && " +
"ruff check src/ tests/ examples/ && " +
"mypy --show-traceback --show-error-codes src/ tests/ examples/"
}

task lintFix(type: Exec, dependsOn: installDev) {
commandLine 'bash', '-c',
"source ${venv_name}/bin/activate && set -x && " +
"black src/ tests/ examples/ && " +
"isort src/ tests/ examples/ && " +
"flake8 src/ tests/ examples/ && " +
"mypy --show-traceback --show-error-codes src/ tests/ examples/"
"ruff check --fix src/ tests/ examples/"
}

def pytest_default_env = "PYTHONDEVMODE=1"
Expand Down
10 changes: 7 additions & 3 deletions metadata-ingestion/developing.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ cd metadata-ingestion-modules/gx-plugin
source venv/bin/activate
datahub version # should print "DataHub CLI version: unavailable (installed in develop mode)"
```

### (Optional) Set up your Python environment for developing on Dagster Plugin

From the repository root:
Expand All @@ -99,6 +100,7 @@ cd metadata-ingestion-modules/dagster-plugin
source venv/bin/activate
datahub version # should print "DataHub CLI version: unavailable (installed in develop mode)"
```

### Common setup issues

Common issues (click to expand):
Expand Down Expand Up @@ -175,19 +177,21 @@ The architecture of this metadata ingestion framework is heavily inspired by [Ap

## Code style

We use black, isort, flake8, and mypy to ensure consistent code style and quality.
We use black, ruff, and mypy to ensure consistent code style and quality.

```shell
# Assumes: pip install -e '.[dev]' and venv is activated
black src/ tests/
isort src/ tests/
flake8 src/ tests/
ruff check src/ tests/
mypy src/ tests/
```

or you can run from root of the repository

```shell
./gradlew :metadata-ingestion:lint

# This will auto-fix some linting issues.
./gradlew :metadata-ingestion:lintFix
```

Expand Down
Loading
Loading