Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable end to end non-DPS testing #289

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jhalakpatel
Copy link
Collaborator

@jhalakpatel jhalakpatel commented Oct 18, 2024

Implement python binding changes to allow execute function return
multiple returns. Update tests to use non-DPS style calling convention.

Also, enable end to end lowering by enabling conversion of closed alloc group op to tensorrt dialect.

Miscellaneous fixes:

  1. Add missing handling of CallAllocOp in EliminateShapeOps pass.
  2. Skip non ranked tensor type function arguments while collecting host tensor arguments.
  3. Temporarily add a pass to remove clone operation in MemRefToExecutor dialect conversion.
  4. Relax memref creation for empty shape tensors.
  5. Fix memref life returned from Lua function results. This required session allocator to track returned memref.
  6. Return error status instead of silently erroring out.

@jhalakpatel jhalakpatel force-pushed the jhalakp-python-exec-non-dps branch from ad0cdfb to fcb0964 Compare November 4, 2024 21:47
@jhalakpatel jhalakpatel changed the title [Python/Bindings] Allow execute_function API to return values Enable end to end non-DPS testing Nov 4, 2024
@jhalakpatel jhalakpatel force-pushed the jhalakp-python-exec-non-dps branch 2 times, most recently from 9fb24dd to 43d4c93 Compare November 5, 2024 01:05
@jhalakpatel jhalakpatel force-pushed the jhalakp-python-exec-non-dps branch 4 times, most recently from 7254cbe to ac28e6c Compare November 6, 2024 06:12
@jhalakpatel jhalakpatel force-pushed the jhalakp-python-exec-non-dps branch 4 times, most recently from e75953d to 070359d Compare November 6, 2024 22:04
@jhalakpatel jhalakpatel force-pushed the jhalakp-python-exec-non-dps branch from 070359d to d839011 Compare November 7, 2024 03:34
@@ -53,7 +53,7 @@ extern "C" {
/// caller must be sure to delete errors via mtrtStatusDestroy.
//===----------------------------------------------------------------------===//

typedef struct MTRT_RuntimeClient MTRT_Runtimeclient;
struct MTRT_RuntimeClient; // Forward declaration
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a C header, not C++.

Without the typedef, we would fail to compile when including this header in a C library (e.g. we would need to rename all the types below to struct MTRT_RuntimeClient instead of just MTRT_RuntimeClient. The typedef is a requirement.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any idea why we use MTRT_Runtimeclient instead of MTRT_RuntimeClient? Is it just a typo?


results.push_back(std::move(*memref));
(*client)->getAllocTracker().incrementExternalCount((*memRef)->getMemory());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this violates an invariant that we should be enforcing -- external reference count should be 0 for any "externally managed" pointer. Reference counts are only required for the object that owns the resource (which in this case is the sesssion).

I've been meaning to erase the distinction between client/session trackers and just have RuntimeSession use the Client's tracker. But until we do that, what we need here actually is to "release" the pointer from the session ownership and have the client assume ownership.

@christopherbate
Copy link
Collaborator

Overall, looks good except for minor comments. Great to see it working end-to-end, and of course much kudos to you for getting this working!

@jhalakpatel jhalakpatel force-pushed the jhalakp-python-exec-non-dps branch 5 times, most recently from 2ee6ed3 to eed9be4 Compare November 9, 2024 00:28
Implement python binding changes to allow execute function return
multiple returns. Update tests to use non-DPS style calling convention.

Also, enable end to end lowering by enabling conversion of closed alloc
group op to tensorrt dialect.

Miscellaneous fixes:
1. Add missing handling of `CallAllocOp` in EliminateShapeOps pass.
2. Skip non ranked tensor type function arguments while collecting host
   tensor arguments.
3. Temporarily add a pass to remove clone operation in MemRefToExecutor
   dialect conversion.
4. Relax memref creation for empty shape tensors.
5. Fix memref life returned from Lua function results. This required
   session allocator to track returned memref.

Also, address
Fix incorrect indexing into output memref results
Return error status instead of silently erroring out during TensorRT weight conversion
Address review comments
@jhalakpatel jhalakpatel force-pushed the jhalakp-python-exec-non-dps branch from eed9be4 to 6e647ca Compare November 9, 2024 19:20
@@ -650,9 +650,6 @@ parseResults(const sol::protected_function_result &pfr,
if (!memref.isOk())
return memref.getStatus();

// Increment external reference count since we are returning a memref
allocator.incrementExternalCount(info.ptr);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need this. Reference count should only be incremented if a explicitly create a DL pack tensor on a memref.
Here memref is internally owned and tracked.

@jhalakpatel jhalakpatel force-pushed the jhalakp-python-exec-non-dps branch from 7b25b15 to 9334481 Compare November 9, 2024 22:19
@jhalakpatel jhalakpatel force-pushed the jhalakp-python-exec-non-dps branch from 9334481 to 8710d9a Compare November 9, 2024 22:19
@jhalakpatel
Copy link
Collaborator Author

@christopherbate were you able to merge changes from this PR?

@christopherbate
Copy link
Collaborator

Sorry, fell behind while you were gone and this wasn't blocking TriPy team. I will merge this week.

@@ -379,20 +379,22 @@ struct ConvertEnqueueAllocToCall

// Create output memrefs from output descriptors
SmallVector<Value> results;
// Initialize output descriptor offset to skip number of results.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these changes are merged

// CHECK: %[[v12:.*]] = executor.getoffset[0, 1] : () -> i64, !executor.table<i64, i64, i64, i64, i64, i64, i64, i64, i64, i64, i64>
// CHECK: executor.store %[[c4]] to %[[v10]] + %[[v12]] : i64, !executor.ptr<host>, i64
// CHECK: %[[v11:.*]] = executor.getoffset[0, 0]
// CHECK: executor.store %[[c1]] to %[[v10]] + %[[v11]]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes in this file are merged

@@ -65,17 +65,26 @@ struct RemoveWithValuesRewriter : public OpRewritePattern<plan::WithValuesOp> {
} // namespace
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes here were merged with some additional tests

@@ -156,28 +156,32 @@ getTensorRTShapeProfile(plan::BoundsAttr attr, Value v) {
return getProfileAttr(attr.getMinShape(), attr.getMaxShape());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in this pass were merged with some additional tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants