Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small improvements #493

Merged
merged 1 commit into from
Dec 20, 2024
Merged

Small improvements #493

merged 1 commit into from
Dec 20, 2024

Conversation

dpxcc
Copy link
Contributor

@dpxcc dpxcc commented Dec 14, 2024

  1. Make pgduckdb/pgduckdb_types.hpp self-contained
  2. Enable InvokeCPPFunc to call functions without arguments

@dpxcc dpxcc marked this pull request as ready for review December 14, 2024 01:28
@dpxcc
Copy link
Contributor Author

dpxcc commented Dec 14, 2024

Another issue I noticed is that the code isn't correctly formatted - running make format results in changes to code

Copy link
Collaborator

@Y-- Y-- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contribution!

Not opposed to these changes, but not clear what it solve? Would you mind clarifying in the PR description? Thanks!

JelteF added a commit that referenced this pull request Dec 16, 2024
In #98 we started using the same version of clang-format that DuckDB
uses. That version is pretty ancient though. So it misses a bunch of
features that we might want to use, and also has a bunch of bugs which
result in sub-optimal formatting. This updates `clang-format` to the
latest version, and while we're at it we also update `ruff`.

To be able to reformat all existing files, this also adds a `format-all`
target to our `Makefile`.

The problem was called out by @dpxcc in #493, but I've been also running
into this myself for a while.
@dpxcc
Copy link
Contributor Author

dpxcc commented Dec 16, 2024

@Y-- Sorry, I think I forgot to submit my inline comments :(

@@ -1,5 +1,6 @@
#pragma once

#include "duckdb/common/types/data_chunk.hpp"
Copy link
Contributor Author

@dpxcc dpxcc Dec 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This header file used several DuckDB definitions, which should be included directly in the header file
duckdb/common/types/data_chunk.hpp includes them all

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this fix, all .cpp files must include duckdb/common/types/data_chunk.hpp (or any file that includes it) before including pgduckdb/pgduckdb_types.hpp
This is undesirable, and the header file should be self-contained

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, but in this case we should clean the cpp files where pgduckdb_types.hpp is included to remove the extra duckdb includes no?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't include data_chunk.hpp anywhere else. That brings two questions:

  1. Should this header simply include duckdb.hpp instead of a specific header?
  2. @Y-- your suggestion for removal then becomes: should we remove duckdb.hpp from cpp files that already include pgduckdb_types.hpp?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right I think if we're going to make this header self-contained. Or maybe better we should forward-declare the various DuckDB classes we're using in the functions below?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and yes :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The convention in duckdb codebase seems to favor including specific headers rather than the entire duckdb.hpp
In my experience, using specific headers reduces compile time significantly, often by a factor of 3 (e.g., 1.8s -> 0.6s per file)

Copy link
Collaborator

@JelteF JelteF Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dpxcc can you do this? then we can merge this PR imo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to submit my comments..
Could you check my previous comments?
If you feel strongly to include entire duckdb.hpp instead of specific headers in pg_duckdb, I can proceed with the proposed changes

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the specific headers (and definitely agree with your point, it's much better)

@@ -30,7 +30,7 @@ __CPPFunctionGuard__(const char *func_name, FuncArgs... args) {

}

#define InvokeCPPFunc(FUNC, ...) pgduckdb::__CPPFunctionGuard__<decltype(&FUNC), &FUNC>(__FUNCTION__, __VA_ARGS__)
#define InvokeCPPFunc(FUNC, ...) pgduckdb::__CPPFunctionGuard__<decltype(&FUNC), &FUNC>(__FUNCTION__, ##__VA_ARGS__)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed change enables the InvokeCPPFunc macro to be used with empty __VA_ARGS__

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this fix, InvokeCPPFunc(FuncWithoutArg) cannot be used to call a function without input arguments

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, but we don't need it for now. Ideally I'd prefer to change it only when we have this use case. What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd think it makes sense to just merge this, it seems more like an oversight to not have this macro support functions with no parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To provide some context, this issue arises because pg_mooncake, built on pg_duckdb, uses InvokeCPPFunc with a no-parameter function. We've fixed it in pg_mooncake and would like to upstream the fix

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah nice, thanks for the context, that's very helpful!
Don't hesitate to provide feedback or reach out to us if we can be more friendly for such cases in the future :-)

@wuputah
Copy link
Collaborator

wuputah commented Dec 16, 2024

heya, the PR name + description is what gets committed when we do the squash merge, so if you could improve that, it would be most appreciated. Thanks!

@dpxcc
Copy link
Contributor Author

dpxcc commented Dec 16, 2024

@wuputah Done. Thanks for the information!

@dpxcc dpxcc force-pushed the main branch 2 times, most recently from d1a2548 to 93d5e3f Compare December 17, 2024 01:44
1. Make `pgduckdb/pgduckdb_types.hpp` self-contained
2. Enable `InvokeCPPFunc` to call functions without arguments
Copy link
Collaborator

@JelteF JelteF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dpxcc general feedback. Since these changes are totally separate. I'd normally prefer to get them as two PRs. Mainly, so that if one of the changes turns out to be contentious while the other is not, we can merge the non contentious part without it being blocked on the other unrelated part.

No need to split up this one though, unless you feel like doing so.

@dpxcc
Copy link
Contributor Author

dpxcc commented Dec 18, 2024

@dpxcc general feedback. Since these changes are totally separate. I'd normally prefer to get them as two PRs. Mainly, so that if one of the changes turns out to be contentious while the other is not, we can merge the non contentious part without it being blocked on the other unrelated part.

No need to split up this one though, unless you feel like doing so.

Thanks for the information! I will follow this guideline in the future

@JelteF JelteF enabled auto-merge (squash) December 20, 2024 14:46
@JelteF JelteF merged commit 3b34bee into duckdb:main Dec 20, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants