Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support nullable base types for (*Build[T]) #638

Merged
merged 1 commit into from
Dec 18, 2023

Conversation

gernest
Copy link
Contributor

@gernest gernest commented Dec 17, 2023

Before, we had null tag that set schema column to nullable. However the base types allowed were int64|float64|bool|string.

This begged the question, what is null? for a nullable field of type int64 do we interpret 0 as null?, probably not. We were offering option for something that was impossible to represent.

Fortunately, we can safely represent null values with out type system. For
base types *int64|*float64|*bool|*string defines nullable base types.

This commit add support for both int64|float64|bool|string and *int64|*float64|*bool|*string for base types . This works with dynamic columns too.

With this change null tag is redundant so It was removed. Fields of type *int64|*float64|*bool|*string will automatically generate nullable schema
column.

NOTE: I also discovered a bug where when T has dynamic column and you append multiple T without any dynamic column followed by T with dynamic column caused a panic. The issue was how we adjusted dynamic columns to match current .Append rows state. The fix is included in this commit because I needed this behavior in tests.

@gernest gernest force-pushed the nullable branch 4 times, most recently from a76e291 to 2d1127e Compare December 17, 2023 01:33
Before, we had `null` tag that set schema column to nullable. However the base
types allowed were `int64|float64|bool|string`.

This begged the question, what is `null`? for a `nullable` field of type
`int64` do we interpret `0` as `null`?, probably not. We were offering option
for something that was impossible to represent.

Fortunately, we can safely represent null values with out type system. For
base types   `*int64|*float64|*bool|*string` defines nullable base types.

This commit add support for both `int64|float64|bool|string` and
`*int64|*float64|*bool|*string` for base types . This works with dynamic
columns too.

With this change `null` tag is redundant so It was removed. Fields of type
`*int64|*float64|*bool|*string` will automatically generate nullable schema
 column.

NOTE: I also discovered a bug where when `T` has dynamic column and you append
multiple `T` without any dynamic column followed by T with dynamic column
caused a panic. The issue was how we adjusted dynamic columns to match current
`.Append` rows state. The fix is included in this commit because I needed this
behavior in tests.
Copy link
Contributor

@thorfour thorfour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work thanks!

@thorfour thorfour merged commit 7926315 into polarsignals:main Dec 18, 2023
3 checks passed
@gernest gernest deleted the nullable branch December 18, 2023 20:54
gernest added a commit to gernest/frostdb that referenced this pull request Dec 23, 2023
* main:
  Optimize dynparquet..deserializeDynamicColumns (polarsignals#652)
  Remove bufutils.Dedupe (polarsignals#650)
  Performance improvement on pqarrow.RecordDynamicCols (polarsignals#644)
  Remove dynparquet.ValuesToBuffer (polarsignals#645)
  parts list no longer used (polarsignals#649)
  dynparquet/samples: Make ToRecord consistent (polarsignals#648)
  parquet/converter: New Record call ResetFull on dictionary builders (polarsignals#646)
  parquet-tool: check in cmd col (polarsignals#643)
  Remove unused function (polarsignals#641)
  Support nullable base types for (*Build[T]) (polarsignals#638)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants