Skip to content

Commit

Permalink
Merge branch 'main' into fix-spill
Browse files Browse the repository at this point in the history
  • Loading branch information
kosiew committed Jan 3, 2025
2 parents 01d2b60 + 63265fd commit e094adb
Show file tree
Hide file tree
Showing 61 changed files with 1,991 additions and 982 deletions.
13 changes: 13 additions & 0 deletions .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
FROM rust:bookworm

RUN apt-get update && export DEBIAN_FRONTEND=noninteractive \
# Remove imagemagick due to https://security-tracker.debian.org/tracker/CVE-2019-10131
&& apt-get purge -y imagemagick imagemagick-6-common

# Add protoc
# https://datafusion.apache.org/contributor-guide/getting_started.html#protoc-installation
RUN curl -LO https://github.com/protocolbuffers/protobuf/releases/download/v25.1/protoc-25.1-linux-x86_64.zip \
&& unzip protoc-25.1-linux-x86_64.zip -d $HOME/.local \
&& rm protoc-25.1-linux-x86_64.zip

ENV PATH="$PATH:$HOME/.local/bin"
16 changes: 16 additions & 0 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{
"build": {
"dockerfile": "./Dockerfile",
"context": "."
},
"customizations": {
"vscode": {
"extensions": [
"rust-lang.rust-analyzer"
]
}
},
"features": {
"ghcr.io/devcontainers/features/rust:1": "latest"
}
}
54 changes: 54 additions & 0 deletions .github/workflows/hash_collisions.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

name: Rust Hash Collisions

concurrency:
group: ${{ github.repository }}-${{ github.head_ref || github.sha }}-${{ github.workflow }}
cancel-in-progress: true

# https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows#running-your-pull_request-workflow-when-a-pull-request-merges
#
# this job is intended to only run on merge to main branch
on:
pull_request:
branches:
- main
types:
- closed

jobs:
# Check answers are correct when hash values collide
hash-collisions:
name: cargo test hash collisions (amd64)
runs-on: ubuntu-latest
container:
image: amd64/rust
if: github.event.pull_request.merged == true
steps:
- uses: actions/checkout@v4
with:
submodules: true
fetch-depth: 1
- name: Setup Rust toolchain
uses: ./.github/actions/setup-builder
with:
rust-version: stable
- name: Run tests
run: |
cd datafusion
cargo test --profile ci --exclude datafusion-examples --exclude datafusion-benchmarks --exclude datafusion-sqllogictest --workspace --lib --tests --features=force_hash_collisions,avro
21 changes: 0 additions & 21 deletions .github/workflows/rust.yml
Original file line number Diff line number Diff line change
Expand Up @@ -541,27 +541,6 @@ jobs:
- name: Run clippy
run: ci/scripts/rust_clippy.sh

# Check answers are correct when hash values collide
hash-collisions:
name: cargo test hash collisions (amd64)
needs: linux-build-lib
runs-on: ubuntu-latest
container:
image: amd64/rust
steps:
- uses: actions/checkout@v4
with:
submodules: true
fetch-depth: 1
- name: Setup Rust toolchain
uses: ./.github/actions/setup-builder
with:
rust-version: stable
- name: Run tests
run: |
cd datafusion
cargo test --profile ci --exclude datafusion-examples --exclude datafusion-benchmarks --exclude datafusion-sqllogictest --workspace --lib --tests --features=force_hash_collisions,avro
cargo-toml-formatting-checks:
name: check Cargo.toml formatting
needs: linux-build-lib
Expand Down
4 changes: 4 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,7 @@
[submodule "testing"]
path = testing
url = https://github.com/apache/arrow-testing
[submodule "datafusion-testing"]
path = datafusion-testing
url = https://github.com/apache/datafusion-testing.git
branch = main
4 changes: 2 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@ futures = "0.3"
half = { version = "2.2.1", default-features = false }
hashbrown = { version = "0.14.5", features = ["raw"] }
indexmap = "2.0.0"
itertools = "0.13"
itertools = "0.14"
log = "^0.4"
object_store = { version = "0.11.0", default-features = false }
parking_lot = "0.12"
Expand All @@ -145,7 +145,7 @@ prost-derive = "0.13.1"
rand = "0.8"
recursive = "0.1.1"
regex = "1.8"
rstest = "0.23.0"
rstest = "0.24.0"
serde_json = "1"
sqlparser = { version = "0.53.0", features = ["visitor"] }
tempfile = "3"
Expand Down
33 changes: 22 additions & 11 deletions datafusion-cli/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion datafusion-examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ cargo run --example dataframe
- [`custom_datasource.rs`](examples/custom_datasource.rs): Run queries against a custom datasource (TableProvider)
- [`custom_file_format.rs`](examples/custom_file_format.rs): Write data to a custom file format
- [`dataframe-to-s3.rs`](examples/external_dependency/dataframe-to-s3.rs): Run a query using a DataFrame against a parquet file from s3 and writing back to s3
- [`dataframe.rs`](examples/dataframe.rs): Run a query using a DataFrame API against parquet files, csv files, and in-memory data. Also demonstrates the various methods to write out a DataFrame to a table, parquet file, csv file, and json file.
- [`dataframe.rs`](examples/dataframe.rs): Run a query using a DataFrame API against parquet files, csv files, and in-memory data, including multiple subqueries. Also demonstrates the various methods to write out a DataFrame to a table, parquet file, csv file, and json file.
- [`deserialize_to_struct.rs`](examples/deserialize_to_struct.rs): Convert query results into rust structs using serde
- [`expr_api.rs`](examples/expr_api.rs): Create, execute, simplify, analyze and coerce `Expr`s
- [`file_stream_provider.rs`](examples/file_stream_provider.rs): Run a query on `FileStreamProvider` which implements `StreamProvider` for reading and writing to arbitrary stream sources / sinks.
Expand Down
52 changes: 0 additions & 52 deletions datafusion-examples/examples/config_extension.rs

This file was deleted.

Loading

0 comments on commit e094adb

Please sign in to comment.