Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement experimental GPU two-phase occlusion culling for the standard 3D mesh pipeline. #17413

Merged
merged 32 commits into from
Jan 27, 2025
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
6aec99d
Implement experimental GPU two-phase occlusion culling for the standard
pcwalton Jan 3, 2025
34f693d
Doc check police
pcwalton Jan 17, 2025
b3bd9c8
Widen the `LatePreprocessWorkItemIndirectParameters` to 64 bytes to work
pcwalton Jan 17, 2025
6ff7c04
Add some missing docs
pcwalton Jan 17, 2025
75ee16e
Fix DX12
pcwalton Jan 17, 2025
fc068fb
Internal import police
pcwalton Jan 17, 2025
c48adf2
Merge remote-tracking branch 'origin/main' into occlusion-culling-4
pcwalton Jan 20, 2025
63733db
Only bail out of `build_indirect_params.wgsl` if we're in
pcwalton Jan 21, 2025
f170d1e
Merge remote-tracking branch 'origin/main' into occlusion-culling-4
pcwalton Jan 21, 2025
ff77eee
Revert changes to `meshlet/cull_clusters.wgsl`
pcwalton Jan 21, 2025
c21715d
Document that occlusion culling is incompatible with deferred shading;
pcwalton Jan 21, 2025
ef82e53
Try to fix the panic on M2; fix meshlet pass labels
pcwalton Jan 21, 2025
677c947
Address review comments on downsample depth and the meshlet query filter
pcwalton Jan 21, 2025
0c78571
Doc check police
pcwalton Jan 21, 2025
59c1424
Fix buffer sizes on Metal
pcwalton Jan 22, 2025
e498850
Gracefully avoid displaying the number of occluded meshes on macOS
pcwalton Jan 22, 2025
c4698df
Merge remote-tracking branch 'origin/main' into occlusion-culling-4
pcwalton Jan 22, 2025
2dbf614
Internal import police
pcwalton Jan 22, 2025
02b8486
Merge remote-tracking branch 'origin/main' into occlusion-culling-4
pcwalton Jan 22, 2025
d08b195
Address review comment
pcwalton Jan 22, 2025
77bfc6a
Set the push constant offset for the late mesh preprocessing phase too.
pcwalton Jan 22, 2025
795ddc4
Merge remote-tracking branch 'origin/main' into occlusion-culling-4
pcwalton Jan 22, 2025
129f12d
Merge remote-tracking branch 'origin/main' into occlusion-culling-4
pcwalton Jan 23, 2025
c1e5053
Warning police
pcwalton Jan 23, 2025
64cb9b7
Merge remote-tracking branch 'origin/main' into occlusion-culling-4
pcwalton Jan 26, 2025
c5df9c8
Update for Bevy changes
pcwalton Jan 26, 2025
a57d2a5
Merge remote-tracking branch 'origin/main' into occlusion-culling-4
pcwalton Jan 26, 2025
9d16350
Merge remote-tracking branch 'origin/main' into occlusion-culling-4
pcwalton Jan 26, 2025
76e4f8a
Merge remote-tracking branch 'origin/main' into occlusion-culling-4
pcwalton Jan 27, 2025
f025875
Fix DX12
pcwalton Jan 27, 2025
dd8e93b
Fix WebGL 2 by moving the depth downsample pipeline creation out of a
pcwalton Jan 27, 2025
6911ddf
Merge remote-tracking branch 'origin/main' into occlusion-culling-4
pcwalton Jan 27, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4073,3 +4073,14 @@ name = "Directional Navigation"
description = "Demonstration of Directional Navigation between UI elements"
category = "UI (User Interface)"
wasm = true

[[example]]
name = "occlusion_culling"
path = "examples/3d/occlusion_culling.rs"
doc-scrape-examples = true

[package.metadata.example.occlusion_culling]
name = "Occlusion Culling"
description = "Demonstration of Occlusion Culling"
category = "3D Rendering"
wasm = false
1 change: 1 addition & 0 deletions crates/bevy_core_pipeline/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ nonmax = "0.5"
smallvec = "1"
thiserror = { version = "2", default-features = false }
tracing = { version = "0.1", default-features = false, features = ["std"] }
bytemuck = { version = "1" }

[lints]
workspace = true
Expand Down
2 changes: 2 additions & 0 deletions crates/bevy_core_pipeline/src/core_2d/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -312,6 +312,8 @@ impl PhaseItem for AlphaMask2d {
}

impl BinnedPhaseItem for AlphaMask2d {
// Since 2D meshes presently can't be multidrawn, the batch set key is
// irrelevant.
type BatchSetKey = BatchSetKey2d;

type BinKey = AlphaMask2dBinKey;
Expand Down
44 changes: 38 additions & 6 deletions crates/bevy_core_pipeline/src/core_3d/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,9 @@ pub mod graph {
#[derive(Debug, Hash, PartialEq, Eq, Clone, RenderLabel)]
pub enum Node3d {
MsaaWriteback,
Prepass,
EarlyPrepass,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the early prepass running the full prepass? E.g. for deferred is it doing the gbuffer rendering too?

Not sure, but it might make more sense to do depth only in the early pass, and then depth + other attachments in the late pass.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure that Griffin said that you usually split the full prepass into early and late phases rather than having a separate z-prepass, but I'd rather not make that change here as this patch is too big as it is. Instead I just documented that occlusion culling is currently incompatible with deferred. We can add support for deferred in a followup.

EarlyDownsampleDepth,
LatePrepass,
DeferredPrepass,
CopyDeferredLightingId,
EndPrepasses,
Expand All @@ -25,6 +27,7 @@ pub mod graph {
MainTransmissivePass,
MainTransparentPass,
EndMainPass,
LateDownsampleDepth,
Taa,
MotionBlur,
Bloom,
Expand Down Expand Up @@ -67,9 +70,10 @@ use core::ops::Range;

use bevy_render::{
batching::gpu_preprocessing::{GpuPreprocessingMode, GpuPreprocessingSupport},
experimental::occlusion_culling::OcclusionCulling,
mesh::allocator::SlabId,
render_phase::PhaseItemBatchSetKey,
view::{NoIndirectDrawing, RetainedViewEntity},
view::{prepare_view_targets, NoIndirectDrawing, RetainedViewEntity},
};
pub use camera_3d::*;
pub use main_opaque_pass_3d_node::*;
Expand Down Expand Up @@ -114,8 +118,9 @@ use crate::{
},
dof::DepthOfFieldNode,
prepass::{
node::PrepassNode, AlphaMask3dPrepass, DeferredPrepass, DepthPrepass, MotionVectorPrepass,
NormalPrepass, Opaque3dPrepass, OpaqueNoLightmap3dBatchSetKey, OpaqueNoLightmap3dBinKey,
node::{EarlyPrepassNode, LatePrepassNode},
AlphaMask3dPrepass, DeferredPrepass, DepthPrepass, MotionVectorPrepass, NormalPrepass,
Opaque3dPrepass, OpaqueNoLightmap3dBatchSetKey, OpaqueNoLightmap3dBinKey,
ViewPrepassTextures, MOTION_VECTOR_PREPASS_FORMAT, NORMAL_PREPASS_FORMAT,
},
skybox::SkyboxPlugin,
Expand Down Expand Up @@ -161,6 +166,9 @@ impl Plugin for Core3dPlugin {
(
sort_phase_system::<Transmissive3d>.in_set(RenderSet::PhaseSort),
sort_phase_system::<Transparent3d>.in_set(RenderSet::PhaseSort),
configure_occlusion_culling_view_targets
.after(prepare_view_targets)
.in_set(RenderSet::ManageViews),
prepare_core_3d_depth_textures.in_set(RenderSet::PrepareResources),
prepare_core_3d_transmission_textures.in_set(RenderSet::PrepareResources),
prepare_prepass_textures.in_set(RenderSet::PrepareResources),
Expand All @@ -169,7 +177,8 @@ impl Plugin for Core3dPlugin {

render_app
.add_render_sub_graph(Core3d)
.add_render_graph_node::<ViewNodeRunner<PrepassNode>>(Core3d, Node3d::Prepass)
.add_render_graph_node::<ViewNodeRunner<EarlyPrepassNode>>(Core3d, Node3d::EarlyPrepass)
.add_render_graph_node::<ViewNodeRunner<LatePrepassNode>>(Core3d, Node3d::LatePrepass)
.add_render_graph_node::<ViewNodeRunner<DeferredGBufferPrepassNode>>(
Core3d,
Node3d::DeferredPrepass,
Expand Down Expand Up @@ -200,7 +209,8 @@ impl Plugin for Core3dPlugin {
.add_render_graph_edges(
Core3d,
(
Node3d::Prepass,
Node3d::EarlyPrepass,
Node3d::LatePrepass,
Node3d::DeferredPrepass,
Node3d::CopyDeferredLightingId,
Node3d::EndPrepasses,
Expand Down Expand Up @@ -898,6 +908,28 @@ pub fn prepare_core_3d_transmission_textures(
}
}

/// Sets the `TEXTURE_BINDING` flag on the depth texture if necessary for
/// occlusion culling.
///
/// We need that flag to be set in order to read from the texture.
fn configure_occlusion_culling_view_targets(
mut view_targets: Query<
&mut Camera3d,
(
With<OcclusionCulling>,
Without<NoIndirectDrawing>,
With<DepthPrepass>,
Without<DeferredPrepass>,
),
>,
) {
for mut camera_3d in &mut view_targets {
let mut depth_texture_usages = TextureUsages::from(camera_3d.depth_texture_usages);
depth_texture_usages |= TextureUsages::TEXTURE_BINDING;
camera_3d.depth_texture_usages = depth_texture_usages.into();
}
}

// Disable MSAA and warn if using deferred rendering
pub fn check_msaa(mut deferred_views: Query<&mut Msaa, (With<Camera>, With<DeferredPrepass>)>) {
for mut msaa in deferred_views.iter_mut() {
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,16 @@
#ifdef MESHLET_VISIBILITY_BUFFER_RASTER_PASS_OUTPUT
@group(0) @binding(0) var<storage, read> mip_0: array<u64>; // Per pixel
Copy link
Contributor

@JMS55 JMS55 Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a note: I believe all the meshlet-specific stuff is going to disappear here once wgpu 24 is merged and I can switch back to an image-based visbuffer.

#else
#ifdef MESHLET
@group(0) @binding(0) var<storage, read> mip_0: array<u32>; // Per pixel
#endif
#else // MESHLET
#ifdef MULTISAMPLE
@group(0) @binding(0) var mip_0: texture_depth_multisampled_2d;
#else // MULTISAMPLE
@group(0) @binding(0) var mip_0: texture_depth_2d;
#endif // MULTISAMPLE
#endif // MESHLET
#endif // MESHLET_VISIBILITY_BUFFER_RASTER_PASS_OUTPUT
Comment on lines +11 to +13
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am reminded of how spoiled I am getting to just write Rust.

@group(0) @binding(1) var mip_1: texture_storage_2d<r32float, write>;
@group(0) @binding(2) var mip_2: texture_storage_2d<r32float, write>;
@group(0) @binding(3) var mip_3: texture_storage_2d<r32float, write>;
Expand Down Expand Up @@ -304,9 +312,25 @@ fn load_mip_0(x: u32, y: u32) -> f32 {
let i = y * constants.view_width + x;
#ifdef MESHLET_VISIBILITY_BUFFER_RASTER_PASS_OUTPUT
return bitcast<f32>(u32(mip_0[i] >> 32u));
#else
#else // MESHLET_VISIBILITY_BUFFER_RASTER_PASS_OUTPUT
#ifdef MESHLET
return bitcast<f32>(mip_0[i]);
#endif
#else // MESHLET
// Downsample the top level.
#ifdef MULTISAMPLE
// The top level is multisampled, so we need to loop over all the samples
// and reduce them to 1.
var result = textureLoad(mip_0, vec2(x, y), 0);
let sample_count = i32(textureNumSamples(mip_0));
for (var sample = 1; sample < sample_count; sample += 1) {
result = min(result, textureLoad(mip_0, vec2(x, y), sample));
}
return result;
#else // MULTISAMPLE
return textureLoad(mip_0, vec2(x, y), 0);
#endif // MULTISAMPLE
#endif // MESHLET
#endif // MESHLET_VISIBILITY_BUFFER_RASTER_PASS_OUTPUT
}

fn reduce_4(v: vec4f) -> f32 {
Expand Down
Loading
Loading