Unity Shader 系列(二十):Unity GPU 流体模拟实战:Compute Shader 与双缓冲

GPU 流体模拟是 Unity 开发中技术深度最高的方向之一,也是游戏中实现交互式水面、墨水扩散、烟雾流动的核心技术。与 ShaderToy 的帧缓冲反馈不同,Unity 中的 GPU 流体模拟使用 Compute Shader + RenderTexture Ping-Pong:一个 Compute Shader 读取上一帧的流体状态,计算新状态写入另一张 RenderTexture,然后通过 URP Renderer Feature 将结果可视化。本文提供完整的可交互 2D 流体模拟实现,包括 Compute Shader、C# 控制脚本和 URP 可视化 Shader。

Unity Compute Shader 基础

Compute Shader 是在 GPU 上并行运行的通用计算程序,不依附于渲染流程。在 Unity 中:

  • 文件扩展名:.compute
  • 着色器类型:不是渲染着色器,是 GPGPU(通用 GPU 计算)
  • 主要数据类型:RWTexture2D(可读写纹理)、RWStructuredBuffer(可读写结构体缓冲)
  • 调用方式:computeShader.Dispatch(kernelIndex, threadGroupsX, threadGroupsY, threadGroupsZ)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// Compute Shader 基本结构
#pragma kernel CSMain // 声明一个 Compute Kernel(入口函数)

// 可读写纹理(类似 RWTexture2D 的格式必须与 C# 侧的 RenderTextureFormat 匹配)
RWTexture2D<float4> _Result; // 写入缓冲
Texture2D<float4> _Source; // 只读源缓冲

uint2 _Resolution; // 纹理分辨率

// 每个线程组 8×8 个线程(可根据 GPU 特性调整,通常 8×8 或 16×16)
[numthreads(8, 8, 1)]
void CSMain(uint3 id : SV_DispatchThreadID)
{
// id.xy:当前线程的全局坐标(像素坐标)
if (id.x >= _Resolution.x || id.y >= _Resolution.y) return;

float4 current = _Source[id.xy];
// ...处理 current...
_Result[id.xy] = current;
}

双缓冲(Ping-Pong)架构

流体模拟需要同时读取上一帧状态和写入新状态,这在同一张纹理上无法实现(读写冲突)。Ping-Pong 双缓冲使用两张交替的 RenderTexture 解决这个问题:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// C# 中创建双缓冲 RenderTexture
private RenderTexture[] _pingPongBuffer = new RenderTexture[2];
private int _currentBuffer = 0;

void CreateBuffers(int width, int height)
{
for (int i = 0; i < 2; i++)
{
_pingPongBuffer[i] = new RenderTexture(width, height, 0,
RenderTextureFormat.ARGBFloat, // 32位浮点(存储速度和密度)
RenderTextureReadWrite.Linear
);
_pingPongBuffer[i].enableRandomWrite = true; // 允许 Compute Shader 写入
_pingPongBuffer[i].filterMode = FilterMode.Bilinear;
_pingPongBuffer[i].wrapMode = TextureWrapMode.Clamp;
_pingPongBuffer[i].Create();
}
}

void SimulateStep()
{
int read = _currentBuffer;
int write = 1 - _currentBuffer;

// 读取 read 缓冲,写入 write 缓冲
_computeShader.SetTexture(kernelID, "_Source", _pingPongBuffer[read]);
_computeShader.SetTexture(kernelID, "_Result", _pingPongBuffer[write]);
_computeShader.Dispatch(kernelID, ...);

// 交换缓冲
_currentBuffer = write;
}

完整流体模拟 Compute Shader

这是核心的流体计算 Shader,实现 2D 欧拉流体模拟(Navier-Stokes 方程的简化实现):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
// FluidSimulation.compute

#pragma kernel InitFluid // 初始化
#pragma kernel AdvectVelocity // 速度平流
#pragma kernel DiffuseVelocity // 速度扩散
#pragma kernel PressureSolve // 压力求解
#pragma kernel ProjectVelocity // 速度投影(确保不可压缩)
#pragma kernel AdvectDye // 染料(可视化用)平流
#pragma kernel AddForce // 外力注入

// ---- 数据布局 ----
// velocity: .xy = 速度 (vx, vy), .z = 压力, .w = 染料浓度

RWTexture2D<float4> _Result;
Texture2D<float4> _Source;
Texture2D<float4> _PressureField; // 独立的压力缓冲(Jacobi 迭代用)

uint2 _Resolution;
float _DeltaTime;
float _Viscosity; // 运动黏度
float _DyeDiffusion; // 染料扩散系数
float _VorticityStr; // 涡旋限制强度

// 外力参数(由 C# 每帧传入)
float2 _ForcePos; // 外力作用位置(像素坐标)
float2 _ForceDir; // 外力方向和强度
float _ForceRadius; // 外力影响半径
float2 _DyeSource; // 染料注入位置
float _DyeAmount; // 染料注入量

SamplerState sampler_linear_clamp; // 双线性采样,钳制边界

// 安全的纹理采样(带边界检测)
float4 SampleField(Texture2D<float4> field, float2 pos)
{
float2 uv = (pos + 0.5) / float2(_Resolution);
return field.SampleLevel(sampler_linear_clamp, uv, 0);
}

// 从像素坐标获取场值(整数寻址)
float4 GetField(Texture2D<float4> field, int2 pos)
{
// 边界处理:速度为零(无滑边界条件)
int2 clamped = clamp(pos, int2(0, 0), int2(_Resolution) - 1);
return field[clamped];
}

// ---- 初始化 ----
[numthreads(8, 8, 1)]
void InitFluid(uint3 id : SV_DispatchThreadID)
{
if (any(id.xy >= _Resolution)) return;
// 初始状态:极小噪声(打破对称性)
float noise = frac(sin(dot(float2(id.xy), float2(12.9898, 78.233))) * 43758.5);
_Result[id.xy] = float4(0, 0, 1.0, 0) + float4(noise, noise, 0, 0) * 0.0001;
}

// ---- 半拉格朗日平流(速度)----
// 逆向追踪:从当前位置沿负速度方向采样上一帧的值
// 无条件稳定,即使 dt 较大也不会发散
[numthreads(8, 8, 1)]
void AdvectVelocity(uint3 id : SV_DispatchThreadID)
{
if (any(id.xy >= _Resolution)) return;

float2 pos = float2(id.xy);
float4 curr = _Source[id.xy];

// 逆向追踪位置(从当前位置沿速度方向回溯)
float2 prevPos = pos - _DeltaTime * curr.xy;

// 在上游位置插值采样
float4 advected = SampleField(_Source, prevPos);
_Result[id.xy] = float4(advected.xy, curr.z, curr.w); // 更新速度,保留压力
}

// ---- 黏性扩散(显式差分,速度扩散)----
[numthreads(8, 8, 1)]
void DiffuseVelocity(uint3 id : SV_DispatchThreadID)
{
if (any(id.xy >= _Resolution)) return;

int2 p = int2(id.xy);
float4 c = _Source[p];
float4 n = GetField(_Source, p + int2(0, 1));
float4 s = GetField(_Source, p - int2(0, 1));
float4 e = GetField(_Source, p + int2(1, 0));
float4 w = GetField(_Source, p - int2(1, 0));

// Laplacian(中心差分)
float4 laplacian = n + s + e + w - 4.0 * c;

// 显式黏性扩散
float2 newVel = c.xy + _DeltaTime * _Viscosity * laplacian.xy;

_Result[p] = float4(newVel, c.z, c.w);
}

// ---- 压力 Jacobi 迭代(一次迭代,需要多次 Dispatch 调用收敛)----
[numthreads(8, 8, 1)]
void PressureSolve(uint3 id : SV_DispatchThreadID)
{
if (any(id.xy >= _Resolution)) return;

int2 p = int2(id.xy);
float4 c = _Source[p];
float4 n = GetField(_Source, p + int2(0, 1));
float4 s = GetField(_Source, p - int2(0, 1));
float4 e = GetField(_Source, p + int2(1, 0));
float4 w = GetField(_Source, p - int2(1, 0));

// 速度散度(不可压缩约束)
float divV = 0.5 * ((e.x - w.x) + (n.y - s.y));

// Jacobi 迭代:p_new = (p_n + p_s + p_e + p_w - divV) / 4
float newPressure = (n.z + s.z + e.z + w.z - divV) * 0.25;

_Result[p] = float4(c.xy, newPressure, c.w);
}

// ---- 速度投影(减去压力梯度,确保 div(v)=0)----
[numthreads(8, 8, 1)]
void ProjectVelocity(uint3 id : SV_DispatchThreadID)
{
if (any(id.xy >= _Resolution)) return;

int2 p = int2(id.xy);
float4 c = _Source[p];
float4 n = GetField(_Source, p + int2(0, 1));
float4 s = GetField(_Source, p - int2(0, 1));
float4 e = GetField(_Source, p + int2(1, 0));
float4 w = GetField(_Source, p - int2(1, 0));

// 压力梯度
float gradPx = 0.5 * (e.z - w.z);
float gradPy = 0.5 * (n.z - s.z);

// 从速度中减去压力梯度
float2 projectedVel = c.xy - float2(gradPx, gradPy);

// 涡旋限制(Vorticity Confinement)
// 旋度场:curl = dVy/dx - dVx/dy
float curlC = 0.5 * ((e.y - w.y) - (n.x - s.x));
float curlN = 0.5 * ((GetField(_Source, p + int2(1,1)).y - GetField(_Source, p + int2(-1,1)).y)
- (GetField(_Source, p + int2(0,2)).x - c.x));
float curlS = 0.5 * ((GetField(_Source, p + int2(1,-1)).y - GetField(_Source, p + int2(-1,-1)).y)
- (c.x - GetField(_Source, p + int2(0,-2)).x));
float curlE = 0.5 * ((GetField(_Source, p + int2(2,0)).y - c.y)
- (GetField(_Source, p + int2(1,1)).x - GetField(_Source, p + int2(1,-1)).x));
float curlW = 0.5 * ((c.y - GetField(_Source, p + int2(-2,0)).y)
- (GetField(_Source, p + int2(-1,1)).x - GetField(_Source, p + int2(-1,-1)).x));

// 涡旋力方向(梯度指向旋度最大的方向)
float2 eta = normalize(float2(abs(curlE) - abs(curlW), abs(curlN) - abs(curlS)) + 1e-5);
projectedVel += _DeltaTime * _VorticityStr * float2(eta.y, -eta.x) * curlC;

// 无滑边界(边缘速度为零)
if (p.x == 0 || p.y == 0 || p.x == (int)_Resolution.x - 1 || p.y == (int)_Resolution.y - 1)
projectedVel = float2(0, 0);

_Result[p] = float4(projectedVel, c.z, c.w);
}

// ---- 染料平流(可视化用)----
[numthreads(8, 8, 1)]
void AdvectDye(uint3 id : SV_DispatchThreadID)
{
if (any(id.xy >= _Resolution)) return;

float2 pos = float2(id.xy);
float4 curr = _Source[id.xy];
float2 prevPos = pos - _DeltaTime * curr.xy;
float advDye = SampleField(_Source, prevPos).w;

// 染料扩散衰减
advDye *= exp(-_DeltaTime * _DyeDiffusion);

// 染料注入(自动发射源)
float2 diff = pos - _DyeSource;
float inject = exp(-dot(diff, diff) / (_ForceRadius * _ForceRadius)) * _DyeAmount;
advDye += inject;

_Result[id.xy] = float4(curr.xy, curr.z, saturate(advDye));
}

// ---- 外力注入(鼠标交互或自动发射)----
[numthreads(8, 8, 1)]
void AddForce(uint3 id : SV_DispatchThreadID)
{
if (any(id.xy >= _Resolution)) return;

float4 curr = _Source[id.xy];
float2 pos = float2(id.xy);

// 高斯衰减的力场
float2 diff = pos - _ForcePos;
float influence = exp(-dot(diff, diff) / (_ForceRadius * _ForceRadius));

float2 newVel = curr.xy + influence * _ForceDir * _DeltaTime;

_Result[id.xy] = float4(newVel, curr.z, curr.w);
}

C# 流体模拟控制器

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
using UnityEngine;
using UnityEngine.Rendering;
using System.Collections.Generic;

[RequireComponent(typeof(Renderer))]
public class FluidSimulator : MonoBehaviour
{
[Header("模拟参数")]
[SerializeField] private int resolution = 512;
[SerializeField] private float viscosity = 0.5f;
[SerializeField] private float dyeDiffusion = 0.1f;
[SerializeField] private float vorticityStrength = 0.035f;
[SerializeField] private int pressureIterations = 20; // Jacobi 迭代次数
[SerializeField] private float timeStep = 0.15f;

[Header("Compute Shader")]
[SerializeField] private ComputeShader fluidCompute;

[Header("可视化")]
[SerializeField] private Material visualizeMaterial;

[Header("交互")]
[SerializeField] private float forceRadius = 30f;
[SerializeField] private float forceStrength = 3.0f;
[SerializeField] private float dyeInjectAmount = 2.0f;

// 双缓冲
private RenderTexture[] _pingPong = new RenderTexture[2];
private int _current = 0;

// Kernel 索引
private int _kernelInit;
private int _kernelAdvectVel;
private int _kernelDiffuseVel;
private int _kernelPressure;
private int _kernelProject;
private int _kernelAdvectDye;
private int _kernelAddForce;

// 线程组数量
private int _threadGroupsX, _threadGroupsY;

// 自动发射源(无鼠标交互时也有流动)
private float _autoSourceTimer = 0;
private Vector2 _autoSourcePos;
private Vector2 _autoSourceVel;

// Property IDs
private static readonly int SourceID = Shader.PropertyToID("_Source");
private static readonly int ResultID = Shader.PropertyToID("_Result");
private static readonly int ResolutionID = Shader.PropertyToID("_Resolution");
private static readonly int DeltaTimeID = Shader.PropertyToID("_DeltaTime");
private static readonly int ViscosityID = Shader.PropertyToID("_Viscosity");
private static readonly int DyeDiffID = Shader.PropertyToID("_DyeDiffusion");
private static readonly int VortStrID = Shader.PropertyToID("_VorticityStr");
private static readonly int ForcePosID = Shader.PropertyToID("_ForcePos");
private static readonly int ForceDirID = Shader.PropertyToID("_ForceDir");
private static readonly int ForceRadID = Shader.PropertyToID("_ForceRadius");
private static readonly int DyeSourceID = Shader.PropertyToID("_DyeSource");
private static readonly int DyeAmountID = Shader.PropertyToID("_DyeAmount");
private static readonly int FluidTexID = Shader.PropertyToID("_FluidTex");

void Start()
{
// 获取 Kernel 索引
_kernelInit = fluidCompute.FindKernel("InitFluid");
_kernelAdvectVel = fluidCompute.FindKernel("AdvectVelocity");
_kernelDiffuseVel = fluidCompute.FindKernel("DiffuseVelocity");
_kernelPressure = fluidCompute.FindKernel("PressureSolve");
_kernelProject = fluidCompute.FindKernel("ProjectVelocity");
_kernelAdvectDye = fluidCompute.FindKernel("AdvectDye");
_kernelAddForce = fluidCompute.FindKernel("AddForce");

// 线程组数量(每个线程组 8×8 个线程)
_threadGroupsX = Mathf.CeilToInt(resolution / 8.0f);
_threadGroupsY = Mathf.CeilToInt(resolution / 8.0f);

// 创建 RenderTexture
for (int i = 0; i < 2; i++)
{
_pingPong[i] = new RenderTexture(resolution, resolution, 0,
RenderTextureFormat.ARGBFloat);
_pingPong[i].enableRandomWrite = true;
_pingPong[i].filterMode = FilterMode.Bilinear;
_pingPong[i].Create();
}

// 初始化
SetCommonUniforms();
fluidCompute.SetTexture(_kernelInit, ResultID, _pingPong[_current]);
fluidCompute.Dispatch(_kernelInit, _threadGroupsX, _threadGroupsY, 1);

// 设置可视化材质
if (visualizeMaterial != null)
visualizeMaterial.SetTexture(FluidTexID, _pingPong[_current]);
}

void SetCommonUniforms()
{
fluidCompute.SetInts(ResolutionID, resolution, resolution);
fluidCompute.SetFloat(DeltaTimeID, timeStep);
fluidCompute.SetFloat(ViscosityID, viscosity);
fluidCompute.SetFloat(DyeDiffID, dyeDiffusion);
fluidCompute.SetFloat(VortStrID, vorticityStrength);
}

void FixedUpdate()
{
SetCommonUniforms();
UpdateAutoSource();

// ---- 外力注入 ----
AddForceStep();

// ---- 速度平流 ----
Swap();
Dispatch(_kernelAdvectVel);

// ---- 速度扩散 ----
Swap();
Dispatch(_kernelDiffuseVel);

// ---- 压力求解(Jacobi 迭代多次)----
for (int i = 0; i < pressureIterations; i++)
{
Swap();
Dispatch(_kernelPressure);
}

// ---- 速度投影 ----
Swap();
Dispatch(_kernelProject);

// ---- 染料平流 ----
Swap();
fluidCompute.SetFloats(DyeSourceID, _autoSourcePos.x, _autoSourcePos.y);
fluidCompute.SetFloat(DyeAmountID, dyeInjectAmount * Time.fixedDeltaTime);
Dispatch(_kernelAdvectDye);

// 更新可视化材质
if (visualizeMaterial != null)
visualizeMaterial.SetTexture(FluidTexID, _pingPong[_current]);
}

void AddForceStep()
{
Vector2 forcePos = _autoSourcePos;
Vector2 forceDir = _autoSourceVel * forceStrength;

// 鼠标/触摸交互(如果有)
if (Input.GetMouseButton(0))
{
Ray ray = Camera.main.ScreenPointToRay(Input.mousePosition);
Plane plane = new Plane(transform.forward, transform.position);
if (plane.Raycast(ray, out float enter))
{
Vector3 hitLocal = transform.InverseTransformPoint(ray.GetPoint(enter));
// 将本地坐标映射到纹理坐标
forcePos = new Vector2(
(hitLocal.x + 0.5f) * resolution,
(hitLocal.y + 0.5f) * resolution
);
forceDir = Vector2.zero; // 鼠标按下时只注入染料,不施加速度
}
}

fluidCompute.SetTexture(_kernelAddForce, SourceID, _pingPong[_current]);
fluidCompute.SetTexture(_kernelAddForce, ResultID, _pingPong[1 - _current]);
fluidCompute.SetFloats(ForcePosID, forcePos.x, forcePos.y);
fluidCompute.SetFloats(ForceDirID, forceDir.x, forceDir.y);
fluidCompute.SetFloat(ForceRadID, forceRadius);
fluidCompute.Dispatch(_kernelAddForce, _threadGroupsX, _threadGroupsY, 1);
_current = 1 - _current;
}

void UpdateAutoSource()
{
// 在屏幕上随机游走的自动力源(无鼠标时也有流动)
_autoSourceTimer += Time.fixedDeltaTime;
float t = _autoSourceTimer;
_autoSourcePos = new Vector2(
resolution * (0.5f + 0.3f * Mathf.Sin(t * 0.7f)),
resolution * (0.5f + 0.3f * Mathf.Cos(t * 0.9f))
);
_autoSourceVel = new Vector2(Mathf.Cos(t), Mathf.Sin(t * 1.3f));
}

void Swap()
{
int read = _current;
int write = 1 - _current;
_current = write; // 切换当前缓冲
// 调用 Dispatch 时再设置 Source 和 Result
}

void Dispatch(int kernel)
{
int read = 1 - _current; // Swap 后,当前写缓冲是 _current,读缓冲是另一个
fluidCompute.SetTexture(kernel, SourceID, _pingPong[read]);
fluidCompute.SetTexture(kernel, ResultID, _pingPong[_current]);
fluidCompute.Dispatch(kernel, _threadGroupsX, _threadGroupsY, 1);
}

void OnDestroy()
{
foreach (var rt in _pingPong)
if (rt != null) rt.Release();
}
}

URP 可视化 Shader

将流体数据渲染为彩色染料效果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
Shader "Custom/URP/FluidVisualize"
{
Properties
{
_FluidTex ("流体数据纹理", 2D) = "black" {}
_VelocityColorA ("速度颜色A", Color) = (0.8, 0.2, 0.5, 1)
_VelocityColorB ("速度颜色B", Color) = (0.2, 0.6, 1.0, 1)
_DyeColor ("染料颜色", Color) = (1, 0.5, 0.1, 1)
_Background ("背景颜色", Color) = (0.02, 0.02, 0.04, 1)
_VelScale ("速度可视化缩放", Range(0.1, 5)) = 1.0
}

SubShader
{
Tags { "RenderType"="Opaque" "RenderPipeline"="UniversalPipeline" }

Pass
{
Name "FluidVis"
Tags { "LightMode"="UniversalForward" }

HLSLPROGRAM
#pragma vertex vert
#pragma fragment frag
#include "Packages/com.unity.render-pipelines.universal/ShaderLibrary/Core.hlsl"

TEXTURE2D(_FluidTex); SAMPLER(sampler_FluidTex);

CBUFFER_START(UnityPerMaterial)
float4 _FluidTex_ST;
float4 _VelocityColorA;
float4 _VelocityColorB;
float4 _DyeColor;
float4 _Background;
float _VelScale;
CBUFFER_END

struct Attributes { float4 positionOS : POSITION; float2 uv : TEXCOORD0; };
struct Varyings { float4 positionHCS : SV_POSITION; float2 uv : TEXCOORD0; };

Varyings vert(Attributes input)
{
Varyings output;
output.positionHCS = TransformObjectToHClip(input.positionOS.xyz);
output.uv = TRANSFORM_TEX(input.uv, _FluidTex);
return output;
}

half4 frag(Varyings input) : SV_Target
{
float4 fluid = SAMPLE_TEXTURE2D(_FluidTex, sampler_FluidTex, input.uv);

// 速度方向映射为色相
float velAngle = atan2(fluid.y, fluid.x); // -π 到 π
float3 velColor = 0.5 + 0.5 * cos(velAngle + float3(0, 2.094, 4.189)); // RGB 色轮
velColor = lerp(_VelocityColorA.rgb, _VelocityColorB.rgb,
velAngle / (2.0 * 3.14159) + 0.5);

// 速度大小
float velMag = length(fluid.xy) * _VelScale;

// 染料浓度
float ink = smoothstep(0.0, 2.0, fluid.w);

// 混合:背景 -> 速度颜色(由染料显示)
float3 finalColor = lerp(_Background.rgb, velColor, ink);

// 压力高光
finalColor += float3(0.05, 0.05, 0.05) * saturate(fluid.z - 1.0);

// 背景不能全黑(否则用户以为程序出错)
finalColor = max(finalColor, _Background.rgb);

return half4(finalColor, 1.0);
}
ENDHLSL
}
}
}

AsyncGPUReadback:CPU-GPU 数据同步

流体模拟有时需要在 CPU 端读取流体数据(例如:检测流体是否到达某个位置,触发游戏事件)。使用 AsyncGPUReadback 避免阻塞 GPU:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
using UnityEngine.Rendering;

// 异步读取流体数据(不阻塞 GPU)
void ReadFluidData()
{
AsyncGPUReadback.Request(_pingPong[_current], 0, TextureFormat.RGBAFloat,
(AsyncGPUReadbackRequest request) =>
{
if (request.hasError)
{
Debug.LogError("AsyncGPUReadback 失败");
return;
}

// 在回调中处理数据(在主线程中调用)
var data = request.GetData<Color>();

// 示例:检测某位置的染料浓度
int checkX = resolution / 2, checkY = resolution / 4;
float inkAtPoint = data[checkY * resolution + checkX].a;

if (inkAtPoint > 0.5f)
{
Debug.Log("检测到流体到达目标位置!");
// 触发游戏事件...
}
}
);
}

性能考量

分辨率 GPU 内存 每帧时间(RTX 3060) 适用场景
128×128 ~0.3MB ~0.1ms 移动端
256×256 ~1MB ~0.3ms 移动端高端/PC
512×512 ~4MB ~1ms PC 标准
1024×1024 ~16MB ~4ms PC 高质量

移动端注意

  • 使用 RenderTextureFormat.RGHalf(16位双通道)代替 ARGBFloat,节省 75% 内存和带宽
  • 将 pressureIterations 减少到 5-10
  • 使用 64×64 或 128×128 分辨率

常见踩坑

坑1:RenderTexture.enableRandomWrite 必须在 Create() 之前设置
如果先调用 rt.Create() 再设置 rt.enableRandomWrite = true,RenderTexture 不会启用随机写入,Compute Shader 的 RWTexture2D 无法正常工作,报错 “Texture is not set”。

坑2:Compute Shader 的线程组大小与分辨率
[numthreads(8, 8, 1)] 意味着每个线程组处理 8×8=64 个像素。Dispatch 的参数 (threadGroupsX, threadGroupsY, 1) 中,threadGroupsX = ceil(width / 8)。如果分辨率不是 8 的整数倍,需要在 Compute Shader 开头检查 if (any(id.xy >= _Resolution)) return,防止越界写入。

坑3:ARGBFloat vs ARGBHalf 的兼容性
在 iOS(Metal)上,Compute Shader 的 RWTexture2D<float4> 对应 ARGBFloat,而 ARGBHalf 格式的 RenderTexture 在某些设备上不支持随机写入。如果需要跨平台,使用 ARGBFloat 更安全,或通过 SystemInfo.SupportsRenderTextureFormat 在运行时检测。

坑4:Jacobi 迭代次数与稳定性
压力求解的 Jacobi 迭代次数(pressureIterations)越多,流体不可压缩性越好,但性能越差。20 次迭代通常是质量/性能的合理平衡点。如果流体出现爆炸性发散(速度值变为 NaN),先检查 clamp 是否正确限制了速度范围。


Unity Shader 系列(二十):Unity GPU 流体模拟实战:Compute Shader 与双缓冲
https://alex-rachel.github.io/2026/04/01/20-fluid-simulation/
作者
Alex
发布于
2026年4月1日
许可协议