RenderFormer: Neural rendering of triangle meshes with global illumination by klavinski

Share This Article

Sed ut perspiciatis unde.

Introduction

We present RenderFormer, a neural rendering pipeline that directly renders an image from a triangle-based representation of a scene with full global illumination effects and that does not require per-scene training or fine-tuning.

Mesh to Image, End to End

Instead of taking a physics-centric approach to rendering, we formulate rendering as a sequence-to-sequence transformation where a sequence of tokens representing triangles with reflectance properties is converted to a sequence of output tokens representing small patches of pixels.

Simple Transformer Architecture with Minimal Prior Constraints

RenderFormer follows a two stage pipeline: a view-independent stage that models triangle-to-triangle light transport, and a view-dependent stage that transforms a token representing a bundle of rays to the corresponding pixel values guided by the triangle-sequence from the the view-independent stage. Both stages are based on the transformer architecture and are learned with minimal prior constraints. No rasterization, no ray tracing.

Rendering Gallery

Examples of scenes rendered with RenderFormer demonstrating various lighting conditions, materials, and geometric complexity, without any per-scene training or fine-tuning. Check out the reference images for more details.

Stanford Bunny in Cornell Box

Videos

Check out extra video results including uncompressed videos and some reference videos.

Post Author

rossant

Posted June 1, 2025 at 4:11 am

Wow. The loop is closed with GPUs then. Rendering to compute to rendering.

0Likes Log in to Reply
Post Author

goatmanbah

Posted June 1, 2025 at 4:24 am

What can't transformers do?

0Likes Log in to Reply
Post Author

keyle

Posted June 1, 2025 at 4:53 am

Raytracing, The Matrix edition. Feels like an odd round about we're in.

0Likes Log in to Reply
Post Author

kookamamie

Posted June 1, 2025 at 4:58 am

Looks ok, albeit blurry. Would have been nice to see comparison of render-time between the neural and classical renderers.

0Likes Log in to Reply
Post Author

dclowd9901

Posted June 1, 2025 at 5:05 am

Forgive my ignorance: are these scenes rendered based on how a scene is expected to be rendered? If so, why would we use this over more direct methods (since I assume this is not faster than direct methods)?

0Likes Log in to Reply
Post Author

feverzsj

Posted June 1, 2025 at 5:14 am

Kinda pointless, when classic algorithms can achieve much better results on much cheaper hardware.

0Likes Log in to Reply
Post Author

timhigins

Posted June 1, 2025 at 5:17 am

The coolest thing here might be the speed: for a given scene RenderFormer takes 0.0760 seconds while Blender Cycles takes 3.97 seconds (or 12.05 secs at a higher setting), while retaining a 0.9526 Structural Similarity Index Measure (0-1 where 1 is an identical image). See tables 2 and 1 in the paper.

This could possibly enable higher quality instant render previews for 3D designers in web or native apps using on-device transformer models.

Note the timings above were on an A100 with an unoptimized PyTorch version of the model. Obviously the average user's GPU is much less powerful, and for 3D designers it might be still powerful enough to see significant speedups over traditional rendering. Or for a web-based system it could even connect to A100s on the backend and stream the images to the browser.

Limitations are that it's not fully accurate especially as scene complexity scales, e.g. with shadows of complex shapes (plus I imagine particles or strands), so the final renders will probably still be done traditionally to avoid any of the nasty visual artifacts common in many AI-generated images/videos today. But who knows, it might be "good enough" and bring enough of a speed increase to justify use by big animation studios who need to render full movie-length previews to use for music, story review, etc etc.

0Likes Log in to Reply
Post Author

vessenes

Posted June 1, 2025 at 6:00 am

This is a stellar and interesting idea: train a transformer to turn a scene description set of triangles into a 2d array of pixels, which happens to look like the pixels a global illumination renderer would output from the same scene.

That this works at all shouldn’t be shocking after the last five years of research, but I still find it pretty profound. That transformer architecture sure is versatile.

Anyway, crazy fast, close to blender’s rendering output, what looks like a 1B parameter model? Not sure if it’s fp16 or 32, but it’s a 2GB file, what’s not to like? I’d like to see some more ‘realistic’ scenes demoed, but hey, I can download this and run it on my Mac to try it whenever I like.

0Likes Log in to Reply
Post Author

mixedbit

Posted June 1, 2025 at 6:36 am

Deep learning is also very successfully used for denoising of global illumination rendered images [1]. In this approach, traditional raytracing algorithm quickly computes rough global illumination of the scene, and neural network is used to remove noise from the output. .

[1] https://www.openimagedenoise.org

0Likes Log in to Reply

RenderFormer: Neural rendering of triangle meshes with global illumination by klavinski

RenderFormer: Neural rendering of triangle meshes with global illumination by klavinski

Share This Article

Newsletter

Introduction

Mesh to Image, End to End

Simple Transformer Architecture with Minimal Prior Constraints

Rendering Gallery

Videos

HackTech

9 Comments

rossant

goatmanbah

keyle

kookamamie

dclowd9901

feverzsj

timhigins

vessenes

mixedbit

Leave a comment Cancel reply

Editor's Choice

RenderFormer: Neural rendering of triangle meshes with global illumination by klavinski

RenderFormer: Neural rendering of triangle meshes with global illumination by klavinski

Share This Article

Newsletter

Introduction

Mesh to Image, End to End

Simple Transformer Architecture with Minimal Prior Constraints

Rendering Gallery

Videos

9 Comments

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter