Finalize, Sweep and Rooting - Understanding Rust GC
Finalize Trait
Previously we explored Trace
in Trace - Understanding Rust GC - higepon blog. Let's look into Finalize
trait. This is really simple. Every GC
-ed object should implement this finalize.
pub trait Finalize { fn finalize(&self) {} }
Let's look at finalize implementation for array
and Option<T>
. They are empty. Hmm. I'm not quite sure why. Let me start from collect_garbage
and come back here later.
;; Array impl<T: Trace, const N: usize> Finalize for [T; N] {} ;; Option<T> impl<T: Trace> Finalize for Option<T> {}
collect_garbage
In collect_garbage
after the mark
returns unmarked objects (= not used objects) and it will call finalize_glue()
for each GC<T>
object.
let unmarked = mark(&mut st.boxes_start); if unmarked.is_empty() { return; } for node in &unmarked { Trace::finalize_glue(&(*node.this.as_ptr()).data); }
finalize_glue
calls finalize
in the Trait. So say an array is unmarked, the gc eventually calls empty trace
of the array. Now I remember that finalize
give the object a chance to any necessary work before it's deallocated. For example an object may want to close a file associated with it.
Sweep
Now we know that the gc collects unmarked objects and calls finalize for the unmarked objects. It's time to sweep
them.
unsafe fn sweep(finalized: Vec<Unmarked>, bytes_allocated: &mut usize) { let _guard = DropGuard::new(); for node in finalized.into_iter().rev() { if (*node.this.as_ptr()).header.marked.get() { continue; } let incoming = node.incoming; let mut node = Box::from_raw(node.this.as_ptr()); *bytes_allocated -= mem::size_of_val::<GcBox<_>>(&*node); *incoming = node.header.next.take(); } }
This is actually very interesting. It recreates Box from returning raw pointer using from_raw which makes sense!
Rooting
Let's get back to root
and unroot
. A Tour of Safe Tracing GC Designs in Rust - In Pursuit of Laziness has a good summary and examples of what is rooting and why we need it.
In one word: Rooting. In a garbage collector, the objects “directly” in use on the stack are the “roots”, and you need to be able to identify them.
struct Foo { bar: Option<Gc<Bar>> } // this is a root let bar = Gc::new(Bar::new()); // this is also a root let foo = Gc::new(Foo::new()); // bar should no longer be a root (but we can't detect that!) foo.bar = Some(bar); // but foo should still be a root here since it's not inside // another GC'd object let v = vec![foo];
To track root objects, the GC maintains root counts in GC_BOX
. In short GC_BOX
with root count > 0 is a root object. Rooting - Designing a GC in Rust explains it very well. Note that the count is incremented or decremented when GC<T>
object is moved.
Summary
- Finalize gives an object an opportunity to do some clean up work before it's free-ed.
- Sweep returns unmarked objects using
Box
. - To track objects directly use in stack the gc maintains root count.