Alexandre Petit-Bianco, Cygnus Solutions
The program specifications and hardware constraints of embedded
systems result in some unique problems regarding garbage collection.
Different garbage collection algorithms are appropriate depending on
the embedded systems application, thus preventing garbage collection
from being a “silver bullet” problem.
The efficiency of garbage collection algorithms depends on numerous
factors. One factor is the hardware of the targeted platform: its CPU
type, the amount of memory available, and the availability of memory
management hardware support. Another factor is the type of runtime
system (compiled or interpreted) and the level of support it provides
for memory allocation (supervised or unsupervised). Finally, factors
depending on the application itself, notably its memory usage patterns
(memory cells connectivity, relative size, and life span of allocated
objects), are also relevant.
Once these factors have been taken into consideration, the suitable
garbage collection algorithms should be reviewed in order to take into
account their processing costs, the space overhead they impose (both
on the allocated data and on the code size) as well as the performance
degradation they incur as the heap fills up.
Definitions
Garbage collection relies on the observation that a previously
allocated memory location that is no longer referenced by any pointers
is no longer accessible to the application and is therefore
reclaimable. The role of a garbage collector is to identify and
recycle these inaccessible allocated memory chunks. There are two
families of garbage collecting techniques — reference counting and
tracing [1].
Reference counting keeps track of the number of references to a memory
cell. The cell is considered reclaimable whenever this number becomes
zero. The compiler or the runtime system is responsible for adjusting
the reference count every time a pointer is updated or copied. The
main advantage of reference counting is that the computational
overhead is distributed throughout the execution of the program, which
ensures a smooth response time. Reclaiming an isolated cell doesn’t
imply accessing other cells, and there are optimizations that reduce
the cost of deleting the last reference to an entire group of cells.
The major weakness of reference counting is its inability to reclaim
cyclic structures such as doubly-linked lists or trees featuring leaf
nodes with pointers back to their roots.
Reference counting can be optimized to reduce the number of times
reference count fields have to be updated and to reduce the size of
the reference count field. It can also be modified to detect pointers
that are part of a cycle or can even be combined with a tracing
algorithm, described below, in order to reclaim cyclic data.
Reachable (live) memory locations are those that are directly or
indirectly pointed to by a primordial set of pointers called the
roots. Tracing algorithms start from the roots and examine each
allocated memory cells’ connectivity.
There are two kinds of tracing algorithms: “mark and sweep” and
“copying.” Mark and sweep marks active memory cells (those that are
reachable through pointer reference, starting with the roots) whenever
the amount of free memory goes below a certain threshold, and then
sweeps through the allocated cells. When sweeping is complete,
unmarked cells are returned to the available memory pool. The
advantage of this approach is its ability to reclaim cyclic data
structures. But if implemented as described, it may not be acceptable
for an application with some real-time constraints for reasons I will
explain below.
The copying algorithm uses a heap divided into halves called
semi-spaces. One semi-space, called the current space, is used for
memory allocation. When garbage collection occurs, active memory
cells are traced from the current space and copied into the second
space. When done, the second space contains only live cells
(non-garbage) and becomes the new active space. The cells get
automatically compacted as a result of this process. Unreachable
cells (garbage) are not copied and are thus naturally reclaimed. The
copying algorithm ensures fast allocation of new cells and reduces
fragmentation. On the other hand, it halves the memory available to
the application and touches all the living memory cells during the
copy algorithm, which may lead to a considerable number of page faults
on a system using pagination.
Tracing garbage collectors can be either accurate/exact (able to
distinguish pointer from non-pointer data) or conservative (consider
every word encountered as a potential pointer). The type of tracing
garbage collector you use — exact or conservative — affects the way
pointers are identified within the location where roots can be found
(registers, stack, static areas) and the way the garbage collector
scans cells looking for pointers to other cells.
A conservative garbage collector needs to examine every memory word as
a potential pointer. This process isn’t guaranteed to succeed for all
instances of the considered words and may lead to misidentification
errors. This implies that memory cells that are actually garbage may
accidentally be retained. Furthermore, a fully conservative garbage
collector can’t venture moving memory blocks around, wrongly updating
memory references it thinks point to them. It implies that a fully
conservative garbage collector can’t be a fully compacting collector.
Conservative collectors can be altered to be mostly copying: objects
that are solely accessible from other heap-allocated object are
copied. They are deployed on systems where no cooperation from the
compiler or the runtime system is expected to help the collector tell
whether a word is a pointer or not.
Garbage Collection and Java
The Java runtime relies on garbage collection to identify and reclaim
objects no longer in use automatically. Garbage collection doesn’t
alter the behavior of an application written in Java, with the
exception of classes that declare a finalizer method.
Finalizers are executed right before objects found to be garbage are
discarded, giving the developer a chance to explicitly release
non-memory resources the class uses. Tricky finalizers may
resuscitate an object by installing a new reference to it, but
finalizers are executed only once. This guarantees that objects may
be brought back to life only once. Beginning with JDK 1.1, data
belonging to classes is also garbage collected, allowing unused
classes to be automatically discarded (this happens whenever their
class load gets recycled.) Note that system classes are never garbage
collected.
The initial set of roots of a Java runtime are the live thread
objects, the system classes (for instance java.lang.String),
the local variables and operands on the JVM stack, the references from
the JNI, the pending exceptions and the references from the native
stack. Stemming from these roots, all three types of Java entities
feature reachable components that will be examined by the collector.
Objects have instance fields and class that are reclaimable, arrays
have element objects and class that can be garbage collected and
classes feature a set of components that can be recycled: super class,
interfaces, linked classes, string literals, static fields, class
loader, signers, and protection domains.
Up to JDK 1.1.4, Sun uses a partially conservative compacting mark and
sweep algorithm, which makes a conservative assumption when the native
stack is scanned, but allows the Java heap and stack to be scanned in
a non-conservative way. The algorithm compacts the heap,
substantially reducing its fragmentation.
Sun’s implementation relies on an implementation detail to keep the
cost of objects’ relocation low. The reference to an object (by which
an object is know to exists to other objects and the VM) is implemented
as a non-moving handle pointing to the actual object’s data. Once the
actual object’s relocation performed, simply updating its handle
allows the object to be accessed at its new location.
The garbage collection code is executed in a separate thread when the
system runs out of memory or when System.gc is explicitly
invoked. The collector may also run asynchronously when it detects
that the system is idle, but this may require some support from the
native thread package. In any cases, execution of the garbage
collector halts all other application threads and scans the entire
heap.
Which Garbage Collection for Embedded Java?
In order to adapt Java garbage collection to embedded systems, you
must consider four issues: the nature of the runtime environment, the
hardware characteristics and limitations of the embedded platform, the
expected performances (acceptable general overhead and maximum pause
times), and the way in which critical memory situations need be
handled.
The runtime environment will mainly influence the way roots are
identified. An interpreted or semi-compiled (JIT) environment can
make references clearly identifiable so the Java heap and the Java
stack can be scanned non-conservatively. On the other hand, the
pre-compiled approach — like the one currently being developed at
Cygnus — needs to rely on conservative scanning, unless additional
information is made available to the runtime by a cooperative
compiler.
Memory, both RAM and ROM, is a scarce resource. Memory constraints
usually rule out all copying and generational algorithms and prevents
sophisticated solutions like adaptive collectors (that are dynamically
changing the way they garbage collect the memory to achieve greater
efficiency) or cooperative collectors (featuring several algorithms in
one collector, to get the best of several worlds) because of the
amount of ROM required for their implementation.
Additionally, reference counting requires the count field of an
al