fn compute_loop_exits(cfg: &CfgInfo) -> HashMap<BlockId, Option<BlockId>>
Expand description
See compute_loop_switch_exits
for
explanations about what “exits” are.
The following function computes the loop exits. It acts as follows.
We keep track of a stack of the loops in which we entered.
It is very easy to check when we enter a loop: loop entries are destinations
of backward edges, which can be spotted with a simple graph exploration (see
build_cfg_partial_info
.
The criteria to consider whether we exit a loop is the following:
- we exit a loop if we go to a block from which we can’t reach the loop entry at all
- or if we can reach the loop entry, but must use a backward edge which goes to an outer loop
It is better explained on the following example:
'outer while i < max {
'inner while j < max {
j += 1;
}
// (i)
i += 1;
}
If we enter the inner loop then go to (i) from the inner loop, we consider that we exited the outer loop because:
- we can reach the entry of the inner loop from (i) (by finishing then starting again an iteration of the outer loop)
- but doing this requires taking a backward edge which goes to the outer loop
Whenever we exit a loop, we save the block we went to as an exit candidate for this loop. Note that there may by many exit candidates. For instance, in the below example:
while ... {
...
if ... {
// We can't reach the loop entry from here: this is an exit
// candidate
return;
}
}
// This is another exit candidate - and this is the one we want to use
// as the "real" exit...
...
Also note that it may happen that we go several times to the same exit (if we use breaks for instance): we record the number of times an exit candidate is used.
Once we listed all the exit candidates, we find the “best” one for every loop, starting with the outer loops. We start with outer loops because inner loops might use breaks to exit to the exit of outer loops: if we start with the inner loops, the exit which is “natural” for the outer loop might end up being used for one of the inner loops…
The best exit is the following one:
- it is the one which is used the most times (note that there can be several candidates which are referenced strictly more than once: see the comment below)
- if several exits have the same number of occurrences, we choose the one for which we goto the “earliest” (earliest meaning that the goto is close to the loop entry node in the AST). The reason is that all the loops should have an outer if … then … else … which executes the loop body or goes to the exit (note that this is not necessarily the first if … then … else … we find: loop conditions can be arbitrary expressions, containing branchings).
§Several candidates for a loop exit:
===================================== There used to be a sanity check to ensure there are no two different candidates with exactly the same number of occurrences and distance from the entry of the loop, if the number of occurrences is > 1.
We removed it because it does happen, for instance here (the match
introduces an unreachable
node, and it has the same number of
occurrences and the same distance to the loop entry as the panic
node):
pub fn list_nth_mut_loop_pair<'a, T>(
mut ls: &'a mut List<T>,
mut i: u32,
) -> &'a mut T {
loop {
match ls {
List::Nil => {
panic!() // <-- best candidate
}
List::Cons(x, tl) => {
if i == 0 {
return x;
} else {
ls = tl;
i -= 1;
}
}
_ => {
// Note that Rustc always introduces an unreachable branch after
// desugaring matches.
unreachable!(), // <-- best candidate
}
}
}
}
When this happens we choose an exit candidate whose edges don’t necessarily lead to an error (above there are none, so we don’t choose any exits). Note that this last condition is important to prevent loops from being unnecessarily nested:
pub fn nested_loops_enum(step_out: usize, step_in: usize) -> usize {
let mut s = 0;
for _ in 0..128 { // We don't want this loop to be nested with the loops below
s += 1;
}
for _ in 0..(step_out) {
for _ in 0..(step_in) {
s += 1;
}
}
s
}