Expand description
ULLBC to LLBC
We reconstruct the control-flow in the Unstructured LLBC.
The reconstruction algorithm is not written to be efficient (its complexity is probably very bad), but it was not written to be: this is still an early stage and we want the algorithm to generate the best reconstruction as possible. We still need to test the algorithm on more interesting examples, and will consider making it more efficient once it is a bit mature and well tested. Also note that we more importantly focus on making the algorithm sound: the reconstructed program must always be equivalent to the original MIR program, and the fact that the reconstruction preserves this property must be obvious.
Finally, we conjecture the execution time shouldnβt be too much a problem: we donβt expect the translation mechanism to be applied on super huge functions (which will be difficult to formally analyze), and the MIR graphs are actually not that big because statements are grouped into code blocks (a block is made of a list of statements, followed by a terminator - branchings and jumps can only be performed by terminators -, meaning that MIR graphs donβt have that many nodes and edges).
Structs§
- Block
Info πSmall utility - Blocks
Info πInformation used to compute the switch exits. We compute this information for every block in the graph. Note that we make sure to use immutable sets because we rely a lot on cloning. - CfgInfo πThis structure contains various information about a functionβs CFG.
- Exit
Info πThe exits of a graph - Filtered
Loop πParents - OrdBlock
Id π
Enums§
- Goto
Kind π
Functions§
- block_
is_ πerror The terminator of the block is a panic, etc. - block_
is_ πswitch - build_
cfg_ πinfo Build the CFGs (the βregularβ CFG and the CFG without backward edges) and compute some information like the loop entries and the switch blocks. - Compute the loop exit candidates.
- compute_
loop_ πexits Seecompute_loop_switch_exits
for explanations about what βexitsβ are. - Compute the exits for the loops and the switches (switch on integer and if β¦ then β¦ else β¦). We need to do this because control-flow in MIR is destructured: we have gotos everywhere.
- compute_
switch_ πexits Seecompute_loop_switch_exits
for explanations about what βexitsβ are. - Compute BlocksInfo for every block in the graph. This information is then used to compute the switch exits.
- filter_
loop_ πparents - get_
goto_ πkind - is_
terminal πReturntrue
if whatever the path we take, evaluating the statement necessarily leads to: - is_
terminal_ πexplore - list_
reachable πList the nodes reachable from a starting point. We list the nodes and the depth (in the AST) at which they were found. - Check if a loop entry is reachable from a node, in a graph where we remove the backward edges going directly to the loop entry.
- make_
ord_ πblock_ id Create an OrdBlockId from a block id and a rank given by a map giving a sort (topological in our use cases) over the graph. - Register a node and its children as exit candidates for a list of parent loops.
- translate_
block πRemark: some values are boxed (here, the returned statement) so that they are allocated on the heap. This reduces stack usage (we had problems with stack overflows in the past). A more efficient solution would be to use loops to make this code constant space, but that would require a serious rewriting. - translate_
body π - translate_
body_ πaux parent_span
: we need some span data for the new statement. We use the one for the parent terminator.- Translate the functions by reconstructing the control-flow.
- translate_
statement π - translate_
terminator π
Type Aliases§
- Cfg πControl-Flow Graph