Rework relooper algorithm for fun and profit#1558
Rework relooper algorithm for fun and profit#1558randomPoison wants to merge 134 commits intomasterfrom
Conversation
- Try to create a loop if there's a single entry. - Attempt to create a multiple if there are any handled entries/blocks. - Fallback to a loop only if we can't create a multiple. This brings us more in line with the original paper as well, and already fixes the disjoint loops case. Also make a bunch of not-strictly-necessary changes to make things more organized and easier to work with. These should be split out into a separate PR before the functional refactoring.
Because apparently I have a different compiler on my laptop and it's mad about the missing statements, even though on my computer at home it works lolsob
Also differentiate between breaks and continues when relooping.
... by not generating a break if there's only one entry following a simple shape.
And also make the json output pretty.
|
Oh joy, we have a subtle difference in behavior between platforms. Looks like the CFG labels that we generate on macos are slightly different than the labels we generate on Ubuntu, likely because the compiler we're using is different and is generating a slightly different CFG. |
|
Okay the tests should be clean and ready for review as well. |
kkysen
left a comment
There was a problem hiding this comment.
For the snapshots, could you have one commit where they're tested with the previous relooper, and then a new commit where they're updated so I can see the diffs? Or if this is already the case, tell me which commits to compare, because there's a lot of them so I'm not sure where to look.
.gitignore
Outdated
| # Relooper debug info | ||
| /dumps | ||
| c2rust-transpile/dumps |
There was a problem hiding this comment.
Should we add relooper in the name somewhere instead of just dumps?
There was a problem hiding this comment.
It's not necessarily relooper-specific, if there are other parts of the transpile or refactor process where we want to dump large quantities of data for debugging we can drop it in there.
There was a problem hiding this comment.
I did update the comment to make it non-relooper-specific though.
| @@ -0,0 +1,87 @@ | |||
| #[test] | |||
| pub fn test_goto_error_a() { | |||
| use crate::goto_error::rust_goto_error_a; | |||
There was a problem hiding this comment.
Can these just all be top-level imports?
There was a problem hiding this comment.
Same for all of the other ones.
There was a problem hiding this comment.
For these it's helpful to import the function names within each test function because that makes it a bit easier to comment out tests for debugging, i.e. if I want to comment out some of the test functions in the original C file, I also have to comment out the corresponding test function and import in the Rust test file. If the import is in the corresponding function I can just comment out the function, whereas if the imports are at the top of the file I have to comment out the test function and the import separately. It's not a big deal either way, but there's also not much of a reason to move the imports.
| use std::io::Write; | ||
| std::fs::create_dir_all("dumps").unwrap(); | ||
| let mut file = std::fs::File::create(&file_name).unwrap(); | ||
| write!(&mut file, "{:#?}", relooped).unwrap(); |
There was a problem hiding this comment.
| use std::io::Write; | |
| std::fs::create_dir_all("dumps").unwrap(); | |
| let mut file = std::fs::File::create(&file_name).unwrap(); | |
| write!(&mut file, "{:#?}", relooped).unwrap(); | |
| fs::create_dir_all("dumps").unwrap(); | |
| fs::write(&file_name, format!("{relooped:#?}")).unwrap(); |
File isn't buffered, so we don't want to use that directly. It should be simpler to just write all in one go.
There was a problem hiding this comment.
What's the issue here with File not being buffered? I assume that's some kind of minor performance issue? The docs say to use a buffered write when doing many small writes, but we're just doing one big write so it's not clear to me that it matters here.
@kkysen There's not a good way to do that without rewriting a lot of history. We didn't have snapshot tests for the CFG stuff before, so they're pretty much all new. |
Hmm. Can you run the snapshots on the previous transpiler in a commit somewhere so we can review the diff? It's a lot harder to review without that IMO. |
Yeah that's easy enough to do, at least. 218a281 now shows the diff between the old behavior and the new behavior in the snapshots |
Thanks, that's very helpful. |
This PR reworks the relooper algorithm to replace most usages of
current_blockwith labeled blocks/breaks. Addresses #330. Closes #1398.The bulk of the changes are in
relooper.rsandstructures.rs:relooper.rsthe core logic is in therelooperfunction, which takes the unstructured CFG and processes it into the "structured CFG".structures.rswe have the logic that takes the structured CFG and turns it into the "strucured AST", which is an intermediate AST that just represents the structured control flow we're translating.structured_cfg_helpwith a newprocess_cfgfunction that is responsible for generating the labeled blocks/breaks.cleanup_labelsthat removes redundant labels and extra labeled blocks. This allows the logic inprocess_cfgto be simpler by conservatively labeling all blocks and breaks.Don't try to review this commit-by-commit, the commit history is NOT clean and shows all of the experimentation and iteration I went through to get to this point. I only have all the commit history intact because I haven't had a need to squash yet, and we should squash this when merging to avoid polluting the commit history.
I'd also recommend not worrying too much about the diff here, and instead just review the new code as if it were all entirely fresh. Many of the parts here have been heavily reworked, and since I'm the only person deeply familiar with what the code was doing before it probably doesn't make sense to worry about how the new code differs from the old version.
NOTE: I still need to do a cleanup pass on the unit tests, since a lot of those were slapped together during experimentation. Snapshot tests are all clean, but I may want to add more as I cleanup the unit tests.218a281 shows the diff between the old behavior and the new behavior in the snapshots.