Skip to content

Verify memory safety of String functions (Challenge 10)#541

Open
jrey8343 wants to merge 1 commit intomodel-checking:mainfrom
jrey8343:challenge-10-string
Open

Verify memory safety of String functions (Challenge 10)#541
jrey8343 wants to merge 1 commit intomodel-checking:mainfrom
jrey8343:challenge-10-string

Conversation

@jrey8343
Copy link

@jrey8343 jrey8343 commented Feb 7, 2026

Summary

Adds 15 Kani verification harnesses proving the memory safety of all functions listed in Challenge 10, covering all unsafe operations in library/alloc/src/string.rs.

Functions verified

Function Harness Unsafe operation
from_utf16le check_from_utf16le align_to::<u16>()
from_utf16le_lossy check_from_utf16le_lossy align_to::<u16>()
from_utf16be check_from_utf16be align_to::<u16>()
from_utf16be_lossy check_from_utf16be_lossy align_to::<u16>()
pop check_pop set_len()
remove check_str_remove ptr::copy + set_len
remove_matches check_remove_matches ptr::copy + set_len in pattern search loop
retain check_retain ptr::copy_nonoverlapping + set_len + Drop guard
insert check_insert ptr::copy + encode_utf8_raw_unchecked + set_len
insert_str check_insert_str ptr::copy + ptr::copy_nonoverlapping + set_len
split_off check_split_off from_utf8_unchecked + set_len
drain check_drain get_unchecked + raw pointer arithmetic
replace_range check_replace_range as_mut_vec() + Vec::splice
into_boxed_str check_into_boxed_str from_boxed_utf8_unchecked
leak check_leak from_utf8_unchecked_mut

Verification approach

  • Symbolic ASCII strings: Helper any_ascii_string<N>() generates symbolic String values of 0..N bytes where each byte is constrained to be valid ASCII (<=127), ensuring UTF-8 validity. Used for pop, insert, insert_str, split_off, drain, replace_range, into_boxed_str, and leak.
  • Concrete strings for char-iteration functions: remove, remove_matches, and retain use concrete strings (e.g., String::from("abcd")) because Chars iterator raw pointer internals cause CBMC memory explosion with symbolic strings.
  • UTF-16 functions: 8-byte symbolic byte arrays as input, exercising align_to::<u16>() alignment cast and decode_utf16 iteration.
  • All harnesses use #[kani::unwind(6)] for bounded loop unwinding.

All 15 harnesses verified locally.

Test plan

  • All 15 harnesses pass with kani verify-std using -Z loop-contracts --cbmc-args --object-bits 12
  • Code formatted with rustfmt

Add proof harnesses for all 15 public String functions that are safe
abstractions over unsafe code:
- UTF-16 decoding: from_utf16le, from_utf16le_lossy, from_utf16be, from_utf16be_lossy
- Element operations: pop, remove, remove_matches, retain
- Insertion: insert, insert_str
- Splitting/draining: split_off, drain, replace_range
- Conversion: into_boxed_str, leak

All 15 harnesses verified with Kani 0.65.0.
@jrey8343 jrey8343 requested a review from a team as a code owner February 7, 2026 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant