miri/concurrency/weak_memory.rs
1//! Implementation of C++11-consistent weak memory emulation using store buffers
2//! based on Dynamic Race Detection for C++ ("the paper"):
3//! <https://www.doc.ic.ac.uk/~afd/homepages/papers/pdfs/2017/POPL.pdf>
4//!
5//! This implementation will never generate weak memory behaviours forbidden by the C++11 model,
6//! but it is incapable of producing all possible weak behaviours allowed by the model. There are
7//! certain weak behaviours observable on real hardware but not while using this.
8//!
9//! Note that this implementation does not fully take into account of C++20's memory model revision to SC accesses
10//! and fences introduced by P0668 (<https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0668r5.html>).
11//! This implementation is not fully correct under the revised C++20 model and may generate behaviours C++20
12//! disallows (<https://github.com/rust-lang/miri/issues/2301>).
13//!
14//! Modifications are made to the paper's model to address C++20 changes:
15//! - If an SC load reads from an atomic store of any ordering, then a later SC load cannot read
16//! from an earlier store in the location's modification order. This is to prevent creating a
17//! backwards S edge from the second load to the first, as a result of C++20's coherence-ordered
18//! before rules. (This seems to rule out behaviors that were actually permitted by the RC11 model
19//! that C++20 intended to copy (<https://plv.mpi-sws.org/scfix/paper.pdf>); a change was
20//! introduced when translating the math to English. According to Viktor Vafeiadis, this
21//! difference is harmless. So we stick to what the standard says, and allow fewer behaviors.)
22//! - SC fences are treated like AcqRel RMWs to a global clock, to ensure they induce enough
23//! synchronization with the surrounding accesses. This rules out legal behavior, but it is really
24//! hard to be more precise here.
25//!
26//! Rust follows the C++20 memory model (except for the Consume ordering and some operations not performable through C++'s
27//! `std::atomic<T>` API). It is therefore possible for this implementation to generate behaviours never observable when the
28//! same program is compiled and run natively. Unfortunately, no literature exists at the time of writing which proposes
29//! an implementable and C++20-compatible relaxed memory model that supports all atomic operation existing in Rust. The closest one is
30//! A Promising Semantics for Relaxed-Memory Concurrency by Jeehoon Kang et al. (<https://www.cs.tau.ac.il/~orilahav/papers/popl17.pdf>)
31//! However, this model lacks SC accesses and is therefore unusable by Miri (SC accesses are everywhere in library code).
32//!
33//! If you find anything that proposes a relaxed memory model that is C++20-consistent, supports all orderings Rust's atomic accesses
34//! and fences accept, and is implementable (with operational semantics), please open a GitHub issue!
35//!
36//! One characteristic of this implementation, in contrast to some other notable operational models such as ones proposed in
37//! Taming Release-Acquire Consistency by Ori Lahav et al. (<https://plv.mpi-sws.org/sra/paper.pdf>) or Promising Semantics noted above,
38//! is that this implementation does not require each thread to hold an isolated view of the entire memory. Here, store buffers are per-location
39//! and shared across all threads. This is more memory efficient but does require store elements (representing writes to a location) to record
40//! information about reads, whereas in the other two models it is the other way round: reads points to the write it got its value from.
41//! Additionally, writes in our implementation do not have globally unique timestamps attached. In the other two models this timestamp is
42//! used to make sure a value in a thread's view is not overwritten by a write that occurred earlier than the one in the existing view.
43//! In our implementation, this is detected using read information attached to store elements, as there is no data structure representing reads.
44//!
45//! The C++ memory model is built around the notion of an 'atomic object', so it would be natural
46//! to attach store buffers to atomic objects. However, Rust follows LLVM in that it only has
47//! 'atomic accesses'. Therefore Miri cannot know when and where atomic 'objects' are being
48//! created or destroyed, to manage its store buffers. Instead, we hence lazily create an
49//! atomic object on the first atomic write to a given region, and we destroy that object
50//! on the next non-atomic or imperfectly overlapping atomic write to that region.
51//! These lazy (de)allocations happen in memory_accessed() on non-atomic accesses, and
52//! get_or_create_store_buffer_mut() on atomic writes.
53//!
54//! One consequence of this difference is that safe/sound Rust allows for more operations on atomic locations
55//! than the C++20 atomic API was intended to allow, such as non-atomically accessing
56//! a previously atomically accessed location, or accessing previously atomically accessed locations with a differently sized operation
57//! (such as accessing the top 16 bits of an AtomicU32). These scenarios are generally undiscussed in formalizations of C++ memory model.
58//! In Rust, these operations can only be done through a `&mut AtomicFoo` reference or one derived from it, therefore these operations
59//! can only happen after all previous accesses on the same locations. This implementation is adapted to allow these operations.
60//! A mixed atomicity read that races with writes, or a write that races with reads or writes will still cause UBs to be thrown.
61//! Mixed size atomic accesses must not race with any other atomic access, whether read or write, or a UB will be thrown.
62//! You can refer to test cases in weak_memory/extra_cpp.rs and weak_memory/extra_cpp_unsafe.rs for examples of these operations.
63
64// Our and the author's own implementation (tsan11) of the paper have some deviations from the provided operational semantics in §5.3:
65// 1. In the operational semantics, loads acquire the vector clock of the atomic location
66// irrespective of which store buffer element is loaded. That's incorrect; the synchronization clock
67// needs to be tracked per-store-buffer-element. (The paper has a field "clocks" for that purpose,
68// but it is not actuallt used.) tsan11 does this correctly
69// (https://github.com/ChrisLidbury/tsan11/blob/ecbd6b81e9b9454e01cba78eb9d88684168132c7/lib/tsan/rtl/tsan_relaxed.cc#L305).
70//
71// 2. In the operational semantics, each store element keeps the timestamp of a thread when it loads from the store.
72// If the same thread loads from the same store element multiple times, then the timestamps at all loads are saved in a list of load elements.
73// This is not necessary as later loads by the same thread will always have greater timestamp values, so we only need to record the timestamp of the first
74// load by each thread. This optimisation is done in tsan11
75// (https://github.com/ChrisLidbury/tsan11/blob/ecbd6b81e9b9454e01cba78eb9d88684168132c7/lib/tsan/rtl/tsan_relaxed.h#L35-L37)
76// and here.
77//
78// 3. §4.5 of the paper wants an SC store to mark all existing stores in the buffer that happens before it
79// as SC. This is not done in the operational semantics but implemented correctly in tsan11
80// (https://github.com/ChrisLidbury/tsan11/blob/ecbd6b81e9b9454e01cba78eb9d88684168132c7/lib/tsan/rtl/tsan_relaxed.cc#L160-L167)
81// and here.
82//
83// 4. W_SC ; R_SC case requires the SC load to ignore all but last store marked SC (stores not marked SC are not
84// affected). But this rule is applied to all loads in ReadsFromSet from the paper (last two lines of code), not just SC load.
85// This is implemented correctly in tsan11
86// (https://github.com/ChrisLidbury/tsan11/blob/ecbd6b81e9b9454e01cba78eb9d88684168132c7/lib/tsan/rtl/tsan_relaxed.cc#L295)
87// and here.
88
89use std::cell::{Ref, RefCell};
90use std::collections::VecDeque;
91
92use rustc_data_structures::fx::FxHashMap;
93
94use super::AllocDataRaceHandler;
95use super::data_race::{GlobalState as DataRaceState, ThreadClockSet};
96use super::vector_clock::{VClock, VTimestamp, VectorIdx};
97use crate::concurrency::GlobalDataRaceHandler;
98use crate::data_structures::range_object_map::{AccessType, RangeObjectMap};
99use crate::*;
100
101pub type AllocState = StoreBufferAlloc;
102
103// Each store buffer must be bounded otherwise it will grow indefinitely.
104// However, bounding the store buffer means restricting the amount of weak
105// behaviours observable. The author picked 128 as a good tradeoff
106// so we follow them here.
107const STORE_BUFFER_LIMIT: usize = 128;
108
109#[derive(Debug, Clone)]
110pub struct StoreBufferAlloc {
111 /// Store buffer of each atomic object in this allocation
112 // Behind a RefCell because we need to allocate/remove on read access
113 store_buffers: RefCell<RangeObjectMap<StoreBuffer>>,
114}
115
116impl VisitProvenance for StoreBufferAlloc {
117 fn visit_provenance(&self, visit: &mut VisitWith<'_>) {
118 let Self { store_buffers } = self;
119 for val in store_buffers
120 .borrow()
121 .iter()
122 .flat_map(|buf| buf.buffer.iter().map(|element| &element.val))
123 {
124 val.visit_provenance(visit);
125 }
126 }
127}
128
129#[derive(Debug, Clone, PartialEq, Eq)]
130pub(super) struct StoreBuffer {
131 // Stores to this location in modification order
132 buffer: VecDeque<StoreElement>,
133}
134
135/// Whether a load returned the latest value or not.
136#[derive(PartialEq, Eq)]
137enum LoadRecency {
138 Latest,
139 Outdated,
140}
141
142#[derive(Debug, Clone, PartialEq, Eq)]
143struct StoreElement {
144 /// The thread that performed the store.
145 store_thread: VectorIdx,
146 /// The timestamp of the storing thread when it performed the store
147 store_timestamp: VTimestamp,
148
149 /// The vector clock that can be acquired by loading this store.
150 sync_clock: VClock,
151
152 /// Whether this store is SC.
153 is_seqcst: bool,
154
155 /// The value of this store. `None` means uninitialized.
156 // FIXME: Currently, we cannot represent partial initialization.
157 val: Option<Scalar>,
158
159 /// Metadata about loads from this store element,
160 /// behind a RefCell to keep load op take &self
161 load_info: RefCell<LoadInfo>,
162}
163
164#[derive(Debug, Clone, PartialEq, Eq, Default)]
165struct LoadInfo {
166 /// Timestamp of first loads from this store element by each thread.
167 timestamps: FxHashMap<VectorIdx, VTimestamp>,
168 /// Whether this store element has been read by an SC load.
169 /// This is crucial to ensure we respect coherence-ordered-before. Concretely we use
170 /// this to ensure that if a store element is seen by an SC load, then all later SC loads
171 /// cannot see `mo`-earlier store elements.
172 sc_loaded: bool,
173}
174
175impl StoreBufferAlloc {
176 pub fn new_allocation() -> Self {
177 Self { store_buffers: RefCell::new(RangeObjectMap::new()) }
178 }
179
180 /// When a non-atomic write happens on a location that has been atomically accessed
181 /// before without data race, we can determine that the non-atomic write fully happens
182 /// after all the prior atomic writes so the location no longer needs to exhibit
183 /// any weak memory behaviours until further atomic writes.
184 pub fn non_atomic_write(&self, range: AllocRange, global: &DataRaceState) {
185 if !global.ongoing_action_data_race_free() {
186 let mut buffers = self.store_buffers.borrow_mut();
187 let access_type = buffers.access_type(range);
188 match access_type {
189 AccessType::PerfectlyOverlapping(pos) => {
190 buffers.remove_from_pos(pos);
191 }
192 AccessType::ImperfectlyOverlapping(pos_range) => {
193 // We rely on the data-race check making sure this is synchronized.
194 // Therefore we can forget about the old data here.
195 buffers.remove_pos_range(pos_range);
196 }
197 AccessType::Empty(_) => {
198 // The range had no weak behaviours attached, do nothing
199 }
200 }
201 }
202 }
203
204 /// Gets a store buffer associated with an atomic object in this allocation.
205 /// Returns `None` if there is no store buffer.
206 fn get_store_buffer<'tcx>(
207 &self,
208 range: AllocRange,
209 ) -> InterpResult<'tcx, Option<Ref<'_, StoreBuffer>>> {
210 let access_type = self.store_buffers.borrow().access_type(range);
211 let pos = match access_type {
212 AccessType::PerfectlyOverlapping(pos) => pos,
213 // If there is nothing here yet, that means there wasn't an atomic write yet so
214 // we can't return anything outdated.
215 _ => return interp_ok(None),
216 };
217 let store_buffer = Ref::map(self.store_buffers.borrow(), |buffer| &buffer[pos]);
218 interp_ok(Some(store_buffer))
219 }
220
221 /// Gets a mutable store buffer associated with an atomic object in this allocation,
222 /// or creates one with the specified initial value if no atomic object exists yet.
223 fn get_or_create_store_buffer_mut<'tcx>(
224 &mut self,
225 range: AllocRange,
226 init: Result<Option<Scalar>, ()>,
227 ) -> InterpResult<'tcx, &mut StoreBuffer> {
228 let buffers = self.store_buffers.get_mut();
229 let access_type = buffers.access_type(range);
230 let pos = match access_type {
231 AccessType::PerfectlyOverlapping(pos) => pos,
232 AccessType::Empty(pos) => {
233 let init =
234 init.expect("cannot have empty store buffer when previous write was atomic");
235 buffers.insert_at_pos(pos, range, StoreBuffer::new(init));
236 pos
237 }
238 AccessType::ImperfectlyOverlapping(pos_range) => {
239 // Once we reach here we would've already checked that this access is not racy.
240 let init = init.expect(
241 "cannot have partially overlapping store buffer when previous write was atomic",
242 );
243 buffers.remove_pos_range(pos_range.clone());
244 buffers.insert_at_pos(pos_range.start, range, StoreBuffer::new(init));
245 pos_range.start
246 }
247 };
248 interp_ok(&mut buffers[pos])
249 }
250}
251
252impl<'tcx> StoreBuffer {
253 fn new(init: Option<Scalar>) -> Self {
254 let mut buffer = VecDeque::new();
255 let store_elem = StoreElement {
256 // The thread index and timestamp of the initialisation write
257 // are never meaningfully used, so it's fine to leave them as 0
258 store_thread: VectorIdx::from(0),
259 store_timestamp: VTimestamp::ZERO,
260 // The initialization write is non-atomic so nothing can be acquired.
261 sync_clock: VClock::default(),
262 val: init,
263 is_seqcst: false,
264 load_info: RefCell::new(LoadInfo::default()),
265 };
266 buffer.push_back(store_elem);
267 Self { buffer }
268 }
269
270 /// Reads from the last store in modification order, if any.
271 fn read_from_last_store(
272 &self,
273 global: &DataRaceState,
274 thread_mgr: &ThreadManager<'_>,
275 is_seqcst: bool,
276 ) {
277 let store_elem = self.buffer.back();
278 if let Some(store_elem) = store_elem {
279 let (index, clocks) = global.active_thread_state(thread_mgr);
280 store_elem.load_impl(index, &clocks, is_seqcst);
281 }
282 }
283
284 fn buffered_read(
285 &self,
286 global: &DataRaceState,
287 thread_mgr: &ThreadManager<'_>,
288 is_seqcst: bool,
289 rng: &mut (impl rand::Rng + ?Sized),
290 validate: impl FnOnce(Option<&VClock>) -> InterpResult<'tcx>,
291 ) -> InterpResult<'tcx, (Option<Scalar>, LoadRecency)> {
292 // Having a live borrow to store_buffer while calling validate_atomic_load is fine
293 // because the race detector doesn't touch store_buffer
294
295 let (store_elem, recency) = {
296 // The `clocks` we got here must be dropped before calling validate_atomic_load
297 // as the race detector will update it
298 let (.., clocks) = global.active_thread_state(thread_mgr);
299 // Load from a valid entry in the store buffer
300 self.fetch_store(is_seqcst, &clocks, &mut *rng)
301 };
302
303 // Unlike in buffered_atomic_write, thread clock updates have to be done
304 // after we've picked a store element from the store buffer, as presented
305 // in ATOMIC LOAD rule of the paper. This is because fetch_store
306 // requires access to ThreadClockSet.clock, which is updated by the race detector
307 validate(Some(&store_elem.sync_clock))?;
308
309 let (index, clocks) = global.active_thread_state(thread_mgr);
310 let loaded = store_elem.load_impl(index, &clocks, is_seqcst);
311 interp_ok((loaded, recency))
312 }
313
314 fn buffered_write(
315 &mut self,
316 val: Scalar,
317 global: &DataRaceState,
318 thread_mgr: &ThreadManager<'_>,
319 is_seqcst: bool,
320 sync_clock: VClock,
321 ) -> InterpResult<'tcx> {
322 let (index, clocks) = global.active_thread_state(thread_mgr);
323
324 self.store_impl(val, index, &clocks.clock, is_seqcst, sync_clock);
325 interp_ok(())
326 }
327
328 /// Selects a valid store element in the buffer.
329 fn fetch_store<R: rand::Rng + ?Sized>(
330 &self,
331 is_seqcst: bool,
332 clocks: &ThreadClockSet,
333 rng: &mut R,
334 ) -> (&StoreElement, LoadRecency) {
335 use rand::seq::IteratorRandom;
336 let mut found_sc = false;
337 // FIXME: we want an inclusive take_while (stops after a false predicate, but
338 // includes the element that gave the false), but such function doesn't yet
339 // exist in the standard library https://github.com/rust-lang/rust/issues/62208
340 // so we have to hack around it with keep_searching
341 let mut keep_searching = true;
342 let candidates = self
343 .buffer
344 .iter()
345 .rev()
346 .take_while(move |&store_elem| {
347 if !keep_searching {
348 return false;
349 }
350
351 keep_searching = if store_elem.store_timestamp
352 <= clocks.clock[store_elem.store_thread]
353 {
354 // CoWR: if a store happens-before the current load,
355 // then we can't read-from anything earlier in modification order.
356 // C++20 §6.9.2.2 [intro.races] paragraph 18
357 false
358 } else if store_elem.load_info.borrow().timestamps.iter().any(
359 |(&load_index, &load_timestamp)| load_timestamp <= clocks.clock[load_index],
360 ) {
361 // CoRR: if there was a load from this store which happened-before the current load,
362 // then we cannot read-from anything earlier in modification order.
363 // C++20 §6.9.2.2 [intro.races] paragraph 16
364 false
365 } else if store_elem.store_timestamp <= clocks.write_seqcst[store_elem.store_thread]
366 && store_elem.is_seqcst
367 {
368 // The current non-SC load, which may be sequenced-after an SC fence,
369 // cannot read-before the last SC store executed before the fence.
370 // C++17 §32.4 [atomics.order] paragraph 4
371 false
372 } else if is_seqcst
373 && store_elem.store_timestamp <= clocks.read_seqcst[store_elem.store_thread]
374 {
375 // The current SC load cannot read-before the last store sequenced-before
376 // the last SC fence.
377 // C++17 §32.4 [atomics.order] paragraph 5
378 false
379 } else if is_seqcst && store_elem.load_info.borrow().sc_loaded {
380 // The current SC load cannot read-before a store that an earlier SC load has observed.
381 // See https://github.com/rust-lang/miri/issues/2301#issuecomment-1222720427.
382 // Consequences of C++20 §31.4 [atomics.order] paragraph 3.1, 3.3 (coherence-ordered before)
383 // and 4.1 (coherence-ordered before between SC makes global total order S).
384 false
385 } else {
386 true
387 };
388
389 true
390 })
391 .filter(|&store_elem| {
392 if is_seqcst && store_elem.is_seqcst {
393 // An SC load needs to ignore all but last store maked SC (stores not marked SC are not
394 // affected)
395 let include = !found_sc;
396 found_sc = true;
397 include
398 } else {
399 true
400 }
401 });
402
403 let chosen = candidates.choose(rng).expect("store buffer cannot be empty");
404 if std::ptr::eq(chosen, self.buffer.back().expect("store buffer cannot be empty")) {
405 (chosen, LoadRecency::Latest)
406 } else {
407 (chosen, LoadRecency::Outdated)
408 }
409 }
410
411 /// ATOMIC STORE IMPL in the paper
412 fn store_impl(
413 &mut self,
414 val: Scalar,
415 index: VectorIdx,
416 thread_clock: &VClock,
417 is_seqcst: bool,
418 sync_clock: VClock,
419 ) {
420 let store_elem = StoreElement {
421 store_thread: index,
422 store_timestamp: thread_clock[index],
423 sync_clock,
424 // In the language provided in the paper, an atomic store takes the value from a
425 // non-atomic memory location.
426 // But we already have the immediate value here so we don't need to do the memory
427 // access.
428 val: Some(val),
429 is_seqcst,
430 load_info: RefCell::new(LoadInfo::default()),
431 };
432 if self.buffer.len() >= STORE_BUFFER_LIMIT {
433 self.buffer.pop_front();
434 }
435 self.buffer.push_back(store_elem);
436 if is_seqcst {
437 // Every store that happens before this needs to be marked as SC
438 // so that in a later SC load, only the last SC store (i.e. this one) or stores that
439 // aren't ordered by hb with the last SC is picked.
440 self.buffer.iter_mut().rev().for_each(|elem| {
441 if elem.store_timestamp <= thread_clock[elem.store_thread] {
442 elem.is_seqcst = true;
443 }
444 })
445 }
446 }
447}
448
449impl StoreElement {
450 /// ATOMIC LOAD IMPL in the paper
451 /// Unlike the operational semantics in the paper, we don't need to keep track
452 /// of the thread timestamp for every single load. Keeping track of the first (smallest)
453 /// timestamp of each thread that has loaded from a store is sufficient: if the earliest
454 /// load of another thread happens before the current one, then we must stop searching the store
455 /// buffer regardless of subsequent loads by the same thread; if the earliest load of another
456 /// thread doesn't happen before the current one, then no subsequent load by the other thread
457 /// can happen before the current one.
458 fn load_impl(
459 &self,
460 index: VectorIdx,
461 clocks: &ThreadClockSet,
462 is_seqcst: bool,
463 ) -> Option<Scalar> {
464 let mut load_info = self.load_info.borrow_mut();
465 load_info.sc_loaded |= is_seqcst;
466 let _ = load_info.timestamps.try_insert(index, clocks.clock[index]);
467 self.val
468 }
469}
470
471impl<'tcx> EvalContextExt<'tcx> for crate::MiriInterpCx<'tcx> {}
472pub(super) trait EvalContextExt<'tcx>: crate::MiriInterpCxExt<'tcx> {
473 fn buffered_atomic_rmw(
474 &mut self,
475 new_val: Scalar,
476 place: &MPlaceTy<'tcx>,
477 atomic: AtomicRwOrd,
478 init: Scalar,
479 ) -> InterpResult<'tcx> {
480 let this = self.eval_context_mut();
481 let (alloc_id, base_offset, ..) = this.ptr_get_alloc_id(place.ptr(), 0)?;
482 if let (
483 crate::AllocExtra {
484 data_race: AllocDataRaceHandler::Vclocks(data_race_clocks, Some(alloc_buffers)),
485 ..
486 },
487 crate::MiriMachine {
488 data_race: GlobalDataRaceHandler::Vclocks(global), threads, ..
489 },
490 ) = this.get_alloc_extra_mut(alloc_id)?
491 {
492 if atomic == AtomicRwOrd::SeqCst {
493 global.sc_read(threads);
494 global.sc_write(threads);
495 }
496 let range = alloc_range(base_offset, place.layout.size);
497 let sync_clock = data_race_clocks.sync_clock(range);
498 let buffer = alloc_buffers.get_or_create_store_buffer_mut(range, Ok(Some(init)))?;
499 // The RMW always reads from the most recent store.
500 buffer.read_from_last_store(global, threads, atomic == AtomicRwOrd::SeqCst);
501 buffer.buffered_write(
502 new_val,
503 global,
504 threads,
505 atomic == AtomicRwOrd::SeqCst,
506 sync_clock,
507 )?;
508 }
509 interp_ok(())
510 }
511
512 /// The argument to `validate` is the synchronization clock of the memory that is being read,
513 /// if we are reading from a store buffer element.
514 fn buffered_atomic_read(
515 &self,
516 place: &MPlaceTy<'tcx>,
517 atomic: AtomicReadOrd,
518 latest_in_mo: Scalar,
519 validate: impl FnOnce(Option<&VClock>) -> InterpResult<'tcx>,
520 ) -> InterpResult<'tcx, Option<Scalar>> {
521 let this = self.eval_context_ref();
522 'fallback: {
523 if let Some(global) = this.machine.data_race.as_vclocks_ref() {
524 let (alloc_id, base_offset, ..) = this.ptr_get_alloc_id(place.ptr(), 0)?;
525 if let Some(alloc_buffers) =
526 this.get_alloc_extra(alloc_id)?.data_race.as_weak_memory_ref()
527 {
528 if atomic == AtomicReadOrd::SeqCst {
529 global.sc_read(&this.machine.threads);
530 }
531 let mut rng = this.machine.rng.borrow_mut();
532 let Some(buffer) = alloc_buffers
533 .get_store_buffer(alloc_range(base_offset, place.layout.size))?
534 else {
535 // No old writes available, fall back to base case.
536 break 'fallback;
537 };
538 let (loaded, recency) = buffer.buffered_read(
539 global,
540 &this.machine.threads,
541 atomic == AtomicReadOrd::SeqCst,
542 &mut *rng,
543 validate,
544 )?;
545 if global.track_outdated_loads && recency == LoadRecency::Outdated {
546 this.emit_diagnostic(NonHaltingDiagnostic::WeakMemoryOutdatedLoad {
547 ptr: place.ptr(),
548 });
549 }
550
551 return interp_ok(loaded);
552 }
553 }
554 }
555
556 // Race detector or weak memory disabled, simply read the latest value
557 validate(None)?;
558 interp_ok(Some(latest_in_mo))
559 }
560
561 /// Add the given write to the store buffer. (Does not change machine memory.)
562 ///
563 /// `init` says with which value to initialize the store buffer in case there wasn't a store
564 /// buffer for this memory range before. `Err(())` means the value is not available;
565 /// `Ok(None)` means the memory does not contain a valid scalar.
566 ///
567 /// Must be called *after* `validate_atomic_store` to ensure that `sync_clock` is up-to-date.
568 fn buffered_atomic_write(
569 &mut self,
570 val: Scalar,
571 dest: &MPlaceTy<'tcx>,
572 atomic: AtomicWriteOrd,
573 init: Result<Option<Scalar>, ()>,
574 ) -> InterpResult<'tcx> {
575 let this = self.eval_context_mut();
576 let (alloc_id, base_offset, ..) = this.ptr_get_alloc_id(dest.ptr(), 0)?;
577 if let (
578 crate::AllocExtra {
579 data_race: AllocDataRaceHandler::Vclocks(data_race_clocks, Some(alloc_buffers)),
580 ..
581 },
582 crate::MiriMachine {
583 data_race: GlobalDataRaceHandler::Vclocks(global), threads, ..
584 },
585 ) = this.get_alloc_extra_mut(alloc_id)?
586 {
587 if atomic == AtomicWriteOrd::SeqCst {
588 global.sc_write(threads);
589 }
590
591 let range = alloc_range(base_offset, dest.layout.size);
592 // It's a bit annoying that we have to go back to the data race part to get the clock...
593 // but it does make things a lot simpler.
594 let sync_clock = data_race_clocks.sync_clock(range);
595 let buffer = alloc_buffers.get_or_create_store_buffer_mut(range, init)?;
596 buffer.buffered_write(
597 val,
598 global,
599 threads,
600 atomic == AtomicWriteOrd::SeqCst,
601 sync_clock,
602 )?;
603 }
604
605 // Caller should've written to dest with the vanilla scalar write, we do nothing here
606 interp_ok(())
607 }
608
609 /// Caller should never need to consult the store buffer for the latest value.
610 /// This function is used exclusively for failed atomic_compare_exchange_scalar
611 /// to perform load_impl on the latest store element
612 fn perform_read_on_buffered_latest(
613 &self,
614 place: &MPlaceTy<'tcx>,
615 atomic: AtomicReadOrd,
616 ) -> InterpResult<'tcx> {
617 let this = self.eval_context_ref();
618
619 if let Some(global) = this.machine.data_race.as_vclocks_ref() {
620 if atomic == AtomicReadOrd::SeqCst {
621 global.sc_read(&this.machine.threads);
622 }
623 let size = place.layout.size;
624 let (alloc_id, base_offset, ..) = this.ptr_get_alloc_id(place.ptr(), 0)?;
625 if let Some(alloc_buffers) =
626 this.get_alloc_extra(alloc_id)?.data_race.as_weak_memory_ref()
627 {
628 let Some(buffer) =
629 alloc_buffers.get_store_buffer(alloc_range(base_offset, size))?
630 else {
631 // No store buffer, nothing to do.
632 return interp_ok(());
633 };
634 buffer.read_from_last_store(
635 global,
636 &this.machine.threads,
637 atomic == AtomicReadOrd::SeqCst,
638 );
639 }
640 }
641 interp_ok(())
642 }
643}