Rust: borrow scopes should not always be lexical

Created on 10 May 2013 · 44Comments · Source: rust-lang/rust

If you borrow immutably in an if test, the borrow lasts for the whole if expression. This means that mutable borrows in the clauses will cause the borrow checker to fail.

This can also happen when borrowing in the match expression, and needing a mutable borrow in one of the arms.

See here for an example where the if borrows boxes, which causes the nearest upwards @mut to freeze. Then remove_child() which needs to borrow mutably conflicts.

https://github.com/mozilla/servo/blob/master/src/servo/layout/box_builder.rs#L387-L411

Updated example from @Wyverald

fn main() {
    let mut vec = vec!();

    match vec.first() {
        None => vec.push(5),
        Some(v) => unreachable!(),
    }
}

A-borrow-checker NLL-fixed-by-NLL

Source

metajack

👍1

Most helpful comment

It's not yet hit nightly, but I just want to say that this now compiles:

#![feature(nll)]

fn main() {
    let mut vec = vec!();

    match vec.first() {
        None => vec.push(5),
        Some(v) => unreachable!(),
    }
}

nikomatsakis on 21 Dec 2017

🎉16 ❤7

All 44 comments

nominating for production ready

metajack on 10 May 2013

I would call this either well-defined or backwards-compat.

nikomatsakis on 10 May 2013

Examples from Jack

[09:57:06] https://github.com/mozilla/servo/blob/master/src/components/servo/layout/box_builder.rs#L387-L411
[10:02:06] https://github.com/mozilla/servo/pull/435

nikomatsakis on 16 May 2013

The code got reorganized, but here is the specific appeasement of the borrow checker i had to do:

https://github.com/metajack/servo/commit/5324cabbf8757fa68b1aa36548b992041be94ef9

https://github.com/metajack/servo/commit/7234635aa580c8a821003882e77d8e043d247687

metajack on 16 May 2013

After some discussion, it's clear that the real problem here is that the borrow checker makes no effort to track aliases, but rather always relies on the region system to determine when a borrow goes out of scope. I am reluctant to change this, at least not in the short term, because there are a number of other outstanding issues I'd like to address first and any changes would be a significant modification to the borrow checker. See issue #6613 for another related example and a somewhat detailed explanation.

nikomatsakis on 20 May 2013

I wonder if we could improve the error messages to make it more clear what's going on? Lexical scopes are relatively easy to understand, but in the examples of this issue that I've stumbled across it was by no means obvious what was going on.

cscott on 20 May 2013

just a bug, removing milestone/nomination.

graydon on 23 May 2013

accepted for far future milestone

graydon on 23 May 2013

triage bump

emberian on 22 Jul 2013

I've had some thoughts about the best way to fix this. My basic plan is that we would have a notion of when a value "escapes". It will take some work to formalize that notion. Basically, when a borrowed pointer is created, we will then track whether it has escaped. When the pointer is dead, if it is has not escaped, this can be considered to kill the loan. This basic idea covers cases like "let p = &...; use-p-a-bit-but-never-again; expect-loan-to-be-expired-here;" Part of the analysis will be a rule that indicates when a return value that contains a borrowed pointer can be considered not to have escaped yet. This would cover cases like "match table.find(...) { ... None => { expect-table-not-to-be-loaned-here; } }"

The most interesting part of all this is the escaping rules, of course. I think that the rules would have to take into account the formal definition of the function, and in particular to take advantage of the knowledge bound lifetimes give us. For example, most escape analyses would consider a pointer p to escape if they see a call like foo(p). But we would not necessarily have to do so. If the function were declared as:

fn foo<'a>(x: &'a T) { ... }

then in fact we know that foo does not hold on to p for any longer than the lifetime a. However, a function like bar would have to be considered as escaping:

fn bar<'a>(x: &'a T, y: &mut &'a T)

So presumably the escaping rules would have to consider whether the bound lifetime appeared in a mutable location or not. This is effectively a form of type-based alias analysis. Similar reasoning I think applies to function return values. Hence find ought to be considered to return a non-escaped result:

fn find<'a>(&'a self, k: &K) -> Option<&'a V>

The reason here is that because 'a is bound on find, it cannot appear in the Self or K type parameters, and hence we know it can't be stored in those, and it does not appear in any mutable locations. (Note that we can apply the same inference algorithm as is used today and which will be used as part of the fix for #3598 to tell us whether lifetimes appears in a mutable location)

Another way to think about this is not that the loan is expired _early_, but rather that the scope of a loan begins (typically) as tied to the borrowed _variable_ and not the full lifetime, and is only promoted to the full lifetime when the variable _escapes_.

Reborrows are a slight complication, but they can be handled in various ways. A reborrow is when you borrow the contents of a borrowed pointer -- they happen _all the time_ because the compiler inserts them automatically into almost every method call. Consider a borowed pointer let p = &v and a reborrow like let q = &*p. It'd be nice if when q was dead, you could use p again -- and if both p and q were dead, you could use v again (presuming neither p nor q escapes). The complication here is that if q escapes, p must be considered escaped up until the lifetime of q expires. But I think that this falls out _somewhat_ naturally from how we handle it today: that is, the compiler notes that q has borrowed p for (initially) the lifetime "q" (that is, of the variable itself) and if q should escape, that would be promoted to the full lexical lifetime. I guess the tricky part is in the dataflow, knowing where to insert the kills -- we can't insert the kill for p right away when p goes dead if it is reborrowed. Oh well, I'll not waste more time on this, it seems do-able, and at worst there are simpler solutions that would be adequate for common situations (e.g., consider p to have escaped for the full lifetime of q, regardless of whether the q loan escapes or not).

Anyway, more thought is warranted, but I'm starting to see how this could work. I'm still reluctant to embark on any extensions like this until #2202 and #8624 are fixed, those being the two known problems with borrowck. I'd also like to have more progress on a soundness proof before we go about extending the system. The other extension that is on the timeline is #6268.

nikomatsakis on 12 Sep 2013

I believe I've run into this bug. My use-case and work-around attempts:

https://gist.github.com/toffaletti/6770126

toffaletti on 30 Sep 2013

Here's another example of this bug (I think):

use std::util;

enum List<T> {
    Cons(T, ~List<T>),
    Nil
}

fn find_mut<'a,T>(prev: &'a mut ~List<T>, pred: |&T| -> bool) -> Option<&'a mut ~List<T>> {
    match prev {
        &~Cons(ref x, _) if pred(x) => {}, // NB: can't return Some(prev) here
        &~Cons(_, ref mut rs) => return find_mut(rs, pred),
        &~Nil => return None
    };
    return Some(prev)
}

I'd like to write:

fn find_mut<'a,T>(prev: &'a mut ~List<T>, pred: |&T| -> bool) -> Option<&'a mut ~List<T>> {
    match prev {
        &~Cons(ref x, _) if pred(x) => return Some(prev),
        &~Cons(_, ref mut rs) => return find_mut(rs, pred),
        &~Nil => return None
    }
}

reasoning that the x borrow goes dead as soon as we finish evaluating the predicate, but of course, the borrow extends for the entire match right now.

ezyang on 16 Dec 2013

I've had more thoughts about how to code this up. My basic plan is that for each loan there would be two bits: an escaped version and a non-escaped version. Initially we add the non-escaped version. When a reference escapes, we add the escaped bits. When a variable (or temporary, etc) goes dead, we kill the non-escaped bits -- but leave the escaped bits (if set) untouched. I believe this covers all major examples.

nikomatsakis on 10 Feb 2014

cc @flaper87

flaper87 on 10 Feb 2014

Does this issue cover this?

use std::io::{MemReader, EndOfFile, IoResult};

fn read_block<'a>(r: &mut Reader, buf: &'a mut [u8]) -> IoResult<&'a [u8]> {
    match r.read(buf) {
        Ok(len) => Ok(buf.slice_to(len)),
        Err(err) => {
            if err.kind == EndOfFile {
                Ok(buf.slice_to(0))
            } else {
                Err(err)
            }
        }
    }
}

fn main() {
    let mut buf = [0u8, ..2];
    let mut reader = MemReader::new(~[67u8, ..10]);
    let mut block = read_block(&mut reader, buf);
    loop {
        //process block
        block = read_block(&mut reader, buf); //error here
}

arjantop on 13 Feb 2014

cc me

lilyball on 25 Feb 2014

Good examples in #9113

nikomatsakis on 15 Apr 2014

cc me

pnkfelix on 24 Apr 2014

I could be mistaken, but the following code seems to be hitting this bug as well:

struct MyThing<'r> {
  int_ref: &'r int,
  val: int
}

impl<'r> MyThing<'r> {
  fn new(int_ref: &'r int, val: int) -> MyThing<'r> {
    MyThing {
      int_ref: int_ref,
      val: val
    }
  }

  fn set_val(&'r mut self, val: int) {
    self.val = val;
  }
}


fn main() {
  let to_ref = 10;
  let mut thing = MyThing::new(&to_ref, 30);
  thing.set_val(50);

  println!("{}", thing.val);
}

Ideally, the mutable borrow caused by calling set_val would end as soon as the function returns. Note that removing the 'int_ref' field from the struct (and associated code) causes the issue to go away. The behavior is inconsistent.

SergioBenitez on 20 May 2014

@SergioBenitez I don't think that's the same issue. You're explicitly requesting that the lifetime of the &mut self reference be the same as the lifetime of the struct.

But you don't need to do this. You don't need a lifetime in set_val() at all.

fn set_val(&mut self, val: int) {
    self.val = val;
}

lilyball on 20 May 2014

I found another case that's pretty tricky to fix:

/// A buffer which breaks chunks only after the specified boundary
/// sequence, or at the end of a file, but nowhere else.
pub struct ChunkBuffer<'a, T: Buffer+'a> {
    input:  &'a mut T,
    boundary: Vec<u8>,
    buffer: Vec<u8>
}

impl<'a, T: Buffer+'a> ChunkBuffer<'a,T> {
    // Called internally to make `buffer` valid.  This is where all our
    // evil magic lives.
    fn top_up<'b>(&'b mut self) -> IoResult<&'b [u8]> {
        // ...
    }
}

impl<'a,T: Buffer+'a> Buffer for ChunkBuffer<'a,T> {
    fn fill_buf<'a>(&'a mut self) -> IoResult<&'a [u8]> {
        if self.buffer.as_slice().contains_slice(self.boundary.as_slice()) {
            // Exit 1: Valid data in our local buffer.
            Ok(self.buffer.as_slice())
        } else if self.buffer.len() > 0 {
            // Exit 2: Add some more data to our local buffer so that it's
            // valid (see invariants for top_up).
            self.top_up()
        } else {
            {
                // Exit 3: Exit on error.
                let read = try!(self.input.fill_buf());
                if read.contains_slice(self.boundary.as_slice()) {
                    // Exit 4: Valid input from self.input. Yay!
                    return Ok(read)
                }
            }
            // Exit 5: Accumulate sufficient data in our local buffer (see
            // invariants for top_up).
            self.top_up()
        }
    }

…which gives:

/path/to/mylib/src/buffer.rs:168:13: 168:17 error: cannot borrow `*self` as mutable more than once at a time
/path/to/mylib/src/buffer.rs:168             self.top_up()
                                                        ^~~~
/path/to/mylib/src/buffer.rs:160:33: 160:43 note: previous borrow of `*self.input` occurs here; the mutable borrow prevents subsequent moves, borrows, or modification of `*self.input` until the borrow ends
/path/to/mylib/src/buffer.rs:160                 let read = try!(self.input.fill_buf());
                                                                            ^~~~~~~~~~
<std macros>:1:1: 3:2 note: in expansion of try!
/path/to/mylib/src/buffer.rs:160:28: 160:56 note: expansion site
/path/to/mylib/src/buffer.rs:170:6: 170:6 note: previous borrow ends here
/path/to/mylib/src/buffer.rs:149     fn fill_buf<'a>(&'a mut self) -> IoResult<&'a [u8]> {
...
/path/to/mylib/src/buffer.rs:170     }

This is basically equivalent to #12147. The variable read is buried in an inner scope, but the return binds read's lifetime to that of the entire function. Most of the obvious workarounds fail:

I can't call input.fill_buf twice, because the Buffer interface doesn't guarantee that it returns the data I just validated the second time. If I _do_ try this, the code is technically incorrect but the type checker passes it happily.
I can't do much about top_up, because it's an evil piece of code that needs to mutate everything in complicated ways.
I can't move the offending bind+test+return into another function, because the new API will still have all the same issues (unless if let allows me to test _then_ bind?).

It almost feels as if the 'a constraint ideally shouldn't get propagated the whole way back to read. but I'm in over my head here. I'm going to try if let next.

emk on 2 Oct 2014

Well, if let didn't make it into the build last night, but since it's supposedly just an AST rewrite, I guess it probably fails in the same way as match does (which I've also tried here).

I'm not sure how to proceed, short of using unsafe.

emk on 2 Oct 2014

My current hack here looks like this:

impl<'a,T: Buffer+'a> Buffer for ChunkBuffer<'a,T> {
    fn fill_buf<'a>(&'a mut self) -> IoResult<&'a [u8]> {
        // ...

            { // Block A.
                let read_or_err = self.input.fill_buf();
                match read_or_err {
                    Err(err) => { return Err(err); }
                    Ok(read) => {
                        if read.contains_slice(self.boundary.as_slice()) {
                               return Ok(unsafe { transmute(read) });
                        }
                    }
                }
            }
            self.top_up()

The theory here is that I'm dropping the lifetime off of read (which was bound to self.input), and immediately applying a new lifetime based on self, which owns self.input. Ideally, I want read to have a lexical lifetime equal to Block A, and I don't want it to get hoisted up to the _lexical_ block level just because I passed it to return. Obviously the lifetime checker still needs to prove that the result has a lifetime compatible with 'a, but I don't understand why that means LIFETIME(read) needs to be unified with LIFETIME('a).

It's entirely possible that I'm massively confused, or that my code is horribly unsafe. :-) But it does feel like this should work, if only because I can call return self.input.fill_buf() without any problem at all. Is there any way to formalize that intuition?

emk on 2 Oct 2014

@emk so this is the "hard code" that SEME regions (that is, non-lexical regions) do not fix, at least not by themselves. I have some ideas for how to fix it nicely in the compiler, but it's a non-trivial extension to SEME regions. There is usually a way to workaround this by restructuring the code. Let me see if I can play around with it and produce a nice example.

nikomatsakis on 3 Oct 2014

I would like to know if this is being reconsidered for 1.0. This has been coming up a _lot_ recently, and I fear that this will vault from a papercut to a flesh wound once 1.0 brings a rush of attention. As Rust's most visible and talked-about feature, it's very important for borrowing to be polished and usable.

Is there a timeframe on an RFC for this?

bstrie on 9 Oct 2014

@nikomatsakis I have a real-world simple example that fell out of working on the Entry API, if it helps:

use std::collections::SmallIntMap;

enum Foo<'a>{ A(&'a mut SmallIntMap<uint>), B(&'a mut uint) }

fn main() {
    let mut map = SmallIntMap::<uint>::new();
    do_stuff(&mut map);
}

fn do_stuff(map: &mut SmallIntMap<uint>) -> Foo {
    match map.find_mut(&1) {
        None => {},  // Definitely can't return A here because of lexical scopes
        Some(val) => return B(val),
    }
    return A(map); // ERROR: borrowed at find_mut???
}

playpen

Gankra on 9 Oct 2014

@bstrie Both @pcwalton and @zwarich spent some time trying to actually implement this work (with a possible RFC coming hand-in-hand). They ran into some unexpected complexity that means it would take much more work than hoped. I think everyone agrees with you that these limitations are important and can impact the first impression of the language, but it's hard to balance that against backwards-incompatible changes that are already scheduled.

aturon on 9 Oct 2014

https://github.com/rust-lang/rfcs/pull/396

ntrel on 15 Oct 2014

I feel that if this isn't resolved by 1.0, it's the kind of thing that will lead people to blame the borrow checking approach altogether, when this problem isn't an inherently unsolvable problem with borrow checking AFAIK.

blaenk on 27 Dec 2014

@blaenk It's hard not to blame the borrow checker, I've run into this and similar (like @Gankro) on a daily basis. It's frustrating when the the usual solution is gyrations (eg. a work around) / or comments to restructure your code to be more "immutable", functional, etc.

mtanski on 3 Jan 2015

@mtanski Yeah, but the fault _does_ lie in the borrow checker AFAIK, it's not incorrect to blame it. What I'm referring to is that it may lead newcomers to believe that it's an inherent, fundamental, unsolvable problem with the borrow checking _approach_, which is a pretty insidious, and AFAIK, incorrect, belief.

blaenk on 3 Jan 2015

For the case: "let p = &...; use-p-a-bit-but-never-again; expect-loan-to-be-expired-here;" I would find acceptable for now a kill(p) instruction to manually declare end of scope for that borrow. Later versions could simply ignore this instruction if it is not needed or flag it as an error if re-use of p is detected after it.

userxfce on 5 Apr 2015

/* (wanted) */
/*
fn main() {

    let mut x = 10;

    let y = &mut x;

    println!("x={}, y={}", x, *y);

    *y = 11;

    println!("x={}, y={}", x, *y);
}
*/

/* had to */
fn main() {

    let mut x = 10;
    {
        let y = &x;

        println!("x={}, y={}", x, *y);
    }

    {
        let y = &mut x;

        *y = 11;
    }

    let y = &x;

    println!("x={}, y={}", x, *y);
}

userxfce on 5 Apr 2015

There's the drop() method in the prelude that does that. But doesn't seem
to help with mutable borrows.

On Sun, Apr 5, 2015, 1:41 PM axeoth [email protected] wrote:

/* (wanted) _//_fn main() { let mut x = 10; let y = &mut x; println!("x={}, y={}", x, _y); *y = 11; println!("x={}, y={}", x, *y);}_/
/* had to */fn main() {
let mut x = 10;
{
    let y = &x;

    println!("x={}, y={}", x, *y);
}

{
    let y = &mut x;

    *y = 11;
}

let y = &x;

println!("x={}, y={}", x, *y);
}

—
Reply to this email directly or view it on GitHub
https://github.com/rust-lang/rust/issues/6393#issuecomment-89848449.

seanmonstar on 5 Apr 2015

Closing in favor of https://github.com/rust-lang/rfcs/issues/811

nikomatsakis on 16 Apr 2015

@metajack the link for your original code is a 404. Can you include it inline for people reading this bug?

vitiral on 2 Sep 2016

After some digging, I believe this is equivalent to the original code:
https://github.com/servo/servo/blob/5e406fab7ee60d9d8077d52d296f52500d72a2f6/src/servo/layout/box_builder.rs#L374

metajack on 2 Sep 2016

Or rather, that's the workaround I used when I filed this bug. The original code before that change seems to be this:
https://github.com/servo/servo/blob/7267f806a7817e48b0ac0c9c4aa23a8a0d288b03/src/servo/layout/box_builder.rs#L387-L399

I'm not sure how relevant these specific examples are now since they were pre-Rust 1.0.

metajack on 2 Sep 2016

@metajack it would be great to have an ultra simple (post 1.0) example in the top of this issue. This issue is now part of https://github.com/rust-lang/rfcs/issues/811

vitiral on 2 Sep 2016

fn main() {
    let mut nums=vec![10i,11,12,13];
    *nums.get_mut(nums.len()-2)=2;
}

jethrogb on 2 Sep 2016

I think what I was complaining about was something like this:
https://is.gd/yfxUfw

That particular case appears to work now.

metajack on 2 Sep 2016

@vitiral An example in today's Rust that I believe applies:

fn main() {
    let mut vec = vec!();

    match vec.first() {
        None => vec.push(5),
        Some(v) => unreachable!(),
    }
}

The None arm fails borrowck.

Curiously, if you don't try to capture int in the Some arm (ie. use Some(_)), it compiles.

Wyverald on 11 Jan 2017

@wyverland oh ya, I hit that just yesterday, pretty annoying.

@metajack can you edit the first post to include that example?

vitiral on 11 Jan 2017

It's not yet hit nightly, but I just want to say that this now compiles:

#![feature(nll)]

fn main() {
    let mut vec = vec!();

    match vec.first() {
        None => vec.push(5),
        Some(v) => unreachable!(),
    }
}

nikomatsakis on 21 Dec 2017

🎉16 ❤7

Was this page helpful?

0 / 5 - 0 ratings