-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include the errored file path (when available) in std::io::Error instances #2885
Comments
Also, |
We cannot include this information in a zero cost way, so it would be a price that everyone would need to pay (even if they don't want to). As such, it would not fit well into std; a library providing a wrapper around std which adds relevant context could be viable though. Thanks for the suggestion, though! If you want to discuss this further, I recommend going to internals.rust-lang.org. |
Would it be inappropriate even in debug mode only? |
We would need an RFC at minimum to design such a change and I personally am not sure that it's possible given that we only ship one std (which is always compiled in release mode). (Regardless, this is a sufficiently large request to need an RFC if you want to pursue it further). |
Yes, I'd wager most code that benefits from this in debug would benefit in release too, making the debug effort wasted. In C/C++ you must attach such contextual information manually. In Linux, you could read file paths from open using /proc/self and
In principle, you could write an |
I object to the idea that adding this is not zero cost. The benefit of adding The only issue I see is that paths will not be present in operations after opening the file. However in my experience errors after opening the file are extremely rare and they are pretty much guaranteed to be problems with whole filesystem (full, device unplugged...) in which case the specific file doesn't matter that much anyway. I'm willing to write the RFC if my arguments convinced you that it's worth pursuing. |
I think I've personally been slowly more convinced over time that this is probably the right thing to do if we can pull it off on benchmarks and such (seems not implausible, given existing allocations etc) -- I think the right next step is to kick off some discussion on internals.rust-lang.org and get a read on how libs team feels about this (#t-libs on Zulip is probably best for that). See also #2979, which tries to codify some of these processes :) |
Hello! I'm here from r/rust. Has there been more recent discussion/progress on this issue? I'd love to have/help with this feature! |
@paulzzy not that I'm aware of. I had it in the back of the mind but didn't find the time to write the RFC. If it starts moving, I will try to schedule some time to help. For now, here are my thoughts:
My desired error messages:
|
There's a relevant PR from @m-ou-se (opened just over a year ago): |
At some point in the past I did https://github.com/thomcc/rust/tree/io-error-path which was the start of an updated version #86826. It was mostly to convince myself this was still possible with the new error repr (it is, but it does require an allocation now), but I ended up coming to the conclusion that it wasn't really worth doing, mostly because of what @m-ou-se said in that issue:
One thought is that the path could be provided even via the error trait via the provider API (CC @yaahc). Beyond that, I do think the fact that it will end up making many programs show the path twice is unfortunate. I'm not sure that this outweighs the utility of having the path available (certainly that would be nice, and it's tedious to have to truck path around just to be able to display it with the error), but yeah. That and the fact that I had convinced myself it was still possible is why I stopped working on that branch. FWIW, it is also clear that it's not zero-cost anymore, at least not on 64 bit systems, as it requires an additional allocation (considerably more expensive than the two mov instructions cited above), but it is probably true that this is outweighted by the cost of a syscall. (It actually may require two allocations now, as since writing the code in that branch, we don't always allocate paths that are passed into the methods) |
@thomcc interesting, I don't have enough time to look into it deeply but in my experience path being displayed twice is less bad than path not being displayed at all. There could be |
Displaying the path for I agree that duplicating the path is problematic. Since first realizing that rust io errors are this useless on their own, I now always map IO errors at the site of creation to a custom error (or just a Still, it's not the end of the world: if paths were included from day 0, no one would have formatted IO errors that way. If paths get included in However, there's also the matter of whether or not it's out of the question to use some internal magic to detect whether we are in a top-level It would be harder (and probably not zero-cost) to determine whether |
I'm going to reopen this since it's getting active discussion and seems plausible we could include it, and @Mark-Simulacrum (the person who closed it in the first place) indicated in an above comment that they've changed their mind somewhat. I'll note that given the current io::Error implementation, this wouldn't be fully zero cost. That said it probably would be negligible compared to the cost of a system call, and the cost would be entirely in the error case -- the success path would be unpunished. It would be doable by conceptually storing (something similar to) Doing so would incur a single extra allocation, but only in the case that the file operation failed. I'm unconvinced this matters or would show up in benchmarks, but perhaps it's not strictly zero cost. CC @yaahc since this is related to errors. |
IIRC, in many cases we're forced to allocate a CString for the path today anyway, which makes me wonder if we can upfront make a slightly larger allocation (by a few bytes) to avoid needing to copy/reallocate the path after the syscall fails. This would make this pretty much zero-cost, modulo a slightly larger path allocation, which in many cases is entirely unnoticeable as the allocator is probably putting us in a power-of-two bucket anyway. |
I have little to add to this one. I'm all for improving that situation because I'm aware that there are many unpleasant failure modes associated with error reporting for |
Yes, I agree that's not a good solution either. I think that and the other "magic"ey solutions are probably a non-starter -- in all likelihood the choices probably are:
I'm in favor of the first or the third. I think the second is likely not worth the trouble, or the (admittedly small) overhead, since it would be very hidden. I'm most in favor of the first, as I don't think we consider the output of the io::Error's display implementation to be a stable detail, and don't think that including the path twice is that bad (it's a little goofy, but it seems decidedly better than not including it ever -- error messaging is often slightly redundant anyway). I could be mistaken about this, though. |
I think option 1 is perfectly fair and would be my preference as well (despite having voiced aloud the other ideas as well), and like I mentioned, people will adapt and adjust their error reporting to account for the duplicated path in time. |
I'd also prefer option 1. As a Rust beginner (admittedly not the most important demographic), it can be frustrating trying to figure out where the error is propagating from. I think having it built-in is a small price to pay for redundant printing. It'd certainly be helpful for new Rustaceans who don't know how to write custom error wrappers (this is a self-own). For option 3, I'm not sure lack-of-complaints implies lack-of-trouble. Beginners probably don't know where to complain, while experienced Rustaceans already know how to deal with it. That said, this is speculation so I could be totally wrong. Also, how much work is already done by rust-lang/rust#86826? |
A significant amount of it would have to be redone, because the underlying error impl has been rewritten, and the approach must change. That said, I did part of that work in https://github.com/thomcc/rust/tree/io-error-path, but stopped because of being unsure of its utility. |
@thomcc wrote:
The error case can still be on the common critical path: some programs try opening many files, most of which won't exist. For instance, consider the case of looking for files along a search path; most open operations will fail. Linux has a "negative dentry" cache specifically to remember "this file doesn't exist", to optimize these kinds of cases. @Mark-Simulacrum wrote:
rust-lang/rust#93668 fixes that. |
@joshtriplett that changes the perspective quite a bit but we ran into an unfortunate situation. Without the path, error messages coming from Maybe the correct approach is really making a separate |
I don't know any Rust - arrived at this page because I'm trying to modify some existing code. And precisely the same as the comments here - as an end user, "no such file or directory" is in effect useless to me, I can't help craft an MRE for the authors of the original code if I don't know what the cause of the error actually is, so it sounds like I will need to dig down deeper and return a different type with this on top. But I don't know Rust's design goals, I'm a C++ programmer really, just thought I'd add:
Yes, that has traditionally been the case, we want to provide information using the what string. However, we probably would insert the paths in this string, and: https://en.cppreference.com/w/cpp/filesystem/filesystem_error/path now we have filesystem stuff in the std library, |
Is there a best practice for dealing with this issue in the meantime? It's easy enough if you use anyhow everywhere, but that has downsides. |
@blueforesticarus I was working on such crate (with a few more bells and whistles) but didn't finish it. Would be happy to setup some repo for other people to help me if there's enough interest. |
In that case I will complain. It has always bothered me that file errors do not include the path. It is not just a matter of the display or debug message, but the path is not even available to query pro grammatically from the error. This was so surprising to me that originally I figured I was just using it wrong somehow. Surely the std lib wouldn't make such an oversight. So tonight I've spent about 3 hours tracking all this down, and now it seems I will have to spend more time creating a wrapper Error just to include info that was already passed to the write() call. |
There is certainly a high cost to not including the path. |
@Kixunil |
Hello, I’m wondering about the state of this issue, now that rust-lang/rust#93668 has landed, since the argument ‘we already allocate a bunch of Cstring’ is not longer valid. I’d ready to have a look at this issue if there’s a green light. |
I am in support of this. At the moment I am forced to add context to every single ? error handling that concerns I/O. It feels bad. |
You really only need to add it to the top level one, i.e. a single one, which honestly you need to do even if io::Error contains the path, cause parsing a file can result in many errors that are unrelated to the IO operation itself, such as the encoding being corrupt. |
This is definitely a stumbling point for beginners, I for one didn't expect the path to not be included. I'm not sure how practical it would be but for the use cases where it is on the critical path of execution I'm wondering if it could be better served by an alternate call that "anticipates" an error case and separate it from "regular" calls that at least for beginners expect the error message to include the path. With regard to requiring an allocator, is it possible to check if one is available and only include the path if one is? If this only happens in the error case I think that's a worthwhile cost because it will save a lot of debug time trying to figure out what is not available, especially if you are new and don't even really know where to start. Googling And with regard to double printing the error messages I think the edition system is a fine time to make that kind of change as it provides a point where people would have to opt in. I personally feel the double printing is less bad than never printing. And this will get less bad with time as people expect it to include the path. |
Yes well, trying to figure it out is how I ended up here. Like probably most users, I don't use Rust full-time and suffer quite large mental overhead from wrapping errors, so it's easier to leave it for now. Also as the Error Handling Project Group has gone stale (?), nobody's there to hold my hand :( The point about encodings is good, however, I feel like typoed filenames are much more common than corrupted files. Weirdly, I often feel like error handling is simultaneously one of the best and worst parts about Rust. |
I think you should take a look at the anyhow create, it might make your life easier. If you get stuck you can always get in touch on discord there are lots of people willing to help you on your journey. This issue imo is about long term ergonomics and practicality not short term solutions. But hopefully we can find a way to help ppl to join in the future. |
I don't mean to derail this discussion any further than this. I am building a library for bioinformatics that handles errors using |
Isn't there any library handling it? Would it make sense to havie? It could mimick I thought there was something like that already, and I'm trying to find it. :D |
@dpc I started writing one a long time ago but didn't finish. I wanted (and still want) to restart it but life got in the way. |
https://crates.io/crates/fs-err is one. I think there are some more. |
Given that probably the most common model for handling errors in rust programs is just to bubble them up, it is far too common to see errors like this:
Note the last line: sccache panicked because it couldn't find some file. Which file? Who knows. You'll need to debug or grep the code to find out.
This is actually because
std::io::Error
is guilty of the same thing sccache did: it just bubbled up the OS error without adding any context.This particular sccache error is just the error I was dealing with last before filing this issue, but I run into this about every other day dealing with 3rd party crates in the ecosystem. In my own code, all of my crates have to wrap
std::io::Error
in a new error type that includes the file path and at the site of invocation wherestd::io::Result
is first returned, everystd::io::Error
gets mapped to a custom error that includes the path in question, if any.I would like to propose that - at least in debug mode, although preferably always - any IO operations that involve a named path should include that path in their error text, and perhaps even the error type should be extended to include an
Option<PathBuf>
field (or select APIs returning a different error type).I know this comes with some caveats. It's hard to use a reference to the path to generate the error text since that would tie the lifetime of the error to (likely) the scope it was called from, making bubbling it up harder, so it would necessarily have to copy the value. This increases the size of
std::io::Error
for cases where errors are not being bubbled up, but I've been mulling this in the back of my head for some time now, and I think the ergonomics outweigh the cost... and I'm also hopeful someone can figure some clever trick around that.The text was updated successfully, but these errors were encountered: