Filesystem Relative Draft Proposal |
Home Tutorial Reference FAQ Releases Portability V3 Intro V3 Design Deprecated Bug Reports |
Introduction
Acknowledgement
Preliminary implementation
Requirements
Issues
Design decisions
Provide separate lexical and
operational relative
functions
Provide
separate lexical and operational proximate
functions
Add lexical functions as path
member functions
Provide a non-member function
lexically_normal
returning a
normal form path
Provide a weakly_canonical
operational function
Resolve issues in ways that "just work" for users
Specify lexical relative
in terms
of std::mismatch
Specify operational relative
in terms of
weakly_canonical
Specify operational relative
in terms of
lexically
relative
Proposed wording
Define normal form
New class path member functions
Synopsis
Specification
New operational functions
Synopsis
Specification
There have been requests for a Filesystem library relative function for at least ten years.
The requested functionality seems simple - given two paths with a common prefix, return the non-common suffix portion of one of the paths such that it is relative to the other path.
In terms of the Filesystem library,
path p("/a/b/c"); path base("/a/b"); path rel = relative(p, base); // the requested function cout << rel << endl; // outputs "c" assert(absolute(rel, base) == p);
If that was all there was to it, the Filesystem library would have had a
relative
function years ago.
Blocking issues: Clashing requirements, symlinks, directory placeholders (dot, dot-dot), user-expectations, corner cases.
A paper by Jamie Allsop, Additions to Filesystem supporting Relative Paths,
is what broke my mental logjam. Much of what follows is based directly on
Jamie's analysis and proposal. The weakly_canonical
function and
aspects of the semantic specifications are my contributions. Mistakes, of
course, are mine.
A preliminary implementation is available in the feature/relative2 branch of the Boost Filesystem Git repository. See github.com/boostorg/filesystem/tree/feature/relative2
Requirement 2: Some uses require symlinks not be followed; i.e. the path must not be resolved in the actual file system.
Requirement 3: Some uses require removing redundant current directory (dot) or parent directory (dot-dot) placeholders.Requirement 4: Some uses do not require removing redundant current directory (dot) or parent directory (dot-dot) placeholders since the path is known to be already in normal form.
Issue 1: What happens if p
and base
are themselves relative?
p
is relative to base
, or something else?
Issue 3: What happens if p
, base
, or both are empty?
p
and base
are the same?Issue 5: How is the "common prefix" determined?
Issue 6: What happens if portions ofp
or base
exist but
the entire path does not exist and yet symlinks need to be followed?Issue 7: What happens when a symlink in the existing portion of a path is affected by a directory (dot-dot) placeholder in a later non-existent portion of the path?
Issue 8: Overly complex semantics (and thus specifications) in preliminary designs made reasoning about uses difficult.
Issue 9: Some uses never have redundant current directory (dot) or parent directory (dot-dot) placeholders, so a removal operation would be an unnecessary expense although otherwise harmless.
relative
functionsResolves the conflict between requirement 1 and requirement 2 and ensures both requirements are met.
A purely lexical function is needed by users working with directory hierarchies that do not actually exist.
An operational function that queries the current file system for existence and follows symlinks is needed by users working with actual existing directory hierarchies.
proximate
functions
Although not the only possibility, a likely fallback when the relative
functions cannot find a relative path is to return the path being made relative. As
a convenience, the proximate
functions do just that.
path
member functions
The Filesystem library is unusual in that it has several functions with
both lexical (i.e. cheap) and operational (i.e. expensive due to file
system access) forms with differing semantics. It is important that users
choose the form that meets their application's specific needs. The library
has always made the distinction via the convention of lexical functions
being members of class path
, while operational functions are
non-member functions. The lexical functions proposed here also use the
name prefix lexically_
to drive home the distinction.
For the contrary argument, see Sutter and Alexandrescu, C++ Coding Standards, 44: "Prefer writing nonmember nonfriend functions", and Meyers, Effective C++ Third Edition, 23: "Prefer non-member non-friend functions to member functions."
lexically_normal
returning a
normal form pathEnables resolution of requirement 3 and requirement 4 in a way consistent with issue 9. Is a contributor to the resolution of issue 8.
"Normalization" is the process of removing redundant current directory (dot) , parent directory (dot-dot), and directory separator elements.
Normalization is a byproduct the current canonical
function.
But for the path returned by the
proposed weakly_canonical
function,
only any leading canonic portion is in canonical form. So any trailing
portion of the returned path has not been normalized.
Jamie Allsop has proposed adding a separate normalization function returning a path, and I agree with him.
Boost.filesystem has a deprecated non-const normalization function that modifies the path, but I agree with Jamie that a function returning a path is a better solution.
weakly_canonical
operational functionResolves issue 6, issue 7, issue 9, and is a contributor to the resolution of issue 8.
The operational function
weakly_canonical(p)
returns a path composed of
canonical(x)/y
, where x
is a path composed of the
longest leading sequence of elements in p
that exist, and
y
is a path composed of the remaining trailing non-existent elements of
p
if any. "weakly
" refers to weakened existence
requirements compared to the existing canonical function.
weakly_canonical
as a separate function, and then
specifying the processing of operational relative
arguments in
terms of calls to weakly_canonical
makes it much easier to
specify the operational relative
function and reason about it.
The difficulty of reasoning about operational relative
semantics before the invention of weakly_canonical
was what led to its
initial development.weakly_canonical
as a separate function also allows
use in other contexts.Resolves issues 1, 2, 3, 4, 6, and 7. Is a contributor to the resolution of issue 8.
The "just works" approach was suggested by Jamie Allsop. It is implemented by specifying a reasonable return value for all of the "What happens if..." corner case issues, rather that treating them as hard errors requiring an exception or error code.
lexically relative
in terms
of std::mismatch
Resolves issue 5. Is a contributor to the resolution of issue 8.
relative
in terms of
weakly_canonical
Is a contributor to the resolution of issue 8.
relative
in terms of
lexically
relative
Is a contributor to the resolution of issue 5 and issue 8.
If would be confusing to users and difficult to specify correctly if the two functions had differing semantics:
These problems are avoided by specifying operational relative
in terms of lexical relative
after preparatory
calls to operational functions.
"Overview:" sections below are non-normative experiments attempting to make the normative reference specifications easier to grasp.
A path is in normal form if it has no redundant current directory (dot) or parent directory (dot-dot) elements. The normal form for an empty path is an empty path. The normal form for a path ending in a directory-separator that is not the root directory is the same path with a current directory (dot) element appended.
The last sentence above is not necessary for POSIX-like or Windows-like operating systems, but supports systems like OpenVMS that use different syntax for directory and regular-file names.
path lexically_normal() const; path lexically_relative(const path& base) const; path lexically_proximate(const path& base) const;
path lexically_normal() const;
Overview: Returns
*this
with redundant current directory (dot), parent directory (dot-dot), and directory-separator elements removed.Returns:
*this
in normal form.Remarks: Uses
operator/=
to compose the returned path.[Example:
assert(path("foo/./bar/..").lexically_normal() == "foo");
assert(path("foo/.///bar/../").lexically_normal() == "foo/.");The above assertions will succeed. On Windows, the returned path's directory-separator characters will be backslashes rather than slashes, but that does not affect
path
equality. —end example]
path lexically_relative(const path& base) const;
Overview: Returns
*this
made relative tobase
. Treats empty or identical paths as corner cases, not errors. Does not resolve symlinks. Does not first normalize*this
orbase
.Remarks: Uses
std::mismatch(begin(), end(), base.begin(), base.end())
, to determine the first mismatched element of*this
andbase
. Usesoperator==
to determine if elements match.Returns:
path()
if the first mismatched element of*this
is equal tobegin()
or the first mismatched element ofbase
is equal tobase.begin()
, or
path(".")
if the first mismatched element of*this
is equal toend()
and the first mismatched element ofbase
is equal tobase.end()
, or
- An object of class
path
composed via application ofoperator/= path("..")
for each element in the half-open range [first mismatched element ofbase
,base.end()
), and then application ofoperator/=
for each element in the half-open range [first mismatched element of*this
,end()
).[Example:
assert(path("/a/d").lexically_relative("/a/b/c") == "../../d");
assert(path("/a/b/c").lexically_relative("/a/d") == "../b/c");
assert(path("a/b/c").lexically_relative("a") == "b/c");
assert(path("a/b/c").lexically_relative("a/b/c/x/y") == "../..");
assert(path("a/b/c").lexically_relative("a/b/c") == ".");
assert(path("a/b").lexically_relative("c/d") == "");The above assertions will succeed. On Windows, the returned path's directory-separators will be backslashes rather than forward slashes, but that does not affect
path
equality. —end example][Note: If symlink following semantics are desired, use the operational function
relative
—end note][Note: If normalization is needed to ensure consistent matching of elements, apply
lexically_normal()
to*this
,base
, or both. —end note]
path lexically_proximate(const path& base) const;
Returns: If the value of
lexically_relative(base)
is not an empty path, return it. Otherwise return*this
.[Note: If symlink following semantics are desired, use the operational function
proximate
—end note][Note: If normalization is needed to ensure consistent matching of elements, apply
lexically_normal()
to*this
,base
, or both. —end note]
path weakly_canonical(const path& p); path weakly_canonical(const path& p, system::error_code& ec); path relative(const path& p, system::error_code& ec); path relative(const path& p, const path& base=current_path()); path relative(const path& p, const path& base, system::error_code& ec); path proximate(const path& p, system::error_code& ec); path proximate(const path& p, const path& base=current_path()); path proximate(const path& p, const path& base, system::error_code& ec);
path weakly_canonical(const path& p); path weakly_canonical(const path& p, system::error_code& ec);
Overview: Returnsp
with symlinks resolved and the result normalized.Returns: A path composed of the result of calling the
canonical
function on a path composed of the leading elements ofp
that exist, if any, followed by the elements ofp
that do not exist, if any.Postcondition: The returned path is in normal form.
Remarks: Uses
operator/=
to compose the returned path. Uses thestatus
function to determine existence.Remarks: Implementations are encouraged to avoid unnecessary normalization such as when
canonical
has already been called on the entirety ofp
.Throws: As specified in Error reporting.
path relative(const path& p, system::error_code& ec);
Returns:
relative(p, current_path(), ec)
.Throws: As specified in Error reporting.
path relative(const path& p, const path& base=current_path()); path relative(const path& p, const path& base, system::error_code& ec);
Overview: Returns
p
made relative tobase
. Treats empty or identical paths as corner cases, not errors. Resolves symlinks and normalizes bothp
andbase
before other processing.Returns:
weakly_canonical(p).lexically_relative(weakly_canonical(base))
. The second form returnspath()
if an error occurs.Throws: As specified in Error reporting.
path proximate(const path& p, system::error_code& ec);
Returns:
proximate(p, current_path(), ec)
.Throws: As specified in Error reporting.
path proximate(const path& p, const path& base=current_path()); path proximate(const path& p, const path& base, system::error_code& ec);
Returns:
weakly_canonical(p).lexically_proximate(weakly_canonical(base))
. The second form returnspath()
if an error occurs.Throws: As specified in Error reporting.
© Copyright Beman Dawes 2015
Distributed under the Boost Software License, Version 1.0. See www.boost.org/LICENSE_1_0.txt
Revised 25 October 2015