// // Copyright (c) 2023 Alan de Freitas (alandefreitas@gmail.com) // // Distributed under the Boost Software License, Version 1.0. (See accompanying // file LICENSE_1_0.txt or copy at https://www.boost.org/LICENSE_1_0.txt) // // Official repository: https://github.com/boostorg/url // = Parsing Algorithms which parse URLs return a view which references the underlying character buffer without taking ownership, avoiding memory allocations and copies. The following example parses a string literal containing a https://datatracker.ietf.org/doc/html/rfc3986#section-3[__URI__,window=blank_]: [source,cpp] ---- include::example$unit/snippets.cpp[tag=code_urls_parsing_1,indent=0] ---- The function returns a `result` which holds a `url_view` if the string is a valid URL. Otherwise it holds an `error_code`. It is impossible to construct a `url_view` which refers to an invalid URL. [WARNING] ==== The caller is responsible for ensuring that the lifetime of the character buffer extends until it is no longer referenced by the view. These are the same semantics as that of cpp:std::string_view[]. ==== For convenience, a URL view can be constructed directly from the character buffer in a `string_view`. In this case, it parses the string according to the https://datatracker.ietf.org/doc/html/rfc3986#section-4.1[__URI-reference__,window=blank_] grammar, throwing an exception upon failure. The following two statements are equivalent: [source,cpp] ---- include::example$unit/snippets.cpp[tag=code_urls_parsing_2,indent=0] ---- In this library, free functions which parse things are named with the word "parse" followed by the name of the grammar used to match the string. There are several varieties of URLs, and depending on the use-case a particular grammar may be needed. In the target of an HTTP GET request for example, the scheme and fragment are omitted. This corresponds to the https://datatracker.ietf.org/doc/html/rfc7230#section-5.3.1[__origin-form__,window=blank_] production rule described in https://tools.ietf.org/html/rfc7230[rfc7230,window=blank_]. The function `parse_origin_form` is suited for this purpose. All the URL parsing functions are listed here: [cols="a,a,a,a"] |=== // Headers |Function|Grammar|Example|Notes // Row 1, Column 1 |`parse_absolute_uri` // Row 1, Column 2 |https://datatracker.ietf.org/doc/html/rfc3986#section-4.3[__absolute-URI__,window=blank_] // Row 1, Column 3 |`pass:[http://www.boost.org/index.html?field=value]` // Row 1, Column 4 |No fragment // Row 2, Column 1 |`parse_origin_form` // Row 2, Column 2 |https://datatracker.ietf.org/doc/html/rfc7230#section-5.3.1[__origin-form__,window=blank_] // Row 2, Column 3 |`pass:[/index.html?field=value]` // Row 2, Column 4 |Used in HTTP // Row 3, Column 1 |`parse_relative_ref` // Row 3, Column 2 |https://datatracker.ietf.org/doc/html/rfc3986#section-4.2[__relative-ref__,window=blank_] // Row 3, Column 3 |`pass:[//www.boost.org/index.html?field=value#downloads]` // Row 3, Column 4 | // Row 4, Column 1 |`parse_uri` // Row 4, Column 2 |https://datatracker.ietf.org/doc/html/rfc3986#section-3[__URI__,window=blank_] // Row 4, Column 3 |`pass:[http://www.boost.org/index.html?field=value#downloads]` // Row 4, Column 4 | // Row 5, Column 1 |`parse_uri_reference` // Row 5, Column 2 |https://datatracker.ietf.org/doc/html/rfc3986#section-4.1[__URI-reference__,window=blank_] // Row 5, Column 3 |`pass:[http://www.boost.org/index.html]` // Row 5, Column 4 |Any __URI__ or __relative-ref__ |=== The URL is stored in its serialized form. Therefore, it can always be easily output, sent, or embedded as part of a protocol: // snippet_parsing_url_1bb [source,cpp] ---- include::example$unit/snippets.cpp[tag=snippet_parsing_url_1bb,indent=0] ---- A `url` is an allocating container which owns its character buffer. Upon construction from `url_view`, it allocates dynamic storage to hold a copy of the string. // snippet_parsing_url_1bc [source,cpp] ---- include::example$unit/snippets.cpp[tag=snippet_parsing_url_1bc,indent=0] ---- A `static_url` is a container which owns its character buffer for a URL whose maximum size is known. Upon construction from `url_view`, it does not perform any dynamic memory allocations. // snippet_parsing_url_1bd [source,cpp] ---- include::example$unit/snippets.cpp[tag=snippet_parsing_url_1bd,indent=0] ---- == Result Type These functions have a return type which uses the `result` alias template. This class allows the parsing algorithms to report errors without referring to exceptions. The functions `result::operator bool()` and `result::operator*` can be used to check if the result contains an error. // snippet_parsing_url_1 [source,cpp] ---- include::example$unit/snippets.cpp[tag=snippet_parsing_url_1,indent=0] ---- Since `result::operator bool()` is already checking if `result` contains an error, `result::operator*` provides an unchecked alternative to get a value from `result`. In contexts where it is acceptable to throw errors, `result::value` can be used directly. // snippet_parsing_url_1b [source,cpp] ---- include::example$unit/snippets.cpp[tag=snippet_parsing_url_1b,indent=0] ---- Check the reference for `result` for a synopsis of the type. For complete information please consult the full https://www.boost.org/doc/libs/1_83_0//libs/system/doc/html/system.html#ref_resultt_e[`result`,window=blank_] documentation in https://www.boost.org/doc/libs/1_83_0//libs/system/doc/html/system.html[Boost.System,window=blank_].