[/ Copyright (c) 2019-2024 Ruben Perez Hidalgo (rubenperez038 at gmail dot com) Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) ] [section:sql_formatting_advanced (Experimental) Advanced client-side SQL query formatting] [nochunk] [heading Extending format_sql and format_context] You can specialize [reflink formatter] to add formatting support to your types. The notation resembles `std::format` but is much simpler, since format specs are not supported. [sql_formatting_formatter_specialization] The type can now be used in [reflink format_sql], [reflink format_sql_to] and [refmem basic_format_context append_value]: [sql_formatting_formatter_use] [heading:format_string_syntax Format string syntax] This section extends on the supported syntax for format strings. The syntax is similar to the one in `fmtlib`. A format string is composed of regular text and replacement fields. Regular text is output verbatim, while replacement fields are substituted by formatted arguments. For instance, in `"SELECT {} FROM employee"`, `"SELECT "` and `" FROM EMPLOYEE"` is regular text, and `"{}"` is a replacement field. A `{}` is called an [*automatic indexing] replacement field. Arguments are replaced in the order they were provided to the format function. For instance: [sql_formatting_auto_indexing] A field index can be included within the braces. This is called [*manual indexing]. Indices can appear in any order, and can be repeated: [sql_formatting_manual_indices] Finally, you can use named arguments by using the initializer-list overloads, which creates [reflink format_arg] values: [sql_formatting_named_args] Argument names can only contain ASCII letters (lowercase and uppercase), digits and the underscore character (`_`). Names can't start with a digit. Format strings can use either manual or automatic indexing, but can't mix them: [sql_formatting_manual_auto_mix] Named arguments can be mixed with either manual or automatic indexing. Unreferenced format arguments are ignored. It's not an error to supply more format arguments than required: [sql_formatting_unused_args] You can output a brace literal by doubling it: [sql_formatting_brace_literal] Format specifiers (e.g. `{:g}`), common in `fmtlib`, are not allowed. There is usually a single, canonical representation for each type in MySQL, so there is no need to format types with different formats. This makes the implementation simpler. Format strings must be encoded according to [refmem format_options charset]. Otherwise, an error will be generated. [heading:error_handling Error handling model] Some values can't be securely formatted. For instance, C++ `double` can be NaN and infinity, which is not supported by MySQL. Strings can contain byte sequences that don't represent valid characters, which makes them impossible to escape securely. [reflink format_sql] reports these errors by throwing `boost::system::system_error` exceptions, which contain an error code with details about what happened. For instance: [sql_formatting_format_double_error] You don't have to use exceptions, though. [reflink basic_format_context] and [reflink format_sql_to] use [link mysql.error_handling.system_result `boost::system::result`], instead. [reflink basic_format_context] contains an error code that is set when formatting a value fails. This is called the ['error state], and can be queried using [refmem basic_format_context error_state]. When [refmem basic_format_context get] is called (after all individual values have been formatted), the error state is checked. The `system::result` returned by `get` will contain the error state if it was set, or the generated query if it was not: [sql_formatting_no_exceptions] Rationale: the error state mechanism makes composing formatters easier, as the error state is checked only once. Errors caused by invalid format strings are also reported using this mechanism. [heading:format_options Format options and character set tracking] MySQL has many configuration options that affect its syntax. There are two options that formatting functions need to know in order to work: * Whether the backslash character represents an escape sequence or not. By default it does, but this can be disabled dynamically by setting the [@https://dev.mysql.com/doc/refman/8.0/en/sql-mode.html#sqlmode_no_backslash_escapes NO_BACKSLASH_ESCAPES] SQL mode. This is tracked by [reflink any_connection] automatically (see [refmem any_connection backslash_escapes]). * The connection's [*current character set]. This determines which multi-byte sequences are valid, and is required to iterate and escape the string. The current character set is tracked by connections as far as possible, but deficiencies in the protocol create cases where the character set may not be known to the client. The current character set can be accessed using [refmem any_connection current_character_set]. [refmem any_connection format_opts] is a convenience function that returns a [link mysql.error_handling.system_result `boost::system::result`]`<`[reflink format_options]`>`. If the connection could not determine the current character set, the result will contain an error. For a reference on how character set tracking works, please read [link mysql.charsets.tracking this section]. [warning Passing an incorrect `format_options` value to formatting functions may cause escaping to generate incorrect values, which may generate vulnerabilities. Stay safe and always use [refmem any_connection format_opts] instead of hand-crafting `format_options` values. Doing this, if the character set can't be safely determined, you will get a [link mysql.sql_formatting.unknown_character_set `client_errc::unknown_character_set`] error instead of a vulnerability. ] [heading Custom string types] [reflink format_sql_to] can be used with string types that are not `std::string`, as long as they satisfy the [reflink OutputString] concept. This includes strings with custom allocators (like `std::pmr::string`) and `boost::static_string`. You need to use [reflink basic_format_context], specifying the string type: [sql_formatting_custom_string] [heading Re-using string memory] You can pass a string value to the context's constructor, to re-use memory: [sql_formatting_memory_reuse] [heading Raw string escaping] If you're building a SQL framework, or otherwise performing very low-level tasks, you may need to just escape a string, without quoting or formatting. You can use [reflink escape_string], which mimics [@https://dev.mysql.com/doc/c-api/8.0/en/mysql-real-escape-string.html `mysql_real_escape_string`]. [note Don't use this unless you know what you're doing. ] [endsect]