Fri, 28 Feb 2025 19:07:47 +0100
write basic parsing documentation
relates to #451
# Properties The UCX properties parser can be used to parse line based key/value strings. <warning> New Feature - documentation work in progress! </warning> ## Supported Syntax Key/value pairs must be line based and separated by a single character delimter. The parser supports up to three different characters which introduce comments. All characters starting with a comment character up to the end of the line are ignored. Blank lines are also ignored. An example properties file looks like this: ```properties # Comment line at start of file key1 = value1 key2 = value2 # next is a blank line and will be ignored keys_are_trimmed = and_so_are_values # also a comment ``` > Delimiter and comment characters are configured with the `CxPropertiesConfig` structure. > There is also a field reserved for `continuation` which will be used as a line continuation character > in a future version of UCX. > In UCX 3.1 this is not implemented. ## Basic Parsing ```C #include <cx/properties.h> typedef struct cx_properties_config_s { char delimiter; char comment1; char comment2; char comment3; // reserved for future use - not implemented in UCX 3.1 char continuation; } CxPropertiesConfig; void cxPropertiesInit(CxProperties *prop, CxPropertiesConfig config); void cxPropertiesInitDefault(CxProperties *prop); void cxPropertiesDestroy(CxProperties *prop); void cxPropertiesReset(CxProperties *prop); int cxPropertiesFilln(CxProperties *prop, const char *buf, size_t len); // where S is one of cxstring, cxmutstr, char*, const char* int cxPropertiesFill(CxProperties *prop, S string); CxPropertiesStatus cxPropertiesNext(CxProperties *prop, cxstring *key, cxstring *value); void cxPropertiesUseStack(CxProperties *prop, char *buf, size_t capacity); ``` The first step is to initialize a `CxProperties` structure with a call to `cxPropertiesInit()` using the desired config. The shorthand `cxPropertiesInitDefault()` creates a default configuration with the equals sign `'='` as delimiter and the hash-symbol `'#'` as comment symbol (the other two comment symbols remain unused in the default config). > In a future UCX version, the default `continuation` character will be a backslash `'\'`. > In UCX 3.1 this feature is not implemented, yet. The actual parsing is an interleaving invocation of the `cxPropertiesFill()` (or `cxPropertiesFilln()`) and `cxPropertiesNext()` functions. The `cxPropertiesFill()` function is a convenience function, that accepts UCX strings and normal zero-terminated C strings and behaves otherwise like `cxPropertiesFilln()`. Filling the input buffer is cost-free if there is no data already in the input buffer. In that case, the input buffer only stores the pointer to the original data without creating a copy. Calling `cxPropertiesNext()` will return with `CX_PROPERTIES_NO_ERROR` (= zero) for each key/value-pair that is successfully parsed, and stores the pointers and lengths for both the key and the value into the structures pointed to by the `key` and `value` arguments. > This is all still free of any copies and allocations. > That means, the pointers in `key` and `value` after `cxPropertiesNext()` returns will point into the input buffer. > If you intend to store the key and/or the value somewhere else, it is strongly recommended to create a copy with `cx_strdup()`, > because you will otherwise soon end up with a dangling pointer. > {style="note"} If `cxPropertiesNext()` returns `CX_PROPERTIES_INCOMPLETE_DATA` it means that the input buffer is exhausted, but the last line did not contain a full key/value pair. In that case, you can call `cxPropertiesFill()` again to add more data and continue with `cxPropertiesNext()`. Note, that adding more data to a non-empty input buffer will lead to an allocation, unless you specified some stack memory with `cxPropertiesUseStack()`. The stack capacity must be large enough to contain the longest line in your data. If the internal buffer is not large enough to contain a single line, it is extended. If that is not possible for some reason, `cxPropertiesNext()` fails and returns `CX_PROPERTIES_BUFFER_ALLOC_FAILED`. If you want to reuse a `CxProperties` structure with the same config, you can call `cxPropertiesReset()`, even if the last operation was a failure. Otherwise, you should always call `cxPropertiesDestroy()` when you are done with the parser. > It is strongly recommended to always call `cxPropertiesDestroy` when you are done with the parser, > even if you did not expect any allocations because you used `cxPropertiesUseStack()`. ### List of Status Codes Below is a full list of error codes for `cxPropertiesNext()`. | Status Code | Meaning | |-----------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | CX_PROPERTIES_NO_ERROR | A key/value pair was found and returned. | | CX_PROPERTIES_NO_DATA | The input buffer does not contain more data. | | CX_PROPERTIES_INCOMPLETE_DATA | The input ends unexpectedly. This can happen when the last line does not terminate with a line break, or when the input ends with a parsed key but no value. Use `cxPropertiesFill()` to add more data before retrying. | | CX_PROPERTIES_NULL_INPUT | The input buffer was never initialized. Probably you forgot to call `cxPropertiesFill()` at least once. | | CX_PROPERTIES_INVALID_EMPTY_KEY | Only white-spaces were found on the left hand-side of the delimiter. Keys must not be empty. | | CX_PROPERTIES_INVALID_MISSING_DELIMITER | A line contains data, but no delimiter. | | CX_PROPERTIES_BUFFER_ALLOC_FAILED | More internal buffer was needed, but could not be allocated. | ## Sources and Sinks ```C #include <cx/properties.h> CxPropertiesSource cxPropertiesStringSource(cxstring str); CxPropertiesSource cxPropertiesCstrSource(const char *str); CxPropertiesSource cxPropertiesCstrnSource(const char *str, size_t len); CxPropertiesSource cxPropertiesFileSource(FILE *file, size_t chunk_size); CxPropertiesSink cxPropertiesMapSink(CxMap *map); CxPropertiesStatus cxPropertiesLoad(CxProperties *prop, CxPropertiesSink sink, CxPropertiesSource source); ``` <warning> TODO: write documentation </warning> ### Additional Status Codes For sources and sinks there are three additional special status codes, which only appear as return values for `cxPropertiesLoad()`. | Status Code | Meaning | |-----------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | CX_PROPERTIES_READ_INIT_FAILED | Initializing the properties source failed and the `cx_properties_read_init_func` returned non-zero. | | CX_PROPERTIES_READ_FAILED | Reading from a properties source failed and the `cx_properties_read_func` returned non-zero. | | CX_PROPERTIES_SINK_FAILED | Sinking a key/value-pair failed and the `cx_properties_sink_func` returned non-zero. | ### Creating own Sources and Sinks ```C #include <cx/properties.h> typedef int(*cx_properties_read_init_func)(CxProperties *prop, CxPropertiesSource *src); typedef int(*cx_properties_read_func)(CxProperties *prop, CxPropertiesSource *src, cxstring *target); typedef void(*cx_properties_read_clean_func)(CxProperties *prop, CxPropertiesSource *src); typedef int(*cx_properties_sink_func)(CxProperties *prop, CxPropertiesSink *sink, cxstring key, cxstring value); typedef struct cx_properties_source_s { void *src; void *data_ptr; size_t data_size; cx_properties_read_func read_func; cx_properties_read_init_func read_init_func; cx_properties_read_clean_func read_clean_func; } CxPropertiesSource; typedef struct cx_properties_sink_s { void *sink; void *data; cx_properties_sink_func sink_func; } CxPropertiesSink; ``` <warning> TODO: write documentation </warning> <seealso> <category ref="apidoc"> <a href="https://ucx.sourceforge.io/api/properties_8h.html">properties.h</a> </category> </seealso>