docs/Writerside/topics/properties.h.md

Sat, 01 Mar 2025 15:02:57 +0100

author
Mike Becker <universe@uap-core.de>
date
Sat, 01 Mar 2025 15:02:57 +0100
changeset 1232
781bd188f1c0
parent 1231
a9f9c59e0b63
child 1233
29e1c48d1a6c
permissions
-rw-r--r--

complete the properties documentation

relates to #451

# Properties

The UCX properties parser can be used to parse line based key/value strings. 

## Supported Syntax

Key/value pairs must be line based and separated by a single character delimter.
The parser supports up to three different characters which introduce comments.
All characters starting with a comment character up to the end of the line are ignored.
Blank lines are also ignored.

An example properties file looks like this:

```properties
# Comment line at start of file
key1 = value1
key2 = value2
# next is a blank line and will be ignored

  keys_are_trimmed  =    and_so_are_values   # also a comment
```

> Delimiter and comment characters are configured with the  `CxPropertiesConfig` structure.
> There is also a field reserved for `continuation` which will be used as a line continuation character
> in a future version of UCX.
> In UCX 3.1 this is not implemented.

## Basic Parsing

```C
#include <cx/properties.h>

typedef struct cx_properties_config_s {
    char delimiter;
    char comment1;
    char comment2;
    char comment3;
    // reserved for future use - not implemented in UCX 3.1
    char continuation;
} CxPropertiesConfig;

void cxPropertiesInit(CxProperties *prop, CxPropertiesConfig config);

void cxPropertiesInitDefault(CxProperties *prop);

void cxPropertiesDestroy(CxProperties *prop);

void cxPropertiesReset(CxProperties *prop);

int cxPropertiesFilln(CxProperties *prop,
        const char *buf, size_t len);

// where S is one of cxstring, cxmutstr, char*, const char*
int cxPropertiesFill(CxProperties *prop, S string);

CxPropertiesStatus cxPropertiesNext(CxProperties *prop,
        cxstring *key, cxstring *value);
        
void cxPropertiesUseStack(CxProperties *prop,
        char *buf, size_t capacity);
```

The first step is to initialize a `CxProperties` structure with a call to `cxPropertiesInit()` using the desired config.
The shorthand `cxPropertiesInitDefault()` creates a default configuration  with the equals sign `'='` as delimiter
and the hash-symbol `'#'` as comment symbol (the other two comment symbols remain unused in the default config).

> In a future UCX version, the default `continuation` character will be a backslash `'\'`.
> In UCX 3.1 this feature is not implemented, yet.

The actual parsing is an interleaving invocation of the `cxPropertiesFill()` (or `cxPropertiesFilln()`) and `cxPropertiesNext()` functions.
The `cxPropertiesFill()` function is a convenience function, that accepts UCX strings and normal zero-terminated C strings and behaves otherwise like `cxPropertiesFilln()`.

Filling the input buffer is cost-free if there is no data already in the input buffer.
In that case, the input buffer only stores the pointer to the original data without creating a copy.
Calling `cxPropertiesNext()` will return with `CX_PROPERTIES_NO_ERROR` (= zero) for each key/value-pair that is successfully parsed,
and stores the pointers and lengths for both the key and the value into the structures pointed to by the `key` and `value` arguments.

When all the data from the input buffer was successfully consumed, `cxPropertiesNext()` returns `CX_PROPERTIES_NO_DATA`.

> This is all still free of any copies and allocations.
> That means, the pointers in `key` and `value` after `cxPropertiesNext()` returns will point into the input buffer.
> If you intend to store the key and/or the value somewhere else, it is strongly recommended to create a copy with `cx_strdup()`,
> because you will otherwise soon end up with a dangling pointer.
> {style="note"}

If `cxPropertiesNext()` returns `CX_PROPERTIES_INCOMPLETE_DATA` it means that the input buffer is exhausted,
but the last line did not contain a full key/value pair.
In that case, you can call `cxPropertiesFill()` again to add more data and continue with `cxPropertiesNext()`.

Note, that adding more data to a non-empty input buffer will lead to an allocation,
unless you specified some stack memory with `cxPropertiesUseStack()`.
The stack capacity must be large enough to contain the longest line in your data.
If the internal buffer is not large enough to contain a single line, it is extended.
If that is not possible for some reason, `cxPropertiesNext()` fails and returns `CX_PROPERTIES_BUFFER_ALLOC_FAILED`. 

If you want to reuse a `CxProperties` structure with the same config, you can call `cxPropertiesReset()`, even if the last operation was a failure.
Otherwise, you should always call `cxPropertiesDestroy()` when you are done with the parser.

> It is strongly recommended to always call `cxPropertiesDestroy` when you are done with the parser,
> even if you did not expect any allocations because you used `cxPropertiesUseStack()`.

### List of Status Codes

Below is a full list of status codes for `cxPropertiesNext()`.

| Status Code                             | Meaning                                                                                                                                                                                                                 |
|-----------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| CX_PROPERTIES_NO_ERROR                  | A key/value pair was found and returned.                                                                                                                                                                                |
| CX_PROPERTIES_NO_DATA                   | The input buffer does not contain more data.                                                                                                                                                                            |
| CX_PROPERTIES_INCOMPLETE_DATA           | The input ends unexpectedly. This can happen when the last line does not terminate with a line break, or when the input ends with a parsed key but no value. Use `cxPropertiesFill()` to add more data before retrying. |
| CX_PROPERTIES_NULL_INPUT                | The input buffer was never initialized. Probably you forgot to call `cxPropertiesFill()` at least once.                                                                                                                 |
| CX_PROPERTIES_INVALID_EMPTY_KEY         | Only white-spaces were found on the left hand-side of the delimiter. Keys must not be empty.                                                                                                                            |
| CX_PROPERTIES_INVALID_MISSING_DELIMITER | A line contains data, but no delimiter.                                                                                                                                                                                 |
| CX_PROPERTIES_BUFFER_ALLOC_FAILED       | More internal buffer was needed, but could not be allocated.                                                                                                                                                            |


## Sources and Sinks

```C
#include <cx/properties.h>

CxPropertiesSource
cxPropertiesStringSource(cxstring str);

CxPropertiesSource
cxPropertiesCstrSource(const char *str);

CxPropertiesSource
cxPropertiesCstrnSource(const char *str, size_t len);

CxPropertiesSource
cxPropertiesFileSource(FILE *file, size_t chunk_size);
        
CxPropertiesSink
cxPropertiesMapSink(CxMap *map);

CxPropertiesStatus
cxPropertiesLoad(CxProperties *prop,
        CxPropertiesSink sink, CxPropertiesSource source);
```

The basic idea of `cxPropertiesLoad()` is that key/value-pairs are extracted from a _source_ and ingested by a _sink_.
For the most common scenarios where properties data is read from a string or a file and put into a map, several functions are available.
But you can specify your [own sources and sinks](#creating-own-sources-and-sinks), as well.

The following example shows a simple function which loads all properties data from a file.
The `chunk_size` argument when creating the file source specifies
how many bytes are read from the file and filled into the properties parser in one read/sink cycle.

```C
#include <stdio.h>
#include <cx/properties.h>

int load_props_from_file(const char *filename, CxMap *map) {
    FILE *f = fopen(filename, "r");
    if (!f) return -1;
    CxProperties prop;
    cxPropertiesInitDefault(&prop);
    CxPropertiesSink sink = cxPropertiesMapSink(map);
    CxPropertiesSource src = cxPropertiesFileSource(f, 512);
    CxPropertiesStatus status = cxPropertiesLoad(&prop, sink, src);
    fclose(f);
    return status;
}

// usage:
CxMap *map = cxHashMapCreateSimple(CX_STORE_POINTERS);
if (load_props_from_file("my-props.properties", map)) {
    // error handling
} else {
    // assuming my-props.properties contains the following line:
    // my-key = some value
    char *value = cxMapGet(map, "my-key");
}
```

> The function `cxPropertiesLoad()` should usually not return `CX_PROPERTIES_INCOMPLETE_DATA` because the parser is automatically refilled from the source.
> If it does, it could mean that the source was unable to provide all the data, or the properties data ended unexpectedly.
> The most expected status code is `CX_PROPERTIES_NO_ERROR` which means that at least one key/value-pair was found.
> If `cxPropertiesLoad()` returns `CX_PROPERTIES_NO_DATA` it means that the source did not provide any key/value-pair.
> There are several special status codes which are documented [below](#additional-status-codes). 

### Creating own Sources and Sinks

```C
#include <cx/properties.h>

typedef int(*cx_properties_read_init_func)(CxProperties *prop,
        CxPropertiesSource *src);

typedef int(*cx_properties_read_func)(CxProperties *prop,
        CxPropertiesSource *src, cxstring *target);

typedef void(*cx_properties_read_clean_func)(CxProperties *prop,
        CxPropertiesSource *src);

typedef int(*cx_properties_sink_func)(CxProperties *prop,
        CxPropertiesSink *sink, cxstring key, cxstring value);

typedef struct cx_properties_source_s {
    void *src;
    void *data_ptr;
    size_t data_size;
    cx_properties_read_func read_func;
    cx_properties_read_init_func read_init_func;
    cx_properties_read_clean_func read_clean_func;
} CxPropertiesSource;

typedef struct cx_properties_sink_s {
    void *sink;
    void *data;
    cx_properties_sink_func sink_func;
} CxPropertiesSink;
```

You can create your own sources and sinks by initializing the respective structures.
For a source, only the `read_func` is mandatory, the other two functions are optional and used for initialization and cleanup, if required.
The file source created by `cxPropertiesFileSource()`, for example,
uses the `read_init_func` to allocate, and the `read_clean_func` to free the read buffer, respectively. 

Since the default map sink created by `cxPropertiesMapSink()` stores `char*` pointers into a map,
the following example uses a different sink, which stores them as `cxmutstr` values, automatically freeing them
when the map gets destroyed.

```C
#include <stdio.h> 
#include <unistd.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <cx/properties.h>
#include <cx/hash_map.h>

static int prop_mmap(CxProperties *prop, CxPropertiesSource *src) {
    struct stat s;
    int fd = open(src->src, O_RDONLY);
    if (fd < 0) return -1;
    // re-use the data field to store the fd
    // there are cleaner ways, but this is just for illustration
    src->src = (void*) fd;
    fstat(fd, &s);
    // memory map the entire file
    // and store the address and length in the properties source
    src->data_ptr = mmap(0, s.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
    src->data_size = s.st_size;
    return src->data_ptr == NULL;
}

static int prop_read(CxProperties *prop, CxPropertiesSource *src,
        cxstring *target) {
    // copy the address and length of the mapped data to the target 
    target->ptr = src->data_ptr;
    target->length = src->data_size;
    // set the new size to zero to indicate that there is no more data
    src->data_size = 0;
    return 0;
}

static void prop_unmap(CxProperties *prop, CxPropertiesSource *src) {
    // unmap the memory and close the file
    munmap(src->data_ptr, src->data_size);
    close((int)src->src);
}

static int prop_sink(CxProperties *prop, CxPropertiesSink *sink,
        cxstring key, cxstring value) {
    CxMap *map = sink->sink;
    // copy the string and store it into the map
    cxmutstr v = cx_strdup(value);
    int r = cxMapPut(map, key, &v);
    if (r != 0) cx_strfree(&v);
    return r;
}

int load_props_from_file(const char *filename, CxMap *map) {
    CxProperties prop;
    cxPropertiesInitDefault(&prop);
    CxPropertiesSource src;
    src.src = (void*) filename;
    src.read_init_func = prop_mmap;
    src.read_func = prop_read;
    src.read_clean_func = prop_unmap;
    CxPropertiesSink sink;
    sink.sink = map;
    sink.sink_func = prop_sink;
    return cxPropertiesLoad(&prop, sink, src);
}

int main() {
    // in contrast to the default map sink,
    // this one here stores the UCX strings by value
    CxMap *map = cxHashMapCreateSimple(sizeof(cxmutstr));
    
    // automatically free the UCX string when removed from the map
    cxDefineDestructor(map, cx_strfree);

    // use our custom load function to load the data from the file
    if (load_props_from_file("my-props.properties", map)) {
        fputs("Error reading properties.\n", stderr);
        return 1;
    }

    // output the read key/value pairs for illustration
    CxMapIterator iter = cxMapIterator(map);
    cx_foreach(CxMapEntry *, entry, iter) {
        cxstring k = cx_strn(entry->key->data, entry->key->len);
        cxmutstr *v = entry->value;
        printf("%.*s = %.*s\n",
            (int) k.length, k.ptr, (int) v->length, v->ptr);
    }

    // freeing the map also frees the strings
    // because we have registered cx_strfree() as destructor function
    cxMapFree(map);

    return 0;
}
```

> A cleaner implementation that does not produce a warning for bluntly casting an `int` to a `void*`
> can be achieved by declaring a struct that contains the information, allocate memory for
> that struct, and store the pointer in `data_ptr`.
> For illustrating how properties sources and sinks can be implemented, this was not necessary.

### Additional Status Codes

For sources and sinks there are three additional special status codes,
which only appear as return values for `cxPropertiesLoad()`.

| Status Code                             | Meaning                                                                                                                                                                                                                 |
|-----------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| CX_PROPERTIES_READ_INIT_FAILED          | Initializing the properties source failed and the `cx_properties_read_init_func` returned non-zero.                                                                                                                     |
| CX_PROPERTIES_READ_FAILED               | Reading from a properties source failed and the `cx_properties_read_func` returned non-zero.                                                                                                                            |
| CX_PROPERTIES_SINK_FAILED               | Sinking a key/value-pair failed and the `cx_properties_sink_func` returned non-zero.                                                                                                                                    |


<seealso>
<category ref="apidoc">
<a href="https://ucx.sourceforge.io/api/properties_8h.html">properties.h</a>
</category>
</seealso>

mercurial