docs/Writerside/topics/iterator.h.md

Sun, 16 Feb 2025 12:59:14 +0100

author
Mike Becker <universe@uap-core.de>
date
Sun, 16 Feb 2025 12:59:14 +0100
changeset 1216
151c1b7d5d50
parent 1215
790ff923f375
permissions
-rw-r--r--

add missing documentation about creating iterators

relates to #451

# Iterators

Iterators generalize the iteration over elements of an arbitrary collection.
This allows iteration over arrays, lists, maps, trees, etc. in a unified way.   

Creating an iterator is as simple as creating a `CxIterator` struct and setting the fields in a meaningful way.
The UCX collections provide various functions to create such iterators.

If the predefined fields are insufficient (or introduce too much bloat) for your use case,
you can alternatively create your own iterator structure
and place the `CX_ITERATOR_BASE` macro as first member of that structure.

```C
#include <cx/iterator.h>

struct my_fancy_iterator_s {
    CX_ITERATOR_BASE; // the base members used by cx_foreach()
    // ... custom fields ...
};
```

## Creating an Iterator

The following functions create iterators over plain C arrays:

```C
#include <cx/iterator.h>

CxIterator cxIterator(const void *array,
        size_t elem_size, size_t elem_count);
        
CxIterator cxMutIterator(void *array,
        size_t elem_size, size_t elem_count, bool remove_keeps_order);
        
CxIterator cxIteratorPtr(const void *array, size_t elem_count);
        
CxIterator cxMutIteratorPtr(void *array, size_t elem_count,
        bool remove_keeps_order);
```

The `cxIterator()` function creates an iterator over the elements of `array` where
each element is `elem_size` bytes large and the array contains a total of `elem_count` elements.
The `cxMutIterator()` function creates an equivalent [mutating iterator](#mutating-iterators). 

The `cxIteratorPtr()` and `cxMutIteratorPtr()` functions are equivalent to
the `cxIteratorPtr()` and `cxMutIteratorPtr()`, except they assume `sizeof(void*)` as the `elem_size`.

The UCX collections also define functions for creating iterators over their items.
You can read more about them in the respective Sections of the documentation.

## Using an Iterator

The following macros work with arbitrary structures using `CX_ITERATOR_BASE`
and invoke the respective function pointers `valid`, `current`, or `next`.
```C
cxIteratorValid(iter)
cxIteratorCurrent(iter)
cxIteratorNext(iter)
```

You may use them for manual iterator, but usually you do not need them.
Every iterator can be used with the `cx_foreach` macro.

```C
#include <cx/iterator.h>

// some custom array and its size
MyData *array = // ...
size_t size = // ...

CxIterator iter = cxIterator(array, sizeof(MyData), size);
cx_foreach(MyData*, elem, iter) {
    // .. do something with elem ..
}
```

The macro takes three arguments:
1. the pointer-type of a pointer to an element,
2. the name of the variable you want to use for accessing the element,
3. and the iterator.

> An iterator does not necessarily need to iterate over the elements of a collections.
> Map iterators, for example, can iterator over the key/value pairs,
> but they can also iterate over just the values or just the keys.
> 
> You should read the documentation of the function creating the iterator to learn
> what exactly the iterator is iterating over.

## Mutating Iterators

Usually an iterator is not mutating the collection it is iterating over.
But sometimes it is desirable to remove an element from the collection while iterating over it.

For this purpose, most collections allow the creation of a _mutating_ iterator.
On mutating iterators the `mutating` flag in the base structure is set to `true`,
and it is allowed to call the `cxFlagForRemoval()` function, which instructs the iterator to remove
the current element from the collection on the next call to `cxIteratorNext()` and clear the flag afterward.
If you are implementing your own iterator, it is up to you to implement this behavior.

## Passing Iterators to Functions

To eliminate the need of memory management for iterators, the structures are usually used by value.
This does not come with additional costs, because iteration is implemented entirely by macros.

However, sometimes it is necessary to pass an iterator to another function.
To make that possible in a generalized way, such functions should accept a `CxIteratorBase*` pointer
which can be obtained with the `cxIteratorRef()` macro on the calling site.

In the following example, elements from a list are inserted into a tree:

```C
CxList *list = // ...
CxTree *tree = // ...

CxIterator iter = cxListIterator(list);
cxTreeInsertIter(tree, cxIteratorRef(iter), cxListSize(list));
```

> This is the reason, why `CX_ITERATOR_BASE` must be the first member of any iterator structure.
> Otherwise, the address taken by `cxIteratorRef()` would not equal the address of the iterator.
{style="note"}

## Custom Iterators

The base structure is defined as follows:
```C
struct cx_iterator_base_s {
    bool (*valid)(const void *);
    void *(*current)(const void *);
    void *(*current_impl)(const void *);
    void (*next)(void *);
    bool mutating;
    bool remove;
};

typedef struct cx_iterator_base_s CxIteratorBase;
```

The `valid` function indicates whether the iterator is currently pointing to an element in the collection.
The `current` function is supposed to return that element,
and the `next` function shall advance the iterator to the next element.
The booleans `mutating` and `remove` are used for [mutating iterators](#mutating-iterators) as explained above.

Iterators may be wrapped in which case the original implementation can be stored in `current_impl` and
called by a wrapper implementation pointed to by `current`.
This can be useful when you want to support the `store_pointer` field of the [](collection.h.md) API.

A specialized, simple, and fast iterator over an array of a certain type,
that does not support mutation, can be implemented as follows:
```C
#include <cx/iterator.h>

typedef struct my_foo_s {
    // ... your data ...
} MyFoo;

typedef struct my_foo_iterator_s {
    CX_ITERATOR_BASE;
    MyFoo *array;
    size_t index;
    size_t elem_count;
} MyFooIterator;

static bool my_foo_iter_valid(const void *it) {
    const MyFooIterator *iter = it;
    return iter->index < iter->elem_count;
}

static void *my_foo_iter_current(const void *it) {
    const MyFooIterator *iter = it;
    return &iter->array[iter->index];
}

static void my_foo_iter_next(void *it) {
    MyFooIterator *iter = it;
    iter->index++;
}

MyFooIterator myFooIterator(MyFoo *array, size_t elem_count) {
    MyFooIterator iter;
  
    // base fields
    iter.base.valid = my_foo_iter_valid;
    iter.base.current = my_foo_iter_current;
    iter.base.next = my_foo_iter_next;
    iter.base.remove = false;
    iter.base.mutating = false;
    
    // custom fields
    iter.index = 0;
    iter.elem_count = elem_count;

    return iter;
}
```

> Note, that the behavior of `current` is undefined when `valid` returns `false`.
> That means, on the one hand, `current` does not need to check for validity of the iterator,
> but on the other hand it is forbidden to invoke `current` when `valid` would return `false`.
{style="note"}

> If performance matters in your application, it is recommended that you indeed create specialized iterators
> for your collections. The default UCX implementations trade some of the performance for generality.

<seealso>
<category ref="apidoc">
<a href="https://ucx.sourceforge.io/api/iterator_8h.html">iterator.h</a>
</category>
</seealso>

mercurial