docs/src/features.md

branch
docs/3.1
changeset 1140
88a9ee79c102
parent 1139
7dfa5bcf39ee
child 1141
a06a2d27c043
equal deleted inserted replaced
1139:7dfa5bcf39ee 1140:88a9ee79c102
1 ---
2 title: UCX Features
3 ---
4
5 <div id="modules">
6
7 ------------------------ ------------------------- ------------------- ---------------------------------
8 [Allocator](#allocator) [String](#string) [Buffer](#buffer) [Memory&nbsp;Pool](#memory-pool)
9 [Iterator](#iterator) [Collection](#collection) [List](#list) [Map](#map)
10 [Utilities](#utilities)
11 ------------------------ ------------------------- ------------------- ---------------------------------
12
13 </div>
14
15 ## Allocator
16
17 *Header file:* [allocator.h](api/allocator_8h.html)
18
19 The UCX allocator provides an interface for implementing an own memory allocation mechanism.
20 Various function in UCX provide an additional alternative signature that takes an allocator as
21 argument. A default allocator implementation using the stdlib memory management functions is
22 available via the global symbol `cxDefaultAllocator`.
23
24 If you want to define your own allocator, you need to initialize the `CxAllocator` structure
25 with a pointer to an allocator class (containing function pointers for the memory management
26 functions) and an optional pointer to an arbitrary memory region that can be used to store
27 state information for the allocator. An example is shown below:
28
29 ```c
30 struct my_allocator_state {
31 size_t total;
32 size_t avail;
33 char mem[];
34 };
35
36 static cx_allocator_class my_allocator_class = {
37 my_malloc_impl,
38 my_realloc_impl, // all these functions are somewhere defined
39 my_calloc_impl,
40 my_free_impl
41 };
42
43 CxAllocator create_my_allocator(size_t n) {
44 CxAllocator alloc;
45 alloc.cl = &my_allocator_class;
46 alloc.data = calloc(1, sizeof(struct my_allocator_state) + n);
47 return alloc;
48 }
49 ```
50
51 ## String
52
53 *Header file:* [string.h](api/string_8h.html)
54
55 UCX strings come in two variants: immutable (`cxstring`) and mutable (`cxmutstr`).
56 The functions of UCX are designed to work with immutable strings by default but in situations where it is necessary,
57 the API also provides alternative functions that work directly with mutable strings.
58 Functions that change a string in-place are, of course, only accepting mutable strings.
59
60 When you are using UCX functions, or defining your own functions, you are sometimes facing the "problem",
61 that the function only accepts arguments of type `cxstring` but you only have a `cxmutstr` at hand.
62 In this case you _should not_ introduce a wrapper function that accepts the `cxmutstr`,
63 but instead you should use the `cx_strcast()` function to cast the argument to the correct type.
64
65 In general, UCX strings are **not** necessarily zero-terminated. If a function guarantees to return zero-terminated
66 string, it is explicitly mentioned in the documentation of the respective function.
67 As a rule of thumb, you _should not_ pass the strings of a UCX string structure to another API without explicitly
68 ensuring that the string is zero-terminated.
69
70 ## Buffer
71
72 *Header file:* [buffer.h](api/buffer_8h.html)
73
74 Instances of this buffer implementation can be used to read from or write to memory like you would do with a stream.
75 This allows the use of `cx_stream_copy()` (see [Utilities](#utilities)) to copy contents from one buffer to another,
76 or from a file or network streams to the buffer and vice-versa.
77
78 More features for convenient use of the buffer can be enabled, like automatic memory management and automatic
79 resizing of the buffer space.
80
81 Since UCX 3.0, the buffer also supports automatic flushing of contents to another stream (or buffer) as an alternative
82 to automatically resizing the buffer space.
83 Please refer to the API doc for the fields prefixed with `flush_` to learn more.
84
85 ## Memory Pool
86
87 *Header file:* [mempool.h](api/mempool_8h.html)
88
89 A memory pool is providing an allocator implementation that automatically deallocates the memory upon its destruction.
90 It also allows you to register destructor functions for the allocated memory, which are automatically called before
91 the memory is deallocated.
92 Additionally, you may also register _independent_ destructor functions within a pool in case some external library
93 allocated memory for you, which should be freed together with this pool.
94
95 Many UCX features support the use of an allocator.
96 The [strings](#string), for instance, provide several functions suffixed with `_a` that allow specifying an allocator.
97 You can use this to keep track of the memory occupied by dynamically allocated strings and cleanup everything with
98 just a single call to `cxMempoolFree()`.
99
100 The following code illustrates this on the example of reading a CSV file into memory.
101 ```C
102 #include <stdio.h>
103 #include <cx/mempool.h>
104 #include <cx/linked_list.h>
105 #include <cx/string.h>
106 #include <cx/buffer.h>
107 #include <cx/utils.h>
108
109 typedef struct {
110 cxstring column_a;
111 cxstring column_b;
112 cxstring column_c;
113 } CSVData;
114
115 int main(void) {
116 CxMempool* pool = cxBasicMempoolCreate(128);
117
118 FILE *f = fopen("test.csv", "r");
119 if (!f) {
120 perror("Cannot open file");
121 return 1;
122 }
123 // close the file automatically at pool destruction
124 cxMempoolRegister(pool, f, (cx_destructor_func) fclose);
125
126 // create a buffer using the memory pool for destruction
127 CxBuffer *content = cxBufferCreate(NULL, 256, pool->allocator, CX_BUFFER_AUTO_EXTEND);
128
129 // read the file into the buffer and turn it into a string
130 cx_stream_copy(f, content, (cx_read_func) fread, cxBufferWriteFunc);
131 fclose(f);
132 cxstring contentstr = cx_strn(content->space, content->size);
133
134 // split the string into lines - use the mempool for allocating the target array
135 cxstring* lines;
136 size_t lc = cx_strsplit_a(pool->allocator, contentstr,
137 CX_STR("\n"), SIZE_MAX, &lines);
138
139 // skip the header and parse the remaining data into a linked list
140 // the nodes of the linked list shall also be allocated by the mempool
141 CxList* datalist = cxLinkedListCreate(pool->allocator, NULL, sizeof(CSVData));
142 for (size_t i = 1 ; i < lc ; i++) {
143 if (lines[i].length == 0) continue;
144 cxstring fields[3];
145 size_t fc = cx_strsplit(lines[i], CX_STR(";"), 3, fields);
146 if (fc != 3) {
147 fprintf(stderr, "Syntax error in line %zu.\n", i);
148 cxMempoolFree(pool);
149 return 1;
150 }
151 CSVData data;
152 data.column_a = fields[0];
153 data.column_b = fields[1];
154 data.column_c = fields[2];
155 cxListAdd(datalist, &data);
156 }
157
158 // iterate through the list and output the data
159 CxIterator iter = cxListIterator(datalist);
160 cx_foreach(CSVData*, data, iter) {
161 printf("Column A: %.*s | "
162 "Column B: %.*s | "
163 "Column C: %.*s\n",
164 (int)data->column_a.length, data->column_a.ptr,
165 (int)data->column_b.length, data->column_b.ptr,
166 (int)data->column_c.length, data->column_c.ptr
167 );
168 }
169
170 // cleanup everything, no manual free() needed
171 cxMempoolFree(pool);
172
173 return 0;
174 }
175 ```
176
177 ## Iterator
178
179 *Header file:* [iterator.h](api/iterator_8h.html)
180
181 In UCX 3 a new feature has been introduced to write own iterators, that work with the `cx_foreach` macro.
182 In previous UCX releases there were different hard-coded foreach macros for lists and maps that were not customizable.
183 Now, creating an iterator is as simple as creating a `CxIterator` struct and setting the fields in a meaningful way.
184
185 You do not always need all fields in the iterator structure, depending on your use case.
186 Sometimes you only need the `index` (for example when iterating over simple lists), and other times you will need the
187 `slot` and `kv_data` fields (for example when iterating over maps).
188
189 If the predefined fields are insufficient for your use case, you can alternatively create your own iterator structure
190 and place the `CX_ITERATOR_BASE` macro as first member of that structure.
191
192 Usually an iterator is not mutating the collection it is iterating over.
193 In some programming languages it is even disallowed to change the collection while iterating with foreach.
194 But sometimes it is desirable to remove an element from the collection while iterating over it.
195 For this purpose, most collections allow the creation of a _mutating_ iterator.
196 The only differences are, that the `mutating` flag is `true` and the `src_handle` is not const.
197 On mutating iterators it is allowed to call the `cxFlagForRemoval()` function, which instructs the iterator to remove
198 the current element from the collection on the next call to `cxIteratorNext()` and clear the flag afterward.
199 If you are implementing your own iterator, it is up to you to implement this behavior.
200
201 ## Collection
202
203 *Header file:* [collection.h](api/collection_8h.html)
204
205 Collections in UCX 3 have several common features.
206 If you want to implement an own collection data type that uses the same features, you can use the
207 `CX_COLLECTION_BASE` macro at the beginning of your struct to roll out all members a usual UCX collection has.
208 ```c
209 struct my_fancy_collection_s {
210 CX_COLLECTION_BASE;
211 struct my_collection_data_s *data;
212 };
213 ```
214 Based on this structure, this header provides some convenience macros for invoking the destructor functions
215 that are part of the basic collection members.
216 The idea of having destructor functions within a collection is that you can destroy the collection _and_ the
217 contents with one single function call.
218 When you are implementing a collection, you are responsible for invoking the destructors at the right places, e.g.
219 when removing (and deleting) elements in the collection, clearing the collection, or - the most prominent case -
220 destroying the collection.
221
222 You can always look at the UCX list and map implementations if you need some inspiration.
223
224 ## List
225
226 *Header file:* [list.h](api/list_8h.html)
227
228 This header defines a common interface for all list implementations.
229
230 UCX already comes with two common list implementations (linked list and array list) that should cover most use cases.
231 But if you feel the need to implement an own list, the only thing you need to do is to define a struct with a
232 `struct cx_list_s` as first member, and set an appropriate list class that implements the functionality.
233 It is strongly recommended that this class is shared among all instances of the same list type, because otherwise
234 the `cxListCompare` function cannot use the optimized implementation of your class and will instead fall back to
235 using iterators to compare the contents element-wise.
236
237 ### Linked List
238
239 *Header file:* [linked_list.h](api/linked__list_8h.html)
240
241 On top of implementing the list interface, this header also defines several low-level functions that
242 work with arbitrary structures.
243 Low-level functions, in contrast to the high-level list interface, can easily be recognized by their snake-casing.
244 The function `cx_linked_list_at`, for example, implements a similar functionality like `cxListAt`, but operates
245 on arbitrary structures.
246 The following snippet shows how it is used.
247 All other low-level functions work similarly.
248 ```c
249 struct node {
250 node *next;
251 node *prev;
252 int data;
253 };
254
255 const ptrdiff_t loc_prev = offsetof(struct node, prev);
256 const ptrdiff_t loc_next = offsetof(struct node, next);
257 const ptrdiff_t loc_data = offsetof(struct node, data);
258
259 struct node a = {0}, b = {0}, c = {0}, d = {0};
260 cx_linked_list_link(&a, &b, loc_prev, loc_next);
261 cx_linked_list_link(&b, &c, loc_prev, loc_next);
262 cx_linked_list_link(&c, &d, loc_prev, loc_next);
263
264 cx_linked_list_at(&a, 0, loc_next, 2); // returns pointer to c
265 ```
266
267 ### Array List
268
269 *Header file:* [array_list.h](api/array__list_8h.html)
270
271 Since low-level array lists are just plain arrays, there is no need for such many low-level functions as for linked
272 lists.
273 However, there is one extremely powerful function that can be used for several complex tasks: `cx_array_copy`.
274 The full signature is shown below:
275 ```c
276 int cx_array_copy(
277 void **target,
278 void *size,
279 void *capacity,
280 unsigned width,
281 size_t index,
282 const void *src,
283 size_t elem_size,
284 size_t elem_count,
285 struct cx_array_reallocator_s *reallocator
286 );
287 ```
288 The `target` argument is a pointer to the target array pointer.
289 The reason for this additional indirection is that this function writes
290 back the pointer to the possibly reallocated array.
291 The next two arguments are pointers to the `size` and `capacity` of the target array for which the width
292 (in bits) is specified in the `width` argument.
293
294 On a successful invocation, the function copies `elem_count` number of elements, each of size `elem_size` from
295 `src` to `*target` and uses the `reallocator` to extend the array when necessary.
296 Finally, the size, capacity, and the pointer to the array are all updated and the function returns zero.
297
298 A few things to note:
299 * `*target` and `src` can point to the same memory region, effectively copying elements within the array with `memmove`
300 * `*target` does not need to point to the start of the array, but `size` and `capacity` always start counting from the
301 position, `*target` points to - in this scenario, the need for reallocation must be avoided for obvious reasons
302 * `index` does not need to be within size of the current array
303 * `index` does not even need to be within the capacity of the array
304 * `width` must be one of 8, 16, 32, 64 (only on 64-bit systems), or zero (in which case the native word width is used)
305
306 If you just want to add one single element to an existing array, you can use the macro `cx_array_add()`.
307 You can use `CX_ARRAY_DECLARE()` to declare the necessary fields within a structure and then use the
308 `cx_array_simple_*()` convenience macros to reduce code overhead.
309 The convenience macros automatically determine the width of the size/capacity variables.
310
311 ## Map
312
313 *Header file:* [map.h](api/map_8h.html)
314
315 Similar to the list interface, the map interface provides a common API for implementing maps.
316 There are some minor subtle differences, though.
317
318 First, the `remove` method is not always a destructive removal.
319 Instead, the last argument is a Boolean that indicates whether the element shall be destroyed or returned.
320 ```c
321 void *(*remove)(CxMap *map, CxHashKey key, bool destroy);
322 ```
323 When you implement this method, you are either supposed to invoke the destructors and return `NULL`,
324 or just remove the element from the map and return it.
325
326 Secondly, the iterator method is a bit more complete. The signature is as follows:
327 ```c
328 CxIterator (*iterator)(const CxMap *map, enum cx_map_iterator_type type);
329 ```
330 There are three map iterator types: for values, for keys, for pairs.
331 Depending on the iterator type requested, you need to create an iterator with the correct methods that
332 return the requested thing.
333 There are no automatic checks to enforce this - it's completely up to you.
334 If you need inspiration on how to do that, check the hash map implementation that comes with UCX.
335
336 ### Hash Map
337
338 *Header file:* [hash_map.h](api/hash__map_8h.html)
339
340 UCX provides a basic hash map implementation with a configurable amount of buckets.
341 If you do not specify the number of buckets, a default of 16 buckets will be used.
342 You can always rehash the map with `cxMapRehash()` to change the number of buckets to something more efficient,
343 but you need to be careful, because when you use this function you are effectively locking into using this
344 specific hash map implementation, and you would need to remove all calls to this function when you want to
345 exchange the concrete map implementation with something different.
346
347 ## Utilities
348
349 *Header file:* [utils.h](api/utils_8h.html)
350
351 UCX provides some utilities for routine tasks.
352
353 The most useful utilities are the *stream copy* functions, which provide a simple way to copy all - or a
354 bounded amount of - data from one stream to another. Since the read/write functions of a UCX buffer are
355 fully compatible with stream read/write functions, you can easily transfer data from file or network streams to
356 a UCX buffer or vice-versa.
357
358 The following example shows, how easy it is to read the contents of a file into a buffer:
359 ```c
360 FILE *inputfile = fopen(infilename, "r");
361 if (inputfile) {
362 CxBuffer fbuf;
363 cxBufferInit(&fbuf, NULL, 4096, NULL, CX_BUFFER_AUTO_EXTEND);
364 cx_stream_copy(inputfile, &fbuf,
365 (cx_read_func) fread,
366 cxBufferWriteFunc);
367 fclose(inputfile);
368
369 // ... do something meaningful with the contents ...
370
371 cxBufferDestroy(&fbuf);
372 } else {
373 perror("Error opening input file");
374 if (fout != stdout) {
375 fclose(fout);
376 }
377 }
378 ```
379
380 ### Printf Functions
381
382 *Header file:* [printf.h](api/printf_8h.html)
383
384 In this utility header you can find `printf()`-like functions that can write the formatted output to an arbitrary
385 stream (or UCX buffer, resp.), or to memory allocated by an allocator within a single function call.
386 With the help of these convenience functions, you do not need to `snprintf` your string to a temporary buffer anymore,
387 plus you do not need to worry about too small buffer sizes, because the functions will automatically allocate enough
388 memory to contain the entire formatted string.
389
390 ### Compare Functions
391
392 *Header file:* [compare.h](api/compare_8h.html)
393
394 This header file contains a collection of compare functions for various data types.
395 Their signatures are designed to be compatible with the `cx_compare_func` function pointer type.

mercurial