|
1 --- |
|
2 title: UCX Features |
|
3 --- |
|
4 |
|
5 <div id="modules"> |
|
6 |
|
7 ------------------------ ------------------------- ------------------- --------------------------------- |
|
8 [Allocator](#allocator) [String](#string) [Buffer](#buffer) [Memory Pool](#memory-pool) |
|
9 [Iterator](#iterator) [Collection](#collection) [List](#list) [Map](#map) |
|
10 [Utilities](#utilities) |
|
11 ------------------------ ------------------------- ------------------- --------------------------------- |
|
12 |
|
13 </div> |
|
14 |
|
15 ## Allocator |
|
16 |
|
17 *Header file:* [allocator.h](api/allocator_8h.html) |
|
18 |
|
19 The UCX allocator provides an interface for implementing an own memory allocation mechanism. |
|
20 Various function in UCX provide an additional alternative signature that takes an allocator as |
|
21 argument. A default allocator implementation using the stdlib memory management functions is |
|
22 available via the global symbol `cxDefaultAllocator`. |
|
23 |
|
24 If you want to define your own allocator, you need to initialize the `CxAllocator` structure |
|
25 with a pointer to an allocator class (containing function pointers for the memory management |
|
26 functions) and an optional pointer to an arbitrary memory region that can be used to store |
|
27 state information for the allocator. An example is shown below: |
|
28 |
|
29 ```c |
|
30 struct my_allocator_state { |
|
31 size_t total; |
|
32 size_t avail; |
|
33 char mem[]; |
|
34 }; |
|
35 |
|
36 static cx_allocator_class my_allocator_class = { |
|
37 my_malloc_impl, |
|
38 my_realloc_impl, // all these functions are somewhere defined |
|
39 my_calloc_impl, |
|
40 my_free_impl |
|
41 }; |
|
42 |
|
43 CxAllocator create_my_allocator(size_t n) { |
|
44 CxAllocator alloc; |
|
45 alloc.cl = &my_allocator_class; |
|
46 alloc.data = calloc(1, sizeof(struct my_allocator_state) + n); |
|
47 return alloc; |
|
48 } |
|
49 ``` |
|
50 |
|
51 ## String |
|
52 |
|
53 *Header file:* [string.h](api/string_8h.html) |
|
54 |
|
55 UCX strings come in two variants: immutable (`cxstring`) and mutable (`cxmutstr`). |
|
56 The functions of UCX are designed to work with immutable strings by default but in situations where it is necessary, |
|
57 the API also provides alternative functions that work directly with mutable strings. |
|
58 Functions that change a string in-place are, of course, only accepting mutable strings. |
|
59 |
|
60 When you are using UCX functions, or defining your own functions, you are sometimes facing the "problem", |
|
61 that the function only accepts arguments of type `cxstring` but you only have a `cxmutstr` at hand. |
|
62 In this case you _should not_ introduce a wrapper function that accepts the `cxmutstr`, |
|
63 but instead you should use the `cx_strcast()` function to cast the argument to the correct type. |
|
64 |
|
65 In general, UCX strings are **not** necessarily zero-terminated. If a function guarantees to return zero-terminated |
|
66 string, it is explicitly mentioned in the documentation of the respective function. |
|
67 As a rule of thumb, you _should not_ pass the strings of a UCX string structure to another API without explicitly |
|
68 ensuring that the string is zero-terminated. |
|
69 |
|
70 ## Buffer |
|
71 |
|
72 *Header file:* [buffer.h](api/buffer_8h.html) |
|
73 |
|
74 Instances of this buffer implementation can be used to read from or write to memory like you would do with a stream. |
|
75 This allows the use of `cx_stream_copy()` (see [Utilities](#utilities)) to copy contents from one buffer to another, |
|
76 or from a file or network streams to the buffer and vice-versa. |
|
77 |
|
78 More features for convenient use of the buffer can be enabled, like automatic memory management and automatic |
|
79 resizing of the buffer space. |
|
80 |
|
81 Since UCX 3.0, the buffer also supports automatic flushing of contents to another stream (or buffer) as an alternative |
|
82 to automatically resizing the buffer space. |
|
83 Please refer to the API doc for the fields prefixed with `flush_` to learn more. |
|
84 |
|
85 ## Memory Pool |
|
86 |
|
87 *Header file:* [mempool.h](api/mempool_8h.html) |
|
88 |
|
89 A memory pool is providing an allocator implementation that automatically deallocates the memory upon its destruction. |
|
90 It also allows you to register destructor functions for the allocated memory, which are automatically called before |
|
91 the memory is deallocated. |
|
92 Additionally, you may also register _independent_ destructor functions within a pool in case some external library |
|
93 allocated memory for you, which should be freed together with this pool. |
|
94 |
|
95 Many UCX features support the use of an allocator. |
|
96 The [strings](#string), for instance, provide several functions suffixed with `_a` that allow specifying an allocator. |
|
97 You can use this to keep track of the memory occupied by dynamically allocated strings and cleanup everything with |
|
98 just a single call to `cxMempoolFree()`. |
|
99 |
|
100 The following code illustrates this on the example of reading a CSV file into memory. |
|
101 ```C |
|
102 #include <stdio.h> |
|
103 #include <cx/mempool.h> |
|
104 #include <cx/linked_list.h> |
|
105 #include <cx/string.h> |
|
106 #include <cx/buffer.h> |
|
107 #include <cx/utils.h> |
|
108 |
|
109 typedef struct { |
|
110 cxstring column_a; |
|
111 cxstring column_b; |
|
112 cxstring column_c; |
|
113 } CSVData; |
|
114 |
|
115 int main(void) { |
|
116 CxMempool* pool = cxBasicMempoolCreate(128); |
|
117 |
|
118 FILE *f = fopen("test.csv", "r"); |
|
119 if (!f) { |
|
120 perror("Cannot open file"); |
|
121 return 1; |
|
122 } |
|
123 // close the file automatically at pool destruction |
|
124 cxMempoolRegister(pool, f, (cx_destructor_func) fclose); |
|
125 |
|
126 // create a buffer using the memory pool for destruction |
|
127 CxBuffer *content = cxBufferCreate(NULL, 256, pool->allocator, CX_BUFFER_AUTO_EXTEND); |
|
128 |
|
129 // read the file into the buffer and turn it into a string |
|
130 cx_stream_copy(f, content, (cx_read_func) fread, cxBufferWriteFunc); |
|
131 fclose(f); |
|
132 cxstring contentstr = cx_strn(content->space, content->size); |
|
133 |
|
134 // split the string into lines - use the mempool for allocating the target array |
|
135 cxstring* lines; |
|
136 size_t lc = cx_strsplit_a(pool->allocator, contentstr, |
|
137 CX_STR("\n"), SIZE_MAX, &lines); |
|
138 |
|
139 // skip the header and parse the remaining data into a linked list |
|
140 // the nodes of the linked list shall also be allocated by the mempool |
|
141 CxList* datalist = cxLinkedListCreate(pool->allocator, NULL, sizeof(CSVData)); |
|
142 for (size_t i = 1 ; i < lc ; i++) { |
|
143 if (lines[i].length == 0) continue; |
|
144 cxstring fields[3]; |
|
145 size_t fc = cx_strsplit(lines[i], CX_STR(";"), 3, fields); |
|
146 if (fc != 3) { |
|
147 fprintf(stderr, "Syntax error in line %zu.\n", i); |
|
148 cxMempoolFree(pool); |
|
149 return 1; |
|
150 } |
|
151 CSVData data; |
|
152 data.column_a = fields[0]; |
|
153 data.column_b = fields[1]; |
|
154 data.column_c = fields[2]; |
|
155 cxListAdd(datalist, &data); |
|
156 } |
|
157 |
|
158 // iterate through the list and output the data |
|
159 CxIterator iter = cxListIterator(datalist); |
|
160 cx_foreach(CSVData*, data, iter) { |
|
161 printf("Column A: %.*s | " |
|
162 "Column B: %.*s | " |
|
163 "Column C: %.*s\n", |
|
164 (int)data->column_a.length, data->column_a.ptr, |
|
165 (int)data->column_b.length, data->column_b.ptr, |
|
166 (int)data->column_c.length, data->column_c.ptr |
|
167 ); |
|
168 } |
|
169 |
|
170 // cleanup everything, no manual free() needed |
|
171 cxMempoolFree(pool); |
|
172 |
|
173 return 0; |
|
174 } |
|
175 ``` |
|
176 |
|
177 ## Iterator |
|
178 |
|
179 *Header file:* [iterator.h](api/iterator_8h.html) |
|
180 |
|
181 In UCX 3 a new feature has been introduced to write own iterators, that work with the `cx_foreach` macro. |
|
182 In previous UCX releases there were different hard-coded foreach macros for lists and maps that were not customizable. |
|
183 Now, creating an iterator is as simple as creating a `CxIterator` struct and setting the fields in a meaningful way. |
|
184 |
|
185 You do not always need all fields in the iterator structure, depending on your use case. |
|
186 Sometimes you only need the `index` (for example when iterating over simple lists), and other times you will need the |
|
187 `slot` and `kv_data` fields (for example when iterating over maps). |
|
188 |
|
189 If the predefined fields are insufficient for your use case, you can alternatively create your own iterator structure |
|
190 and place the `CX_ITERATOR_BASE` macro as first member of that structure. |
|
191 |
|
192 Usually an iterator is not mutating the collection it is iterating over. |
|
193 In some programming languages it is even disallowed to change the collection while iterating with foreach. |
|
194 But sometimes it is desirable to remove an element from the collection while iterating over it. |
|
195 For this purpose, most collections allow the creation of a _mutating_ iterator. |
|
196 The only differences are, that the `mutating` flag is `true` and the `src_handle` is not const. |
|
197 On mutating iterators it is allowed to call the `cxFlagForRemoval()` function, which instructs the iterator to remove |
|
198 the current element from the collection on the next call to `cxIteratorNext()` and clear the flag afterward. |
|
199 If you are implementing your own iterator, it is up to you to implement this behavior. |
|
200 |
|
201 ## Collection |
|
202 |
|
203 *Header file:* [collection.h](api/collection_8h.html) |
|
204 |
|
205 Collections in UCX 3 have several common features. |
|
206 If you want to implement an own collection data type that uses the same features, you can use the |
|
207 `CX_COLLECTION_BASE` macro at the beginning of your struct to roll out all members a usual UCX collection has. |
|
208 ```c |
|
209 struct my_fancy_collection_s { |
|
210 CX_COLLECTION_BASE; |
|
211 struct my_collection_data_s *data; |
|
212 }; |
|
213 ``` |
|
214 Based on this structure, this header provides some convenience macros for invoking the destructor functions |
|
215 that are part of the basic collection members. |
|
216 The idea of having destructor functions within a collection is that you can destroy the collection _and_ the |
|
217 contents with one single function call. |
|
218 When you are implementing a collection, you are responsible for invoking the destructors at the right places, e.g. |
|
219 when removing (and deleting) elements in the collection, clearing the collection, or - the most prominent case - |
|
220 destroying the collection. |
|
221 |
|
222 You can always look at the UCX list and map implementations if you need some inspiration. |
|
223 |
|
224 ## List |
|
225 |
|
226 *Header file:* [list.h](api/list_8h.html) |
|
227 |
|
228 This header defines a common interface for all list implementations. |
|
229 |
|
230 UCX already comes with two common list implementations (linked list and array list) that should cover most use cases. |
|
231 But if you feel the need to implement an own list, the only thing you need to do is to define a struct with a |
|
232 `struct cx_list_s` as first member, and set an appropriate list class that implements the functionality. |
|
233 It is strongly recommended that this class is shared among all instances of the same list type, because otherwise |
|
234 the `cxListCompare` function cannot use the optimized implementation of your class and will instead fall back to |
|
235 using iterators to compare the contents element-wise. |
|
236 |
|
237 ### Linked List |
|
238 |
|
239 *Header file:* [linked_list.h](api/linked__list_8h.html) |
|
240 |
|
241 On top of implementing the list interface, this header also defines several low-level functions that |
|
242 work with arbitrary structures. |
|
243 Low-level functions, in contrast to the high-level list interface, can easily be recognized by their snake-casing. |
|
244 The function `cx_linked_list_at`, for example, implements a similar functionality like `cxListAt`, but operates |
|
245 on arbitrary structures. |
|
246 The following snippet shows how it is used. |
|
247 All other low-level functions work similarly. |
|
248 ```c |
|
249 struct node { |
|
250 node *next; |
|
251 node *prev; |
|
252 int data; |
|
253 }; |
|
254 |
|
255 const ptrdiff_t loc_prev = offsetof(struct node, prev); |
|
256 const ptrdiff_t loc_next = offsetof(struct node, next); |
|
257 const ptrdiff_t loc_data = offsetof(struct node, data); |
|
258 |
|
259 struct node a = {0}, b = {0}, c = {0}, d = {0}; |
|
260 cx_linked_list_link(&a, &b, loc_prev, loc_next); |
|
261 cx_linked_list_link(&b, &c, loc_prev, loc_next); |
|
262 cx_linked_list_link(&c, &d, loc_prev, loc_next); |
|
263 |
|
264 cx_linked_list_at(&a, 0, loc_next, 2); // returns pointer to c |
|
265 ``` |
|
266 |
|
267 ### Array List |
|
268 |
|
269 *Header file:* [array_list.h](api/array__list_8h.html) |
|
270 |
|
271 Since low-level array lists are just plain arrays, there is no need for such many low-level functions as for linked |
|
272 lists. |
|
273 However, there is one extremely powerful function that can be used for several complex tasks: `cx_array_copy`. |
|
274 The full signature is shown below: |
|
275 ```c |
|
276 int cx_array_copy( |
|
277 void **target, |
|
278 void *size, |
|
279 void *capacity, |
|
280 unsigned width, |
|
281 size_t index, |
|
282 const void *src, |
|
283 size_t elem_size, |
|
284 size_t elem_count, |
|
285 struct cx_array_reallocator_s *reallocator |
|
286 ); |
|
287 ``` |
|
288 The `target` argument is a pointer to the target array pointer. |
|
289 The reason for this additional indirection is that this function writes |
|
290 back the pointer to the possibly reallocated array. |
|
291 The next two arguments are pointers to the `size` and `capacity` of the target array for which the width |
|
292 (in bits) is specified in the `width` argument. |
|
293 |
|
294 On a successful invocation, the function copies `elem_count` number of elements, each of size `elem_size` from |
|
295 `src` to `*target` and uses the `reallocator` to extend the array when necessary. |
|
296 Finally, the size, capacity, and the pointer to the array are all updated and the function returns zero. |
|
297 |
|
298 A few things to note: |
|
299 * `*target` and `src` can point to the same memory region, effectively copying elements within the array with `memmove` |
|
300 * `*target` does not need to point to the start of the array, but `size` and `capacity` always start counting from the |
|
301 position, `*target` points to - in this scenario, the need for reallocation must be avoided for obvious reasons |
|
302 * `index` does not need to be within size of the current array |
|
303 * `index` does not even need to be within the capacity of the array |
|
304 * `width` must be one of 8, 16, 32, 64 (only on 64-bit systems), or zero (in which case the native word width is used) |
|
305 |
|
306 If you just want to add one single element to an existing array, you can use the macro `cx_array_add()`. |
|
307 You can use `CX_ARRAY_DECLARE()` to declare the necessary fields within a structure and then use the |
|
308 `cx_array_simple_*()` convenience macros to reduce code overhead. |
|
309 The convenience macros automatically determine the width of the size/capacity variables. |
|
310 |
|
311 ## Map |
|
312 |
|
313 *Header file:* [map.h](api/map_8h.html) |
|
314 |
|
315 Similar to the list interface, the map interface provides a common API for implementing maps. |
|
316 There are some minor subtle differences, though. |
|
317 |
|
318 First, the `remove` method is not always a destructive removal. |
|
319 Instead, the last argument is a Boolean that indicates whether the element shall be destroyed or returned. |
|
320 ```c |
|
321 void *(*remove)(CxMap *map, CxHashKey key, bool destroy); |
|
322 ``` |
|
323 When you implement this method, you are either supposed to invoke the destructors and return `NULL`, |
|
324 or just remove the element from the map and return it. |
|
325 |
|
326 Secondly, the iterator method is a bit more complete. The signature is as follows: |
|
327 ```c |
|
328 CxIterator (*iterator)(const CxMap *map, enum cx_map_iterator_type type); |
|
329 ``` |
|
330 There are three map iterator types: for values, for keys, for pairs. |
|
331 Depending on the iterator type requested, you need to create an iterator with the correct methods that |
|
332 return the requested thing. |
|
333 There are no automatic checks to enforce this - it's completely up to you. |
|
334 If you need inspiration on how to do that, check the hash map implementation that comes with UCX. |
|
335 |
|
336 ### Hash Map |
|
337 |
|
338 *Header file:* [hash_map.h](api/hash__map_8h.html) |
|
339 |
|
340 UCX provides a basic hash map implementation with a configurable amount of buckets. |
|
341 If you do not specify the number of buckets, a default of 16 buckets will be used. |
|
342 You can always rehash the map with `cxMapRehash()` to change the number of buckets to something more efficient, |
|
343 but you need to be careful, because when you use this function you are effectively locking into using this |
|
344 specific hash map implementation, and you would need to remove all calls to this function when you want to |
|
345 exchange the concrete map implementation with something different. |
|
346 |
|
347 ## Utilities |
|
348 |
|
349 *Header file:* [utils.h](api/utils_8h.html) |
|
350 |
|
351 UCX provides some utilities for routine tasks. |
|
352 |
|
353 The most useful utilities are the *stream copy* functions, which provide a simple way to copy all - or a |
|
354 bounded amount of - data from one stream to another. Since the read/write functions of a UCX buffer are |
|
355 fully compatible with stream read/write functions, you can easily transfer data from file or network streams to |
|
356 a UCX buffer or vice-versa. |
|
357 |
|
358 The following example shows, how easy it is to read the contents of a file into a buffer: |
|
359 ```c |
|
360 FILE *inputfile = fopen(infilename, "r"); |
|
361 if (inputfile) { |
|
362 CxBuffer fbuf; |
|
363 cxBufferInit(&fbuf, NULL, 4096, NULL, CX_BUFFER_AUTO_EXTEND); |
|
364 cx_stream_copy(inputfile, &fbuf, |
|
365 (cx_read_func) fread, |
|
366 cxBufferWriteFunc); |
|
367 fclose(inputfile); |
|
368 |
|
369 // ... do something meaningful with the contents ... |
|
370 |
|
371 cxBufferDestroy(&fbuf); |
|
372 } else { |
|
373 perror("Error opening input file"); |
|
374 if (fout != stdout) { |
|
375 fclose(fout); |
|
376 } |
|
377 } |
|
378 ``` |
|
379 |
|
380 ### Printf Functions |
|
381 |
|
382 *Header file:* [printf.h](api/printf_8h.html) |
|
383 |
|
384 In this utility header you can find `printf()`-like functions that can write the formatted output to an arbitrary |
|
385 stream (or UCX buffer, resp.), or to memory allocated by an allocator within a single function call. |
|
386 With the help of these convenience functions, you do not need to `snprintf` your string to a temporary buffer anymore, |
|
387 plus you do not need to worry about too small buffer sizes, because the functions will automatically allocate enough |
|
388 memory to contain the entire formatted string. |
|
389 |
|
390 ### Compare Functions |
|
391 |
|
392 *Header file:* [compare.h](api/compare_8h.html) |
|
393 |
|
394 This header file contains a collection of compare functions for various data types. |
|
395 Their signatures are designed to be compatible with the `cx_compare_func` function pointer type. |