libmba
A library of generic C modules

The libmba package is a collection of mostly independent C modules potentially useful to any project. There are the usual ADTs including a linkedlist, hashmap, pool, stack, and varray, a flexible memory allocator, CSV parser, path canonicalization routine, I18N text abstraction, configuration file module, portable semaphores, condition variables and more. The code is designed so that individual modules can be integrated into existing codebases rather than requiring the user to commit to the entire library. The code has no typedefs, few comments, and extensive man pages and HTML documentation.

Links

Download
API Reference
Browse The Source

Similar Projects

These projects look similar in purpose to libmba although in most cases that has not be confirmed and their presence here is in not necessarily an endorsement of quality. They are listed here (not in any particular order) only to help developers focus their search.

MLib
libavl
SGLIB
Libtc
netsw.org
OSSP
LibAST
skalibs
libslack
LIBH
Hackerlab
Kazlib
SGLIB
Matt's C Utility Library
ubiqx
gdsl
smbase

News

libmba-0.9.1 released
Fri Apr 29, 2005
Portability of libmba has been greatly improved. It has been compiled and tested (albeit not extensively on all platforms) on OSF1, HP-UX [1], Linux, Mac OS X [2], and FreeBSD with gcc, DECC, and aC++. There have also been the following modifications and enhancements.
  • The msgno(3m) module has been significantly reworked. The MSG, MNO, and MNF macros have been renamed to MMSG, MMNO, and MMNF respectively to reduce namespace collisions. There is also no longer a dependance on variadic macros. The msgno(3m) module is now highly portable. The MSGNO macro is no longer used -- msgno is now enabled at all times.
  • The bitset(3m) module macros have been converted to functions so that using expressions as arguments (e.g. i++) does not result in undefined behavior. As a result, some return values have changed. Please review the man page or HTML documentation.
  • A debug module has been added that provides some useful backtrace oriented functions but it is not documented because it is specific to GNUC on i386.
  • The linkedlist_insert_sorted function has been modified to support a context parameter that is passed as-is to the supplied cmp_fn defined in hashmap(3m). The compare_fn type has been removed.
  • A path_name function has been added to the path(3m) module.
  • Some parameters of shellout(3m) function have been changed from unsigned char to char.
  • The csv(3m) module will now exclude carriage returns preceeding newlines in elements.
  • A SUBA_PTR_SIZE macro has been added to the suba(3m) module that evaluates to the size of the cell backing the suba allocated object. This code can also be easily modified to support a payload within each cell (e.g. stack addresses for stacktraces).
  • The SEM_UNDO flag has been been replaced with O_UNDO to avoid possible collision with other O_* bits.
  • A varray_index function has been added.
  • A variety of other small bugfixes have been applied.
As usual some nifty example programs are included.
  • csvprint has been considerably improved.
  • errcmp is a program used to generate this useful Errno Codes by Platform page.
  • bindiff, hexdiff, spell, and strdiff thoroughly illustrate the possibilities of the diff(3m) module
  • hexd will print hexdump formatted output based on a comma separated list of ranges. There is also an HTML option that will color code the ranges. This is potantially useful for documenting binary formats. Below is an example SMB_COM_SESSION_SETUP response packet from the CIFS networking protocol.

    It was generated with the below command where ssxr.bin is a file containing the raw packet data (exported from Ethereal).
    
    hexd -h -r "32:SMB Header,14:8:Signature,32:1:WordCount,6:Words,2:ByteCount,25:Bytes,0x44:17:Tree Connect AndX" ssxr.bin
    
    
[1] shellout(3m) does not work on HP-UX because it lacks the non-standard forkpty(2).
[2] svsem(3m) and svcond(3m) do not work properly on Mac OS X. It appears that semop(2) does not initialize semid_ds.sem_otime in the same way that other platforms do.

libmba-0.8.10 released
Sat Aug 28, 2004
Two bugs have been found and fixed in the csv module. If a non-ASCII character was read with csv_row_parse, parsing would stop prematurely due to a signedness error. The csv module now uses unsigned char throughout to properly support internationalized text. Note csv_row_fread was unaffected by this bug. Second, if the character preceeding EOF was a double quote (as opposed to a newline) an error would occur. The csv module will now correctly process the final element.

Also related, a few example programs are now included with the distribution. One such example is the csvprint utility which prints data in a csv file using a format string.

examples$ ./csvprint data.csv "%2|%1|FOO(%2)\n"
three|two|FOO(three)
...

This is suprisingly useful for reordering fields, generating source code, etc.

The bitset_find_first function will now set errno to ENOENT if the target bit was not found.

Some issues regarding the initialization of svsem(3m) semaphores have been fixed. The module should now properly handle the initialization race outlined in Stevens' UNPv2 in addition to the scenario where a semaphore is removed during initialization.

Finally the eval(3m) module now provides for a context parameter to be specifed that will be passed to the user supplied symlook function. This is necessary for full reentrance.

All documentation has been updated accordingly.
libmba-0.8.9 released
Fri May 21, 2004
The sho_loop function now accepts a pattern vector and timeout like sho_expect and the cfg module has been modified to more closely support Java Properties escape sequences for spaces and Unicode characters.
libmba-0.8.8 released
Thu May 6, 2004
The purpose of this project is to provide generic C implementations of concepts elemental to a wide variety of programming problems. The latest addition to libmba is the diff module and it is a fine example of a non-trivial algorithm that is crucial to the function and efficiency of many common applications such as spell checkers, version control systems, spam filters, speech recognition, and more. The code is generic such that anything that can be indexed and compared with user supplied callbacks can be used such as strings, linked lists, pointers to lines in files, etc.

The algorithm is perhaps best known for it's use in the GNU diff(1) program for generating a "diff" of two files. Formally it is known as the shortest edit script (SES) problem and is solved efficiently using the dynamic programming algorithm described by Myers [1] and in linear space with the Hirschberg refinement. The objective is to compute the minimum set of edit operations necessary to transform sequence A of length N into B of lenth M. This can be performed in O((N+M)D^2) expected time where D is the edit distance (the number of elements deleted and inserted).

[1] E. Myers, ``An O(ND) Difference Algorithm and Its Variations,'' Algorithmica 1, 2 (1986), 251-266. http://www.cs.arizona.edu/people/gene/PAPERS/diff.ps

Also, in this release, the path module, which has been in libmba for a some time, is now documented. This module provides a high quality filesystem path canonicalization routine. Path canonicalization is notoriously unforgiving because the parsing rountine is complex and yet it is not uncommon for programs to be required to accept paths from potentially malicous sources. This implementation uses a state machine approach to reduce complexity and has been tested with a wide range of inputs (see tcase/test/data/PathCanonExamples.data). Certain conditions are enforced that minimize the potential for exploits. For exmaple, only one input character is examined with each iteration of the outer loop so that it can be certain that the slim and dlim limit pointers are checked with the advance of every input character. A canonicalized path cannot begin with a path separator unless the input began with a path separator. Because of the state machine structure, if there is a flaw in the implementation the fix is more likely to be a local adjustment which limits the potential for creating new flaws.