A binary tree is an efficient way of storing data in memory (in a tree!) The C run time library has a set of functions that allow you to create and manage data in a binary tree, but it is not that easy to set up.
Core concepts
Each element, or node, in the tree has two children and a pointer to the record. Each child can be null, or the address of another child. The left child contains elements which are less than the parent, the right node contains elements which are greater than the parent node. To find an element in the tree, you start at the top, or the root node, and compare the value of the nodes value, with the element you are looking for and work down the tree until you find a matching record or you hit the bottom of the tree. You provide a compare function takes two records and returns:
- -1 if the first record is greater than the second record
- 0 if the two records are the same item
- 1 if the first record is less than the second record.
The comparison could be a simple string compare, or comparing data in a structure for example
typedef struct {
char pString[5]; // to the message text
int count ;
int date;
} MyNode;
int Node_compare(const void *MyNode1, const void *MyNode2) {
MyNode * p1, *p2;
int rc;
// check the dates
rc = p1-> date - p2 -> date;
if (rc == 0 )
// check the string
rc = strcmp(p1->pString, p2->pString);
// return the result
return rc;
}
You need a root for the tree
void *ROOT = NULL;
Some C I do not understand
In some of the routines you need code like
char * buffer = * (char **) in;
//char * buffer = (char *) in;
With my knowledge of C, these two statements look the same. Only the first one works.
I have code
void print_Node(const void *ptr, ...) {
...
MyNode p = *(MyNode**) ptr;
There may be a smarter way of referencing the passed data, but the definition of “MyNode p” and using “* (Mynode **)” on the passed in pointer, works.
Referencing structures
It took me a while to work out the correct type definitions to get it to work. For example
int Node_compare(const void *, const void *);
MyNode * pMyNode;
void * p;
p = (void *) pMyNode;
MyNode * q;
/* see if it is in the tree already */
q = *(MyNode **) tsearch(p, (void **) &ROOT, Node_Compare);
The various functions expect pointer defined with type (void *).
Find an element
To look for for an element you use the tfind (tree find) function. The parameters are
- a pointer to the record (or string) partially initialised with the values used by the compare function.
- the root of the tree
- the compare function, described above.
It returns the record, or a null.
To use this function you have to create a record containing the values the compare function can use it. This record can be in automatic storage, or you can use malloc() to allocate storage for it.
typedef struct {
char pString[5]; // to the message text
int count ;
int date;
} MyNode;
MyNode tempNode;
memcpy(&tempNode.String("WXYZ"),sizeof("WXYZ");
tempNode.date = ....
You do not need to initialise tempNode.count, as this is not used in the comparison.
Adding an element
There is no tadd() function as such, there is a tsearch which provides “look for this node, and add it if it was not found“.
The parameters are
- a pointer to the record.
- the root of the tree
- the compare function, described above.
it returns the address of a record. If it is the one you passed in – then it was added. It it returns a different node – an existing record was returned.
See below for a more detailed description.
Deleting a node
You pass the standard three parameters in.
Walking the tree.
I’ve typically used a binary tree to store lots of information, such as a list of error messages and the count of occurrences, then, at the end of the job, print them all out in sequence. This is called walking the tree.
You use
twalk(root, print_function);
This calls the print_function for every element in the tree.
Planning for the tree
If you plan to walk the tree and print the elements in order, then the compare function needs to be written so the data is in the right order.
For example, I want to report the number of error messages, and the count of the messages, by day.
With my structure
typedef struct {
char pString[5]; // to the message text
int count ;
int date;
} MyNode;
If I use
int Node_compare(const void *MyNode1, const void *MyNode2) {
MyNode * p1, *p2;
int rc;
// check the dates
rc = p1-> date - p2 -> date;
if (rc == 0 )
// check the string
rc = strcmp(p1->pString, p2->pString);
return rc;
}
then the data will be sorted by date, then message within date order
If I use
int Node_compare(const void *MyNode1, const void *MyNode2) {
MyNode * p1, *p2;
int rc;
// check the string
rc = strcmp(p1->pString, p2->pString);
if (rc == 0 )
// check the dates
rc = p1-> date - p2 -> date;
return rc;
}
it will report in message sequence, then by date within message.
Adding nodes to the tree.
You need to malloc storage for each node you intend to add, because the node may be stored within the tree.
MyNode * pTode;
/* allocate our Node - we cant use automatic storage as it may */
/* be added to the tree - so must not be deleted */
pNode = (MyNode *) malloc(sizeof(MyNode));
if (pNode == 0)
{
perror("Malloc for Node failed ");
return 8;
}
/* initialise it */
strcpy(&pNode->pString[0],"ABCD") ;
pNode -> date = date();
pNode->count = 1;
void * p;
p = (void *) pNode;
MyNode * qreturned;
/* see if it is in the tree already */
qreturned = *(MyNode **) tsearch(p, (void **) &ROOT,Node_compare);
if (qreturned == p)
{
/* it didnt exist before - thus it was added */
/* possibly do something */
initialise the remained of the record
}
else /* it did exist before so we need to update the count field */
{
qreturned->count += 1; Update the values
/* release the storage we dont need */
free(pNode);
}
All data must be self contained within the node, or to permanently allocate storage (with malloc()). If it references something in automatic storage, then when the automatic storage is reused, your pointer will point to invalid data.
Print the tree
twalk(ROOT, print_Node); // ROOT not &ROOT because of no updates, and pass the function
//
// which invokes
//
void print_Node(const void *ptr, VISIT order, int level) {
if (order == leaf || order == postorder) {
MyNode p = *(MyNode**) ptr;
printf("Msg %4.4s date %d count %d\n",
p->pString, p->date,p-> count);
// level is how far down the tree the node is. Root = 0
}
}
Putting it together
I wanted to collect information from RACF records and be able to refer back to the records. I treated the records as char *
The essence of the code is
Compare
#define LRECORD 294
int compare207(const void *MyNode1, const void *MyNode2) {
int rc = memcmp( ((char * ) Mynode1) + 1
((char * ) Mynode2) + 1,261)
return rc;
}
Add a record
void add207(char * pRACF)
{
void * p = malloc(LRECORD);
...
memcpy(p ,pRACF,294);
qreturned = *(char **) tsearch(p, (void **) &ROOT207,
compare207);
if (qreturned == p)
{
// it was added
}
else /* it did exist before so we need to update the count field */
{
//it existed - so update the fields);
// qreturned -> ....
}
Find a record
char * find207(char * pIn)
{
char * * pOut;
pOut =*(char **) tfind( (const void *) pIn,
(void **)&ROOT207,
compare207);
return pOut;
}
Walk the tree and print a record
void tree207print(const void *in , VISIT order, int level) {
char * pBuffer = * (char **) in;
if (order == leaf || order == postorder)
{
printf("Data %s\n",pBbuffer -> .... );
}
}
void twalk207()
{
twalk(ROOT207, tree207print);
}