Programming paradigms — part 2

Gabriel Barberini
9 min readAug 18, 2024

--

These notes are meant to give the reader the fundamentals to follow what is covered under Stanford CS107 Programming paradigms.

Part 2 covers lectures 5 to 8, which is basically about:

  • More Generics
  • Proper memory allocation and disposal
  • Heap memory management

No medium membership? No problem, you can access all of the content of my notes here: https://github.com/GabrielBarberini/EECS/blob/master/notes/personal/Programming_paradigms_CS107.pdf

Recap on function pointers

Function Pointers provide an extremely interesting, efficient, and elegant programming technique. You can use them to replace switch/if-statements and to realize late-binding. Late binding is deciding the proper function during runtime instead of compiling time. They are less error-prone than normal pointers because you will never allocate or deallocate memory with them.

More info at: https://www.cs.cmu.edu/~ab/15-123N09/lectures/Lecture%2008%20-%20Function%20Pointers.pdf

Functions like variables, can be associated with an address in the memory. We call this a function pointer. A specific function pointer variable can be defined as follows: int (*fn)(int,int) ;

Here we define a function pointer `fn`, which can be initialized to any function that takes two integer arguments and returns an integer. Here is an example of such a function:

int sum(int x, int y) {
return (x+y);
}

Now to initialize `fn` to the address of the sum, we can do the following:

  • Make `fn` points to the address of sum
fn = &sum ;
  • Simply ignore the `&` as function names are just like array names, namely: they are pointers to the structure they are referring to.
fn = sum;

In the end, we can use the sum function in two ways.

int x = sum(10,12);    /* direct call to the function */
int x = (*fn)(12,10); /* call to the function through a pointer */

Using Typedef’s

The syntax of function pointers can sometimes be confusing. So we can use a typedef statement to make things simpler:

typedef (* fpointer)(argument list);

so we can define a variable of type `fpointer` as follows:

fpointer fp;

Now imagine you have a function that returns another function based on some conditions:

int (*Convert(const char code)) (int, int) {
if (code = = '+') return &Sum ;
if (code = = '-') return &Difference;
}

The above function takes a `char` as an argument and returns a specific function pointer based on whether the char is + or — .

Both `Sum` and `Difference` are functions that essentially receive two numbers and return a single number, in this context we are working with ints, so we can do something like:

typedef int (*Ptr)(int,int);

And then simplify the Convert definition as follows:

Linear search in C

Recap on linear search

In computer science, linear search or sequential search is a method for finding an element within a list. It sequentially checks each element of the list until a match is found or the whole list has been searched.

A linear search runs in linear time in the worst case, and makes at most n comparisons, where n is the length of the list

Basic algorithm

Given a list L of n elements with values or records L0 …. Ln−1, and target value T, the following subroutine uses linear search to find the index of the target T in L.

  1. Set i to 0.
  2. If Li = T, the search terminates successfully; return i.
  3. Increase i by 1.
  4. If i < n, go to step 2. Otherwise, the search terminates unsuccessfully.

More info at: https://en.wikipedia.org/wiki/Linear_search

Implementing a generic linear search

Applying it over known `int` space requires us to write our comparing function for integers:

int intcmp (void* elem1, void* elem2) {
int* ip1 = (int*) elem1;
int* ip2 = (int*) elem2;
return *ip1 - *ip2;
}

Now putting everything together:

In case we want to search for a specific string in an array of strings (array of pointers to pointers), we can borrow the `Strcmp` function we implemented in part 1 (Single dereferencing of a char **) and do something like:

#include <string.h>

int Strcmp(void* vp1, void* vp2)
{
char* s1 = *(char**) vp1;
char* s2 = *(char**) vp2;
return strcmp(s1, s2);
}

char* note = "Eb";
char* notes[] = { "Ab", "F#", "B", "Gb", "D" };
char** found = lsearch(&note, notes, 5, sizeof(char**), Strcmp);

Approaching C++

The idea of a class

First, let us establish that Structs in C and Classes in C++ are identical in all ways except for the default access modifier: for a `struct` the default is public, whereas for a class it is private. There’s no other difference as far as the language is concerned.

The arrow operator (->) vs the dot (.) operator

dot (.)

  • The dot operator is used to access the content of a member of a struct (or class in C++) directly when you have an instance of the struct or class.

Example: struct_instance.member

arrow (->)

  • The arrow operator is used to access the content of a member of a struct (or class) when you have a pointer to the struct (or class);
  • The arrow operator is syntactic sugar for dereferencing the pointer before accessing the member content via the dot operator.

Example: ptr->member is equivalent to (*ptr).member

Implementing a stack data structure for ints

Before we begin, we need to know that in C it is a common practice to aggressively separate implementation from behavior. This means that we should first create a `.h` file to store the implementation (interface) of the stack structure and then a `.c` file with the actual behavior.

Note that `assert()` is just a macro we use here to ensure we get the memblock pointer in the end.

Quick words about realloc()

  • Deallocate old array and return a resized array;
  • Has the advantage of checking if there is available space to contiguously extend the memory in the heap for `s->elems`;
  • There is no similar function in c++;
  • Fun fact: malloc(size) ≡ realloc(Null, size)

Implementing a generic stack data structure

Pause a bit to ponder on the above implementation, make sure you understand and agree with what each line is doing, and don’t hesitate to go back to part 1 if you need to review something.

Quick words on memory disposal

Disposing basic types
If you dynamically allocate memory for basic types (e.g int, float, char, etc), `free()` will correctly deallocate the memory without any additional steps needed:

int *p = malloc(sizeof(int));
// Use p...
free(p); // No additional action needed

Disposing structures
For simple structures, like a fraction struct or any struct that only contains basic types, `free()` will deallocate the memory for the structure itself, but if the structure contains pointers to dynamically allocated memory (e.g char**, pointers to structs, structs with pointers, etc), you’ll need to manually `free()` those inner allocations before freeing the structure:

typedef struct {
char* name;
int age;
} Person;

Person *p = malloc(sizeof(Person));
p->name = malloc(100 * sizeof(char)); // Dynamically allocate memory for name

// Use p...

free(p->name); // Free the inner allocation first
free(p); // Then free the structure itself

In this case, failing to free `p->name` before freeing `p` would lead to a memory leak. C doesn’t automatically traverse the structure to free any internal pointers, so you have to handle that manually.

That is why we had to create a custom `StringFree()` function for our `stringStack` use-case.

The RAM memory

RAM is a common computing acronym that stands for random-access memory. It is often referred to as PC memory or simply memory. In essence, RAM is your computer’s short-term memory, where data needed by the CPU to run applications and open files is stored for quick access.

When memory is allocated — either statically on the stack via stack frames or dynamically on the heap — we are utilizing RAM. It is also referred to as primary storage.

Despite the lack of formal constraints, stack frames allocation are usually contiguous, and even if not contiguous in physical memory they are usually contiguous in virtual memory, more info at https://stackoverflow.com/questions/5086577/is-stack-memory-contiguous

Unlike stack frames, heap allocations are typically not contiguous. Heap memory is often fragmented and can consist of blocks that are scattered throughout both physical and virtual memory, depending on the allocation and deallocation patterns.

Zeroing in the heap segment with malloc()

Zeroing a bit more

When `int *arr = malloc(40*sizeof(int))` is executed, the heap manager actually accommodates more than 160 bytes of memory. Along with the reserved memory blob lies a header that stores additional information, like memory footprint, etc.

General header information

  • Block size: The header usually stores the size of the allocated block;
  • Status: It may include a flag indicating whether the block is free or in use;
  • Pointers to neighboring blocks: In some memory allocators (especially those implementing a free list), the header might include pointers to the next and previous blocks in the list.

When `malloc()` runs, the memory management system stores the leading address of that memory blob and the address to its header in a symbol table which `free()` eventually relies on in order to know how much memory to deallocate. Hence, one cannot simply give any pointer to `free()`, it needs to be the same pointer that was handed back from `malloc()` in the first place.

e.g: `free(arr+1)` will not work as only the leading address of the array `(arr+0)` is kept at the symbol table by `malloc()` with a header reference that holds how much memory was allocated to the whole blob. The same is true for string arrays.

Important reminder: since there are many different heuristics on memory management and heap segmentation per OS, memory-specific interactions shouldn’t be relied upon by the client. The way the heap is traversed when searching for available space is also arbitrary to implementation trade-offs

memcopy() vs memove() and the overlapping issue

Prototype: void *memcpy(void *dest, const void *src, size_t n);

  • dest: Pointer to the destination memory location;
  • src: Pointer to the source memory location;
  • n: Number of bytes to copy from the source to the destination.

memcpy() assumes that the source (src) and destination (dest) memory areas do not overlap. The function directly copies n bytes from src to dest without performing any checks to see if these two memory regions might overlap.

If src and dest overlap, using memcpy() can lead to undefined behavior. This means that the outcome of the copy operation is unpredictable — data corruption or other unexpected issues might occur because the source data might be overwritten before it is fully copied.

If there is a possibility that the memory regions might overlap, the memmove() function should be used instead. Unlike memcpy(), memmove() is designed to handle overlapping memory areas safely and does so with a slightly higher performance cost than memcpy().

Handles and heap compaction

Indirect approaches, like using “handles” or “indirect pointers,” can allow for heap compaction, which is a process where fragmented blocks of memory are consolidated to make larger contiguous blocks available.

Handles

A handle is essentially a level of indirection where a pointer (or reference) does not directly point to the memory location of the data but instead points to a handle (a fixed location). This handle, in turn, points to the actual data. Therefore, the pointers handed to the client are two hops away from the actual data.

In other words, the handle serves as a lookup table for client reference that points to the header of a memory segment and can be updated during heap optimization processes.

Heap Compaction

When heap memory becomes fragmented due to allocations and deallocations, the memory manager can move the actual data to a new location in order to create larger contiguous blocks of free memory. Since the application code only references the handle and not the direct memory address, the memory manager can update the handle to point to the new location of the data, without requiring the application code to change.

Enjoyed it so far?
See you in part 3!

--

--

Gabriel Barberini

Electrical Engineering and Computer Science; QA Engineer at SumUp