The Linux Programming Interface 笔记之malloc和free的实现

2019-11-14 09:13:18

字体：大中小

来源：转载

供稿：网友

Although malloc()andfree() PRovide an interface for allocating memory that ismuch easier to use than brk() and sbrk(), it is still possible to make various programmingerrors when using them. Understanding how malloc() and free()are implemented

provides us with insights into the causes of these errors and how we can avoid them.

The implementation of malloc()is straightforward. It first scans the list of memoryblocks previously released by free() in order to find one whose size is larger thanor equal to its requirements. (Different strategies may be employed for this scan,depending on the implementation; for example, first-fit or best-fit.) If the block isexactly the right size, then it is returned to the caller. If it is larger, then it is split, sothat a block of the correct size is returned to the caller and a smaller free block is

left on the free list.

If no block on the free list is large enough, then malloc() callssbrk() to allocatemore memory. To reduce the number of calls to sbrk(), rather than allocatingexactly the number of bytes required, malloc() increases the program break inlarger units (some multiple of the virtual memory page size), putting the excessmemory onto the free list.

Looking at the implementation of free(), things start to become more interesting.When free() places a block of memory onto the free list, how does it know whatsize that block is? This is done via a trick. When malloc() allocates the block, itallocates extra bytes to hold an integer containing the size of the block. This integer islocated at the beginning of the block; the address actually returned to the callerpoints to the location just past this length value, as shown in Figure 7-1.

When a block is placed on the (doubly linked) free list,free()uses the bytes of theblock itself in order to add the block to the list, as shown in Figure 7-2.

As blocks are deallocated and reallocated over time, the blocks of the free list willbecome intermingled with blocks of allocated, in-use memory, as shown in Figure 7-3.

Now consider the fact that C allows us to create pointers to any location in theheap, and modify the locations they point to, including the length,previous free block,and next free block pointers maintained by free() and malloc(). Add this to the precedingdescription, and we have a fairly combustible mix when it comes to creatingobscure programming bugs. For example, if, via a misdirected pointer, we accidentallyincrease one of the length values preceding an allocated block of memory, andsubsequently deallocate that block, then free()will record the wrong size block ofmemory on the free list. Subsequently, malloc() may reallocate this block, leading toa scenario where the program has pointers to two blocks of allocated memory thatit understands to be distinct, but which actually overlap. Numerous other picturesof what could go wrong can be drawn. To avoid these types of errors, we should observe the following rules:> After we allocate a block of memory, we should be careful not to touch anybytes outside the range of that block. This could occur, for example, as a resultof faulty pointer arithmetic or off-by-one errors in loops updating the contents ofa block.> It is an error to free the same piece of allocated memory more than once. Withglibc on linux, we often get a segmentation violation (SIGSEGV signal). This isgood, because it alerts us that we’ve made a programming error. However,more generally, freeing the same memory twice leads to unpredictable behavior.> We should never call free() with a pointer value that wasn’t obtained by a call toone of the functions in the malloc package.> If we are writing a long-running program (e.g., a shell or a network daemonprocess) that repeatedly allocates memory for various purposes, then weshould ensure that we deallocate any memory after we have finished using it.Failure to do so means that the heap will steadily grow until we reach the limitsof available virtual memory, at which point further attempts to allocate memory fail. Such a condition is known as amemory leak.

Tools and libraries for malloc debugging

Failure to observe the rules listed above can lead to the creation of bugs that areobscure and difficult to reproduce. The task of finding such bugs can be easedconsiderably by using the malloc debugging tools provided by glibc or one of a numberof malloc debugging libraries that are designed for this purpose.

Among the malloc debugging tools provided by glibc are the following:

> The mtrace() and muntrace() functions allow a program to turn tracing of memoryallocation calls on and off. These functions are used in conjunction withthe MALLOC_TRACE environment variable, which should be defined to contain thename of a file to which tracing information should be written. When mtrace() iscalled, it checks to see whether this file is defined and can be opened for writing; if so,then all calls to functions in the malloc package are traced andrecorded in the file. Since the resulting file is not easily human-readable, ascript—also called mtrace—is provided to analyze the file and produce a readablesummary. For security reasons, calls to mtrace()are ignored by set-user-ID andset-group-ID programs.

> The mcheck() and mprobe() functions allow a program to perform consistencychecks on blocks of allocated memory; for example, catching errors such asattempting to write to a location past the end of a block of allocated memory.These functions provide functionality that somewhat overlaps with the mallocdebugging libraries described below. Programs that employ these functionsmust be linked with the mcheck library using the cc -lmcheck option.

> The MALLOC_CHECK_ environment variable (note the trailing underscore) serves asimilar purpose to mcheck() and mprobe(). (One notable difference between thetwo techniques is that using MALLOC_CHECK_ doesn’t require modification andrecompilation of the program.) By setting this variable to different integer values,we can control how a program responds to memory allocation errors. Possiblesettings are: 0, meaning ignore errors; 1, meaning print diagnostic errors onstderr; and 2, meaning call abort() to terminate the program. Not all memoryallocation and deallocation errors are detected via the use of MALLOC_CHECK_; itfinds just the common ones. However, this technique is fast, easy to use, andhas low run-time overhead compared with the use of malloc debugging libraries.For security reasons, the setting of MALLOC_CHECK_ is ignored by set-user-ID andset-group-ID programs.

Further information about all of the above features can be found in the glibc manual.

A malloc debugging library offers the same API as the standard malloc package,but does extra work to catch memory allocation bugs. In order to use such alibrary, we link our application against that library instead of the malloc package inthe standard C library. Because these libraries typically Operate at the cost of slowerrun-time operation, increased memory consumption, or both, we should use themonly for debugging purposes, and then return to linking with the standard mallocpackage for the production version of an application. Among such libraries areElectric Fence (http://www.perens.com/FreeSoftware/),dmalloc(http://dmalloc.com/),Valgrind (http://valgrind.org/), andInsure++(http://www.parasoft.com/).

Both Valgrind and Insure++ are capable of detecting many other kinds of bugsaside from those associated with heap allocation. See their respective web sitesfor details.

Controlling and monitoring the malloc package

The glibc manual describes a range of nonstandard functions that can be used tomonitor and control the allocation of memory by functions in the malloc package,including the following:

> The mallopt() function modifies various parameters that control the algorithmused by malloc(). For example, one such parameter specifies the minimumamount of releasable space that must exist at the end of the free list beforesbrk() is used to shrink the heap. Another parameter specifies an upper limitfor the size of blocks that will be allocated from the heap; blocks larger thanthis are allocated using the mmap() system call (refer to Section 49.7).

> The mallinfo() function returns a structure containing various statistics aboutthe memory allocated by malloc().

Many UNIX implementations provide versions of mallopt() and mallinfo().However, the interfaces offered by these functions vary across implementations, so theyare not portable.

上一篇：Mybatis学习总结

下一篇：java的值传递和引用传递