Programming Previous page   Next Page

Accessing Files with Memory-Mapping

Memory-mapping is a mechanism that maps a portion of a file, or an entire file, on disk to a range of addresses within an application's address space. The application can then access files on disk in the same way it accesses dynamic memory. This makes file reads and writes faster in comparison with using functions such as fread and fwrite.

Another advantage of using memory-mapping in MATLAB is that it enables you to access file data using standard MATLAB indexing operations. Once you have mapped a file to memory, you can read the contents of that file using the same type of MATLAB statements used to read variables from the MATLAB workspace. The contents of the mapped file appear as if they were an array in the currently active workspace. You simply index into this array to read or write the desired data from the file.

This section covers

Overview of Memory-Mapping in MATLAB

This section describes the benefits and limitations of memory-mapping in MATLAB. The last part of this section gives details on which types of applications derive the greatest advantage from using memory-mapping:

Benefits of Memory Mapping

The principal benefits of memory mapping are efficiency, faster file access, the ability to share memory between applications, and more efficient coding.

Faster File Access.   Accessing files via memory map is faster than using I/O functions such as fread and fwrite. Data is read and written using the virtual memory capabilities that are built in to the operating system rather than having to allocate, copy into, and then deallocate data buffers owned by the process.

MATLAB does not access data from the disk when the map is first constructed. It only reads or writes the file on disk when a specified part of the memory map is accessed, and then it only reads that specific part. This provides faster random access to the mapped data.

Efficiency.   Mapping a file into memory allows access to the data in the file as if that data had been read into an array in the application's address space. MATLAB does not allocate physical memory for the array until you access a specific area in the mapped region. As a result, memory-mapped files provide a mechanism by which applications can access data segments in an extremely large file without having to read the entire file into memory first.

Efficient Coding Style.   Memory mapping eliminates the need for explicit calls to the fread and fwrite functions. In MATLAB, if x is a memory-mapped variable, and y is the data to be written to a file, then writing to the file is as simple as

Sharing Memory Between Applications.   Memory-mapped files also provide a mechanism for sharing data between applications, as shown in the figure below. This is achieved by having each application map sections of the same file. This feature can be used to transfer large data sets between MATLAB and other applications.

Also, within a single application, you can map the same segment of a file more than once.

Limitations of Memory-Mapping in MATLAB

MATLAB restricts the size of a memory map to 2 gigabytes, and on some platforms, requires that you set up your memory mapping so that all data access is aligned properly. See Maximum Size of a Memory Map for more information.

Maximum Size of a Memory Map.   Due to limits set by the operating system, the maximum amount of data you can map with a single instance of a memory map is 2^31 - 1 (or 2 GB). If you need to map more than 2 GB, you can either create separate maps for different regions of the file, or you can move the 2 GB window of one map to different locations in the file.

The 2 GB limit also applies to 64-bit platforms. However, because 64-bit platforms have a much larger address space, they can support having many more map instances in memory at any given time.

Aligned Access on Sol2 and HP-UX.   The Sol2 and HP-UX platforms only support aligned data access. This means that numeric values of type double that are to be read from a memory-mapped file must start at some multiple of 8 bytes from the start of the file. (Note that this is from the start of the file, and not the start of the mapped region.) Furthermore, numeric values of type single and also 32-bit integers must start at multiples of 4 bytes, and 16-bit integers at 2-byte multiples.

If you attempt to map a file on Sol2 or HP-UX that does not take into account these alignment considerations, MATLAB generates an error.

Byte Ordering

Memory mapping works only with data that has the same byte ordering scheme as the native byte ordering of your operating system. For example, because both Linux and Windows use little-endian byte ordering, data created on a Linux system can be read on Windows. You can use the computer function to determine the native byte ordering of your current system.

When to Use Memory Mapping

Just how much advantage you get from mapping a file to memory depends mostly on the size and format of the file, the way in which data in the file is used, and the computer platform you are using.

When Memory Mapping Is Most Useful.   Memory-mapping works best with binary files, and in the following scenarios:

When the Advantage Is Less Significant.   The following types of files do not fully utilize the benefits of memory mapping:


Previous page  Importing Data from MAT-Files The memmapfile Class Next page

© 1994-2005 The MathWorks, Inc.