How Much Space Does MongoDB Take in RAM?
MongoDB is a non-relational database that is built for developers to quickly build internet and business applications. It scales horizontally to handle high data loads.
Each document in a MongoDB collection takes up some space on disk as it grows. It also stores a bit of extra space as padding, if it has not outgrown its original space allocation.
Memory
The amount of memory that MongoDB tries to keep in RAM depends on the structure of your data. It also depends on whether you use SQLite or WiredTiger as the storage engine.
WiredTiger uses memory mapped files and the operating system cache. This makes it less likely that the database will need to swap pages to and from disk. But it also means that there will be a larger hole in the memory that can’t be reused by MongoDB.
For this reason, it is important to design your application so that the “Working Set” fits in RAM. If it doesn’t, you may experience performance penalties as the operating system needs to swap one part of the Working Set to and from disk. This also applies if you are using a large amount of secondary indexes. There are MongoDB admin commands that can be run to shrink or defragment the data files and indexes. However, these will require an additional disk space up to 2GB during the compaction process.
Disk
The disk performance of MongoDB is impacted by the number and size of data files. This is especially important in environments with a large working set that exceeds available memory.
When the working set of a mongod process exceeds the amount of memory allocated to it, page faults occur. These page faults are operating system operations that cause MongoDB to read pages of its data files from disk into memory. This can significantly impact database throughput.
For optimal MongoDB performance, the majority of the working set should fit in memory. This is a key consideration when sizing replica sets and sharded clusters.
When a new document is added, if MongoDB cannot fit the document in its existing datafiles it will allocate a new file. This can lead to several GB of unused space in the datafiles. This can be addressed with administrative commands that allow for compaction of the datafiles, or by removing empty documents.
RAM Cache
Many people confuse the MongoDB RAM cache with the system memory that gets allocated to other applications on your machine. This is a mistake because the RAM that MongoDB uses is actually a memory mapped file that has no effect on overall system performance.
When a query is made, it is actually reading data that has been read from disk, not from RAM. This is because the data is split up into small pages that are accessed in turn by your CPU.
When a document grows in size MongoDB will add a bit of extra space to the end of the document referred to as padding. This is to ensure that the document can continue growing without running out of space and causing an expensive copy operation.
File System
MongoDB stores its data on disk in BSON format, a binary representation of JSON documents that allows for fast parsing and indexing. The storage format also enables rapid and flexible scalability in both horizontal and vertical directions. These scalability features have made MongoDB a popular choice among developers working on a variety of applications for both web and enterprise environments.
For example, Shutterfly switched from Oracle to MongoDB and has over 6 billion images stored in it. Electronic Arts, the video game developer, also uses it for their FIFA online games.
When you run the repairDatabase command to restore a database, it requires free disk space equal to the size of your current data set plus 2 gigabytes. You should use a separate volume for the dbpath to avoid consuming all available space on that volume. This is necessary because the operation will defragment the associated storage. Also, note that the repair process will cause IO activity to stop on the data volume.