Monday, 13 February 2012

How to Write High-Performance C# Code

http://dotnet.sys-con.com/node/46342
Writing code that runs quickly is sometimes at odds with writing code quickly. C.A.R. Hoare, computer science luminary and discoverer of the QuickSort algorithm, famously proclaimed, "Premature optimization is the root of all evil." The extreme programming design principle of "You Aren't Gonna Need It" (YAGNI) argues against implementing any features, including performance optimizations, until they're needed.
Writing unnecessary code is undoubtedly bad for work efficiency. However, it's important to realize that different situations have different needs. Code for vehicular real-time control systems has inherent up-front responsibilities for stability and performance that aren't present in, say, a small one-off departmental application. Therefore, it's more important in such code to optimize early and often.
Performance tuning for real-world applications often involves activities geared towards finding bottlenecks: in code, in the network transport layer, and at transaction boundaries. However, these techniques alone cannot solve the dreaded problem of uniformly slow code, which surfaces when large bottlenecks have been resolved but the code still exhibits inadequate performance. This is code that has been written without attention to correct usage, often by junior programmers, in the same style across whole modules or applications. Unfortunately, the best solution for this problem is to make sure that all programmers on a project follow correct coding practice when writing the code the first time; coding guidelines and good shared libraries help enormously.
This article presents helpful tips for writing in-process .NET managed code that performs well. It's assumed that basic programming skills such as factoring control structures, pulling work outside of loops whenever possible, caching variables for reuse, use of the switch statement, and the like are known to the average reader.
All code examples referred to in this article can be downloaded from the .NET Developer's Journal Web site, at www.sys-con.com/dotnet/sourcec.cfm. The code comes with a Windows Forms application that can be used to easily view the code and run all the tests on your own machine. You'll need the .NET Runtime 1.1 to run the code.
Tools
While testing tools such as NUnit and the upcoming VS.NET 2005 Team System can help you find bottlenecks, when tuning small sections of code, there's still no substitute for the micro-benchmark. This is because most generic testing frameworks depend on things like delegates, attributes, and/or interface method calls to do testing, and the code usually is not written with benchmarking primarily in mind. This can be very significant if you're interested in measuring the execution time of a batch of code down to the microsecond or even nanosecond level.
A micro-benchmark consists of a tight loop isolating the code that's being tested for performance, with a time reading before and after. When the test has finished, the start time is subtracted from the end time, and this duration is divided by the number of iterations to get the per-iteration time cost. The following code shows a simple micro-benchmark construct:


int loopCount = 1000000000;
long startTime, endTime;
double nanoseconds;

startTime = DateTime.Now.Ticks * 100;
for(int x = 0; x < loopCount; x++) {
    // put the code to be tested here
}
endTime = DateTime.Now.Ticks * 100;
nanoseconds = ((double)(endTime - startTime)) / ((double)loopCount);
Console.WriteLine(nanoseconds.ToString("F") + " ns per operation");
When performing a simple micro-benchmark, it's important to remember a couple of things. First, small fluctuations (noise) are normal, so to obtain the most accurate results, each test should be run several times. In particular, the first set of tests executed after program initialization may be skewed due to the lazy acquisition of resources by the .NET runtime. Also, if your results are very inconsistent, you may not have penetrated the "noise floor" of the measurement. The best solution for this is to increase the number of loops and/or tests. Another thing to remember is that looping itself introduces overhead, and for the most accurate readings, you should subtract this from the result. On a P4-M 2-GHz laptop, the per-loop overhead for a for loop with an int counter in release mode is around 1 nanosecond.
I'd never advocate running each code fragment from a long program through micro-benchmarks, but benchmarking is a good way to become familiar with the relative costs of different types of expressions. True knowledge of the performance of your code is built on actual observations. As time goes on, you'll find yourself needing such tests less and less, and you'll keep track in the back of your head of the relative performance of the statements you're writing.
Another important tool is ildasm.exe, the IL disassembler. With it, you can inspect the IL of your release builds to see if your assumptions are correct about what's going on under the covers. IL is not hard to read for a person familiar with the .NET framework; if you're interested in learning more, I suggest starting with Serge Lidin's book on the subject.
A great free tool for decompiling IL to C# or VB source, Reflector, is found at www.aisto.com/roeder/dotnet/; it's incredibly useful for viewing code that ships with the .NET Framework, for those of you less familiar with IL.
The CLR Profiler, available as a free download from Microsoft's Web site, allows you to track memory allocation and garbage collection activity, among other useful features. Also, the MSDN Web site has excellent coverage of performance metrics tools such as WMI and performance counters.
Working with Objects and Value Types
Objects: A Double Whammy
Objects are expensive to use, partly because of the overhead involved in allocating memory from the heap (which is actually well-optimized in .NET) and partly because every created object must eventually be destroyed. The destruction of an object may take longer than its creation and initialization, especially if the class contains a custom finalization routine. Also, the garbage collector runs in an indeterministic way; there's no guarantee that an object's memory will be immediately reclaimed when it goes out of scope, and until it's collected, this wasted memory can adversely affect performance.
The Garbage Collector in a Nutshell
It's necessary to understand garbage collection to appreciate the full impact of using objects. The single most important fact to know about the garbage collector is that it divides objects into three "generations": 0, 1, and 2. Every object starts out in generation 0; if it survives (if at least one reference is maintained) long enough, it goes to 1; much later, it transitions to 2. The cost of collecting an object increases with each generation. For this reason, it's important to avoid creating unnecessary objects, and to destroy each reference as quickly as possible. The objects that are left will often be long-lived and won't be destroyed until application shutdown.
Lazy Instantiation/Initialization
The Singleton design pattern is often used to provide a single global instance of a class. Sometimes it's the case that a particular singleton won't be needed during an application run. It's generally good practice to delay the creation of any object until it's needed, unless there's a specific need to the contrary - for instance, to pre-cache slow-initializing objects such as database connections. The "double-checked locking" pattern is useful in these situations, as a way to avoid synchronization and still ensure that a needed action is only performed once. Lazy initialization is a technique that can enhance the performance of an entire application through object reduction.
Avoiding Use of Class Destructors
Class destructors (implemented as the Finalize() method in VB.NET) cause extra overhead for the garbage collector, because it must track which objects have been finalized before their memory can be reclaimed. I've never had a need for finalizers in a purely managed application.
Casting and Boxing/Unboxing Overhead
Casting is the dynamic conversion of a type at runtime to another, and boxing is the creation of a reference wrapper for a value type (unboxing being the conversion back to the wrapped value type). The overhead of both is most heavily felt in collections classes, as they all - with the exception of certain specialized ones like StringDictionary - store each value as an Object. For instance, when you store an Int32 in an ArrayList, it is first boxed (wrapped in an object) when it is inserted; each time the value is read, it is unboxed before it is returned to the calling code.
This will be fixed in the next version of .NET with the introduction of generics, but for now you can avoid it by creating strongly typed collections and by typing variables and parameters as strongly as possible. If you're unsure about whether or not boxing/unboxing is taking place, you can check the IL of your code for appearances of the box and unbox keywords.
Trusting the Garbage Collector
Programmers new to .NET sometimes worry about memory allocation to the point that they explicitly invoke System.GC.Collect(). Garbage collection is a fairly expensive process, and it usually works best when left to its own devices. The .NET garbage collection scheme can intentionally delay reclamation of objects until memory is available, and in particular longer-lived objects (those that make it to generation 1 or 2) may not be reclaimed for an extended period. Even a simple "Hello, world!" console application may allocate 15 MB or more of memory for its "working set." My advice: don't call GC.Collect() unless you really know what you're doing.
Properties, Methods, and Delegates
Avoiding Overuse of Property Getters and Setters
Most people don't realize that property getters and setters are similar to methods when it comes to overhead; it's mainly syntax that differentiates them. A non-virtual property getter or setter that contains no instructions other than the field access will be inlined by the compiler, but in many other cases, this isn't possible. You should carefully consider your use of properties; from inside a class, access fields directly (if possible), and never blindly call properties repeatedly without storing the value in a variable. All that said, this doesn't mean that you should use public fields! Example 1 demonstrates the performance of properties and field access in several common situations.
About Delegates
Delegates are slower to execute than interface methods. Delegates are often used to introduce a level of indirection in code, but in almost all cases interfaces allow a cleaner design. Of course, it's impossible to completely shun delegates; the entire event-handling paradigm in .NET is based on them. Example 2 compares the performance of delegates and direct method calls.
Minimizing Method Calls
The .NET compiler is capable of performing many optimizations for release builds. One of them is called "method inlining." If method A calls method B and certain other conditions are met, such as the code in method B being small enough, the code from B will be copied into A during compilation. However, .NET won't or can't inline certain types of methods, such as virtual methods or methods over a certain size. Each method invocation/property access entails significant overhead, such as the allocation of a stack frame, etc. Of course, you should never repeatedly call a method for the same result on purpose, but you should also be mindful of the impact of method calls in general.

Using the 'Sealed' Keyword
Wherever extensibility is not required, you should use the sealed keyword. This makes your design easier to understand, as someone can tell at a glance if a certain class or method isn't meant to be extended or overridden. It also increases the chances that the compiler will inline code.
Working with Collections
Avoid Overuse of Collections
This might sound strange, but I'm not advocating working without data structures. The fact is, I've seen collections used many times when they don't actually simplify the code or provide any benefit at all. The single biggest avoidable use of collections is the use of an ArrayList when a simple array would suffice. It should be obvious that there's no way that calling a collection method - which may do significant work under the covers - can compare to something as simple as an array access for performance. See Example 3 for more information.
for vs foreach
The rumor abounds that the foreach loop is bad for performance. The truth is actually a little more complicated. Basically, foreach involves no performance penalty when used against arrays. However, when used against lists it involves the same overhead as creating an enumerator and using it within a try/catch block! Ildasm.exe may come in handy here to see what's going on. This isn't a complete killer, but if you do enough list access you may want to avoid it. Example 4 compares the two statements for access against an array and an ArrayList.
Enumerators: Don't Go Overboard
Just because the possibility exists for the enumerating of a collection doesn't mean you have to do it. For instance, the ArrayList class is useful as an array replacement; one of its best features is indexed element access. By using an enumerator with ArrayList, you hide its most useful features and introduce needless overhead. Example 5 shows the performance difference between indexed access and the use of an enumerator.
Avoid Overuse of Collection Wrappers
A peek at the code in classes such as ArrayList shows that, to implement synchronized, fixed-size, and read-only versions of these, methods such as Synchronized() actually create a new wrapper list around the original one (this scheme was copied from Java). While this is nice from certain design standpoints, it degrades performance; the method-call overhead for each operation is multiplied. For a fixed-size, synchronized ArrayList, this overhead is tripled! Chances are, if you're working with a collection and need synchronization, the code using the collection is already synchronized. In any case, simply locking on the collection itself around every access turns out to be faster than using a synchronized wrapper. Example 6 compares the use of a synchronized ArrayList with synchronizing access to an ArrayList.
Working With Strings
Don't Use String.Format() to Concatenate Strings
While string-formatting routines built into .NET are very useful for globalization and other tasks, they're not meant to be used for appending strings to each other. Example 7 tests the performance difference between String.Format() and string concatenation.
Use StringBuilder to Build Strings Inside Loops
The StringBuilder class is basically an array list for string fragments; the StringBuilder takes care of expanding its internal char array, hiding this from the user. The use of this class is also specially optimized by the .NET compiler, making it impossible to duplicate this functionality with equivalent performance (for instance, by manipulating char arrays directly and converting to a string afterwards). Example 8 shows the performance benefit of building a large string using StringBuilder instead of string concatenation.
Don't Use StringBuilder to Concatenate Small Numbers of Strings
Many .NET developers who consider themselves well-versed in performance matters advocate the use of the StringBuilder class whenever possible. However, it's not the fastest approach for concatenating small numbers of strings. Actually, any number can be combined in a single statement, although the performance benefit tends to dwindle above five or six substrings. This is due to instantiation and destruction overhead for the StringBuilder instance, as well as method-call overhead involved in calling Append() once for every added substring and ToString() once the string is built. What are the alternatives? Plus-sign concatenation and the String.Concat() method are equivalent; I prefer plus signs for readability. Note that this cannot be used across loops because the entire concatenation must occur within a single statement. Example 9 tests String.Concat() vs. StringBuilder for various numbers of substrings.
Don't Be Afraid to Use String Literals
Many developers incorrectly assume that a new object is created for every string literal, and, therefore, avoid their use. In some cases, using string literals directly in your code is a better approach than using string constants! It can make the code easier to understand and has no adverse impact on performance. This is due to the use of the .NET interned string table; this table maintains a String instance for every known string and reuses this instance whenever the same character sequence is used as a literal. See the documentation of the String.Intern() method for more details. Example 10 compares string literals to the use of string constants.
Threading
In itself, this is a huge topic. Most easy-to-apply, synchronization-related optimizations focus on minimizing the amount of code executing inside synchronized blocks, which may involve moving some code from inside to outside such blocks. In some cases, thoughtful programming may allow elimination of synchronization entirely from a class (see "Immutable Objects").
Performing a multithreaded micro-benchmark is more complicated than running a simple loop. In the simplest method, multiple threads are spawned, each looping over the benchmarked code; the reading is not started until all threads are successfully executing the code, and is terminated before the threads are shut down.
Thread Reuse
Newbie programmers make the common mistake of spawning a new thread for every request or other action. This can result in worse performance than using a single-threaded approach; the relative performance degradation is worse the quicker the individual requests can be serviced.
Writing Your Own Threading Code
Even better than the .NET thread pool, with its dependence on delegates, is to write your own threading code. Instead of using QueueUserWorkItem(), you typically write your own queueing code to coordinate work items. This also allows other benefits such as priority-based queueing.
Immutable Objects
Immutable objects are objects in which the data cannot be changed. In most cases, this is achieved by setting all fields in constructor methods and providing only property getters and/or methods that retrieve data from the object, without any mutator logic whatsoever. Many classes in the .NET Common Type System are immutable: System.String, System. Drawing.Font, etc. In addition, care should be taken that any values returned from property accessors, etc. are immutable as well. Otherwise, this data may be copied to insure the integrity of the object itself. Example 11 shows the performance benefit of immutable objects over synchronization.
Data Copying
This flies in the face of the advice given earlier, to minimize the use of objects. However, it's really the other side of the coin from immutable objects. Object copying allows you to use the data in a non-immutable object, but in a way that still completely avoids synchronization. The more highly multithreaded the environment, the more strategies like this make sense.
Read-Write Locks
Synchronization issues in managed code mirror those in databases. In some situations, optimistic concurrency strategies can be used; in some dirty reads are acceptable, etc. For situations in which a structure is seldom updated and often read, the ReaderWriterLock class can give significant performance benefits over simple synchronization. It allows either multiple read access or single write access at once. Example 12 compares ReaderWriterLock to simple synchronization in a read-heavy scenario.
Minimizing Synchronized Blocks
The use of the [MethodImpl Attribute(MethodImplOptions.Synchronized)]attribute should be avoided, as it always locks an entire method and is also non-standard C# usage. Instead, the lock keyword or one of the System.Threading classes should be used. Wherever possible, adjust the start of a synchronized section forward and the end backward. Do whatever you can to decrease the number of synchronized operations.
Suggested Reading
  • Lidin, Serge. (2002). Inside Microsoft .NET IL Assembler: Microsoft Press.
  • Archer, Tom, and Whitechapel, Andrew. (2002). Inside C#, Second Edition. Microsoft Press.
  • Rico Mariani's Weblog: http://blogs.msdn.com/ricom
  • Garbage Collector Basics and Performance Hints: dotnetgcbasics.asp
  • Improving .NET Application Performance and Scalability: scalenet.asp
  • An Introduction to C# Generics: csharp_generics.asp
  • WikiWikiWeb: http://c2.com/cgi/wiki?UniformlySlowCode
  • Monday, 30 January 2012

    High Scalability - High Scalability - Are Cloud Based Memory Architectures the Next Big Thing?

    High Scalability - High Scalability - Are Cloud Based Memory Architectures the Next Big Thing?:



    We are on the edge of two potent technological changes: Clouds and Memory Based Architectures. This evolution will rip open a chasm where new players can enter and prosper. Google is the master of disk. You can't beat them at a game they perfected. Disk based databases like SimpleDB and BigTable are complicated beasts, typical last gasp products of any aging technology before a change. The next era is the age of Memory and Cloud which will allow for new players to succeed. The tipping point will be soon.

    Let's take a short trip down web architecture lane:

  • It's 1993: Yahoo runs on FreeBSD, Apache, Perl scripts and a SQL database



  • It's 1995: Scale-up the database.



  • It's 1998: LAMP



  • It's 1999: Stateless + Load Balanced + Database + SAN



  • It's 2001: In-memory data-grid.



  • It's 2003: Add a caching layer.



  • It's 2004: Add scale-out and partitioning.



  • It's 2005: Add asynchronous job scheduling and maybe a distributed file system.



  • It's 2007: Move it all into the cloud.



  • It's 2008: Cloud + web scalable database.



  • It's 20??: Cloud + Memory Based Architectures

    You may disagree with the timing of various innovations and you would be correct. I couldn't find a history of the evolution of website architectures, so I just made stuff up. If you have any better information please let me know.

    Why might cloud based memory architectures be the next big thing? For now we'll just address the memory based architecture part of the question, the cloud component is covered a little later.

    Behold the power of keeping data in memory:
    Google query results are now served in under an astonishingly fast 200ms, down from 1000ms in the olden days. The vast majority of this great performance improvement is due to holding indexes completely in memory. Thousands of machines process each query in order to make search results appear nearly instantaneously.
    This text was adapted from notes on Google Fellow Jeff Dean keynote speech at WSDM 2009.

    Google isn't the only one getting a performance bang from moving data into memory. Both LinkedInand Digg keep the graph of their network social network in memory. Facebook has northwards of 800 memcached servers creating a reservoir of 28 terabytes of memory enabling a 99% cache hit rate. Even little guys can handle 100s of millions of events per day by using memory instead of disk.

    With their new Unified Computing strategy Cisco is also entering the memory game. Their new machines "will be focusing on networking and memory" with servers crammed with 384 GB of RAM, fast processors, and blazingly fast processor interconnects. Just what you need when creating memory based systems.

    Memory Is The System Of Record

    What makes Memory Based Architectures different from traditional architectures is that memory is the system of record. Typically disk based databases have been the system of record. Disk has been King, safely storing data away within its castle walls. Disk being slow we've ended up wrapping disks in complicated caching and distributed file systems to make them perform.

    Sure, memory is used as all over the place as cache, but we're always supposed to pretend that cache can be invalidated at any time and old Mr. Reliable, the database, will step in and provide the correct values. In Memory Based Architectures memory is where the "official" data values are stored.

    Caching also serves a different purpose. The purpose behind cache based architectures is to minimize the data bottleneck through to disk. Memory based architectures can address the entire end-to-end application stack. Data in memory can be of higher reliability and availability than traditional architectures.

    Memory Based Architectures initially developed out of the need in some applications spaces for very low latencies. The dramatic drop of RAM prices along with the ability of servers to handle larger and larger amounts of RAM has caused memory architectures to verge on going mainstream. For example, someone recently calculated that 1TB of RAM across 40 servers at 24 GB per server would cost an additional $40,000. Which is really quite affordable given the cost of the servers. Projecting out, 1U and 2U rack-mounted servers will soon support a terabyte or more or memory.

    RAM = High Bandwidth And Low Latency

    Why are Memory Based Architectures so attractive? Compared to disk RAM is a high bandwidth and low latency storage medium. Depending on who you ask the bandwidth of RAM is 5 GB/s. The bandwidth of disk is about 100 MB/s. RAM bandwidth is many hundreds of times faster. RAM wins. Modern hard drives have latencies under 13 milliseconds. When many applications are queued for disk reads latencies can easily be in the many second range. Memory latency is in the 5 nanosecond range. Memory latency is 2,000 times faster. RAM wins again.

    RAM Is The New Disk

    The superiority of RAM is at the heart of the RAM is the New Disk paradigm. As an architecture it combines the holy quadrinity of computing:



  • Performance is better because data is accessed from memory instead of through a database to a disk.



  • Scalability is linear because as more servers are added data is transparently load balanced across the servers so there is an automated in-memory sharding.



  • Availability is higher because multiple copies of data are kept in memory and the entire system reroutes on failure.



  • Application development is faster because there’s only one layer of software to deal with, the cache, and its API is simple. All the complexity is hidden from the programmer which means all a developer has to do is get and put data.

    Access disk on the critical path of any transaction limits both throughput and latency. Committing a transaction over the network in-memory is faster than writing through to disk. Reading data from memory is also faster than reading data from disk. So the idea is to skip disk, except perhaps as an asynchronous write-behind option, archival storage, and for large files.

    Or Is Disk Is The The New RAM

    To be fair there is also a Disk is the the new RAM, RAM is the New Cache paradigm too. This somewhat counter intuitive notion is that a cluster of about 50 disks has the same bandwidth of RAM, so the bandwidth problem is taken care of by adding more disks.

    The latency problem is handled by reorganizing data structures and low level algorithms. It's as simple as avoiding piecemeal reads and organizing algorithms around moving data to and from memory in very large batches and writing highly parallelized programs. While I have no doubt this approach can be made to work by very clever people in many domains, a large chunk of applications are more time in the random access domain space for which RAM based architectures are a better fit.

    Grids And A Few Other Definitions

    There's a constellation of different concepts centered around Memory Based Architectures that we'll need to understand before we can understand the different products in this space. They include:



  • Compute Grid - parallel execution. A Compute Grid is a set of CPUs on which calculations/jobs/work is run. Problems are broken up into smaller tasks and spread across nodes in the grid. The result is calculated faster because it is happening in parallel.



  • Data Grid - a system that deals with data — the controlled sharing and management of large amounts of distributed data.



  • In-Memory Data Grid (IMDG) - parallel in-memory data storage. Data Grids are scaled horizontally, that is by adding more nodes. Data contention is removed removed by partitioning data across nodes.



  • Colocation - Business logic and object state are colocated within the same process. Methods are invoked by routing to the object and having the object execute the method on the node it was mapped to. Latency is low because object state is not sent across the wire.



  • Grid Computing - Compute Grids + Data Grids



  • Cloud Computing - datacenter + API. The API allows the set of CPUs in the grid to be dynamically allocated and deallocated.

    Who Are The Major Players In This Space?

    With that bit of background behind us, there are several major players in this space (in alphabetical order):



  • Coherence - is a peer-to-peer, clustered, in-memory data management system. Coherence is a good match for applications that need write-behind functionality when working with a database and you require multiple applications have ACID transactions on the database. Java, JavaEE, C++, and .NET.



  • GemFire - an in-memory data caching solution that provides low-latency and near-zero downtime along with horizontal & global scalability. C++, Java and .NET.



  • GigaSpaces - GigaSpaces attacks the whole stack: Compute Grid, Data Grid, Message, Colocation, and Application Server capabilities. This makes for greater complexity, but it means there's less plumbing that needs to be written and developers can concentrate on writing business logic. Java, C, or .Net.



  • GridGain - A compute grid that can operate over many data grids. It specializes in the transparent and low configuration implementation of features. Java only.



  • Terracotta - Terracotta is network-attached memory that allows you share memory and do anything across a cluster. Terracotta works its magic at the JVM level and provides: high availability, an end of messaging, distributed caching, a single JVM image. Java only.



  • WebSphere eXtreme Scale. Operates as an in-memory data grid that dynamically caches, partitions, replicates, and manages application data and business logic across multiple servers.

    This class of products has generally been called In-Memory Data Grids (IDMG), though not all the products fit snugly in this category. There's quite a range of different features amongst the different products.

    I tossed IDMG the acronym in favor of Memory Based Architectures because the "in-memory" part seems redundant, the grid part has given way to the cloud, the "data" part really can include both data and code. And there are other architectures that will exploit memory yet won't be classic IDMG. So I just used Memory Based Architecture as that's the part that counts.

    Given the wide differences between the products there's no canonical architecture. As an example here's a diagram of how GigaSpaces In-Memory-Data-Grid on the Cloud works.





    Some key points to note are:



  • A POJO (Plain Old Java Object) is written through a proxy using a hash-based data routing mechanism to be stored in a partition on a Processing Unit. Attributes of the object are used as a key. This is straightforward hash based partitioning like you would use with memcached.



  • You are operating through GigaSpace's framework/container so they can automatically handle things like messaging, sending change events, replication, failover, master-worker pattern, map-reduce, transactions, parallel processing, parallel query processing, and write-behind to databases.



  • Scaling is accomplished by dividing your objects into more partitions and assigning the partitions to Processing Unit instances which run on nodes-- a scale-out strategy. Objects are kept in RAM and the objects contain both state and behavior. A Service Grid component supports the dynamic creation and termination of Processing Units.

    Not conceptually difficult and familiar to anyone who has used caching systems like memcached. Only is this case memory is not just a cache, it's the system of record.

    Obviously there are a million more juicy details at play, but that's the gist of it. Admittedly GigaSpaces is on the full featured side of the product equation, but from a memory based architecture perspective the ideas should generalize. When you shard a database, for example, you generally lose the ability to execute queries, you have to do all the assembly yourself. By using GigaSpaces framework you get a lot of very high-end features like parallel query processing for free.

    The power of this approach certainly comes in part from familiar concepts like partitioning. But the speed of memory versus disk also allows entire new levels of performance and reliability in a relatively simple and easy to understand and deploy package.

    NimbusDB - The Database In The Cloud

    Jim Starkey, President of NimbusDB, is not following the IDMG gang's lead. He's taking a completely fresh approach based on thinking of the cloud as a new platform unto itself. Starting from scratch, what would a database for the cloud look like?

    Jim is in position to answer this question as he has created a transactional database engine for MySQL named Falcon and added multi-versioning support to InterBase, the first relational database to feature MVCC (Multiversion Concurrency Control).

    What defines the cloud as a platform? Here's are some thoughts from Jim I copied out of the Cloud Computing group. You'll notice I've quoted Jim way way too much. I did that because Jim is an insightful guy, he has a lot of interesting things to say, and I think he has a different spin on the future of databases in the cloud than anyone else I've read. He also has the advantage of course of not having a shipping product, but we shall see.




  • I've probably said this before, but the cloud is a new computing platform that some have learned to exploit, others are scrambling to master, but most people will see as nothing but a minor variation on what they're already doing. This is not new. When time sharing as invented, the batch guys considered it as remote job entry, just a variation on batch. When departmental computing came along (VAXes, et al), the timesharing guys considered it nothing but timesharing on a smaller scale. When PCs and client/server computing came along, the departmental computing guys (i.e. DEC), considered PCs to be a special case of smart terminals. And when the Internet blew into town, the client server guys considered it as nothing more than a global scale LAN. So the batchguys are dead, the timesharing guys are dead, the departmental computing guys are dead, and the client server guys are dead. Notice a pattern?



  • The reason that databases are important to cloud computing is that virtually all applications involve the interaction of client data with a shared, persistent data store. And while application processing can be easily scaled, the limiting factor is the database system. So if you plan to do anything more than play Tetris in the cloud, the issue of database management should be foremost in your mind.



  • Disks are the limiting factors in contemporary database systems. Horrible things, disk. But conventional wisdom is that you build a clustered database system by starting with a distributed file system. Wrong. Evolution is faster processors, bigger memory, better tools. Revolution
    is a different way of thinking, a different topology, a different way of putting the parts together.



  • What I'm arguing is that a cloud is a different platform, and what works well for a single computer doesn't work at all well in cloud, and things that work well in a cloud don't work at all on the single computer system. So it behooves us to re-examine a lot an ancient and honorable assumptions to see if they make any sense at all in this brave new world.



  • Sharing a high performance disk system is fine on a single computer, troublesome in a cluster, and miserable on a cloud.



  • I'm a database guy who's had it with disks. Didn't much like the IBM 1301, and disks haven't gotten much better since. Ugly, warty, slow, things that require complex subsystems to hide their miserable characteristics. The alternative is to use the memory in a cloud as a distributed L2
    cache. Yes, disks are still there, but they're out of the performance loop except for data so stale that nobody has it memory.



  • Another machine or set of machines is just as good as a disk. You can quibble about reliable power, etc, but write queuing disks have the same problem.



  • Once you give up the idea of logs and page caches in favor of asynchronous replications, life gets a great deal brighter. It really does make sense to design to the strengths of cloud(redundancy) rather than their weaknesses (shared anything).



  • And while one guys is fetching his 100 MB per second, the disk is busy and everyone else is waiting in line contemplating existence. Even the cheapest of servers have two gigabit ethernet channels and switch. The network serves everyone in parallel while the disk is single threaded



  • I favor data sharing through a formal abstraction like a relational database. Shared objects are things most programmers are good at handling. The fewer the things that application developers need to manage the more likely it is that the application will work.



  • I buy the model of object level replication, but only as a substrate for something with a more civilized API. Or in other words, it's a foundation, not a house.



  • I'd much rather have a pair of quad-core processors running as independent servers than contending for memory on a dual socket server. I don't object to more cores per processor chip, but I don't want to pay for die size for cores perpetually stalled for memory.



  • The object substrate worries about data distribution and who should see what. It doesn't even know it's a database. SQL semantics are applied by an engine layered on the object substrate. The SQL engine doesn't worry or even know that it's part of a distributed database -- it just executes SQL statements. The black magic is MVCC.



  • I'm a database developing building a database system for clouds. Tell me what you need. Here is my first approximation: A database that scales by adding more computers and degrades gracefully when machines are yanked out; A database system that never needs to be shut down; Hardware and software fault tolerance; Multi-site archiving for disaster survival; A facility to reach into the past to recover from human errors (drop table customers; oops;); Automatic load balancing



  • MySQL scales with read replication which requires a full database copy to start up. For any cloud relevant application, that's probably hundreds of gigabytes. That makes it a mighty poor candidate for on-demand virtual servers.



  • Do remember that the primary function of a database system is to maintain consistency. You don't want a dozen people each draining the last thousand buckets from a bank account or a debit to happen without the corresponding credit.



  • Whether the data moves to the work or the work moves to the data isn't that important as long as they both end up a the same place with as few intermediate round trips as possible.



  • In my area, for example, databases are either limited by the biggest, ugliest machine you can afford *or* you have to learn to operation without consistent, atomic transactions. A bad rock / hard place choice that send the cost of scalable application development through the ceiling. Once we solve that, applications that server 20,000,000 users will be simple and cheap to write. Who knows where that will go?



  • To paraphrase our new president, we must reject the false choice between data consistency and scalability.



  • Cloud computing is about using many computers to scale problems that were once limited by the capabilities of a single computer. That's what makes clouds exciting, at least to me. But most will argue that cloud computing is a better economic model for running many instances of a
    single computer. Bah, I say, bah!



  • Cloud computing is a wonder new platform. Let's not let the dinosaurs waiting for extinction define it as a minor variation of what they've been doing for years. They will, of course, but this (and the dinosaurs) will pass.



  • The revolutionary idea is that applications don't run on a single computer but an elastic cloud of computers that grows and contracts by demand. This, in turn, requires an applications infrastructure that can a) run a single application across as many machines as necessary, and b) run many applications on the same machines without any of the cross talk and software maintenance problems of years past. No, the software infrastructure required to enable this is not mature and certainly not off the shelf, but many smart folks are working on it.



  • There's nothing limiting in relational except the companies that build them. A relational database can scale as well as BigTable and SimpleDB but still be transactional. And, unlike BigTable and SimpleDB, a relational database can model relationships and do exotic things like transferring money from one account to another without "breaking the bank.". It is true that existing relational database systems are largely constrained to single cpu or cluster with a shared file system, but we'll get over that.



  • Personally, I don't like masters any more than I like slaves. I strongly favor peer to peer architectures with no single point of failure. I also believe that database federation is a work-around
    rather than a feature. If a database system had sufficient capacity, reliability, and availability, nobody would ever partition or shard data. (If one database instance is a headache, a million tiny ones is a horrible, horrible migraine.)



  • Logic does need to be pushed to the data, which is why relational database systems destroyed hierarchical (IMS), network (CODASYL), and OODBMS. But there is a constant need to push semantics higher to further reduce the number of round trips between application semantics and the database systems. As for I/O, a database system that can use the cloud as an L2 cache breaks free from dependencies on file systems. This means that bandwidth and cycles are the limiting factors, not I/O capacity.



  • What we should be talking about is trans-server application architecture, trans-server application platforms, both, or whether one will make the other unnecessary.



  • If you scale, you don't/can't worry about server reliability. Money spent on (alleged) server reliability is money wasted.



  • If you view the cloud as a new model for scalable applications, it is a radical change in computing platform. Most people see the cloud through the lens of EC2, which is just another way to run a server that you have to manage and control, then the cloud is little more than a rather
    boring business model. When clouds evolve to point that applications and databases can utilize whatever resources then need to meet demand without the constraint of single machine limitations, we'll have something really neat.



  • On MVCC: Forget about the concept of master. Synchronizing slaves to a master is hopeless. Instead, think of a transaction as a temporal view of database state; different transactions
    will have different views. Certain critical operations must be serialized, but that still doesn't require that all nodes have identical views of database state.



  • Low latency is definitely good, but I'm designing the system to support geographically separated sub-clouds. How well that works under heavy load is probably application specific. If the amount of volatile data common to the sub-clouds is relatively low, it should work just fine provided there is enough bandwidth to handle the replication messages.



  • MVCC tracks multiple versions to provide a transaction with a view of the database consistent with the instant it started while preventing a transaction from updating a piece of data that it could not see. MVCC is consistent, but it is not serializable. Opinions vary between academia and the real world, but most database practitioners recognize that the consistency provided by MVCC is sufficient for programmers of modest skills to product robust applications.



  • MVCC, heretofore, has been limited to single node databases. Applied to the cloud with suitable bookkeeping to control visibility of updates on individual nodes, MVCC is as close to black magic as you are likely to see in your lifetime, enabling concurrency and consistency with mostly non-blocking, asynchronous messaging. It does, however, dispense with the idea that a cloud has at any given point of time a single definitive state. Serializability implemented with record locking is an attempt to make distributed system march in lock-step so that the result is as if there there no parallelism between nodes. MVCC recognizes that parallelism is the key to scalability. Data that is a few microseconds old is not a problem as long as updates don't collide.

    Jim certainly isn't shy with his opinions :-)

    My summary of what he wants to do with NimbusDB is:



  • Make a scalable relational database in the cloud where you can use normal everyday SQL to perform summary functions, define referential integrity, and all that other good stuff.



  • Transactions scale using a distributed version of MVCC, which I do not believe has been done before. This is the key part of the plan and a lot depends on it working.



  • The database is stored primarily in RAM which makes cloud level scaling of an RDBMS possible.



  • The database will handle all the details of scaling in the cloud. To the developer it will look like just a very large highly available database.

    I'm not sure if NimbusDB will support a compute grid and map-reduce type functionality. The low latency argument for data and code collocation is a good one, so I hope it integrates some sort of extension mechanism.

    Why might NimbusDB be a good idea?



  • Keeps simple things simple. Web scale databases like BigTable and SimpleDB make simple things difficult. They are full of quotas, limits, and restrictions because by their very nature they are just a key-value layer on top of a distributed file system. The database knows as little about the data as possible. If you want to build a sequence number for a comment system, for example, it takes complicated sharding logic to remove write contention. Developers are used to SQL and are comfortable working within the transaction model, so the transition to cloud computing would be that much easier. Now, to be fair, who knows if NimbusDB will be able to scale under high load either, but we need to make simple things simple again.



  • Language independence. Notice the that IDMG products are all language specific. They support some combination of .Net/Java/C/C++. This is because they need low level object knowledge to transparently implement their magic. This isn't bad, but it does mean if you use Python, Erlang, Ruby, or any other unsupported language then you are out of luck. As many problems as SQL has, one of its great gifts is programmatic universal access.



  • Separates data from code. Data is forever, code changes all the time. That's one of the common reasons for preferring a database instead of an objectbase. This also dovetails with the language independence issue. Any application can access data from any language and any platform from now and into the future. That's a good quality to have.

    The smart money has been that cloud level scaling requires abandoning relational databases and distributed transactions. That's why we've seen an epidemic of key-value databases and eventually consistent semantics. It will be fascinating to see if Jim's combination of Cloud + Memory + MVCC can prove the insiders wrong.

    Are Cloud Based Memory Architectures The Next Big Thing?

    We've gone through a couple of different approaches to deploying Memory Based Architectures. So are they the next big thing?

    Adoption has been slow because it's new and different and that inertia takes a while to overcome. Historically tools haven't made it easy for early adopters to make the big switch, but that is changing with easier to deploy cloud based systems. And current architectures, with a lot of elbow grease, have generally been good enough.

    But we are seeing a wide convergence on caching as way to make slow disks perform. Truly enormous amounts of effort are going into adding cache and then trying to keep the database and applications all in-sync with cache as bottom up and top down driven changes flow through the system.

    After all that work it's a simple step to wonder why that extra layer is needed when the data could have just as well be kept in memory from the start. Now add the ease of cloud deployments and the ease of creating scalable, low latency applications that are still easy to program, manage, and deploy. Building multiple complicated layers of application code just to make the disk happy will make less and less sense over time.

    We are on the edge of two potent technological changes: Clouds and Memory Based Architectures. This evolution will rip open a chasm where new players can enter and prosper. Google is the master of disk. You can't beat them at a game they perfected. Disk based databases like SimpleDB and BigTable are complicated beasts, typical last gasp products of any aging technology before a change. The next era is the age of Memory and Cloud which will allow for new players to succeed. The tipping point is soon.

    Related Articles




  • GridGain: One Compute Grid, Many Data Grids



  • GridGain vs Hadoop



  • Cameron Purdy: Defining a Data Grid



  • Compute Grids vs. Data Grids



  • Performance killer: Disk I/O by Nathanael Jones



  • RAM is the new disk... by Steven Robbins



  • Talk on disk as the new RAM by Greg Linden



  • Disk-Based Parallel Computation, Rubik's Cube, and Checkpointing by Gene Cooperman, Northeastern Professor, High Performance Computing Lab - Disk is the the new RAM and RAM is the new cache



  • Disk is the new disk by David Hilley.



  • Latency lags bandwidth by David A. Patterson



  • InfoQ Article - RAM is the new disk... by Nati Shalom



  • Tape is Dead Disk is Tape Flash is Disk RAM Locality is King by Jim Gray



  • Product: ScaleOut StateServer is Memcached on Steroids



  • Cameron Purdy: Defining a Data Grid



  • Compute Grids vs. Data Grids



  • Latency is Everywhere and it Costs You Sales - How to Crush it



  • Virtualization for High Performance Computing by Shai Fultheim



  • Multi-Multicore Single System Image / Cloud Computing. A Good Idea? (part 1) by Greg Pfister



  • How do you design and handle peak load on the Cloud ? by Cloudiquity.



  • Defining a Data Grid by Cameron Purdy



  • The Share-Nothing Architecture by Zef Hemel.



  • Scaling memcached at Facebook



  • Cache-aside, write-behind, magic and why it sucks being an Oracle customer by Stefan Norberg.



  • Introduction to Terracotta by Mike



  • The five-minute rule twenty years later, and how flash memory changes the rules by Goetz Graefe

  • Wednesday, 11 January 2012

    In-Memory Database Systems Questions and Answers

    In-Memory Database Systems Questions and Answers:

    In-Memory Database Systems - Questions and Answers

    In-memory database systems (IMDS) are a growing sub-set of a database management system (DBMS) software. In-memory databases emerged in response to new application goals, system requirements, and operating environments. Below, we answer common IMDS questions.

    What is an in-memory database system?

    An in-memory database system is a database management system that stores data entirely in main memory. This contrasts to traditional (on-disk) database systems, which are designed for data storage on persistent media. Because working with data in memory is much faster than writing to and reading from a file system, IMDSs can perform applications’ data management functions an order of magnitude faster. Because their design is typically simpler than that of on-disk databases, IMDSs can also impose significantly lower memory and CPU requirements.

    If avoiding disk I/O is the goal, why not achieve that through database caching?


    Caching is the process whereby on-disk databases keep frequently-accessed records in memory, for faster access. However, caching only speeds up retrieval of information, or “database reads.” Any database write – that is, an update to a record or creation of a new record – must still be written through the cache, to disk. So, the performance benefit only applies to a subset of database tasks. In addition, managing the cache is itself a process that requires substantial memory and CPU resources, so even a “cache hit” underperforms an in-memory database.

    If an in-memory database system boosts performance by holding all records in memory, can’t I get the same result by creating a RAM disk and deploying a traditional database there?


    As a makeshift solution, placing the entire on-disk database on a RAM disk will speed up both database reads and writes. However, the database is still hard-wired for disk storage, and processes in the database to facilitate disk storage, such as caching and file I/O, will continue to operate, even though they are now redundant.

    In addition, data in an on-disk database system must be transferred to numerous locations as it is used. Figure 1 shows the handoffs required for an application to read a piece of data from an on-disk database, modify it and write that record back to the database. These steps require time and CPU cycles, and cannot be avoided in a traditional database, even when it runs on a RAM disk. Still more copies and transfers are required if transaction logging is active.
    Figure 1. Data transfer in an on-disk database system
    In contrast, an in-memory database system entails a single data transfer. Elimination of multiple data transfers streamlines processing. Removing multiple copies of data reduces memory consumption, and the simplified processing makes for greater reliability and minimizes CPU demands.

    Can you quantify the performance difference between the three approaches described above – using on-disk, on-disk deployed on a RAM-disk, and in-memory database systems?


    In a published benchmark, McObject compared the same application’s performance using an embedded on-disk database system, using an embedded in-memory database, and using the embedded on-disk database deployed on a RAM-disk. Moving the on-disk database to a RAM drive resulted in read accesses that were almost 4x faster, and database updates that were more than 3x faster.

    Moving this same benchmark test to a true in-memory database system, however, provided much more dramatic performance gains: the in-memory database outperformed the RAM-disk database by 4x for database reads and turned in a startling 420x improvement for database writes. Click here to read an article on iApplianceWeb reporting the benchmark test. Click here to download McObject’s benchmark report.

    What else distinguishes an in-memory database from a “traditional” (on-disk) database management system (DBMS)?


    The optimization objectives of an on-disk database system are diametrically opposed to those of an in-memory database system. With an on-disk database system, the primary burden on performance is file I/O. Therefore an on-disk database system seeks to reduce that I/O, and it will trade off memory consumption and CPU cycles to do so. This includes using extra memory for a cache, and CPU cycles to maintain the cache.

    On-disk DBMSs also keep a lot of redundant data around. For example, duplicate data is kept in index structures, to enable the on-disk database system to fetch records from the index, rather than “spending” an I/O navigating from the index to the data itself. Disk space is cheap, so designers of on-disk database systems proceed with the assumption that storage space is virtually limitless.

    In stark contrast, an in-memory database system carries no file I/O burden. From the start, its design can be more streamlined, with the optimization goals of reducing memory consumption and CPU cycles. Though memory has declined in price, developers rightly treat it as more precious—and because memory equals storage space for an in-memory database system, IMDSs should be (and McObject’s eXtremeDB in-memory embedded database is) designed to get the most out of memory. An in-memory database is chosen explicitly for its performance advantage, so a secondary design goal is always to eliminate unnecessary CPU cycles.

    Isn’t the database just lost if there’s a system crash?


    It needn’t be. Most in-memory database systems offer features for adding persistence, or the ability survive disruption of their hardware or software environment.

    One important tool is transaction logging, in which periodic snapshots of the in-memory database (called “savepoints”) are written to non-volatile media. If the system fails and must be restarted, the database either “rolls back” to the last completed transaction, or “rolls forward” to complete any transaction that was in progress when the system went down (depending on the particular IMDS’s implementation of transaction logging).

    In-memory database systems can also gain durability by maintaining one or more copies of the database. In this solution – called database replication – fail-over procedures allow the system to continue using a standby database. The “master” and replica databases can be maintained by multiple processes or threads within the same hardware instance. They can also reside on two or more boards in a chassis with a high-speed bus for communication, run on separate computers on a LAN, or exist in other configurations.

    Non-volatile RAM or NVRAM provides another means of in-memory database persistence. One type of NVRAM, called battery-RAM, is backed up by a battery so that even if a device is turned off or loses its power source, the memory content—including the database—remains. Newer types of NVRAM, including ferroelectric RAM (FeRAM), magnetoresistive RAM (MRAM) and phase change RAM (PRAM) are designed to maintain information when power is turned off, and offer similar persistence options.

    Finally, new hybrid database system technology adds the ability to apply disk-based storage selectively, within the broader context of an in-memory database. For example, with McObject’s hybrid eXtremeDB Fusion, a notation in the database design or "schema" causes certain record types to be written to disk, while all others are managed entirely in memory. On-disk functions such as cache management are applied only to those records stored on disk, minimizing these activities’ performance impact and CPU demands.

    What kinds of applications typically employ an in-memory database?


    In-memory databases are most commonly used in applications that demand very fast data access, storage and manipulation, and in systems that don’t typically have a disk but nevertheless must manage appreciable quantities of data.

    An important use for in-memory database systems is in real-time embedded systems. IMDSs running on real-time operating systems (RTOSs) provide the responsiveness needed in applications including IP network routing, telecom switching, and industrial control. IMDSs manage music databases in MP3 players and handle programming data in set-top boxes. In-memory databases’ typically small memory and CPU footprint make them ideal because most embedded systems are highly resource-constrained.

    Non-embedded applications requiring exceptional performance are an important growth area for in-memory database systems. For example, algorithmic trading and other applications for financial markets use IMDSs to provide instant manipulation of data, in order to identify and leverage market opportunities. Some multi-user Web applications – such as e-commerce and social networking sites – use in-memory databases to cache portions of their back-end on-disk database systems. These enterprise-scale applications sometimes require very large in-memory data stores, and this need is met by 64-bit IMDS editions.

    Is an in-memory database the same as an “embedded database”?


    “Embedded database” refers to a database system that is built into the software program by the application developer, is invisible to the application’s end-user and requires little or no ongoing maintenance. Many in-memory databases fit that description, but not all do. In contrast to embedded databases, a “client/server database” refers to a database system that utilizes a separate dedicated software program, called the database server, accessed by client applications via inter-process communication (IPC) or remote procedure call (RPC) interfaces. Some in-memory database systems employ the client/server model.

    How scalable is an in-memory database system? My application manages terabytes of data – is it practical to hold this much in an in-memory database?


    IMDS technology scales well beyond the terabyte size range. McObject’s benchmark report, In-Memory Database Systems (IMDSs) Beyond the Terabyte Size Boundary, detailed this scalability with a 64-bit in-memory database system deployed on a 160-core SGI Altix 4700 server running SUSE Linux Enterprise Server version 9 from Novell. The database grew to 1.17 terabytes and 15.54 billion rows, with no apparent limits on it scaling further.

    Performance remained consistent as the database size grew into the hundreds of gigabytes and exceeded a terabyte, suggesting nearly linear scalability. For a simple SELECT against the fully populated database, the IMDS (McObject’s eXtremeDB-64) processed 87.78 million query transactions per second using its native application programming interface (API) and 28.14 million transactions per second using a SQL ODBC API. To put these results in perspective, consider that the lingua franca for discussing query performance is transactions per minute.

    Doesn’t it take a long time to populate an in-memory database?


    “A long time” is relative. For example, a 19 megabyte in-memory database loads in under 6.6 seconds (under 4 seconds if reloading from a previously saved database image). The 1.17 terabyte database described earlier loaded in just over 33 hours.

    What is true is that populating a very large in-memory database system can be much faster than populating an on-disk DBMS. During such “data ingest,” on-disk database systems use caching to enhance performance. But eventually, memory buffers fill up, and the system writes the data to the file system (logical I/O). Eventually, the file system buffers also fill up, and data must be written to the hard disk (physical I/O). Physical I/O is usually measured in milliseconds, and its performance burden is much greater than that of logical I/O (which is usually measured in microseconds). Physical I/O may be required by an on-disk DBMS for other reasons, for example, to guarantee transactional integrity.

    Consider what happens when populating an on-disk database, as the total amount of stored data increases:

    First, as the database grows, the tree indexes used to organize data grow deeper, and the average number of steps into the tree, to reach the storage location, expands. Each step imposes a logical disk I/O. Second, assuming that the cache size stays the same, the percent of the database that is cached is smaller. Therefore, it is more likely that any logical disk I/O is the more-burdensome physical I/O.

    Third, as the database grows, it consumes more physical space on the disk platter, and the average time to move the head from position to position is greater. When the head travels further, physical I/O takes longer, further degrading performance.

    In contrast, in-memory database ingest performance is roughly linear as database size increases.

    Isn’t an in-memory database only really usable on a single computer system, whereas an on-disk database can be shared by any number of computers on a network?


    An in-memory database system can be either an “embedded database” or a “client/server” database system. Client/server database systems are inherently multi-user, but embedded in-memory databases can also be shared by multiple threads/processes/users. First, the database can be created in shared memory, with the database system providing a mechanism to control concurrent access. Also, embedded databases can (andeXtremeDB does) provide a set of interfaces that allow processes that execute on network nodes remote from the database node, to read from and write to the database. Finally, database replication can be exploited to copy the in-memory database to the node(s) where processes are located, so that those processes can query a local database and eliminate network traffic and latency.

    What’s different/better about an in-memory database versus STL or Boost collections, or even just creating my own memory-mapped file(s)?


    The question is the same as asking why these alternatives are not viable replacements for Oracle, MS SQL Server, DB2, and other on-disk databases. Any database system goes far beyond giving you a set of interfaces to manage collections, lists, etc. This typically includes support for ACID (atomic, consistent, isolated and durable) transactions, multi-user access, a high level data definition language, one or more programming interfaces (including industry-standard SQL), triggers/event notifications, and more.

    Won’t an in-memory database require huge amounts of memory because database systems are large?


    Equating “database management system” with “big” is justified, generally speaking. Even some embedded DBMSs are megabytes in code size. This is true largely because traditional on-disk databases – including some that have now been adapted for use in memory, and are pitched as IMDSs—were not written with the goal of minimizing code size (or CPU cycles). As described above, as on-disk database systems, their overriding design goal was amelioration of disk I/O.

    In contrast, a database system designed from first principles for in-memory use can be much smaller, requiring less than 100K of memory, compared to many 100s of kilobytes up to many megabytes for other database architectures. This reduction in code size results from:
    • Elimination of on-disk database capabilities that become redundant for in-memory use, such as all processes surrounding caching and file I/O
    • Elimination of many features that are unnecessary in the types of application that use in-memory databases. An IP router does not need separate client and server software modules to manage routing data. And a persistent Web cache doesn’t need user access rights or stored procedures
    • Hundreds of other development decisions that are guided by the design philosophy that memory equals storage space, so efficient use of that memory is paramount