Unlimited-Data. moved to lab.itbee.vn : September 2010

Thursday, 30 September 2010

Hadoop2010: Winning the Big Data SPAM Challenge � YDN Blog

Hadoop2010: Winning the Big Data SPAM Challenge

Fri August 13, 2010 (Updated)

Worldwide spam volumes this year are forecast to rise by 30% to 40% compared with 2009. Spam recently reached a record 92% of total email. Spammers have turned their attention to social media sites as well. In 2008, there were few Facebook phishing messages; Facebook is now the second most phished organization online. Even though Twitter has managed to recently bring its spam rate down to as low as 1%, the absolute volume of spam is still massive given its tens of millions of users. Dealing with spam introduces a number of Big Data challenges. The sheer size and scale of the data is enormous. In addition, spam in social media involves the need to understand very complex patterns of behavior as well as to identify new types of spam. This presentation discusses how data analytics built on Hadoop can help businesses keep spam from spiraling out of control.

Thursday, 23 September 2010

High Scalability - High Scalability - Applying Scalability Patterns to Infrastructure�Architecture

ABSTRACT And APPLY

So the aforementioned post is just a summary of a longer and more detailed post, but for purposes of this post I think the summary will do with the caveat that the original, “Scalability patterns and an interesting story...” by Jesper Söderlund is a great read that should definitely be on your “to read” list in the very near future.

For now, let’s briefly touch on the scalability patterns and sub-patterns Jesper described with some commentary on how they fit into scalability from a network and application delivery network perspective. The original text from the High Scalability blog are in red(dish) text.

Load distribution - Spread the system load across multiple processing units

This is a horizontal scaling strategy that is well-understood. It may take the form of “clustering” or “load balancing” but in both cases it is essentially an aggregation coupled with a distributed processing model. The secret sauce is almost always in the way in which the aggregation point (strategic point of control) determines how best to distribute the load across the “multiple processing units.”

load balancing / load sharing - Spreading the load across many components with equal properties for handling the request
This is what most people think of when they hear “load balancing”, it’s just that at the application delivery layer we think in terms of directing application requests (usually HTTP but can just about any application protocol) to equal “servers” (physical or virtual) that handle the request. This is a “scaling out” approach that is most typically associated today with cloud computing and auto-scaling: launch additional clones of applications as virtual instances in order to increase the total capacity of an application. The load balancing distributes requests across all instances based on the configured load balancing algorithm.

Partitioning - Spreading the load across many components by routing an individual request to a component that owns that data specific
This is really where the architecture comes in and where efficiency and performance can be dramatically increased in an application delivery architecture. Rather than each instance of an application being identical to every other one, each instance (or pool of instances) is designated as the “owner”. This allows for devops to tweak configurations of the underlying operating system, web and application server software for the specific type of request being handled. This is, also, where thedifference between “application switching” and “load balancing” becomes abundantly clear as “application switching” is used as a means to determine where to route a particular request which is/can be then load balanced across a pool of resources. It’s a subtle distinction but an important one when architecting not only efficient and fast but resilient and reliable delivery networks.

- - - Vertical partitioning - Spreading the load across the functional boundaries of a problem space, separate functions being handled by different processing units
      When it comes to routing application requests we really don’t separate by function unless that function is easily associated with a URI. The most common implementation of vertical partitioning at the application switching layer will be by content. Example: creating resource pools based on the Content-Type HTTP header: images in pool “image servers” and content in pool “content servers”. This allows for greater optimization of the web/application server based on the usage pattern and the content type, which can often also be related to a range of sizes. This also, in a distributed environment, allows architects to leverage say cloud-based storage for static content while maintaining dynamic content (and its associated data stores) on-premise. This kind of hybrid cloud strategy has been postulated as one of the most common use cases since the first wispy edges of cloud were seen on the horizon.
    - Horizontal partitioning - Spreading a single type of data element across many instances, according to some partitioning key, e.g. hashing the player id and doing a modulus operation, etc. Quite often referred to as sharding.
      This sub-pattern is inline with the way in which persistence-based load balancing is accomplished, as well as the handling of object caching. This also describes the way in which you might direct requests received from specific users to designated instances that are specifically designed to handle their unique needs or requirements, such as the separation of “gold” users from “free” users based on some partitioning key which in HTTP land is often a cookie containing the relevant data.

Queuing and batch - Achieve efficiencies of scale by processing batches of data, usually because the overhead of an operation is amortized across multiple request
I admit defeat in applying this sub-pattern to application delivery. I know, you’re surprised, but this really is very specific to middleware and aside from the ability to leverage queuing for Quality of Service (QoS) at the delivery layer this one is just not fitting in well. If you have an idea how this fits, feel free to let me know – I’d love to be able to apply all the scalability patterns and sub-patterns to a broader infrastructure architecture.

Relaxing of data constraints - Many different techniques and trade-offs with regards to the immediacy of processing / storing / access to data fall in this strategy
This one takes us to storage virtualization and tiering and the way in which data storage and access is intelligently handled in varying properties based on usage and prioritization of the content. If one relaxes the constraints around access times for certain types of data, it is possible to achieve a higher efficiency use of storage by subjugating some content to secondary and tertiary tiers which may not have the same performance attributes as your primary storage tier. And make no mistake, storage virtualization is a part of the application delivery network – has been since its inception – and as cloud computing and virtualization have grown so has the importance of a well-defined storage tiering strategy.

We can bring this back up to the application layer by considering that a relaxation of data constraints with regards to immediacy of access can be applied by architecting a solution that separates data reads from writes. This implies eventual consistency, as data updated/written to one database must necessarily be replicated to the databases from which reads are, well, read, but that’s part of relaxing a data constraint. This is a technique used by many large, social sites such as Facebook and Plenty of Fish in order to scale the system to the millions upon millions of requests it handles in any given hour.

Parallelization - Work on the same task in parallel on multiple processing units
I’m not going to be able to apply this one either, unless it was in conjunction with optimizing something like MapReduce and SPDY. I’ve been thinking hard about this one, and the problem is the implication that “same task” is really the “same task”, and that processing is distributed. That said, if the actual task can be performed by multiple processing units, then anapplication delivery controller could certainly be configured to recognize that a specific URL should be essentially sent to some other proxy/solution that performs the actual distribution, but the processing model here deviates sharply from the request-reply paradigm under which most applications today operate.

DEVOPS CAN MAKE THIS HAPPEN

I hate to sound-off too much on the “devops” trumpet, but one of the primary ways in which devops will be of significant value in the future is exactly in this type of practical implementation. Only by recognizing that many architectural patterns are applicable to not only application but infrastructure architecture can we start to apply a whole lot of “lessons that have already been learned” by developers and architects to emerging infrastructure architectural models. This abstraction and application from well-understood patterns in application design and architecture will be invaluable in designing the new network; the next iteration of network theory and implementation that will allow it to scale along with the applications it is delivering.

Monday, 20 September 2010

Cloud Storage ROI: A Business Model for the Enterprise

Cloud Storage ROI: A Business Model for the Enterprise: "

Our Enterprise Cloud Storage ROI Model identifies the key enterprise business drivers
for cloud storage, and provides a model that helps organizations
quantitatively evaluate if implementing a private storage cloud is the
best option for them.

The ROI model and paper may be downloaded here >>

In addition to reducing the cost of storage, the Enterprise Cloud Storage ROI Model
examines other areas of opportunity, including:

in-house backup
thin
provisioning
risk management
productivity gains
the ability to manage
large file transmissions
workstation storage upgrades and
sharing
information versus emailing information

The good news is that implementing a private storage cloud can provide
cost savings while maintaining control behind the firewall. This ROI
model helps you evaluate multiple use case options so that you can
make the right decision for your organization.

Download the Enterprise Cloud Storage ROI Model >>"

Tackling Architectural Complexity with Modeling

Tackling Architectural Complexity with Modeling: "Component models can help diagnose architectural problems in both new and existing systems."

Tuesday, 14 September 2010

HBase GUI Admin: HBaseXplorer

HBase GUI Admin: HBaseXplorer: "
I haven’t seen any HBase admin tools before — except maybe Toad for Cloud, but that’s SQL on top of HBase. HBaseXplorer source code is available on ☞ GitHub. It currently supports the following features:

Multiple HBase connections

Single row preview

Can preview any size tables

Create,Read,Update,Delete of columns (Only String represented data)

Filter rows

No screenshots though. Let me know if you tried it out.

Update: Thanks to Lars George, I’ve learned there is also hbase-explorer — code available on ☞ GitHub. Screenshot below:

Original title and link for this post: HBase GUI Admin: HBaseXplorer (published on the NoSQL blog: myNoSQL)

A Cloud Discussion With Dimension Data

A Cloud Discussion With Dimension Data: "
Working as I do at EMC, I get the opportunity to work with some of the world's best partners, including Dimension Data. Recently acquired by NTT, they're in an interesting position to do well as the world shifts from traditional IT to newer models.

I was recently asked to contribute a thought piece for a new customer publication they're creating. They asked the questions, I gave my best answers.

Graciously, they allowed me to post the piece here in advance of their publication. I'll supply a link when it gets published -- the publication shows every sign of being a good and valuable read.

There is a lot of industry hype around cloud computing- can you talk about virtualisation, private and public cloud computing what it means to clients today and why they should pay attention (if they are not already)?

Great question, so thanks. I’d like to start with a bit of context as to what might be going on here.

At a high level, many of us believe that the IT business is going through an important structural shift, to a world where the majority of IT is delivered as a dynamic, flexible and cost-effective service.

Historically, IT organizations would acquire chunks of discrete-yet-incompatible hardware and software, put it on their data center floor, and try to manage it all efficiently. That sort of approach is giving way to a pooled approach for infrastructure and software.

“Cloud” comes into the discussion as a convenient short-hand to identify what’s different here: IT is now built differently (dynamic pools of fully virtualized resources), operated differently (as an end-to-end service vs. individual technology specialties) and consumed differently (convenient consumption models for IT users and IT organizations alike).

“Private cloud” generally refers to any cloud that is under IT control: resource usage, service delivery, security and the like. “Public cloud” generally refers to any cloud where most of the control lies with the service provider, and not the consumer.

Many organizations operate at enough scale where cloud economics could easily apply to their internal operations – a cloud of your own, so to speak. Smaller organizations are looking to the new crop of service providers to provide the economics and the flexibility of the cloud, yet retain key aspects of control that they need.

I happen to think that we’ll see a lot of both going forward: cloud-like internal IT operations complemented by compatible service providers with IT still in control of the end result.

Virtualization is the technology that is largely responsible for making this all happen. By decoupling logical functions from physical resources, it makes the entire cloud discussion a realistic goal rather than some sort of unobtainable future state.

What does this mean to clients today?

So much emphasis has been put on lowering the costs of delivering IT services, and – while that is inarguably important – there’s far more to the story to consider.

Cloud deliver models are inherently flexible and dynamic. That means that anyone who uses IT services can generally get what they want – far faster and with less planning or commitment – than available before. That sort of speed and agility can be compelling if your business is moving fast – and most of them are!

Finally, there are a class of IT services where external providers have a unique advantage that’s hard to match with internal capabilities – certain industry applications are a good example, as are certain parts of the security and compliance stack, or data protection as another example.

Simply put – it’s not just about cheaper IT, in many cases it’s about better IT.

What should clients pay attention to?

For me, it’s all about the journey from where you are to where you’d like to be in the near future. Simply put, “cloud” has redefined where we’d all like the end state to be. Now that there’s general agreement on that, the work lies in the projects and initiatives that get us closer to that destination.

Put another way, the journey is mostly about people and process, and less about enabling technology. As an example, we use a three-part model to describe how we think most IT organizations will progress. I’ve included a high-level graphic that EMC’s IT organization is using to describe the process.

The first phase is virtualizing the non-business-critical applications that IT owns: development and test, infrastructure support apps, and the like.

Here, you’re just introducing new technologies (mostly server virtualization, such as VMware) and not really touching either your operational processes nor your relationship with the business. Big savings typically result, but there’s more to be had.

The second phase is virtualizing the business-critical applications that people really care about: you know, the ones with specific names and business owners. Here, you’re not only introducing new technology to protect, secure and manage applications, you’re also fundamentally changing the operational processes you use to run IT. And, as we all know, doing this inevitably impacts roles and responsibilities within the IT organization.

But there’s a new payoff in addition to capex and opex savings: there’s the obvious potential for better operated IT: availability, protection, security, efficiency and the like.

The third phase involves creating a catalog of services for the business: infrastructure as a service, application platforms as a service, popular applications as a services and – more recently – desktop experiences as a service. Each of these can either be provided internally, externally or using a combination of both.

If you think about it, this phase is all about establishing a new relationship between the business and IT. IT is now the internal service provider, and business users learn to be intelligent consumers of those services.

As with any journey, planning and orchestration is essential. The skills required at every juncture are usually not found within the existing IT organization, necessitating the use of one or more external consultancies.

That being said, we are working with many hundreds of organizations who have started down this path. It’s no longer theory – people are making considerable progress, often with spectacular results.

As clients accelerate their adoption of virtualisation towards cloud computing what are some of the major challenges that they are facing and what are the steps that they can take today to get ready for the private and public cloud computing?

A lot can be said on this topic, but in the interest of brevity, I’ll keep it to more manageable sound bites:

• Adopt a “virtualization first” policy. Don’t invest a single dollar in the physical legacy unless all alternatives have been exhausted. Make sure that every piece of technology that lands on the data center floor moves you closer to the goal, and not away from it. No U-turns.

• Create a “next-gen” environment, and point it at your loudest critics. By “next gen”, we’re not only talking integrated virtualized technology, we’re talking service provider processes and procedures that create a cloud-like experience for end users. By “loudest critics”, I’m referring to the people who routinely criticize how IT goes about delivering their services. These can be application developers, business analysis professionals or just about any cadre of demanding knowledge workers in the organization.

• Don’t underestimate the impact of organizational change. The underlying proposition here is to do IT in a new way, using people who’ve made a career of doing it the old way. The larger and more entrenched your IT organization is, the more of a challenge this can be.

• Enlist the business. So many IT organizations have created walls between themselves and the people they serve. Those walls are going to have to come down for meaningful change to occur at an accelerated pace. Invest in the time and resources needed to have ongoing discussions with key business stakeholders before, during and after the journey.

• Focus on what matters. It’s so easy for this discussion to devolve into a beauty pageant around who’s got the best technology, simply because that’s a topic that most IT professionals feel comfortable discussing. Many IT leaders are using pre-integrated IT infrastructure (such as the Vblock from the VCE Coalition) to accelerate the transition, and focus the organization on achieving immediate results.

• Learn from others. If you’re an IT leader, you’re not alone. Just about every IT organization will have to make this sort of transition over the next several years, so it’s easy to find people who are in your situation that you can learn from.

With all this market hype around cloud computing can you tell us about how storage and data management will need to change to support cloud computing?

We’re fond of saying “cloud changes everything”: how IT is built, operated and consumed. Storage and data management topics aren’t immune from this discussion; indeed, most of the anxiety around cloud has to do with security and data management topics.

Like other parts of the technology landscape, storage has to become fully virtualized alongside servers, networks and other components. It too has to dynamically deliver a wide range of service levels (and cost points) from the same underlying infrastructure.

Storage has to be increasingly managed in concert with other technologies, and less as a standalone discipline. New control planes need to be established to make sure that information is consistently protected and secured even where it might be moving from place to place.

Using EMC as an example, you can see those themes already evident in our portfolio. Storage virtualization is de-facto now within individual arrays, and increasingly across multiple arrays in potentially multiple locations. Auto-tiering technologies (such as EMC’s FAST) automatically deliver a wide range of service levels (and new cost points) from the same shared storage infrastructure.

New storage management models increasingly coordinate with the server virtualization layer (think VMware’s vCenter) and the newer crop of integrated infrastructure managers (think EMC Ionix UIM). And not only are there new tools for protecting and securing data, but new ways of managing them as an end-to-end service – whether implemented in the data center, or through a compatible service provider.

One thing is for sure: there won’t be any less information to deal with in this new world, and information won’t be any less important.

Virtual storage is fast becoming a reality with the release of VPLEX from EMC – can you talk about why VPLEX is so revolutionary in the storage market?

One key aspect of cloud is the ability to progressively aggregate larger and larger pools of resources that you can use dynamically: primarily servers and storage. The larger the dynamic pool, the better – generally speaking. You can run more efficiently, you can respond more quickly to changing conditions, and you can protect applications and data more effectively.

From a storage perspective, we’ve done a good job at creating these dynamic storage pools within a single storage frame. I’d use Symmetrix VMAX as an example of this. A single VMAX can be petabytes in size, creating a considerable dynamic and virtualized pool of resources.

But that’s not enough. We need to be able to create pools across multiple storage frames from potentially multiple vendors, and – ideally – allow these pools to extend over progressively larger and larger distances.

That’s what VPLEX is about – creating these dynamic, flexible and non-disruptive pools of storage resources that can incorporate multiple frames from multiple vendors, and – over the next year or so – progressively incorporate storage resources in multiple locations separated by considerable distance.

Taken in its entirety, VPLEX works alongside server and network virtualization to create very appealing clouds that can not only ignore geographical distances, but leverage it in interesting ways – for example, dynamically relocating applications and information closer to users to improve the user experience, or perhaps take advantage of cost-effective capacity in different locations.

More pragmatically, customers are using VPLEX today to simply move storage workloads around from device to device without having to take downtime or otherwise impact users. It’s pretty exciting stuff.

Can you talk about storage in the cloud and the how real it is for clients today?

It’s more real than many people imagine, once you start focusing on real-world use cases rather than generic theory.

For example, doing backup and other forms of data protection to external service providers is quite popular, simply because the service provider is physically and logically separated from the primary application location. Technically speaking, that’s “storage in the cloud”, although most people don’t think about it that way. Many of EMC’s data protection products (Mozy, Avamar, Data Domain, Data Protection Advisor) etc. are being used for this purpose today.

A related popular application is long-term information repository and archiving, especially with compliance requirements. We’ve been sending physical information records off-site for decades, digital records aren’t really all that different if you think about it. In the EMC portfolio, that would be Centera, the newer SourceOne information governance platforms and portions of the Documentum portfolio.

There’s a growing requirement to gather and dynamically distribute information content globally for many customers. They need multiple points of presence around the globe, managed as an integrated service that conforms to policy objectives. Many of these “storage clouds” are being built with EMC’s Atmos platform.

Those are just a few examples where there’s an individual storage discussion. Far more frequently, the storage is associated with an application, both of which are run externally: think SaaS, hosting, outsourcing, etc.

Taken that way, there’s already a considerable amount of storage “in the cloud”!

Security concerns come up frequently as an objection to cloud computing – what is your stance on this and is it really as big a concern as some believe?

Being secure in the cloud isn’t particularly difficult. However, changing established policy and procedure to incorporate this new IT delivery model can be very difficult indeed.

From a pure technology perspective, it’s fairly easy to show that fully virtualized cloud environments can be far more secure and compliant than the traditional physical environments they replace. It’s like night and day in terms of new underlying capabilities.

That being said, people’s perceptions tend to lag the reality, often by many years. The good news is that we’ve recently had good success by focusing the security and compliance teams on the high level GRC management of the IT environment (cloud or otherwise) as the required control point for security and compliance, using the EMC RSA Archer environment.

It’s not surprising that giving people an integrated dashboard to monitor and control risk can make things move forward in a much more decisive manner. As these high-level GRC dashboards proliferate, I think we’ll see much less resistance going forward.

Compliance and data protection are high on the agenda for clients – when people move to cloud computing what needs to be taken into account to ensure compliance and data protection?

It’s actually worse than that – the rules change all the time, often without prior notice!

As a result, we’re strong advocates of policy definition and enforcement capabilities that are built in, rather than bolted on later. From an architecture perspective, virtualized entities (servers, storage, applications, databases, etc.) are relatively self-contained, making them easy to tag and indentify from a compliance or data protection perspective.

This tagging or labelling capability makes a fully automated approach far more implementable: as polices are established for labelled objects (for example, information that can’t leave Belgium, or always must be remotely mirrored at least 200KM away), policy enforcement is automatic. Compliance for existing policies can be monitored in real-time, non-compliant pieces can be easily identified and remediated, and the inevitable new policies can be implemented with a minimum of drama.

Harking back to a previous discussion around security, it’s not hard to show that compliance and data protection capabilities can be far better in a fully virtualized cloud model -- it’s more a matter of changing established perceptions and protocols.

As the world comes to embrace all that is ‘cloud’ where do you see the market going and how quickly?

I don’t think any part of the IT landscape will emerge unscathed from this industry transition. Perhaps the last transition of similar magnitude was the wholesale adoption of the internet – and we’re still feeling the repercussions of that one!

Vendors will need to build products that can be delivered as a service, rather than standalone physical entities. These products and technologies will increasingly go to market through a new generation of solution and service providers – as well as IT organizations that think of themselves as internal service providers.

Value-added solution providers (such as Dimension Data, now acquired by NTT) will be well served by helping customers manage the transition to the new model in an intelligent and rational fashion, as well as providing not only technology, but the new breed of services that will be so popular.

IT organizations will find themselves strongly encouraged to invest in the new model, while being obligated to keep the old model running during the transition. We believe that IT organizations will come to look very differently in the next few years – new skills, new roles, and new relationships with the business.

Consumers of IT – whether they be individuals, or in an organizational setting – are increasingly starting to demand a “cloud experience” – a catalog of useful services available on demand. And there’s no reason to believe that they won’t eventually get what they want – if not from the established IT organization, then certainly elsewhere.

In some sense, all of this IT change really isn’t that different to what’s already happened in other forms of infrastructure we take for granted: power, communications, transportation, facilities, manufacturing, labor forces, etc.

Taken that way, the transition is somewhat inevitable.

With anything new there are winners and losers - what are the winners going to look like in cloud computing - both clients and IT providers?

Any time there’s a big change in an industry, the winners tend to be those that recognize the change, and start accelerating their efforts towards the new model. Conversely, those that resist the change without good reason tend to suffer the consequences.

I think this statement is broadly applicable to just about any industry (not just IT). There’s little debate remaining that cloud concepts will significantly change the IT industry for just about every participant: vendor, solution provider, IT organization and user.

Any closing thoughts or comments?

Personally, I think Dimension Data is uniquely well positioned to serve their clients well through this transition.

The pieces are pretty easy to see – Dimension Data has access to a broad range of relevant technologies (including EMC’s!), they have the intellectual prowess to develop the new services and skills that customers will demand, they have the important “trusted advisor” relationship with their clients, and – now as part of NTT – they’ve got access to a world-class service provider infrastructure as part of their offer.

Indeed, the business of IT really hasn’t changed – but how we go about doing it is certainly changing. And I think Dimension Data serves as the prototypical example of what customers are going to want from their partners going forward.

Thanks for the opportunity to share a few thoughts!

"

Wednesday, 8 September 2010

Using Object Database db4o as Storage Provider in Voldemort

Using Object Database db4o as Storage Provider in Voldemort: "

Abstract: In this article I will show you how easy it is to add support for Versant’s object database, db4o in project Voldemort, an Apache licensed distributed key-value storage system used at LinkedIn which is useful for certain high-scalability storage problems where simple functional partitioning is not sufficient (Note: Voldemort borrows heavily from Amazon’s Dynamo, if you’re interested in this technology all the available papers about Dynamo should be useful).

Voldemort’s storage layer is completely mockable so development and unit testing can be done against a throw-away in-memory storage system without needing a real cluster or, as we will show next, even a real storage engine like db4o for simple testing.

Note: All db4o related files that are part of this project are publicly available on ☞ GitHub

This is an article contributed by German Viscuso from db4objects (a division of Versant Corporation).

Content:

Voldemort in a Nutshell

In one sentence Voldemort is basically a big, distributed, persistent, fault-tolerant hash table designed to tradeoff consistency for availability.

It’s implemented as a highly available key-value storage system for services that need to provide an “always-on” experience. To achieve this level of availability, Voldemort sacrifices consistency under certain failure scenarios. It makes extensive use of object versioning and application-assisted conflict resolution.

Voldemort targets applications that operate with weak/eventual consistency (the “C” in ACID) if this results in high availability even under occasional network failures. It does not provide any isolation guarantees and permits only single key updates. It is well known that when dealing with the possibility of network failures, strong consistency and high data availability cannot be achieved simultaneously.

The system uses a series of known techniques to achieve scalability and availability: data is partitioned and replicated using consistent hashing, and consistency is facilitated by object versioning. The consistency among replicas during updates is maintained by a quorum-like technique and a decentralized replica synchronization protocol. It employs a gossip based distributed failure detection and membership protocol and is a completely decentralized system with minimal need for manual administration. Storage nodes can be added and removed from without requiring any manual partitioning or redistribution.

Voldemort is not a relational database, it does not attempt to satisfy arbitrary relations while satisfying ACID properties. Nor is it an object database that attempts to transparently map object reference graphs. Nor does it introduce a new abstraction such as document-orientation.

However, given the current complex scenario of persistence solutions in the industry we’ll see that different breeds can be combined to offer solutions that become the right tool for the job at hand. In this case we’ll see how a highly scalable NoSQL solution such as Voldemort can use a fast Java based embedded object database (Versant’s db4o) as the underlying persistence mechanism. Voldemort provides the scalability while db4o provides fast key/value pair persistence.

Key/Value Storage System

To enable high performance and availability Voldemort allows only very simple key-value data access. Both keys and values can be complex compound objects including lists or maps, but none-the-less the only supported queries are effectively the following:

value = store.get( key )
store.put( key, value )
store.delete( key )

This is clearly not good enough for all storage problems, there are a variety of trade-offs:

Cons:
no complex query filters
all joins must be done in code
no foreign key constraints
no triggers/callbacks

Pros:

only efficient queries are possible, very predictable performance
easy to distribute across a cluster
service-orientation often disallows foreign key constraints and forces joins to be done in code anyway (because key refers to data maintained by another service)
using a relational db you need a caching layer to scale reads, the caching layer typically forces you into key-value storage anyway
often end up with xml or other denormalized blobs for performance anyway
clean separation of storage and logic (SQL encourages mixing business logic with storage operations for efficiency)
no object-relational miss-match, no mapping

Having a three operation interface makes it possible to transparently mock out the entire storage layer and unit test using a mock-storage implementation that is little more than a HashMap. This makes unit testing outside of a particular container or environment much more practical and was certainly instrumental in simplifying the db4o storage engine implementation.

Logical Architecture

In Voldemort, each storage node has three main software component groups: request coordination, membership and failure detection, and a local persistence engine. All these components are implemented in Java.

If we take a closer look we’ll see that each layer in the code implements a simple storage interface that does put, get, and delete. Each of these layers is responsible for performing one function such as TCP/IP network communication, serialization, version reconciliation, inter-node routing, etc. For example the routing layer is responsible for taking an operation, say a PUT, and delegating it to all the N storage replicas in parallel, while handling any failures.

Voldemort’s local persistence component allows for different storage engines to be plugged in. Engines that are in use are Oracle’s Berkeley Database (BDB), Oracle’s MySQL, and an in-memory buffer with persistent backing store. The main reason for designing a pluggable persistence component is to choose the storage engine best suited for an application’s access patterns. For instance, BDB can handle objects typically in the order of tens of kilobytes whereas MySQL can handle objects of larger sizes. Applications choose Voldemort’s local persistence engine based on their object size distribution. The majority of Voldemort’s production instances currently use BDB.

In this article we’ll provide details about the creation of a new storage engine (yellow box layer on the image above) that uses db4o instead of Oracle’s Berkeley DB (BDB), Oracle’s MySQL or just memory. Moreover we’ll show that db4o’s performance is on-par with BDB (if not better) while introducing a significant reduction in the implementation complexity.

db4o API in a Nutshell

db4o’s basic API is not far away from Voldemort’s API in terms of simplicity but it’s not limited to the storage of key/value pairs. db4o is a general purpose object persistence engine that can deal with the peculiarities of native POJO persistence that exhibit arbitrarily complex object references.

The very basic API looks like this:

container.store( object )
container.delete( object )
objectSet = container.queryByExample( prototype )

Two more advanced querying mechanisms are available and we will introduce one of them (SODA) on the following sections. As you can see, in order to provide an interface for key/value storage (what Voldemort expects) we’ll have to provide an intermediate layer which will take requests via the basic Voldemort API and will translate that into db4o API calls.

db4o as Key/Value Storage Provider

db4o fits seamlessly into the Voldemort picture as a storage provider because:

db4o allows free form storage of objects with no special requirements on object serialization which results in a highly versatile storage solution which can easily be configured to store from low level objects (eg byte arrays) to objects of arbitrarily complexity (both for keys and values on key/value pairs)
db4o’s b-tree based indexing capabilities allow for quick retrieval of keys while acting as a key/value pair storage provider
db4o’s simple query system allows the retrieval of key/value pairs with one-line of code
db4o can persist versioned objects directly which frees the developer from having to use versioning wrappers on stored values

db4o can act as a simpler but powerful replacement to Oracle’s Berkley DB (BDB) because the db4o complexity of the db4o storage provider is closer to Voldemort’s “in memory” storage provider than the included BDB provider which results in faster test and development cycles (remember Voldemort is work in progress).

As a first step in the implementation we provide a generic class to expose db4o as a key/value storage provider which is not dependant on Voldemort’s API (also useful to db4o developers who need this functionality in any application). The class is voldemort.store.db4o.Db4oKeyValueProvider and relies on Java generics so it can be used with any sort of key/value objects.

This class implements the following methods which are more or less self explanatory considering they operate on a store of key/value pairs:

voldemort.store.db4o.Db4oKeyValueProvider.keyIterator()
voldemort.store.db4o.Db4oKeyValueProvider.getKeys()
voldemort.store.db4o.Db4oKeyValueProvider.getValues(Key)
voldemort.store.db4o.Db4oKeyValueProvider.pairIterator()
voldemort.store.db4o.Db4oKeyValueProvider.get(Key)
voldemort.store.db4o.Db4oKeyValueProvider.delete(Key)
voldemort.store.db4o.Db4oKeyValueProvider.delete(Key, Value)
voldemort.store.db4o.Db4oKeyValueProvider.delete(Db4oKeyValuePair)
voldemort.store.db4o.Db4oKeyValueProvider.put(Key, Value)
voldemort.store.db4o.Db4oKeyValueProvider.put(Db4oKeyValuePair)
voldemort.store.db4o.Db4oKeyValueProvider.getAll()
voldemort.store.db4o.Db4oKeyValueProvider.getAll(Iterable)
voldemort.store.db4o.Db4oKeyValueProvider.truncate()
voldemort.store.db4o.Db4oKeyValueProvider.commit()
voldemort.store.db4o.Db4oKeyValueProvider.rollback()
voldemort.store.db4o.Db4oKeyValueProvider.close()
voldemort.store.db4o.Db4oKeyValueProvider.isClosed()

Second, we implement a db4o StorageProvider following Voldemort’s API, namely voldemort.store.db4o.Db4oStorageEngine which define the key (K) as ByteArray and the value (V) as byte[] for key/value classes.

Since this class inherits from abstract class voldemort.store.StorageEngine it must (and it does) implement the following methods:

voldemort.store.db4o.Db4oStorageEngine.getVersions(K)
voldemort.store.db4o.Db4oStorageEngine.get(K)
voldemort.store.db4o.Db4oStorageEngine.getAll(Iterable)
voldemort.store.db4o.Db4oStorageEngine.put(K, Versioned)
voldemort.store.db4o.Db4oStorageEngine.delete(K, Version)

We also implement the class that handles the db4o storage configuration: voldemort.store.db4o.Db4oStorageConfiguration
This class is responsible for providing all the initialization parameters when the db4o database is created (eg. index creation on the key (K) field for fast retrieval of key/value pairs)

Third and last we provide three support classes:

voldemort.store.db4o.Db4oKeyValuePair : a generic key/value pair class. Instances of this class will be what ultimately gets stored in the db4o database.
voldemort.store.db4o.Db4oKeysIterator : an Iterator implementaion that allows to iterate over the keys (K).
voldemort.store.db4o.Db4oEntriesIterator: an Iterator implementaion that allows to iterate over the values (V).

and a few test classes: voldemort.CatDb4oStore, voldemort.store.db4o.Db4oStorageEngineTest (this is the main test class), voldemort.store.db4o.Db4oSplitStorageEngineTest.

db4o and BerkeleyDB side by side

Let’s take a look at db4o’s simplicity by comparing the matching methods side by side with BDB (now both can act as Voldemort’s low level storage provider).

Constructors: BdbStorageEngine() vs Db4oStorageEngine()

The db4o constructor get rids of the serializer object because it can store objects of any complexity with no transformation. The code:

this.versionSerializer = new Serializer() {
public byte[] toBytes(Version object) {
return ((VectorClock) object).toBytes();
}
public Version toObject(byte[] bytes) {
return versionedSerializer.getVersion(bytes);
}
};

is no longer necessary and, of course, this impacts all storage operations because db4o requires no conversions “to bytes” and “to object” back and forth:

Fetch by Key

BDB (fetch by key):

DatabaseEntry keyEntry = new DatabaseEntry(key.get());
DatabaseEntry valueEntry = new DatabaseEntry();
List results = Lists.newArrayList();

for(OpStatus status = cursor.getSearchKey(keyEntry, valueEntry, lockMode);
status == OperationStatus.SUCCESS;
status = cursor.getNextDup(keyEntry, valueEntry, lockMode)) {

results.add(serializer.toObject(valueEntry.getData()));
};
return results;

db4o (fetch by key)

Query query = getContainer().query();
query.constrain(Db4oKeyValuePair.class);
query.descend('key').constrain(key);
return query.execute();

in this example we use “SODA”, a low level and powerful graph based query system where you basically build a query tree and pass it to db4o for execution.

Store Key/Value Pairs

BDB (store key/value pair):

DatabaseEntry keyEntry = new DatabaseEntry(key.get());
boolean succeeded = false;
transaction = this.environment.beginTransaction(null, null);
// Check existing values
// if there is a version obsoleted by this value delete it
// if there is a version later than this one, throw an exception
DatabaseEntry valueEntry  = new DatabaseEntry();
cursor = getBdbDatabase().openCursor(transaction, null);

for(OpStatus status=cursor.getSearchKey(keyEntry,valueEntry,LockMode.RMW);
status == OperationStatus.SUCCESS;
status = cursor.getNextDup(keyEntry, valueEntry, LockMode.RMW)) {

VectorClock clock = new VectorClock(valueEntry.getData());
Occured occured = value.getVersion().compare(clock);

if(occured == Occured.BEFORE) throw new ObsoleteVersionException();
else if(occured == Occured.AFTER) // best effort delete of obsolete previous value!
cursor.delete();
}

// Okay so we cleaned up all the prior stuff, so now we are good to
// insert the new thing
valueEntry = new DatabaseEntry(versionedSerializer.toBytes(value));
OperationStatus status = cursor.put(keyEntry, valueEntry);
if(status != OperationStatus.SUCCESS)
throw new PersistenceFailException('Put operation failed:' + status);
succeeded = true;

db4o (store key/value pair):

boolean succeeded = false;
candidates = keyValueProvider.get(key);

for(Db4oKeyValuePair> pair: candidates) {
Occured occured = value.getVersion().compare(pair.getValue().getVersion());
if(occured == Occured.BEFORE) throw new ObsoleteVersionException();
else if(occured == Occured.AFTER) // best effort delete of obsolete previous value!
keyValueProvider.delete(pair);
}
// Okay so we cleaned up all the prior stuff, so we can now insert
try {
keyValueProvider.put(key, value);
} catch(Db4oException de) {
throw new PersistenceFailException("Put op failed:" + de.getMessage()); succeeded = true;
}

Status of Unit Tests

Let’s take a look at the current status of unit tests in for the db4o storage provider (Db4oStorageEngineTest):

testPersistence: single key/value pair storage and retrieval with storage engine close operation in between (it takes some time because this is the first test and the Voldemort system is kick started).
testEquals: test that retrieval of storage engine instance by name gets you the same instance.
testNullConstructorParameters: null constructors for storage engine instantiation are illegal. Arguments are storage engine name and database configuration.
testSimultaneousIterationAndModification: threaded test of simultaneous puts (inserts), deletes and iteration over pairs (150 puts and 150 deletes before start of iteration).
testGetNoEntries: test that empty storage empty returns zero pairs.
testGetNoKeys: test that empty storage empty returns zero keys.
testKeyIterationWithSerialization: test storage and key retrieval of 5 pairs serialized as
Strings.
testIterationWithSerialization: test storage and retrieval of 5 pairs serialized as Strings.
testPruneOnWrite: stores 3 versions for one key and tests prune (overwriting of previous versions should happen).
testTruncate: stores 3 pairs and issues a truncate database operation. Then verifies db is empty.
testEmptyByteArray: stores 1 pair with zeroed key and tests correct retrieval.
testNullKeys: test that basic operations (get, put, getAll, delete,etc) fail with a null key parameter (5 operations on pairs total).
testPutNullValue: test that put operation works correctly with a null value (2 operations on pairs total, put and get).
testGetAndDeleteNonExistentKey: test that key of non persistent pair doesn’t return a value.
testFetchedEqualsPut: stores 1 pair with complex version and makes sure only one entry is stored and retrieved.
testVersionedPut: tests that obsolete or equal versions of a value can’t be stored. Test correct storage of incremented version (13 operations on pairs total).
testDelete: puts 2 pairs with conflicting versions and deletes one (7 operations on pairs total).
testGetVersions: Test retrieval of different versions of pair (3 operations on pairs total).
testGetAll: puts 10 pairs and tests if all can be correctly retrieved at once (getAll)
testGetAllWithAbsentKeys: same as before but with non persistent keys (getAll returns 0 pairs).
testCloseIsIdempotent: tests that a second close does not result in error.

Conclusion

db4o provides a low level implementation of a key/value storage engine that is both simpler than BDB and on-par (if not better) in performance. Moreover, the implementation shows that different persistence solutions such as the typical highly scalable, eventually consistent NoSQL engine and object databases can be combined to provide a solution that becomes the right tool for the job at hand which makes typical “versus” arguments obsolete.

Finally the short implementation path taken to make db4o act as a reliable key/value storage provider show the power and versatility of native embedded object databases.

Original title and link for this post: Using Object Database db4o as Storage Provider in Voldemort (published on the NoSQL blog: myNoSQL)

Unlimited-Data. moved to lab.itbee.vn