Sunday, 29 September 2013

The Power of Proxies in Java | Javalobby

From Evernote:

The Power of Proxies in Java | Javalobby

Clipped from: http://java.dzone.com/articles/power-proxies-java

The Power of Proxies in Java

05.10.2010
| 46633 views |
In this article, I'll show you the path that leads to true Java power, the use of proxies.
They are everywhere but only a handful of people know about them. Hibernate for lazy loading entities, Spring for AOP, LambdaJ for DSL, only to name a few: they all use their hidden magic. What are they? They are… Java's dynamic proxies.
Everyone knows about the GOF Proxy design pattern:
Allows for object level access control by acting as a pass through entity or a placeholder object.
Likewise, in Java, a dynamic proxy is an instance that acts as apass through to the real object. This powerful pattern let you changethe real behaviour from a caller point of view since method calls canbe intercepted by the proxy.

Pure Java proxies

Pure Java proxies have some interesting properties:
  • They are based on runtime implementations of interfaces
  • They are public, final and not abstract
  • They extend java.lang.reflect.Proxy
In Java, the proxy itself is not as important as the proxy's behaviour. The latter is done in an implementation of java.lang.reflect.InvocationHandler . It has only a single method to implement:
public Object invoke(Object proxy, Method method, Object[] args)
  • proxy: the proxy instance that the method was invoked on
  • method: the Method instance corresponding to the interface method invoked on the proxy instance. The declaring class of the Methodobject will be the interface that the method was declared in, which maybe a superinterface of the proxy interface that the proxy classinherits the method through
  • args: an array of objects containing the values of the arguments passed in the method invocation on the proxy instance, or nullif interface method takes no arguments. Arguments of primitive typesare wrapped in instances of the appropriate primitive wrapper class,such as java.lang.Integer or java.lang.Boolean
Let's take a simple example: suppose we want a List that can't be added elements to it. The first step is to create the invocation handler:
01.public class NoOpAddInvocationHandler implements InvocationHandler {
02. 
03.private final List proxied;
04. 
05.public NoOpAddInvocationHandler(List proxied) {
06. 
07.this.proxied = proxied;
08.}
09. 
10.public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
11. 
12.if (method.getName().startsWith("add")) {
13. 
14.return false;
15.}
16. 
17.return method.invoke(proxied, args);
18.}
19.}
The invoke method will intercept method calls and donothing if the method starts with "add". Otherwise, it will the callpass to the real proxied object. This is a very crude example but isenough to let us understand the magic behind.
Notice that in case you want your method call to pass through, youneed to call the method on the real object. For this, you'll need areference to the latter, something the invoke method does not provide. That's why in most cases, it's a good idea to pass it to the constructor and store it as an attribute.
Note: under no circumstances should you call the method on the proxyitself since it will be intercepted again by the invocation handler andyou will be faced with a StackOverflowError.
To create the proxy itself:
1.List proxy = (List) Proxy.newProxyInstance(
2.NoOpAddInvocationHandlerTest.class.getClassLoader(),
3.new Class[] { List.class },
4.new NoOpAddInvocationHandler(list));
The newProxyInstance method takes 3 arguments:
  • the class loader
  • an array of interfaces that will be implemented by the proxy
  • the power behind the throne in the form of the invocation handler
Now, if you try to add elements to the proxy by calling any add methods, it won't have any effect.

CGLib proxies

Java proxies are runtime implementations of interfaces. Objects donot necessarily implement interfaces, and collections of objects do notnecessarily share the same interfaces. Confronted with such needs, Javaproxies fail to provide an answser.
Here begins the realm of CGLib . CGlib is a third-party framework, based on bytecode manipulation provided by ASM that can help with the previous limitations. A word of advice first,CGLib's documentation is not on par with its features: there's notutorial nor documentation. A handful of JavaDocs is all you can counton. This said CGLib waives many limitations enforced by pure Javaproxies:
  • you are not required to implement interfaces
  • you can extend a class
For example, since Hibernate entities are POJO, Java proxies cannot be used in lazy-loading; CGLib proxies can.
There are matches between pure Java proxies and CGLib proxies: where you use Proxy, you use net.sf.cglib.proxy.Enhancer class, where you use InvocationHandler, you use net.sf.cglib.proxy.Callback. The two main differences is that Enhancer has a public constructor and Callback cannot be used as such but only through one of its subinterfaces:
  • Dispatcher: Dispatching Enhancer callback
  • FixedValue: Enhancer callback that simply returns the value to return from the proxied method
  • LazyLoader: Lazy-loading Enhancer callback
  • MethodInterceptor: General-purpose Enhancer callback which provides for "around advice"
  • NoOp: Methods using this Enhancer callback will delegate directly to the default (super) implementation in the base class
As an introductory example, let's create a proxy that returns thesame value for hash code whatever the real object behind. The featurelooks like a MethodInterceptor, so let's implement it as such:
01.<public class HashCodeAlwaysZeroMethodInterceptor implements MethodInterceptor {
02. 
03.public Object intercept(Object object, Method method, Object[] args,
04.MethodProxy methodProxy) throws Throwable {
05. 
06.if ("hashCode".equals(method.getName())) {
07. 
08.return 0;
09.}
10. 
11.return methodProxy.invokeSuper(object, args);
12.}
13.}
Looks awfully similar to a Java invocation handler, doesn't it? Now, in order to create the proxy itself:
1.Object proxy = Enhancer.create(
2.Object.class,
3.new HashCodeAlwaysZeroMethodInterceptor());
Likewise, the proxy creation isn't suprising. The real differences are:
  • there's no interface involved in the process
  • the proxy creation process also creates the proxied object. There'sno clear cut between proxy and proxied from the caller point of view
  • thus, the callback method can provide the proxied object and there's no need to create and store it in your own code

Conclusion

This article only brushed the surface of what can be done withproxies. Anyway, I hope it let you see that Java has some interestingfeatures and points of extension, whether out-of-the-box or coming fromsome third-party framework
You can find the sources for this article in Eclipse/Maven format here .
From http://blog.frankel.ch/the-power-of-proxies-in-java

We’re on the cusp of deep learning for the masses. You can thank Google later — Tech News and Analysis

From Evernote:

We're on the cusp of deep learning for the masses. You can thank Google later — Tech News and Analysis

Clipped from: http://gigaom.com/2013/08/16/were-on-the-cusp-of-deep-learning-for-the-masses-you-can-thank-google-later/

We're on the cusp of deep learning for the masses. You can thank Google later — Tech News and Analysis 

Google silently did something revolutionary on Thursday. It open sourced a tool called word2vec , prepackaged deep-learning software designed to understand the relationships between words with no human guidance. Just input a textual data set and let underlying predictive models get to work learning.
"This is a really, really, really big deal," said Jeremy Howard, president and chief scientist of data-science competition platform Kaggle. "… It's going to enable whole new classes of products that have never existed before." Think of Siri on steroids, for starters, or perhaps emulators that could mimic your writing style down to the tone.

When deep learning works, it works great 

To understand Howard's excitement, let's go back a few days. It was Monday and I was watching him give a presentation in Chicago about how deep learning was dominating the competition in Kaggle, the online platform where organization present vexing predictive problems and data scientists compete to create the best models. Whenever someone has used a deep learning model to tackle one of the challenges, he told the room, it has performed better than any model ever previously devised to tackle that specific problem.

Jeremy Howard (left) at Structure: Data 2012 (c) Pinar Ozger / http://www.pinarozger.com
But there's a catch: deep learning is really hard. So far, only a handful of teams in hundreds of Kaggle competitions have used it. Most of them have included Geoffrey Hinton or have been associated with him.
Hinton is a University of Toronto professor who pioneered the use of deep learning for image recognition and is now a distinguished engineer at Google, as well. What got Google really interested in Hinton — at least to the point where it hired him — was his work in an image-recognition competition called ImageNet . For years the contest's winners had been improving only incrementally on previous results, until Hinton and his team used deep learning to improve by an order of magnitude.

Neural networks: A way-simplified overview 

Deep learning, Howard explained, is essentially a bigger, badder take on the neural network models that have been around for some time. It's particularly useful for analyzing image, audio, text, genomic and other multidimensional data that doesn't lend itself well to traditional machine learning techniques.
Neural networks work by analyzing inputs (e.g., words or images) and recognizing the features that comprise them as well as how all those features relate to each other. With images, for example, a neural network model might recognize various formations of pixels or intensities of pixels as features.

A very simple neural network. Source: Wikipedia Commons
Trained against a set of labeled data, the output of a neural network might be the classification of an input as a dog or cat, for example. In cases where there is no labeled training data — a process called self-taught learning — neural networks can be used to identify the common features of their inputs and group similar inputs even though the models can't predict what they actually are. Like when Google researchers constructed neural networks that were able to recognize cats and human faces without having been trained to do so.

Stacking neural networks to do deep learning 

In deep learning, multiple neural networks are "stacked" on top of each other, or layered, in order to create models that are even better at prediction because each new layer learns from the ones before it. In Hinton's approach, each layer randomly omits features — a process called "dropout" — to minimize the chances the model will overfit itself to just the data upon which it was trained. That's a technical way of saying the model won't work as well when trying to analyze new data.
So dropout or similar techniques are critical to helping deep learning models understand the real causality between the inputs and the outputs, Howard explained during a call on Thursday. It's like looking at the same thing under the same lighting all the time versus looking at it in different lighting and from different angles. You'll see new aspects and won't see others, he said, "But the underlying structure is going to be the same each time."

An example of what features a neural network might learn from images. Source: Hinton et al
Still, it's difficult to create accurate models and to program them to run on the number of computing cores necessary to process them in a reasonable timeframe. It's also can be difficult to train them on enough data to guarantee accuracy in an unsupervised environment. That's why so much of the cutting-edge work in the field is still done by experts such as Hinton, Jeff Dean and Andrew Ng, all of whom had or still have strong ties to Google.
There are open source tools such as Theano and PyLearn2 that try to minimize the complexity, Howard told the audience on Monday, but a user-friendly, commercialized software package could be revolutionary. If data scientists in places outside Google could simply (a relative term if ever there was one) input their multidimensional data and train models to learn it, that could make other approaches to predictive modeling all but obsolete. It wouldn't be inconceivable, Howard noted, that a software package like this could emerge within the next year.

Enter word2vec 

Which brings us back to word2vec. Google calls it "an efficient implementation of the continuous bag-of-words and skip-gram architectures for computing vector representations of words." Those "architectures" are two new natural-language processing techniques developed by Google researchers Tomas Mikolov, Ilya Sutskever, and Quoc Le (Google Fellow Jeff Dean was also involved, although modestly, he told me.) They're like neural networks, only simpler so they can be trained on larger data sets.
Kaggle's Howard calls word2vec the "crown jewel" of natural language processing. "It's the English language compressed down to a list of numbers," he said.
Word2vec is designed to run on a system as small as a single multicore machine (Google tested its underlying techniques over days across more than 100 cores on its data center servers). Its creators have shown how it can recognize the similarities among words (e.g., the countries in Europe) as well as how they're related to other words (e.g., countries and capitals). It's able to decipher analogical relationships (e.g., short is to shortest as big is to biggest), word classes (e.g., carnivore and cormorant both relate to animals) and "linguistic regularities" (e.g., "vector('king') – vector('man') + vector('woman') is close to vector('queen')).

Source: Google
Right now, the word2vec Google Code page notes, "The linearity of the vector operations seems to weakly hold also for the addition of several vectors, so it is possible to add several word or phrase vectors to form representation of short sentences."
This is accomplished by turning words into numbers that correlate with their characteristics, Howard said. Words that express positive sentiment, adjectives, nouns associated with sporting events — they'll all have certain numbers in common based on how they're used in the training data (so bigger data is better).

Smarter models means smarter apps 

If this is all too esoteric, think about these methods applied to auto-correct or word suggestions in text-messaging apps. Current methods for doing this might be as simple as suggesting words that are usually paired together, Howard explained, meaning a suggestion is could be based solely on the word immediately before it. Using deep-learning-based approaches, a texting app could take into account the entire sentence, for example, because the app would have a better understanding of what the all words really mean in context.
Maybe you could average out all the numbers in a tweet, Howard suggested, and get a vector output that would accurately infer the sentiment, subject and level of formality of the tweet. Really, the possibilities are limited only to the types of applications people can think up to take advantage of word2vec's deep understanding of natural language.

An example output file from word2vec that has grouped similar words
The big caveat, however, is researchers and industry data scientists still need to learn how to use word2vec. There hasn't been a lot of research done on how to best use these types of models, Howard said, and the thousands of researchers working on other methods of natural language processing aren't going to jump ship to Google's tools overnight. Still, he believes the community will come around and word2vec and its underlying techniques could make all other approaches to natural language processing obsolete.
And this is just the start. A year from now, Howard predicts, deep learning will have surpassed a whole class of algorithms in other fields (i.e., things other than speech recognition, image recognition and natural language processing), and a year after that it will be integrated into all sorts of software packages. The only questions — and they're admittedly big ones — is how smart deep learning models can get (and whether they'll run into another era of hardware constraints that graphical processing units helped resolve earlier this millennium) and how accessible software packages like word2vec can make deep learning even for relatively unsophisticated users.
"Maybe in 10 years' time," Howard proposed, "we'll get to that next level."

Cajo, the easiest way to accomplish distributed computing in Java | Java Code Geeks

From Evernote:

Cajo, the easiest way to accomplish distributed computing in Java | Java Code Geeks

Clipped from: http://www.javacodegeeks.com/2011/01/cajo-easiest-way-to-accomplish.html

Cajo, the easiest way to accomplish distributed computing in Java

by on January 27th, 2011 | Filed in: Enterprise Java Tags: ,
Derived from the introductory section of Jonas Boner's article "Distributed Computing Made Easy" posted on TheServerSide.com on May 1st 2006 :
"Distributed computing is becoming increasingly important in the world of enterprise application development. Today, developers continuously need to address questions like: How do you enhance scalability by scaling the application beyond a single node? How can you guarantee high-availability, eliminate single points of failure, and make sure that you meet your customer SLAs?
For many developers, the most natural way of tackling the problem would be to divide up the architecture into groups of components or services that are distributed among different servers. While this is not surprising, considering the heritage of CORBA, EJB, COM and RMI that most developers carry around, if you decide to go down this path then you are in for a lot of trouble. Most of the time it is not worth the effort and will give you more problems than it solves."
On the other hand, distributed computing and Java go together naturally. As the first language designed from the bottom up with networking in mind, Java makes it very easy for computers to cooperate. Even the simplest applet running in a browser is a distributed application, if you think about it. The client running the browser downloads and executes code that is delivered by some other system. But even this simple applet wouldn't be possible without Java's guarantees of portability and security: the applet can run on any platform, and can't sabotage its host.
The cajo project is a small library, enabling powerful dynamic multi-machine cooperation. It is a surprisingly easy to use yet unmatched in performance. It is a uniquely 'drop-in' distributed computing framework: meaning it imposes no structural requirements on your applications, nor source changes. It allows multiple remote JVMs to work together seamlessly, as one.
The project owner John Catherino claims "King Of the Mountain! ;-)" and challenges everyone who is willing to prove that there exists a distributed computing framework in Java that is equally flexible and as fast as cajo.
To tell you the truth, personally I am convinced by John's saying; and I strongly believe that you will be also if you just let me walk you through this client – server example. You will be amazed of how easy and flexible the cajo framework is :
The Server.java
01import gnu.cajo.Cajo; // The cajo implementation of the Grail
02
03public class Server {
04
05   public static class Test { // remotely callable classes must be public
06      // though not necessarily declared in the same class
07      private final String greeting;
08      // no silly requirement to have no-arg constructors
09      public Test(String greeting) { this.greeting = greeting; }
10      // all public methods, instance or static, will be remotely callable
11      public String foo(Object bar, int count) {
12         System.out.println("foo called w/ " + bar + ' ' + count + " count");
13         return greeting;
14      }
15      public Boolean bar(int count) {
16         System.out.println("bar called w/ " + count + " count");
17         return Boolean.TRUE;
18      }
19      public boolean baz() {
20         System.out.println("baz called");
21         return true;
22      }
23      public String other() { // functionality not needed by the test client
24         return "This is extra stuff";
25      }
26   } // arguments and return objects can be custom or common to server and client
27
28   public static void main(String args[]) throws Exception { // unit test
29      Cajo cajo = new Cajo(0);
30      System.out.println("Server running");
31      cajo.export(new Test("Thanks"));
32   }
33}
Compile via:
1javac -cp cajo.jar;. Server.java
Execute via:
1java -cp cajo.jar;. Server
As you can see with just 2 commands :
1Cajo cajo = new Cajo(0);
2cajo.export(new Test("Thanks"));
we can expose any POJO (Plain Old Java Object) as a distributed service!
And now the Client.java
01import gnu.cajo.Cajo;
02
03import java.rmi.RemoteException; // caused by network related errors
04
05interface SuperSet {  // client method sets need not be public
06   void baz() throws RemoteException;
07} // declaring RemoteException is optional, but a nice reminder
08
09interface ClientSet extends SuperSet {
10   boolean bar(Integer quantum) throws RemoteException;
11   Object foo(String barbaz, int foobar) throws RemoteException;
12} // the order of the client method set does not matter
13
14public class Client {
15   public static void main(String args[]) throws Exception { // unit test
16      Cajo cajo = new Cajo(0);
17      if (args.length > 0) { // either approach must work...
18         int port = args.length > 1 ? Integer.parseInt(args[1]) : 1198;
19         cajo.register(args[0], port);
20         // find server by registry address & port, or...
21      } else Thread.currentThread().sleep(100); // allow some discovery time
22
23      Object refs[] = cajo.lookup(ClientSet.class);
24      if (refs.length > 0) { // compatible server objects found
25         System.out.println("Found " + refs.length);
26         ClientSet cs = (ClientSet)cajo.proxy(refs[0], ClientSet.class);
27         cs.baz();
28         System.out.println(cs.bar(new Integer(77)));
29         System.out.println(cs.foo(null, 99));
30      } else System.out.println("No server objects found");
31      System.exit(0); // nothing else left to do, so we can shut down
32   }
33}
Compile via:
1javac -cp cajo.jar;. Client.java
Execute via:
1java -cp cajo.jar;. Client
The client can find server objects either by providing the server address and port (if available) or by using multicast. To locate the appropriate server object "Dynamic Client Subtyping" is used. For all of you who do not know what "Dynamic Client Subtyping" stands for, John Catherino explains in his relevant blog post :
"Oftentimes service objects implement a large, rich interface. Other times service objects implement several interfaces, grouping their functionality into distinct logical concerns. Quite often, a client needs only to use a small portion of an interface; or perhaps some methods from a few of the logical grouping interfaces, to satisfy its own needs.
The ability of a client to define its own interface, from ones defined by the service object, is known as subtyping in Java. (in contrast to subclassing) However, unlike conventional Java subtyping; Dynamic Client Subtyping means creating an entirely different interface. What makes this subtyping dynamic, is that it works with the original, unmodified service object.
This can be a very potent technique, for client-side complexity management."
Isn't that really cool??? We just have to define the interface our client "needs" to use and locate the appropriate server object that complies with the client specification. The following command derived from our example accomplish just that :
1Object refs[] = cajo.lookup(ClientSet.class);
Last but not least we can create a client side "proxy" of the server object and remotely invoke its methods just like an ordinary local object reference, by issuing the following command :
1ClientSet cs = (ClientSet)cajo.proxy(refs[0], ClientSet.class);
That's it. These allow for complete interoperability between distributed JVMs. It just can't get any easier than this.
As far as performance is concerned, I have conducted some preliminary tests on the provided example and achieved an average score of 12000 TPS on the following system :
Sony Vaio with the following characteristics :
  • System : openSUSE 11.1 (x86_64)
  • Processor (CPU) : Intel(R) Core(TM)2 Duo CPU T6670 @ 2.20GHz
  • Processor Speed : 1,200.00 MHz
  • Total memory (RAM) : 2.8 GB
  • Java : OpenJDK 1.6.0_0 64-Bit
For your convenience I provide the code snippet that I used to perform the stress test :
1int repeats = 1000000;
2long start = System.currentTimeMillis();
3for(int i = 0; i < repeats;i ++)
4  cs.baz();
5System.out.println("TPS : " + repeats/((System.currentTimeMillis() - start)/1000d));
Happy Coding! and Don't forget to share!
Justin
Related Articles :
Related Whitepaper:

Java EE 6 Cookbook for Securing, Tuning, and Extending Enterprise Applications

Java Platform, Enterprise Edition is a widely used platform for enterprise server programming in the Java programming language.
This book covers exciting recipes on securing, tuning and extending enterprise applications using a Java EE 6 implementation.The book starts with the essential changes in Java EE 6. Then they will dive into the implementation of some of the new features of the JPA 2.0 specification, and look at implementing auditing for relational data stores.They will then look into how they can enable security for their software system using Java EE built-in features as well as using the well-known Spring Security framework. They will then look at recipes on testing various Java EE technologies including JPA, EJB, JSF, and Web services.Next they will explore various ways to extend a Java EE environment with the use of additional dynamic languages as well as frameworks.At the end of the book, they will cover managing enterprise application deployment and configuration, and recipes that will help you debug problems and enhance the performance of your applications.
.
.
Share and enjoy!
.