The data computation layer in between the data persistent layer and the
application layer is responsible for computing the data from data persistence
layer, and returning the result to the application layer. The data
computation layer of Java aims to reduce the coupling between these two
layers and shift the computational workload from them. The typical
computation layer is characterized with below features:
Ability to compute on the data from arbitrary data persistence layers, not
only databases, but also the non-database Excel, Txt, or XML files. Of all
these computations, the key is the computation on the commonest structured
data. Ability to perform the interactive computations among various data
sources uniformly, not only including the computation among different
databases, but also calculation between the databases and non-database data
sources. The coupling... (more)
The MapReduce of Hadoop is a widely-used parallel computing framework.
However, its code reuse mechanism is inconvenient, and it is quite cumbersome
to pass parameters. Far different from our usual experience of calling the
library function easily, I found both the coder and the caller must bear a
sizable amount of precautions in mind when writing even a short pieces of
program for calling by others.
However, we finally find that esProc could easily realize code reuse in
hadoop. Still a simple and understandable example of grouping and
summarizing, let's check out a solution with... (more)
In the process of development with Java, we will occasionally encounter the
computation similar to data processing in database. For instance, there are
two frequently updated Excel sheets, which are the clients' information and
the orders. We need to query the data of clients who have bought all the
products on the list through entering a dynamic product list.
The "computation similar to data processing in database" refers to structured
data computation of an application without database. Although Java is capable
of handling such computation, the procedure is very cumbersome and v... (more)
In Java development, we may encounter the complex set operations. Java alone
is not powerful enough to save programmers' efforts in implementing the
computation details, which is time-consuming and poor in code reuse. In view
of this, programmers usually resort to dynamic calculation script for set
SQL is surely the first kind of script that comes into most programmers'
mind. However, to their disappointments, SQL does not support the explicit
set, and is unable to represent the sets of a set, ordered set, generic set,
and only the result set can be recognized as a set... (more)
Recently, I read "Why Big Data Projects Fail" by Stephen Brobst. I can’t
agree more with his opinions which exposed the problem I’ve been worried
about. In this article, I am going to further discuss this topic to remind
the enterprises to beware of falling into such pitfall of failure.
Let’s have a look on a positive example. As a successful enterprise in
leveraging big data, how does Google make use of the big data?
1. Collect the row data, capture the contents of each website, e-mail, or
Cookie, and extract the key information.
2. Create the complex syndetic index for this inf... (more)