In Java development, the typical data computation problems are characterized
Long computation procedure requiring a great deal of debugging Data may from
database, or Excel/Txt Data may from multiple databases, instead of just one.
Some computation goals are complex, such as relative position computation,
and set-related computation
Just suppose a sales department needs to make statistics on the top 3
outstanding salesmen ranking by their monthly sales in every month from Jan
to the previous month, based on the order data.
Java alone is difficult to handle such computations. Although it is powerful
enough and also quite convenient in debugging, Java has not directly
implemented the common computational algorithms yet. So, Java programmers
still have to spend great time and efforts to implement the details like
aggregating, filtering, grouping, sorting, and ran... (more)
The data computation layer in between the data persistent layer and the
application layer is responsible for computing the data from data persistence
layer, and returning the result to the application layer. The data
computation layer of Java aims to reduce the coupling between these two
layers and shift the computational workload from them. The typical
computation layer is characterized with below features:
Ability to compute on the data from arbitrary data persistence layers, not
only databases, but also the non-database Excel, Txt, or XML files. Of all
these computations, the... (more)
The Big Data Real-time Application is a scenario to return the computation
and analysis results in real time even if there are huge amounts of data.
This is an emerging demand on database applications in recent years.
In the past, because there wasn't a lot of data, the computation was simple,
and few parallelisms, the pressure on the database wasn't great. A high-end
or middle-range database server or cluster could allocate enough resources to
meet the demand. Moreover, in order to rapidly and parallel access to the
current business data and the historic data, users also tended t... (more)
As we know, the stored procedure is designed to handle computations
involving complex business logics.
In the past, the data structure and business logics were so simple that one
SQL statement was enough to achieve user's computational goal. With the rapid
growing of information industry, users frequently find that they need to
achieve the increasingly complex computational goals to out-perform their
competitors. To address such computations, SQL alone is far from
enough. Database programmers have the additional demands regarding the judge
and loop statements, branches at multip... (more)
The low efficiency of Hadoop computation is an undeniable truth. We believe,
one of the major reasons is that the underlying computational structure of
MapReduce for Hadoop is basically of the external memory computation. The
external memory computation implements the data exchange through the frequent
external memory read/write. Because the efficiency of file I/O is two orders
of magnitude lower than that of memory, the computational performance of
Hadoop is unlikely high.
While for the normal users, they usually have a small size of cluster with
only tens or scores of nodes. T... (more)