The MapReduce of Hadoop is a widely-used parallel computing framework.
However, its code reuse mechanism is inconvenient, and it is quite cumbersome
to pass parameters. Far different from our usual experience of calling the
library function easily, I found both the coder and the caller must bear a
sizable amount of precautions in mind when writing even a short pieces of
program for calling by others.
However, we finally find that esProc could easily realize code reuse in
hadoop. Still a simple and understandable example of grouping and
summarizing, let's check out a solution with not so great reusability.
Suppose we need to group the big data of order (sales.txt) on HDFS by
salesman (empID), and seek the corresponding sales amount of each Salesman.
esProc codes are:
Code for summary machine:
Code for node machine:
esProc classifies the distributed computing into two... (more)
The upcoming esProc for data computation and data analytics has many
outstanding highlights. Developed by Raqsoft, the leading provider of
software for business intelligence, the anticipated esProc will greatly
enhance analysts' work efficiency.
The cell-style interface of esProc makes codes presented with natural
formatting, natural alignment, naturally indent and thus makes typesetting
unnecessary, which saves time for focusing on analysis. Besides, the Excel
like interface enables analysts to write code in two-dimensional way which
conforms to human thinking mode, reducing pe... (more)
Spreadsheet software is widely used by people in every industry with
flexibility for data computing and analysis. But due to inherent drawbacks,
common business spreadsheet software can't conduct relational query like SQL.
The spreadsheet can implement the visualized calculation to some extent, and
the nontechnical people can perform some rather complex calculations without
having to learn the SQL. However, as the core of SQL, the relational query is
unable to be implemented through common business spreadsheet software, which
adds complexity to the apparently simple problems of mu... (more)
In the process of development with Java, we will occasionally encounter the
computation similar to data processing in database. For instance, there are
two frequently updated Excel sheets, which are the clients' information and
the orders. We need to query the data of clients who have bought all the
products on the list through entering a dynamic product list.
The "computation similar to data processing in database" refers to structured
data computation of an application without database. Although Java is capable
of handling such computation, the procedure is very cumbersome and v... (more)
The data computation layer in between the data persistent layer and the
application layer is responsible for computing the data from data persistence
layer, and returning the result to the application layer. The data
computation layer of Java aims to reduce the coupling between these two
layers and shift the computational workload from them. The typical
computation layer is characterized with below features:
Ability to compute on the data from arbitrary data persistence layers, not
only databases, but also the non-database Excel, Txt, or XML files. Of all
these computations, the... (more)