commit bf67895773b6e44b270a376914ee780956a6101a Author: Simon Clark simon.richard.clark@gmail.com Date: Thu Oct 23 11:43:46 2014 +0100
Added an entry for Apache Spark.
en-US/Cluster.xml | 17 +++++++++++++++++ 1 files changed, 17 insertions(+), 0 deletions(-) --- diff --git a/en-US/Cluster.xml b/en-US/Cluster.xml index 1fd5d5d..b48826e 100644 --- a/en-US/Cluster.xml +++ b/en-US/Cluster.xml @@ -61,4 +61,21 @@ <para>For more information see: <ulink url="http://pig.apache.org/" />.</para> </section> + <section id="apache-spark"> + <title>Apache Spark</title> + <para>Apache Spark is a fast and general engine for large-scale + data processing. It supports developing custom analytic + processing applications over large data sets or streaming data. + Because it has the capability to cache intermediate results in + cluster memory and schedule DAGs of computations, Spark + programs can run up to 100x faster than equivalent Hadoop + MapReduce jobs. Spark applications are easy to develop, + parallel, fast, and resilient to failure, and they can operate + on data from in-memory collections, local files, a + Hadoop-compatible filesystem, or from a variety of streaming + sources. Spark also includes libraries for distributed machine + learning and graph algorithms.</para> + <para>For more information see: + <ulink url="http://spark.apache.org/" />.</para> + </section> </section>
docs-commits@lists.fedoraproject.org