[release-notes] Added an entry for Apache Spark. - docs-commits - Fedora mailing-lists

23 Oct 2014

commit bf67895773b6e44b270a376914ee780956a6101a
Author: Simon Clark simon.richard.clark@gmail.com
Date:   Thu Oct 23 11:43:46 2014 +0100
Added an entry for Apache Spark.
en-US/Cluster.xml |   17 +++++++++++++++++
 1 files changed, 17 insertions(+), 0 deletions(-)
---

diff --git a/en-US/Cluster.xml b/en-US/Cluster.xml
index 1fd5d5d..b48826e 100644
--- a/en-US/Cluster.xml
+++ b/en-US/Cluster.xml
@@ -61,4 +61,21 @@
     <para>For more information see: 
     <ulink url="http://pig.apache.org/" />.</para>
   </section>
+  <section id="apache-spark">
+    <title>Apache Spark</title>
+    <para>Apache Spark is a fast and general engine for large-scale
+    data processing. It supports developing custom analytic
+    processing applications over large data sets or streaming data.
+    Because it has the capability to cache intermediate results in
+    cluster memory and schedule DAGs of computations, Spark
+    programs can run up to 100x faster than equivalent Hadoop
+    MapReduce jobs. Spark applications are easy to develop,
+    parallel, fast, and resilient to failure, and they can operate
+    on data from in-memory collections, local files, a
+    Hadoop-compatible filesystem, or from a variety of streaming
+    sources. Spark also includes libraries for distributed machine
+    learning and graph algorithms.</para>
+    <para>For more information see: 
+    <ulink url="http://spark.apache.org/" />.</para>
+  </section>
 </section>