Introduction to Spark 2.0 for Database Researchers

Michael Armbrust 1
Doug Bateman 2
Reynold Xin 1
Matei Zaharia 3
1
 
Databricks, San Francisco, CA, USA
2
 
Databricks, doug.bateman@databricks.com, CA, USA
Publication typeProceedings Article
Publication date2016-06-26
Abstract
Originally started as an academic research project at UC Berkeley, Apache Spark is one of the most popular open source projects for big data analytics. Over 1000 volunteers have contributed code to the project; it is supported by virtually every commercial vendor; many universities are now offering courses on Spark. Spark has evolved significantly since the 2010 research paper: its foundational APIs are becoming more relational and structural with the introduction of the Catalyst relational optimizer, and its execution engine is developing quickly to adopt the latest research advances in database systems such as whole-stage code generation.
Found 
Found 

Top-30

Publishers

1
2
3
Institute of Electrical and Electronics Engineers (IEEE)
3 publications, 33.33%
Emerald
1 publication, 11.11%
Wiley
1 publication, 11.11%
MDPI
1 publication, 11.11%
Springer Nature
1 publication, 11.11%
Association for Computing Machinery (ACM)
1 publication, 11.11%
SAGE
1 publication, 11.11%
1
2
3
  • We do not take into account publications without a DOI.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
9
Share