Mastering Apache Spark 2.0
by Jacek Laskowski
Publisher: GitBook 2016
Number of pages: 1621
This collections of notes (what some may rashly call a 'book') serves as the ultimate place of mine to collect all the nuts and bolts of using Apache Spark. The notes aim to help me designing and developing better products with Apache Spark.
Home page url
Download or read it online for free here:
by Alan F Gates - O'Reilly Media
Apache Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs. The structure of Pig programs is amenable to parallelization, which enables them to handle very large data sets.
by Marc Farley - Microsoft Press
The book describes a storage architecture that some experts are calling a game changer in the infrastructure industry. Called the Microsoft hybrid cloud storage, it is a way to integrate cloud storage services with traditional enterprise storage.
by Jan Bodnar - ZetCode
MySQL is a leading open source database management system. This is MySQL tutorial. It covers the MySQL database, various mysql command line tools and the SQL language covered by the database engine. It is an introductory tutorial for the beginners.
by Julia Silge, David Robinson - O'Reilly Media
With this practical book, you'll explore text-mining techniques with tidytext, a package that authors developed using the tidy principles behind R packages like ggraph and dplyr. You'll learn how tidytext can make text analysis easy and effective.