Data-Intensive Text Processing with MapReduce
by Jimmy Lin, Chris Dyer
Publisher: Morgan & Claypool Publishers 2010
ISBN/ASIN: 1608453421
ISBN-13: 9781608453429
Number of pages: 175
Description:
This book focuses on MapReduce algorithm design, with an emphasis on text processing algorithms common in natural language processing, information retrieval, and machine learning. We introduce the notion of MapReduce design patterns, which represent general reusable solutions to commonly occurring problems across a variety of problem domains. This book not only intends to help the reader 'think in MapReduce', but also discusses limitations of the programming model as well.
Download or read it online for free here:
Download link
(1.7MB, PDF)
Similar books
Forensic Analysis of Database Tampering
by Kyriacos E. Pavlou, Richard T. Snodgrass - University of Arizona
The text on detection via cryptographic hashing. The authors show how to determine when the tampering occurred, what data was tampered, and who did the tampering. Four successively more sophisticated forensic analysis algorithms are presented.
(21308 views)
by Kyriacos E. Pavlou, Richard T. Snodgrass - University of Arizona
The text on detection via cryptographic hashing. The authors show how to determine when the tampering occurred, what data was tampered, and who did the tampering. Four successively more sophisticated forensic analysis algorithms are presented.
(21308 views)
Concurrency Control and Recovery in Database Systems
by P. A. Bernstein, V. Hadzilacos, N. Goodman - Addison Wesley
This book is about techniques for concurrency control and recovery. It covers techniques for centralized and distributed computer systems, and for single copy, multiversion, and replicated databases. Example applications are included.
(23115 views)
by P. A. Bernstein, V. Hadzilacos, N. Goodman - Addison Wesley
This book is about techniques for concurrency control and recovery. It covers techniques for centralized and distributed computer systems, and for single copy, multiversion, and replicated databases. Example applications are included.
(23115 views)
Databases, Types, and The Relational Model: The Third Manifesto
by C.J. Date, Hugh Darwen - Addison Wesley
This is a book on database management based on an earlier book by the same authors. It can be seen as an abstract blueprint for the design of a DBMS and the language interface to such a DBMS. It serves as a basis for a model of type inheritance.
(6977 views)
by C.J. Date, Hugh Darwen - Addison Wesley
This is a book on database management based on an earlier book by the same authors. It can be seen as an abstract blueprint for the design of a DBMS and the language interface to such a DBMS. It serves as a basis for a model of type inheritance.
(6977 views)
XML and Databases
by Ronald Bourret
This paper gives a high-level overview of how to use XML with databases. It describes how the differences between data-centric and document-centric documents affect their usage with databases and how XML is commonly used with relational databases.
(19617 views)
by Ronald Bourret
This paper gives a high-level overview of how to use XML with databases. It describes how the differences between data-centric and document-centric documents affect their usage with databases and how XML is commonly used with relational databases.
(19617 views)