Over the last several years there are two important trends that require additional thought when putting together an architecture for a hosted service. The ability to analyze and process enormous amounts of data is increasingly important. From a technology perspective, the two trends to focus on are:
1. Batch processing -- the increasing awareness of batch processing and the recent uptick in use of the map educe paradigm for that purpose; Distributed computing is a field of computer science that studies distributed systems.
2. NoSQL stores - The rise of so called ""NoSQL"" stores and their use to serve up data to online users; a distributed file system or network file system is any file system that allows access to files from multiple hosts sharing via a computer network. This makes it possible for multiple users on multiple machines to share files and storage resources.
Both of these trends represent significant advances in the way that hosted systems are developed. But in order to derive the most value for an entire system, developers must think about how these two areas will work together in some holistic manner.
This book is your ultimate resource for Storing and managing big data- NoSQL, Hadoop and more. Here you will find the most up-to-date information, analysis, background and everything you need to know.
In easy to read chapters, with extensive references and links to get you to know all there is to know about Storing and managing big data- NoSQL, Hadoop and more right away, covering: Distributed data store, Background Intelligent Transfer Service, BATON Overlay, BitVault, Bootstrapping node, Chimera (software library), Chord (peer-to-peer), Cloud (operating system), CoDeeN, Collaber, Collanos, Comparison of streaming media systems, Comparison of video hosting services, Content addressable network, Content delivery network, Coral Content Distribution Network, Data center, Distributed file system, Distributed hash table, Distributed Networking, FAROO, Globule (CDN), GlusterFS, Grid casting, Hibari (database), High performance cloud computing, HTTP(P2P), Hyper distribution, Infrastructure for Resilient Internet Systems, Jigdo, JXTA, Kademlia, Key-based routing, Koorde, Legion (software), MagmaFS, Metalink, NeoEdge Networks, Octoshape, Ono (P2P), Osiris (Serverless Portal System), OverSim, P-Grid, P2P-Next, P2PTV, PAST storage utility, Pastry (DHT), Peer-to-peer wiki, Prefix hash tree, Proactive network Provider Participation for P2P, Rawflow, Sciencenet, Similarity Enhanced Transfer, Space-based architecture, Superdistribution, Tapestry (DHT), Tulip Overlay, Tuotu, Web acceleration, YaCy, Aquiles, BigTable, Apache Cassandra, Column family, Hector (API), Keyspace (distributed data store), NoSQL, Standard column family, Super column family, Tombstone (data store), Voldemort (distributed data store), Andrew File System, Apache Hadoop, Apache Hive, BigCouch, Ceph, The Circle (file system), Cloudant, Cloudera, CloudStore, DCE Distributed File System, Direct Access File System, Distributed File System (Microsoft), FhGFS, Gfarm file system, Global Storage Architecture, Google File System, HAMMER, IBM General Parallel File System, Infinit, Lustre (file system), MapR, Moose File System, OFFSystem, OneFS distributed file system, Parallel Virtual File System, POHMELFS, Sector/Sphere, Storage@home, Tahoe Least-Authority Filesystem, Wuala, XtreemFS
This book explains in-depth the real drivers and workings of Storing and managing big data- NoSQL, Hadoop and more. It reduces the risk of your technology, time and resources investment decisions by enabling you to compare your understanding of Storing and managing big data- NoSQL, Hadoop and more with the objectivity of experienced professionals.