Condensed Principles of Big Data

Last night I re-read yesterday's post (Toward Big Data Immuntability), and I realized that there really is no effective way to use this blog to teach anyone the mechanics of Big Data construction and analysis.  My guess is that many readers were confused by the blog, because a single post cannot provide the back-story to the concepts included in the post.So, basically, I give up.  If you want to learn the fundamentals of Big Data, you'll need to do some reading  I would recommend my own book, Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information.  Depending on your background and agenda, you might prefer one of the hundreds of other books written for this vibrant field (I won't be offended).The best I can do is to summarize, with a few principles, the basic theme of my book.1. You cannot create a good Big Data resource without good identifiers.  A Big Data resource can be usefully envisioned as a system of identifiers to which data is attached.2. Data must be described with metadata, and the metadata descriptors should be organized under a classification or an ontology.  The latter will drive down the complexity of the system and will permit heterogeneous data to be shared, merged, and queried across systems.3. Big Data must be immutable.  You can add to Big Data, but you can never alter or delete the contained data. 4. Big Data must be accessible to the public if it is to have any scientific value.  Unless ...
Source: Specified Life - Category: Pathologists Source Type: blogs