Progress against cancer? Let's think about it.
It is difficult to pick up a newspaper these days without reading an article proclaiming progress in the field of cancer research. Here is an example, taken from an article posted on the MedicineNet site (1). The lead-off text is: " Statistics (released in 1997) show that cancer patients are living longer and even " beating " the disease. Information released at an AMA sponsored conference for science writers, showed that the death rate from the dreaded disease has decreased by three percent in the last few years. In the 1940s only one patient in four survived on the average. By the 1960s, that figure was up to one i...
Source: Specified Life - March 25, 2016 Category: Information Technology Tags: cancer cancer cure cancer statistics cancer treatments orphan diseases progress in cancer research rare diseases Source Type: blogs

Progress against cancer? Let's think about it.
It is difficult to pick up a newspaper these days without reading an article proclaiming progress in the field of cancer research. Here is an example, taken from an article posted on the MedicineNet site (1). The lead-off text is: "Statistics (released in 1997) show that cancer patients are living longer and even "beating" the disease. Information released at an AMA sponsored conference for science writers, showed that the death rate from the dreaded disease has decreased by three percent in the last few years. In the 1940s only one patient in four survived on the average. By the 1960s, that figure was up to one in th...
Source: Specified Life - March 25, 2016 Category: Information Technology Tags: cancer cancer cure cancer statistics cancer treatments orphan diseases progress in cancer research rare diseases Source Type: blogs

Scientific Misconduct at Prestigious Research Centers
On January 23, 2009, the Office of Research Integrity made public their findings of scientific misconduct concerning a doctor who fabricated data for several grants projects funded by the NIH (1). The doctor was a former graduate student in the Department of Pathology, Harvard Medical School, a former research fellow and Instructor of Pathology, at Brigham and Women's Hospital in Boston, a former postdoctoral fellow in the Department of Biology, at the California Institute of Technology, and a former Associate Professor in the Department of Biology and the Center for Cancer Research at the Massachusetts Institute of Tech...
Source: Specified Life - March 24, 2016 Category: Information Technology Tags: ethics fraud Karolinska Institute ORI scientific misconduct Source Type: blogs

The Importance of Biological Taxonomy
Biological taxonomy is the scientific field dealing with the classification of living organisms. Non-biologists who give any thought to taxonomy, may think that the field is the dullest of the sciences. To the uninitiated, there is little difference between the life of a taxonomist and the life of a stamp collector. Nothing could be further from the truth. Taxonomy has become the grand unifying theory of the biological sciences. Efforts to sequence the genomes of prokaryotic, eukaryotic and viral species, thereby comparing the genomes of different classes of organisms, have revitalized the field of evolutionary taxonom...
Source: Specified Life - March 22, 2016 Category: Information Technology Tags: classification data organization evolution taxonomy Source Type: blogs

DATA SIMPLIFICATION: Published At Last!
Blog readers can use the discount code: COMP315 for a 30% discount, at checkout.On March 17, my book Data Simplification: Taming Information with Open Source Tools was published by Morgan Kaufmann, an imprint of Elsevier. [the Elsevier site indicates that the book is still on preorder, buy you can ignore that]. This past month, I've posted on topics relevant to data simplification. Beginning tomorrow, I'll be moving onto new subjects for this blog site, but I wanted to make one additional comment for anyone who might be on the fence about buying this book. Most large data projects are total failures (1-21). Furthermor...
Source: Specified Life - March 21, 2016 Category: Information Technology Tags: computer science data analysis data repurposing data science data simplification information science simplifying data taming data Source Type: blogs

DATA SIMPLIFICATION: Persistent Data
This is the last of my blogs related to topics selected from Data Simplification: Taming Information With Open Source Tools (released March, 2016). I hope that as you page back through my posts on Data Simplification topics, appearing throughout this month's blog, you'll find that this is a book worth reading. Blog readers can use the discount code: COMP315 for a 30% discount, at checkout. A file that big? It might be very useful. But now it is gone.-Haiku by David J. Liszewski Your scripts create data objects, and the data objects hold data. Sometimes, these data objects are transient, existing only during a block or ...
Source: Specified Life - March 19, 2016 Category: Information Technology Tags: computer science data analysis data science data simplification databases persistence simplifying data Source Type: blogs

DATA SIMPLIFICATION: The Many Uses of Random Number Generators
Over the next few weeks, I will be writing on topics related to my latest book, Data Simplification: Taming Information With Open Source Tools (release date March 17, 2016). I hope I can convince you that this is a book worth reading. Blog readers can use the discount code: COMP315 for a 30% discount, at checkout.If you are among the many students and professionals who are intimidated by statistics, then fear no more! With a little imagination, random number generators (to be accurate, pseudorandom number generators) can substitute for a wide range of statistical methods. As it happens, modern computers can perform tw...
Source: Specified Life - March 15, 2016 Category: Information Technology Tags: computer science data analysis data repurposing data simplification Monte Carlo probability pseudorandom resampling simplifying data simulations Source Type: blogs

DATA SIMPLIFICATION: Abbreviations and Acronyms
Over the next few weeks, I will be writing on topics related to my latest book, Data Simplification: Taming Information With Open Source Tools (release date March 17, 2016). I hope I can convince you that this is a book worth reading. Blog readers can use the discount code: COMP315 for a 30% discount, at checkout."A synonym is a word you use when you can't spell the other one." -Baltasar GracianPeople confuse shortening with simplifying; a terrible mistake. In point of fact, next to reifying pronouns, abbreviations are the most vexing cause of complex and meaningless language. Before we tackle the complexities of abbre...
Source: Specified Life - March 14, 2016 Category: Information Technology Tags: abbreviations acronyms complexity computer science data analysis data repurposing data simplification simplifying data Source Type: blogs

DATA SIMPLIFICATION: Doublet Lists
Over the next few weeks, I will be writing on topics related to my latest book, Data Simplification: Taming Information With Open Source Tools (release date March 17, 2016). I hope I can convince you that this is a book worth reading. Blog readers can use the discount code: COMP315 for a 30% discount, at checkout.Yesterday's blog covered lists of single words. Today we'll do doublets. Doublet lists (lists of two-word terms that occur in common usage or in a body of text) are a highly underutilized resource. The special value of doublets is that single word terms tend to have multiple meanings, while doublets tend to h...
Source: Specified Life - March 13, 2016 Category: Information Technology Tags: complexity computer science data analysis data repurposing data simplification doublet lists n-grams open source tools word lists Source Type: blogs

DATA SIMPLIFICATION: Building Word Lists
Over the next few weeks, I will be writing on topics related to my latest book, Data Simplification: Taming Information With Open Source Tools (release date March 23, 2016). I hope I can convince you that this is a book worth reading. Blog readers can use the discount code: COMP315 for a 30% discount, at checkout.Word lists, for just about any written language for which there is an electronic literature, are easy to create. Here is a short Python script, words.py, that prompts the user to enter a line of text. The script drops the line to lowercase, removes the carriage return at the end of the line, parses the result i...
Source: Specified Life - March 12, 2016 Category: Information Technology Tags: complexity computer science data analysis data repurposing data simplification data wrangling information science simplifying data taming data word lists Source Type: blogs

DATA SIMPLIFICATION: ImageMagick
Over the next few weeks, I will be writing on topics related to my latest book, Data Simplification: Taming Information With Open Source Tools (release date March 23, 2016). I hope I can convince you that this is a book worth reading. Blog readers can use the discount code: COMP315 for a 30% discount, at checkout. In yesterday's blog, I discussed using system calls within your scripts. One of my examples called an ImageMagick. Today, I thought I'd describe ImageMagick, and some of its benefits. ImageMagick is an open source utility that supports a huge selection of robust and sophisticated image editing methods. Its so...
Source: Specified Life - March 11, 2016 Category: Information Technology Tags: complexity computer science image magick information science simplification simplifying data system calls taming data Source Type: blogs

DATA SIMPLIFICATION: System Calls
Over the next few weeks, I will be writing on topics related to my latest book, Data Simplification: Taming Information With Open Source Tools (release date March 23, 2016). I hope I can convince you that this is a book worth reading. Blog readers can use the discount code: COMP315 for a 30% discount, at checkout.A system call is a command line, inserted into a software program, that interrupts the script while the operating system executes the command line. Immediately afterwords, the script resumes, at the next line. Any utility that runs from the command line can be embedded in any scripting language that supports sy...
Source: Specified Life - March 10, 2016 Category: Information Technology Tags: complexity computer science data analysis data repurposing data simplification data wrangling information science perl python Ruby simplifying data system calls taming data Source Type: blogs

DATA SIMPLIFICATION: Specifications to the Rescue!
Over the next few weeks, I will be writing on topics related to my latest book, Data Simplification: Taming Information With Open Source Tools (release date March 23, 2016). I hope I can convince you that this is a book worth reading. Blog readers can use the discount code: COMP315 for a 30% discount, at checkout.Today's blog continues yesterday's discussion of Standards and Specifications Despite the problems inherent in standards, government committees cling to standards as the best way to share data. The perception is that in the absence of standards, the necessary activities of data sharing, data verification, data a...
Source: Specified Life - March 9, 2016 Category: Information Technology Tags: complexity computer science data analysis data repurposing data simplification data wrangling information science simplifying data specifications standards taming data Source Type: blogs

DATA SIMPLIFICATION: Substandard Standards
Over the next few weeks, I will be writing on topics related to my latest book, Data Simplification: Taming Information With Open Source Tools (release date March 23, 2016). I hope I can convince you that this is a book worth reading. Blog readers can use the discount code: COMP315 for a 30% discount, at checkout."The nice thing about standards is that you have so many to choose from." -Andrew S. Tanenbaum Data standards are the false gods of informatics. They promise miracles, but they can't deliver. The biggest drawback of standards is that they change all the time. If you take the time to read some of the computer li...
Source: Specified Life - March 8, 2016 Category: Information Technology Tags: complexity computer science data analysis data repurposing data simplification data wrangling information science simplifying data specifications standards taming data Source Type: blogs

DATA SIMPLIFICATION: Poor Identifiers, Horrific Consequences
Over the next few weeks, I will be writing on topics related to my latest book, Data Simplification: Taming Information With Open Source Tools (release date March 23, 2016). I hope I can convince you that this is a book worth reading. Blog readers can use the discount code: COMP315 for a 30% discount, at checkout.All information systems, all databases, and all good collections of data are best envisioned as identifier systems to which data (belonging to the identifier) can be added over time. If the system is corrupted (e.g., multiple identifiers for the same object, data belonging to one object incorrectly attached t...
Source: Specified Life - March 7, 2016 Category: Information Technology Tags: complexity computer science data analysis data repurposing data simplification data wrangling identifiers information science simplifying data taming data Source Type: blogs