Passive and Active Observation: Experimental Design Issues in Big Data
Data can be collected in scientific studies via a controlled experiment or passive observation. Big data is often collected in a passive way, e.g. from social media. Understanding the difference between active and passive observation is critical to the analysis. For example in studies of causation great efforts are made to guard against hidden confounders or feedback which can destroy the identification of causation by corrupting or omitting counterfactuals (controls). Various solutions of these problems are discussed, including randomization.
READ FULL TEXT