State Legislatures pass over 120,000 bills each year totalling over 3 million words . The time from introduction to enactment to agency implementation takes an average of 354 days [2, 3]. The actual effect of these bills in the real-world is long-forgotten by the time they become measurable.
Informed law-making is uniquely important to achieving good health outcomes in the US. Previous studies have shown that state-level legislation can boost health care utilization rates by over and even inspire individuals from neighboring states to live more healthily .
New York offers a compelling case study for understanding the effects of legislation on health outcomes. Mayor Michael Bloomberg’s "Health in all Policies" legislation during his 2002-2013 tenure tackled air pollution, dieting, physical activity, and smoking and included such infamous policies as the fountain drink size limiting and curb smoking prohibition. We propose the following research question:
How effective is New York state health legislation at producing better health outcomes and how can they be better formulated to achieve those ends?
2.1 Datasets: Service Requests & NY Bills
For our study, we utilize the publicly-available dataset 311_service_requests, a New York state compilation of 1 million service requests made to the 311 reporting hotline from 2010-2018, as a self-reported metric of health outcome. These complaints are divided according to several health categories documented in Figure 1 (e.g. Air Quality, Hazardous Materials, Rodent, etc.)
To correlate these outcomes with legislative activity, we also utilize a publicly-available dataset nys_bills of New York State Legislature bills passed between 2011-2016. Relevant columns are displayed in Table 1.
|Create Date||Bill Title||Bill Subject||Health Area||…|
|2012-05-02||Prohibits the sale of sugary drinks…||Health||Food Establishment||…|
|2012-10-16||Concentration of fluoride in water…||Health||Water Quality||…|
|2015-05-20||Textbook Transparency Act||Education||N/A||…|
2.2 Data Preprocessing & Cleaning
For 311_service_requests, we first randomly subsample 30k entries to facilitate further analysis. We are mostly interested in the type of complaint because its granularity is appropriate to map to bills. We consider the top 13 most frequent types of complaints since the rest of them appear less than 0.5 percent throughout the dataset. The selected types of complaints, which reflects health related problems are reported in Table 2.
|Indoor Air Quality||2.44|
In order to analyze the trend of health related problems over time, we bin the data into months and count how many service requests we get per each category. Then we perform a correlation analysis between features to see if some similar categories shall be collapsed into a single category. The results are shown in Figure 1. Since Food Establishment, Sanitation Condition and Rodent exhibit a strong correlation between each other, we group them together in the following analyses.
For the nys_bills dataset, since the bill subject is not informative enough, we parse the bill tile to determine the health problem category each bill aims to target. 111For simplicity we use a manually constructed keyword dictionary that aims to trade-off precision over recall, but more complicated techniques like contextual word embeddings might provide a better trade-off. Finally we also bin the data into months.
3.1 Change Point Analysis
As state legislatures are trying to propose and implement bills to affect health related problems, it’s important to analyze the trend of reported health problems, especially at which time points there’s a shift in the trend, which might reflect a new bill in effect. To that end, we apply change point analysis, which is also a central research area in time series to a wide range of applications. Detecting critical change points and segmenting time series data into different regimes with different data generating process would be very helpful to reasoning and decision making.
Our main goal here is to detect abrupt change points in people’s health related living conditions over the past 8 years in the state of New York, reflected by the number of complaints in aforementioned multiple categories. And we then test the correlation of the change point with relative legislation bills, to find out whether these bills have a significant impact on improving people’s living conditions.
We applied the offline version of change point detection  onto our processed data. The objective is to minimize a sum of piece-wise loss functions to find the optimal segmentation of a time series, which takes the form below:
where is the total loss, is a cost function which measures goodness-of-fit of the sub-series , and is the set of all dividing points, with and
. We choose the least absolute deviation as our cost, which is a robust estimator of a shift in the central point (mean, median, mode) of a distribution, and takes the following form:
where is the component-wise median of . For the searching algorithm, we used dynamic programming which roughly computes the cost of all sub-sequences of a given time series.
The results of our change point analysis over the number of complaints in the 13 categories are shown in Figure 2. Note that we observed strong seasonal effects in all of the signals which would potentially hurt the detection, so we removed the seasonal effects first before doing the change detection.
3.2 Correlation between bills and health problems
The overall trend of bills
Similarly, we plot the percentage of bills versus time in Figure 3. We use percentage rather then the absolute numbers because the number of passed bills remain relatively stable over time, and we think percentage reflects the attention paid to different problems. We observe strong seasonality as expected, and the curves mostly exhibit the same trends as the health problem curves, which is unsurprising since more bills might lead to improvement of health conditions, or inversely, more severe problems get more attention from the state legislature.
Do bills fix health problems
In order to examine whether bills of a certain category fixed their target health problems or not, we perform a chi-squared test to see if there is significant difference in number of bills per year (see Figure 4). To correlate the bills with change points, we found in hazardous material, there is significant correlation between number of bills and change points (pval <0.05)
4 Conclusions & Future Work
Our study indicates that there are certain valuable characteristics of the Hazardous Materials legislation that caused a downturn in 311 service requests that made them capable of producing actual health outcomes and the lack of those same characteristics caused the failure of negative legislation to produce better health outcomes.
The statistical analysis that surfaces these empirically-verifiable distinctions is an essential step in producing future legislation that learns from past experience. But it lies for future work, likely in the area of in-depth public policy analysis focused on the specific negative legislation bills, to determine the exact socioeconomic, demographic, political, or implementation factors that resulted in policy failure.
 King, Kevin. “State Legislatures Vs. Congress: Which Is More Productive?” Quorum, 2018, www.quorum.us/data-driven-insights/state-legislatures-versus-congress-which-is-more-productive/176/.
 Moore, Carter. “How Long Does It Take to Pass a Bill in the US?” Quora, 2015, www.quora.com/How-long-does-it-take-to-pass-a-bill-in-the-US.
 “How Long Does It Take To Pass And Enact Bills? | PMG.” Parliamentary Monitoring Group, 2015, pmg.org.za/page/How%20long.
 Gibson, TB (2015) Analyzing the effect of state legislation on health care utilization for children with concussion. Journal of AMA Pediatrics
 Coppin, P., Jonckheere, I., Nackaerts, K., Muys, B. and Lambin, E., 2004. Review ArticleDigital change detection methods in ecosystem monitoring: a review. International journal of remote sensing, 25(9), pp.1565-1596.