Evaluating the Effectiveness of Health Awareness Events by Google Search Frequency

10/09/2018 ∙ by Zheng Hao, et al. ∙ South Dakota State University 0

Over two hundreds health awareness events take place in the United States in order to raise attention and educate the public about diseases. It would be informative and instructive for the organization to know the impact of these events, although such information could be difficult to measure. Here 46 events are selected and their data from 2004 to 2017 are downloaded from Google Trend(GT). We investigate whether the events effectively attract the public attention by increasing the search frequencies of certain keywords which we call queries. Three statistical methods including Transfer Function Noise modeling, Wilcoxon Rank Sum test, and Binomial inference are conducted on 46 GT data sets. Our study show that 10 health awareness events are effective with evidence of a significant increase in search frequencies in the event months, and 28 events are ineffective, with the rest being classified as unclear.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

1.1 Background

Chronic diseases (such as diabetes, cancer and heart diseases) cause of deaths in the United States every year, even though many of those diseases are preventable [1]. The goal of holding health awareness events is to raise attention and educate the public about diseases. Take the National Breast Cancer awareness month as an example: the National Breast Cancer Foundation devotes efforts to educating women on early detection to reduce the risk of breast cancer, helping those diagnosed with breast cancer, as well as raising funds to support research. Companies join the National Breast Cancer Awareness Month, such as Estée Lauder Companies Inc. who releases exclusive Pink Ribbon products to help improve awareness of breast cancer and raise funds for medical research [2].

It is estimated that

of the information flowing through two-way telecommunication were carried by the Internet by 2007 [3]. The number of Internet users has increased enormously and surpasses 3 billion or about of the world population in 2014 [4]. Google has led the U.S. core search market for the past decade [5], and millions of people worldwide use it to search for health topics every day [6][7]. In particular it occupied three quarters of the search engine market in 2017.

We want to determine if effective health awareness events are effective in raising public awareness of the health topic resulting in higher Google search frequencies. The results could benefit a variety of parties, for instance, the Department of Public Health and public interest groups could optimally rearrange resources allocation among events.

1.2 Related Work

Using Internet statistics to explain and predict quantities has been popular among researcher. Bollen et al.[8] classified tweets into different moods to quantify the daily public mood and used it to predict stock market by using different models. The idea was based on the fact that people intentionally or unintentionally disclosed their thinking online by some means including social media such as Twitter, which might be a factor of stock price variation. What was interesting was that the authors used tweets which was not traditionally considered as an economic factor unlike some classical factors such as interest rates, GDP, and unemployment rates.

Ginsberg et al.[9], Doornik[10] and Carneiro et al.[11]

proved that Google Trends data could be predictive for current influenza-like activity levels by 1-2 weeks earlier before conventional centers for disease control and prevention surveillance systems by comparing GT data and the actual disease numbers and provided different case studies. The search frequency would dramatically increase before and during the disease outbreak. Similarly, Cook et al.

[12] chose H1N1 ease cases. The increasing search frequency could be useful in identifying the presence of diseases and the media effect on web users’ search behaviors [13].

GT data was proven to be effective in terms of modeling other areas such as marketing and information security. Youn et al.[14] used GT data and Autoregressive Integrated Moving Average (ARIMA) models to conduct nowcast for TV market of a few brands and was able reveal the correlation. Accurate prediction for the near future of the market was obtained. Rech [15] used GT data to analyze the attention that products received and the cause-effect relation among a few factors in software engineering. Kuo et al [16] demonstrated the lifecycles of internet security systems had the same pattern including four stages: zero day, publicity, cooldown and silence with different scales. The author discovered that GT data showed a interesting correlation with the lifecycle and claimed the reason was that when the vulnerability attracted a lot attention, the risk became large and the lifecycle turned to the decay stages. Choi et al [17] was able to conduct time series analysis on GT data to forecast some economic indicators, and showed that some GT data was very well fitted by ARIMA models. In their case, they focused on the intrinsic structure of the data sets without incorporating any explanatory variables or time series. Mondal et al [18] used transfer function noise model to study the effect of monthly rain fall on the Ganges River flow, with both data sets being time series. In our case, we will use an impulse series as the explanatory.

Ari Seifter[19] show that GT data was high related to the public attention on diseases according to a study on Lyme disease. Grant[20] analyzed the number of articles published and number of early detection of disease in the event month for breast cancer and concluded that the event did promote public attention. The study quantitatively indicated that a successful event actually educated public and encouraged early detection. Here we want to identify the effective ones from a pool of events. In [21], Ayers et al studied the Great American Smokeout health awareness event by using a number of data sets such as number of news, tweets, Wiki visits and etc. Their proposed evaluation method for event effectiveness was to first fit counterfactual data by assuming the event had not occurred, then compare them with the actual data. Although their approach was quantitative, they used the percent change where it is unclear detect the threshold of significance.

2 Datasets and Preprocessing

2.1 Datasets

We focus on monthly health awareness events in the US and select a set of 46 events on disease. Since GT data is based on the search frequency of one or a few words which we call a query, we select a query for each event and present them in Appendix A. In fact, for some events, there were more than one meaningful queries, then we picked the one with highest frequency.

On Google Trends webpage, users are able to track the search popularity of queries in different languages across regions starting from January 2004. Weekly or monthly GT data may be downloaded as a CSV file depending on the total time range. Since the pure values of queries can be huge numbers, Google rescales them in a range from 0 to 100 with the highest frequency being 100. Four options, including Region, Time, Category and Search Type are needed to specify a search and are selected as United State, 2004-2017, Health, and Web search respectively in this work.

For example, Figure 1 shows the query of Breast Cancer as a time series plot. There was also a graph showing popularity over regions as shown in Figure 2. the top three subregions of search popularity were Pennsylvania, Maryland, Alabama.

Figure 1: Google Trends Search Plot for the Query of Breast Cancer
Figure 2: Google Trends Search by Region for the Query of Breast Cancer

2.2 Data Preprocessing

Monthly data from 2004 to 2017 for 46 selected queries are collected. All data points are integers between 0 and 100, with no missing data. We rescale every month to an equal length of 30 days to reduce the variation caused by uneven number of days. In particular, January, March, May, July, August, October, and December data points are multiplied by , and February data points are multiplied by .

3 Methodology

We provide three different quantitative methods to evaluate the effectiveness with their thresholds clearly stated. The main method is to use transfer function noise modelling with impulse series as input. Then inferences based on Wilcoxon Rank Sum test and Binomial distribution are used to consolidate results.

3.1 Transfer Function Noise Model

The (Seasonal) Autoregressive Integrated Moving Average models (ARIMA or SARIMA) make interpretation and forecast by developing the intrinsic pattern of the single response time series. A general SARIMA has the form:

where is the backshift operator, ,

is a white noise, and

, , , and are constant coefficients. This model can be expressed by a more compact notation as:

If there is another series, say which we call an input series that has a relationship with . The Transfer Function Noise Model is built to describe this situation as

(3.1)

Intuitively, is determined by the structure of input and measures the effect of on , and measures the intrinsic pattern with itself.

We construct an impulse time series with if it corresponds a non event month, and if it corresponds an event month. We want to analyze the effect of towards . (3.1) is called the Intervention model, whose operator usually has a fairly simple form. We let , and we are interested in how much the impulse contributes to the current response which results in:

(3.2)

We first determine whether there is a seasonality in each data set, that is whether an ARIMA model or a SARIMA model should be used and then fit the best ARIMA/SARIMA model.

Secondly, we fit a transfer function noise model. The input series is just impulse function, thus there is no prewhitening step. To determine the orders of and in (3.2), we use two attempts and choose the better one:

The first attempt will be simply to use the same order as the ARIMA/SARIMA. In second attempt, we first replace the event month data with the average of the previous and next month. The idea is that after this replacement, the new data is our best guess for what the data would be if there were no event happening. We use the new data to determine the orders of the ARIMA/SARIMA model and use them in (3.2). The better attempt is chosen as the final transfer function noise model.

We will conclude that the event contributes to the number of search if the transfer function noise model is better fitted than the ARIMA/SARIMA model, and the parameter is significant at level.

3.2 Wilcoxon Rank Sum Test

The Wilcoxon Rank Sum test was introduced by Frank Wilcoxon in his well-known article [22] to compare the means of two groups. Clifford [23]

showed that Wilson test usually holds large power advantages over t test and is asymptotically more efficient than t test. In our case, the sample sizes are unequal and the sample distributions are unclear, thus we believe the Wilcoxon Rank-Sum is more appropriate than the t-test.

Data points are splitted into two groups as event month and non event month. The question then become that if event-month group has larger values. The null hypothesis is that the two group of observations came from the same population. The Wilcoxon test is based upon ranking data points of the combined sample. Assign numeric ranks to all the observations with 1 being the smallest value. If there is a group that ties, assign the rank equal to its average ranking. The Wilcoxon rank-sum test statistic is the sum of the ranks for observations from one of the samples and therefore are calculated as:

(3.3)
(3.4)

where and are the two sample sizes; and are the sums of the ranks in samples and respectively. The smaller value between and is the one used to consult significance tables to estimate the p-value.

3.3 Inference by Binomial Distribution

We used the null hypothesis that the search frequencies were completely random implying that the event did not have effect. Under the null hypothesis, every month has equal probability

to be the peak since all selected diseases are not seasonal as an influenza-like illness. Let be the number of yearly peaks for event-month data in 14 years. Among 14 years, the probability that a certain month appears to be the peak times is

In particular, is the largest value making the probability less than 0.05, and . Therefore, that the event month appears to be the peak at least 4 times indicates evidence that the event-month data is significantly different from the other months.

4 Results

Health awareness events that show evidence of significance in all three method decribed above will be defined as effective health awareness events. Health awareness events that have insignificant results for all three tests will be defined as ineffective health awareness events. The events with inconsistent results by different methods will be defined as unclear.

Details for two selected events as case study are presented in this chapter. All 46 selected query data have been analyzed and presented in table B in Appendix.

4.1 Case 1: National Breast Cancer Awareness Month

One out of eight women in the USA are diagnosed with breast cancer [24], and breast cancer is the top cause of cancer death for women 40 to 50 years of age [25] and the second leading cause of cancer death for women in the USA [26]. The National Breast Cancer Awareness Event is dedicated to drawing public attention on prevention and early detection, supporting the patients and fundraising for scientific research.

The time series plot as shown in Figure 3 presented a slightly declining trend, with peaks at the event months, October. Three different tests including periodogram, auto-correlation function, and linear model comparison are conducted to check the seasonality. For breast cancer data, two of the three tests indicated that there is no seasonality, therefore we choose ARIMA model instead of SARIMA and obtain the best ARIMA model and transfer function model as described in section 4.1.

Figure 3: Breast Cancer: (a) shows a Time Series Plot; (b) shows the fitted ARIMA line.

The results are shown in table 1. We see that the Adjust is about 0.41 for the ARIMA model and is about 0.58 for the transfer function noise model, and the p-value for parameter “eventmonth" is . Therefore we conclude that the event has a significant effect on the number of search for breast cancer.

Orders Adjusted R square p value of event coefficient
ARIMA (2,1,3) 0.408 NA
ARIMAX (2,0,3) 0.583 <0.001
Table 1: Results for ARIMA and Transfer Function Model(ARIMAX)

Next, to conduct the Wilcoxon rank sum test, we split the data into event month subset and non event month subset. A p-value indicate that we shall reject the null hypothesis that two groups of observations come from the same population. Further we notice that the mean of the event months is greater than non event months, thus during event months the search frequencies are higher than the rest of the year.

For the Binomial approach, among 14 years of Google Trends data of the query breast cancer, we have found that all 14 yearly peaks happen in October(see Color Figure 4) which is greater than the threshold, 4. There is evidence to prove that event-month frequencies are greater than the other months’.

Figure 4: Breast Cancer: All 14 Peaks Fall in October.

In sum, all our results consistently indicate that the National Breast Cancer Awareness event is effective in increasing search frequency of breast cancer in October.

4.2 Case 2: American Stroke Awareness Month

Strokes are one of the leading causes of death and serious long-term disability in the USA [27]. More than 795,000 Americans have a stroke every year and about 130,000 people have been killed by a stroke in the USA each year [28]. To get insight into public awareness for American Stroke Awareness Month, Google Trends data of query stroke has been obtained.

Figure 5: Stroke: (a) shows a Time Series Plot; (b) shows the fitted SARIMA line.

The time series plot as shown in Figure 5 (a) presents a slight decline trend before the year 2011 and an uptrend after the year 2011. We use a R function which uses three different tests including peridogram, auto-correlation function, and linear model comparison to check the seasonality. For stroke data, all three tests indicate that there is seasonality, meaning SARIMA model should be used. The outputs for SARIMA model and transfer function noise model are presented in Figure 2. We see that the Adjust is about 0.62 for the transfer function noise model which is no better than the one for SARIMA model which is about 0.68, and the p-value for parameter “eventmonth" is about . Therefore we do not have evidence to conclude that the event has a significant effect on the number of search for Stroke.

Orders Adjusted R square p value of event coefficient
SARIMA (4,1,2)(2,0,0) 0.677 NA
ARIMAX (4,1,2)(2,0,0) 0.620 0.2354
Table 2: Results for ARIMA and Transfer Function Model(ARIMAX)

According to the one-side Wilcoxon Rank-Sum test statistics, we have p-value, which means we have no compelling evidence that there is higher search frequency for the query “strokes” in the event month of May.

Figure 6: Stroke: one peak falls in May.

From the years 2004 to 2017, we have only one peak in May (See color Figure 6) which is less than the threshold of four peaks. In sum, all our results consistently indicate that the there is no evidence that the Stroke Awareness event is effective in increasing search frequency of stroke in May.

Ten events are concluded to be effective in raising public search frequency about related diseases: Alcohol Awareness, Autism, Breast Cancer, Colon Cancer, Dental Health, Heart Disease, Immunization, National Nutrition, Ovarian Cancer, and Sids. Eight events are unclear according to inconsistent results and the others are ineffective.

5 Conclusion and Discussion

According to the analysis of all 46 data sets, we have found that 10 health awareness events are effective health awareness events by showing strong evidence of significant seasonal patterns with peaks matching the event month, 28 events are defined as ineffective health awareness events and the rest are defined as unclear health awareness events. Although lack of attention is definitely bad, overheating events may result in possessing too much public resources and weakening the severity of other health topics.

People may suspect that the effective events should have higher frequencies than others, or the opposite. In fact, we checked the relative frequencies across effective, unclear, and ineffective events, and found that there is no relationship. There are effective events with high search frequencies and low search frequencies, and vice versa.

Another interesting thing to notice is that Diabetes was classified as unclear, which was somehow counterintuitive. We compared all eight unclear events and found out that the frequency for Diabetes is absolutely the largest, while all other 7 events are relatively closed to each other but away from Diabetes. We suspected that the Diabetes is so influential that a considerable attention was paid on it during many months over a year which made the event month insignificant. Therefore, a possible future study is to think about if some of the unclear and inffective events are similar to the case of Diabetes. We may also consider the prevalence and severity of these disease, since obviously it is not practical to make all disease as well-known as heart disease or breast cancer.

Classification within this study will be beneficial for the public health management and health awareness for public welfare. The Department of Public Health and public interest groups need to optimally rearrange resources allocation between effective health awareness events and ineffective health awareness events to improve the awareness of ineffective health awareness events topics, especially. Corporate partners would take the opportunity to promote related products or services to effective health awareness events, such as pink-ribbon brooch, exclusive pink-ribbon products, and the clinic needs to be prepared for increased demands of health screening appointments.

References

References

  • [1]
  • [1] CDC, National Prevention Strategy: America’s Plan for Better Health and Wellness, 2014. [Online]. Available: \(https://www.surgeongeneral.gov/priorities/prevention/strategy/report.pdf\) (last accessed on July 30, 2018)
  • [2] Centers for Disease Control and Prevention, Update on Overall Prevalence of Major Birth Defects–Atlanta, Georgia, 1978-2005., 2008. [Online]. Available: \(http://www.cdc.gov/mmwr/preview/mmwrhtml/mm5701a2.htm\) (last accessed on July 30, 2018)
  • [3] M. Hilbert and P. Lopez, The World’s Technological Capacity to Store, Communicate, and Compute Information., Science, vol. 332, no. 6025, pp. 60-65, 2011.
  • [4] Internet Society, Internet Society Global Internet Report 2014, 2014. [Online]. Available: \(https://www.internetsociety.org/globalinternetreport/2014/\) (last accessed on March 30, 2018)
  • [5] comScore, comScore Search Engine Rankings [Online]. Available: \(https://www.statista.com/statistics/267161/market-share-of-search-engines-in-% the-united-states/\) (last accessed on July 30, 2018)
  • [6] H. A. Johnson, M. M. Wagner, W. R. Hogan, W. Chapman, R. T. Olszewski, J. Dowling and G. Barnas, Analysis of web access logs for surveillance of influenza, Stud Health Technol Inform, Vols. 107:1202-6, 2004.
  • [7] H. A. Carneiro and E. Mylonakis, Google Trends: A Web-Based Tool for Real-Time Surveillance of Disease Outbreaks, Clinical Infectious Diseases, Vols. 49:1557-64, 2009.
  • [8] J. Bollen, H. Mao and X Zeng, Twitter Mood Predicts the Stock Market, Journal of Computational Science, Vol 2, pp. 1-8, 2011.
  • [9] J. Ginsberg, M. H. Mohebbi, R. S. Patel, L. Brammer, M. S. Smolinski and L. Brilliant, Detecting influenza epidemics using search engine query data, Nature, Vol. 457, 2009.
  • [10] J. A. Doornik, Improving the Timeliness of Data on Influenza-like Illnesses using Google Search Data, University of Oxford, Technical report, pp. 1-21, 2009.
  • [11] H.A. Carneiro and E Mylonakis, Google Trends: a Web-Based Tool for Real-Time Surveillance of Disease Outbreaks, Clinical Infectious Diseases 49(10):1557-64
  • [12] S. Cook, C. Conrad, A. L. Fowlkes and M. H. Mohebbi, Assessing Google Flu Trends Performance in the United States during the 2009 Influenza Virus A (H1N1) Pandemic, 2011. [Online]. Available: DOI: 10.1371/journal.pone.0023610.
  • [13] G. Eysenbach, Infodemiology: tracking flu-related searches on the web for syndromic surveillance, AMIA Annu Symp Proc, pp. 244-248, 2006.
  • [14] S. Youn and H. Cho, Nowcast of TV Market using Google Trend Data, Journal of Electrical Engineering and Technology, Vol 11, pp 227-233, 2016
  • [15] J. Rech, Discovering trends in software engineering with google trend, ACM SIGSOFT Software Engineering Notes, Vol 21, pp 1-2, 2007i
  • [16] C. Kuo, H. Ruan and S Chen, An Analysis of Security Patch Lifecycle Using Google Trend Tool, Seventh Asia Joint Conference on Information Security, 2012.
  • [17] H Choi and H Varian, Predicting the Present with Google Trends, the Economic Record, vol 88, pp 2-9, 2012
  • [18] M.S. Mondal and S.A. Wasimi, Periodic Transfer Function-Noise Model for Forecasting, Journal of Hydrologic Engineering, vol 10, 2005
  • [19] A. Seifter, A. Schwarzwalder, K. Geis and J. Aucott, the Utility of Google Trends for Epidemiological Research: Lyme Disease as an Example, Geospatial Health vol 4, pp 135-137, 2010
  • [20] G.D. Jacobsen and K.H. Jacobsen, Health Awareness Campaigns and Diagnosis Rates: Evidence from National Breast Cancer Awareness Month, Journal of Health Economics, vol 30, pp 55-61, 2011
  • [21] J.W. Ayers and B.M. Althouse, Leveraging Big Data to Improve Health Awareness Campaigns: A Novel Evaluation of the Great American Smokeout, JMIR Public Health and Surveillance, vol 2, 2016
  • [22] F. Wilcoxon, Individual Comparisons by Ranking Methods, Biometrics Bulletin, vol 1, pp 80-83, 1945
  • [23] R.C. Blair and J.J. Higgins, A Comparison of the Power of Wilcoxon’s Rank-Sum Statistic of that of Student’s t Statistic under Various Nonormal Distributions, Journal of Educational Statistics, vol 5, pp 309-335, 1980
  • [24] ACS, Breast Cancer Facts and Figures 2011-2012.
  • [25] SEER, Cancer Statistics Review 1975-2008-table 4.12, [Online]. Available: \(http://seer.cancer.gov/csr/1975\_2008/results\_single/sect\_04\_table.12.pdf\) (last accessed on July 30, 2018)
  • [26] Centers for Disease Control and Prevention, Breast Cancer Statistics, 2014.
  • [27] M. Dariush and et al., "Heart disease and stroke statistics—2015 update: a report from the American Heart Association," 2015.
  • [28] Centers for Disease Control and Prevention and NCHS, "Underlying Cause of Death 1999-2013 on CDC WONDER Online Database," 2015.

Appendices

A National Health Awareness Events with corresponding Selected Queries

Health Awareness Event/Month Query
January
National Birth Defects Prevention Month Birth Defects
Cervical Health Awareness Month Cervical
National Glaucoma Awareness Month Glaucoma
Thyroid Awareness Month Thyroid
February
American Heart Month Heart Disease
National Children’s Dental Health Month Dental Health
March
National Colorectal Cancer Awareness Month Colon Cancer
National Endometriosis Awareness Month Endometriosis
National Nutrition Month National Nutrition
Multiple Sclerosis Education Month Sclerosis
April
Alcohol Awareness Month Alcohol Awareness
National Autism Awareness Month Autism
Irritable Bowel Syndrome Month Ibs
May
American Stroke Awareness Month Stroke
Arthritis Awareness Month Arthritis
National Asthma and Allergy Awareness Month Asthma Allergy
National Celiac Disease Awareness Month Celiac
Hepatitis Awareness Month Hepatitis
National High Blood Pressure Education Month High Blood Pressure
Lupus Awareness Month Lupus
Mental Health Month Mental Health
National Osteoporosis Awareness Month Osteoporosis
Skin Cancer Detection and Prevention Month Skin Cancer
Health Awareness Event/Month Query
June
National Aphasia Awareness Month aphasia
Scoliosis Awareness Month scoliosis
July
Eye Injury Prevention Month eye injury
August
Amblyopia Awareness Month amblyopia
National Immunization Awareness Month immunization
Psoriasis Awareness Month psoriasis
September
National Alcohol and Drug Addition Recovery Month alcohol drug addition
National Cholesterol Education Month cholesterol
Lcukemia and Lymphomn Awareness Month Lcukemia
National Menopause Awareness Month menopause
Ovarian Cancer Awareness Month ovarian cancer
Prostate Awareness Month prostate
October
National Breast Cancer Awareness Month breast cancer
National Dental Hygiene Month dental hygiene
National Depression and Mental Health Screening Month depression
National Down Syndrome Awareness Month down syndrome
SIDS Awareness Month Sids
Spina Bifida Awareness Month spina bifida
November
National Alzheimer’s Disease Awareness Month alzheimer
American Diabetes Month diabetes
National Epilepsy Awareness Month epilepsy
Lung Cancer Awareness Month lung cancer
Pancreatic Cancer Awareness Month pancreatic cancer

B The results of three methods for all 46 query data

Event
Wilcox
Sum Test
p-value
Peaks at
Event
Months
Transfer Function
Noise Model
Fits Better
Input Series
Coefficient
p value
Conclusion
Alcohol Awareness 0.0013* 6 Yes 0* Effective
Autism 0* 12 Yes 0* Effective
Breast Cancer 0* 14 Yes 0* Effective
Coloncancer 0.0008* 7 Yes 0.0129* Effective
Dental Health 0* 14 Yes 0* Effective
Heart Disease 0* 14 Yes 0.0016* Effective
Immunization 0* 14 Yes 0.0009* Effective
National Nutrition 0* 5 Yes 0.0054* Effective
Ovarian Cancer 0.0007* 7 Yes 0* Effective
Sids 0.0008* 4 Yes 0* Effective
Asthma Allergy 0.0183* 3 Yes 0.0636 Unclear
Diabetes 0.0297* 1 No 0.0813 Unclear
Endometriosis 0.1314 4 No 0.7099 Unclear
Epilepsy 0.0159* 0 No 0.2426 Unclear
Lung Cancer 0.0341* 1 No 0.1929 Unclear
Lupus 0.0192* 4 Yes 0.7506 Unclear
Menopause 0.0177* 2 No 0.5078 Unclear
Skin Cancer 0 5 No 0.0504 Unclear
Alcohol Drug
Addiction
0.3959 0 Yes 0.0718 Ineffective
Alzheimer 0.177 1 No 0.2090 Ineffective
Amblyopia 0.8139 1 No 0.9164 Ineffective
Aphasia 0.9809 0 No 0.0009* Ineffective
Arthritis 0.1718 1 No 0.6986 Ineffective
Birth Defect 0.1899 0 No 0.5783 Ineffective
Celiac 0.22 1 No 0.7075 Ineffective
Cervical 0.8439 0 No 0.0012* Ineffective
Cholesterol 0.2667 1 No 0.0124* Ineffective
Dental Hygiene 0.0724 1 No 0.5741 Ineffective
Depression 0.1168 1 No 0* Ineffective
Down Syndrome 0.2446 1 No 0.0484* Ineffective
Eye Injury 0.4793 0 Yes 0.2093 Ineffective
Glaucoma 0.6872 0 Yes 0.0274* Ineffective
Hepatitis 0.3914 0 Yes 0.0300* Ineffective
High Blood Pressure 0.8289 0 No 0.0038* Ineffective
Ibs 0.1389 1 No 0.0033* Ineffective
Leukemia 0.249 0 No 0.0024* Ineffective
Mental Health 0.5126 0 No 0* Ineffective
Osteoporosis 0.6779 0 No 0.0429* Ineffective
Pancreatic Cancer 0.2508 0 Yes 0.6771 Ineffective
Prostate 0.7092 0 No 0.6659 Ineffective
Psoriasis 0.8311 0 No 0.3862 Ineffective
Sclerosis 0.1822 0 No 0.0258* Ineffective
Event
Wilcox
Sum Test
p-value
Peaks at
Event
Months
Transfer Function
Noise Model
Fits Better
Input Series
Coefficient
p value
Conclusion
Scoliosis 0.3892 1 Yes 0.4533 Ineffective
Spina Bifida 0.0036* 2 No 0.0047* Ineffective
Stroke 0.2918 1 No 0.2082 Ineffective
Thyroid 0.9551 0 No 0.5111 Ineffective

C ARIMA and Transfer model comparison in JMP for all 46 events

Each pair of pictures shows one event, with the left being ARIMA/SARIMA and right one being Transfer Function model.

Figure 7: Alcohol Awareness
Figure 8: Alcohol Drug Addiction
Figure 9: Alzheimer
Figure 10: Amblyopia
Figure 11: Aphasia
Figure 12: Arthritis
Figure 13: Asthma Allergy
Figure 14: Autism
Figure 15: Birth Defect
Figure 16: Breast Cancer
Figure 17: Celiac
Figure 18: Cervical
Figure 19: Cholesterol
Figure 20: Coloncancer
Figure 21: Dental Health
Figure 22: Dental Hygiene
Figure 23: Depression
Figure 24: Diabetes
Figure 25: Down Syndrome
Figure 26: Endometriosis
Figure 27: Epilepsy
Figure 28: Eye Injury
Figure 29: Glaucoma
Figure 30: Heart Disease
Figure 31: Hepatitis
Figure 32: High Blood Pressure
Figure 33: Ibs
Figure 34: Immunization
Figure 35: Leukemia
Figure 36: Lung Cancer
Figure 37: Lupus
Figure 38: Menopause
Figure 39: Mental Health
Figure 40: National Nutrition
Figure 41: Osteoporosis
Figure 42: Ovarian Cancer
Figure 43: Pancreatic Cancer
Figure 44: Prostate
Figure 45: Psoriasis
Figure 46: Sclerosis
Figure 47: Scoliosis
Figure 48: Sids
Figure 49: Skin Cancer
Figure 50: Spina Bifida
Figure 51: Stroke
Figure 52: Thyroid