1 Introduction
A growing body of the literature has found evidence that fetal exposure to adverse health shocks is associated with negative socioeconomic and health outcomes later in their life (see Prinz2018, for a recent survey).^{1}^{1}1See also Almond:2011jf and Currie:2013ke for comprehensive reviews. In addition to a sizable longrun literature, the shortrun relationship between health shocks, particularly weather shocks, in utero and birth outcomes has also been widely studied (Poursafa:2015vg; Zhang:2017cw). These studies have found that exposure to weather shocks such as heat and cold waves during pregnancy are associated with lower birthweight (Deschenes:2009jz; Andalon:2016gd; Molina:2017hf).^{2}^{2}2Currie:2013bm found that newborns could suffer abnormalities because of stressful events due to weather shocks. According to their analyses, exposure to a hurricane during pregnancy is associated with the use of ventilators and the occurrence of meconium aspiration syndrome. In addition, since climate change is now a global agenda , the consequences of weather shocks on child health have recently begun to attract wider attention (Zivin:2016wq; Kousky:2016um)). Carleton:2016cb reviewed the social and economic impacts of climate change. These findings in both the longrun and the shortrun strands of the literature are closely related because a lower birthweight can be associated with worse socioeconomic outcomes in later life (Prinz2018).
In contrast to these studies, however, mortality selection in utero has not attracted broad coverage in economics literature. One exception is the study by Valente:2015ci, who tested a biological proposition named the Trivers–Willard hypothesis, which argues that fetal exposure to adverse health shocks disturbs the gender balance at birth because reproductive success of males is more vulnerable than that of females (Trivers:1973fd)
. Valente found that fetal exposure to civil conflict in Nepal is associated with a higher probability of miscarriage and relatively low male births compared to female births (i.e., lower secondary sex ratio). Although testing the Trivers–Willard hypothesis is not their main object,
Sanders:2015iq also found that the Clean Air Act Amendments of 1970 in the United States improved fetal health (measured as a higher secondary sex ratio).Maintaining the natural gender balance is important in an economy because the adult sex ratio imbalance leads to skewed marriage in terms of age and assortative matching as well as other demographic conditions such as out of wedlock fertility
(Angrist:2010tz; Chiappori:2002bx; Abramitzky:2011bu; Bethmann:2012il; Brainerd:2017bw; Francis2011). Considering this scarcity of research, we aim to bridge the gap in the body of knowledge by investigating the associations between fetal exposure to pandemic influenza and mortality selection in utero.Using a comprehensive dataset of vital statistics in Japan, we find that fetal exposure to pandemic influenza between 1918–1920 decreased the proportion of males at birth. The culling effect was concentrated on exposure during the first trimester of the pregnancy, and the estimated magnitude suggests that such exposure could have decreased the proportion of males at birth by up to 1%, accounting for 60% of one standard deviation. Our results from the analyses using the complete census on annual infant mortality by gender indicate that such a reduction in male births during pandemics might be associated with a “scarring” mechanism under which the distribution of fetal health endowment shifts to the left. We also investigate the persistency of fetal influenza exposure on the sex ratio between 5 and 12 years old using a set of official reports of the Population Censuses conducted in 1925 and 1930. From these exercises, we find evidence that shocks due to pandemic influenza might have persisted into childhood. The estimated magnitude is approximately 0.18% in the maximum case, accounting for 40% of the standard deviation.
This study contributes to the wider literature in the following two ways. First, it is the first to use pandemic influenza as an exogenous shock to test mortality selection in utero as well as a set of complete censuses on births in a developing economy, which covers all births in a unit of a prefecturemonth cell between 1916 and 1922 in Japan. While previous studies have used survey data to analyze fetal health, Sanders:2015iq showed that data from household surveys on fetal losses are more likely to suffer from unobserved selection issues because of the use of small samples.^{3}^{3}3Sanders:2015iq used countyyearlevel birth data from the National Center for Health Statistics’ Vital Statistics Microdata between 1968 and 1972, which records 50% of all birth certificates in the United States Valente:2015ci used the Demography and Health Surveys conducted in 2001 and 2006, which records nearly 11,000 births. Another potential issue with survey samples in developing countries is age heaping (Beckett2001). In light of these issues, we use the complete prefecturemonthlevel birth records of prewar Japan to provide new evidence on the association between fetal influenza exposure and the gender imbalance at birth. Using similar comprehensive vital statistics on infant deaths, this study also assesses the mechanism behind male culling before birth.
Second, this study is the first to investigate the persistency of fetal shocks on the sex ratio of children. While previous studies have focused on the associations between fetal shocks and the sex ratio at birth, the laterlife gender imbalance due to those fetal shocks has not been studied (Bethmann:2014fc; Sanders:2015iq; Valente:2015ci). Investigating the longterm effects of fetal exposure to pandemics on the sex ratio is important given that maintaining the natural gender balance in an economy is preferable, as discussed earlier. Although our data constructed from Population Censuses include children aged up to 12 years old, we find evidence of the persistent effects of fetal influenza exposure on the sex ratio of children.
The structure of the remainder of this paper is as follows. Section 2 introduces the empirical setting. Section 3 provides empirical evidence on the gender imbalance at birth due to pandemic influenza. Section 4 assesses the mechanism behind fetal shocks on the gender imbalance. Section 5 investigates the persistency of fetal influenza exposure on the sex ratio of children. Section 6 concludes the paper.
2 Empirical Setting
2.1 Theoretical Framework
This study investigates the impacts of pandemic influenza on the gender balance. The influential study by Trivers:1973fd in the field of biology proposed a hypothesis about the mechanism behind the determinants of the secondary sex ratio, which has recently attracted attention in the field of health economics (Valente:2015ci). As Catalano:2006vv
illustrated, the intuition of the Trivers–Willard hypothesis can be explained using shifts in the distribution of a random variable.
Proportion of Male Births
Let and be the initial health endowments of boys and girls in utero, respectively. Owing to natural selection, a certain threshold, , exists, below which fetuses are culled before birth. Since male fetuses are more vulnerable in utero than female fetuses (Kraemer:2000to), it is natural to assume that the mean initial health endowment of girls is greater than that of boys:
. When we consider the probability density function for boys (
) and girls (), this initial assumption implies that the number of culled male fetuses is always greater than that of female fetuses because of the following condition:(1) 
If fetuses are exposed to health shocks in utero, the distribution shifts to the left or the survival threshold moves to the right. The former is called the “scarring” mechanism, whereas the latter is called the “selection” mechanism. In both cases, condition (1) indicates that the male share at birth must decrease, which corresponds to the proposition implied by the Trivers–Willard hypothesis (Trivers:1973fd). This study thus investigates whether this proposition holds for the influenza pandemic in the early 20th century in industrializing Japan.
Mechanism
In contrast to culling before birth, the health status of an infant depends on the type of mechanism. If the “scarring” mechanism works, the conditional mean of the truncated normal distribution shifts to the left because the original mean of the distribution moves to the left, as illustrated in Figure
(a)a. If the “selection” mechanism works instead, the conditional mean of the truncated normal distribution shifts to the right as the survival threshold moves to the right, cutting the lower tail of the original distribution, as illustrated in Figure (b)b.^{4}^{4}4These illustrations may be understandable intuitively. However, it is easy to show both mechanisms mathematically as the conditional mean of the truncated normal distribution of, for example : , whereis the cumulative distribution function.
If both mechanisms work at the same time, the health status of an infant should therefore be unchanged.To test which mechanism is more relevant, we use the infant mortality rate as a proxy for the health status of infants. In the contemporary context, a health measurement at birth such as birthweight can be used (Valente:2015ci). Although such a measurement at birth was unavailable in early 20th century Japan, the annual vital statistics reports provide complete figures on infant deaths in all prefectures at that time. Since the infant mortality rate can accurately represent health status at birth (Almond:2006va), we use data on infant mortality to analyze the mechanism behind the observed secondary sex ratio during the influenza pandemic.
2.2 Maternal Stressor: Pandemic Influenza
The Spanish influenza of 1918 infected 600 million people and killed 20–40 million patients worldwide (Kilbourne:2006ce; Taubenberger:2006vq). After only five months of the first reported case of influenza in the United States, Spanish flu hit the Japanese archipelago between August 1918 and July 1920. Figure 2 shows the number of monthly deaths from pandemic influenza between 1918 and 1920 in Japan. Similar to other Asian countries, there were two waves of the pandemic in Japan, with the first and second peaks observed in November 1918 and January 1920, respectively (hayami2006). Regarding the intensity of the pandemics, although the influenza mortality rate in Japan (4.5 per 1,000 people) between 1918 and 1919 was lower than that of other Asian countries, this rate was in a similar range to that of Western countries (Rice:1993vp; hayami2010). Indeed, during the pandemic periods (August 1918–July 1919; September 1919–July 1920), more than one in every five people in Japan became infected with influenza (csbhm1927).
Moreover, pandemic influenza tended to affect young adult females as well as older adults and children because of its aggressiveness (Almond:2005uw; Erkoreka:2010fi; Kawana:2007te). Specifically, women aged 20–29 years in Japan were more likely to be affected by the pandemic flu virus than men in the same age range (hayami2006; Ogasawara:2017ii).^{5}^{5}5Similar gender and age biases in infections were observed in Western countries. Influenzarelated mortality rates in the pandemic years were more than five times higher than those in nonpandemic years in the United Kingdom and the United States (RICHARD:2009eb). Reid:2005vm also reported that such dramatic increases in mortality rates were more obvious in young adult women. Since the average age at first marriage of Japanese women was 23 years old in the 1920s, average age at first birth might have been around 24–25 years old (census1920v1; census1925v2). This means that pandemic influenza affected not only young adult women but also children in utero during the pandemics via maternal infection.
Given these features of pandemics, a growing body of studies has employed pandemic influenza as a natural experiment to identify the longterm effects of fetal shocks on human capital formation (e.g., Almond:2006va; Lin:2014bz). Indeed, as these previous studies have found, pandemics show a certain random spatiotemporal distribution. Figure 3 illustrates the spatial distribution of influenza death rates in the pandemic months of 1918–1920. In the first wave, the epidemic cluster was generated in the southwestern region (Kyūshū and Shikoku) in November 1918 (Figure (a)a); it then jumped to the northern and northeastern regions (Chūbu and Tohōku) in the next month (Figure (b)b) before moving to the central part of the main island (Kantō) (Figure (c)c). The second wave exhibits a more straightforward but not persistent transition. The cluster was generated in the western region (Chūgoku) in January 1920 (Figure (d)d), and then transited to the northern region (Chūbu) in the next month (Figure (e)e). It finally covered a broader region including the northeastern region (Tōhoku) and Hokkaidō, a northeastern island. The foregoing suggests that the patterns of the pandemics were not systematic or concentrated in a specific region.
Potential sorting might be an issue in the identification because this can cause measurement errors in the influenza death rate. However, internal migration as an escaping strategy would not have worked because people could not have predicted the timing and place of the pandemics in early 20th century (hayami2006). Moreover, although another potential issue is the immediate response by the Japanese government, the government could not have provided an efficient preventive policy during the pandemics because no vaccination was available at that time (hayami2006).
3 Gender Imbalance at Birth
3.1 Data
Proportion of Male Births
Our main analysis uses a unique prefecturemonthlevel panel dataset on the male share at birth defined as the number of male live births per 100 live births.^{6}^{6}6This is essentially the same as using the secondary sex ratio, namely, the ratio of male live births to female live births (Bethmann:2014fc). Since the Trivers–Willard hypothesis focuses on the vulnerability of male fetuses, the male share at birth may be more useful than the secondary sex ratio for interpreting the results. We construct the dataset of the proportion of male births using the official vital statistics records published by the Statistics Bureau of the Cabinet. Since these vital statistics have been recorded based on the comprehensive national registration system (koseki), the data cover all births during the measured years.^{7}^{7}7Although the quality of Japanese fetal death records that begun in 1900 was not high in the initial stage, the records became reliable around the 1920s (Kawana:2007te; Ito1987). See Drixler2016 for a more indepth discussion on birth records in prewar Japan. We digitize the 1916–1922 editions of Nihonteikoku jinkōdōtaitōkei (Vital Statistics of Empire Japan, hereafter the VSEJ) (vsej1916; vsej1917; vsej1918; vsej1919; vsej1920; vsej1921; vsej1922). Online Appendix A shows an example of this vital statistics record.^{8}^{8}8We confirm the stationarity of our panel dataset. For the proportion of male births used in our analysis, several tests reject the null of unit root nonstationarity. See Online Appendix B.1. Panel A of Table 1 lists the summary statistics of the proportion of male births.
Influenza Mortality
We use the influenza death rate, the number of deaths due to influenza per 10,000 people, as the key independent variable that captures the intensity of exposure to pandemic influenza. The data on the monthly death tolls from influenza are obtained from the 1915–1922 editions of Nihonteikoku shiintōkei (Statistics of Causes of Death of the Empire of Japan, hereafter the SCDEJ) published by the Statistics Bureau of the Cabinet (scdej1915; scdej1916; scdej1917; scdej1918; scdej1919; scdej1920; scdej1921; scdej1922), whereas the data on the population are taken from the official online database of the Statistical Survey Department, Statistics Bureau, Ministry of Internal Affairs and Communications.^{9}^{9}9These are publicly available at the official website: https://www.estat.go.jp/statsearch/filedownload?statInfId=000000090265&fileKind=0, accessed on July 31, 2019). In this dataset, the population in month () of year is calculated as , where , , and are the annual population, number of live births, and number of deaths, respectively. The data on the number of live births and deaths are from the 1915–1922 editions of the VSEJ (vsej1915; vsej1916; vsej1917; vsej1918; vsej1919; vsej1920; vsej1921; vsej1922).
Since fetal influenza exposure matters in our theoretical framework as described, we use the past ninemonth average of influenza death rates in the regression analysis. We also investigate the impacts of fetal influenza exposure in each trimester (see the next subsection). Panel B of Table
1 shows the summary statistics of the average influenza death rates.Additional Control Variables
To control for the observable factors, we include a set of available prefectureyearlevel control variables. The first set of controls are the indices of agricultural production. Given the agrarian society at that time, a certain proportion of wealth can be captured by productivity. We herein consider rice yield per hectare, soy yield per hectare, and milk production per capita as measures of potential wealth because these items were the main sources of carbohydrate and protein (Ogasawara:2020et). The data on these variables are digitized from Todōfuken nōgyōkisotōkei (Basic Statistics of Agriculture in Japanese Prefecture) edited by Nobufumi Kayo (kayo1983). Another set of controls is access to medical care. We include the share of medical doctors and midwives to control for access to medical care and related socioeconomic conditions and potential wealth level. To obtain the data on medical access, we digitize the volumes 36–43 of Nihonteikoku tōkeinenkan (Statistical Yearbook of the Japanese Empire, hereafter the SYEJ) (syej33_46).^{10}^{10}10The data on the population used as the denominator are taken from the official database of the Statistical Survey Department, Statistics Bureau, Ministry of Internal Affairs and Communications (http://www.stat.go.jp/data/chouki/zuhyou/0205.xls, accessed on July 13, 2017).
Frequency  Mean  Std. Dev.  Min  Max  Observations  

Panel A: Dependent variables  
Male births (per 100 births)  Monthly  
Infant mortality rate (per 1,000 live births)  Annual  
Infant mortality rate (boys)  Annual  
Infant mortality rate (girls)  Annual  
Panel B: Influenza severity  
Influenza death rate (past ninemonth average, per 10,000 people)  Monthly  
Influenza death rate (first trimester average)  Monthly  
Influenza death rate (second trimester average)  Monthly  
Influenza death rate (third trimester average)  Monthly  
Weighted influenza death rate (per 10,000 people)  Annual  
Weighted influenza death rate (first trimester average)  Annual  
Weighted influenza death rate (second trimester average)  Annual  
Weighted influenza death rate (third trimester average)  Annual  
Panel C: Control variables  
Rice yield per hectare (hectoliter)  Annual  
Soy yield per hectare (hectoliter)  Annual  
Milk production per capita (liter)  Annual  
Coverage of doctors (per 100 people)  Annual  
Coverage of midwives (per 100 people)  Annual 
3.2 Identification Strategy
We use the differenceindifferences (DID) estimation strategy within the regression framework that compares the proportion of male births among prefectures that experienced different intensities of exposure to the influenza before and after the pandemic. As discussed, there were considerable exogenous variations in the influenza death rates during the pandemic periods. To identify the impacts of fetal exposure to influenza on the sex ratio at birth, we employ a semiexperimental approach using this spatiotemporal variation in influenza death rates. Our baseline specification is given as follows:
(2) 
where indexes the prefecture, indexes the measured yearmonth, and indicates a group variable for the measured year. The variable is the proportion of male births,
is a vector of the prefectureyearlevel control variables,
is the prefecture fixed effect, is the yearmonthspecific fixed effect, and is a random error term. is the month lagged influenza death rate, and thus our key independent variable is the past ninemonth average of influenza death rates. Our parameter of interest is and its estimate captures the marginal effect of the influenza death rate on the proportion of male births. Therefore, we expect to be negative and statistically significant.The first specification in equation 2
assumes that the potential effects of fetal influenza exposure are constant regardless of the timing of exposure. However, medical evidence suggests that fetuses are most susceptible to maternal stress in the first trimester when they experience rapid neuron differentiation and the proliferation of neuronal elements
(moore2013). This implies that the culling effects on male fetuses are much clearer in the first trimester than in the second and third trimesters. Therefore, our preferred specification is as follows:(3) 
The second to fourth terms represented as summations on the righthand side are the average influenza death rates during the first, second, and third trimesters, respectively. Therefore, we expect the estimates , , and to be negative; among these, the estimate for the first trimester, , shows the clear adverse effects on the proportion of male births.
Since we use a within estimator for the fixed effect models in equations 2 and 3, the identification depends on the sharp increases in influenza mortality during the pandemic years (Figure 2). As discussed in Section 2.2, these influenza death rates are plausibly exogenous because no vaccination was available in the prewar period and internal migration was unrealistic given the rapid spread of the virus. Despite this preferable feature for the identification, we control for a large proportion of the unobservable factors and observable characteristics in the following ways. First, we control for prefecturespecific timeinvariant factors such as the baseline wealth level and geographical features using prefecture fixed effects. Second, the macroeconomic shocks and cyclical effects of seasonal epidemics are captured using yearmonth fixed effects.^{11}^{11}11Since we use seven measured years (1916–1922), 83 () yearmonth fixed effects are included in the models altogether. After controlling for these fixed effects, the remaining potential confounding factors included in the error term that might be correlated with the influenza death rate include the timevarying wealth level and access to medical care. To control for these factors, we further include the set of available prefectureyearlevel control variables introduced in the previous subsection. Since we employ the regression DID specification, the trends in the proportion of male births are assumed to be similar across prefectures. Since the sex ratio at birth is a biological measure rather than a socioeconomic outcome, this common trend assumption is likely to hold. To relax the common trend assumption, however, we further include the prefecturespecific time trend () in some of the specifications.
To address the potential spatial and prefecturespecific within correlations, we report the clusterrobust variance estimator (CRVE) and cluster standard errors at the 8area level.
^{12}^{12}12This geographical classification of Japan includes Hokkaidō (northernmost), Tōhoku (eastern), Kantō (eastcentral), Chūbu (westcentral), Kansai (southcentral), Chūgoku (westernmost), Shikoku (southwest of the main island), and Kyūshū (southwest island). Our method controls for the correlation and heteroskedasticity within clusters as well as addresses the potential heteroskedasticity across clusters. To address the small number of clusters in the CRVE, we adopt the wild cluster bootstrapt method for the statistical inference (Cameron:2008ws). All the regressions are weighted by the average number of births over the sample period in each prefecture.3.3 Main Results
Table 2 presents the results. Columns (1)–(4) present the results for the entire period (1916–1922). Column (1) shows the results from the baseline specification in equation 2. The estimate is negative but statistically insignificant. This result is unchanged if we include the prefecturespecific time trend in column (2). Column (3) shows the result from our preferred specification in equation 3. The estimates listed in this column suggest that fetal influenza exposure in the first trimester has a statistically significantly negative effect on the proportion of male births, whereas that during the second and third trimesters does not have such an effect. As explained, this finding is consistent with the fact that fetuses are more vulnerable in the first trimester than in the other trimesters. In column (4), we find that this result is robust to including the prefecturespecific time trend as expected.
Columns (5) and (6) present the results for nonpandemic years (1916–1917 and 1921–1922), whereas columns (7) and (8) present the results for pandemic years (1918–1920). In columns (5) and (6), we find no statistically significant effects of fetal exposure to influenza on the proportion of male births during nonpandemic years. By contrast, columns (7) and (8) show the clear significant adverse effects of fetal influenza exposure during pandemic years. This implies that while seasonal influenza does not have any significant impacts on the secondary sex ratio, pandemic influenza does have such an effect. The result of this placebo experiment supports the evidence that the identification in our within estimator using the sharp increase in influenza death rates during pandemic years seems to work well and should provide reliable estimates.
If we use the maximum average influenza death rate in the first trimester (Panel B of Table 1) as the reference value to calculate the magnitude, the estimate in column (3) implies that fetal exposure to pandemic influenza decreased the proportion of male births by approximately 1.1% (). This magnitude is not very large but still nonnegligible given that one standard deviation of the proportion of male births is 1.67 (Panel A of Table 1). Although the estimate in column (7) is slightly larger than that in column (3), the magnitude calculated is in a similar range in both cases (1.1% vs. 1.3%).
Overall, we find that pandemic influenza can disturb the gender balance at birth, consistent with the proposition implied by the Trivers–Willard hypothesis that health shocks in utero can decrease the proportion of male births under both the “scarring” and the “selection” mechanisms because male fetuses are more vulnerable than female fetuses.
Entire period  Nonpandemic years  Pandemic years  
Exposed trimesters  (1)  (2)  (3)  (4)  (5)  (6)  (7)  (8) 
All trimesters  
First trimester  ***  ***  ***  ***  
Second trimester  
Third trimester  
Time trend  No  Yes  No  Yes  No  Yes  No  Yes 
Period  1916–22  1916–22  1916–22  1916–22  1916–17  1916–17  1918–1920  1918–1920 
1921–22  1921–22  
Observations  3864  3864  3864  3864  2208  2208  1656  1656 
Number of prefectures  46  46  46  46  46  46  46  46 
Number of clusters  8  8  8  8  8  8  8  8 
4 Mechanism
Next, we investigate the mechanism behind the suggested effects of the pandemics on the proportion of male births by pandemic influenza. To do so, we first digitize the statistics on infant deaths reported in the SCDEJ. We then calculate the weighted annual influenza death rate using the monthly variation in influenza mortality to match the infant death rates observed at the prefectureyear level.
4.1 Data and Specification
As described, we must measure the health of infants to test whether the “scarring” or “selection” mechanism drove the gender imbalance at birth due to pandemic influenza. We digitize the complete censuses of annual infant deaths documented in the SCDEJ to construct the dataset on infant mortality rates between 1916 and 1922 (scdej1916; scdej1917; scdej1918; scdej1919; scdej1920; scdej1921; scdej1922). The data on the number of annual live births used as the denominator are obtained from the VSEJ (vsej1916; vsej1917; vsej1918; vsej1919; vsej1920; vsej1921; vsej1922).^{13}^{13}13Although the VSEJ also documents the number of fetal deaths by month, unfortunately, it does not record any information on the length of gestation period until fetal deaths. This means that it is technically difficult to precisely match the timing of exposure to pandemic influenza with fetal death rates. This sort of crude assignment can attenuate the estimated coefficients on the treatment variables. Despite this difficulty, we also try to run the annual fetal death rate on the weighted influenza mortality rate. As expected, the estimates are statistically insignificant in most cases.
To improve the assignment of the treatments, we calculate a weighted influenza death rate using the monthly variations in the number of influenza deaths and live births. The weighted influenza death rate in prefecture in year is defined as follows:
(4) 
where is the number of live births in month and is the past ninemonth average of influenza mortality in month .^{14}^{14}14This transformation takes both the severity of influenza exposure and the timing of birth into account: captures the treatment intensity, whereas the weight, , coordinates the differences in the timing of birth. Ogasawara:2018hk showed evidence that this transformation improves the treatment assignment to a certain extent if we compare it using a simple lagged influenza death rate. The baseline specification is then given as follows:
(5) 
where is the infant mortality rate, is a vector of the same control variables introduced above, is the prefecture fixed effect, is the yearspecific fixed effect, and is a random error term. Our parameter of interest is and its estimate may capture the marginal effect of the influenza death rate on the infant mortality rate. As explained, we expect to be negative and statistically significant if the “selection” mechanism works, whereas it should be statistically significantly positive if the “scarring” mechanism is relevant.
In the flexible specification, we consider the weighted influenza death rates for the first, second, and third trimester by replacing in equation 4 with the average influenza mortality rates for each trimester. The flexible specification is given as follows:
(6) 
This specification allows us to investigate the most sensitive trimester for the impacts of fetal influenza exposure on infants’ health. One must be careful here, as in the regressions using infant mortality as a dependent variable, we do not necessarily expect the first trimester to be the most vulnerable for infants’ health. While fetuses are indeed relatively vulnerable during the first trimester, those affected by any shocks during this trimester are culled before birth. In other words, surviving fetuses are positively selected into birth.^{15}^{15}15We can usually show that the conditional expectation of the truncated normal distribution is always greater than that of the original distribution, as described in Section 2.1. Therefore, the observed (i.e., surviving) infants may be more sensitive to shocks during the second and/or third trimesters than those during the first trimester. This natural selection mechanism suggests that the estimates and/or can be positive (negative) if the “scarring” (“selection”) mechanism works, whereas the estimate can be negative or statistically insignificant.
The inferences are conducted in a similar way for the specifications for the proportion of male births. We use the CRVE and cluster the standard errors at the 8area level to address the potential spatial and prefecturespecific within correlations. The wild cluster bootstrapt method is employed for the statistical inference. All the regressions are weighted by the average number of live births over the sample period in each prefecture. To relax the common trend assumption, we include the prefecturespecific time () trend in some of the specifications.
4.2 Results
Table 3 presents the results. Panels A–C of this table present the results for the infant mortality rates for all infants, boys, and girls , respectively. Columns (1) and (3) show the results from equations 5 and 6, respectively. Columns (2) and (4) add the prefecturespecific time trend for both equations.
Column (1) of Panel A shows that the estimated effect of fetal influenza exposure on the infant mortality rate is positive and statistically significant. This result is unchanged if we consider the prefecturespecific time trend in the infant mortality rate in column (2). This implies that the “scarring” mechanism might have driven the gender imbalance at birth. Column (3) of Panel A indicates that such an effect was concentrated on exposure during the third trimester as expected. This result is still unchanged after controlling for the prefecturespecific time trend in column (4).
Panels B and C of Table 3 show similar results for boys and girls. An interesting gender difference can be highlighted: the estimates for girls are greater in magnitude than those for boys. For example, if we compare column (2) of Panel B with that of Panel C, the estimate for girls is approximately 45‰ greater than that for boys (). We confirm that this difference in magnitude is statistically significant.^{16}^{16}16To test the gender difference, we pooled the infant mortality rates for boys and girls and interacted all the independent variables including fixed effects with the gender dummy. Online Appendix B.2 summarizes these results.
This gender difference is considered to be consistent with the “scarring” mechanism. As explained, the distribution of the fetal health endowment must shift to the left if the “scarring” mechanism works. Suppose the distributions of both boys and girls shift to the left by same degree and that the survival threshold is fixed at . Then, the net shift of the distribution depends only on the degree of the selection effect due to the truncation (at ). Since the selection effect on the male fetus is always greater than that on the female fetus as condition 1 suggests, the net leftward shift of the distribution of girls can be greater than that of boys.^{17}^{17}17This mechanism can also be explained mathematically. Given that the conditional expectation of the truncated normal distribution of can be written as , the selection effect due to the truncation is expressed as (i.e., the second term). The gender difference (boys minus girls) of this term can then be written as , which is positive because (condition 1). This means that the total shift of the mean caused by the “scarring” mechanism should be greater for girls than for boys, implying that the estimated “scarring” effect of fetal influenza exposure on girls’ infant mortality is greater than that on boys’ infant mortality.
Dependent variable: Infant mortality rate  
Exposed trimesters  (1)  (2)  (3)  (4) 
Panel A: All infants  
All trimesters  ***  ***  
First trimesters  
Second trimesters  
Third trimesters  **  **  
Panel B: Boys  
All trimesters  **  **  
First trimesters  
Second trimesters  
Third trimesters  **  **  
Panel C: Girls  
All trimesters  ***  ***  
First trimesters  
Second trimesters  
Third trimesters  **  **  
Time trend  No  Yes  No  Yes 
Observations  322  322  322  322 
Number of prefectures  46  46  46  46 
Number of clusters  8  8  8  8 
5 Persistency: Evidence from Population Censuses
5.1 Data and Specification
Thus far, we have found that fetal exposure to pandemic influenza decreased the proportion of male births. In this section, we assess whether the gender imbalance at birth persisted into their teens. The Population Censuses conducted in 1925 and 1930 documented the population by age and gender in each prefecture. To investigate the potential lasting effects of fetal influenza exposure on the sex ratio, we digitize the statistics and calculate the proportion of males aged 0–20 years.^{18}^{18}18Since the prefecture editions of the Population Census were published for each prefecture, we use issues (46 prefectures 2 census years) to construct the dataset. For simplicity, we refer to those issues as census1925pp and census1930pp.
Figure 4 shows the proportion of males in percentage points in each prefecture by age. As shown, the variance in the sex ratio is relatively stable until 12 years old because children graduate from primary school around then (hijikata1994). After graduation, while some children go to higher schools, a large part of them begin to work, which creates a gender imbalance due to the flow of migrant workers.^{19}^{19}19The school enrollment rate for primary school was near 100% and there were no significant differences in the rates across prefectures at that time in Japan. See (Schneider:2018cx) for finer details about primary school students in prewar Japan. This kind of internal migration after primary school age makes it difficult to analyze the potential longrun impacts of fetal influenza exposure on the sex ratio among teen workers because we use the prefecturelevel aggregate dataset that does not have useful information on birthplace. Therefore, we focus on the gender imbalance up to 12 years old. We further trim the ages to improve the DID setting. Since children born between 1918 and 1920 were exposed to pandemic influenza, children aged 5–7 in the 1925 Population Census and 10–12 in the 1930 Population Census are defined as the exposed cohorts, as shown in Figures (a)a and (b)b. Considering this, we focus on the proportion of boys aged 5–7 and 10–12 years old in 1925 and 1930. This means that our analytical sample includes children aged 5—7 years and 10–12 years born in 1913–1925.^{20}^{20}20Precisely, those children aged 5–7 years (10–12 years) in 1925 were born in 1918–1920 (1913–1916). Those children aged 5–7 years (10–12 years) in 1930 were born in 1819–1920 (1923–1925). Accordingly, to prepare the weighted influenza death rates in 1913–1925, we additionally digitize the 1912–1914 and 1923–1925 editions of the VSEJ (vsej1912; vsej1913; vsej1914; vsej1922; vsej1923; vsej1924; vsej1925) and the SCDEJ (scdej1912; scdej1913; scdej1914; scdej1923; scdej1924; scdej1925). Panel A of Table 4 lists the summary statistics for the dependent variables used.
Unit  Mean  Std. Dev.  Min  Max  Observations  

Panel A: Dependent variable 

Proportion of males (per 100 people)  Prefectureyearage  
Panel B: Influenza severity  
Exposed cohort (dummy variable) 
Prefecturebirth year  
Weighted influenza death rate (per 10,000 people)  Prefecturebirth year  
Panel C: Control variables  
Rice yield per hectare (hectoliter)  Prefecturebirth year  
Soy yield per hectare (hectoliter)  Prefecturebirth year  
Milk production per capita (liter)  Prefecturebirth year  
Coverage of doctors (per 100 people)  Prefecturebirth year  
Coverage of midwives (per 100 people)  Prefecturebirth year 
We begin our analysis by estimating the cohort effects of fetal exposure to pandemic influenza using the following specification:
(7) 
where indexes the prefecture, indexes the measured year, indexes the age, and thus indexes the cohort (birth year). The variable is the proportion of male births, Exposed is an indicator variable for the exposed cohorts (Figure 4), is a vector of the prefecturebirth yearlevel control variables, is the prefectureyearspecific fixed effect, is the age fixed effect, and is a random error term. We expect to be negative and statistically significant, as it captures the cohort effects of fetal influenza exposure on the proportion of male births.
Our main specification is then designed to estimate the marginal effects of fetal influenza exposure on the proportion of male births:
(8) 
where Weighted FLUDR is the weighted influenza death rate defined in equation 4. Similarly, we expect to be statistically significantly negative. In this specification, includes the same control variables used in equation 2: rice yield, soy yield, milk production, coverage of doctors, and coverage of midwives. An important difference is that we use these variables to control for the variations in the birth year (i.e., 1913–1925) rather than the measured year (i.e., 1925 and 1930). Therefore, these variables are used to control for the birth year heterogeneities in the potential wealth level and socioeconomic conditions that might be correlated with Weighted FLUDR. Panels B and C of Table 4 show the summary statistics for the key and control variables, respectively. On the contrary, the instantaneous effects, namely, any unobserved shocks in the prefecturemeasured year cells such as local economic shocks, are captured by the prefectureyearspecific fixed effect . The age fixed effect, , captures the common trend in the proportion of males over time. Thus, the identification assumption is that after controlling for these observed and unobserved factors, Weighted FLUDR is uncorrelated with the error term . Together with the randomness of the pandemics, our key variable is thus considered to be plausibly exogenous. However, the specifications of both equations 7 and 8 assume a common trend in the proportion of males across prefectures. To relax this assumption, we therefore allow the trend of the dependent variable to vary across prefectures using the prefecturespecific trend, say , in some of the specifications.
To address the potential spatial and prefecturespecific within correlations, we use the CRVE and cluster the standard errors at the 8area level. Since our data are a threedimensional (i.e., prefecturemeasured yearage) panel, this clustering can mitigate the potential correlations across cohorts. To overcome the issue of the small number of clusters, we use the wild cluster bootstrapt method for the statistical inference. All the regressions are weighted by the average number of children in each prefectureyear cell.
5.2 Results
Dependent variable: Proportion of males (%)  
(1)  (2)  (3)  (4)  
Exposed cohort  **  **  
Weighted FLUDR  **  **  
Prefectureyearspecific fixed effect  Yes  Yes  Yes  Yes 
Age fixed effect  Yes  Yes  Yes  Yes 
Heterogeneous trend across prefectures  No  Yes  No  Yes 
Observations  552  552  552  552 
Number of prefectures  46  46  46  46 
Number of clusters  8  8  8  8 
Measured years  1925 & 1930  1925 & 1930  1925 & 1930  1925 & 1930 
Ages  5–7 & 10–12  5–7 & 10–12  5–7 & 10–12  5–7 & 10–12 
Table 5 presents the results. Columns (1) and (3) present the estimates from equations 7 and 8, respectively, whereas columns (2) and (4) show the estimates from the specification including the prefecturespecific trend. The estimate in column (1) shows that the exposed cohort, on average, has a 0.1 % lower proportion of males than the surrounding cohorts. This magnitude becomes 0.126% if we relax the common trend assumption across prefectures in column (2), accounting for approximately 26% of the standard deviation of the dependent variable (Panel A of Table 4). Column (3) indicates that the estimated coefficient on the weighted influenza death rate is negative and statistically significant. This result remains unchanged if we include the prefecturespecific trend in column (4). The estimate in column (4) shows that a one standard deviation ( in Panel B of Table 4) increase in the rate decreases the proportion of males in those children by approximately % (). Further, the proportion of males decreases by approximately 0.18% (%) in the case of exposure to the maximum influenza death rate. This accounts for roughly 40% of the standard deviation of the dependent variable and thus is nonnegligible in terms of its magnitude. The foregoing results suggest that fetal exposure to pandemic influenza has lasting effects on the sex ratio after birth, at least for 5–12yearolds.
6 Conclusion
This study uses the pandemic influenza in prewar Japan as a natural experiment to investigate mortality selection in utero. We find that fetal influenza exposure during the first trimester of the pregnancy period had negative impacts on the proportion of males at birth. Analyses using the infant mortality rate as a proxy of the health status of infants provide evidence that the reduction in male births was associated with the “scarring” mechanism rather than the “selection” mechanism. We also find that the proportion of males in the affected birth cohorts was statistically significantly lower than those of the surrounding cohorts.
As discussed in the Introduction, potential barriers for studies in this strand of the literature include difficulties compiling a set of birth records in developing countries and measurement errors in the observed birth records such as age heaping. Given this issue, industrializing Japan is an ideal study setting because the Registration Act was set in the early stage of its industrialization. Therefore, Japan has comprehensive birth registration records from the beginning of the 20th century. This advantage enables us to investigate in detail the potential impacts of the examined pandemics on the sex ratio. However, using aggregate prefecturelevel data makes it difficult to precisely identify the actual assignments of the exposure at the individual level. Although we use an appropriate set of weights for the analyses, the estimated effects found in this study are therefore considered to be the lower bounds rather than the true unobserved treatment effects.
Nevertheless, this study contributes to our understanding of potential mortality selection in utero due to the pandemics in early 20th century Japan. It also offers suggestive evidence of the persistency of the pandemics in the gender imbalance into childhood. Analyzing the potential longterm effects of fetal exposure to the pandemics on the gender imbalance in adulthood may be a future research avenue.
References
Appendix Appendix A Data Appendix
Figure B.1 shows an example image of the VSEJ that we mainly use to construct the dataset. The SCDEJ takes a similar style to the VSEJ (not reported).
Appendix Appendix B Empirical Analysis Appendix
b.1 Testing Stationarity
Table C.1 presents the results of the panel unit root tests for the proportion of male births used in the main empirical analyses, confirming the stationarity of our panel dataset on the secondary sex ratio. We also run regressions for the proportion of male births using the dynamic panel data models in Online Appendix B.3.
January 1916–December 1922  January 1918–December 1920  
Test statistics  (1)  (2)  (3)  (4) 
statistic value  0.0000  0.0000  0.0000  0.0000 
statistic value  0.0000  0.0000  0.0000  0.0000 
statistic value  0.0000  0.0000  0.0000  0.0000 
statistic value  0.0000  0.0000  0.0000  0.0000 
Number of prefectures  46  46  46  46 
Number of periods  84  84  36  36 
Number of lagged differences  1  3  1  3 

0.90 Notes: The results of the Fishertype panel unit root tests based on augmented Dickey–Fuller (ADF) tests are reported in this table. The null hypothesis is that all the panels contain unit roots, whereas the alternative hypothesis is that at least one panel is stationary. In all the specifications, the process under the null hypothesis is assumed to be a random walk with drift. The demeaned data are used to address the effect of crosssectional dependence. Although the number of lagged differences in the ADF regression equation reported is set as either one or three, the results are not affected by the number of lagged differences. See
Choi:2001wl for the details of the tests.b.2 Testing the Gender Difference
In this subsection, we investigate the gender difference in the estimates reported in Table 3. To test the difference, we pool the infant mortality rates for boys and girls and then interact all the independent variables including the fixed effects with the gender dummy. Table C.2 presents the results. As shown, the number of observations is now (). Column (1) indicates that the estimated effect of fetal influenza exposure on the infant mortality rate of boys is approximately 45.8‰ lower than that of girls. This result is largely unchanged if we include the prefecturespecific time trend in column (2). If we disaggregate the exposure variable in columns (3) and (4), however, no such gender difference is observed in a statistical sense.
Dependent variable: Infant mortality rate  
Exposed trimesters  (1)  (2)  (3)  (4) 
All trimesters  ***  ***  
All trimesters Boys  **  **  
First trimester  
First trimester Boys  
Second trimester  
Second trimester Boys  
Third trimester  **  **  
Third trimester Boys  
Control variables  Yes  Yes  Yes  Yes 
Control variables Boys  Yes  Yes  Yes  Yes 
Fixed effects  Yes  Yes  Yes  Yes 
Fixed effects Boys  Yes  Yes  Yes  Yes 
Time trend  No  Yes  No  Yes 
Time trend Boys  No  Yes  No  Yes 
Observations  644  644  644  644 
Number of prefectures  46  46  46  46 
Number of clusters  8  8  8  8 
b.3 Alternative Specifications: Dynamic Panel Data Analysis and Placebo Test
Dynamic panel  Placebo test  
Exposed trimesters  (1)  (2)  (3)  (4) 
All trimesters  
First trimester  ***  **  **  
Second trimester  
Third trimester  
Lagged dependent variable  
Zero trimester (placebo)  
Fourth trimester (placebo)  
Time trend  No  No  No  Yes 
Period  1916–22  1916–22  1916–22  1916–22 
Observations  3818  3818  3864  3864 
Number of prefectures  46  46  46  46 
Number of clusters  8  8  8  8 
To assess the robustness of our main results, we run regressions using alternative specifications. First, throughout Section 3, we assumed that the regression specifications are static in nature. Since the secondary sex ratio is a highly biological measure, this assumption is likely to hold. However, we herein further consider dynamic panel data models to test the validity of this assumption.^{21}^{21}21The within estimator in a dynamic panel data model becomes consistent as the number of time periods, say , rises. The number of sampled time periods in columns (1) and (2) is 84 months in total (7 years 12 months) and thus should be sufficiently large given that the estimated coefficients on the lagged dependent variables are close to zero. See baltagi2013 and hsiao2014 for theoretical discussions about dynamic panel data models. Column (1) of Table C.3 presents the results of a regression including the lagged dependent variable in our baseline specification in equation 2, whereas column (2) shows the result from a regression including the lagged dependent variable in our flexible specification in equation 3. In both columns, the estimated coefficients on the lagged dependent variables are close to zero and statistically insignificant. This result supports the validity of our main specifications using static panel data models.
Second, we include placebo variables to test whether the significant effect of the first trimester is valid in terms of the regression specifications. First, we include a variable named Zero trimester representing the average influenza death rates between 10 and 12 months before birth. Since a fetus does not exist in utero before conception, exposure to pandemic influenza during this preconception period should have no significant impacts on the proportion of males at birth. Second, in a similar way, we include a variable named Fourth trimester representing the average influenza death rates between one and three months after birth. The estimate of this variable should also be statistically insignificant because fetuses could not be infected during this postbirth period. Column (3) of Table C.4 indicates that the estimated coefficients on these placebo variables are statistically insignificant. This result is unchanged if we include the prefecturespecific time trend in column (4).
b.4 Alternative Specifications: Additional Weather Shock Variables
As discussed in the Introduction, weather shocks can also affect birth outcomes. Moreover, temperature might correlate with the risk of infectious diseases (Ogasawara2019clio; Ogasawara:2018cu) In this subsection, we thus check the robustness of our main results to including measures of heat and cold waves during the sample period.
First, we download temperature data from the official database of the Japanese Meteorological Agency (JMA): http://www.data.jma.go.jp/gmd/risk/obsdl/index.php (accessed on March 30, 2018). In this database, the JMA reports information on the number of days with a temperature above or below a certain threshold to record heat and cold waves. We compile monthly meteorological data between 1915 and 1922 for a maximum of three weather stations in each prefecture. Following the official definitions provided by the JMA, we then define a heat wave as the annual average number of days on which the maximum temperature exceeds 30C, whereas a cold wave is the annual average number of days on which the minimum temperature is below 0C.
Second, we calculate the inverse distanceweighted average of all the valid measurements from these stations in the spirit of Deschenes:2009jz. Each prefecture’s centroid is set as the city office because a large part of the population lives in the principal city in each prefecture. The weighted average of the weather shock variable for prefecture in month is given as follows:
(9) 
where denotes the weather shock variable and denotes the geospatial distance from the centroid to station .^{22}^{22}22Data on latitude and longitude are taken from the database of the Geospatial Information Authority of Japan: http://www.gsi.go.jp/KOKUJYOHO/kenchokan.html, accessed on August 20, 2017.
Third, we calculate the average number of days of heat and cold waves for all trimesters, as for the average influenza death rates in Section 3. These variables are finally added into our baseline and flexible specifications in equations 2 and 3, respectively.
Table C.4 presents the results. Column (1) indicates that the estimated coefficients on these weather shock variables are statistically insignificant. This result is robust to including the prefecturespecific time trend in column (2). The estimated coefficients on the influenza exposure variables are close to the estimates in columns (3) and (4) of Table 2. This result supports the evidence that our key influenza exposure variables are less likely to be correlated with weather shocks during the pandemics.
Weather shocks  
Exposed trimesters  (1)  (2) 
Flu pandemics during first trimester  ***  *** 
Flu pandemics during second trimester  
Flu pandemics during third trimester  
Heat waves during first trimester  
Heat waves during second trimester  
Heat waves during third trimester  
Cold waves during first trimester  
Cold waves during second trimester  
Cold waves during third trimester  
Time trend  No  Yes 
Period  1916–22  1916–22 
Observations  3864  3864 
Number of prefectures  46  46 
Number of clusters  8  8 