Association between first language and SARS-CoV-2 infection rates, hospitalization, intensive care admissions and death in Finland: a population-based observational cohort study

Objectives Motivated by reports of increased risk of coronavirus disease 2019 (COVID-19) in ethnic minorities of high-income countries, we explored whether patients with a foreign first language are at an increased risk of COVID-19 infections, more serious presentations, or worse outcomes. Methods In a retrospective observational population-based quality registry study covering a population of 1.7 million, we studied the incidence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), admissions to specialist healthcare and the intensive care unit (ICU), and all-cause case fatality in different language groups between 27th February and 3rd August 2020 in Southern Finland. A first language other than Finnish, Swedish or Sámi served as a surrogate marker for a foreign ethnic background. Results In total, 124 240 individuals were tested, and among the 118 300 (95%) whose first language could be determined, 4005 (3.4%) were COVID-19-positive, 623 (0.5%) were admitted to specialized hospitals, and 147 (0.1%) were admitted to the ICU; 254 (0.2%) died. Those with a foreign first language had lower testing rates (348, 95%CI 340–355 versus 758, 95%CI 753–762 per 10 000, p < 0.0001), higher incidence (36, 95%CI 33–38 versus 22, 95%CI 21–23 per 10 000, p < 0.0001), and higher positivity rates (103, 95%CI 96–109 versus 29, 95%CI 28–30 per 1000, p < 0.0001). There was no significant difference in ICU admissions, disease severity at ICU admission, or ICU outcomes. Case fatality by 90 days was 7.7% in domestic cases and 1.2% in those with a foreign first language, explained by demographics (age- and sex-adjusted HR 0.49, 95%CI 0.21–1.15). Conclusions The population with a foreign first language was at an increased risk for testing positive for SARS-CoV-2, but when hospitalized they had outcomes similar to those in the native, domestic language population. This suggests that special attention should be paid to the prevention and control of infectious diseases among language minorities.


Introduction
Migrants and ethnic minority populations seem to experience a disproportionate burden of the coronavirus disease 2019 (COVID-19) pandemic [1]. Higher incidences have been reported in varying ethnic minorities in several countries, but regarding the outcomes the results have been conflicting [2e8]. Differences in health status, access to health care, housing conditions, family size, professional exposure, use of public transportation, the possibility of teleworking, and economic status have all been identified as factors contributing to increased exposure [1]. Ethnicity can be defined using various indicators such as the country of birth, nationality, migrant status, race, and selfdefinition. As an advantage of small language groups, in which the language is spoken only in highly restricted areas, an individual's first language can be used as a surrogate marker of ethnicity and used in primary screening to exclude native speakers as non-immigrants. Obviously, this approach does not exclude second-generation immigrants.
Healthcare interventions can address ethnicity-related differences in COVID-19 morbidity and mortality through a better understanding of their causes and consequences. One of the obvious challenges, language, influences all aspects of COVID-19 infection control and treatment, including implementation of public education, counselling and guidance, testing, treatment and contact tracking. At the same time, language problems, when acknowledged, may be overcome with relatively simple measures within the reach of healthcare providers.
We wanted to study whether the first language of a person is associated with the rates of COVID-19 testing, test positivity, hospitalization, ICU care and case fatality. We consider the first language of the patient to be an important aspect in the treatment cascade of COVID-19 patients, as it determines the success of the communication between the patient and the healthcare provider.

Setting
Finland is a Nordic welfare country with a public universal healthcare system for its residents. Inpatient treatment of COVID-19 patients and intensive care are exclusively provided within the public healthcare system. Testing and treatment of COVID-19 is free of charge for all patients, including migrants and tourists. Undocumented migrants, representing 0.1e0.2% of the population, are entitled to urgent healthcare and in some municipalities, including Helsinki, to all necessary care. Payment is never required prior to receiving health care. The first language might not have been registered for undocumented migrants and persons whose stay in the country is only temporary; thus these groups are probably underrepresented in this study. All residents can be identified with a personal identification code, which enabled us to link healthcare data in this study.
Finland, with a population of 5 525 292 at the end of 2019 [9], has two official national languages, Finnish and Swedish, and a third minority domestic language (S ami). In the Uusimaa province in Southern Finland, encompassing the capital Helsinki with a population of 1 689 725 [9], Finnish was the first language of 78.2%, Swedish of 7.7% and S ami for 137 persons (0.0%); foreign domestic languages represented 14.0% of the population (n ¼ 236,959).
Testing for COVID-19 was available free of charge during the study period. However, during the timeframe included in the present study, access to testing was dependent on medical assessment and limited mainly to symptomatic persons. Screening or selfreferral in walk-in or mobile testing units was not yet available.
Indications for testing reflected changes in testing capacity and national policies.

Study population
This retrospective observational population-based quality registry study included all individuals in the capital province of Finland (Uusimaa) tested for COVID-19 between 27th February and 3rd August 2020 by the Helsinki University Hospital laboratory services. All patients admitted to specialist healthcare in any of the 22 hospitals in the capital province diagnosed during this period were included in the analyses of in-hospital treatment. Previously institutionalized and/or dependent patients were mainly treated in primary-care institutions and hospitals and were not included in the analyses of specialized in-hospital care.

Retrieval of data
We collected laboratory and clinical data from the electronic patient records of the Helsinki University Hospital district into a COVID-19 quality registry. Data documentation included sex, age, language, duration of hospital stay, date of death, basic laboratory test results and information from the ICU quality registry. Data on patient and ICU admission characteristics were retrieved from the Finnish Intensive Care Consortium (FICC) database and from the HUCH electronic patient data management systems (Miranda, PICIS, WebLab, Apotti). Based on these data, the Charlson comorbidity index (CCI) and sequential organ failure assessment scores (SOFA) during the first 24 h of ICU treatment, which were sporadically missing from the FICC database, were calculated. If the patient was transferred to another ICU the total length of the ICU stay was included as one admission. If the patient was readmitted to intensive care during the same hospital admission, only the first admission was included. Thus, a single patient was included only once. All-cause case-fatality data were automatically linked from the National Causes of Death Registry into the hospital's electronic patient records, allowing comprehensive post-hospital stay followup of possible fatal cases among all patients.

Statistical analyses
The p-values for categorical variables were calculated with twosided Fisher's exact test and for age with Wilcoxon rank-sum (ManneWhitney) test. The ICU data were first analysed with univariable models, and variables with p < 0.2 were included in the multivariable model, where a stepwise backward logistic regression model was used. For testing rates, positivity rates and incidence rates, 95% Poisson confidence intervals were calculated using the Stata 16.1 program. Mixed-effects (or fixed-effects) logistic regression was used to calculate the odds ratios for positivity, hospitalization and ICU admission. Ethnicity was taken into account either as a fixed effect or as a random effect, depending on the context. Survival during the first 90 days after the positive test date was analysed by the KaplaneMeier estimator or the Cox proportional hazards model. In the Cox model, the potential clustering by different ethnicity groups was taken into account using robust standard error estimates. Age-adjusted risk of death differences were analysed using the Cox model or the KaplaneMeier method with a stratified log-rank test. Tests were two-tailed. Analyses were carried out with IBM SPSS Statistics version 27 for Mac.

Institutional review and patient consent
The quality registry was institutionally approved without requirement for patient consent (approvals HUS/1049/2020/x4 and HUS/157/2020/x94), allowing all consecutive patients to be included. Due to the retrospective registry study design, no permission from the Ethical Committee of Helsinki University Hospital was required.

Characteristics of the cohorts
Altogether, 124 240 persons were tested for SARS-CoV-2 during this study, i.e. the first wave of the pandemic. The first language was registered for 118 300 patients (95.2%) (Fig. 1). The distribution of sex, age and 90-day case fatality in the four cohorts is shown in Table 1. Overall, 7.0% of tested patients, 21.7% of positive patients, 21.7% of patients admitted to specialist hospitals, and 25.9% of patients admitted to the ICU had a foreign (other than Finnish, Swedish or S ami) first language (Fig. 1).
Similar differences were seen in all age groups, suggesting that they are not explained by the younger age profile of the migrant population compared to the native population (Table 2). However, the findings were not uniform in all language groups. When adjusted to population size, the highest positivity rates were among native speakers of Somali and Albanian. Compared to domestic languages, higher proportions of positives among those tested were found in native speakers of Russian, Estonian, Arabic, Somali, English, Kurdish, Albanian and Turkish.

Odds ratios for COVID-19 and hospitalization
During the first wave, testing was focused on symptomatic patients and those needing in-hospital care, which was reflected in the increased odds ratio for testing positive for males and those older than 75 years (Table 3). Persons with a foreign first language were more likely to test positive, with the highest odds ratios when the first language was Somali, Albanian or Turkish. The odds ratio for hospitalization and need for intensive care were analysed in people under the age of 75 years, as older patients were often treated in primary care and were not included in our data. The odds ratio for hospitalization in specialist health care was higher for those with a foreign first language compared to the native population, but this is explained mainly by lower testing rates of persons with a foreign first language (Table 3). When the foreign language group was divided into specific languages, a distinct heterogeneity between languages was seen.

ICU outcomes
Among the 147 patients admitted to ICUs, 38 (26%) had a foreign first language. They were younger and had a lower Charlson comorbidity index than those with a domestic first language ( Table 4). The unadjusted SOFA scores at 24 h from admission, the use of invasive mechanical ventilation, length of stay in the ICU, or case fatality did not differ between patients with foreign or domestic first languages (Table 4a). When comparing d-90 survivors and d-90 non-survivors in the ICU population in univariable analysis, age (p < 0.001), sex (p 0.007), CCI (<0.001), and SOFA score from the first 24 hours in the ICU (p 0.03) were included in the multivariable analysis, together with the language group (Table 4b). In the multivariable analysis, only CCI was significantly and independently associated with d-90 mortality (OR 1.697, 95%CI 1.30e2.21, p < 0.001).

Case fatality
Two hundred and fifty-four deaths within 90 days from the first positive test were recorded; in 244 cases the person's first language was a domestic one (7.7% overall case-fatality rate) and in ten (1.2%) it was a foreign one (Table 1). Of the deceased, 207 were older than 75 years, and only one of them had a first language that was foreign. The overall unadjusted 90-day case fatality of patients admitted to specialist healthcare hospitals was 78/486 (16.0%) among speakers of Finnish, Swedish or S ami and 8/137 (5.8%) among speakers of a foreign language, with the latter group being more than 10 years younger ( Table 1). The age-and sex-adjusted hazard ratio for 90day case fatality was 0.49 (95%CI 0.21e1.15) in patients with foreign first languages (Cox regression, adjusted for clusters with a robust standard error estimator). KaplaneMeier failure estimates showed similar case fatality in those with domestic and foreign first languages in individuals younger than 65 years of age (Fig. 2). In those older than 65 years of age, the case fatality was lower among individuals with a foreign first language, but the 95% confidence intervals were overlapping.

Discussion
Individuals with a foreign first language generally had a higher incidence, a higher proportion of positive test results, and a lower rate of testing than the domestic language population. The risk for  hospitalization was increased in those with a foreign first language, which can be explained by a lower testing rate which resulted in only the more severe cases being detected. The odds ratio for intensive care admission, disease severity at ICU admission, and ICU outcomes did not differ between language groups, suggesting that there were no differences in disease presentation, and no imbalance in admission criteria, choice of treatment modality, or outcomes. After adjusting for age and sex, there was no significant difference in case fatality between patients with foreign and domestic first languages.
Previous studies [6e8] on the COVID-19 case-fatality risk for ethnic minorities have been conflicting. Access to high-quality health care might partially explain why case-fatality risk is increased in some settings but not in others. Our results are in line with previous studies reporting an increased infection rate for COVID-19 in different ethnic minorities in different healthcare systems and varying epidemic circumstances [1]. It is noteworthy that our results come from a setting where access to COVID-19 testing and high-level public health care at low or no cost were technically equal in the population. Yet, testing was likely affected by practical problems such as poor domestic language skills or by lower test-seeking behaviour, or healthcare workers might have been more prone to direct persons speaking domestic languages to testing. Foreign language speakers were overrepresented in a previous study on prehospital COVID-19 patients in Helsinki, which may indicate delayed seeking of health care [10]. An increased attack rate in ethnic minorities during epidemics is not limited to the most recent pandemic of SARS-CoV-2. Very similar findings can be traced back to the H1N1 pandemic in 1918 [11], and to the previous H1N1v pandemic in 2009 [12,13]. Quite analogously to our results, these studies also suggest that the problem does not lie in a genetic or biomedical susceptibility for a more severe illness but in poorer access to health care and increased risk of contracting the virus due to diverse socioeconomic factors. Previous research may also provide encouraging tools for overcoming these differences, as they suggest that ethnic minorities may be more inclined to be vaccinated if offered the vaccine [14].
A major strength of our study was that the register data was of high quality since they were population-based [15]. In Finland, the self-reported first language of residents is registered in the Population Information System and thereby offers a useful indicator of ethnicity. However, the registered first language says more about the ancestry of the individual than their actual language skills, since many second-generation migrants may, often in addition to their own first language, speak fluent Finnish despite being registered as a speaker of another first language. Assuming that migrant background influences the ability to use and access healthcare, including second-generation migrants is not expected to strengthen these findings, but rather to dilute this effect. Thus, the fact that we found differences, even when second-generation migrants were included, makes it even more likely that a true effect exists. Information about the first language was available for 95% of the tested individuals. Based on the names of the individuals with missing language data, we assume that most of them had a migrant background or were temporary visitors or tourists in Finland, which was why their first language was not registered. One limitation of the study was the lack of information about co-morbidities and the socioeconomic status of the outpatients, which could not be included in the multivariable analyses.
The study shows that language minorities have an increased attack rate for COVID-19, but after access to quality care the outcomes do not differ from those of the general population. Potential language barriers concerning situation awareness, access to testing and more efficient contact tracing should be addressed. This study did not provide data about the mechanisms underlying the increased susceptibility to COVID-19 in language minorities. Possible risk factors may include more crowded housing, larger family sizes, professional exposure, use of public transport, and lack of possibilities for teleworking [1]. The motivation for testing and adherence to isolation and quarantine recommendations might also be affected by access to social benefits, employment status and ability to understand instructions given in a foreign language, and even attitudes within the given group. Qualitative interviews with persons from language minorities could provide better understanding of the reasons for the increased incidences.
Studying ethnicity as a risk factor for infectious diseases includes ethical questions which need to be addressed [1]. The results must be interpreted and communicated with caution to avoid stigmatization of vulnerable groups. Still, experiences from HIV and tuberculosis research have shown that acknowledging risks specific to certain vulnerable groups can help in tailoring public health policies and programmes to reduce the disease burden in these populations [16,17]. In addition, using a person's first language as a risk factor may be less stigmatizing than relying on other markers of ethnicity, as language involves a practical aspect which can be addressed within the healthcare system. Differences in susceptibility to COVID-19 based on language group should encourage the implementation of concrete measures targeted at populations at risk, such as providing public guidance about infection control measures in relevant languages and using interpreters for contact tracing and patient information.
Author contributions VH, HS, JH and AK conceptualized the study and its methodology. JO performed the statistical analyses. VH, HS and JH were responsible for writing the original draft. All authors reviewed and edited the manuscript and approved its final version.

Transparency declaration
VH has received a grant from Finska L€ akares€ allskapet. AJ has received a grant from Wilhelm och Else Stockmanns stiftelse and speaker honoraria from Astellas, GlaxoSmithKline, Sanofi, Thermo Fisher, MSD, OrionPharma and UnimedicPharma. JH has recieved grants from NordForsk, Government research funding and Kirsti och Tor Johansson's hj€ art och cancerstiftelse. The authors declare that they have no other conflicts of interest in relation to this work. This work was supported by a research grant from Finska L€ akares€ allskapet.