codeburst

Bursts of code to power through your day. Web Development articles, tutorials, and news.

Follow publication

Maternal Mortality Rate, United States

Using Python and Big Data Visualization Tools for Maternal Deaths Analysis

Providing pregnancy-related mortality findings for convenient studies, a new view and clarification. Using Python visualization tools.

Elena Kirzhner
codeburst
Published in
17 min readMay 18, 2018

--

Introduction

Pregnancy related deaths are a comparatively rare occurrence. Worldwide, maternal mortality rate fell by almost half from 1990 to 2015. Yet, the rate in the United States had increased by more than 25 percent.

Maternity deaths are rising for unclear reasons in United States. USA is the only developed nation where that rate is increasing and it is only getting worse. American women are more likely to die from childbirth than women in any other high developed country. Meanwhile, fertility rate getting lower. The number of women giving birth has been declining for years and recently hit a historic low [1].

It is hard to understand the relationship between declining birthrates and rising maternal deaths and logically does not make sense.

Based on research and analysis by the Center for Disease Control and Prevention [2], maternal death greatly increased and more than half of such incidents could have been prevented with the current medical technologies. Most of the cases were result of medical errors, negligence and unprepared hospitals. Doctor’s ability to protect the health of mothers in child-birth is a basic measure of a society’s development. Yet, every year in the United States 700 to 900 women die from pregnancy or childbirth-related causes; and some 65,000 nearly die. It is by many measures, the worst record in the developed world [3]. We have ability to prevent it, by analyzing each cause and utilize monitoring to predict cases and usage of the Big Data and Analytics. Statistical research for 2011 put America in the 50th place; the lowest of all developed nations for maternal death during childbirth [4].

Background And Related Work

World Health Organization, UnData and American’s Health Rankings made some statistical data available to public to support and contribute of the review and analysis.

Over 200 healthcare applications and tools were developed since 2010. Number of healthcare providers have already benefited from big data by concentrating on the fundamental structure of the big data and visualization tools. For example, Kaiser Permanente adapted new system called HealthConnect, it communicates new data between collected information about patients and treatments. The implemented system have helped to save more than one billion dollars from lowering patients visits to doctor’s office. Another example would be the Blue Shield of California adapted NantHealth and improved outcomes for patients and hospitals by communicating information about the visits, patient health history. It helped to provide most effective and less expensive treatments for chronicle illness with preventive care and communications between doctors and patients [5].

Furthermore, The Lancet Journal done similar study on October 8, 2016 that called ”Global, regional, and national levels of maternal mortality, 1990–2015: a systematic analysis for the Global Burden of Disease Study 2015” [6]. They used a standardized process to identify, extract and process all relevant data sources. Uniformed algorithms were applied to identify age category, year category, and location specific patterns of failure and hidden records for vital registration, as well as patterns of deaths misrepresentation. As shown in the Figure 1 below, they visualized the data by using line plots, scatter plots, choropleth maps and contour.

Also, Centers For Disease Control and Prevention established national surveillance of pregnancy-related deaths. Each year, CDC requests the 52 reporting areas (50 states, New York City, and Washington DC) to voluntarily send copies of death certificates for all women who died during pregnancy [7]. Centers for Disease Control and Prevention Maternal Mortality Study Group analyzes the data and provides report periodically through their literature and website. As shown in Figure 2, they published trends in pregnancy related deaths by using visualization methods such as line plots, time series and bar charts.

Even thoughts, there is a lot of data that was being visualized, it is unclear what conclusions can be drawn from the graphs. Lack of details, not enough information about each state, correlation, women’s health and condition.

Research Questions

Comparing and analyzing women’s health, reproduction, maternal deaths trend and health standard of living within each state might reveal the reasoning behind increased maternity death and related causes. Finding leading potential factors and exploring further. Additionally, perform variety of methods in data visualization tools with Python within Jupiter Notebook that would help to present the information in informative and simple way for human visual processing and understanding the alarming statistics.

Process

The workflow for the project involved Jupyter Notebook for data analysis, generating the figures and running different Python visualization tools. Latest tools allow to utilize Python to cross mix and match different values and data sets to analyze complex data; and visualize it by rendering correlations and trends. It reveals stunning insights about each state and trends. Python world has been around for thirty years and a lot of code was written with multiple contributors.

The most common types of visualization are simple bar chart, line graphs, scatter plots and choropleth map. These are the most popular and commonly used types of visualization to make comparison between values and varieties of categories. Parameters were identified, such as axes, similarities, titles and decided on what exactly the visualization supposed to represent.

To create the visualizations some of the most popular tools and libraries that have been used. These include common tools such as: Pandas, Seaborn, Bokeh, Pygal and Ploty.

The data-sets collected from Data World Bank at https://data.worldbank.org and American Health Care Ranking at https://www.americashealthrankings.org.

Additionally, uploaded to the GitHub at https://github.com/elenadesigner/maternal-death/blob/master/analysis.md to demonstrate the trends and patterns between each output.

The data involves geographic locations, health conditions, and ratings by years, by states, ethnicity and gender.

Certain methods didn’t yield successful results. For example visualizing percentages of prenatal care within each state by plotting data with bar chart provided unclear visualization due to a very small difference between states and large percentage of visits. Also, when comparing large amount of data with high values versus small values, can be misrepresented as it makes it look like an even line without variations for the data with small values. As shown in Figure 3, when maternal mortality was compared with infant mortality versus chlamydia cases, it only emphasized the last variable. Therefore when comparing data, it needs to have values within approximately same range. Also, when drawing statistics from all years and countries together it becomes unclear and creates weird patterns.

The best methods worked by comparing 1–7 categories of similar leading factors on one axis and the other axis for the ratio. Additionally, using colors and sizes to differentiate variables.

Results and Insights

The workflow for the project involved Jupyter Notebook for data analysis, generating the figures and running different Python visualization tools. Latest tools allow to utilize Python to cross mix and match different values and data sets to analyze complex data; and visualize it by rendering correlations and trends. It reveals stunning insights about each state and trends. Python world has been around for thirty years and a lot of code was written with multiple contributors.

To start, the most basic method was being used by importing pandas, numpy, matplotlib import style, pyplot and ggplot. Then, the world maternal mortality data-sets was pulled from a CSV file, by usage of the reader function to generate a reader object. It took each line of the file and made a list of all columns. Then, chosen columns and rows were executed.

After that, the data was being visually displayed by using horizontal bar chart method, as shown in Figure 4. Two categories were being compared, data in 1990 versus data in 2015, nationwide. Even thought the y axis with countries overlapping each other and looks messy, it shows that worldwide, overall maternal mortality rate significantly dropped from 1900 to 2015. Red color represents data in 1900 versus blue color represents data in 2015.

Next, the data-set was modified, to compare the change in maternal rate between six completely different countries, which were Australia, United States, Uzbekistan, Uruguay, Afghanistan and middle class world average. The code automatically displayed the table. Once the data visible within the table, it is very simple to create a quick bar chart plot. Vertical bar chart method in Figure 5 shows that even in Afghanistan, which is an extremely poor country, maternal death rates dropped by 75 percent.

Unfortunately this did not show much information about United States or Australia. With a few tweaks and running 3 countries, it looked a little more impactful. The code and the plotted chart is below. As shown in Figure 6, United States maternal mortality rate increased, meanwhile other countries decreased.

Then, the maternal mortality rate in United States was broken down by years, from 1990 to 2015. A simple line plot method was used. As shown in Figure 7, the rate was decreased in 1995 and then significantly increased in 2010.

Based on the world mortality rate data, there was not enough data to compare in earlier years. Some countries started as early as 1800, meanwhile other countries only have data since 1990. To understand the density of data in the file by years, the scatter plot graph was executed below. As shown in Figure 8, there was lack of information and increased data was supplied from around 1990.

To understand how much data was provided in comparison to other countries, the file was modified with isolated cases of 5 countries, Afghanistan, Australia, Sweden, Zimbabwe and USA. Another scatter plot was executed, with color adjusted by country and size based on amount of data supplied. As shown the the Figure 9, Sweden showing the biggest circle, while Afghanistan showing the smallest circle. That means that Sweden provided much more information about mortality rate throughout the years than other countries.

Worldwide fertility rate was compared next. Again, the file was modified by combining 3 variables. World’s middle rate, United States rate, and countries with upper middle income. As shown in Figure 10, horizontal bar chart was created with transparent bars for easear comparison between variables. The chart showed that fertility rate dropped worldwide between 1990 and 2015. Fertility rate in United States significantly decreased compared to other countries, it declined by 30 percent.

Fertility rate is not the same as the birth rate. While fertility rate is a parameter of females in the reproductive age and average number of children born within that range, birth rate is a parameter of live births in the entire population, without age limits.

Therefore, the birth rate was compared to fertility rate within United States between 1990 and 2015. Line plot method was used for this type of analysis. As shown in Figure 11, significant drop started to occur around 2010 and then continued to decline, for both, fertility rate and birth rate.

To magnify the birth rate and get better details isolated birth rate was rendered in the code below. As shown in figure 12, since 1990, the lowest birth rate was in 1998 and highest in 2007.

Next, to understand the medical improvement the infant death rate was compared between 3 variables, world middle class rate, United States and upper middle income class countries. The data was adjusted and another line graph was executed. As shown in Figure 13, there was a major drop in infant deaths, and United states seems to have a line without much change.

To understand the infant deaths in United states, another line plot was created with only one variable, United States. As shown in Figure 14, there was an improvement in United States.

The next logical step was to identify doctors performance and overall improvement. The only tie period when full statistics were available is between 2000 and 2013. Skilled doctors rate was compared between 3 variables, world middle class rate, United States and upper middle income class countries. As shown in Figure 15, skilled doctors in United States actually declined while all other countries increased.

Due to a large variety of worldwide data it’s hard to identify maternal mortality causes in United States by comparing with other countries. Therefore, maternal mortality rate between states in United States were compared. The Maternal mortality rate is the yearly number of female deaths per 100,000 live births from any cause related to pregnancy. For this analysis, horizontal bar chart method was executed. It provided very clear comparison between each state. As shown in Figure 16, District of Columbia, Georgia and New Jersey, are the top 3 states with highest maternal mortality rates. The average in United states is 19 as for example compared to District of Columbia where it is 40.

To get even better visualization a scatter plot was created by using Plotly online application. Then the code was embedded into the Jupiter Notebook. As shown in Figure 17, rendered interactive scatter plot with hovers and colored by rates and density with gradient measurement bar, makes the visualization easy to understand and much more visually appealing.

Figure 17

Next, to understand the health and habits during pregnancy, alcohol consumption during pregnancy was compared between different states. The bar chart represents what percentage of pregnant women are having at least one alcoholic beverage in the past 30 days. Again, as shown in Figure 18, District of Columbia stands out with the highest rate of almost 30 percent.

After that percentage of pregnant women who are smokers were compared by states. Different color was being used to differentiate the category. This time, as shown in Figure 19, West Virginia stands out with the highest rate of almost 25 percent.

Then, percentage of women who reported that their health is very good or excellent was being compared. Interestingly enough, as shown in Figure 20, District of Columbia stands out with the highest rate of around 70 percent.

Mental distress between states was compared next. The bar chart shows percentage of women who reported their mental health was not good. As shown in Figure 21, the highest mental distress is in Arkansas, almost 20 percent.

To get an idea or to identify a leading factor, or any correlating factor, maternal mortality ratio was compared with mental distress, health and alcohol consumption during pregnancy. Horizontal bar chart colored by categories was executed. Based on Figure 22, District of Columbia, New Jersey and Georgia, had some highest correlation between these categories.

To continue on the further study of women’s heath within states. Analysis was conducted on percentage of women who were told by a health professional that they have high blood pressure was visualized by using the same bar chart method. As shown in Figure 23, the highest blood pressure was identified in Louisiana at almost 20 percent.

Next, percentage of women with diabetes was compared by states with the same bar chart method. As shown in Figure 24, Alabama and Mississippi have highest percentage of diabetes in women.

Obesity levels were compared next. Again, same method was used as previously, horizontal bar chart. As shown in Figure 25, Mississippi and Arkansas have the highest obesity rates at higher than 35 percent.

Next, percentage of live births in which the mother received prenatal care before the 3rd trimester was observed. Same bar chart method was being used, as shown in Figure 26. District of Columbia showed the smallest percentage of visits, which is 89 percent.

Combined data-sets of infant mortalities by 5 different ethnicities were compared. Plotly was used as a platform to execute the bar chart and then embedded into the Jupiter Notebook. As shown in Figure 27, the highest infant mortality within black ethnicity in Wisconsin.

The data above didn’t provide any insights or correlation between the leading factors or states. Therefore, to see if there is any similarities in the rate between child mortality and maternal mortality, another bar chart was executed. As shown in figure 28, the highest rate is in Mississippi and Louisiana, which is almost identical to their maternal mortality rate.

Next, percentage of women who are not covered by health insurance was analyzed. A simple bar chart method was used, as shown in Figure 29. Based on the analysis, it showed that Texas have highest rate of uninsured women, which is around 28 percent.

To understand the issues even better, ratio of homeless families was compared by states. Simple bar chart was used again, as shown in Figure 30. Based on the analysis, highest ratio of homeless people are in District of Columbia.

This was followed by analysis of a new data-set. It was created and executed with modified information, where 3 states with highest problems and 3 states with lowest problems were identified, and traced in Plotly. Based on that information, interactive and vertical bar chart was created as shown in Figure 31. Each state had a colored bucket for each problem, where the tallest bar identified the most problematic state. Based on the analysis, it showed that Mississippi and District of Columbia have the most health living issues.

Based on that observation above, another data-frame was created within Plotly platform. It contained the states issues compared with 3 most problematic states and 3 states with the lowest issue rates. The data-frame was visualized by scatter plot chart, with data value assigned to each state. As it shows in Figure 32, leading issue that featured in District of Columbia by the biggest circle, is the homeless rate issue and then it shows the highest maternal mortality rate as well.

Last chart that was exported within Plotly was a choropleth map. It was created by combining the sum of the values for each issue and then each state was assigned to it. Gradient colors were added to differentiate each state and provide better visualization as to which states are better compared to the worst by darker colors. As shown in Figure 33, California, Hawaii,Vancouver and Minnesota are one of the best states to have a safe pregnancy, meanwhile District of Columbia, Mississippi, Oklahoma and West Virginia are the worst and unsafe states.

Figure 33

Conclusion

It is alarming and unbelievable that in the 21st century women in United States are still dying in childbirth at increased rates. Pregnancy-related mortality findings should be studied and cross analyzed even more with the latest and advanced technology. It will provide a new view and value, resulting in clarification and better health management.

Based on the project’s research and analysis, it showed that the most disturbing leading factor were within District of Columbia. It showed the highest rates for the maternal mortality rates, homeless families, alcohol consumption during pregnancy, and lowest prenatal doctor visitations.

Additionally, the analysis showed that number of skilled doctors and birth rates are declining in United States.

Hopefully, this research will bring attention to the public and government institutions and prevent maternity deaths and what is causing them.

All these years, there was not enough information that was structured for deeper understanding and analysis. It can be improved. Big Data massively grows daily, useful information is everywhere around us.

Latest and fastest digital platforms have the ability to transform and improve the healthcare, store data and analyze huge mass of information from separate sources.

Doctors, medical staff and patients could use that information to improve and achieve better outcomes for pregnant mothers and prevent death.

Various visualization methods have been demonstrated by using variety of datasets to plot simple bar charts, scatter plots and line graphs. By using most common techniques demonstrated here, Pandas with Python is the simplest method for basic plots. Plotly is the most useful, appealing and easiest option for creating web based highly interactive visualizations.

Next step would be to identify how many lives could be saved by providing medical kits. Cost of providing medical kits compared to cost of death. With increase in availability of medical kits I expect to see decreased mortality rate and cost in hospitals that monitor and provide special care for pregnant mothers in need. Finally, further analyze states with highest mortality rate and compare with the ones that have lower rates and find out reasoning behind it.

References

  1. Cha, A. E. (2017, June 30). The U.S. fertility rate just hit a historic low. Why some demographers are freaking out. Retrieved December 15, 2017, from https://www.washingtonpost.com/news/to-your-health/wp/2017/06/30/the-u-s-fertility-rate-just-hit-a-historic-low-why-some-demographers-are-freaking-out/?utm_term=.ae4cf271272a
  2. Reproductive Health. (2017, November 09). Retrieved December 15, 2017, from https://www.cdc.gov/reproductivehealth/maternalinfanthealth/pmss.html
  3. Martin, ProPublica Nina, and Renee Montagne. “U.S. Has The Worst Rate Of Maternal Deaths In The Developed World.” NPR, NPR, 12 May 2017, www.npr.org/2017/05/12/528098789/u-s-has-the-worst-rate-of-maternal-deaths-in-the-developed-world.
  4. Maternal Mortality in the United States: A Human Rights Failure. (n.d.). Retrieved December 15, 2017, from http://www.arhp.org/publications-and-resources/contraception-journal/march-2011
  5. Basel Kayyali, David Knott, and Steve Van Kuiken. (n.d.). The big-data revolution in US health care: Accelerating value and innovation. Retrieved December 15, 2017, from https://www.mckinsey.com/industries/healthcare-systems-and-services/our-insights/the-big-data-revolution-in-us-health-care
  6. “Global, Regional, and National Levels of Maternal Mortality, 1990–2015: a Systematic Analysis for the Global Burden of Disease Study 2015.” The Lancet, Oct. 2016, www.thelancet.com/journals/lancet/article/PIIS0140-6736(16)31470-2/abstract.
  7. “Reproductive Health.” Centers for Disease Control and Prevention, Centers for Disease Control and Prevention, 9 Nov. 2017, www.cdc.gov/reproductivehealth/maternalinfanthealth/pmss.html.

✉️ Subscribe to CodeBurst’s once-weekly Email Blast, 🐦 Follow CodeBurst on Twitter, view 🗺️ The 2018 Web Developer Roadmap, and 🕸️ Learn Full Stack Web Development.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Published in codeburst

Bursts of code to power through your day. Web Development articles, tutorials, and news.

No responses yet

Write a response