Data Wrangling World Bank Population

Of the 259 countries captured in the World Bank data set, 10 of them are ASEAN member states, including Singapore. The goal is to find out if Singapore’s past population growth were in tandem with world trend, in addition to regional ASEAN indicators and benchmarks.

As in most data analysis, data wrangling is applied to extract the appropriate data from the core, prepare the data for mapping. It includes:

  • Separating the historical data from future projections. Removal of data that stretched beyond 2017. Appropriated the data from 1960 to 2017 for this analysis.
  • Extracted and created the separate data set for each of the 10 ASEAN states from the core.
  • Re-indexed the new data sets and removed the old indices.
  • Cleaned and dropped invalid cells and redundant columns from the newly created sets.
  • Appended and combined the 10 sets into one collective set for ASEAN.
  • Prepared the data series for Singapore, ASEAN and World trends for the final line chart presentation.

The correlations were displayed in the final line chart rendered with Matplotlib and Seaborn libraries. The latter is an add-on and enhancement to Matplotlib for more updated look.

It appeared that there were some significant upheavals and erratic movements in Singapore’s population growth since 1960 compared to World and ASEAN trends for the same period. Interesting.

Leave a Reply

Your email address will not be published. Required fields are marked *