Posted in Uncategorized by realitycheck on November 1, 2006

Have you seen a cartogram ? They help visualization of geographic data like no other tool.  The actual sizes of geographical units are scaled to some other value such as social or economic data. The magic is that even though the values are scaled, they roughly retain their structure atleast with respect to their neighbors.

See an example below (click here for more excellent examples).

This is a normal map of the world.

This is a population cartogram. See how big India and China have become !


Very neat indeed – surely beats looking at excel spreadsheets !

This blog has been frustrated with the lack of data in Indian social policy. This lack of data prevents anyone from having a serious discussion on social issues.  I have set aside an hour each day for the past week trying to collect and tabulate publicly available social data in India. This resulted in a massive spreadsheet available here.

The input data consists of :

Procedure : Making the data sheet 

The first step was to tabulate the data from these diverse sources into a single spreadsheet.

  1. The 2001 census contains statewise (and UT wise) overall population data.
  2. The 2001 census has tabulated SC and ST data in all states and UTs.
  3. The NSSO data contains data for only the major indian states of (Punjab, Haryana, Raj, UP, Bihar, Assam, West Bengal, Orissa, MP, Guj, Mah, AP, Karnataka, Kerala, and TN). So out of 28 states and 7 UTs – NSSO has data for 15 major states and no UTs.
  4. The NSSO data does not account for new states Uttaranchal, Jharkhand, Chattisgarh.
  5. The NSSO data contains distribution of OBCs by rural and urban. So we need to combine them into a single count. For example in state X, if 10 % of rural population is OBC – and 20% of urban population is OBC. Then we need the state wide rural / urban count to arrive at a final number. This data can be found in the 2001 census.
  6. Using the above data, we can arrive at statewise ST/SC/OBC/FC percentages for the 15 states.
  7. For the remaining states and UTs, we do not have data. We cant do much about them.
  8. Some simple spreadsheet calculations gives us this data sheet.
  9. For states that did not have the NSSO data on OBC and FC (Others), we assumed the national average so that holes would not appear on the cartogram. This approach is debatable because the northeast states do not have many castes in the OBC group, J&K may also be wrong on the lower side. Using the national average is better than guessing wildly for the missing states and UTs.

Now what! We went through so much trouble to produce the most unreadable output.  This is too boring to even think about understanding it.

So, enter the cartogram.

Procedure : Making the cartogram

I had major trouble with this before figuring it out with some help.

  1. Step 1 : We need a GIS map of india with states marked. This data is usually in the form of a SHP file. I just googled for it and found one here
  2. The problem with that file was the data was old. The states of Jharkhand, Chattisgarh, Uttaranchal were missing.  I tried seaching for more recent files, but came up empty handed.
  3. Since the SHP file did not have the three states mentioned above, we have to combine data with the original states. So Uttaranchal data was added to UP, Jharkhand to Bihar, Chattisgarh to MP. This may be problematic to a minor extent because Orissa also contributed some areas to Jharkhand. 
  4. Next step was to find a tool capable to generating the cartogram. After trying a bunch of tools – I decided to use mapresso
  5. First step is to input the data into the SHP file. This was the most frustrating and time consuming task. I finally ended up using a tool called Geoda
  6. Using Geoda table editor, I input all the data from the spreadsheet. This was painful because I copy pasted every single value. I am sure there is a better way to input data. This took a few days because I only blog an hour or so at a time.
  7. Finally, I had a shp file with data. Next step was to convert it into a so called PSC file that Mapresso understands.
  8. After much trouble with the tool shp2psc, I was able to create the input files for mapresso.
  9. Next steps were easy. I generated the cartograms with 300-500 iterations of whatever algorithm mapresso uses. The more iterations, the finer  the map looks.
  10. After each map was generated, I print screen them into paint.
  11. This was hard work. I decided to add a graphic to point to this blog (more hits please!).

Feel free to use the cartograms or data which ever way you choose. Just mention the original source of these cartograms. You know where you found them.

Any problems with the data or methodology , please leave a comment here.

Enjoy ! See a separate post with all the cartograms.


5 Responses

  2. Goldstar said, on November 2, 2006 at 11:12 am

    Amazing work!! Hats off to your commitment and dedication!!

  3. Sharan Sharma said, on November 2, 2006 at 6:40 pm

    Excellent! Excellent! Excellent!

    Incidentally, when the University of Michigan did an ‘in the focus’ story on Prof. Mark Newman and his cartogram project a few months ago, the favourite pastime here was to get hold of all kinds of data and do these fun maps.

  4. state83 said, on October 15, 2007 at 3:33 pm

    I ttried to do how you describe but I have one problem… when I generate the psc file and copy the code in MAPresso the data displayed in the countries do not match.. How can I fix this



  5. KADIR ISMAIL said, on October 31, 2008 at 2:30 pm

    Social cartogram is new tool for effective presentation. It aestthetically combines quantity and the quality of data.
    however a caution has to be exercised that there should not be any distortion in the name of the quality of data.

