Reality Check India

Stop picking on Raje, its much bigger than her

Posted in Uncategorized by realitycheck on May 31, 2007

Last night, I switched on the TV in the faint hope that some channel might, in a rare flash of clarity, throw some light on the real issues surrounding the continuing Gujjar protests in Rajasthan.  That was not to be.

All channels were lambasting the Rajasthan CM Vasundara Raje, without telling us how they expected her to pull off a miracle solution. CNN-IBN was the front runner. After failing to make an impression by attacking her for participating in fashion shows promoting handicrafts, this was their moment. No, their program fell flat on its face.  Read the transcript to judge the quality for yourself.

Can we blame the media ?

I sent emails to various media outlets many months back, asking them to highlight the burning need to involve data in the current discourse.  I gave them links,  data, and leads to do stories on the NSSO and other available data.  They were invited to use my social cartograms for their purpose without even giving me credit. All they had to do was cross check with their in-house or external social scientists and they would have a front page bestseller, with no effort on their part.  Are there no takers for this ?

If you are not going to talk about data, what is left ?

The central question that will torment this country in the days to come is this : “If X, who is so obviously doing well can get backward benefits, why can’t Y, who feels they are losing out ? ”

Try debating that without data establishing X’s and Y’s backwardness.

What kind of debate will emerge ? What happens when you actively try to avoid the obvious ? I think we can only have political gossip at best and fanning the flames at worst.

Forget Vasundara Raje, even Superman and Batman together cannot fix this problem.

To fix this problem, you must seek data – which is the truth. Truth will always set you free. (Isnt that our national motto ? Satyameva Jayate) Any takers for that today ?

Do we have it in us to seek answers to these questions ? Can we seek the truth and more importantly can we handle the truth ?

Reminds me of one of the all time great movie dialogs.


Jessep: You want answers?
Kaffee (Tom Cruise): I think I’m entitled to them.
Jessep: You want answers?
Kaffee: I want the truth!
Jessep: You can’t handle the truth! Son,..
 “A Few Good Men – 1992

1. On what basis were the Jats and others conferred OBC status ? This leads to how does a community (a) get classified as OBC and (b) remain in that classification.

2. Was the OBC group in Rajasthan homogeneous prior to Jats getting OBC status ? Were communities benefiting roughly equally from the prior classification ?

3. Did the inclusion of Jats disturb the homogeneous group ? By how much ?

4. If the Gurjars and Meenas are roughly equally tribal, then what is the rationale for excluding Gurjars from the ST group ?

Of course, if you follow this thread – it will inevitably lead to the southern states. Jats can easily point to the dominant OBCs from the south (esp TN).  Soon its back to the basic quesiton. “If they are OBC, why cant we be ?”

Say, NO to life stories – and YES to hard and current data.

It is the last hope.

NSS 61st Round data for Social Groups

Posted in Uncategorized by realitycheck on December 14, 2006

This post has been in a draft mode for weeks, glad it get it out.  

I think it is a good sign that folks are finally beginning to talk about the importance of data in framing of public policy.  This is essential to evaluate claims of backwardness by different social groups. They should remember that data is not a “nice to have” – it is a “must have”.  It is not an afterthought – it is a pre-requisite to select castes for inclusion in the group called OBCs.  Abi, one of “The other india” bloggers has some useful links here.

I have been reading the latest NSSO called the 61st round conducted between July 2004 and June 2005. This was the seventh such survey and only the second to collect OBC data. The previous available survey was called the NSS 55th round. I have covered the old rounds here and here.  This is boring stuff and hard to wrap your head around it – so check out the cartograms here.

What is the big deal with NSSO data ?

Have you heard about NSSO prior to April of this year ? You are not alone, me neither. The NSSO collects statistics like household expenditure, unemployment profiles, energy usage, educational levels and other data useful for planners and researchers. This stuff while useful to planners, is not very interesting to the lay person. The NSSO however had a hidden gem in its data. It was a simple count of OBCs and Others (Forward Castes). Since, 1931 we did not have any idea about these two critical numbers. 

Survey methodology

It uses a stratified multi stage sampling method. Dont worry, I dont know the math behind that either. From what I gather, they divided up villages, hamlets, urban areas, and wards in a scientific manner.  The final unit (so called ultimate staging units or USUs) were individual households. This means that they did not go around asking people for data, instead the head of the household reported data that applied to the entire household. This is robust and anyone claiming otherwise better do the math and tell us why this is not so. So I am quite satisfied with what is available.

Is Self reporting bad ?

This is receiving some flak from some commentators.  In the NSSO survey, all data including the social group is self-reported. This means that the head of the household claimed that he belonged to one of the four social groups.  It is important to remember that even the census data is self reported. This does not mean it is invalid. Self-reporting is expected to be robust for all but the most threatening questions. If you asked a population a question like, “Have you committed a crime in the past three months ?”  – you can expect a lot of misreporting.   

How is this data relevant to OBC quota policy ?

The NSSO data has for the first time given an glimpse of the OBC population.  This is why it has piqued the interest of so many people.  This survey can give us a pretty good estimate of how many OBCs/FCs there are in each state and in India. We already know the SC and ST count based on the 2001 census.

So anything interesting in this round ?

The nationwide OBC percentage has increased 5% from the last round.  TN OBC count has increased 8%. The SC/ST count nationwide has remained pretty much the same.  It turns out OBC purchasing power is almost equal to that of the “Others”. OBCs are also better employed than the FCs at 42% vs 38.9%.

Probably a lot of interesting stuff for folks like Surjit Bhalla (read his take of OBC quota and the legitimate muslim claim for a subquota).

Any new cartograms ?

Yes – the startling thing to look out for is Tamilnadu/WB/J&K. Someone better come up with an explanation for why is 95.4% of a state in need of social justice in the form of hard quotas. They must show that the remaining 4.6% have such a stranglehold on forces of production, education, and economy. If not all states are soon going to ask for more – they pay taxes too.

Data source : NSS 61st Round Report 516 titled “Employment and Unemployment Situation among Social Groups in India” Page 54.

Thanks to Abi we can just look at that one page without having to register and download the document. Check it out here.



For every 100 people (figures rounded off) –

  • only 6 Bengalis can compete for the OBC quota seats
  • only 13 J&K can compete
  • only 14 Delhiites
  • but 74 Tamils are eligible, 61 Keralites too (but only 8 Keralites compete as SCs)

Even the Moily Report warned 

(page 95):

The analysis of NSS data clearly brings out the inclusion of creamy layer will result in reserved seats getting pre-empted by OBCs from the top two income deciles of the OBCs at the cost of the poorer income deciles. Thus almost all rural OBCs as well as OBCs from the Northern, Central and Eastern regions will be deprived of the intended benefit of reservation.

(page 94)

.. In fact Prime Minister Rajiv Gandhi also vehemently propounded the theory of excluding the creamy layer as he was apprehensive of the disadvantaged classes losing the benefits and upper layer of OBCs who are well off and have enjoyed the benefits of two or three generations continuing to be protected at the cost of the deprived classes for whom the process of empowerment is really meant.

This may not appear to be a problem now, but sooner or later these imbalances have a way of turning violent. Bengalis and Kashmiris pay taxes too that go into the 25000 cr kitty. They have an IIT at Kharagpur and an IIM at Kolkata – can you imagine the local feeling when only 6 out of 100 Bengalis can compete for the newly created OBC seats there ? At the very least this merits a 5 min discussion in parliament – no ?

Why are you always picking on TN because it has only 4.3% forward castes ? Even Lakshadweep (1%) , Chattisgarh (10%), Manipur (3%) have low forward castes.  Why not ask the same questions to those states (or UTs) ?

Good question. There is no oppression in those states but still they have very few forward castes. That is because most of the rest are classified as Scheduled Tribes. STs are not selected on the basis of caste discrimination or untouchability. They have just been cut off due to their remoteness. In fact, most ST societies are very egalitarian and there is hardly any discrimination. This is why Tamilnadu has a special requirement to prove that 95.7% of people are oppressed – because only 1.04% of TN is classified as ST.

What about officials like PS Krishnan who want a census before deciding on quotas (article here)?

Brilliant, we welcomed this long back. However, there is a fly in the ointment. The census should count individual castes – not the omnibus OBC group. This is at the very heart of the quota system. If “caste” is the main criteria to get into the group known as the OBCs – it must be the measurement unit (or Ultimate Staging Unit). His claims that extrapolation from 1931 are better than a stratified sampling done in 2005 – is not sound. He has a lot of ground to cover if he wants to prove that mathematically. 

What difference does it make ? Even if the OBC population is 27.001% we would still have the 27% quota, right ? Is this all worth it then ?

Good question again hitting at the heart of the issue. The nature of the OBC quota is different from the ST quota as seen above.   OBC units are castes that are selected for membership into a group – which is then allocated a cut of the seats. The catch is that each OBC caste that is so selected, must be Socially and Educationally backward. This is why this group is called SEBCs in the constitution and various court judgements. SCs (Dalits) do not have that requirement – they are given quotas because they suffered most humiliating oppression such as untouchability, access restrictions. This means that SC castes do not need to demonstrate “Social and Educational Backwardness” parameters – they are “in” just based on their caste.

So are you saying you support 27% quota for OBCs ?

The 27% figure has nothing to do with the OBC count. It has to do with the ceiling of 50% imposed by the SC. Why do we have this ceiling ? It is there because quotas are the supposed to be exception to equality, under no circumstances can the exception be allowed to swallow the rule. This was said by none other than Dr Ambedkar.

Is there a group called OBCs – or is this bogus ?

I do not dispute the existance of a group called the OBCs. In a highly heirarchical society like India, there may be several castes that are close to the SCs – but have not been classified so. In my view, only castes that are close to the SCs in social status are OBCs. These castes have the added burden of measurement for social and educational backwardness. I am not a social scientist, but I am sure some combination of parameters can be used to set a benchmark for cut off.

Ok cut it short,  how can you measure this monster ?

1. The central data, which is unfortunately treated as a national secret is university admissions records and applications.  For employment, it is employment applications and selected profiles. This is a direct external validation for census and sample survey results. In fact, the government must work up from this data which is remarkably easy to obtain.

2. All data must be normalized respective to the eligible applications received.  If only 10 students apply for an MBBS program from Caste X (whose pop is in the millions) – then there is an access problem at the school level. Quotas are not going to help here. 

3. If a caste is well represented (not necessarily proportionally) in the open competition, then it cannot remain in the OBC group any longer. Policies can be framed so that a caste is placed in a “monitor” state for a couple of years before being moved out. By remaining in the OBC group, this caste is destroying the social justice needs of those OBC castes that are not able to make it in the open (again normalized based on eligible candidates)

4. Creamy layer removal is an absolute requirement for OBCs. This is because they are the creme-de-la-creme of Indian society. There is no case to be made for social justice for them at the expense of the poor OBC.  Remember the largest OBCs did not suffer oppression like the Dalits did.

The India social cartogram project maps

Posted in Uncategorized by realitycheck on November 1, 2006

Here are the cartograms. See this post for details about how these maps were generated and where the data used for these maps came from. All these cartograms can be used under the creative commons license. You can use it however you want to, just acknowledge the source. The raw data is available here (scroll to the bottom to see it)





Note: The south has become very large, West Bengal and Assam are almost absent due to the low number of OBCs. (WB has only 6% and Assam has 18%, TN has 65%, Kerala 52%)

Note: West Bengal (63%), Maharashtra (50%) Assam (56%) have large number of forward (or open competition) castes. Tamilnadu has shrunk a lot due to the low number of forward castes (10.66%)

Note: See how the northeast is larger due to the large number of scheduled tribes. Lakshadweep is larger too because 94% of population is ST.


The social cartogram project

Posted in Uncategorized by realitycheck on November 1, 2006

Have you seen a cartogram ? They help visualization of geographic data like no other tool.  The actual sizes of geographical units are scaled to some other value such as social or economic data. The magic is that even though the values are scaled, they roughly retain their structure atleast with respect to their neighbors.

See an example below (click here for more excellent examples).

This is a normal map of the world.

This is a population cartogram. See how big India and China have become !


Very neat indeed – surely beats looking at excel spreadsheets !

This blog has been frustrated with the lack of data in Indian social policy. This lack of data prevents anyone from having a serious discussion on social issues.  I have set aside an hour each day for the past week trying to collect and tabulate publicly available social data in India. This resulted in a massive spreadsheet available here.

The input data consists of :

Procedure : Making the data sheet 

The first step was to tabulate the data from these diverse sources into a single spreadsheet.

  1. The 2001 census contains statewise (and UT wise) overall population data.
  2. The 2001 census has tabulated SC and ST data in all states and UTs.
  3. The NSSO data contains data for only the major indian states of (Punjab, Haryana, Raj, UP, Bihar, Assam, West Bengal, Orissa, MP, Guj, Mah, AP, Karnataka, Kerala, and TN). So out of 28 states and 7 UTs – NSSO has data for 15 major states and no UTs.
  4. The NSSO data does not account for new states Uttaranchal, Jharkhand, Chattisgarh.
  5. The NSSO data contains distribution of OBCs by rural and urban. So we need to combine them into a single count. For example in state X, if 10 % of rural population is OBC – and 20% of urban population is OBC. Then we need the state wide rural / urban count to arrive at a final number. This data can be found in the 2001 census.
  6. Using the above data, we can arrive at statewise ST/SC/OBC/FC percentages for the 15 states.
  7. For the remaining states and UTs, we do not have data. We cant do much about them.
  8. Some simple spreadsheet calculations gives us this data sheet.
  9. For states that did not have the NSSO data on OBC and FC (Others), we assumed the national average so that holes would not appear on the cartogram. This approach is debatable because the northeast states do not have many castes in the OBC group, J&K may also be wrong on the lower side. Using the national average is better than guessing wildly for the missing states and UTs.

Now what! We went through so much trouble to produce the most unreadable output.  This is too boring to even think about understanding it.

So, enter the cartogram.

Procedure : Making the cartogram

I had major trouble with this before figuring it out with some help.

  1. Step 1 : We need a GIS map of india with states marked. This data is usually in the form of a SHP file. I just googled for it and found one here
  2. The problem with that file was the data was old. The states of Jharkhand, Chattisgarh, Uttaranchal were missing.  I tried seaching for more recent files, but came up empty handed.
  3. Since the SHP file did not have the three states mentioned above, we have to combine data with the original states. So Uttaranchal data was added to UP, Jharkhand to Bihar, Chattisgarh to MP. This may be problematic to a minor extent because Orissa also contributed some areas to Jharkhand. 
  4. Next step was to find a tool capable to generating the cartogram. After trying a bunch of tools – I decided to use mapresso
  5. First step is to input the data into the SHP file. This was the most frustrating and time consuming task. I finally ended up using a tool called Geoda
  6. Using Geoda table editor, I input all the data from the spreadsheet. This was painful because I copy pasted every single value. I am sure there is a better way to input data. This took a few days because I only blog an hour or so at a time.
  7. Finally, I had a shp file with data. Next step was to convert it into a so called PSC file that Mapresso understands.
  8. After much trouble with the tool shp2psc, I was able to create the input files for mapresso.
  9. Next steps were easy. I generated the cartograms with 300-500 iterations of whatever algorithm mapresso uses. The more iterations, the finer  the map looks.
  10. After each map was generated, I print screen them into paint.
  11. This was hard work. I decided to add a graphic to point to this blog (more hits please!).

Feel free to use the cartograms or data which ever way you choose. Just mention the original source of these cartograms. You know where you found them.

Any problems with the data or methodology , please leave a comment here.

Enjoy ! See a separate post with all the cartograms.