The number of villages in India is anywhere between 600,000 and one million, according to various government databases. The number and the definition of villages vary across databases, making it challenging to plan across sectors for a village development plan.
There are around 649,481 villages in India, according to Census 2011, the most authoritative source of information about administrative boundaries in the country. Of these, 593,615 are inhabited.
The Mahatma Gandhi National Rural Employment Guarantee Programme (MGNREGA)–India’s labour law that guarantees 100 days of wage employment to rural households–considers villages and their hamlets as distinct entities, and hence its tally of administrative units goes up to more than a million, according to the author’s discussions with various officials.
The Ministry of Drinking Water and Sanitation puts the figure in its Integrated Management Information System (IMIS) database at 608,662. The Ministry points out the mismatch in this report.
The Swachh Bharat Abhiyan (Gramin) report by the same ministry, pegs the number of villages at 605,805.
How the Number of Villages Varies Across Different Government Databases
Clearly, there is not one authoritative estimate of the number of villages in the country.
It is important to get the accurate number of villages for planning and financing. A village is the basic unit of administration and fiscal governance–it has continued to be so since colonial times when the office of District Collector was set up to collect taxes from its holdings, the original modern-day villages. Thanks to that legacy, the department of revenue is the government’s oldest department—it defines and recognises a village.
As a result, even when a ministry—say, the Ministry of Rural Development—recognises new villages, they are not automatically added to the list of the revenue department. A revenue village is updated for the use of all other departments only when the enumeration for Census starts every 10 years. There is, therefore, an ever-growing number of administrative entities that are not immediately registered in a single composite database.
Why Are the Data in Silos?
If the Census is the most comprehensive record of the country’s demographic, social and economic information, why can it not be used as the base for defining geographic units?
To begin with, the Census lists out villages other than the revenue villages, both inhabited and uninhabited. Nearly 50,000 of the Census villages are uninhabited. There are also forest villages–settlements inside forest areas that a state forest department classifies as such through the process of forest reservation. Then, there are villages that the Census has not covered–called unsurveyed villages. A Government of India notification has asked to convert all forest villages and unsurveyed villages into revenue villages so that the people living there—the majority being classified as tribals—get government welfare benefits.
All government information systems do not follow the Census list of villages for the simple reason that the basic unit of administration is not the Census village for everyone. A case in point is MGNREGA, which treats a revenue village and its hamlets as distinct entities, giving us a tally of over one million villages.
In fact, different government departments adopt different administrative units to operate and monitor their programmes. For example, all rural development schemes operate at the gram panchayat level. Depending on the size of population, a gram panchayat may consist of a single village or a cluster of adjoining villages. There are an estimated 262,800 of these gram panchayats in the country today.
The Health Department’s lowest administrative unit is a sub-centre, which services a population of 5,000 people. The Ministry of Human Resource Development’s flagship scheme, Sarva Shiksha Abhiyaan, operates at the school level—a village typically has more than one school. The Ministry of Women and Child Development operates at the level of anganwadi (courtyard shelter)–one for every 1,000 people.
So, a school, gram panchayat, anganwadi and a sub-centre service different population ranges, and therefore do not correspond to every Census village.
Operational Structure of Sarva Shiksha Abhiyaan, National Rural Health Mission, and National Rural Livelihood Mission
This makes it difficult to get a conclusive assessment of all the schemes at one level. So if a unified report at the sub-district or the village level across education, health, and livelihood is required to be created, it just isn’t feasible.
Implications of Not Having a Single Source of Geographical Units
What this means is that while the NREGA database would recognise a particular village, the Swachh Bharat Abhiyan database might not. This makes planning and budgeting at the village level extremely difficult, given that matching data across sectors would be difficult to come by. This is why it is tough to create granular plans for gram panchayats in India for all central schemes–and tracking development gaps for different geographies becomes a herculean task.
in 2016, the Ministry of Rural Development decided to implement the district monitoring programme, Disha, for more coordinated tracking of 28 government schemes to help elected representatives understand the development needs of the different administrative units. Look at the table (Fig 3) below to get a sense of the geographic granularity of different scheme databases under the Disha programme.
The programme requires that a member of parliament assess the performance of these 28 central schemes at the district level and take action based on the way the district is performing. This cannot be done unless all the data are in one place. That unified database is yet to be created. Until then, an MP will face two major challenges.
First, to understand which specific regions need to pull up in development as a whole, it would require a lot of customised reports across sectors. For example, if Bulandshahr district in Uttar Pradesh is not performing on par with others, it will need a couple of reports to figure out which villages are lagging behind, and on which parameters. Or, if a sub-centre is faring sub-optimally, it might not reflect which specific villages need better infrastructure in this regard, especially if that sub-centre serves more than one village.
Secondly, depending on the department, sub-units of a district may constitute either blocks or tehsils, making it difficult to get a holistic development view of a tehsil or sub-district. That is because we don’t have a mapping of blocks to each tehsil.
What Is the Government Doing to Standardise Geographies?
The Central Government launched a Local Government Directory (LGD) to encourage all state departments to update their record of newly formed panchayats, local bodies and also their reorganisation to ensure that all government bodies are mapped to the constituting geographies and that they all comply with the Census 2011 classification.
The Ministry of Panchayati Raj, responsible for creating and maintaining the LGD, will work through a team of coordinators to ensure that the LGD is updated with the latest data from all the districts of the country; this will help to prepare the complete and final database of all villages in the country.
The LGD will also include inputs from all the ministries it works with so that all databases follow the same geographic base, helping create a unique master list of administrative units.
However, the adoption has been low and its use far from ingrained. States are failing to update the LGD regularly. As per LGD’s Updation Report, only Andhra Pradesh and Puducherry have completed the status of Panchayati Raj Institutions.
Using a Master Geography Curtails Mismatches Significantly
[Atlan], a New Delhi-based data intelligence company that has worked with different government departments, has created a geography standardisation tool as part of its platform that reconciles the anomalies in different geographies occurring in government datasets. The aim is to develop a composite master database so that datasets for education, demographics, livelihood, health, and so on, that currently exist in silos can correspond with and talk to each other.
An example from a case study will help explain how a data mismatch because of confusing names can be tackled. The Census identifies a sub-district in Nashik, Maharashtra, as “Yevla”. However, some of the government information systems that [Atlan] has worked with, including the Nashik district administration, use the name “Yeola” for the sub-district. Further, this sub-district is easily confused with another sub-district called Deola, also in Nashik. As a result, Yeola tends to get replaced with Deola, given it is the closest match, and therefore all data that are collected for the former get reported for Deola. However, through the standardisation tool that automatically replaces Yeola with Yevla, it is possible to correct inaccuracies from creeping into the datasets.
The geography standardisation tool also helps to correctly match one dataset against another, creating a database of geographies across sectors and across time periods. The system is intelligent enough to know, for example, that Panchsheel Nagar was created out of Ghaziabad district in 2011, and was renamed to Hapur in 2012. Therefore, the datasets will be able to match Hapur with Ghaziabad district from 2012 onwards.
The article was originally published in IndiaSpend and has been reprinted here with permission. IndiaSpend.org is a data-driven, public-interest journalism non-profit organisation.