In the age of hi-tech data crunching, data collection in India still relies on pen and paper
Shivangi Narayan | August 2, 2014
Do you want to know how all the statistics and figures you see tossed around by the government every other day are collated? Think of the people you see roaming around with huge sheafs of paper, ticking off responses as they move from home to home.
These people are field surveyors, the eyes and ears of the national sample survey office (NSSO), under the jurisdiction of the ministry of statistics and programme implementation (MOSPI). It is 2014 but paper still rules their lives.
For the field officials of MOSPI, the overarching agency for collating data and figures for the government, laptops are, till date, the only ‘technology’ with which they collect and process data that serves as the backbone of information the government uses to plan and draft welfare schemes for 1.2 billion Indians. (Also read: “As more and more departments become digital, more data would be available”)
A socio-economic survey, according to an NSSO official, can be as long as 20 pages. If properly filled, one such schedule can take anywhere up to two hours or more. After cleaning for inconsistencies at the regional level, the responses are sent to respective data processing divisions (DPD), where it is again entered in computers and then treated to derive the required information.
Eight such divisions in Delhi, Kolkata (the city has three divisions), Bangalore, Nagpur, Ahmedabad and Giridih (Jharkhand) compile data for the $1.3-trillion economy that is India. The report thus generated helps NSSO process and release 116 economic numbers every year.
These include statistics like GDP, inflation (through consumer price index, or CPI, and wholesale price index, or WPI), index of industrial performance, or IIP (through annual survey of industries), consumption pattern, and income distribution (through socio-economic surveys of rural and urban areas).
Data collection: what it entails
Though CPI and WPI numbers are used to calculate inflation, the government also uses it to monitor and reorganise sale for essential commodities. In one such case, the government increased the minimum export price of onion to $500 per tonne this July.
Socio-economic numbers are also used to calculate expenditures on schemes like MNREGS and Sarva Shiksha Abhiyan. These two schemes received '34,000 crore and '28,635 crore, respectively, in Arun Jaitley’s budget this year.
Similarly, IIP is used to make long-term plans for industries. According to current budget estimates, the manufacturing sector faced a decline in growth to 1.9 percent in 2012-13 compared to 9.7 percent 2010-11, prompting the government to pump more money in the manufacturing sector.
Experts, however, indicate that the whole process of data compilation to decision-making can be cut down to a fourth of its time with the use of simple, existing technology. A geo-stamped handheld device that records data in digital format, for example, can help remove the need to manually enter data in computers.
MOSPI: A fit case for technology intervention
Under its mandate, the ministry of statistics and programme implementation (MOSPI) does not have much role in analysing data. It is more a sourcing agency whose role in analytics is limited to GDP calculation and a few other economic and social indicators.
According to the ministry’s secretary TCA Anant, MOSPI does depend on a lot of sources for data collection and standardisation of data collection process and digitisation of records is the need of the hour. (See Anant’s interview.) This also requires a lot of process re-engineering.
With paper surveys, a supervisor needs to monitor each surveyor against bogus data filling and choosing the wrong location. Inspectors, additional director generals (ADG) and deputy director generals (DDG) from the field office division (FOD) of NSSO carry surprise checks to see if they conduct surveys without any glitches. All of them have to provide a monthly progress report to their zonal FODs from where it is sent to Kolkata for monitoring the survey progress.
The time and money spent on monitoring can also be done away with the above mentioned device. Being GPS-enabled, it would also tell how much time each surveyor spent with a respondent and on each question. The small device would thus take care of collecting data in a usable format, as also monitoring the quality of data and each surveyor.
But as the world moves towards sleeker devices, all that some surveyors get are bulky laptops. “The surveyors who conduct surveys for commodity price index (CPI) and annual survey of industries (ASI) carry laptops for instant data collection of commodities,” said RK Singh, AD, NSSO (FOD).
The use of laptops, however, has its own limitations. “It is easy to use laptops for the ASI and CPI surveys, as they are conducted in industries and in urban settings. But carrying a heavy laptop in rural areas for socio-economic surveys is a problem,” said Rakesh Kumar, DDG (CPD).
The surveys are time-consuming and many surveyors work on 28 schedules in a survey period (six months to a year). “With 28 schedules, most surveyors can only complete two schedules in a month,” Singh said. A surveyor’s job involves extensive travel and data collection where laptops can be a liability more than a useful device. Besides their weight, it is difficult for surveyors to find the time and place to charge the laptops, he said.
Lack of training in using a laptop efficiently also pushes surveyors towards using paper and pen to record responses.
On the software side, Singh said they now have a web portal for the ASI so that industries can directly send data to the government. “It was inaugurated on September 16, 2013 in Kolkata,” he said.
Small handheld devices, such as those described above have made their foray in Indian government data collection. Called the computer assisted personal interviewing (CAPI), they are currently being used by the national health and family welfare (NHFW) department for conducting field health surveys. The World Bank exclusively provides CAPI devices, along with training on how to use them. In the case of NHFW, the devices are returned after the survey is completed.
Though successful in the NHFW surveys, CAPI is not being adopted in NSSO’s socio-economic sample surveys. At least not as of now. “CAPI has made a presentation to us but we are not using it because it does not support stratified random sample surveys, the kinds that we do at NSSO,” said Rakesh Kumar, DDG (CPD) at NSSO.
According to him, the NSSO socio-economic surveys are sample surveys where households are divided according to ‘stratum’ in order to make the surveys representative. “CAPI does not have this feature, which is necessary for survey listing,” he said, adding that besides CAPI the data processing division (DPD) of NSSO is also developing a software that will help it automate the survey process.
“We are in the testing phase. The software works on laptops but still has to be tested on field. Also, we would like it to work on tablets or small devices because it is not possible to conduct socio-economic surveys with laptops,” Rakesh Kumar said.
The organisation, he said, will need between 12 and 18 months to completely move to the digital survey process. “We hope to move toward using tablets and other devices on field because it makes the process faster. It also checks for inconsistencies on the field itself, so data collected is more accurate,” Kumar said.
Technology: A grass root issues?
A large chunk of government data is collected at the block or district level for different ministries. Digital penetration has not reached ground level offices in India, so these district or block-level administrations do not collect their records on a digital format. From lack of infrastructure such as computers to lack of training to operate this infrastructure and lack of power for smooth running of the machines, there are many reasons that hinder the use of technology for recording data in a digital format.
The data is thus entered – again manually – from physical books and sent to state and central ministries. There are delays and there are errors in copying from the books to the computer. According to MOSPI secretary TCA Anant, due to lack of technology penetration there is a need to launch data collection drives where citizens are encouraged to provide information to the government.
Apart from collecting data faster and more accurately, not using technology makes the entire data collection process tedious and more expensive. Current data is not available for policy design and approximations take the place of concrete data. Standardisation of data is not possible since data is collected and processed in different formats. Lack of standardisation means data of one department lies in a silo without being integrated with that of other departments, and without giving any meaningful information to people.
Accuracy of data just does not depend on data capturing but also on its source. According to Singh, most people are not interested in providing accurate information to the surveyors. “We get '15,000 as monthly expenditure from a wealthy bureaucrat in Moti Bagh (south Delhi), who has three cars. In such a case, the poverty line will come out to be '42 (as in the present case),” Singh said.
Citizens should understand their responsibility to provide correct data to the surveyors. It is a process of nation building, he added.
A look at MOSPI
Set up in 1999 with the merger of the department of statistics and the department of programme implementation, MOSPI is responsible for compiling statistics on national accounts, data from informal sector and large-scale sample surveys, and conducting censuses. It is also responsible for providing service-sector statistics, data from non-observed economy, social sector and environmental statistics to support decision-making at the highest level.
At the heart of all this is MOSPI’s statistics wing: the national statistical office (NSO), which consists of the central statistical office (CSO), the computer centre, and the NSSO.
Similarly, NSO acts as the nodal agency for planned development of India’s statistical system and is responsible for creating the database needed to study the impact of specific problems for the benefit of different population groups in diverse socio-economic areas, such as employment, consumer expenditure, housing conditions and environment, literacy levels, health, nutrition and family welfare, among others.
(The story appeared in the August 1 to 15 issue of the magazine)
A successful effort to take high speed internet to over six lakh Indian villages by an IIT Bombay team failed to get support from the government. But, the same indigenous television white space technology developed by the team has now won Mozilla Corporation’s ‘equal rating innovation cha
Delhi Jal Board, in collaboration with University of Virginia on Thursday unveiled the Yamuna River Project with a symposium cum exhibition as a part of the overall effort to map out an expansive, multidisciplinary prescription to clean Yamuna. The discussion opened on the lines of eradicati
Over 22 crore LED bulbs have been distributed by Energy Efficiency Services Limited (EESL) under UJALA scheme, union minister Piyush Goyal informed the Lok Sabha. The minister said that the National LED programme, called Unnat Jyoti by Affordable LEDs for
The Central Pollution Control Board (CPCB) has identified a total of 764 Grossly Polluting Industries (industry discharging more than pollution load of Biochemical Oxygen Demand 100kg per day) along the entire stretch of river Ganga and its tributaries, the Lok Sabha was informed.
As many as 196 dams in India are over 100-years old, union minister Sanjeev Kumar Balyan informed the Lok Sabha. The minister said that National Water Resource Council adopted National Water Policy 2012 on December 28, 2012. The policy mentions that
The government has requested the Law Commission of India to undertake examination of various issues relating to uniform civil code, the Lok Sabha was informed. Union minister PP Chaudhary said that said that Article 44 of the Constitution relating to Dire