Search for:
Category

DataOps

Category

Nothing is more frustrating than wrapping up a lengthy data collection exercise, aggregating all the data and looking through it, only to find missing data. At best, these missing values are a nuisance that can be fixed with a bit of work. At worst, they pose an intimidating threat to data quality and your sample size. How can you assess…

For data collected through both paper and digital surveys, you should conduct some basic data checks before carrying out thorough data cleaning. Keep reading for 4 basic data checks that you can use to check for underlying errors in almost any data set. Number of Respondents vs. Rows For any kind of survey, you should always match the number of rows…

The number of villages in India is anywhere between 600,000 and one million, according to various government databases. The number and the definition of villages vary across databases, making it challenging to plan across sectors for a village development plan. There are around 649,481 villages in India, according to Census 2011, the most authoritative source of information about administrative boundaries…

With data scientist being hailed as the sexiest job of the 21st century, there has been an influx of “big data” companies, visualization tools, and other products. But unless the input data is cleaned and managed, all these products are fairly useless. As the saying goes: Garbage in, garbage out! This blog post is about the un-sexy aspects of data science – the practices…