Search for:
Category

Data Science

Category

Knowledge is at the core of any successful initiative. In the age of data, one would think that the task of incorporating knowledge would be easier than ever. However, the three Vs—volume, velocity and veracity of data—make the task anything but easy. What is an effective way to accomplish the same? Enter knowledge graphs. Graphical representation of knowledge has been…

Graphs are elegant and powerful data structures that find application across multiple industries and organizations of varying scale. This article aims to introduce and demystify graph databases and the field of graph tech to the humans of data. Why should you care about graphs? Graphs are an elegant representation of data and cater to various use cases. As a human…

At the 14 July R User Meetup, hosted at Atlan, I had the pleasure of briefly introducing the relatively new tidytext package, written by Julia Silge (@juliasilge) and David Robinson (@drob). Essentially this package serves to bring text data into the “tidyverse”. It provides simple tools to manipulate unstructured text data in such a way that it can be analyzed…

Unleashing the power of alternative data to transform India’s battle against malaria Too many malaria cases, too little data Today, India is fighting one of its toughest battles against malaria. With a malaria prevalence rate 1.5 times greater than that of Southeast Asia, India accounts for over 67% of malaria cases in the region. What’s even more worrying, however, is that this…

One of the first things we are taught in Programming 101 is to write a well-structured and commented code. And as any newbie would, we ignore this lesson and focus on achieving the end result. Recently, I coded a R (the R language!) script to be run on files amounting to 30 GBs! This was my first professional experience after…

What is an outlier? In short, it’s a data point that is significantly different from other data points in a data set. The long story? There isn’t a strong mathematical definition for what is or isn’t an outlier. In the end, detecting and handling outliers is often a somewhat subjective exercise. So how can you dive into a new data…