Learn what data silos are (definition); why they pose the biggest challenge to your big data strategy, pipelines and dreams; and how to break them down. First things first, if…
Data collection is a minefield of errors. It doesn’t matter whether you’re a researcher with one survey in the field, an NGO with 10 data collection drives per month, or…
Data Governance
Sorry! We don’t have any resources about this topic yet 🙁
Interested in writing about it? Email us at [email protected]
Data is most valuable when you have something to compare it to, but these comparisons aren’t helpful if the data is bad or irrelevant. Data is most valuable when you…
We’re so excited to announce the launch of our second online course about geospatial data in R. Sign up here. When you hear “geospatial data”, what comes to your mind?…
Data visualization is a great way to represent huge amounts of data in a simple and intuitive fashion. All data visualizations have the same goal: help viewers easily grasp information…
The most challenging part about designing any survey is to identify and create the right set of questions. Typically, there are two types of survey questions – open-ended questions and…
Data Analyst / Scientist
View AllSean presented a talk about how he built this app at the December 2018 edition of the Delhi userR meetup. Learn more about the meetup here, or watch his talk…
“A little knowledge that acts is worth infinitely more than much knowledge that is idle.” –Kahlil Gibran Scalability, at the end of the day, has a lot to do with…
Learn what a data lake is (definition) and how to get the best value from it with a data catalog. Much like the term suggests, a data lake is literally…
Data collection is a minefield of errors. It doesn’t matter whether you’re a researcher with one survey in the field, an NGO with 10 data collection drives per month, or…
Data Governance
Sorry! We don’t have any resources about this topic yet 🙁
Interested in writing about it? Email us at [email protected]
Data is most valuable when you have something to compare it to, but these comparisons aren’t helpful if the data is bad or irrelevant. Data is most valuable when you…
We’re so excited to announce the launch of our second online course about geospatial data in R. Sign up here. When you hear “geospatial data”, what comes to your mind?…
Data visualization is a great way to represent huge amounts of data in a simple and intuitive fashion. All data visualizations have the same goal: help viewers easily grasp information…
The most challenging part about designing any survey is to identify and create the right set of questions. Typically, there are two types of survey questions – open-ended questions and…
Data Analyst / Scientist
View AllSean presented a talk about how he built this app at the December 2018 edition of the Delhi userR meetup. Learn more about the meetup here, or watch his talk…
“A little knowledge that acts is worth infinitely more than much knowledge that is idle.” –Kahlil Gibran Scalability, at the end of the day, has a lot to do with…
As we at Atlan started to use big data (massive data from hundreds of sources), we quickly found that we needed to move everything to the cloud. With data sizes in TBs, we couldn’t keep a local copy of our data for analysis, so we needed to find a way to directly interact with data
Data collection is a minefield of errors. It doesn’t matter whether you’re a researcher with one survey in the field, an NGO with 10 data collection drives per month, or…
Data Governance
Sorry! We don’t have any resources about this topic yet 🙁
Interested in writing about it? Email us at [email protected]
Data is most valuable when you have something to compare it to, but these comparisons aren’t helpful if the data is bad or irrelevant. Data is most valuable when you…
We’re so excited to announce the launch of our second online course about geospatial data in R. Sign up here. When you hear “geospatial data”, what comes to your mind?…
Data visualization is a great way to represent huge amounts of data in a simple and intuitive fashion. All data visualizations have the same goal: help viewers easily grasp information…
The most challenging part about designing any survey is to identify and create the right set of questions. Typically, there are two types of survey questions – open-ended questions and…
Data Analyst / Scientist
View AllSean presented a talk about how he built this app at the December 2018 edition of the Delhi userR meetup. Learn more about the meetup here, or watch his talk…
“A little knowledge that acts is worth infinitely more than much knowledge that is idle.” –Kahlil Gibran Scalability, at the end of the day, has a lot to do with…
All Articles
We hosted the Introduction to Machine Learning event, in collaboration with GDG Cloud and Women Techmakers Delhi. Here is a useful round-up and links to resources.
When we think of computers, we think of the twenty-first century. But did you know that India started using them back in the 1950s? Computers were the unexpected secret sauce…
Did you know that understanding the brain helps build better AI algorithms? Or that neuroscience can help validate AI techniques? Or that neuroimaging data science uses several Python packages such as NIPY or NiLearn? For a data scientist building predictive algorithms or an ML engineer building computational models that teach machines how to make decisions, unraveling the mystery of the…
This year, over 1,000 attendees, 40 speakers, 19 sponsors, and more people from the data science and tech community witnessed the magic of The Fifth Elephant. Now, have you ever…
On July 20, 2019, we complete fifty years since humans officially landed on the moon for the first time in 1969. This historic feat of accomplishment is often synonymous with two names—Neil Armstrong and Buzz Aldrin. That’s not the complete picture. While we celebrate the legendary astronauts who first stepped on the moon, their achievement wouldn’t have been possible without…
Ever wondered what led to humans taking over the skies? Sure, we all know the Wright brothers were the pioneers of controlled flight. However, do you know why two flight…
You may be familiar with principles for good data visualization when it comes to ordinary bar plots, scatterplots, and line plots. However, geospatial data visualization has its own set of principles for effective and honest communication. The answer to what is the “best” style of geospatial data visualization often depends on the type of data
Data sets are most valuable when people can understand them. When done right, data visualization is a great way to display large amounts of information simply and intuitively. However, in order to ensure that visualizations are effective, it’s important to follow a few important standards and avoid a few all-too-common mistakes. Data Visualization Do’s Keep the visualizations simple While many…
We’re so excited to announce the launch of our second online course about geospatial data in R. Sign up here. When you hear “geospatial data”, what comes to your mind?…
Sean presented a talk about how he built this app at the December 2018 edition of the Delhi userR meetup. Learn more about the meetup here, or watch his talk below. https://www.youtube.com/watch?v=c07AysEIo-g This was recorded as part of the community initiatives by SocialCops. If I told you that, according to the 2011 Census, 67% of Indian households had access to…
As we at Atlan started to use big data (massive data from hundreds of sources), we quickly found that we needed to move everything to the cloud. With data sizes in TBs, we couldn’t keep a local copy of our data for analysis, so we needed to find a way to directly interact with data
Paper-based data collection has been around as long as humans have had an interest in understanding the world around them — tick marks on parchment were used by ancient civilizations to track food inventory, and in the 1800s the first known census was collected via pen and paper. Now, digital and cloud-based systems for data collection are rapidly increasing, but…
We use GitHub issues to keep track of all issues. Please do not report bugs or issues in this blog’s comments. Instead, post them on GitHub as an issue. Before submitting a comment…
In our last post on Apache Airflow, we mentioned how it has taken the data engineering ecosystem by storm. We also talked about how we’ve been using it to move data across our internal systems and explained the steps we took to create an internal workflow. The ETL workflow (e)xtracted PDFs from a website, (t)ransformed them into CSVs and (l)oaded…
Even if you haven’t worked on Kubernetes, chances are you’ve at least heard or read about it. It is already one of the most popular open source projects ever, and…
Landsat is without a doubt one of the best sources of free satellite data today. Managed by NASA and the United States Geological Survey, the Landsat satellites have been capturing multi-spectral imagery for over 40 years. The latest satellite, Landsat 8, orbits the Earth every 16 days and captures more than 700 satellite images per day across 9 spectral bands…
The development sector loves to measure — and maximize — its impact. Every penny spent is meant to reach beneficiaries, directly or indirectly. That means one of the biggest challenges…
The alternative data revolution isn’t about what alternative data to use. It’s about how to turn it into actionable insights. Road data for Pune, India, color-coded to show which roads are well-lit at night. (Source: Atlan.) If there is one thing every business in the world wants, it is market intelligence. By offering an untapped source of market intelligence, alternative…
If you’re making decisions based just on internal data, you’re already behind the curve. The sudden decline of Blackberry came as a shock to the world. A year before its revenue…
What is the first thing that comes to your mind upon hearing the word ‘Airflow’? Data engineering, right? For good reason, I suppose. You are likely to find Airflow mentioned in every other blog post that talks about data engineering. Apache Airflow is a workflow management platform. To oversimplify, you can think of it as cron, but on steroids! It…
Numbers are everywhere and drive our day-to-day lives. We take decisions based on numbers, both at work and in our personal lives. For example, an organization may rely on sales…