“A little knowledge that acts is worth infinitely more than much knowledge that is idle.” –Kahlil Gibran
Scalability, at the end of the day, has a lot to do with time management. Every little act that results in improved efficiency counts when you’re trying to scale a process or an organization.
Yet organizations have come to embrace so many inefficient acts; be it a 15 minute deep dive into Google or Stack Overflow when working on a problem someone around us may have tackled previously, or trying to gather bits of information about a past project from different stakeholders. These acts aren’t only unnecessarily time-consuming but also, at times, frustrating and unproductive. To sum it up, you end up losing tons of man-hours reinventing the wheel your colleague had already worked on previously.
We wanted to improve our efficiency and productivity through a combination of a central knowledge sharing framework and better practices. With that in mind, we set out to look for methods to help us manage knowledge in a better manner. That’s when we discovered the Knowledge Repository.
The Knowledge Repository framework
In 2016, Airbnb open-sourced the Knowledge Repository, a knowledge sharing platform they built to tackle their own needs of scaling internal knowledge sharing. Although the project itself is focused primarily on sharing knowledge between data scientists and other technical roles, its open-source nature and use of the ubiquitous Markdown format allows anyone to improve and modify it. Also, having an active Airbnb team looking after its development and maintenance counts for something.
A knowledge post, a Markdown document with a specific header format, is the basic unit of the Knowledge Repository. There are 3 supported formats that can be converted into a knowledge post: Markdown, R Markdown, and Jupyter Notebooks.
Example knowledge post header:
--- title: This is a Knowledge Template Header<br>authors: - Akash T tags: - knowledge - example created_at: 2016-06-29 00:00:00 updated_at: 2017-11-16 21:21:54.284249 tldr: This is short description of the content and findings of the post. ---
The ‘tags’ (knowledge, example) in the above example allow you to group documents under categories that you can easily search using the Knowledge Repository’s Flask web app. You can associate as many tags as you want with a document.
The knowledge-repo Python package comes with a command-line tool and a web application. You’ll need to create and host (on GitHub, for example) a git repository for storing the knowledge posts. The command-line tool can be used to create, add and push posts from a local machine to the remote git repository. The web application can render posts directly from the Git repository. Using a Git webhook, a connection can be set up between the GitHub repository and a clone of the repository residing on the server (such as AWS) hosting the web app.
You can install and get started with the package by following the instructions here. Here’s what a basic set-up would look like.
The stand-alone framework can be extremely useful, particularly for technology teams. However, in order to use it as a central knowledge management tool, you may need to modify or enhance it as per your overall organization’s workflow. At Atlan, we’ve done the same and are building an environment around the core Knowledge Repository application.
From a user perspective, there are 5 components in our knowledge environment. 3 of them — command line tool, RStudio add-in, and Quip integration — are ways to create knowledge posts. Depending upon a user’s role and workflow, they may primarily use one or another. Here’s a brief description of each component:
- Knowledge repository: This is a git repository which holds all the knowledge posts. The web app (described next) renders posts from this repository. Unless you’re using the Quip integration (described below), you’ll need to clone the repo from GitHub.
- Web app: This is where knowledge posts are hosted for users to view them. You can search for posts by navigating a folder structure, similar to Quip/Google Drive, or by searching for a tag or author.
- Command line tool: This comes with the knowledge-repo Python package itself . It can be used to create and add knowledge posts to the repository from the command line itself. The syntax is as simple as this:
knowledge_repo add ~/Documents/my_post.Rmd [-p projects/test_project] [--update]. Instructions for its use can be found in the official Github repository’s README.
- RStudio add-in: This will allow R users to interact with the Knowledge Repo and create posts using R Markdown from within RStudio itself. We’ll share tips about creating your own add-in in a later post.
- Quip integration: We used the Automation API to connect the Quip web application with our knowledge repository. This allows us to render documents created using Quip directly to the Knowledge Repository app. We’ll share more details about this part in a later post.
Putting it all together
After incorporating the RStudio add-in and Quip integration, here’s what our revised knowledge flow looks like.
Concluding thoughts on the Knowledge Repository
The Knowledge Repository framework is a work in progress, and so is our knowledge environment. However, with the amount of information/data and resulting noise going around these days, efficient and scalable knowledge management is the need of the hour. Simply put, learning from your colleagues’ knowledge and experience can be much better than spending hours searching for half-baked solutions on the internet. To that end, the Knowledge Repository definitely offers a promising approach.
You can expect detailed posts in the near future about the components of our knowledge environment described above. Until then, happy learning!