Chat-Your-Data Challenge

ChatGPT has taken the world by storm. Millions are using it. But while it’s great for general purpose knowledge, it only knows information about what it has been trained on, which is pre-2021 generally available internet data. It doesn’t know about your private data, it doesn’t know about recent sources of data.

Wouldn’t it be useful if it did? This is where LangChain comes in.

The goal of LangChain is to make it easier for everyone to develop language model applications. We recently published a guide on how to create your own ChatGPT over your data here. This included an example GitHub repo to start from and customize. But even still, there is a long tail of data sources to integrate with and write prompts for. We realized this after putting a call out to see what the most interesting integrations would be and getting an overwhelming response.

Gonna beef up the tutorials for how to create your own Chat-GPT over specific documents with @LangChainAI

What types of documents/knowledge bases would people want to have examples for? Eg Notion, Obsidian, webpages, etc
— Harrison Chase (@hwchase17) February 6, 2023

In a "Chat-Your-Data" Challenge, we're launching a week long challenge to create ChatGPT over your data sources.

Motivation

The motivation for doing this is, as always, to make it easier for everyone to develop language model applications. In particular, we believe that examples are critically important for helping people do so. Therefore, we are hoping to get as many examples (data loaders + prompts) as possible for doing this for various data sources.

We will then put the data loading logic in LangChain, put the prompts in LangChainHub, and put the examples in the LangChain documentation to make it as easy as possible for others to get started.

How to get started

Clone the example GitHub repo
Customize the data source + prompts to your data (can follow this tutorial)
Bonus: deploy a nice frontend to go along with it! We have an example deployment to Hugging Face spaces in the above tutorial.
Submit your entry with this form
Repeat!

Examples

We've created two example repos off of this example GitHub repo, to show what it might look like:

Notion: connect with your notion
ReadTheDocs: connect with your ReadTheDocs site

Other ideas for sources that we saw from the above tweet are:

Obsidian
Gong calls
PDFs
Audio files (can use Whisper!)
Git repos
Arbitrary websites

And lots, lots more! If you're looking for ideas, just look in the replies to this tweet.

Will there be a winner?

Yes! What is a challenge without a winner?

The rules of engagement are as follows:

At the end of each day, we will tweet out from our Twitter a list of all example GitHub repos submitted in the submission form
At the end of this week (2/12) we will freeze submissions and do a tweet thread will all the GitHub repos submitted
Whichever repo has the most stars by 2/19 will be the winner!

What do I win?

A limited edition LangChain t-shirt.

Motivation

How to get started

Examples

Will there be a winner?

What do I win?

Tags

Join our newsletter