🔌 Data Connectors (LlamaHub)
Our data connectors are offered through LlamaHub 🦙. LlamaHub is an open-source repository containing data loaders that you can easily plug and play into any LlamaIndex application.
Some sample data connectors:
local file directory (
SimpleDirectoryReader
). Can support parsing a wide range of file types:.pdf
,.jpg
,.png
,.docx
, etc.Notion (
NotionPageReader
)Google Docs (
GoogleDocsReader
)Slack (
SlackReader
)Discord (
DiscordReader
)Apify Actors (
ApifyActor
). Can crawl the web, scrape webpages, extract text content, download files including.pdf
,.jpg
,.png
,.docx
, etc.
Each data loader contains a “Usage” section showing how that loader can be used. At the core of using each loader is a download_loader
function, which
downloads the loader file into a module that you can use within your application.
Example usage:
from llama_index import GPTVectorStoreIndex, download_loader
GoogleDocsReader = download_loader('GoogleDocsReader')
gdoc_ids = ['1wf-y2pd9C878Oh-FmLH7Q_BQkljdm6TQal-c1pUfrec']
loader = GoogleDocsReader()
documents = loader.load_data(document_ids=gdoc_ids)
index = GPTVectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
query_engine.query('Where did the author go to school?')
Examples