Node Parserο
Conceptο
Node parsers are a simple abstraction that take a list of documents, and chunk them into Node
objects, such that each node is a specific size. When a document is broken into nodes, all of itβs attributes are inherited to the children nodes (i.e. metadata
, text and metadata templates, etc.). You can read more about Node
and Document
properties here.
A node parser can configure the chunk size (in tokens) as well as any overlap between chunked nodes. The chunking is done by using a TokenTextSplitter
, which default to a chunk size of 1024 and a default chunk overlap of 20 tokens.
Usage Patternο
from llama_index.node_parser import SimpleNodeParser
node_parser = SimpleNodeParser.from_defaults(chunk_size=1024, chunk_overlap=20)
You can find more usage details and availbale customization options below.