MarkdownElementNodeParser#

pydantic model llama_index.node_parser.MarkdownElementNodeParser#

Markdown element node parser.

Splits a markdown document into Text Nodes and Index Nodes corresponding to embedded objects (e.g. tables).

Show JSON schema
{
   "title": "MarkdownElementNodeParser",
   "description": "Markdown element node parser.\n\nSplits a markdown document into Text Nodes and Index Nodes corresponding to embedded objects\n(e.g. tables).",
   "type": "object",
   "properties": {
      "include_metadata": {
         "title": "Include Metadata",
         "description": "Whether or not to consider metadata when splitting.",
         "default": true,
         "type": "boolean"
      },
      "include_prev_next_rel": {
         "title": "Include Prev Next Rel",
         "description": "Include prev/next node relationships.",
         "default": true,
         "type": "boolean"
      },
      "callback_manager": {
         "title": "Callback Manager"
      },
      "id_func": {
         "title": "Id Func"
      },
      "llm": {
         "title": "Llm"
      },
      "summary_query_str": {
         "title": "Summary Query Str",
         "description": "Query string to use for summarization.",
         "default": "What is this table about? Give a very concise summary (imagine you are adding a new caption and summary for this table), and output the real/existing table title/caption if context provided.and output the real/existing table id if context provided.and also output whether or not the table should be kept.",
         "type": "string"
      },
      "num_workers": {
         "title": "Num Workers",
         "description": "Num of works for async jobs.",
         "default": 4,
         "type": "integer"
      },
      "show_progress": {
         "title": "Show Progress",
         "description": "Whether to show progress.",
         "default": true,
         "type": "boolean"
      },
      "class_name": {
         "title": "Class Name",
         "type": "string",
         "default": "MarkdownElementNodeParser"
      }
   }
}

Config
  • arbitrary_types_allowed: bool = True

Fields

classmethod class_name() str#

Get the class name, used as a unique ID in serialization.

This provides a key that makes serialization robust against actual class name changes.

extract_elements(text: str, node_id: Optional[str] = None, table_filters: Optional[List[Callable]] = None, **kwargs: Any) List[Element]#

Extract elements from text.

filter_table(table_element: Any) bool#

Filter tables.

get_nodes_from_node(node: TextNode) List[BaseNode]#

Get nodes from node.