Quick Start Guide
This guide will get you up and running with the Confluence Content Parser in just a few minutes.
Basic Usage
The most common use case is parsing Confluence XML content and extracting text:
from confluence_content_parser import ConfluenceParser
# Sample Confluence XML (simplified)
confluence_xml = """
<p>Hello <strong>world</strong>!</p>
<h1>Main Heading</h1>
<p>Some paragraph text with <em>emphasis</em>.</p>
"""
# Create parser and parse content
parser = ConfluenceParser()
document = parser.parse(confluence_xml)
# Extract all text content
print(document.text)
# Output: Hello world!
#
# Main Heading
#
# Some paragraph text with emphasis.
Finding Specific Elements
You can search for specific types of content within the document:
from confluence_content_parser.nodes import HeadingElement, TextEffectElement
# Find all headings
headings = document.find_all(HeadingElement)
for heading in headings:
print(f"Heading: {heading.to_text()}")
# Find multiple types at once
headings, bold_elements = document.find_all(HeadingElement, TextEffectElement)
print(f"Found {len(headings)} headings and {len(bold_elements)} text effects")
# Find all bold text
for element in bold_elements:
if element.type.value == 'strong':
print(f"Bold text: {element.to_text()}")
Walking the Document Tree
For more detailed analysis, you can walk through every node in the document:
# Walk through all nodes
for node in document.walk():
node_type = type(node).__name__
text_content = node.to_text()
print(f"{node_type}: {text_content}")
Working with Complex Content
The parser handles complex Confluence elements like macros, tables, and layouts:
# Sample with macro content
complex_xml = """
<ac:structured-macro ac:name="info">
<ac:rich-text-body>
<p>This is an info panel with <strong>important</strong> information.</p>
</ac:rich-text-body>
</ac:structured-macro>
"""
document = parser.parse(complex_xml)
print(document.text)
# Output: ℹ️ INFO: This is an info panel with important information.
Error Handling
The parser provides diagnostic information when encountering issues:
# Parse with error handling
try:
document = parser.parse(malformed_xml)
except ParsingError as e:
print(f"Parsing failed: {e}")
print("Diagnostics:", e.diagnostics)
# Or check diagnostics after parsing
parser = ConfluenceParser(raise_on_finish=False)
document = parser.parse(xml_content)
if parser.diagnostics:
print("Warnings:", parser.diagnostics)
Next Steps
Read the User Guide for detailed information about node types and advanced usage
Check the API Reference for complete API documentation
Browse Examples for real-world usage patterns