latexnodes.nodes — LaTeX Nodes Classes

Nodes, Node Lists, and Visitors

class pylatexenc.latexnodes.nodes.LatexNode(_fields, _redundant_fields=None, parsing_state=None, pos=None, pos_end=None, latex_walker=None, **kwargs)

Represents an abstract ‘node’ of the latex document.

Use nodeType() to figure out what type of node this is, and isNodeType() to test whether it is of a given type.

You should use LatexWalker.make_node() to create nodes, so that the latex walker has the opportunity to do some additional setting up.

All nodes have the following attributes:

parsing_state

The parsing state at the time this node was created. This object stores additional context information for this node, such as whether or not this node was parsed in a math mode block of LaTeX code.

See also the LatexWalker.make_parsing_state() and the parsing_state argument of LatexWalker.get_latex_nodes().

pos

The position in the parsed string that this node represents. If you’re using the LatexWalker walker class, then the parsed string can normally be recovered as node.latex_walker.s, see LatexWalker.s and the latex_walker attribute.

pos_end

The position in the parsed string that is immediately after the present node. If you’re using the LatexWalker walker class, then the parsed string can normally be recovered as node.latex_walker.s, see LatexWalker.s and the latex_walker attribute.

len

(Read-only attribute.) How many characters in the parsed string this node represents, starting at position pos. If you’re using the LatexWalker walker class, then the parsed string can normally be recovered as node.latex_walker.s, see LatexWalker.s and the latex_walker attribute.

Starting from pylatexenc 3.0, the pos_end attribute is primarily set and used instead of the len field. The len field becomes a computed read-only attribute that computes pos_end - pos.

New in version 2.0: The attributes parsing_state, pos and len were added in pylatexenc 2.0.

latex_walker

The LatexWalker instance used to create this node.

New in version 3.0: The attribute latex_walker was added in pylatexenc 3.

nodeType()

Returns the class which corresponds to the type of this node. This is a Python class object, that is one of LatexCharsNode, LatexGroupNode, etc.

isNodeType(t)

Returns True if the current node is of the given type. The argument t must be a Python class such as, e.g. LatexGroupNode.

latex_verbatim()

Return the chunk of LaTeX code that this node represents.

This is a shorthand for node.latex_walker.s[node.pos:node.pos_end].

class pylatexenc.latexnodes.nodes.LatexNodeList(nodelist, **kwargs)

Represents a list of nodes, along with the spanning position and length.

You can use a LatexNodeList pretty much like a python list of nodes, including as an argument to len() and with indexing as in obj[i].

nodelist

A list of node instances.

pos

The position in the parsed string where this node list starts, assuming that the nodelist represents a single continuous sequence of nodes in the latex string.

pos_end

The position in the parsed string immediately after this node list ends, assuming that the nodelist represents a single continuous sequence of nodes in the latex string.

parsing_state

The parsing state used to parse this node list.

latex_walker

The latex walker instance used to parse this node list.

len

(Read-only attribute.) The total length spanned by this node list, assuming that the nodelist represents a single continuous sequence of nodes in the latex string.

latex_verbatim()

Return the chunk of LaTeX code that this node represents.

This is a shorthand for concatenating all the latex_verbatim() representation of all the nodes in the list.

display_str()

Return a string that is not too long

split_at_chars(sep_chars, max_split=None, keep_empty=False, skip_none=True)

Split the node list into multiple node lists corresponding to chunks delimited by the given sep_chars.

More precisely, this method iterates over the node list, collecting nodes as they are iterated over. In simple character nodes, every occurrence of sep_chars causes a new list to be initiated.

This method is useful to split arguments delimited by tokens, e.g.:

\cite{key1,key2,my{special,key},keyN}

In the above example, splitting the argument of the \cite command with the , separator yields four node lists [ (node for "key1") ], [ (node for "key2") ], [ (chars node for "my"), (group node for "{special,key}" ...) ], and [ (node for "keyN") ].

If sep_chars is a Python callable object, then it is assumed to be a function that, when called with a string and a position, returns either a pair (index, len) the index of next separator start position and end position to use to split the string, or an object with start() and end() methods that returns those positions (e.g., a regex match object); return None, a strictly negative start index, an empty list to indicate splitting is done. To split at a regex, for instance, you can use:

# make sure there are no capturing parentheses, see re.split()
rx_space = re.compile(r'\s+')

# rx_object is also accepted
split_node_lists = nodelist.split_at_chars(rx_space)

If sep_chars is a Regular expression (or any object with a search() method returning match-like objects with start() and end()).

get_content_as_chars()

Return the character string content associated with this node list, which is assumed to contain only characters, comments, or group nodes that contain such nodes.

This method is useful to extract character arguments from macro calls with an argument that requires a single string, such as \label{my-label} or \href{https://example.com/}{..}. It also allows you to handle cases like \item[{*}] that result in nested group nodes.

Group node delimiters (if applicable) are not included in the returned string.

class pylatexenc.latexnodes.nodes.LatexNodesVisitor

Implement a visitor pattern on a node structure.

Doc ………………….

visit(node)

Fallback for visiting any type of node. This is called by the visit_XXX() methods below. In your subclass, you can reimplement a subset of the visit_XXXX() methods, and whichever objects you didn’t reimplement the visit_XXX() method for, you can catch with the visit() method.

start(node)

A shortcut for calling node.accept_node_visitor() with this visitor object. It’s a convenient starting point for your visiting pattern:

visitor = MyNodeVisitor()
visitor.start(node)

You probably shouldn’t override this method in your visitor subclass.

LaTeX Node Types

class pylatexenc.latexnodes.nodes.LatexCharsNode(chars, **kwargs)

Bases: LatexNode

A string of characters in the LaTeX document, without any special LaTeX code.

chars

The string of characters represented by this node.

class pylatexenc.latexnodes.nodes.LatexGroupNode(nodelist, **kwargs)

Bases: LatexNode

A LaTeX group delimited by braces, {like this}.

Note: in the case of an optional macro or environment argument, this node is also used to represents a group delimited by square braces instead of curly braces.

nodelist

A list of nodes describing the contents of the LaTeX braced group. Each item of the list is a LatexNode.

This attribute is normally a LatexNodeList.

delimiters

A 2-item tuple that stores the delimiters for this group node. Usually this is (‘{’, ‘}’), except for optional macro arguments where this might be for instance (‘[’, ‘]’).

New in version 2.0: The delimiters field was added in pylatexenc 2.0.

class pylatexenc.latexnodes.nodes.LatexCommentNode(comment, **kwargs)

Bases: LatexNode

A LaTeX comment, delimited by a percent sign until the end of line.

comment

The comment string, not including the ‘%’ sign nor the following newline

comment_post_space

The newline that terminated the comment possibly followed by spaces (e.g., indentation spaces of the next line)

class pylatexenc.latexnodes.nodes.LatexMacroNode(macroname, **kwargs)

Bases: LatexNode

Represents a macro type node, e.g. \textbf

macroname

The name of the macro (string), without the leading backslash.

spec

The specification object for this macro (a MacroSpec instance).

New in version 3.0: The spec attribute was introduced in pylatexenc 3.

nodeargd

The pylatexenc.latexnodes.ParsedArguments object that represents the macro arguments.

For macros that do not accept any argument, this is an empty ParsedArguments instance. The attribute nodeargd can be None even for macros that accept arguments, in the situation where LatexWalker.get_latex_expression() encounters the macro when reading a single expression.

Arguments must be declared in the latex context passed to the LatexWalker constructor, using a suitable pylatexenc.macrospec.MacroSpec object. Some known macros are already declared in the default latex context.

New in version 2.0: The nodeargd attribute was introduced in pylatexenc 2.

macro_post_space

Any spaces that were encountered immediately after the macro.

The following attributes are obsolete since pylatexenc 2.0.

nodeoptarg

Deprecated since version 2.0: Macro arguments are stored in nodeargd in pylatexenc 2. Accessing the argument nodeoptarg will still give a first optional argument for standard latex macros, for backwards compatibility.

If non-None, this corresponds to the optional argument of the macro.

nodeargs

Deprecated since version 2.0: Macro arguments are stored in nodeargd in pylatexenc 2. Accessing the argument nodeargs will still provide a list of argument nodes for standard latex macros, for backwards compatibility.

A list of arguments to the macro. Each item in the list is a LatexNode.

class pylatexenc.latexnodes.nodes.LatexEnvironmentNode(environmentname, nodelist, **kwargs)

Bases: LatexNode

A LaTeX Environment Node, i.e. \begin{something} ... \end{something}.

environmentname

The name of the environment (‘itemize’, ‘equation’, …)

spec

The specification object for this macro (an EnvironmentSpec instance).

New in version 3.0.

The spec attribute was introduced in pylatexenc 3.

nodelist

A list of LatexNode’s that represent all the contents between the \begin{...} instruction and the \end{...} instruction.

This attribute is normally a LatexNodeList.

nodeargd

The pylatexenc.latexnodes.ParsedArguments object that represents the arguments passed to the environment. These are arguments that are present after the \begin{xxxxxx} command, as in \begin{tabular}{ccc} or \begin{figure}[H]. Arguments must be declared in the latex context passed to the LatexWalker constructor, using a suitable pylatexenc.macrospec.EnvironmentSpec object. Some known environments are already declared in the default latex context.

New in version 2.0: The nodeargd attribute was introduced in pylatexenc 2.

The following attributes are available, but they are obsolete since pylatexenc 2.0.

envname

Deprecated since version 2.0: This attribute was renamed environmentname for consistency with the rest of the package.

optargs

Deprecated since version 2.0: Macro arguments are stored in nodeargd in pylatexenc 2. Accessing the argument optargs will still give a list of initial optional arguments for standard latex macros, for backwards compatibility.

args

Deprecated since version 2.0: Macro arguments are stored in nodeargd in pylatexenc 2. Accessing the argument args will still give a list of curly-brace-delimited arguments for standard latex macros, for backwards compatibility.

class pylatexenc.latexnodes.nodes.LatexSpecialsNode(specials_chars, **kwargs)

Bases: LatexNode

Represents a specials type node, e.g. & or ~

specials_chars

The name of the specials (string), without the leading backslash.

spec

The specification object for this macro (a SpecialsSpec instance).

New in version 3.0.

The spec attribute was introduced in pylatexenc 3.

nodeargd

If the specials spec (cf. SpecialsSpec) has args_parser=None then the attribute nodeargd is set to None. If args_parser is specified in the spec, then the attribute nodeargd is a pylatexenc.latexnodes.ParsedArguments instance that represents the arguments to the specials.

The nodeargd attribute can also be None even if the specials expects arguments, in the special situation where LatexWalker.get_latex_expression() encounters this specials.

Arguments must be declared in the latex context passed to the LatexWalker constructor, using a suitable pylatexenc.macrospec.SpecialsSpec object. Some known latex specials are already declared in the default latex context.

New in version 2.0: Latex specials were introduced in pylatexenc 2.0.

class pylatexenc.latexnodes.nodes.LatexMathNode(displaytype, nodelist=[], **kwargs)

Bases: LatexNode

A Math node type.

displaytype

Either ‘inline’ or ‘display’, to indicate an inline math block or a display math block. (Note that math environments such as \begin{equation}...\end{equation}, are reported as LatexEnvironmentNode’s, and not as LatexMathNode’s.)

delimiters

A 2-item tuple containing the begin and end delimiters used to delimit this math mode section.

New in version 2.0: The delimiters attribute was introduced in pylatexenc 2.

nodelist

The contents of the environment. This attribute is normally a LatexNodeList.