Core Functions

parse

formatparse.parse(pattern, string, extra_types=None, case_sensitive=False, evaluate_result=True, *, validators=None, pipeline=None, validation_mode='strict')[source]

Parse a string using a format specification.

This function parses a string according to a format pattern and extracts named or positional fields from it. The pattern syntax is based on Python’s format() function syntax.

Parameters:
  • pattern (str) – Format specification pattern (e.g., "{name}: {age:d}")

  • string (str) – String to parse

  • extra_types (dict, optional) – Optional mapping of custom type names (after : in the field) to callables, typically from with_pattern(). Uses the same compiled-parser cache as compile() (pattern plus per-name pattern / regex_group_count). See the Custom types guide.

  • case_sensitive (bool) – Whether matching should be case sensitive (default: False)

  • evaluate_result (bool) – Whether to evaluate and convert result types (default: True)

  • validators (Optional[Mapping[Union[str, int], Callable[..., Any]]]) – Optional map of field key to validator; see apply_validators().

  • pipeline (Optional[ValidationPipeline]) – Optional ValidationPipeline (mutually exclusive with validators).

  • validation_mode (Literal['strict', 'collect', 'lenient']) – "strict", "collect", or "lenient" for validation.

Returns:

ParseResult object if match found, None otherwise

Return type:

ParseResult or None

Raises:
  • ValueError – If the pattern is invalid in a way that still raises from the native compiler (for example some unclosed nested format specs), or if both validators and pipeline are set. For a narrow class of malformed patterns (missing } after a field), this function returns None while compile() raises PatternParseMismatch, which is a ValueError subclass (same split as the original parse library).

  • NotImplementedError – For unsupported pattern features (for example quoted dict keys).

  • ValidationError – If validation fails in strict mode

  • MultipleValidationErrors – If validation_mode='collect' and any validator fails

Example:

>>> result = parse("{name}: {age:d}", "Alice: 30")
>>> result.named['name']
'Alice'
>>> result.named['age']
30
>>> result = parse("{}, {}", "Hello, World")
>>> result.fixed
('Hello', 'World')

Note

For some malformed patterns (for example a missing } after a field), parse() returns None while compile() raises formatparse.PatternParseMismatch (a subclass of ValueError). Other invalid patterns may still raise plain ValueError from both APIs. This mirrors the original parse library.

findall

formatparse.findall(pattern, string, extra_types=None, case_sensitive=False, evaluate_result=True, max_matches=None)[source]

Find all matches of a pattern in a string.

Searches for all non-overlapping occurrences of the pattern in the string. Returns a list-like Results when the fast Rust path applies (no extra_types, evaluate_result is True, and no nested dict field names). Otherwise returns a plain Python list of ParseResult or Match objects (same values as the original parse library).

Parameters:
  • pattern (str) – Format specification pattern

  • string (str) – String to search

  • extra_types (dict, optional) –

    Same semantics as parse(). When provided, the Rust fast path that returns Results is disabled and a Python list is built instead (see returns below). See the Custom types guide.

  • case_sensitive (bool) – Whether matching should be case sensitive (default: False)

  • evaluate_result (bool) – Whether to evaluate and convert result types (default: True)

  • max_matches (int, optional) – Stop after this many matches (default: no limit). Useful for untrusted input; see the Security guide in the project docs (docs/security.rst).

Returns:

Results (preferred) or list of matches, depending on options

Return type:

Results | list

Example:

>>> results = findall("ID:{id:d}", "ID:1 ID:2 ID:3")
>>> len(results)
3
>>> results[0].named['id']
1
>>> results[1].named['id']
2
>>> results[2].named['id']
3
>>> for result in results:
...     print(result.named['id'])
1
2
3

findall_iter

formatparse.findall_iter(pattern, string, extra_types=None, case_sensitive=False, evaluate_result=True, max_matches=None)[source]

Yield non-overlapping matches for pattern in string, one at a time.

Semantics match findall() (same extra_types, case_sensitive, and evaluate_result), but each step converts at most one match. This lowers peak memory when you stream results instead of building a full Results or list.

This is a partial answer to issue #13: it does not implement arbitrary chunked file reads with backtracking across chunk boundaries. For logs, a common pattern is line-sized strings (matches must not span lines):

parser = compile("ID:{id:d}")
with open("log.txt") as f:
    for line in f:
        for m in parser.findall_iter(line.strip()):
            process(m.named["id"])
Parameters:

max_matches (int, optional) – Same as findall() (default: no limit).

Return type:

Iterator[Any]

Returns:

Iterator of ParseResult or Match (same as findall)

parse_batch

formatparse.parse_batch(pattern, strings, extra_types=None, case_sensitive=False, evaluate_result=True)[source]

Parse many strings with the same pattern (compile once, sequential apply).

This is intended for workloads that apply one pattern to many strings: the compiled regex is resolved once (same LRU cache as parse() / compile()) and each string is parsed in order. Non-matches appear as None at the corresponding index.

strings is copied to a list of str on the Rust side (pass a list or tuple of strings; a bare str is treated as an iterable of characters, which is usually not what you want).

Parameters:
Return type:

List[Optional[ParseResult]]

Returns:

List of ParseResult or None per input string

Raises:

ValueError – Same pattern-compile rules as parse(); if the pattern is in the narrow class where parse() returns None, this function returns a list of None with one entry per input string.

Example:

>>> out = parse_batch("{:d}", ["1", "2", "x"])
>>> out[0].fixed[0]
1
>>> out[2] is None
True

parse_with_validation

formatparse.parse_with_validation(parser, string, pipeline, *, extra_types=None, case_sensitive=False, evaluate_result=True, validation_mode='strict')[source]

Parse string with a compiled parser, then run pipeline.

Equivalent to applying pipeline to the result of parser.parse(...) with the same case_sensitive, extra_types, and evaluate_result defaults as parse(). Use parse() or ValidatedParser.parse() when you pass a validators map instead of a ValidationPipeline.

Parameters:
Return type:

Optional[ParseResult]

Returns:

Same as FormatParser.parse() after validation, or None if parse failed. In lenient mode, validation failures emit ValidationWarning and do not raise.

Raises:

compile

formatparse.compile(pattern, extra_types=None)[source]

Compile a pattern into a FormatParser for repeated use.

Compiling a pattern allows you to reuse the same pattern multiple times without recompiling the regex, which improves performance for repeated parsing operations.

Repeated compile calls with the same pattern and equivalent extra_types (same converter pattern and regex_group_count per name) share the same internal compiled-regex cache as parse(), search(), and findall(), so hot loops that call compile do not pay full pattern-to-regex compilation on every iteration (see issue #29).

Custom types: keys are the type names used after : in fields (for example Number in {:Number} or {x:Number}). Values are callables, usually from with_pattern(), which attach a pattern regex fragment and optional regex_group_count when the regex contains capturing parentheses. See the Custom types guide for examples with search() / findall() and for regex_group_count.

The cache fingerprints each name’s pattern and regex_group_count. If you mutate those attributes on a live converter object, reuse the same extra_types dict, and the fingerprint stays unchanged, you can see a stale compiled parser until the process restarts—prefer a fresh dict or new function objects when changing patterns at runtime.

Parameters:
  • pattern (str) – Format specification pattern (e.g., "{name}: {age:d}")

  • extra_types (dict, optional) – Optional mapping of custom type names to converters (see above)

Returns:

FormatParser object that can be used to parse strings

Return type:

FormatParser

Raises:

Pickling: A FormatParser only round-trips the pattern string. If you compiled with extra_types, unpickling yields a parser without those converters; call compile() again with the same extra_types if you need them after pickle.loads.

Example:

>>> parser = compile("{name}: {age:d}")
>>> result = parser.parse("Alice: 30")
>>> result.named['name']
'Alice'
>>> result.named['age']
30
>>> result2 = parser.parse("Bob: 25")
>>> result2.named['name']
'Bob'
>>> result2.named['age']
25

with_pattern

formatparse.with_pattern(pattern, regex_group_count=0)[source]

Decorator to create a custom type converter with a regex pattern.

This decorator adds a pattern attribute to the converter function, which is used by the parse functions when matching custom types.

Parameters:
  • pattern (str) – The regex pattern to match

  • regex_group_count (int) – Number of regex groups in the pattern (for parentheses) (default: 0)

Returns:

Decorator function that adds the pattern attribute

Return type:

Callable

Example:

>>> @with_pattern(r'\d+')
... def parse_number(text):
...     return int(text)
>>> result = parse("Answer: {:Number}", "Answer: 42", {"Number": parse_number})
>>> result.fixed[0]
42
>>> type(result.fixed[0])
<class 'int'>

>>> @with_pattern(r'[A-Z]{2,3}')
... def parse_code(text):
...     return text.upper()
>>> result = parse("Code: {:Code}", "Code: abc", {"Code": parse_code})
>>> result.fixed[0]
'ABC'