Core Functions
parse
- formatparse.parse(pattern, string, extra_types=None, case_sensitive=False, evaluate_result=True, *, validators=None, pipeline=None, validation_mode='strict')[source]
Parse a string using a format specification.
This function parses a string according to a format pattern and extracts named or positional fields from it. The pattern syntax is based on Python’s format() function syntax.
- Parameters:
pattern (str) – Format specification pattern (e.g.,
"{name}: {age:d}")string (str) – String to parse
extra_types (dict, optional) – Optional mapping of custom type names (after
:in the field) to callables, typically fromwith_pattern(). Uses the same compiled-parser cache ascompile()(pattern plus per-namepattern/regex_group_count). See the Custom types guide.case_sensitive (bool) – Whether matching should be case sensitive (default: False)
evaluate_result (bool) – Whether to evaluate and convert result types (default: True)
validators (
Optional[Mapping[Union[str,int],Callable[...,Any]]]) – Optional map of field key to validator; seeapply_validators().pipeline (
Optional[ValidationPipeline]) – OptionalValidationPipeline(mutually exclusive withvalidators).validation_mode (
Literal['strict','collect','lenient']) –"strict","collect", or"lenient"for validation.
- Returns:
ParseResult object if match found, None otherwise
- Return type:
ParseResult or None
- Raises:
ValueError – If the pattern is invalid in a way that still raises from the native compiler (for example some unclosed nested format specs), or if both
validatorsandpipelineare set. For a narrow class of malformed patterns (missing}after a field), this function returnsNonewhilecompile()raisesPatternParseMismatch, which is aValueErrorsubclass (same split as the originalparselibrary).NotImplementedError – For unsupported pattern features (for example quoted dict keys).
ValidationError – If validation fails in strict mode
MultipleValidationErrors – If
validation_mode='collect'and any validator fails
Example:
>>> result = parse("{name}: {age:d}", "Alice: 30") >>> result.named['name'] 'Alice' >>> result.named['age'] 30 >>> result = parse("{}, {}", "Hello, World") >>> result.fixed ('Hello', 'World')
Note
For some malformed patterns (for example a missing } after a field), parse()
returns None while compile() raises formatparse.PatternParseMismatch
(a subclass of ValueError). Other invalid patterns may still raise plain
ValueError from both APIs. This mirrors the original parse library.
search
- formatparse.search(pattern, string, pos=0, endpos=None, extra_types=None, case_sensitive=True, evaluate_result=True)[source]
Search for a pattern anywhere in a string.
Unlike parse(), which matches the entire string, search() finds the first occurrence of the pattern anywhere within the string.
- Parameters:
pattern (str) – Format specification pattern
string (str) – String to search
pos (int) – Start position for search (default: 0)
endpos (int, optional) – End position for search (default: None for end of string)
extra_types (dict, optional) –
Same semantics as
parse()(custom types / cache); see Custom types guide.case_sensitive (bool) – Whether matching should be case sensitive (default: True)
evaluate_result (bool) – Whether to evaluate and convert result types (default: True)
- Returns:
ParseResult object if match found, None otherwise
- Return type:
ParseResult or None
- Raises:
ValueError – If pattern is invalid
Example:
>>> result = search("age: {age:d}", "Name: Alice, age: 30, City: NYC") >>> result.named['age'] 30 >>> result = search("age: {age:d}", "No age here") >>> result is None True
findall
- formatparse.findall(pattern, string, extra_types=None, case_sensitive=False, evaluate_result=True, max_matches=None)[source]
Find all matches of a pattern in a string.
Searches for all non-overlapping occurrences of the pattern in the string. Returns a list-like
Resultswhen the fast Rust path applies (noextra_types,evaluate_resultis True, and no nested dict field names). Otherwise returns a plain PythonlistofParseResultorMatchobjects (same values as the originalparselibrary).- Parameters:
pattern (str) – Format specification pattern
string (str) – String to search
extra_types (dict, optional) –
Same semantics as
parse(). When provided, the Rust fast path that returnsResultsis disabled and a Pythonlistis built instead (see returns below). See the Custom types guide.case_sensitive (bool) – Whether matching should be case sensitive (default: False)
evaluate_result (bool) – Whether to evaluate and convert result types (default: True)
max_matches (int, optional) – Stop after this many matches (default: no limit). Useful for untrusted input; see the Security guide in the project docs (
docs/security.rst).
- Returns:
Results(preferred) orlistof matches, depending on options- Return type:
Results | list
Example:
>>> results = findall("ID:{id:d}", "ID:1 ID:2 ID:3") >>> len(results) 3 >>> results[0].named['id'] 1 >>> results[1].named['id'] 2 >>> results[2].named['id'] 3 >>> for result in results: ... print(result.named['id']) 1 2 3
findall_iter
- formatparse.findall_iter(pattern, string, extra_types=None, case_sensitive=False, evaluate_result=True, max_matches=None)[source]
Yield non-overlapping matches for
patterninstring, one at a time.Semantics match
findall()(sameextra_types,case_sensitive, andevaluate_result), but each step converts at most one match. This lowers peak memory when you stream results instead of building a fullResultsor list.This is a partial answer to issue #13: it does not implement arbitrary chunked file reads with backtracking across chunk boundaries. For logs, a common pattern is line-sized strings (matches must not span lines):
parser = compile("ID:{id:d}") with open("log.txt") as f: for line in f: for m in parser.findall_iter(line.strip()): process(m.named["id"])
- Parameters:
max_matches (int, optional) – Same as
findall()(default: no limit).- Return type:
- Returns:
Iterator of
ParseResultorMatch(same asfindall)
parse_batch
- formatparse.parse_batch(pattern, strings, extra_types=None, case_sensitive=False, evaluate_result=True)[source]
Parse many strings with the same pattern (compile once, sequential apply).
This is intended for workloads that apply one pattern to many strings: the compiled regex is resolved once (same LRU cache as
parse()/compile()) and each string is parsed in order. Non-matches appear asNoneat the corresponding index.stringsis copied to a list ofstron the Rust side (pass alistortupleof strings; a barestris treated as an iterable of characters, which is usually not what you want).- Parameters:
- Return type:
- Returns:
List of
ParseResultorNoneper input string- Raises:
ValueError – Same pattern-compile rules as
parse(); if the pattern is in the narrow class whereparse()returnsNone, this function returns a list ofNonewith one entry per input string.
Example:
>>> out = parse_batch("{:d}", ["1", "2", "x"]) >>> out[0].fixed[0] 1 >>> out[2] is None True
parse_with_validation
- formatparse.parse_with_validation(parser, string, pipeline, *, extra_types=None, case_sensitive=False, evaluate_result=True, validation_mode='strict')[source]
Parse
stringwith a compiledparser, then runpipeline.Equivalent to applying
pipelineto the result ofparser.parse(...)with the samecase_sensitive,extra_types, andevaluate_resultdefaults asparse(). Useparse()orValidatedParser.parse()when you pass avalidatorsmap instead of aValidationPipeline.- Parameters:
parser (
FormatParser) – Output ofcompile().string (
str) – Text to parse.pipeline (
ValidationPipeline) – Validation pipeline (required).validation_mode (
Literal['strict','collect','lenient']) – Passed toValidationPipeline.apply().
- Return type:
- Returns:
Same as
FormatParser.parse()after validation, orNoneif parse failed. Inlenientmode, validation failures emitValidationWarningand do not raise.- Raises:
ValidationError – In
strictmode when validation fails.MultipleValidationErrors – In
collectmode when validation fails.
compile
- formatparse.compile(pattern, extra_types=None)[source]
Compile a pattern into a FormatParser for repeated use.
Compiling a pattern allows you to reuse the same pattern multiple times without recompiling the regex, which improves performance for repeated parsing operations.
Repeated
compilecalls with the same pattern and equivalentextra_types(same converterpatternandregex_group_countper name) share the same internal compiled-regex cache asparse(),search(), andfindall(), so hot loops that callcompiledo not pay full pattern-to-regex compilation on every iteration (see issue #29).Custom types: keys are the type names used after
:in fields (for exampleNumberin{:Number}or{x:Number}). Values are callables, usually fromwith_pattern(), which attach apatternregex fragment and optionalregex_group_countwhen the regex contains capturing parentheses. See the Custom types guide for examples withsearch()/findall()and forregex_group_count.The cache fingerprints each name’s
patternandregex_group_count. If you mutate those attributes on a live converter object, reuse the sameextra_typesdict, and the fingerprint stays unchanged, you can see a stale compiled parser until the process restarts—prefer a fresh dict or new function objects when changing patterns at runtime.- Parameters:
- Returns:
FormatParser object that can be used to parse strings
- Return type:
- Raises:
RepeatedNameError – If a repeated field name has mismatched types
PatternParseMismatch – For some malformed patterns (missing
}after a field); subclass ofValueError.parse()returnsNonefor the same pattern.ValueError – For other invalid patterns or internal errors
Pickling: A
FormatParseronly round-trips the pattern string. If you compiled withextra_types, unpickling yields a parser without those converters; callcompile()again with the sameextra_typesif you need them afterpickle.loads.Example:
>>> parser = compile("{name}: {age:d}") >>> result = parser.parse("Alice: 30") >>> result.named['name'] 'Alice' >>> result.named['age'] 30 >>> result2 = parser.parse("Bob: 25") >>> result2.named['name'] 'Bob' >>> result2.named['age'] 25
with_pattern
- formatparse.with_pattern(pattern, regex_group_count=0)[source]
Decorator to create a custom type converter with a regex pattern.
This decorator adds a
patternattribute to the converter function, which is used by the parse functions when matching custom types.- Parameters:
- Returns:
Decorator function that adds the pattern attribute
- Return type:
Callable
Example:
>>> @with_pattern(r'\d+') ... def parse_number(text): ... return int(text) >>> result = parse("Answer: {:Number}", "Answer: 42", {"Number": parse_number}) >>> result.fixed[0] 42 >>> type(result.fixed[0]) <class 'int'> >>> @with_pattern(r'[A-Z]{2,3}') ... def parse_code(text): ... return text.upper() >>> result = parse("Code: {:Code}", "Code: abc", {"Code": parse_code}) >>> result.fixed[0] 'ABC'