Lichen (file pyparser/test/test_automata.py at f91b467ef568)

Lichen

pyparser/test/test_automata.py

535:f91b467ef568

2017-02-04

Paul Boddie

Removed recoding to UTF-8 since this failed for ISO-8859-15, causing UTF-8 recodings of byte sequences to occur, not producing such undesirable data for ISO-8859-1 only because of it being special-cased. This change may break other ASCII-incompatible encodings because UTF-8 is likely to be the safe form of such data, permitting the parser to understand it, and without such recoding the parser will no longer recognise the grammar's tokens.

     1 from pyparser.automata import DFA, DEFAULT     2      3 def test_states():     4     d = DFA([{"\x00": 1}, {"\x01": 0}], [False, True])     5     assert d.states == "\x01\xff\xff\x00"     6     assert d.defaults == "\xff\xff"     7     assert d.max_char == 2     8      9     d = DFA([{"\x00": 1}, {DEFAULT: 0}], [False, True])    10     assert d.states == "\x01\x00"    11     assert d.defaults == "\xff\x00"    12     assert d.max_char == 1