Analysis

Frequency

General purpose frequency analysis tools.

lantern.analysis.frequency.ENGLISH_IC = 0.06505393453880672

Index of coincidence for the English language.

lantern.analysis.frequency.chi_squared(source_frequency, target_frequency)

Calculate the Chi Squared statistic by comparing source_frequency with target_frequency.

Example

>>> chi_squared({'a': 2, 'b': 3}, {'a': 1, 'b': 2})
0.1
Parameters:
  • source_frequency (dict) – Frequency map of the text you are analyzing
  • target_frequency (dict) – Frequency map of the target language to compare with
Returns:

Decimal value of the chi-squared statistic

lantern.analysis.frequency.english = <lantern.structures.dynamicdict.DynamicDict object>

English ngram frequencies.

lantern.analysis.frequency.frequency_analyze(text, n=1)

Analyze the frequency of ngrams for a piece of text.

Examples

>>> frequency_analyze("abb")
{'a': 1, 'b': 2}
>>> frequency_analyze("abb", 2)
{'ab': 1, 'bb': 1}
Parameters:
  • text (str) – The text to analyze
  • n (int) – The ngram size to use
Returns:

Dictionary of ngrams to frequency

Raises:

ValueError – If n is not a positive integer

lantern.analysis.frequency.frequency_to_probability(frequency_map, decorator=<function <lambda>>)

Transform a frequency_map into a map of probability using the sum of all frequencies as the total.

Example

>>> frequency_to_probability({'a': 2, 'b': 2})
{'a': 0.5, 'b': 0.5}
Parameters:
  • frequency_map (dict) – The dictionary to transform
  • decorator (function) – A function to manipulate the probability
Returns:

Dictionary of ngrams to probability

lantern.analysis.frequency.index_of_coincidence(*texts)

Calculate the index of coincidence for one or more texts. The results are averaged over multiple texts to return the delta index of coincidence.

Examples

>>> index_of_coincidence("aabbc")
0.2
>>> index_of_coincidence("aabbc", "abbcc")
0.2
Parameters:

*texts (variable length argument list) – The texts to analyze

Returns:

Decimal value of the index of coincidence

Raises:
  • ValueError – If texts is empty
  • ValueError – If any text is less that 2 character long

Search

Algorithms for searching and optimisation.

lantern.analysis.search.hill_climb(nsteps, start_node, get_next_node)

Modular hill climbing algorithm.

Example

>>> def get_next_node(node):
...     a, b = random.sample(range(len(node)), 2)
...     node[a], node[b] = node[b], node[a]
...     plaintext = decrypt(node, ciphertext)
...     score = lantern.score(plaintext, *fitness_functions)
...     return node, score, Decryption(plaintext, ''.join(node), score)
>>> final_node, best_score, outputs = hill_climb(10, "ABC", get_next_node)
Parameters:
  • nsteps (int) – The number of neighbours to visit
  • start_node – The starting node
  • get_next_node (function) – Function to return the next node the score of the current node and any optional output from the current node
Returns:

The highest node found, the score of this node and the outputs from the best nodes along the way