Analysis¶
Frequency¶
General purpose frequency analysis tools.
-
lantern.analysis.frequency.
ENGLISH_IC
= 0.06505393453880672¶ Index of coincidence for the English language.
-
lantern.analysis.frequency.
chi_squared
(source_frequency, target_frequency)¶ Calculate the Chi Squared statistic by comparing
source_frequency
withtarget_frequency
.Example
>>> chi_squared({'a': 2, 'b': 3}, {'a': 1, 'b': 2}) 0.1
Parameters: - source_frequency (dict) – Frequency map of the text you are analyzing
- target_frequency (dict) – Frequency map of the target language to compare with
Returns: Decimal value of the chi-squared statistic
-
lantern.analysis.frequency.
english
= <lantern.structures.dynamicdict.DynamicDict object>¶ English ngram frequencies.
-
lantern.analysis.frequency.
frequency_analyze
(text, n=1)¶ Analyze the frequency of ngrams for a piece of text.
Examples
>>> frequency_analyze("abb") {'a': 1, 'b': 2}
>>> frequency_analyze("abb", 2) {'ab': 1, 'bb': 1}
Parameters: - text (str) – The text to analyze
- n (int) – The ngram size to use
Returns: Dictionary of ngrams to frequency
Raises: ValueError – If n is not a positive integer
-
lantern.analysis.frequency.
frequency_to_probability
(frequency_map, decorator=<function <lambda>>)¶ Transform a
frequency_map
into a map of probability using the sum of all frequencies as the total.Example
>>> frequency_to_probability({'a': 2, 'b': 2}) {'a': 0.5, 'b': 0.5}
Parameters: - frequency_map (dict) – The dictionary to transform
- decorator (function) – A function to manipulate the probability
Returns: Dictionary of ngrams to probability
-
lantern.analysis.frequency.
index_of_coincidence
(*texts)¶ Calculate the index of coincidence for one or more
texts
. The results are averaged over multiple texts to return the delta index of coincidence.Examples
>>> index_of_coincidence("aabbc") 0.2
>>> index_of_coincidence("aabbc", "abbcc") 0.2
Parameters: *texts (variable length argument list) – The texts to analyze
Returns: Decimal value of the index of coincidence
Raises: - ValueError – If texts is empty
- ValueError – If any text is less that 2 character long
Search¶
Algorithms for searching and optimisation.
-
lantern.analysis.search.
hill_climb
(nsteps, start_node, get_next_node)¶ Modular hill climbing algorithm.
Example
>>> def get_next_node(node): ... a, b = random.sample(range(len(node)), 2) ... node[a], node[b] = node[b], node[a] ... plaintext = decrypt(node, ciphertext) ... score = lantern.score(plaintext, *fitness_functions) ... return node, score, Decryption(plaintext, ''.join(node), score) >>> final_node, best_score, outputs = hill_climb(10, "ABC", get_next_node)
Parameters: - nsteps (int) – The number of neighbours to visit
- start_node – The starting node
- get_next_node (function) – Function to return the next node the score of the current node and any optional output from the current node
Returns: The highest node found, the score of this node and the outputs from the best nodes along the way