Ppm-star » History » Version 1
Marcus Pearce, 2012-02-02 12:06 PM
1 | 1 | Marcus Pearce | h1. ppm-star |
---|---|---|---|
2 | 1 | Marcus Pearce | |
3 | 1 | Marcus Pearce | h2. Loading the system |
4 | 1 | Marcus Pearce | |
5 | 1 | Marcus Pearce | <pre> |
6 | 1 | Marcus Pearce | CL-USER> (asdf:oos 'asdf:load-op 'ppm-star) |
7 | 1 | Marcus Pearce | ... |
8 | 1 | Marcus Pearce | </pre> |
9 | 1 | Marcus Pearce | |
10 | 1 | Marcus Pearce | h2. Usage |
11 | 1 | Marcus Pearce | |
12 | 1 | Marcus Pearce | This is a simple use of the package to generate probabilities for each element of each sequence in a list of sequences composed of some alphabet of symbols: |
13 | 1 | Marcus Pearce | |
14 | 1 | Marcus Pearce | <pre> |
15 | 1 | Marcus Pearce | CL-USER> |
16 | 1 | Marcus Pearce | (defun simple-ppm-test (sequences alphabet) |
17 | 1 | Marcus Pearce | (let ((model (ppm:make-ppm alphabet :escape :c :mixtures t |
18 | 1 | Marcus Pearce | :update-exclusion nil :order-bound nil))) |
19 | 1 | Marcus Pearce | (ppm:model-dataset model sequences :construct? t :predict? t))) |
20 | 1 | Marcus Pearce | SIMPLE-PPM-TEST |
21 | 1 | Marcus Pearce | CL-USER> (simple-ppm-test '((a b a b a b a b)) '(a b)) |
22 | 1 | Marcus Pearce | ((0 (A ((A 0.5) (B 0.5))) (B ((A 0.75) (B 0.25))) (A ((A 0.5) (B 0.5))) |
23 | 1 | Marcus Pearce | (B ((A 0.36363637) (B 0.6363636))) (A ((A 0.6666667) (B 0.33333334))) |
24 | 1 | Marcus Pearce | (B ((A 0.35714284) (B 0.64285713))) (A ((A 0.6666667) (B 0.33333334))) |
25 | 1 | Marcus Pearce | (B ((A 0.3529412) (B 0.6470588))))) |
26 | 1 | Marcus Pearce | CL-USER> |
27 | 1 | Marcus Pearce | </pre> |
28 | 1 | Marcus Pearce | |
29 | 1 | Marcus Pearce | The output is a list of lists, one for each sequence in the list supplied to the function. Each of these lists is itself a list, composed of the symbol that appeared at that position in the sequence and a probability distribution reflecting the models predictions for that position. The probability of the symbol appearing at that location can be obtained by looking up the symbol in the probability distribution or one can use the distribution to compute the entropy (uncertainty) of the model's prediction at that location. |
30 | 1 | Marcus Pearce | |
31 | 1 | Marcus Pearce | |
32 | 1 | Marcus Pearce | The code in <code>ppm-ui.lisp</code> shows another example application: the following function returns the n-gram counts of a given order (n) in a list of sequences composed from symbols in the supplied alphabet: |
33 | 1 | Marcus Pearce | |
34 | 1 | Marcus Pearce | <pre> |
35 | 1 | Marcus Pearce | (defun test-model (sequences alphabet order) |
36 | 1 | Marcus Pearce | (ngram-frequencies (build-model sequences alphabet) order)) |
37 | 1 | Marcus Pearce | </pre> |
38 | 1 | Marcus Pearce | |
39 | 1 | Marcus Pearce | Here are some examples: |
40 | 1 | Marcus Pearce | |
41 | 1 | Marcus Pearce | <pre> |
42 | 1 | Marcus Pearce | CL-USER> (ppm-star::test-model '((a b r a c a d a b r a)) '(a b c d r) 3) |
43 | 1 | Marcus Pearce | (((D A B) 1) ((C A D) 1) ((R A C) 1) ((B R A) 2) ((A D A) 1) ((A C A) 1) |
44 | 1 | Marcus Pearce | ((A B R) 2)) |
45 | 1 | Marcus Pearce | </pre> |
46 | 1 | Marcus Pearce | <pre> |
47 | 1 | Marcus Pearce | CL-USER> (ppm-star::test-model '((l e t l e t t e r t e l e)) '(e l t r) 1) |
48 | 1 | Marcus Pearce | (((R) 1) ((T) 4) ((E) 5) ((L) 3)) |
49 | 1 | Marcus Pearce | </pre> |
50 | 1 | Marcus Pearce | <pre> |
51 | 1 | Marcus Pearce | CL-USER> (ppm-star::test-model '((a g c g a c g a g)) '(a c g) 2) |
52 | 1 | Marcus Pearce | (((C G) 2) ((G A) 2) ((G C) 1) ((A C) 1) ((A G) 2)) |
53 | 1 | Marcus Pearce | </pre> |
54 | 1 | Marcus Pearce | <pre> |
55 | 1 | Marcus Pearce | CL-USER> (ppm-star::test-model '((m i s s i s s i p p i)) '(i m p s) 4) |
56 | 1 | Marcus Pearce | (((S I P P) 1) ((S I S S) 1) ((S S I P) 1) ((S S I S) 1) ((I P P I) 1) |
57 | 1 | Marcus Pearce | ((I S S I) 3) ((M I S S) 1)) |
58 | 1 | Marcus Pearce | </pre> |
59 | 1 | Marcus Pearce | <pre> |
60 | 1 | Marcus Pearce | CL-USER> (ppm-star::test-model '((a s s a n i s s i m a s s a)) '(a i m n s) 2) |
61 | 1 | Marcus Pearce | (((M A) 1) ((I M) 1) ((I S) 1) ((N I) 1) ((S I) 1) ((S A) 2) ((S S) 3) |
62 | 1 | Marcus Pearce | ((A N) 1) ((A S) 2)) |
63 | 1 | Marcus Pearce | </pre> |
64 | 1 | Marcus Pearce | <pre> |
65 | 1 | Marcus Pearce | CL-USER> (ppm-star::test-model '((a s s a n i s s i m a s s a) |
66 | 1 | Marcus Pearce | (m i s s i s s i p p i)) |
67 | 1 | Marcus Pearce | '(a s n i m p) |
68 | 1 | Marcus Pearce | 2) |
69 | 1 | Marcus Pearce | (((P I) 1) ((P P) 1) ((M I) 1) ((M A) 1) ((I P) 1) ((I M) 1) ((I S) 3) |
70 | 1 | Marcus Pearce | ((N I) 1) ((S I) 3) ((S A) 2) ((S S) 5) ((A N) 1) ((A S) 2)) |
71 | 1 | Marcus Pearce | </pre> |
72 | 1 | Marcus Pearce | <pre> |
73 | 1 | Marcus Pearce | CL-USER> (ppm-star::test-model '((a b r a c a d a b r a) |
74 | 1 | Marcus Pearce | (l e t l e t t e r t e l e) |
75 | 1 | Marcus Pearce | (a s s a n i s s i m a s s a) |
76 | 1 | Marcus Pearce | (m i s s i s s i p p i) |
77 | 1 | Marcus Pearce | (w o o l o o b o o l o o)) |
78 | 1 | Marcus Pearce | '(a b c d e i l m n o p r s t w) |
79 | 1 | Marcus Pearce | 3) |
80 | 1 | Marcus Pearce | (((O B O) 1) ((O L O) 2) ((O O B) 1) ((O O L) 2) ((W O O) 1) ((P P I) 1) |
81 | 1 | Marcus Pearce | ((M I S) 1) ((M A S) 1) ((I P P) 1) ((I M A) 1) ((I S S) 3) ((N I S) 1) |
82 | 1 | Marcus Pearce | ((S I P) 1) ((S I S) 1) ((S I M) 1) ((S A N) 1) ((S S I) 3) ((S S A) 2) |
83 | 1 | Marcus Pearce | ((T E L) 1) ((T E R) 1) ((T T E) 1) ((T L E) 1) ((E L E) 1) ((E R T) 1) |
84 | 1 | Marcus Pearce | ((E T T) 1) ((E T L) 1) ((L O O) 2) ((L E T) 2) ((D A B) 1) ((C A D) 1) |
85 | 1 | Marcus Pearce | ((R T E) 1) ((R A C) 1) ((B O O) 1) ((B R A) 2) ((A N I) 1) ((A S S) 2) |
86 | 1 | Marcus Pearce | ((A D A) 1) ((A C A) 1) ((A B R) 2)) |
87 | 1 | Marcus Pearce | CL-USER> |
88 | 1 | Marcus Pearce | </pre> |
89 | 1 | Marcus Pearce | |
90 | 1 | Marcus Pearce | There is also a function to write the model to a postscript representation of a suffix tree: |
91 | 1 | Marcus Pearce | |
92 | 1 | Marcus Pearce | <pre> |
93 | 1 | Marcus Pearce | CL-USER> (ppm-star:write-model-to-postscript |
94 | 1 | Marcus Pearce | (ppm-star::build-model '((a b r a c a d a b r a)) '(a b c d r)) |
95 | 1 | Marcus Pearce | "/tmp/ppm.ps") |
96 | 1 | Marcus Pearce | NIL |
97 | 1 | Marcus Pearce | </pre> |