Ppm-star » History » Version 2

Jeremy Gow, 2013-04-09 11:50 AM

1 1 Marcus Pearce
h1. ppm-star  
2 2 Jeremy Gow
 
3 1 Marcus Pearce
4 1 Marcus Pearce
This is a simple use of the package to generate probabilities for each element of each sequence in a list of sequences composed of some alphabet of symbols:
5 1 Marcus Pearce
6 1 Marcus Pearce
<pre>
7 1 Marcus Pearce
CL-USER> 
8 1 Marcus Pearce
(defun simple-ppm-test (sequences alphabet) 
9 1 Marcus Pearce
  (let ((model (ppm:make-ppm alphabet :escape :c :mixtures t 
10 1 Marcus Pearce
                             :update-exclusion nil :order-bound nil)))
11 1 Marcus Pearce
    (ppm:model-dataset model sequences :construct? t :predict? t)))
12 1 Marcus Pearce
SIMPLE-PPM-TEST
13 1 Marcus Pearce
CL-USER> (simple-ppm-test '((a b a b a b a b)) '(a b))
14 1 Marcus Pearce
((0 (A ((A 0.5) (B 0.5))) (B ((A 0.75) (B 0.25))) (A ((A 0.5) (B 0.5)))
15 1 Marcus Pearce
  (B ((A 0.36363637) (B 0.6363636))) (A ((A 0.6666667) (B 0.33333334)))
16 1 Marcus Pearce
  (B ((A 0.35714284) (B 0.64285713))) (A ((A 0.6666667) (B 0.33333334)))
17 1 Marcus Pearce
  (B ((A 0.3529412) (B 0.6470588)))))
18 1 Marcus Pearce
CL-USER> 
19 1 Marcus Pearce
</pre>
20 1 Marcus Pearce
21 1 Marcus Pearce
The output is a list of lists, one for each sequence in the list supplied to the function. Each of these lists is itself a list, composed of the symbol that appeared at that position in the sequence and a probability distribution reflecting the models predictions for that position. The probability of the symbol appearing at that location can be obtained by looking up the symbol in the probability distribution or one can use the distribution to compute the entropy (uncertainty) of the model's prediction at that location.
22 1 Marcus Pearce
23 1 Marcus Pearce
24 1 Marcus Pearce
The code in <code>ppm-ui.lisp</code> shows another example application: the following function returns the n-gram counts of a given order (n) in a list of sequences composed from symbols in the supplied alphabet: 
25 1 Marcus Pearce
26 1 Marcus Pearce
<pre>
27 1 Marcus Pearce
(defun test-model (sequences alphabet order)
28 1 Marcus Pearce
  (ngram-frequencies (build-model sequences alphabet) order))
29 1 Marcus Pearce
</pre> 
30 1 Marcus Pearce
31 1 Marcus Pearce
Here are some examples: 
32 1 Marcus Pearce
33 1 Marcus Pearce
<pre>
34 1 Marcus Pearce
CL-USER> (ppm-star::test-model '((a b r a c a d a b r a)) '(a b c d r) 3)           
35 1 Marcus Pearce
(((D A B) 1) ((C A D) 1) ((R A C) 1) ((B R A) 2) ((A D A) 1) ((A C A) 1)
36 1 Marcus Pearce
 ((A B R) 2))
37 1 Marcus Pearce
</pre> 
38 1 Marcus Pearce
<pre> 
39 1 Marcus Pearce
CL-USER> (ppm-star::test-model '((l e t l e t t e r t e l e)) '(e l t r) 1)                     
40 1 Marcus Pearce
(((R) 1) ((T) 4) ((E) 5) ((L) 3))
41 1 Marcus Pearce
</pre>
42 1 Marcus Pearce
<pre> 
43 1 Marcus Pearce
CL-USER> (ppm-star::test-model '((a g c g a c g a g)) '(a c g) 2)
44 1 Marcus Pearce
(((C G) 2) ((G A) 2) ((G C) 1) ((A C) 1) ((A G) 2))
45 1 Marcus Pearce
</pre> 
46 1 Marcus Pearce
<pre> 
47 1 Marcus Pearce
CL-USER> (ppm-star::test-model '((m i s s i s s i p p i)) '(i m p s) 4)
48 1 Marcus Pearce
(((S I P P) 1) ((S I S S) 1) ((S S I P) 1) ((S S I S) 1) ((I P P I) 1)
49 1 Marcus Pearce
 ((I S S I) 3) ((M I S S) 1))
50 1 Marcus Pearce
</pre> 
51 1 Marcus Pearce
<pre> 
52 1 Marcus Pearce
CL-USER> (ppm-star::test-model '((a s s a n i s s i m a s s a)) '(a i m n s) 2)
53 1 Marcus Pearce
(((M A) 1) ((I M) 1) ((I S) 1) ((N I) 1) ((S I) 1) ((S A) 2) ((S S) 3)
54 1 Marcus Pearce
 ((A N) 1) ((A S) 2))
55 1 Marcus Pearce
</pre> 
56 1 Marcus Pearce
<pre> 
57 1 Marcus Pearce
CL-USER> (ppm-star::test-model '((a s s a n i s s i m a s s a)
58 1 Marcus Pearce
                                 (m i s s i s s i p p i))
59 1 Marcus Pearce
                               '(a s n i m p)
60 1 Marcus Pearce
                               2)
61 1 Marcus Pearce
(((P I) 1) ((P P) 1) ((M I) 1) ((M A) 1) ((I P) 1) ((I M) 1) ((I S) 3)
62 1 Marcus Pearce
 ((N I) 1) ((S I) 3) ((S A) 2) ((S S) 5) ((A N) 1) ((A S) 2))
63 1 Marcus Pearce
</pre> 
64 1 Marcus Pearce
<pre>
65 1 Marcus Pearce
CL-USER> (ppm-star::test-model '((a b r a c a d a b r a)
66 1 Marcus Pearce
                                 (l e t l e t t e r t e l e)
67 1 Marcus Pearce
                                 (a s s a n i s s i m a s s a)
68 1 Marcus Pearce
                                 (m i s s i s s i p p i)
69 1 Marcus Pearce
                                 (w o o l o o b o o l o o))
70 1 Marcus Pearce
                               '(a b c d e i l m n o p r s t w)
71 1 Marcus Pearce
                               3)
72 1 Marcus Pearce
(((O B O) 1) ((O L O) 2) ((O O B) 1) ((O O L) 2) ((W O O) 1) ((P P I) 1)
73 1 Marcus Pearce
 ((M I S) 1) ((M A S) 1) ((I P P) 1) ((I M A) 1) ((I S S) 3) ((N I S) 1)
74 1 Marcus Pearce
 ((S I P) 1) ((S I S) 1) ((S I M) 1) ((S A N) 1) ((S S I) 3) ((S S A) 2)
75 1 Marcus Pearce
 ((T E L) 1) ((T E R) 1) ((T T E) 1) ((T L E) 1) ((E L E) 1) ((E R T) 1)
76 1 Marcus Pearce
 ((E T T) 1) ((E T L) 1) ((L O O) 2) ((L E T) 2) ((D A B) 1) ((C A D) 1)
77 1 Marcus Pearce
 ((R T E) 1) ((R A C) 1) ((B O O) 1) ((B R A) 2) ((A N I) 1) ((A S S) 2)
78 1 Marcus Pearce
 ((A D A) 1) ((A C A) 1) ((A B R) 2))
79 1 Marcus Pearce
CL-USER> 
80 1 Marcus Pearce
</pre>
81 1 Marcus Pearce
82 1 Marcus Pearce
There is also a function to write the model to a postscript representation of a suffix tree:
83 1 Marcus Pearce
84 1 Marcus Pearce
<pre>
85 1 Marcus Pearce
CL-USER> (ppm-star:write-model-to-postscript 
86 1 Marcus Pearce
          (ppm-star::build-model '((a b r a c a d a b r a)) '(a b c d r))
87 1 Marcus Pearce
          "/tmp/ppm.ps")
88 1 Marcus Pearce
NIL
89 1 Marcus Pearce
</pre>