Chris@909
|
1 # encoding: utf-8
|
Chris@909
|
2 # Encoding.default_internal = 'UTF-8'
|
Chris@909
|
3
|
Chris@909
|
4 # = CodeRay Library
|
Chris@909
|
5 #
|
Chris@909
|
6 # CodeRay is a Ruby library for syntax highlighting.
|
Chris@909
|
7 #
|
Chris@909
|
8 # I try to make CodeRay easy to use and intuitive, but at the same time fully
|
Chris@909
|
9 # featured, complete, fast and efficient.
|
Chris@909
|
10 #
|
Chris@909
|
11 # See README.
|
Chris@909
|
12 #
|
Chris@909
|
13 # It consists mainly of
|
Chris@909
|
14 # * the main engine: CodeRay (Scanners::Scanner, Tokens, Encoders::Encoder)
|
Chris@909
|
15 # * the plugin system: PluginHost, Plugin
|
Chris@909
|
16 # * the scanners in CodeRay::Scanners
|
Chris@909
|
17 # * the encoders in CodeRay::Encoders
|
Chris@909
|
18 # * the styles in CodeRay::Styles
|
Chris@909
|
19 #
|
Chris@909
|
20 # Here's a fancy graphic to light up this gray docu:
|
Chris@909
|
21 #
|
Chris@909
|
22 # http://cycnus.de/raindark/coderay/scheme.png
|
Chris@909
|
23 #
|
Chris@909
|
24 # == Documentation
|
Chris@909
|
25 #
|
Chris@909
|
26 # See CodeRay, Encoders, Scanners, Tokens.
|
Chris@909
|
27 #
|
Chris@909
|
28 # == Usage
|
Chris@909
|
29 #
|
Chris@909
|
30 # Remember you need RubyGems to use CodeRay, unless you have it in your load
|
Chris@909
|
31 # path. Run Ruby with -rubygems option if required.
|
Chris@909
|
32 #
|
Chris@909
|
33 # === Highlight Ruby code in a string as html
|
Chris@909
|
34 #
|
Chris@909
|
35 # require 'coderay'
|
Chris@909
|
36 # print CodeRay.scan('puts "Hello, world!"', :ruby).html
|
Chris@909
|
37 #
|
Chris@909
|
38 # # prints something like this:
|
Chris@909
|
39 # puts <span class="s">"Hello, world!"</span>
|
Chris@909
|
40 #
|
Chris@909
|
41 #
|
Chris@909
|
42 # === Highlight C code from a file in a html div
|
Chris@909
|
43 #
|
Chris@909
|
44 # require 'coderay'
|
Chris@909
|
45 # print CodeRay.scan(File.read('ruby.h'), :c).div
|
Chris@909
|
46 # print CodeRay.scan_file('ruby.h').html.div
|
Chris@909
|
47 #
|
Chris@909
|
48 # You can include this div in your page. The used CSS styles can be printed with
|
Chris@909
|
49 #
|
Chris@909
|
50 # % coderay_stylesheet
|
Chris@909
|
51 #
|
Chris@909
|
52 # === Highlight without typing too much
|
Chris@909
|
53 #
|
Chris@909
|
54 # If you are one of the hasty (or lazy, or extremely curious) people, just run this file:
|
Chris@909
|
55 #
|
Chris@909
|
56 # % ruby -rubygems /path/to/coderay/coderay.rb > example.html
|
Chris@909
|
57 #
|
Chris@909
|
58 # and look at the file it created in your browser.
|
Chris@909
|
59 #
|
Chris@909
|
60 # = CodeRay Module
|
Chris@909
|
61 #
|
Chris@909
|
62 # The CodeRay module provides convenience methods for the engine.
|
Chris@909
|
63 #
|
Chris@909
|
64 # * The +lang+ and +format+ arguments select Scanner and Encoder to use. These are
|
Chris@909
|
65 # simply lower-case symbols, like <tt>:python</tt> or <tt>:html</tt>.
|
Chris@909
|
66 # * All methods take an optional hash as last parameter, +options+, that is send to
|
Chris@909
|
67 # the Encoder / Scanner.
|
Chris@909
|
68 # * Input and language are always sorted in this order: +code+, +lang+.
|
Chris@909
|
69 # (This is in alphabetical order, if you need a mnemonic ;)
|
Chris@909
|
70 #
|
Chris@909
|
71 # You should be able to highlight everything you want just using these methods;
|
Chris@909
|
72 # so there is no need to dive into CodeRay's deep class hierarchy.
|
Chris@909
|
73 #
|
Chris@909
|
74 # The examples in the demo directory demonstrate common cases using this interface.
|
Chris@909
|
75 #
|
Chris@909
|
76 # = Basic Access Ways
|
Chris@909
|
77 #
|
Chris@909
|
78 # Read this to get a general view what CodeRay provides.
|
Chris@909
|
79 #
|
Chris@909
|
80 # == Scanning
|
Chris@909
|
81 #
|
Chris@909
|
82 # Scanning means analysing an input string, splitting it up into Tokens.
|
Chris@909
|
83 # Each Token knows about what type it is: string, comment, class name, etc.
|
Chris@909
|
84 #
|
Chris@909
|
85 # Each +lang+ (language) has its own Scanner; for example, <tt>:ruby</tt> code is
|
Chris@909
|
86 # handled by CodeRay::Scanners::Ruby.
|
Chris@909
|
87 #
|
Chris@909
|
88 # CodeRay.scan:: Scan a string in a given language into Tokens.
|
Chris@909
|
89 # This is the most common method to use.
|
Chris@909
|
90 # CodeRay.scan_file:: Scan a file and guess the language using FileType.
|
Chris@909
|
91 #
|
Chris@909
|
92 # The Tokens object you get from these methods can encode itself; see Tokens.
|
Chris@909
|
93 #
|
Chris@909
|
94 # == Encoding
|
Chris@909
|
95 #
|
Chris@909
|
96 # Encoding means compiling Tokens into an output. This can be colored HTML or
|
Chris@909
|
97 # LaTeX, a textual statistic or just the number of non-whitespace tokens.
|
Chris@909
|
98 #
|
Chris@909
|
99 # Each Encoder provides output in a specific +format+, so you select Encoders via
|
Chris@909
|
100 # formats like <tt>:html</tt> or <tt>:statistic</tt>.
|
Chris@909
|
101 #
|
Chris@909
|
102 # CodeRay.encode:: Scan and encode a string in a given language.
|
Chris@909
|
103 # CodeRay.encode_tokens:: Encode the given tokens.
|
Chris@909
|
104 # CodeRay.encode_file:: Scan a file, guess the language using FileType and encode it.
|
Chris@909
|
105 #
|
Chris@909
|
106 # == All-in-One Encoding
|
Chris@909
|
107 #
|
Chris@909
|
108 # CodeRay.encode:: Highlight a string with a given input and output format.
|
Chris@909
|
109 #
|
Chris@909
|
110 # == Instanciating
|
Chris@909
|
111 #
|
Chris@909
|
112 # You can use an Encoder instance to highlight multiple inputs. This way, the setup
|
Chris@909
|
113 # for this Encoder must only be done once.
|
Chris@909
|
114 #
|
Chris@909
|
115 # CodeRay.encoder:: Create an Encoder instance with format and options.
|
Chris@909
|
116 # CodeRay.scanner:: Create an Scanner instance for lang, with '' as default code.
|
Chris@909
|
117 #
|
Chris@909
|
118 # To make use of CodeRay.scanner, use CodeRay::Scanner::code=.
|
Chris@909
|
119 #
|
Chris@909
|
120 # The scanning methods provide more flexibility; we recommend to use these.
|
Chris@909
|
121 #
|
Chris@909
|
122 # == Reusing Scanners and Encoders
|
Chris@909
|
123 #
|
Chris@909
|
124 # If you want to re-use scanners and encoders (because that is faster), see
|
Chris@909
|
125 # CodeRay::Duo for the most convenient (and recommended) interface.
|
Chris@909
|
126 module CodeRay
|
Chris@909
|
127
|
Chris@909
|
128 $CODERAY_DEBUG ||= false
|
Chris@909
|
129
|
Chris@909
|
130 require 'coderay/version'
|
Chris@909
|
131
|
Chris@909
|
132 # helpers
|
Chris@909
|
133 autoload :FileType, 'coderay/helpers/file_type'
|
Chris@909
|
134
|
Chris@909
|
135 # Tokens
|
Chris@909
|
136 autoload :Tokens, 'coderay/tokens'
|
Chris@909
|
137 autoload :TokensProxy, 'coderay/tokens_proxy'
|
Chris@909
|
138 autoload :TokenKinds, 'coderay/token_kinds'
|
Chris@909
|
139
|
Chris@909
|
140 # Plugin system
|
Chris@909
|
141 autoload :PluginHost, 'coderay/helpers/plugin'
|
Chris@909
|
142 autoload :Plugin, 'coderay/helpers/plugin'
|
Chris@909
|
143
|
Chris@909
|
144 # Plugins
|
Chris@909
|
145 autoload :Scanners, 'coderay/scanner'
|
Chris@909
|
146 autoload :Encoders, 'coderay/encoder'
|
Chris@909
|
147 autoload :Styles, 'coderay/style'
|
Chris@909
|
148
|
Chris@909
|
149 # Convenience access and reusable Encoder/Scanner pair
|
Chris@909
|
150 autoload :Duo, 'coderay/duo'
|
Chris@909
|
151
|
Chris@909
|
152 class << self
|
Chris@909
|
153
|
Chris@909
|
154 # Scans the given +code+ (a String) with the Scanner for +lang+.
|
Chris@909
|
155 #
|
Chris@909
|
156 # This is a simple way to use CodeRay. Example:
|
Chris@909
|
157 # require 'coderay'
|
Chris@909
|
158 # page = CodeRay.scan("puts 'Hello, world!'", :ruby).html
|
Chris@909
|
159 #
|
Chris@909
|
160 # See also demo/demo_simple.
|
Chris@909
|
161 def scan code, lang, options = {}, &block
|
Chris@909
|
162 # FIXME: return a proxy for direct-stream encoding
|
Chris@909
|
163 TokensProxy.new code, lang, options, block
|
Chris@909
|
164 end
|
Chris@909
|
165
|
Chris@909
|
166 # Scans +filename+ (a path to a code file) with the Scanner for +lang+.
|
Chris@909
|
167 #
|
Chris@909
|
168 # If +lang+ is :auto or omitted, the CodeRay::FileType module is used to
|
Chris@909
|
169 # determine it. If it cannot find out what type it is, it uses
|
Chris@909
|
170 # CodeRay::Scanners::Text.
|
Chris@909
|
171 #
|
Chris@909
|
172 # Calls CodeRay.scan.
|
Chris@909
|
173 #
|
Chris@909
|
174 # Example:
|
Chris@909
|
175 # require 'coderay'
|
Chris@909
|
176 # page = CodeRay.scan_file('some_c_code.c').html
|
Chris@909
|
177 def scan_file filename, lang = :auto, options = {}, &block
|
Chris@909
|
178 lang = FileType.fetch filename, :text, true if lang == :auto
|
Chris@909
|
179 code = File.read filename
|
Chris@909
|
180 scan code, lang, options, &block
|
Chris@909
|
181 end
|
Chris@909
|
182
|
Chris@909
|
183 # Encode a string.
|
Chris@909
|
184 #
|
Chris@909
|
185 # This scans +code+ with the the Scanner for +lang+ and then
|
Chris@909
|
186 # encodes it with the Encoder for +format+.
|
Chris@909
|
187 # +options+ will be passed to the Encoder.
|
Chris@909
|
188 #
|
Chris@909
|
189 # See CodeRay::Encoder.encode.
|
Chris@909
|
190 def encode code, lang, format, options = {}
|
Chris@909
|
191 encoder(format, options).encode code, lang, options
|
Chris@909
|
192 end
|
Chris@909
|
193
|
Chris@909
|
194 # Encode pre-scanned Tokens.
|
Chris@909
|
195 # Use this together with CodeRay.scan:
|
Chris@909
|
196 #
|
Chris@909
|
197 # require 'coderay'
|
Chris@909
|
198 #
|
Chris@909
|
199 # # Highlight a short Ruby code example in a HTML span
|
Chris@909
|
200 # tokens = CodeRay.scan '1 + 2', :ruby
|
Chris@909
|
201 # puts CodeRay.encode_tokens(tokens, :span)
|
Chris@909
|
202 #
|
Chris@909
|
203 def encode_tokens tokens, format, options = {}
|
Chris@909
|
204 encoder(format, options).encode_tokens tokens, options
|
Chris@909
|
205 end
|
Chris@909
|
206
|
Chris@909
|
207 # Encodes +filename+ (a path to a code file) with the Scanner for +lang+.
|
Chris@909
|
208 #
|
Chris@909
|
209 # See CodeRay.scan_file.
|
Chris@909
|
210 # Notice that the second argument is the output +format+, not the input language.
|
Chris@909
|
211 #
|
Chris@909
|
212 # Example:
|
Chris@909
|
213 # require 'coderay'
|
Chris@909
|
214 # page = CodeRay.encode_file 'some_c_code.c', :html
|
Chris@909
|
215 def encode_file filename, format, options = {}
|
Chris@909
|
216 tokens = scan_file filename, :auto, get_scanner_options(options)
|
Chris@909
|
217 encode_tokens tokens, format, options
|
Chris@909
|
218 end
|
Chris@909
|
219
|
Chris@909
|
220 # Highlight a string into a HTML <div>.
|
Chris@909
|
221 #
|
Chris@909
|
222 # CSS styles use classes, so you have to include a stylesheet
|
Chris@909
|
223 # in your output.
|
Chris@909
|
224 #
|
Chris@909
|
225 # See encode.
|
Chris@909
|
226 def highlight code, lang, options = { :css => :class }, format = :div
|
Chris@909
|
227 encode code, lang, format, options
|
Chris@909
|
228 end
|
Chris@909
|
229
|
Chris@909
|
230 # Highlight a file into a HTML <div>.
|
Chris@909
|
231 #
|
Chris@909
|
232 # CSS styles use classes, so you have to include a stylesheet
|
Chris@909
|
233 # in your output.
|
Chris@909
|
234 #
|
Chris@909
|
235 # See encode.
|
Chris@909
|
236 def highlight_file filename, options = { :css => :class }, format = :div
|
Chris@909
|
237 encode_file filename, format, options
|
Chris@909
|
238 end
|
Chris@909
|
239
|
Chris@909
|
240 # Finds the Encoder class for +format+ and creates an instance, passing
|
Chris@909
|
241 # +options+ to it.
|
Chris@909
|
242 #
|
Chris@909
|
243 # Example:
|
Chris@909
|
244 # require 'coderay'
|
Chris@909
|
245 #
|
Chris@909
|
246 # stats = CodeRay.encoder(:statistic)
|
Chris@909
|
247 # stats.encode("puts 17 + 4\n", :ruby)
|
Chris@909
|
248 #
|
Chris@909
|
249 # puts '%d out of %d tokens have the kind :integer.' % [
|
Chris@909
|
250 # stats.type_stats[:integer].count,
|
Chris@909
|
251 # stats.real_token_count
|
Chris@909
|
252 # ]
|
Chris@909
|
253 # #-> 2 out of 4 tokens have the kind :integer.
|
Chris@909
|
254 def encoder format, options = {}
|
Chris@909
|
255 Encoders[format].new options
|
Chris@909
|
256 end
|
Chris@909
|
257
|
Chris@909
|
258 # Finds the Scanner class for +lang+ and creates an instance, passing
|
Chris@909
|
259 # +options+ to it.
|
Chris@909
|
260 #
|
Chris@909
|
261 # See Scanner.new.
|
Chris@909
|
262 def scanner lang, options = {}, &block
|
Chris@909
|
263 Scanners[lang].new '', options, &block
|
Chris@909
|
264 end
|
Chris@909
|
265
|
Chris@909
|
266 # Extract the options for the scanner from the +options+ hash.
|
Chris@909
|
267 #
|
Chris@909
|
268 # Returns an empty Hash if <tt>:scanner_options</tt> is not set.
|
Chris@909
|
269 #
|
Chris@909
|
270 # This is used if a method like CodeRay.encode has to provide options
|
Chris@909
|
271 # for Encoder _and_ scanner.
|
Chris@909
|
272 def get_scanner_options options
|
Chris@909
|
273 options.fetch :scanner_options, {}
|
Chris@909
|
274 end
|
Chris@909
|
275
|
Chris@909
|
276 end
|
Chris@909
|
277
|
Chris@909
|
278 end
|