yading@10
|
1 The official guide to swscale for confused developers.
|
yading@10
|
2 ========================================================
|
yading@10
|
3
|
yading@10
|
4 Current (simplified) Architecture:
|
yading@10
|
5 ---------------------------------
|
yading@10
|
6 Input
|
yading@10
|
7 v
|
yading@10
|
8 _______OR_________
|
yading@10
|
9 / \
|
yading@10
|
10 / \
|
yading@10
|
11 special converter [Input to YUV converter]
|
yading@10
|
12 | |
|
yading@10
|
13 | (8bit YUV 4:4:4 / 4:2:2 / 4:2:0 / 4:0:0 )
|
yading@10
|
14 | |
|
yading@10
|
15 | v
|
yading@10
|
16 | Horizontal scaler
|
yading@10
|
17 | |
|
yading@10
|
18 | (15bit YUV 4:4:4 / 4:2:2 / 4:2:0 / 4:1:1 / 4:0:0 )
|
yading@10
|
19 | |
|
yading@10
|
20 | v
|
yading@10
|
21 | Vertical scaler and output converter
|
yading@10
|
22 | |
|
yading@10
|
23 v v
|
yading@10
|
24 output
|
yading@10
|
25
|
yading@10
|
26
|
yading@10
|
27 Swscale has 2 scaler paths. Each side must be capable of handling
|
yading@10
|
28 slices, that is, consecutive non-overlapping rectangles of dimension
|
yading@10
|
29 (0,slice_top) - (picture_width, slice_bottom).
|
yading@10
|
30
|
yading@10
|
31 special converter
|
yading@10
|
32 These generally are unscaled converters of common
|
yading@10
|
33 formats, like YUV 4:2:0/4:2:2 -> RGB12/15/16/24/32. Though it could also
|
yading@10
|
34 in principle contain scalers optimized for specific common cases.
|
yading@10
|
35
|
yading@10
|
36 Main path
|
yading@10
|
37 The main path is used when no special converter can be used. The code
|
yading@10
|
38 is designed as a destination line pull architecture. That is, for each
|
yading@10
|
39 output line the vertical scaler pulls lines from a ring buffer. When
|
yading@10
|
40 the ring buffer does not contain the wanted line, then it is pulled from
|
yading@10
|
41 the input slice through the input converter and horizontal scaler.
|
yading@10
|
42 The result is also stored in the ring buffer to serve future vertical
|
yading@10
|
43 scaler requests.
|
yading@10
|
44 When no more output can be generated because lines from a future slice
|
yading@10
|
45 would be needed, then all remaining lines in the current slice are
|
yading@10
|
46 converted, horizontally scaled and put in the ring buffer.
|
yading@10
|
47 [This is done for luma and chroma, each with possibly different numbers
|
yading@10
|
48 of lines per picture.]
|
yading@10
|
49
|
yading@10
|
50 Input to YUV Converter
|
yading@10
|
51 When the input to the main path is not planar 8 bits per component YUV or
|
yading@10
|
52 8-bit gray, it is converted to planar 8-bit YUV. Two sets of converters
|
yading@10
|
53 exist for this currently: One performs horizontal downscaling by 2
|
yading@10
|
54 before the conversion, the other leaves the full chroma resolution,
|
yading@10
|
55 but is slightly slower. The scaler will try to preserve full chroma
|
yading@10
|
56 when the output uses it. It is possible to force full chroma with
|
yading@10
|
57 SWS_FULL_CHR_H_INP even for cases where the scaler thinks it is useless.
|
yading@10
|
58
|
yading@10
|
59 Horizontal scaler
|
yading@10
|
60 There are several horizontal scalers. A special case worth mentioning is
|
yading@10
|
61 the fast bilinear scaler that is made of runtime-generated MMXEXT code
|
yading@10
|
62 using specially tuned pshufw instructions.
|
yading@10
|
63 The remaining scalers are specially-tuned for various filter lengths.
|
yading@10
|
64 They scale 8-bit unsigned planar data to 16-bit signed planar data.
|
yading@10
|
65 Future >8 bits per component inputs will need to add a new horizontal
|
yading@10
|
66 scaler that preserves the input precision.
|
yading@10
|
67
|
yading@10
|
68 Vertical scaler and output converter
|
yading@10
|
69 There is a large number of combined vertical scalers + output converters.
|
yading@10
|
70 Some are:
|
yading@10
|
71 * unscaled output converters
|
yading@10
|
72 * unscaled output converters that average 2 chroma lines
|
yading@10
|
73 * bilinear converters (C, MMX and accurate MMX)
|
yading@10
|
74 * arbitrary filter length converters (C, MMX and accurate MMX)
|
yading@10
|
75 And
|
yading@10
|
76 * Plain C 8-bit 4:2:2 YUV -> RGB converters using LUTs
|
yading@10
|
77 * Plain C 17-bit 4:4:4 YUV -> RGB converters using multiplies
|
yading@10
|
78 * MMX 11-bit 4:2:2 YUV -> RGB converters
|
yading@10
|
79 * Plain C 16-bit Y -> 16-bit gray
|
yading@10
|
80 ...
|
yading@10
|
81
|
yading@10
|
82 RGB with less than 8 bits per component uses dither to improve the
|
yading@10
|
83 subjective quality and low-frequency accuracy.
|
yading@10
|
84
|
yading@10
|
85
|
yading@10
|
86 Filter coefficients:
|
yading@10
|
87 --------------------
|
yading@10
|
88 There are several different scalers (bilinear, bicubic, lanczos, area,
|
yading@10
|
89 sinc, ...). Their coefficients are calculated in initFilter().
|
yading@10
|
90 Horizontal filter coefficients have a 1.0 point at 1 << 14, vertical ones at
|
yading@10
|
91 1 << 12. The 1.0 points have been chosen to maximize precision while leaving
|
yading@10
|
92 a little headroom for convolutional filters like sharpening filters and
|
yading@10
|
93 minimizing SIMD instructions needed to apply them.
|
yading@10
|
94 It would be trivial to use a different 1.0 point if some specific scaler
|
yading@10
|
95 would benefit from it.
|
yading@10
|
96 Also, as already hinted at, initFilter() accepts an optional convolutional
|
yading@10
|
97 filter as input that can be used for contrast, saturation, blur, sharpening
|
yading@10
|
98 shift, chroma vs. luma shift, ...
|