In implementing the Metro parser on top of Fitz, and with the new road map, I am facing the task of designing the API for managing resources in Fitz. Up till now I have punted on making any final decisions, exploring as I go. Now I feel that I cannot do so much longer. Therefore I would like to hear your opinions and ideas on this. Please brainstorm and point out any missing pieces. First, a ten mile high overview of the Fitz architecture and nomenclature, for those of you who are not in the loop or need a refresher :) The Fitz world is a set of resources. There are many kinds of resources. Resources can depend on other resources, but no circular dependencies. The resource types are: tree, font, image, shade, and colorspace. A document is a sequence of tree resources that define the contents of its pages. A front-end is a producer of Fitz worlds, that reads a file and creates a Fitz world from its contents. Abstracting this into a high-level interface is useful primarily for viewers and converters. A back-end is a consumer of Fitz worlds. The default rasterizer is one back-end. PDF writers and other output drivers too. I don't think there should be a special interface for these. They are just functions or programs that take the Fitz world and do something unspecified with it. The resource API is what I need help fleshing out. Keep in mind that a Fitz world should be able to be serialized to disk, so having callbacks and other hooks into the front-end is a no no. If it weren't for this, my life would be a lot simpler :) Both creation, querying, and on-disk format. -------------------------------------------------------------------- TREE The Fitz tree resource is the primary data structure. A tree consists of nodes. There are leaf nodes and branch nodes. Leaf nodes produce images, branch nodes combine or change images. An image here is a two-dimensional region of color and shape (alpha). LEAF NODES SOLID. A constant color or shape that stretches out to infinity. IMAGE. A rectangular region of color and or shape derived a from a grid of samples. Outside this rectangle is transparent. References an image resource. SHADE. A mesh of patches that define a region of interpolated colors. Outside the mesh is transparent. References a shade resource. PATH. A path defines only shape. Moveto, lineto, curveto and closepath. Stroke, fill, even-odd-fill. Dash patterns. TEXT. A text node is an optimization, space-effectively combining a transform matrix and references to glyph shapes. Text nodes define only shape. Text nodes have a b c d coefficients and a reference to a font. Then an array of glyph index and e and f coefficient tuples. Text nodes may also have a separate unicode array, associating the (glyph,e,f) tuples with unicode character codes. One-to-many and many-to-one mappings are allowed. For text search and copy. BRANCH NODES TRANSFORM. Transform nodes apply an affine transform matrix to its one and only child. OVER. An over node stacks its children on top of each other, combining their colors with a blending mode. MASK. A mask node has two children. The second child is masked with the first child, multiplying their shapes. This causes the effect of clipping. (mask (path ...) (solid 'devicegray 0)) BLEND. This does the magic of PDF 1.4 transparency. The isolated and non-isolated and knockout group stuff happens here. It also sets the blend mode for its children. LINK. This is a dummy node that grafts in another tree resource. The effect is as if the other tree was copied here instead. References a tree resource. META. A way to insert application specific data. Transparent to the rasterizer, but can be useful to preserve some non-Fitz semantics that may be of use to specific producers/consumers of Fitz trees. For example, tiling patterns would be represented as an over node with many transform and link nodes stamping out the pattern to fill the page. Putting these under an appropriate Meta node would allow a PDF or Postscript backend to detect and recreate the tiling pattern. (meta 'pattern "...tiling pattern info..." (over (transform 1 0 0 1 0 0 (link 'pat1)) (transform 1 0 0 1 0 1 (link 'pat1)) (transform 1 0 0 1 1 0 (link 'pat1)) (transform 1 0 0 1 1 1 (link 'pat1)))) -------------------------------------------------------------------- COLORSPACES. A colorspace needs the capability to transform colors into and out of any other colorspace. I suggest that this happens by going either through a standard colorspace (CIE XYZ), or by having an optional A-to-B shortcut transform. I am thinking of three sub-classes: Device colors. Fast and dirty: Gray, RGB, CMYK only. ICC Profiles. I am not very familiar with this. Use Argyll? Are they easy to create programmatically to represent the Cal* and L*a*b colorspaces? Separation. For Separation and DeviceN. This is a list of named colors, with backend specific tailoring required to make sense of it. Also has an alternate colorspace (Device or ICC) with a transform. How do we represent the transform function? SHADES. This is fairly simple. A mesh of patches as in PDF. Three levels of detail: with full tensors, with only patches, with linear quads. Axial and radial shadings are trivially converted. Type 1 (functional) shadings need to be sampled. If the backends cannot cope, it can either convert to linear shaded triangles (clip an axial shading with a triangular path) or render to an image. FONTS. There need to be four types of font resources. For now I am going to use FreeType, but that should not be a necessity. The resource format for fonts should be independent. * Fake fonts for substituted fonts. Refer to another fall-back font resource and override the metrics. Could possibly be represented as a type 3 font, but knowing that it is a substitute may be useful. * Type 3 fonts where each glyph is represented as a Fitz tree resource. * Postscript fonts, in CFF format. Type 1 and Type 1 CID fonts are losslessly convertible to CFF. OpenType fonts can have CFF glyph data. * TrueType fonts IMAGES. This is the tricky one. Raph, I forgot what we decided on the tiling. What size, planar or chunky, etc. Which bit depths do we allow? Image data should be chopped into tiles to allow for independent and random access and more CPU-cache friendly image scaling and color transforms. * JPEG encoded images. Save byte+bit offsets to allow random access to groups of eight scanlines. * Monochrome images. * Contone images. The End