Talk:TXTDSC
taken from ProdSpec S-100 part 10 proposal
The product specification imposes special requirements on the content of files named in TXTDSC or NTXTDS or equivalent new attributes intended for formatted text. SNPWG requires such constraints because digital nautical publications need formatted text. Proposals suggest well-known schemas or DTDs whose specifications are publicly available. Examples are HTML 4.01 (“Strict”) or XHTML 1.0 (“Strict”). If the product specification follows the HTML/XHTML approach the use of those standards should be more restricted, saying the content should be valid HTML 4.01 only, without specified tags (<a>, <img>, etc.).
The specification for formatted text content should prohibit dependencies on another component of the data set or external data. These rules mean that scripts, Java, and external CSS stylesheets are prohibited, as are external links (web site addresses may be included, but may not be coded as HTML links). Images may not be used, or used only in strictly limited circumstances. The use of frames should be prohibited.
jens 07:18, 22 August 2008 (CEST)
After having had a discussion with the industrial technical experts of SNPWG yesterday I would like to share some more thought on formats of the files referenced by TXTDSC/NTXTDS. The ideas should be taken into consideration by TSMAD. We are not willing to press the standard in one or the other way. We only intend to ensure that the most options have been discussed.
i. The standard may define a list of acceptable file formats. The advantage is to allow *.txt as well and not forcing HOs to restructure their existing *.txt information into one particular file type like e.g. *.html/ *.xhtml/ *.htm ... No disadvantages spotted so far.
ii. The standard may permit the same information in different files with same file name but different extension; e.g. file abc.txt contains the same information like abc.xml. The advantage is to let the decision which file will be presented on an ECDIS screen to the ECDIS software. This will open a good competition market; which system provides the information in the most convenient way to the mariner. The disadvantage is how to ensure both content versions harmonized and how to convince HOs to provide the information in different formats. However, this might be a HOs or RENC work.
Barrie's reply 2008_08_22
An interesting debate. Much like S-57, S-100 does not prescribe formats for text files other than the character set codes described in the meta data component. The formats are defined in a product specification, and other than taking into consideration potential interoperability issues, is a free choice - txt, XHTML, XML etc. The only other consideration is to be careful if considering proprietary formats like PDF. These are unsuitable for use in ECDIS because the colour palette cannot be controlled for differing light conditions.
The multi-format concept could be mandated in the prod spec, forcing producers to distribute different formats. Obviously data size and maintenance would have to be considered as potential issues.
The topic of mandation is very important considering the experience we have gained and lessons learned from the ENC world. Much of the inconsistency inherent in ENCs from different producers is due to a lack of clear encoding policies prescribed in the prod spec. This will be addressed in S-101.
David's comments 2008_08_22
Hi John, Raphael and Jens, Many thanks for this contribution. I have been thinking about how an XHTML, XML or GML file might be used and please forgive my technical ignorance. In the first case, would we need to use a style sheet, or could we operate without? I noticed on page 10 of the enclosure that "external CSS style sheets are prohibited". Secondly it seemed to me, that if we were using XML the data would be tagged and so it was possible that an XML file could be used to provide information on a number of subjects and, if we used GML, it could be for one particular geometry. I am thinking here about simple point features like DgpsStation and much more complex features like a port, or a waterway which contain information on multiple very different subjects. And I then have a further thought, really for John and Raphael. Would there be advantages in referencing the same GML file from a number of different features and expecting an ECDIS or ECS to only return the bit of information requested, preferably in a nicely formatted or structured layout, with relevant and sensible headings, indentation and use of bold, italics, colour and perhaps hyperlinks? Am I in the land of the fairies?
Raphael 2008_08_25 replay to Barrie
Thanks for your comments. I agree about PDF, and of course the same argument would apply to any format that cannot have different palettes applied to its portrayal. Similar arguments might apply to Microsoft Word, though it might be possible to devise different styles for different conditions for MS Word, it seems too much effort for the benefit, for style designers as well as implementors. So it may be simpler to restrict the list of acceptable formats to text, HTML, XHTML, and XML-based formats.
Concerning mandation - My belief is that marked-up text (HTML, XML, etc.) would be needed only for relatively complex text content, and some (much?) of the existing content of TXTDSC/NTXTDS files should do as well in its current form (i.e., plain text). HTML/XML text would also need more processing than plain text. So mandating multiple formats *everywhere* might be objected to by both HOs and ECDIS manufacturers on the grounds of development and conversion effort and system performance. One compromise might be to mandate (or recommend?) HTML/XML only for more complex content (such as lists and tables).