Talk:Specification

From cbwiki.net
Jump to: navigation, search

Contents

cb:value

Paul Asman-FRBNY 16:42, 21 May 2007 (BST) I don't understand one of the new examples. Given <cb:value unit_mult="9" decimals="3">1.781</cb:value>, which makes sense to me, why is the next example <cb:value frequency="weekly" units="Billions of US dollars" decimals="3">241.6</cb:value>? Shouldn't it be <cb:value frequency="weekly" units="USD" unit_mult="9" decimals="3">241.6</cb:value>?

San 18:32, 22 May 2007 (BST) Excellent point Paul! In this particular case, the currency abbreviate combined with the unit mult attribute is perfect. For indexes or other non-currency data, there may not be as simple an expression. I'll make the changes.

Transactions

Christine Sommo-FRBNY 22:25, 11 May 2007 (BST) I added Transactions (open market operations for many of us, to be precise) to the application guide for statistics about two weeks ago. No one has objected to my work thus far, so I am going to bring that info into the spec and the user guide and start publishing soon.

dc:creator

Christine Sommo-FRBNY 11:57, 26 April 2007 (ADT) This seems to have gotten lost. It is used and discussed in the application guide for statistical data, but not carried over here. Was it decided against? Allen Galiza is going to update this spec to include dc:creator.

Steven Bagshaw-BIS 04:22, 27 April 2007 (ADT): Is it now required for all applications? This is how it reads now... it's the first I've heard of it and adding it would make our current feed non-compliant.

Christine Sommo-FRBNY 09:42, 27 April 2007 (ADT) I asked Allen put it in here to grab attention. I think we had intended to make it required, but dropped it. I support making it required for ALL applications, but would like others to to weigh in before we all start changing our existing compliant feeds or remove it from the spec. My memory suggests that it had more applicability for research papers and statistical feeds than the others, but it also provides a field for each institution to declare an "official" abbreviated term for itself (themselves?).

Steven Bagshaw-BIS 08:57, 2 May 2007 (ADT): In the sample RSS-CB file we have dc:publisher in the channel element. Is this serving the same purpose? Would it be better to make this required?

Why would it be better in each item, rather than in the channel? I can only think for aggregators who are collecting together these items, but I'm not sure of the value of it. (I suppose the channel could have a publisher - the aggregator - and each item a creator - the source institution). Other opinions welcome... I won't change our feed just yet.

Christine Sommo-FRBNY 13:35, 4 May 2007 (ADT) My use of dc:creator was just wrong. I've since replaced it with a custom element - cb:institution_abbrev - to capture the (wait for it) institutional abbreviation. dc:creator is still a useful element for research papers, but not the statistical feeds. Google Reader and the RSS 2.0 crowd are using dc:creator to define the individual who created the content (for example, the NY Times feeds list the individual author(s) of articles and columns using the dc:creator element). I still want an element in which each institution can declare it's "official" abreviated title, therefore cb:institution_abbrev.

See the flurry of discussion about channel and item titles from November to learn why this tag is necessary for eventual aggregation purposes.

      • Note - This has been corrected to cb:institutionAbbrev.

Detailing recommended or required in the spec

Steven Bagshaw-BIS 13:39, 12 April 2007 (ADT): I started having a look at the recent changes to the spec.

My question is just because our current validation tools don't enforce some things, should we then say fields previously called "required" are just "recommended"? I think it might be better to say they are required in the spec and let that guide the validation tools (however we end up doing that).

For example, cb:application is required for events, news etc. But in the spec, now it says it's just recommended - and might be required if the structure changes. I'd rather just say it is required for these specific applications, so that people will enter them in, even if the validator doesn't (yet) cope with this. (Well, actually if you had a feed of only events, the current validation would throw an error if there were no cb:application).

I haven't changed anything - just thought I'd ask on this first. I think the spec drives the validator, rather than vice versa, but I'm not sure it reads that way now??

Paul Asman-FRBNY 15:54, 12 April 2007 (ADT) My concern is this. A consumer of RSS-CB could (and perhaps should) look to the spec to see what to expect, and adjust receiving applications on that basis. That is, a consumer could see 'required' in the spec and legitimately think that valid(ated) files will conform to that requirement. When they don't, there may be problems. I look at validation as a tool for the consumer of the feeds.

That said, I don't think that we need to choose one way or the other. You want to use 'required' for one purpose, I want to use it for another. Why not use it for both? For something that can be validated, let's just call it required. For something that we don't want to require on any level, we'll call it optional, or recommended, or recommended for x, whichever the case.

For the parts we want to require but for which we have no mechanism - and this goes for the title for FX rates as well, I think - let's use a status to the effect of "required but not yet enforced." That gives creators of RSS-CB notice that this is something they should do (and, given the 'yet', something they may well be required to do), but also notifies consumers that file validation will not guarantee that the requirement is met.

Mike Eltsufin-FRBNY 16:18, 12 April 2007 (ADT) I completely agree with Steve's principle that "the spec drives the validator, rather than vice versa." We cannot write an XML Schema for the current version of the spec because XML Schema is not expressive enough, but we can write a validator in Java/C that would work. But the reason XML Schema is not expressive enough for our purpose is because the spec does not use XML properly. The real reason for changing the spec should be to make the proper use of XML, and as a result of that we will be able to create an XML Schema that will validate the structure.

I don't think we should use the qualification "required but not yet enforced" because enforcement is the problem for the tools, not the spec. The right place for that note is the user guide or the page that describes the tools.

Paul Asman-FRBNY 17:26, 12 April 2007 (ADT) When Mike says that our spec does not use the XML spec properly, I believe he's saying that we ought to have the hierarchy for applications expressed via new, intermediate elements. I'm not sure, but either way that is a major enough modification to require a new release. I'd like to take care of this more immediate concern of requirement status first.

I understand Mike's argument - I may even find it convincing - but my concern remains that consumers of RSS-CB might then expect certain things for RSS-CB feeds that they cannot be guaranteed to receive. I'm not sure that the user guide is the place to include this information; I thought of it as a guide for creators rather than consumers. (That is, I thought that the users in question were RSS-CB creators. Am I wrong?) Is there any objection to putting a paragraph about this issue in the spec as a non-normative note, with individual Notes for the relevant individual fields indicating that we are not yet aware of validation schemas that can be used to enforce the requirement?

Steven Bagshaw-BIS 04:28, 17 April 2007 (ADT): I think as long as there is a caveat on any validators that do not fully enforce the spec, then it's OK. If you read the spec, it should still read correctly I think.

Rather than cluttering up the whole spec, my suggestion would be to make a single note, with a little link (if you like) to the note where relevant. The note would be something "current validating tools for RSS-CB do not enforce <such and such>, but are intended to in the future. Where this is the case, the contents of the specification as found on this website are to be considered correct, even if they cannot (yet) be validated automatically."

Then it will be easier to remove the note when we have a fully whizzo validator.

So... where are we at with taking up the hierarchicalization (!) proposal of Mike's?

Using hierarchical elements to validate different application types

Mike Eltsufin-FRBNY 12:50, 16 March 2007 (ADT) I've been playing around with the schemas to see what can be done to permit validation of the different application types. Currently, the different applications with their specific requirements are supported by having multiple schemas for the same namespace. However, it's awkward to use and has problems such as the inability to mix items from different application types in the same feed. My feeling is that to make validation work properly, we need to exploit the hierarchical nature of XML.

Currently, the application type is specified using the <application> element. The elements that follow the <application> element must conform to the specified application type. For example:

VALID:
<item>
  <application>event</application>
  <location>33 Maiden Lane</location>
</item>

INVALID:
<item>
  <application>news</application>
  <location>33 Maiden Lane</location>
</item>

What we want to happen is to constrain what can go after the <application> element based on what’s inside of it. This rule is impossible to express in XML Schema because <location> is a sibling, not a child of <application>.

However the following rule can be enforced using XML Schema:

VALID:
<item>
  <event> 
    <location>33 Maiden Lane</location>
  </event>
</item>

INVALID:
<item>
  <news>
    <location>33 Maiden Lane</location>
  </news>
</item>

This rule can be expressed because <location> is now a child of the element that determines the rule. This approach also seems to be more appropriate because it utilizes the hierarchical nature of XML.

This option would require changes to the specification. What do you think?

Steven Bagshaw-BIS 14:24, 16 March 2007 (ADT): I'm not sure about modifying the spec at this stage, just to make the schema validation easier. But if everyone else goes for it (some people may have to modify existing work), it seems OK to me.

Paul Asman-FRBNY 09:04, 6 April 2007 (ADT) As things stand now, we cannot validate the status "required for some application types" that we include in the specification. It seems wrong to include something in the specification that cannot be validated. There are at least two responses to alleviate this unease: to change the status to "recommended for some application types," which allows the specification to stay as it is otherwise, or to introduce more hierarchy, as Mike suggests in the post that began this thread.

I am not prepared to make a decision on my own. With the specification as it is, we have in effect decided on the first option, albeit with some careless wording. Changing the specification to increase hierarchy strikes me as at least a new point release, and possibly a full release. As such, it should not be taken lightly. On the other hand, delaying it may result in a later invalidation of applications that receive RSS-CB feeds. What's the mechanism for deciding?

I've noticed that the order of elements differs in the specification and the application guides. To straighten that out, I took the order of the specification as authoritative, with the exception that I put all the cb elements after all the dc and dcterms elements. I also changed the specification to reflect more accurately the current situation, by changing the status 'required for some applications' to 'recommended' with a note that a non-enforceable requirement is in effect.

Steven Bagshaw-BIS 04:17, 10 April 2007 (ADT): I am prepared to vote on this... in which case, I would vote to make the change Mike suggested.

If we vote, the question is what to do with "abstentions".


San Cannon-FRB 10:50, 19 April 2007 (ADT)I vote against. This is just what we discussed in Dallas this week and I think that while it improves the ability to strictly validate combined feeds, it also significantly increases the complexity of the implementation and therefore the cost of adoption. If we are really concerned about the small banks who only issue a few feeds, we want to make sure it is as easy as possible to implement things even if they cannot be strictly validated. So I submit a "No" vote for myself and as the Board representative (in case we just count one vote per institution.)

Steven Bagshaw-BIS 12:36, 19 April 2007 (ADT): I'm not sure why it would increase the complexity of implementation, except for those who have implemented the spec as is - which is not many so far. Can you give me an example of why it would? As far as I can tell, you would just drop the cb:application element, make it a parent element with the same name, and then dump all the existing stuff underneath it.

Steven Bagshaw-BIS 15:40, 30 May 2007 (BST): Hi, I'd like to bring this up again.

I've tried again without success to implement any kind of validation of the spec as it stands. So, for now I'm going to withdraw the validation tools, as they do not do the job anyway.

I have one last suggestion for changes to the spec to make validation possible without intensive work...

Instead of <cb:application>event</cb:application> we could use <cb:applicationEvent/>. This one small change would make validation relatively simple and would meet all the requirements of it. (A single XSD file to validate against, correct validation of mixed feeds, correct validation of feeds not of a specific application and so on).

[Mike's suggestion would make for cleaner schema files, but since it got knocked back...]

We would have the new elements <cb:applicationEvent>, <cb:applicationNews>, <cb:applicationStatistics> etc.

Comments please!

Christine Sommo-FRBNY 13:47, 31 May 2007 (BST) I'm okay with this change, but will wait for some of the more technical folks to chime in. Of course, San is on sabbatical and Paul is in Tahiti, so...

Steven Bagshaw-BIS 15:13, 31 May 2007 (BST): I think I'd like to get one of them to give their OK on it before I went ahead with it. Let's wait for a little bit then - and for comments from any others of course.

Mike Eltsufin-FRBNY 17:25, 31 May 2007 (BST) It is a valid option for making schema validation possible, and it is a smaller change to the spec than the hierarchical approach, but it just doesn't feel aesthetically right. It feels more like a hack to enable schema validation. Nonetheless, I would prefer it over the current spec.

San 19:03, 31 May 2007 (BST) I'm on sabbatical but I'm not dead yet! (Monty Python reference fully intended!) While I'd like to see what Paul's take is, I have no problem with this fix. I agree with Mike that it feels like a hack but it maintains the simplicity that we want to keep for allowing banks with fewer technical resources to still play in the sandbox and yet allows for validation. Of course Paul needs to weigh in since he sees more devil in the details than I do....

Steven Bagshaw-BIS 07:58, 1 June 2007 (BST): I certainly agree it's hacky! My preferred option would still be the hierarchical approach, or even an attribute on the item (that didn't work). But I think we need the validation tools. If Paul gives the OK, I'll make the changes to the wiki and the validation tools.

Paul Asman-FRBNY 14:34, 7 June 2007 (BST) We're way too early in this process to need hacks. I've come around to the hierarchical solution Mike proposed, though not entirely or even primarily because of the need for validation. It is equally or more important, I think, that the list of sub-elements for <item> has become unwieldy and difficult to comprehend. RSS-CB will grow, and we need to eliminate this complexity now.

We currently deal with application-specific elements by making them optional in general, and required for certain application types. But this is wrong. <cb:issue> is not optional for statistical applications; its presence there would be an error. And cb:rateType is not optional for research papers; its presence there would be equally erroneous. We currently rely on RSS-CB users not to put in rateType where it doesn't make sense, but such reliance is more in the realm of hopes than of specifications.

We require one attribute (rdf:about) and three RSS elements (title, link, description) as children of <item>. We require one Dublin Core element (date), recommend another (language), and recommend one element from the extended DC (audience) for certain application types. There's no reason to change any of this.

Currently, we follow these elements with all the application-specific elements defined in the CB namespace. It's a long list, and it will only get longer. It makes the spec hard to follow. So let's follow Mike's suggestion from March, and limit the other child elements of <item> to the application types, currently event, paper, speech, and statistics. Then we'll have an item specification that's easy to follow.

We should then specify each of these application types; this will be 2.6 through 2.9. cb:issue will appear under <paper>, and cb:rateType under <statistics>. We currently have a number of cb elements in the spec for which it is said that we might make them required if more structure is added. This is that more structure.

This also creates the precedent for further extensions. Under <statistics>, we could have a child element fxRate, or perhaps rate with fxRate as a child of that. Surely we should leave our spec easily open to this.

The hierarchical solution passes a test that I use for deciding among alternatives: I can explain the spec easily if it has this hierarchy. I have found it difficult to explain the current <item>, with its one level - what we mean by optional, and so on. The hierarchy makes it easy.

Should an institution wish not to use any of the cb extensions, it would not use this hierarchy. All the non-cb elements are children of <item>. They would use only the base RSS with DC extensions. The hierarchy would be irrelevant to them.

We're catching this early, but not so early that those of us producing RSS-CB feeds won't need to make changes. So this should be a new version. I'll wait to next week, and then start creating it if I don't hear otherwise.

Steven Bagshaw-BIS 15:02, 7 June 2007 (BST): I fully agree.

On how to document it in the wiki (if everyone agrees it should "fly")... one reason the spec and user guides have an all-inclusive, massive list of items is so that we do not have to create detailed information in multiple places for an element that is shared in multiple applications. I'm not sure how you're intending to express that Paul. The wiki may look just as daunting, but the overall structure and the example files will be cleaner.

You'd know better, but San seemed the most against this structure. And she's away... so... up to you. But for me this would be a good move for the medium and long term.

Paul Asman-FRBNY 18:39, 7 June 2007 (BST) I just started moving things around, and I did copy the elements used in multiple applications multiple times. But reflecting on Steve's comments, this seems wrong. But rather than go back to the long list, I'm going to leave the (edited) full entry at the first instance, and at later instances refer back to it, noting any changes. (A number of these elements are recommended for some applications, required for others.)

I also propose moving the part of this discussion about the 1.1 spec itself to the discussion page for that spec itself. I'll go put an introductory remark there.

dcterms:audience

Steven Bagshaw-BIS 06:52, 13 March 2007 (ADT): San pointed out to me that audience is actually part of dcterms, not dc. So I've made a whole lot of changes today. This would effect any existing feeds created on the specs as they were before.

If using dcterms:audience, of course the dcterms namespace needs to be referred to. xmlns:dcterms="http://purl.org/dc/terms/"

The schemas, stylesheet and feed generator on the Technical tools page have been updated to match.

I added a note on this somewhere that now it is elements from both DC and DCTERMS that can be used optionally within RSS-CB. Not sure if I've caught all the places where we discuss this.

cb:occurrenceDate and cb:simpleTitle

Steven Bagshaw-BIS 07:46, 2 March 2007 (AST): Note that these are required for some applications - i.e. speeches, events, news. The spec and app guides were out of sync, with the former saying the fields were just recommended. But they will be essential for aggregation, so the various pages have been updated to reflect that.

Adding cb:unit to items

Noe Palmerin-Banco de Mexico 13:57, 22 February 2007 (AST) After working with the rss-cb file to publish "international remittances" I realized that adding unit with a multiple is necessary. Are you agree or there is another way to save this?

   <item rdf:about="#item1">
      <dc:format>text/html</dc:format>
      <dc:creator>Banco de México</dc:creator>
      <title>MX: 1757.80, 2006-12, Workers' Remittances, Banxico</title>
      <link>http://www.banxico.org.mx/SieInternet/consu......</link>
      <description>Workers' Remittances (Millions of Dollars).</description>
      <cb:application>statistics</cb:application>
      <cb:simpletitle>Workers' Remittances</cb:simpletitle>
      <dc:language>en</dc:language>
      <dc:date>2006-12</dc:date>
      <cb:country>MX</cb:country>
      <cb:value frequency="Monthly" decimals="2">1757.80</cb:value>
      <cb:rateType>Flows</cb:rateType>

      <cb:unit multiple="1000000">USD</cb:unit>

   </item>
</rdf:RDF>

Paul Asman-FRBNY 08:39, 23 February 2007 (AST) SDMX has an attribute UNIT_MULT to do this, with values from an enumeration that I believe contains the power to which 10 is raised. (So 1000000 would have UNIT_MULT="7".) You may want to use this.

Noe Palmerin-Banco de Mexico 20:07, 23 February 2007 (AST)
Ok, that solved the multiple issue but, What about unit? We want to publish a data which unit type is Invesment Unit. It doesn't fit in 'rateType' neither 'baseCurrency'. Maybe a Custom Element?


Forget about it. I just found the currency in ISO 4217 currency codes.

Adding cb:application to items

Steven Bagshaw-BIS 05:48, 22 January 2007 (AST): A late addition I know, because people have started implementing these. However, I hope it wouldn't be too hard to add - comments welcome though.

I was doing some thinking about how to aggregate feeds - and particularly to automatically validate them. It occurred to me that there didn't seem to be any way of differentiating a speech from an event from a research paper etc. So I've added cb:application in, which should do the trick for aggregation and to assist with validation. It is now reflected in the spec, user guide, sample file and application guides.

Rework

Steven Bagshaw-BIS 12:25, 19 December 2006 (AST): I've reworked the spec page based on fields appearing in speeches, events and news, as discussed at the RSS-CB meeting this month in Basel.

Paul Asman-FRBNY 09:35, 21 December 2006 (AST) I've just gone through the statistical data section, and went back to make sure it works as an extension of what we had for the base specification. It didn't. In the base specification, there were two items that were required that do not apply to statistical data, cb:simpletitle and cb:eventdate. I've changed those to "recommended for some application types." While I was there, I made a similar change for cb:person.

Steven Bagshaw-BIS 10:13, 21 December 2006 (AST): Thanks. Yes, I think we will need to re-work the spec vs user guide vs app guides structure again. Having a list of common fields will probably break like that in the future too, if/when new types of applications come on. So I think maybe having a large list of ALL possible fields in the user guide might work, with then a simple list in application guides of which fields to use (along with notes on any specific usages). Comments welcome on that. I'll do the gruntwork once we know what to do.

Steven Bagshaw-BIS 10:25, 21 December 2006 (AST): Another point. We are using cb:eventDate to mark publication dates, speech dates, event dates and so on. It is different from dc:date in that the latter represents where the item should appear chronologically in the feed - and can thereby be different when the feed is re-purposed by an aggregator, for example. Whereas the cb:eventDate is meant to stay constant.

Elena Atayeva-FRBNY 16:04, 27 December 2006 (AST) I thought that the chronological feed item date was represented by dc:date. Is cb:date the same?
Steven Bagshaw-BIS 03:31, 28 December 2006 (AST): dc:date is the chronological feed item date. cb:occurrenceDate (previously cb:eventDate) is the constant date associated with the item. If you found cb:date anywhere, it's probably a typo and should be one of these other two. And I see I put it in my comment of 21 December... so I've fixed that now. Sorry for the confusion...

I think maybe we should rename it, as it seems too closely tied to the application type of "Events" and could lead to confusion. I don't have a great idea for it, but I'll suggest here cb:occurrenceDate or cb:identifiedDate. Sorry this is coming so late...

-- Steven Bagshaw-BIS 05:52, 22 December 2006 (AST): OK, I've gone with cb:occurrenceDate for now. Comments still welcome.

Steven Bagshaw-BIS 07:59, 27 December 2006 (AST): I've updated the specification to correspond to the user guide (and the application guides). If someone could enter an example of the cb:bibliographicCitation field here, that would be appreciated.

Steven Bagshaw-BIS 11:50, 27 December 2006 (AST): Last change I'll make for now... I was reading a book on XML we have here and they suggest using givenName and surname, rather than firstName and lastName, particularly to remove confusion for Asian names that appear in reverse order to names of European (and other) origin(s). Sounds fair enough to me and I imagine it could have been a question that would come up later. Comments welcome. I've also tweaked the <cb:person><role> element slightly, to be a bit more flexible in future, particularly for aggregation.

Steven Bagshaw-BIS 05:23, 29 December 2006 (AST): Well, I thought that would be the last change. I've now added the cb: namespace to the cb:resource and cb:person child elements. This seems to be good practice, plus we couldn't get XSL working on a RSS-CB sample file without it.

Implementation issues

San Cannon-FRB 17:25, 19 October 2006 (ADT) As I begin to finalize some things for us to push out our feeds by the end of October (really! I mean it!), I can't seem to find any indication of how we wanted to delimit information in the channel titles. I know we had an interesting discussion on field separation issues for data feeds but what are the feelings about channel titles? Yahoo seems to like colons at least for rendering a channel title as a category on a "My Yahoo" page but the BBC channel name seems to be pipe delimited. Any thoughts as to how to represent the layers?

Paul Asman-FRBNY 12:00, 24 October 2006 (ADT) The section on titles in the user guide has two layers, one for the institional label, another for the rest of the content. The institutional label is followed by a colon and a space.

San Cannon-FRB 09:10, 27 October 2006 (ADT) So there are no strong feelings about further deliniation of the content string? Or the order of additional fields if there are more than one?

Custom namespace elements

Mike Eltsufin-FRBNY 13:26, 12 July 2006 (ADT) Another problem I've encountered while writing the schema is allowing custom namespace elements. The problem stems from the requirement that the grammar specified by the schema must be deterministic. For example the following XML Schema type definition is invalid.

<xs:complexType name="InvalidType">
	<xs:sequence>
		<xs:element name="mandated" maxOccurs="unbounded"/>
		<xs:any maxOccurs="unbounded"/>
	</xs:sequence>
</xs:complexType>

It would allow this kind of instance documents:

<Invalid>
	<mandated/>
	<mandated/>
	<mandated/>
	<custom:MyElement/>
 	<custom:MyOtherElement/>
</Invalid>

It is nondeterministic because for the parser parsing a document based on this schema, it will be ambiguous where the 'mandated' elements list ends, and where the custom elements begin (the mandated element can be technically considered as custom too). This leads to multiple possible parse trees and ambiguity.

One way to solve this problem is to use a container element ('customStuff').

<xs:complexType name="ValidType">
	<xs:sequence>
		<xs:element name="mandated" maxOccurs="unbounded"/>
		<xs:element name="customStuff">
			<xs:complexType>
				<xs:sequence>
					<xs:any maxOccurs="unbounded"/>
				</xs:sequence>
			</xs:complexType>
		</xs:element>
	</xs:sequence>
</xs:complexType>

It would allow this kind of instance documents:

<Valid>
	<mandated/>
	<mandated/>
	<mandated/>
	<customStuff>
		<custom:MyElement/>
		<custom:MyOtherElement/>
	</customStuff>
</Valid>

This would be valid but forces you to use a container element for custom elements.

Another alternative is to just disallow custom elements and instead specify all of the useful extensions to RSS and Dublin Core in the new RSS-CB namesapce.

Paul Asman-FRBNY 15:20, 28 July 2006 (ADT) Towards the end of July, when we had committed to finishing a draft spec, this issue remained unresolved. So I solicited opinions via email. Unfortunately, people responded to the email rather than to the wiki, so I'm transferring material. [Dan Chall added: Unfortunately, there is no talk page for a talk page, or else this meta-talk would have been moved there.]

Dan Chall started, with this: http://www-128.ibm.com/developerworks/xml/library/x-contain.html

A pre-determined set of allowable custom elements?

Versus a fixed element that allows sub-elements without limitation by the spec?

Seems the latter is preferable. What's the point of custom elements if they must be pre-specified?

Thanks for the gun-to-the-head, ;-) Dan

Brent, Butch, and Noe all supported Dan's position, so I suppose it's settled. Let me complicate issues, then, by raising some additional questions:

1. Do we want two containers, one for elements that we endorse in common, the other for elements that we endorse as individual institutions?

Dan Chall-FRBNY I don't see a reason to document endorsement of elements in our specification. I think customized elements may become standard elements in time, and that will require some ongoing monitoring and discussions about those elements. Until they are standardized, we can tolerate a great deal of heterogeneity within the container, even the same element name used in contradictory ways. If we endorse an element in common, would it not mean we don't need a container?

Noe Palmerin-Banco de Mexico 11:30, 31 July 2006 (ADT) Before taking the decision about the container to the "elements that we endorse in common" I have another question. What elements will be there? Maybe we don’t need this container.

On the other hand. I think that not all the customized elements will become standard elements in time. Due to the different necessities, legal and political reasons of the many institutions and countries. Avoiding customizable elements could be a harsh work.

Conclusion:

-Container for common elements. I don’t think so.
-Container for personal institutional elements. I say, yes.

2. Where do DC elements go? We are taking the position, I believe, that some DC elements should be mandatory, e.g. language. Where does the element go in the schema? Is it part of the common customization, and in a container for that? Is it another child element of item and/or channel?

3. What is our position on the customization (extension) of existing elements vs. the creation of new elements within a custom container? Take the JEL codes, for example. Should we include them as refinements of the DC subject, or should we create a separate element in an rss-cb or individual container? (This is more important if the common DC elements are not in a separate container.) Dan Chall-FRBNYIs there any existing usage we can learn from? Google finds 20,000 pages with "JEL" and "Dublin Core" but stops at six and says the rest are "very similar." Sounds like a low boredom threshold. But perhaps this web site may indicate that someone else has addressed this issue: http://www.ecommunics.com/modules.php . No good info at the surface level, but I will dig a bit more.

Suzanne LeBlanc-Bank of Canada 16:35, 28 July 2006 (ADT)I just came across an article on OECD publishing by Toby Green where they had JEL classification as an example of publishing metadata with the corresponding XML element - <JEL Code="value is JEL code">value is the JEL classification label </JEL>. No indication on whether it was a refinement of DC or custom element. I wasn't sure in the discussions we had whether more custom elements than less might impact on interoperability. Is it going to be the same thing with refinements if we extend Subject for the JEL codes? Or, is this going to change?

Mike Eltsufin-FRBNY 12:15, 31 July 2006 (ADT) For dc elements and other endorsed elements there is no syntactic need to have a container, whereas for custom elements we need a container. Another syntactic constraint is that endorsed and custom elements cannot be siblings (appear within the same container element). Now from the perspective of writing the schema here are some questions:

1. Should endorsed elements (dc and others) be wrapped in a container?

Proposed solution: No.

2. What should be the name for the container element for custom elements?

Proposed solution: <rss-cb:custom>.

3. What is the order of the endorsed elements?

Proposed solution: dc elements, rss-cb endorsed elements, <rss-cb:custom>

4. Where should the endorsed elements appear?

Proposed solution: As the last set of elements in: <rdf:RDF>, <channel>, <image>, <item>

Please vote on the proposed solutions, or offer your own suggestions.


Brent Eades-Bank of Canada 14:45, 31 July 2006 (ADT): Regarding Suzanne's question about denoting JEL categories: This could be handled through Dublin Core;

<item...
   <dc:subject>
      <rdf:Description>
         <taxo:topic rdf:resource="http://www.aeaweb.org/journal/jel_class_system.html" />
         <rdf:value>E64 - Incomes Policy; Price Policy</rdf:value>
      </rdf:Description>
   </dc:subject>
...
</item>

... where the URI points to an instance of the classification scheme being used, and the rdf:value is the taxonomic identifier. One approach, anyway...

Noe Palmerin-Banco de Mexico 22:27, 4 August 2006 (ADT) About Mike questions (and suggested solutions) I concur with the solutions of the points: 1, 2 and 3.

I'm still thinking about point 4.

Order of RDF child elements

Paul Asman-FRBNY 09:44, 12 July 2006 (ADT) Mike Eltsufin started to write a schema yesterday, and could not find a reasonable way to do so while respecting the decision we made not to impose an order on the child elements of RDF. So let's examine that decision. It might not have been so difficult if all of these elements had the same cardinality, but they don't. Channel is used once and only once, item is used at least once but normally more, and image is used at most once. It turns out to be difficult to put these together in any order.

It's important to note at the start that imposing an order in no way restricts the information that we represent. It should be a matter of indifference to us whether an RSS file starts with channel or with instances of item - we get to put in the channel information and the instances of item whether channel is first or last or in the middle. We said that we did not impose an order, I think, because we are indifferent to order, not because we needed a lack of order to convey information.

The problem, though, is that the schema specification doesn't make it easy to allow arbitrary order. Basically, the specification defines three different ways of putting elements together: as a sequence, which defines an order, as a choice, which says, in effect, choose one among several, possibly multiple times, and something called 'all', which defines a non-ordered list.

'All' is the obvious path to implement our decision. But to quote the O'Reilly book on XML Schema (by Eric van der Vlist), "the Recommendation has imposed huge limitations on the xs:all element, which makes it hardly usable in practice." Let's assume that this is true; it certainly reflects Mike's experience in trying to respect our decision.

That leaves us in the position that the element designed to implement our decision isn't really up to the job. We could try to force things, and perhaps we could come up with a complicated schema that allows the RDF elements to be placed in any order. But why?

If we use a sequence, we have a simple schema that preserves all our needs to present content and restricts only something that doesn't matter to us. The New York Fed puts the instances of item before the channel, which always struck me as counterintuitive - putting the table of contents at the back rather than the front. If we impose an order that channel comes first, we'd change our application that generates RSS with no loss of information. I think that we should. It's the low-cost solution that preserves what we need and allows us to present an easily understood schema.

The imposition of a sequence will apply, I'm sure, to the child elements of RDF that themselves have child elements (e.g. item). For the same reasons as stated above, this strikes me as unproblematic.


Elena Atayeva-FRBNY 10:16, 12 July 2006 (ADT) Yes, the lack of unordered lists with multiple instance of elements is one of the biggest problems with W3C schema. There are two ways to solve this -- we could impose order or we could use a different schema. NG Relax (home page, OASIS Technical Committee, a third-party presentation, IBM advocacy) was developed as a cleaner schema that doesn't have the W3CSchema disadvantages. No work seems to have been done on in in a while but that's fairly typical of standards.


Brent Eades-Bank of Canada 13:48, 12 July 2006 (ADT): Even leaving schema problems aside for the moment, I don't see any benefit to be gained by allowing arbitrary ordering of elements. It's to everyone's advantage -- producers of feeds, human readers of them, and perhaps even some parsers (who knows?) to follow the "usual" order.

As for Relax NG -- yes, it has some advantages in this case. For reference, here's a sample Relax XML schema that I generated using an XML editor (Oxygen): schema


Christine Sommo-FRBNY 09:51, 13 July 2006 (ADT) yes, yes, yes. I agree with imposing an order on the child elements. And I agree that the way we are doing it here at the NYFed with the items before the channel is not the way we ought to do it.

Timo Laurmaa-BIS 11:21, 13 July 2006 (ADT) Yes, let's impose the order.

San Cannon-FRB 12:48, 17 July 2006 (ADT) I see no reason to let chaos reign supreme. Go for it.

Noe Palmerin-Banco de Mexico 11:48, 26 July 2006 (ADT) OK, I just read all the discussion and I agree with an order for elements. Even if we don’t impose it I think that it was the natural way to do it.

Paul Asman-FRBNY 12:49, 27 July 2006 (ADT) Interestingly, or not, we seemed to have resolved the issue in the discussion section, but not reflected that consensus in the specification itself. So I changed the spec accordingly. I added statements imposing order for the two elements where it really matters, RDF and channel, where a child element can be used multiple times ("The permitted child elements must appear in the order shown below"), and for the two where it doesn't, image and item ("The child elements must appear in the order shown below"). I also made corresponding changes in the User Guide.

Namespace prefixes

Mike Eltsufin-FRBNY 12:05, 7 July 2006 (ADT) I don't think the names for the namespace prefixes should be mandated. When you programmatically generate and XML file, you often don't have control over the names of the prefixes. Even if the XML processing tool allows you to choose the prefixes, it just adds more work for the developer. As far as consuming XML files, the tools are always agnostic to the namespace prefixes. However, for humans that will be writing RSS files by hand (highly unlikely) it might be useful to recommend common prefixes for certain namespaces, but this belongs in the User guide, not the specification.

Timo Laurmaa-BIS 14:15, 7 July 2006 (ADT) Mike, the educational part of our effort (ie the RSS newcomers can quickly put together their first feed files) might result in more hand-written RSSs than you might think. For this purpose, our mnemonic name space names (dc, dcterms, cb, onecb) are quite handy and help understand the contents of the various namespaces. Do you mean that some XML processing tools would use arbitrary names, such as xmlns:ns01="http://purl.org/dc/terms/" and xmlns:ns02="http://www.centralbanks.org/rss/"? This would probably be OK for any RSS reader, but an aggregator such as the BIS will need to know from the outset that, for example, the job title of a speaker might not be cb:speakerjobtitle but ns02:speakerjobtitle. Why would it be (significantly) more work for a developer to define an explicite name space name if it can be given? On the aggregator side (or with anyone wishing to develop applications using RSS-CB feeds) the unpredictable name space names would certainly mean more work.

Mike Eltsufin-FRBNY 15:57, 7 July 2006 (ADT) Timo, I understand the benefit of the mnemonic namespace prefixes for people who will be writing the RSS feeds by hand. That is why I think it's useful to keep them in the User Guide as a recommendation. XML processing tools that I've worked with require you to specify the full namespace URI whenever you are creating a new element, such as when working with a DOM tree. The prefixes are then auto-generated when the DOM tree is outputted to XML and look something like ns01, ns02, etc. It is usually possible to control the prefix names, but it requires extra effort from the developer. From the point of view of aggregation, varying namespace prefixes should not cause any extra work at all because all of the namespace-aware parsers are prefix agnostic. Mandating namespace prefix names kind of goes against the grain of the XML specification which makes it clear that only the namespace URIs have semantic meaning, not the prefixes.

Christine Sommo-FRBNY 12:21, 13 July 2006 (ADT) hmm. I just learned something that leaves me a bit more confused rather than less. I assumed the prefixes would have to be explicitly defined by this group.

Paul Asman-FRBNY 08:49, 14 July 2006 (ADT) Prefixes serve as abbreviations internal to the document. When a parser deals with the document, it substitutes the full names when the abbreviations are used. (I'm not sure if this is literally true or not, but it's the effect.) As long as they're used consistently, the prefixes themselves don't matter. All that matters is the actual URI. In xmlns:dc="http://purl.org/dc/elements/1.1/", for example, the 'dc' doesn't matter to the parser, but the URI does.

Mike is of the opinion, I believe, that the specification should only include requirements that both matter and can be enforced by the schema. A requirement that prefixes be certain tokens ('rdf', 'dc', et al.) neither matters nor can be enforced. Insisting on them, then, buys us nothing.

Mike Eltsufin-FRBNY 12:08, 17 July 2006 (ADT) Given the limitations of XML Schema, I don't think that only requirements that can be enforced using the schema should be included in the specification. I would like to be able to automatically validate every requirement of the specification, but it may not be possible with XML Schema alone, and that's OK. However, I am of the opinion that requirements that don't really matter, such as the namespace prefixes, should not be included in the specification.

Naming and numbering

13:33, 22 June 2006 (EDT)
San Cannon: I would hope that the numbering would be for our spec rather than the one we base things on.


Dan Chall: I'm thinking that perhaps we should not call this thing CB-RSS because it is exactly RSS, implemented for a specific purpose. It's not something different. Our RSS stands for "RDF Site Summary," and maybe the name we choose should be an abbreviation that references RSS similarly: CBR. I like acronym nesting. One of the concerns is the version numbering: Does the version number in CB-RSS 1.0 refer to RSS or CB-RSS? Am I using this page properly?

05:18, 22 June 2006 (ADT)
Timo Laurmaa: Would we rather start from the more general (RSS), moving towards the more specific (CB)? The first version of our effort would then be quite unambiguously RSS-CB 1.0

12:88, 22 June 2006 (EDT)Dan Chall: I have no objection to that. But as RSS is a subset of RDF, the corresponding name of RSS might be "RDF-SS 1.0, " but they chose RSS instead. I'm wondering if we will keep the numbering convention of RSS-CB to be the same as the RSS numbering, or should it have its own? Maybe we should be RSS-CB (or RCB) 0.1 for now....

Link element of channel

Paul Asman-FRBNY 09:38, 28 June 2006 (ADT) The meeting notes had the URL as that of the landing page (or home page, failing that). The spec had the URL of the main RSS feed page. I thought the notes correct, so I changed the spec to reflect that.

Modality

Paul Asman-FRBNY 09:28, 29 June 2006 (ADT) I'm changing the modality of requirements, so that the spec reads more like the RSS spec itself. For required elements, instead of 'x must include ...', for example, I'm putting in 'x includes ...." The label of 'required' remains.

Typography

Dan Chall-FRBNY 13:31, 29 June 2006 (EDT) Should attributes be explicitly labeled as such, as in attribute: rdf:about ? Maybe some kind of typographical cue in "attribute" too.

Paul Asman-FRBNY 15:20, 29 June 2006 (ADT) For what it's worth, the <element> pattern is that of the RSS 1.0 specification. (I eliminated what had been in our spec previously - where the pattern was "<element> element" to follow that.) But the RSS 1.0 spec has sections only for elements, not for attributes, and so offers no guidance (other than, perhaps, not having sections for attributes).

In the spirit of committee practice: since you raised the issue, why not take a crack at resolving it?