RE: [xep-support] linefeed normalization

From: Victor Mote (vic@portagepub.com)
Date: Mon Apr 05 2004 - 09:32:21 PDT

Next message: Jirka Kosek: "Re: [xep-support] linefeed normalization"

Previous message: Victor Mote: "RE: [xep-support] linefeed normalization"
In reply to: Nikolai Grigoriev: "Re: [xep-support] linefeed normalization"
Next in thread: Jirka Kosek: "Re: [xep-support] linefeed normalization"
Reply: Jirka Kosek: "Re: [xep-support] linefeed normalization"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Nikolai Grigoriev wrote:

> the question of U+2028 is a complicated one. The XSL-FO spec
> does not constrain its processing in any way. There is no
> mention that it is subject to normalization, but equally no
> indication that it is expected to produce a line break at
> all. The effects of this character are therefore not
> well-defined: I doubt whether it can be considered a valid linefeed.

I agree that it is not well-defined.

> In XEP, we treat U+000A, U+000D, and U+2028 as complete
> equivalents. (This refers to the data that come to the
> formatter, after linefeed normalization in the processor).
> The logic is
> straightforward: a character is either a linefeed or not;
> linefeeds terminate lines and are subject to the effects of
> linefeed-treatment; non-linefeeds do neither of these.

Ken's distinction between a linefeed and a LINE SEPARATOR is relevant here I
think. They are different concepts. The fact that linefeed-treatment is
supposed to *only* affect U+000A, but affects U+2028 in XEP indicates some
misunderstanding.

> One can argue if this is a correct behaviour. However, I
> believe that it is inherently unsafe to rely on Unicode text
> flow control characters in systems that have their own markup
> to express the same semantics. There is no reason to use

I mostly agree with this. However, there really is no semantic in XSL-FO
that says "force a line break here". It is true that you can say "start a
new block here", but that really is a different concept.

> U+2028 or U+2029 if you have explicit paragraph structure set
> by <fo:block>s; it is risky to mix LRO/RLO/LRE/RLE with

I think U+2029 really is the same as saying "start a new block", and agree
that there is no good reason to use it in XSL-FO.

> fo:bidi-override. If you need explicit line breaks inside
> non-preformatted text, set a <br/> element in the input XML
> vocabulary and match it to <fo:block/> in the stylesheet. In
> this way, your intent is clear to everybody.

There really is a good reason to not take this approach, unless necessary.
Simply inserting an </fo:block><fo:block> combination does not do the job.
The new block created here may not have the same properties -- things like
space-before, keeps, etc. have great potential to be different. Now, I
acknowledge that this can be worked around in the stylesheet, but it does
add an order-of-magnitude level of complexity.

> One additional consideration: in XML 1.1, U+2028 will be
> subject to parser-side linefeed normalization. It implies
> that you never get it from user text; and if you generate an
> entity just to make it appear after the normalization, why
> not generate a piece of markup instead?

OK. I find this to be persuasive, and it means ultimately that either I or
the authors of the XML 1.1 standard have misunderstood what the Unicode
standard was trying to do with U+2028.

This leaves only the issue of documentation. I would simply suggest that
Section 7.1 of the document "XSL Formatting Objects in XEP 3.7" be modified
to include your comment above that U+2028 is always treated within XEP as a
linefeed character.

Thanks again to both Nikolai and Ken for your explanations. This is not an
issue I feel strongly about, and I didn't mean for it to turn into a big
deal.

Victor Mote

-------------------
(*) To unsubscribe, send a message with words 'unsubscribe xep-support'
in the body of the message to majordomo@renderx.com from the address
you are subscribed from.
(*) By using the Service, you expressly agree to these Terms of Service http://www.renderx.com/tos.html

Next message: Jirka Kosek: "Re: [xep-support] linefeed normalization"
Previous message: Victor Mote: "RE: [xep-support] linefeed normalization"
In reply to: Nikolai Grigoriev: "Re: [xep-support] linefeed normalization"
Next in thread: Jirka Kosek: "Re: [xep-support] linefeed normalization"
Reply: Jirka Kosek: "Re: [xep-support] linefeed normalization"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.5 : Mon Apr 05 2004 - 09:41:16 PDT