On 31 Oct 2006, at 09:38, David Tolpin wrote:
> Hi Geoff,
>
>
>> French users will always enter a space before the punctuation
>> mark. So we still need a magical way of removing or replacing all
>> the preceding white space, if that white space is in the same
>> block, and even if there is XML markup between the end of the
>> previous word and the punctuation. This is difficult to do with
>> XSLT.
>
> 1) Why do french users insert markup between the text and the
> column? Is the software that generates the FO responsible for that?
> Could it be modified?
They don't usually. But say you have a word which in bold followed
by a question mark which is not. If the bold style stops at the end
of the word, no problem. But it might include the space following
the word, too.
> 2) There is a single space, and the space is attached either
> - to the text, and then it is the last one, and the node is
> followed by a text node starting with colon;
>
> - to the colon, in which case it is a string literal ' :'.
>
> The latter case is trivial (split before, split after, remove
> space, wrap into inline with space-start . The former case requires
> moving a single space to the colon .
Yes, the second case is the one we handle correctly now in XSLT. The
first one is hard because you need to know whether or not the markup
separating the word and the punctuation starts a new block.
Granted, I am being nitpicky. It is very unlikely that the French
user will intentionally insert a paragraph or a line break before a
punctuation mark. And if he does, then it is reasonable to do as he
says...
>>
>> I suspect it would be easier to implement in XEP's line breaking
>> mechanism, when the XML has already been parsed, the superfluous
>> white space has been removed, and the text has been broken into
>> blocks.
>
> If one decided to implement this in XEP, it would go to the
> preprocessing stage along the lines described above.
>
>> All you need to do at that point is prevent the insertion of a
>> line break before the punctuation mark.
>
> XEP line-breaking algorithm is based on the Unicode standard. There
> are many ways to format texts, some of them are better than those
> which are specified in the standards. The unicode standard allows
> breaking a line before a space even if it is followed by a space.
>
> The French typesetting has a legacy feature. This legacy feature is
> easy to express in XSL FO. French users are accustomed to placing a
> space before the colon, a poor practice from past times. Much in
> the same way as first-line indents using TAB must be replaced by
> proper start-indent, spaces before colons in french texts must be
> replaced by space-start .
I have to disagree with you there. We could ask that users not enter
a space before colons, question marks, etc., but that would be like
asking them not to enter a space after a period. People will do it
anyway, both out of habit and because it looks "better" in the text
editor.
-------------------
(*) To unsubscribe, send a message with words 'unsubscribe xep-support'
in the body of the message to majordomo@renderx.com from the address
you are subscribed from.
(*) By using the Service, you expressly agree to these Terms of Service http://www.renderx.com/terms-of-service.html
Received on Tue Oct 31 01:18:20 2006
This archive was generated by hypermail 2.1.8 : Tue Oct 31 2006 - 01:18:21 PST