replacing XSL-FO

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

replacing XSL-FO

Mohamed Ashraf
If we should replacing the XSL-FO which we use to export PDF file out of
XML,
with XML and CSS only with open-source library ,

and I think  * ”CSS Paged Media “ *

is this good enough to do that ,
or there are any suggestion
Reply | Threaded
Open this post in threaded view
|

Re: replacing XSL-FO

vmassol
Administrator
Hi Mohamed,

> On 24 Mar 2018, at 18:44, Mohamed Ashraf <[hidden email]> wrote:
>
> If we should replacing the XSL-FO which we use to export PDF file out of
> XML,
> with XML and CSS only with open-source library ,
>
> and I think  * ”CSS Paged Media “ *
>
> is this good enough to do that ,
> or there are any suggestion

Sorry but I don’t understand your question. Why would you want toi replace XSL-FO in your XWiki install?

If you’d like to contribute to XWiki dev, then could you provide more context and explain why you want to replace XSL-FO and by what.

You may also be interested by the LaTeX exporter which can be used to generate PDFs: http://extensions.xwiki.org/xwiki/bin/view/Extension/LaTeX/

Thanks
-Vincent


Reply | Threaded
Open this post in threaded view
|

Re: replacing XSL-FO

Mohamed Ashraf
Currently, the PDF export of XWiki is implemented based on XSL-FO and
transformation of XHTML to FO. This poses a couple of problems, mainly
related to the current level of support of FO from libraries implementing
FO to PDF transformation, as well as the limitations of automatized
transformation of XHTML to FO. The problems are mainly related to styling
limitations, auto-layouting, etc.

The idea is to try to replace this with a pure XHTML & CSS (paged CSS)
export, using an open source library for producing PDFs out of this
,
and I will see LaTeX ,
thanks

2018-03-24 19:52 GMT+02:00 Vincent Massol <[hidden email]>:

> Hi Mohamed,
>
> > On 24 Mar 2018, at 18:44, Mohamed Ashraf <[hidden email]> wrote:
> >
> > If we should replacing the XSL-FO which we use to export PDF file out of
> > XML,
> > with XML and CSS only with open-source library ,
> >
> > and I think  * ”CSS Paged Media “ *
> >
> > is this good enough to do that ,
> > or there are any suggestion
>
> Sorry but I don’t understand your question. Why would you want toi replace
> XSL-FO in your XWiki install?
>
> If you’d like to contribute to XWiki dev, then could you provide more
> context and explain why you want to replace XSL-FO and by what.
>
> You may also be interested by the LaTeX exporter which can be used to
> generate PDFs: http://extensions.xwiki.org/xwiki/bin/view/Extension/LaTeX/
>
> Thanks
> -Vincent
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: replacing XSL-FO

vmassol
Administrator
Hi Mohamed,

> On 24 Mar 2018, at 19:12, Mohamed Ashraf <[hidden email]> wrote:
>
> Currently, the PDF export of XWiki is implemented based on XSL-FO and
> transformation of XHTML to FO. This poses a couple of problems, mainly
> related to the current level of support of FO from libraries implementing
> FO to PDF transformation, as well as the limitations of automatized
> transformation of XHTML to FO. The problems are mainly related to styling
> limitations, auto-layouting, etc.
>
> The idea is to try to replace this with a pure XHTML & CSS (paged CSS)
> export, using an open source library for producing PDFs out of this
> ,

Sure, but which one?

The only alternative I know is flying saucer (which is dead: https://github.com/flyingsaucerproject/flyingsaucer). Is that what you mean?

Do you know a maintained fork of it? One that I know is used by a competing wiki: https://bitbucket.org/atlassian/xhtmlrenderer-atlassian

Are you doing this as part of this GSOC project: http://dev.xwiki.org/xwiki/bin/view/GoogleSummerOfCode/ImplementPDFexportwithXHTMLpagedCSS ?

Thanks
-Vincent

> and I will see LaTeX ,
> thanks
>
> 2018-03-24 19:52 GMT+02:00 Vincent Massol <[hidden email]>:
>
>> Hi Mohamed,
>>
>>> On 24 Mar 2018, at 18:44, Mohamed Ashraf <[hidden email]> wrote:
>>>
>>> If we should replacing the XSL-FO which we use to export PDF file out of
>>> XML,
>>> with XML and CSS only with open-source library ,
>>>
>>> and I think  * ”CSS Paged Media “ *
>>>
>>> is this good enough to do that ,
>>> or there are any suggestion
>>
>> Sorry but I don’t understand your question. Why would you want toi replace
>> XSL-FO in your XWiki install?
>>
>> If you’d like to contribute to XWiki dev, then could you provide more
>> context and explain why you want to replace XSL-FO and by what.
>>
>> You may also be interested by the LaTeX exporter which can be used to
>> generate PDFs: http://extensions.xwiki.org/xwiki/bin/view/Extension/LaTeX/
>>
>> Thanks
>> -Vincent
>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: replacing XSL-FO

Mohamed Ashraf
Yes this is part of GSOC project

Sent from my iPhone

> On Mar 24, 2018, at 9:29 PM, Vincent Massol <[hidden email]> wrote:
>
> Hi Mohamed,
>
>> On 24 Mar 2018, at 19:12, Mohamed Ashraf <[hidden email]> wrote:
>>
>> Currently, the PDF export of XWiki is implemented based on XSL-FO and
>> transformation of XHTML to FO. This poses a couple of problems, mainly
>> related to the current level of support of FO from libraries implementing
>> FO to PDF transformation, as well as the limitations of automatized
>> transformation of XHTML to FO. The problems are mainly related to styling
>> limitations, auto-layouting, etc.
>>
>> The idea is to try to replace this with a pure XHTML & CSS (paged CSS)
>> export, using an open source library for producing PDFs out of this
>> ,
>
> Sure, but which one?
>
> The only alternative I know is flying saucer (which is dead: https://github.com/flyingsaucerproject/flyingsaucer). Is that what you mean?
>
> Do you know a maintained fork of it? One that I know is used by a competing wiki: https://bitbucket.org/atlassian/xhtmlrenderer-atlassian
>
> Are you doing this as part of this GSOC project: http://dev.xwiki.org/xwiki/bin/view/GoogleSummerOfCode/ImplementPDFexportwithXHTMLpagedCSS ?
>
> Thanks
> -Vincent
>
>> and I will see LaTeX ,
>> thanks
>>
>> 2018-03-24 19:52 GMT+02:00 Vincent Massol <[hidden email]>:
>>
>>> Hi Mohamed,
>>>
>>>> On 24 Mar 2018, at 18:44, Mohamed Ashraf <[hidden email]> wrote:
>>>>
>>>> If we should replacing the XSL-FO which we use to export PDF file out of
>>>> XML,
>>>> with XML and CSS only with open-source library ,
>>>>
>>>> and I think  * ”CSS Paged Media “ *
>>>>
>>>> is this good enough to do that ,
>>>> or there are any suggestion
>>>
>>> Sorry but I don’t understand your question. Why would you want toi replace
>>> XSL-FO in your XWiki install?
>>>
>>> If you’d like to contribute to XWiki dev, then could you provide more
>>> context and explain why you want to replace XSL-FO and by what.
>>>
>>> You may also be interested by the LaTeX exporter which can be used to
>>> generate PDFs: http://extensions.xwiki.org/xwiki/bin/view/Extension/LaTeX/
>>>
>>> Thanks
>>> -Vincent
>>>
>>>
>>>
>
Reply | Threaded
Open this post in threaded view
|

Re: replacing XSL-FO

Paul Libbrecht-2
Hello Mohammed,

have you googled for paged-media html to css converters?

Surely an option is to let it be done by the browser but there must also be engines.
E.g. I think that phantomJS of weasyprint can do that. However, I haven’t found yet in java (which would simplify things).
As Vincent says, print with LaTeX in the middle is a way to get high-quality but there are many losses too: it is really hard to get CSS rules to be all implemented in TeX.

I’m wondering if CSSbox could do the job.

paul

On 24 Mar 2018, at 20:51, Mohamed Ashraf wrote:

> Yes this is part of GSOC project
>
> Sent from my iPhone
>
>> On Mar 24, 2018, at 9:29 PM, Vincent Massol <[hidden email]> wrote:
>>
>> Hi Mohamed,
>>
>>> On 24 Mar 2018, at 19:12, Mohamed Ashraf <[hidden email]> wrote:
>>>
>>> Currently, the PDF export of XWiki is implemented based on XSL-FO and
>>> transformation of XHTML to FO. This poses a couple of problems, mainly
>>> related to the current level of support of FO from libraries implementing
>>> FO to PDF transformation, as well as the limitations of automatized
>>> transformation of XHTML to FO. The problems are mainly related to styling
>>> limitations, auto-layouting, etc.
>>>
>>> The idea is to try to replace this with a pure XHTML & CSS (paged CSS)
>>> export, using an open source library for producing PDFs out of this
>>> ,
>>
>> Sure, but which one?
>>
>> The only alternative I know is flying saucer (which is dead: https://github.com/flyingsaucerproject/flyingsaucer). Is that what you mean?
>>
>> Do you know a maintained fork of it? One that I know is used by a competing wiki: https://bitbucket.org/atlassian/xhtmlrenderer-atlassian
>>
>> Are you doing this as part of this GSOC project: http://dev.xwiki.org/xwiki/bin/view/GoogleSummerOfCode/ImplementPDFexportwithXHTMLpagedCSS ?
>>
>> Thanks
>> -Vincent
>>
>>> and I will see LaTeX ,
>>> thanks
>>>
>>> 2018-03-24 19:52 GMT+02:00 Vincent Massol <[hidden email]>:
>>>
>>>> Hi Mohamed,
>>>>
>>>>> On 24 Mar 2018, at 18:44, Mohamed Ashraf <[hidden email]> wrote:
>>>>>
>>>>> If we should replacing the XSL-FO which we use to export PDF file out of
>>>>> XML,
>>>>> with XML and CSS only with open-source library ,
>>>>>
>>>>> and I think  * ”CSS Paged Media “ *
>>>>>
>>>>> is this good enough to do that ,
>>>>> or there are any suggestion
>>>>
>>>> Sorry but I don’t understand your question. Why would you want toi replace
>>>> XSL-FO in your XWiki install?
>>>>
>>>> If you’d like to contribute to XWiki dev, then could you provide more
>>>> context and explain why you want to replace XSL-FO and by what.
>>>>
>>>> You may also be interested by the LaTeX exporter which can be used to
>>>> generate PDFs: http://extensions.xwiki.org/xwiki/bin/view/Extension/LaTeX/
>>>>
>>>> Thanks
>>>> -Vincent
>>>>
>>>>
>>>>
>>

signature.asc (523 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: replacing XSL-FO

vmassol
Administrator


> On 24 Mar 2018, at 21:58, Paul Libbrecht <[hidden email]> wrote:
>
> Hello Mohammed,
>
> have you googled for paged-media html to css converters?
>
> Surely an option is to let it be done by the browser but there must also be engines.

We have evaluated this in the past and there are lots of limitations, see https://markmail.org/message/ztcwibiuoqfjcnjo

> E.g. I think that phantomJS of weasyprint can do that. However, I haven’t found yet in java (which would simplify things).

Note that phantomjs is dead now:
https://www.puzzle.ch/blog/articles/2018/02/12/phantomjs-is-dead-long-live-headless-browsers

> As Vincent says, print with LaTeX in the middle is a way to get high-quality but there are many losses too: it is really hard to get CSS rules to be all implemented in TeX.

Yes indeed, that’s very hard. CSS shouldn’t be used as a way to style the LaTeX output. The LaTeX exporter itself should provide its own way of controlling the style of the output. This is what I do in the LaTeX exporter. Basically I provide some default styles (sometimes with some config options) and the user has the ability to control exactly the styles he/she wants applied if the default style is not enough. It’s not trivial though and will take a bit of time if you need a heavily styled document.

Thanks
-Vincent

>
> I’m wondering if CSSbox could do the job.
>
> paul
>
> On 24 Mar 2018, at 20:51, Mohamed Ashraf wrote:
>
>> Yes this is part of GSOC project
>>
>> Sent from my iPhone
>>
>>> On Mar 24, 2018, at 9:29 PM, Vincent Massol <[hidden email]> wrote:
>>>
>>> Hi Mohamed,
>>>
>>>> On 24 Mar 2018, at 19:12, Mohamed Ashraf <[hidden email]> wrote:
>>>>
>>>> Currently, the PDF export of XWiki is implemented based on XSL-FO and
>>>> transformation of XHTML to FO. This poses a couple of problems, mainly
>>>> related to the current level of support of FO from libraries implementing
>>>> FO to PDF transformation, as well as the limitations of automatized
>>>> transformation of XHTML to FO. The problems are mainly related to styling
>>>> limitations, auto-layouting, etc.
>>>>
>>>> The idea is to try to replace this with a pure XHTML & CSS (paged CSS)
>>>> export, using an open source library for producing PDFs out of this
>>>> ,
>>>
>>> Sure, but which one?
>>>
>>> The only alternative I know is flying saucer (which is dead: https://github.com/flyingsaucerproject/flyingsaucer). Is that what you mean?
>>>
>>> Do you know a maintained fork of it? One that I know is used by a competing wiki: https://bitbucket.org/atlassian/xhtmlrenderer-atlassian
>>>
>>> Are you doing this as part of this GSOC project: http://dev.xwiki.org/xwiki/bin/view/GoogleSummerOfCode/ImplementPDFexportwithXHTMLpagedCSS ?
>>>
>>> Thanks
>>> -Vincent
>>>
>>>> and I will see LaTeX ,
>>>> thanks
>>>>
>>>> 2018-03-24 19:52 GMT+02:00 Vincent Massol <[hidden email]>:
>>>>
>>>>> Hi Mohamed,
>>>>>
>>>>>> On 24 Mar 2018, at 18:44, Mohamed Ashraf <[hidden email]> wrote:
>>>>>>
>>>>>> If we should replacing the XSL-FO which we use to export PDF file out of
>>>>>> XML,
>>>>>> with XML and CSS only with open-source library ,
>>>>>>
>>>>>> and I think  * ”CSS Paged Media “ *
>>>>>>
>>>>>> is this good enough to do that ,
>>>>>> or there are any suggestion
>>>>>
>>>>> Sorry but I don’t understand your question. Why would you want toi replace
>>>>> XSL-FO in your XWiki install?
>>>>>
>>>>> If you’d like to contribute to XWiki dev, then could you provide more
>>>>> context and explain why you want to replace XSL-FO and by what.
>>>>>
>>>>> You may also be interested by the LaTeX exporter which can be used to
>>>>> generate PDFs: http://extensions.xwiki.org/xwiki/bin/view/Extension/LaTeX/
>>>>>
>>>>> Thanks
>>>>> -Vincent
>>>>>
>>>>>
>>>>>
>>>

Reply | Threaded
Open this post in threaded view
|

Re: replacing XSL-FO

Ludovic Dubost
On Sat, Mar 24, 2018 at 10:06 PM, Vincent Massol <[hidden email]> wrote:

>
>
> > On 24 Mar 2018, at 21:58, Paul Libbrecht <[hidden email]> wrote:
> >
> > Hello Mohammed,
> >
> > have you googled for paged-media html to css converters?
> >
> > Surely an option is to let it be done by the browser but there must also
> be engines.
>
> We have evaluated this in the past and there are lots of limitations, see
> https://markmail.org/message/ztcwibiuoqfjcnjo
>
> > E.g. I think that phantomJS of weasyprint can do that. However, I
> haven’t found yet in java (which would simplify things).
>
> Note that phantomjs is dead now:
> https://www.puzzle.ch/blog/articles/2018/02/12/phantomjs-
> is-dead-long-live-headless-browsers
>
> > As Vincent says, print with LaTeX in the middle is a way to get
> high-quality but there are many losses too: it is really hard to get CSS
> rules to be all implemented in TeX.
>
> Yes indeed, that’s very hard. CSS shouldn’t be used as a way to style the
> LaTeX output. The LaTeX exporter itself should provide its own way of
> controlling the style of the output. This is what I do in the LaTeX
> exporter. Basically I provide some default styles (sometimes with some
> config options) and the user has the ability to control exactly the styles
> he/she wants applied if the default style is not enough. It’s not trivial
> though and will take a bit of time if you need a heavily styled document.
>

This is a major limitation of a latex based export for XWiki. This makes it
very hard to export any macros that would produce HTML + CSS and any HTML
that the user would create in XWiki.
The current XML-FO based export supports a limited set of HTML + CSS. Also
latex does not provide us with a java pdf export.

The CSS paged media standard has this advantage of bringing to the table
HTML + CSS support and support of CSS for the general document output
(header, footer, etc..). Now of course we need to find the right libraries
for that. It would be nice to have an experiment based on this to see how
far we can go with css pages media.

It's important to consider the full needs if we want to compare
technologies. The latex export makes nice high quality output but currently
only for the basic syntax elements that we validate for that output.

Ludovic



>
> Thanks
> -Vincent
>
> >
> > I’m wondering if CSSbox could do the job.
> >
> > paul
> >
> > On 24 Mar 2018, at 20:51, Mohamed Ashraf wrote:
> >
> >> Yes this is part of GSOC project
> >>
> >> Sent from my iPhone
> >>
> >>> On Mar 24, 2018, at 9:29 PM, Vincent Massol <[hidden email]>
> wrote:
> >>>
> >>> Hi Mohamed,
> >>>
> >>>> On 24 Mar 2018, at 19:12, Mohamed Ashraf <[hidden email]> wrote:
> >>>>
> >>>> Currently, the PDF export of XWiki is implemented based on XSL-FO and
> >>>> transformation of XHTML to FO. This poses a couple of problems, mainly
> >>>> related to the current level of support of FO from libraries
> implementing
> >>>> FO to PDF transformation, as well as the limitations of automatized
> >>>> transformation of XHTML to FO. The problems are mainly related to
> styling
> >>>> limitations, auto-layouting, etc.
> >>>>
> >>>> The idea is to try to replace this with a pure XHTML & CSS (paged CSS)
> >>>> export, using an open source library for producing PDFs out of this
> >>>> ,
> >>>
> >>> Sure, but which one?
> >>>
> >>> The only alternative I know is flying saucer (which is dead:
> https://github.com/flyingsaucerproject/flyingsaucer). Is that what you
> mean?
> >>>
> >>> Do you know a maintained fork of it? One that I know is used by a
> competing wiki: https://bitbucket.org/atlassian/xhtmlrenderer-atlassian
> >>>
> >>> Are you doing this as part of this GSOC project:
> http://dev.xwiki.org/xwiki/bin/view/GoogleSummerOfCode/
> ImplementPDFexportwithXHTMLpagedCSS ?
> >>>
> >>> Thanks
> >>> -Vincent
> >>>
> >>>> and I will see LaTeX ,
> >>>> thanks
> >>>>
> >>>> 2018-03-24 19:52 GMT+02:00 Vincent Massol <[hidden email]>:
> >>>>
> >>>>> Hi Mohamed,
> >>>>>
> >>>>>> On 24 Mar 2018, at 18:44, Mohamed Ashraf <[hidden email]> wrote:
> >>>>>>
> >>>>>> If we should replacing the XSL-FO which we use to export PDF file
> out of
> >>>>>> XML,
> >>>>>> with XML and CSS only with open-source library ,
> >>>>>>
> >>>>>> and I think  * ”CSS Paged Media “ *
> >>>>>>
> >>>>>> is this good enough to do that ,
> >>>>>> or there are any suggestion
> >>>>>
> >>>>> Sorry but I don’t understand your question. Why would you want toi
> replace
> >>>>> XSL-FO in your XWiki install?
> >>>>>
> >>>>> If you’d like to contribute to XWiki dev, then could you provide more
> >>>>> context and explain why you want to replace XSL-FO and by what.
> >>>>>
> >>>>> You may also be interested by the LaTeX exporter which can be used to
> >>>>> generate PDFs: http://extensions.xwiki.org/xwiki/bin/view/Extension/
> LaTeX/
> >>>>>
> >>>>> Thanks
> >>>>> -Vincent
> >>>>>
> >>>>>
> >>>>>
> >>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: replacing XSL-FO

vmassol
Administrator
Hi Luod,

> On 26 Mar 2018, at 09:16, Ludovic Dubost <[hidden email]> wrote:
>
> On Sat, Mar 24, 2018 at 10:06 PM, Vincent Massol <[hidden email]> wrote:
>
>>
>>
>>> On 24 Mar 2018, at 21:58, Paul Libbrecht <[hidden email]> wrote:
>>>
>>> Hello Mohammed,
>>>
>>> have you googled for paged-media html to css converters?
>>>
>>> Surely an option is to let it be done by the browser but there must also
>> be engines.
>>
>> We have evaluated this in the past and there are lots of limitations, see
>> https://markmail.org/message/ztcwibiuoqfjcnjo
>>
>>> E.g. I think that phantomJS of weasyprint can do that. However, I
>> haven’t found yet in java (which would simplify things).
>>
>> Note that phantomjs is dead now:
>> https://www.puzzle.ch/blog/articles/2018/02/12/phantomjs-
>> is-dead-long-live-headless-browsers
>>
>>> As Vincent says, print with LaTeX in the middle is a way to get
>> high-quality but there are many losses too: it is really hard to get CSS
>> rules to be all implemented in TeX.
>>
>> Yes indeed, that’s very hard. CSS shouldn’t be used as a way to style the
>> LaTeX output. The LaTeX exporter itself should provide its own way of
>> controlling the style of the output. This is what I do in the LaTeX
>> exporter. Basically I provide some default styles (sometimes with some
>> config options) and the user has the ability to control exactly the styles
>> he/she wants applied if the default style is not enough. It’s not trivial
>> though and will take a bit of time if you need a heavily styled document.
>>
>
> This is a major limitation of a latex based export for XWiki. This makes it
> very hard to export any macros that would produce HTML + CSS and any HTML
> that the user would create in XWiki.

Yes this is what I was mentioning re CSS. On the HTML side, it’s not exactly true since we can parse the HTML with the XWiki HTML parser which generates and XDOM and then render that XDOM. It won’t be perfect though. For example there are HTML elements that are not supported by our HTML parser (ex: <FORM> elements).

> The current XML-FO based export supports a limited set of HTML + CSS. Also
> latex does not provide us with a java pdf export.

One thing has to be clear: I’m absolutely not pushing for having the LaTeX export as a replacement of our XSL-FO approach. For some reason you seem to be hinting at that which is not my opinion for several reasons. I mentioned LaTeX here because it’s important to know all the technologies that exist to produce a PDF and that’s one we have, that’s all.

It’ll be interesting at some point to draw a Pro/Cons table on xwiki.org to compare the various export options with their limits.

> The CSS paged media standard has this advantage of bringing to the table
> HTML + CSS support and support of CSS for the general document output
> (header, footer, etc..). Now of course we need to find the right libraries
> for that. It would be nice to have an experiment based on this to see how
> far we can go with css pages media.

Definitely. That’s actually the purpose of this GSOC Ludo! :)

TBH I was the one pushing for this experiment initially when I found about the nice result of flyingsaucer… So I’m as eager as you to see what we can get with paged CSS.

> It's important to consider the full needs if we want to compare
> technologies. The latex export makes nice high quality output but currently
> only for the basic syntax elements that we validate for that output.

Regarding the quality of the output, yes it’ll be fun to compare what we get with various inputs when using the 3 technologies:
* XSL-FO
* LaTeX
* Paged CSS (see https://print-css.rocks/intro.html#what-is-css-paged-media)

Thanks
-Vincent

>
> Ludovic
>
>
>
>>
>> Thanks
>> -Vincent
>>
>>>
>>> I’m wondering if CSSbox could do the job.
>>>
>>> paul
>>>
>>> On 24 Mar 2018, at 20:51, Mohamed Ashraf wrote:
>>>
>>>> Yes this is part of GSOC project
>>>>
>>>> Sent from my iPhone
>>>>
>>>>> On Mar 24, 2018, at 9:29 PM, Vincent Massol <[hidden email]>
>> wrote:
>>>>>
>>>>> Hi Mohamed,
>>>>>
>>>>>> On 24 Mar 2018, at 19:12, Mohamed Ashraf <[hidden email]> wrote:
>>>>>>
>>>>>> Currently, the PDF export of XWiki is implemented based on XSL-FO and
>>>>>> transformation of XHTML to FO. This poses a couple of problems, mainly
>>>>>> related to the current level of support of FO from libraries
>> implementing
>>>>>> FO to PDF transformation, as well as the limitations of automatized
>>>>>> transformation of XHTML to FO. The problems are mainly related to
>> styling
>>>>>> limitations, auto-layouting, etc.
>>>>>>
>>>>>> The idea is to try to replace this with a pure XHTML & CSS (paged CSS)
>>>>>> export, using an open source library for producing PDFs out of this
>>>>>> ,
>>>>>
>>>>> Sure, but which one?
>>>>>
>>>>> The only alternative I know is flying saucer (which is dead:
>> https://github.com/flyingsaucerproject/flyingsaucer). Is that what you
>> mean?
>>>>>
>>>>> Do you know a maintained fork of it? One that I know is used by a
>> competing wiki: https://bitbucket.org/atlassian/xhtmlrenderer-atlassian
>>>>>
>>>>> Are you doing this as part of this GSOC project:
>> http://dev.xwiki.org/xwiki/bin/view/GoogleSummerOfCode/
>> ImplementPDFexportwithXHTMLpagedCSS ?
>>>>>
>>>>> Thanks
>>>>> -Vincent
>>>>>
>>>>>> and I will see LaTeX ,
>>>>>> thanks
>>>>>>
>>>>>> 2018-03-24 19:52 GMT+02:00 Vincent Massol <[hidden email]>:
>>>>>>
>>>>>>> Hi Mohamed,
>>>>>>>
>>>>>>>> On 24 Mar 2018, at 18:44, Mohamed Ashraf <[hidden email]> wrote:
>>>>>>>>
>>>>>>>> If we should replacing the XSL-FO which we use to export PDF file
>> out of
>>>>>>>> XML,
>>>>>>>> with XML and CSS only with open-source library ,
>>>>>>>>
>>>>>>>> and I think  * ”CSS Paged Media “ *
>>>>>>>>
>>>>>>>> is this good enough to do that ,
>>>>>>>> or there are any suggestion
>>>>>>>
>>>>>>> Sorry but I don’t understand your question. Why would you want toi
>> replace
>>>>>>> XSL-FO in your XWiki install?
>>>>>>>
>>>>>>> If you’d like to contribute to XWiki dev, then could you provide more
>>>>>>> context and explain why you want to replace XSL-FO and by what.
>>>>>>>
>>>>>>> You may also be interested by the LaTeX exporter which can be used to
>>>>>>> generate PDFs: http://extensions.xwiki.org/xwiki/bin/view/Extension/
>> LaTeX/
>>>>>>>
>>>>>>> Thanks
>>>>>>> -Vincent

Reply | Threaded
Open this post in threaded view
|

Re: replacing XSL-FO

Paul Libbrecht-2
Hello all,
Hello Mohamed,

There has been two tools suggested in this thread: flyingsaucer and weasyprint.
Can you find more tools?
Can you try to see how an architecture would look like with these two tools and incorporate that in your proposal?

flyingsaucer should be quite easy to integrate since it’s in java. So it’d be probably a pom.xml change or an extension with the relevant pom.xml…

paul


On 26 Mar 2018, at 9:55, Vincent Massol wrote:

> Hi Ludo,
>
>> On 26 Mar 2018, at 09:16, Ludovic Dubost <[hidden email]> wrote:
>>
>> On Sat, Mar 24, 2018 at 10:06 PM, Vincent Massol <[hidden email]> wrote:
>>
>>>
>>>
>>>> On 24 Mar 2018, at 21:58, Paul Libbrecht <[hidden email]> wrote:
>>>>
>>>> Hello Mohammed,
>>>>
>>>> have you googled for paged-media html to css converters?
>>>>
>>>> Surely an option is to let it be done by the browser but there must also
>>> be engines.
>>>
>>> We have evaluated this in the past and there are lots of limitations, see
>>> https://markmail.org/message/ztcwibiuoqfjcnjo
>>>
>>>> E.g. I think that phantomJS of weasyprint can do that. However, I
>>> haven’t found yet in java (which would simplify things).
>>>
>>> Note that phantomjs is dead now:
>>> https://www.puzzle.ch/blog/articles/2018/02/12/phantomjs-
>>> is-dead-long-live-headless-browsers
>>>
>>>> As Vincent says, print with LaTeX in the middle is a way to get
>>> high-quality but there are many losses too: it is really hard to get CSS
>>> rules to be all implemented in TeX.
>>>
>>> Yes indeed, that’s very hard. CSS shouldn’t be used as a way to style the
>>> LaTeX output. The LaTeX exporter itself should provide its own way of
>>> controlling the style of the output. This is what I do in the LaTeX
>>> exporter. Basically I provide some default styles (sometimes with some
>>> config options) and the user has the ability to control exactly the styles
>>> he/she wants applied if the default style is not enough. It’s not trivial
>>> though and will take a bit of time if you need a heavily styled document.
>>>
>>
>> This is a major limitation of a latex based export for XWiki. This makes it
>> very hard to export any macros that would produce HTML + CSS and any HTML
>> that the user would create in XWiki.
>
> Yes this is what I was mentioning re CSS. On the HTML side, it’s not exactly true since we can parse the HTML with the XWiki HTML parser which generates and XDOM and then render that XDOM. It won’t be perfect though. For example there are HTML elements that are not supported by our HTML parser (ex: <FORM> elements).
>
>> The current XML-FO based export supports a limited set of HTML + CSS. Also
>> latex does not provide us with a java pdf export.
>
> One thing has to be clear: I’m absolutely not pushing for having the LaTeX export as a replacement of our XSL-FO approach. For some reason you seem to be hinting at that which is not my opinion for several reasons. I mentioned LaTeX here because it’s important to know all the technologies that exist to produce a PDF and that’s one we have, that’s all.
>
> It’ll be interesting at some point to draw a Pro/Cons table on xwiki.org to compare the various export options with their limits.
>
>> The CSS paged media standard has this advantage of bringing to the table
>> HTML + CSS support and support of CSS for the general document output
>> (header, footer, etc..). Now of course we need to find the right libraries
>> for that. It would be nice to have an experiment based on this to see how
>> far we can go with css pages media.
>
> Definitely. That’s actually the purpose of this GSOC Ludo! :)
>
> TBH I was the one pushing for this experiment initially when I found about the nice result of flyingsaucer… So I’m as eager as you to see what we can get with paged CSS.
>
>> It's important to consider the full needs if we want to compare
>> technologies. The latex export makes nice high quality output but currently
>> only for the basic syntax elements that we validate for that output.
>
> Regarding the quality of the output, yes it’ll be fun to compare what we get with various inputs when using the 3 technologies:
> * XSL-FO
> * LaTeX
> * Paged CSS (see https://print-css.rocks/intro.html#what-is-css-paged-media)
>
> Thanks
> -Vincent
>
>>
>> Ludovic
>>
>>
>>
>>>
>>> Thanks
>>> -Vincent
>>>
>>>>
>>>> I’m wondering if CSSbox could do the job.
>>>>
>>>> paul
>>>>
>>>> On 24 Mar 2018, at 20:51, Mohamed Ashraf wrote:
>>>>
>>>>> Yes this is part of GSOC project
>>>>>
>>>>> Sent from my iPhone
>>>>>
>>>>>> On Mar 24, 2018, at 9:29 PM, Vincent Massol <[hidden email]>
>>> wrote:
>>>>>>
>>>>>> Hi Mohamed,
>>>>>>
>>>>>>> On 24 Mar 2018, at 19:12, Mohamed Ashraf <[hidden email]> wrote:
>>>>>>>
>>>>>>> Currently, the PDF export of XWiki is implemented based on XSL-FO and
>>>>>>> transformation of XHTML to FO. This poses a couple of problems, mainly
>>>>>>> related to the current level of support of FO from libraries
>>> implementing
>>>>>>> FO to PDF transformation, as well as the limitations of automatized
>>>>>>> transformation of XHTML to FO. The problems are mainly related to
>>> styling
>>>>>>> limitations, auto-layouting, etc.
>>>>>>>
>>>>>>> The idea is to try to replace this with a pure XHTML & CSS (paged CSS)
>>>>>>> export, using an open source library for producing PDFs out of this
>>>>>>> ,
>>>>>>
>>>>>> Sure, but which one?
>>>>>>
>>>>>> The only alternative I know is flying saucer (which is dead:
>>> https://github.com/flyingsaucerproject/flyingsaucer). Is that what you
>>> mean?
>>>>>>
>>>>>> Do you know a maintained fork of it? One that I know is used by a
>>> competing wiki: https://bitbucket.org/atlassian/xhtmlrenderer-atlassian
>>>>>>
>>>>>> Are you doing this as part of this GSOC project:
>>> http://dev.xwiki.org/xwiki/bin/view/GoogleSummerOfCode/
>>> ImplementPDFexportwithXHTMLpagedCSS ?
>>>>>>
>>>>>> Thanks
>>>>>> -Vincent
>>>>>>
>>>>>>> and I will see LaTeX ,
>>>>>>> thanks
>>>>>>>
>>>>>>> 2018-03-24 19:52 GMT+02:00 Vincent Massol <[hidden email]>:
>>>>>>>
>>>>>>>> Hi Mohamed,
>>>>>>>>
>>>>>>>>> On 24 Mar 2018, at 18:44, Mohamed Ashraf <[hidden email]> wrote:
>>>>>>>>>
>>>>>>>>> If we should replacing the XSL-FO which we use to export PDF file
>>> out of
>>>>>>>>> XML,
>>>>>>>>> with XML and CSS only with open-source library ,
>>>>>>>>>
>>>>>>>>> and I think  * ”CSS Paged Media “ *
>>>>>>>>>
>>>>>>>>> is this good enough to do that ,
>>>>>>>>> or there are any suggestion
>>>>>>>>
>>>>>>>> Sorry but I don’t understand your question. Why would you want toi
>>> replace
>>>>>>>> XSL-FO in your XWiki install?
>>>>>>>>
>>>>>>>> If you’d like to contribute to XWiki dev, then could you provide more
>>>>>>>> context and explain why you want to replace XSL-FO and by what.
>>>>>>>>
>>>>>>>> You may also be interested by the LaTeX exporter which can be used to
>>>>>>>> generate PDFs: http://extensions.xwiki.org/xwiki/bin/view/Extension/
>>> LaTeX/
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> -Vincent

signature.asc (523 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: replacing XSL-FO

Mohamed Ashraf
Hello all,
Hello mr paul libbercht ,
Thanks all
Thanks paul  for these wonderful and useful libraries
And these information ,
I will use them for sure in my proposal.

Sent from my iPhone

> On Mar 27, 2018, at 2:07 PM, Paul Libbrecht <[hidden email]> wrote:
>
> Hello all,
> Hello Mohamed,
>
> There has been two tools suggested in this thread: flyingsaucer and weasyprint.
> Can you find more tools?
> Can you try to see how an architecture would look like with these two tools and incorporate that in your proposal?
>
> flyingsaucer should be quite easy to integrate since it’s in java. So it’d be probably a pom.xml change or an extension with the relevant pom.xml…
>
> paul
>
>
>> On 26 Mar 2018, at 9:55, Vincent Massol wrote:
>>
>> Hi Ludo,
>>
>>> On 26 Mar 2018, at 09:16, Ludovic Dubost <[hidden email]> wrote:
>>>
>>> On Sat, Mar 24, 2018 at 10:06 PM, Vincent Massol <[hidden email]> wrote:
>>>
>>>>
>>>>
>>>>> On 24 Mar 2018, at 21:58, Paul Libbrecht <[hidden email]> wrote:
>>>>>
>>>>> Hello Mohammed,
>>>>>
>>>>> have you googled for paged-media html to css converters?
>>>>>
>>>>> Surely an option is to let it be done by the browser but there must also
>>>> be engines.
>>>>
>>>> We have evaluated this in the past and there are lots of limitations, see
>>>> https://markmail.org/message/ztcwibiuoqfjcnjo
>>>>
>>>>> E.g. I think that phantomJS of weasyprint can do that. However, I
>>>> haven’t found yet in java (which would simplify things).
>>>>
>>>> Note that phantomjs is dead now:
>>>> https://www.puzzle.ch/blog/articles/2018/02/12/phantomjs-
>>>> is-dead-long-live-headless-browsers
>>>>
>>>>> As Vincent says, print with LaTeX in the middle is a way to get
>>>> high-quality but there are many losses too: it is really hard to get CSS
>>>> rules to be all implemented in TeX.
>>>>
>>>> Yes indeed, that’s very hard. CSS shouldn’t be used as a way to style the
>>>> LaTeX output. The LaTeX exporter itself should provide its own way of
>>>> controlling the style of the output. This is what I do in the LaTeX
>>>> exporter. Basically I provide some default styles (sometimes with some
>>>> config options) and the user has the ability to control exactly the styles
>>>> he/she wants applied if the default style is not enough. It’s not trivial
>>>> though and will take a bit of time if you need a heavily styled document.
>>>>
>>>
>>> This is a major limitation of a latex based export for XWiki. This makes it
>>> very hard to export any macros that would produce HTML + CSS and any HTML
>>> that the user would create in XWiki.
>>
>> Yes this is what I was mentioning re CSS. On the HTML side, it’s not exactly true since we can parse the HTML with the XWiki HTML parser which generates and XDOM and then render that XDOM. It won’t be perfect though. For example there are HTML elements that are not supported by our HTML parser (ex: <FORM> elements).
>>
>>> The current XML-FO based export supports a limited set of HTML + CSS. Also
>>> latex does not provide us with a java pdf export.
>>
>> One thing has to be clear: I’m absolutely not pushing for having the LaTeX export as a replacement of our XSL-FO approach. For some reason you seem to be hinting at that which is not my opinion for several reasons. I mentioned LaTeX here because it’s important to know all the technologies that exist to produce a PDF and that’s one we have, that’s all.
>>
>> It’ll be interesting at some point to draw a Pro/Cons table on xwiki.org to compare the various export options with their limits.
>>
>>> The CSS paged media standard has this advantage of bringing to the table
>>> HTML + CSS support and support of CSS for the general document output
>>> (header, footer, etc..). Now of course we need to find the right libraries
>>> for that. It would be nice to have an experiment based on this to see how
>>> far we can go with css pages media.
>>
>> Definitely. That’s actually the purpose of this GSOC Ludo! :)
>>
>> TBH I was the one pushing for this experiment initially when I found about the nice result of flyingsaucer… So I’m as eager as you to see what we can get with paged CSS.
>>
>>> It's important to consider the full needs if we want to compare
>>> technologies. The latex export makes nice high quality output but currently
>>> only for the basic syntax elements that we validate for that output.
>>
>> Regarding the quality of the output, yes it’ll be fun to compare what we get with various inputs when using the 3 technologies:
>> * XSL-FO
>> * LaTeX
>> * Paged CSS (see https://print-css.rocks/intro.html#what-is-css-paged-media)
>>
>> Thanks
>> -Vincent
>>
>>>
>>> Ludovic
>>>
>>>
>>>
>>>>
>>>> Thanks
>>>> -Vincent
>>>>
>>>>>
>>>>> I’m wondering if CSSbox could do the job.
>>>>>
>>>>> paul
>>>>>
>>>>>> On 24 Mar 2018, at 20:51, Mohamed Ashraf wrote:
>>>>>>
>>>>>> Yes this is part of GSOC project
>>>>>>
>>>>>> Sent from my iPhone
>>>>>>
>>>>>>> On Mar 24, 2018, at 9:29 PM, Vincent Massol <[hidden email]>
>>>> wrote:
>>>>>>>
>>>>>>> Hi Mohamed,
>>>>>>>
>>>>>>>> On 24 Mar 2018, at 19:12, Mohamed Ashraf <[hidden email]> wrote:
>>>>>>>>
>>>>>>>> Currently, the PDF export of XWiki is implemented based on XSL-FO and
>>>>>>>> transformation of XHTML to FO. This poses a couple of problems, mainly
>>>>>>>> related to the current level of support of FO from libraries
>>>> implementing
>>>>>>>> FO to PDF transformation, as well as the limitations of automatized
>>>>>>>> transformation of XHTML to FO. The problems are mainly related to
>>>> styling
>>>>>>>> limitations, auto-layouting, etc.
>>>>>>>>
>>>>>>>> The idea is to try to replace this with a pure XHTML & CSS (paged CSS)
>>>>>>>> export, using an open source library for producing PDFs out of this
>>>>>>>> ,
>>>>>>>
>>>>>>> Sure, but which one?
>>>>>>>
>>>>>>> The only alternative I know is flying saucer (which is dead:
>>>> https://github.com/flyingsaucerproject/flyingsaucer). Is that what you
>>>> mean?
>>>>>>>
>>>>>>> Do you know a maintained fork of it? One that I know is used by a
>>>> competing wiki: https://bitbucket.org/atlassian/xhtmlrenderer-atlassian
>>>>>>>
>>>>>>> Are you doing this as part of this GSOC project:
>>>> http://dev.xwiki.org/xwiki/bin/view/GoogleSummerOfCode/
>>>> ImplementPDFexportwithXHTMLpagedCSS ?
>>>>>>>
>>>>>>> Thanks
>>>>>>> -Vincent
>>>>>>>
>>>>>>>> and I will see LaTeX ,
>>>>>>>> thanks
>>>>>>>>
>>>>>>>> 2018-03-24 19:52 GMT+02:00 Vincent Massol <[hidden email]>:
>>>>>>>>
>>>>>>>>> Hi Mohamed,
>>>>>>>>>
>>>>>>>>>> On 24 Mar 2018, at 18:44, Mohamed Ashraf <[hidden email]> wrote:
>>>>>>>>>>
>>>>>>>>>> If we should replacing the XSL-FO which we use to export PDF file
>>>> out of
>>>>>>>>>> XML,
>>>>>>>>>> with XML and CSS only with open-source library ,
>>>>>>>>>>
>>>>>>>>>> and I think  * ”CSS Paged Media “ *
>>>>>>>>>>
>>>>>>>>>> is this good enough to do that ,
>>>>>>>>>> or there are any suggestion
>>>>>>>>>
>>>>>>>>> Sorry but I don’t understand your question. Why would you want toi
>>>> replace
>>>>>>>>> XSL-FO in your XWiki install?
>>>>>>>>>
>>>>>>>>> If you’d like to contribute to XWiki dev, then could you provide more
>>>>>>>>> context and explain why you want to replace XSL-FO and by what.
>>>>>>>>>
>>>>>>>>> You may also be interested by the LaTeX exporter which can be used to
>>>>>>>>> generate PDFs: http://extensions.xwiki.org/xwiki/bin/view/Extension/
>>>> LaTeX/
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> -Vincent