Decode HTML

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Decode HTML

DeHaynes
I have some groovy code and in it I am grabbing the value from a field and storing it in an object.  When the value has spaces, it is encoded with " ".  Is there a way to remove HTML encoding from a string?  I looked in Util and didn't see anything.  

Thanks.
Reply | Threaded
Open this post in threaded view
|

Re: Decode HTML

DeHaynes
I found StringEscapeUtils.unescapeJava() in import groovy.json.StringEscapeUtils.
Reply | Threaded
Open this post in threaded view
|

Re: Decode HTML

Sofiane Baloul
You can use *StringEscapeUtils.uescapeHtml(String)
<http://commons.apache.org/proper/commons-lang/javadocs/api-2.6/org/apache/commons/lang/StringEscapeUtils.html#unescapeHtml%28java.lang.String%29>*
Hopping this helps.

--
Sofiane


On Fri, Mar 28, 2014 at 9:06 PM, DeHaynes <[hidden email]> wrote:

> I found StringEscapeUtils.unescapeJava() in import
> groovy.json.StringEscapeUtils.
>
>
>
> --
> View this message in context:
> http://xwiki.475771.n2.nabble.com/Decode-HTML-tp7589862p7589863.html
> Sent from the XWiki- Dev mailing list archive at Nabble.com.
> _______________________________________________
> devs mailing list
> [hidden email]
> http://lists.xwiki.org/mailman/listinfo/devs
>
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Reply | Threaded
Open this post in threaded view
|

Re: Decode HTML

DeHaynes
I tried that method.

source.setTitle(StringEscapeUtils.unescapeHtml(NewDocumentName));

But if there are more than 2 spaces in a row, it doesn't work.  It gives the following:

My Process  3

In that one, there was 2 spaces between Prcess and the number 3.  Any ideas why?  
Reply | Threaded
Open this post in threaded view
|

Re: Decode HTML

Marius Dumitru Florea
In reply to this post by DeHaynes
On Fri, Mar 28, 2014 at 10:01 PM, DeHaynes <[hidden email]> wrote:
> I have some groovy code and in it I am grabbing the value from a field and
> storing it in an object.  When the value has spaces, it is encoded with
> "&nbsp;".  Is there a way to remove HTML encoding from a string?  I looked
> in Util and didn't see anything.

What type of field are you referring to? HTML input field? Is it a
text area or an input with type=text? How do you get the value from
the field?

If the value contains &nbsp; then it's probably HTML so removing the
HTML encoding will break the HTML as you can also have &gt; or &lt;

before <em>&nbsp;1 &lt; 2</em> after

I guess what you are trying to achieve is to get the plain text from
an HTML content.

Hope this helps,
Marius

>
> Thanks.
>
>
>
> --
> View this message in context: http://xwiki.475771.n2.nabble.com/Decode-HTML-tp7589862.html
> Sent from the XWiki- Dev mailing list archive at Nabble.com.
> _______________________________________________
> devs mailing list
> [hidden email]
> http://lists.xwiki.org/mailman/listinfo/devs
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Reply | Threaded
Open this post in threaded view
|

Re: Decode HTML

DeHaynes
This post was updated on .
I am using a String input field to hold the name of the document and it's title.  It works fine for the document name, but not the document Title.  If there is more than one space between words in the title, it inserts a "&nbsp;" for each additional space beyond the first one.

This is in groovy.  I get the value from the field like this

// I use this to pull by Class document from the form document.
def myObject = source.getObject(Globals.fullClassSpace);

// I use this to get the value of the field.
def NewDocumentName = myObject.get(Globals.titleFieldName).value

// I use this to set the title of the document.
source.setTitle(StringEscapeUtils.unescapeJava(NewDocumentName));