How do I prevent smooks from escaping ampersands?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

How do I prevent smooks from escaping ampersands?

myarbrou
This post has NOT been accepted by the mailing list yet.
This post was updated on .
I have a CSV file that I'm converting to a Java bean and one of the fields has "H&M" in it. When I convert the file, smooks return "H&M"

How do I prevent this from happening?

Thanks

Here's the config:
<?xml version="1.0" encoding="UTF-8"?>
<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd"
    xmlns:csv="http://www.milyn.org/xsd/smooks/csv-1.1.xsd" xmlns:jb="http://www.milyn.org/xsd/smooks/javabean-1.1.xsd">
    <params>
        <param name="stream.filter.type">SAX</param>
        <param name="default.serialization.on">false</param>
        <param name="feature-off">http://apache.org/xml/features/scanner/notify-char-refs</param>
    </params>

    <csv:reader
        fields="Property,PropertyID,Unit#,FloorPlan, ReportingStatus, WorkType, Sub-Status, Subject, CurrentStep, Status, DenyCounter, RevisionCounter, MoveOutDate, BaselineOPSCommitPlan, PlannedOPSCommitPlan, ActualOPSCommitPlan, BaselineCSAcceptsPlan, PlannedCSAcceptsPlan, ActualCSAcceptsPlan, BaselineOPSCommitUnit, PlannedOPSCommitUnit, ActualOPSCommitUnit, BaselineCSAcceptsUnit, PlannedCSAcceptsUnit, ActualCSAcceptsUnit, BaselineCSTurnsUnit, PlannedCSTurnsUnit, ActualCSTurnsUnit, BaselineCommunityManagerAccepts, PlannedCommunityManagerAccepts, ActualCommunityManagerAccepts, BaselineRentReady, PlannedRentReady, ActualRentReady, Area, PreleaseMove-inDate"
        skipLines="1" separator=","/>

    <jb:bindings beanId="UpDownUnitList" class="java.util.ArrayList" createOnElement="csv-set">
        <jb:wiring beanIdRef="UpDownUnit" />
    </jb:bindings>

    <jb:bindings beanId="UpDownUnit" class="com.mycompany.beans.UpDownUnit" createOnElement="csv-record">
        <jb:value data="csv-record/Property" property="property" />
        <jb:value data="csv-record/PropertyID" property="propertyNumber" />
        <jb:value data="csv-record/Unit#" property="unitName" />
        ...
        ...
        ...
        <jb:value data="csv-record/Area" property="area" />
        <jb:value data="csv-record/PreleaseMove-inDate" decoder="com.mycompany.smooks.decoders.DateNullDecoder" property="preleaseMoveInDate">
            <jb:decodeParam name="format">MM.dd.yyyy</jb:decodeParam>
        </jb:value>
    </jb:bindings>
</smooks-resource-list>
Reply | Threaded
Open this post in threaded view
|

Re: How do I prevent smooks from escaping ampersands?

aemcat313
myarbrou wrote
I have a CSV file that I'm converting to a Java bean and one of the fields has "H&M" in it. When I convert the file, smooks return "H&amp;M"

How do I prevent this from happening?

You may need a custom Decoder class to "unescape" the &. We ran into this problem with java-to-java smooks transformation, mentioned it to Red Hat, who said it was a bug in smooks, but it won't be fixed. So we wrote the decoder as a workaround - used for any elements that might have such text.

Maybe this will help - note uses apache common-lang jar for org.apache.commons.lang.StringEscapeUtils

---
in smooks file:
<jb:value property="productName" data="productName" decoder="com.yourcompany.text.StringDecoderUnescapeXml" />
---

package com.yourcompany.test;

import org.apache.commons.lang.StringEscapeUtils;
import org.milyn.javabean.DataDecodeException;
import org.milyn.javabean.DataDecoder;

/**
 * This decoder is for a workaround to a Red Hat Smooks bug in Java-Java transform.
 * XML reserved characters such as &, <, > are being XML-encoded, which they should not be.
 */
public class StringDecoderUnescapeXml implements DataDecoder
{
        /**
         * Unescapes XML-escaped characters in data. For example,
         * & will be converted to &
         */
        public Object decode(String data) throws DataDecodeException
        {
                if (data == null) {
                        return null;
                } else {
                        return StringEscapeUtils.unescapeXml(data);
                }
        }
}
Reply | Threaded
Open this post in threaded view
|

Re: How do I prevent smooks from escaping ampersands?

ak-dak
Hi,

I had the same problem with an EDIFACT to Java conversion. After migrating to smooks 1.5.1 the unwanted escaping disappeard.


Best regards

Anton



Von:        aemcat313 <[hidden email]>
An:        [hidden email],
Datum:        08.10.2014 23:39
Betreff:        [milyn-user] Re: How do I prevent smooks from escaping ampersands?




myarbrou wrote
> I have a CSV file that I'm converting to a Java bean and one of the fields
> has "H&M" in it. When I convert the file, smooks return "H&amp;amp;M"
>
> How do I prevent this from happening?


You may need a custom Decoder class to "unescape" the &. We ran into this
problem with java-to-java smooks transformation, mentioned it to Red Hat,
who said it was a bug in smooks, but it won't be fixed. So we wrote the
decoder as a workaround - used for any elements that might have such text.

Maybe this will help - note uses apache common-lang jar for
org.apache.commons.lang.StringEscapeUtils

---
in smooks file:
<jb:value property="productName" data="productName"
decoder="com.yourcompany.text.StringDecoderUnescapeXml" />
---

package com.yourcompany.test;

import org.apache.commons.lang.StringEscapeUtils;
import org.milyn.javabean.DataDecodeException;
import org.milyn.javabean.DataDecoder;

/**
* This decoder is for a workaround to a Red Hat Smooks bug in Java-Java
transform.
* XML reserved characters such as &, <, > are being XML-encoded, which they
should not be.
*/
public class StringDecoderUnescapeXml implements DataDecoder
{
                /**
                 * Unescapes XML-escaped characters in data. For example,
                 * &amp; will be converted to &
                 */
                public Object decode(String data) throws DataDecodeException
                {
                                 if (data == null) {
                                                  return null;
                                 } else {
                                                  return StringEscapeUtils.unescapeXml(data);
                                 }
                }
}




--
View this message in context:
http://milyn.996300.n3.nabble.com/How-do-I-prevent-smooks-from-escaping-ampersands-tp8930p8931.html
Sent from the milyn - user mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   
http://xircles.codehaus.org/manage_email



Reply | Threaded
Open this post in threaded view
|

Re: How do I prevent smooks from escaping ampersands?

myarbrou
In reply to this post by myarbrou
Thanks for your support guys. We took the SAX parser out and that sees to fix it across all fields, though I'm not sure why. I'll talk to my boss about migrating to 1.5 and see what that does. Otherwise I might start using more decoders!

Thanks again