Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug on on referrer header when value contains &section like www.asdf.com?a=1&section=2 #503

Closed
federico-piazza opened this issue Jun 27, 2019 · 26 comments

Comments

@federico-piazza
Copy link

We are using ESAPI 2.2.0.0 version and found a bug when using referrer header.

If we have below config:

Encoder.AllowMultipleEncoding=false
Encoder.AllowMixedEncoding=false

And when we use a header like Referer http://127.0.0.1:3000/campaigns?goal=all&section=active&sort-by=-id&status=Draft%2CLaunched, this generates errors like this:

2019-06-27 15:26:14 [qtp414915464-14] ERROR IntrusionException - [SECURITY FAILURE Anonymous:null@unknown -> 127.0.0.1:8282/Meta/IntrusionException] INTRUSION - Mixed encoding (2x) detected in http://127.0.0.1:3000/campaigns?goal=all&section=active&sort-by=-id&status=Draft%2CLaunched

By debugging the source code the encoding validation has this snippet on DefaultEncoder class:

            while ( i.hasNext() ) {
                Codec codec = (Codec)i.next();
                String old = working;
                working = codec.decode( working );
                if ( !old.equals( working ) ) {
                    if ( codecFound != null && codecFound != codec ) {
                        mixedCount++;
                    }
                    codecFound = codec;
                    if ( clean ) {
                        foundCount++;
                    }
                    clean = false;
                }
            }

As you can see, our referrer header value is decoded from this:

http://127.0.0.1:3000/campaigns?goal=all&page=1&section=active&sort-by=-id&status=Draft%2CLaunched

To this

http://127.0.0.1:3000/campaigns?goal=all&page=1§ion=active&sort-by=-id&status=Draft%2CLaunched
                                               ^^^^

Therefore this counts as one encoding applied, then it continues and in the next decode it goes with:

http://127.0.0.1:3000/campaigns?goal=all&page=1§ion=active&sort-by=-id&status=Draft,Launched

As you can see the &section is decoded as §ion and then the %2C is decoded as ,, therefore the encoding counter is 2... which is treated as a mixed encoding and generates above error log.

I think this is a bug since the url is valid to have &section=....

What do you think?

@kwwall
Copy link
Contributor

kwwall commented Jun 29, 2019

@federico-piazza Seems like a bug to me since you have both Encoder.AllowMultipleEncoding and Encoder.AllowMixedEncoding set to false.

Do you suppose you could attach a simple 2 or 3 line code fragment that will allow us to reproduce this (assuming these config settings) so I don't have to chase down exactly exactly which Encoder method you are calling. That would help us a lot. Thanks.

@kwwall kwwall added the bug label Jun 29, 2019
@federico-piazza
Copy link
Author

federico-piazza commented Jun 29, 2019

hey @kwwall not sure really what snippet you need from me, so I think this is what you look for:

@Bean
public Filter esapiSecurityWrapper() {
        return new SecurityWrapper(); // org.owasp.esapi.filters.SecurityWrapper
}
@Bean
public FilterRegistrationBean esapiFilterRegistrationBean() {
        FilterRegistrationBean registrationBean = new FilterRegistrationBean();
        registrationBean.setFilter(esapiSecurityWrapper());
        registrationBean.setName("ESAPISecurityWrapper");
        registrationBean.setOrder(2);
        return registrationBean;
}

The rest is configuration con ESAPI.properties

Encoder.AllowMultipleEncoding=false
Encoder.AllowMixedEncoding=false
Encoder.DefaultCodecList=HTMLEntityCodec,PercentCodec,JavaScriptCodec

@kwwall
Copy link
Contributor

kwwall commented Jun 30, 2019

Yeah; okay, I didn't realize that you were using ESAPI's SecurityWrapper. I guess what I meant was, since you had already gone through the debugging and discovered it was going through this part of the DefaultEncoder class:

            while ( i.hasNext() ) {
                Codec codec = (Codec)i.next();
                String old = working;
                working = codec.decode( working );
                if ( !old.equals( working ) ) {
                    if ( codecFound != null && codecFound != codec ) {
                        mixedCount++;
                    }
                    codecFound = codec;
                    if ( clean ) {
                        foundCount++;
                    }
                    clean = false;
                }
            }

could you identify the method in DefaultEncoder that was actually on the call-chain? Just to save me time having to hunt for this particular snippet in the code and then devising a JUnit test for it to reproduce it. Or even better, if you could turn this into a new test in DefaultEncoderTest that calls out this behavior.
Sorry for the confusion.

@federico-piazza
Copy link
Author

Hope this help.

The class is DefaultEncoder, the method is canonicalize:

public String canonicalize( String input, boolean restrictMultiple, boolean restrictMixed ) {

The stack I could find was this SecurityWrapperResponse.addHeader:

public void addHeader(String name, String value) {
        try {
            // TODO: make stripping a global config
        	SecurityConfiguration sc = ESAPI.securityConfiguration();
            String strippedName = StringUtilities.stripControls(name);
            String strippedValue = StringUtilities.stripControls(value);
            String safeName = ESAPI.validator().getValidInput("addHeader", strippedName, "HTTPHeaderName", sc.getIntProp("HttpUtilities.MaxHeaderNameSize"), false);
            String safeValue = ESAPI.validator().getValidInput("addHeader", strippedValue, "HTTPHeaderValue", sc.getIntProp("HttpUtilities.MaxHeaderValueSize"), false);
            getHttpServletResponse().addHeader(safeName, safeValue);
        } catch (ValidationException e) {
            logger.warning(Logger.SECURITY_FAILURE, "Attempt to add invalid header denied", e);
        }
    } 

The ESAPI.validator().getValidInput internally calls:

		return getValidInput(context, input, type, maxLength, allowNull, true);

That method:

public String getValidInput(String context, String input, String type, int maxLength, boolean allowNull, boolean canonicalize) throws ValidationException {
		StringValidationRule rvr = new StringValidationRule( type, encoder );
		Pattern p = ESAPI.securityConfiguration().getValidationPattern( type );
		if ( p != null ) {
			rvr.addWhitelistPattern( p );
		} else {
            // Issue 232 - Specify requested type in exception message - CS
			throw new IllegalArgumentException("The selected type [" + type + "] was not set via the ESAPI validation configuration");
		}
		rvr.setMaximumLength(maxLength);
		rvr.setAllowNull(allowNull);
		rvr.setCanonicalize(canonicalize);
		return rvr.getValid(context, input);
	}

The return rvr.getValid(context, input); is:

public String getValid( String context, String input ) throws ValidationException
	{
		String data = null;

		// check for empty/null
		if(checkEmpty(context, input) == null)
			return null;

		// check length
		checkLength(context, input);

		// canonicalize
		if (canonicalizeInput) {
		    data = encoder.canonicalize(input);
		} else {
		    String message = String.format("Input validaiton excludes canonicalization.  Context: %s   Input: %s", context, input);
		    LOGGER.warning(Logger.SECURITY_AUDIT, message);
            data = input;
		}

		// check whitelist patterns
		checkWhitelist(context, input);

		// check blacklist patterns
		checkBlacklist(context, input);
			
		// validation passed
		return data;
	}

Above method has data = encoder.canonicalize(input);

This canonicalize belongs to DefaultEncoder:

public String canonicalize( String input ) {
		if ( input == null ) {
			return null;
		}

        // Issue 231 - These are reverse boolean logic in the Encoder interface, so we need to invert these values - CS
		return canonicalize(input, 
							!ESAPI.securityConfiguration().getAllowMultipleEncoding(),
							!ESAPI.securityConfiguration().getAllowMixedEncoding() );
	}

And that return canonicalize... is:

public String canonicalize( String input, boolean restrictMultiple, boolean restrictMixed ) {
		if ( input == null ) {
			return null;
		}
		
        String working = input;
        Codec codecFound = null;
        int mixedCount = 1;
        int foundCount = 0;
        boolean clean = false;
        while( !clean ) {
            clean = true;
            
            // try each codec and keep track of which ones work
            Iterator i = codecs.iterator();
            while ( i.hasNext() ) {
                Codec codec = (Codec)i.next();
                String old = working;
                working = codec.decode( working );
                if ( !old.equals( working ) ) {
                    if ( codecFound != null && codecFound != codec ) {
                        mixedCount++;
                    }
                    codecFound = codec;
                    if ( clean ) {
                        foundCount++;
                    }
                    clean = false;
                }
            }
        }
        
        // do strict tests and handle if any mixed, multiple, nested encoding were found
        if ( foundCount >= 2 && mixedCount > 1 ) {
            if ( restrictMultiple || restrictMixed ) {
                throw new IntrusionException( "Input validation failure", "Multiple ("+ foundCount +"x) and mixed encoding ("+ mixedCount +"x) detected in " + input );
            } else {
                logger.warning( Logger.SECURITY_FAILURE, "Multiple ("+ foundCount +"x) and mixed encoding ("+ mixedCount +"x) detected in " + input );
            }
        }
        else if ( foundCount >= 2 ) {
            if ( restrictMultiple ) {
                throw new IntrusionException( "Input validation failure", "Multiple ("+ foundCount +"x) encoding detected in " + input );
            } else {
                logger.warning( Logger.SECURITY_FAILURE, "Multiple ("+ foundCount +"x) encoding detected in " + input );
            }
        }
        else if ( mixedCount > 1 ) {
            if ( restrictMixed ) {
                throw new IntrusionException( "Input validation failure", "Mixed encoding ("+ mixedCount +"x) detected in " + input );
            } else {
                logger.warning( Logger.SECURITY_FAILURE, "Mixed encoding ("+ mixedCount +"x) detected in " + input );
            }
        }
        return working;
	}

So, here you can find the issue with the mixed encoding.

@kwwall
Copy link
Contributor

kwwall commented Jun 30, 2019

@federico-piazza Excellent! Exactly what I was looking for. Thanks.

@jeremiahjstacey
Copy link
Collaborator

Trying to help out with limited connectivity. Wrote up another method for EncoderTest that should capture the bug. Problem is that I'm not getting the IntrusionException when I run it. I may be overlooking something though. I've included below,, maybe you'll see what I'm missing.


 public void testPiazzaCanoncalizeHeader()  throws Exception {
        String piazzaCheck = "http://127.0.0.1:3000/campaigns?goal=all&section=active&sort-by=-id&status=Draft%2CLaunched";
        //Only mutation on the raw String in the call chain (That I noticed)
        piazzaCheck = org.owasp.esapi.StringUtilities.stripControls(piazzaCheck);
        ArrayList<String> list = new ArrayList<String>();
        list.add( "HTMLEntityCodec" );
        list.add( "PercentCodec" );
        list.add( "JavaScriptCodec" );
        
        Encoder instance = new DefaultEncoder( list );
        try {
            String can = instance.canonicalize(piazzaCheck, false, false);
            org.junit.Assert.fail("Should have thrown Exception");
        } catch (IntrusionException ex) {
            //expected
        }
    }

@xeno6696
Copy link
Collaborator

xeno6696 commented Jul 8, 2019

Still catching up from Vacation.

The long and the short of it is that the validation on the referer header needs to use the new DefaultEncoder.getCanonicalizedURI() method. And this will have been a problem for at least a decade. The root cause is that a URI is already a language specification itself, and some parameters are also HTML entities.

@jeremiahjstacey , to get your test to work as expected you have to pick a URI GET parameter that maps to an HTML entity, like &para or &apos. &section, &sort and &status do not map to any HTMLEntities.

@xeno6696
Copy link
Collaborator

xeno6696 commented Jul 8, 2019

Actually I should have made a convenience method called DefaultEncoder.getValidURI()

@kwwall
Copy link
Contributor

kwwall commented Jul 8, 2019

I'm not convinced that '&section=active' should have been matched as the named character reference for '&sect;' and thus changed into '§ion=active' in the first place.

My rudimentary (mis)understanding from what I vaguely recall from @planetlevel about this choice of not requiring trailing semicolons as part of the named character reference matches was that it had something to do with certain browsers out coming of Redmond, WA from a corporation whose thinking was that they know better than everyone else and in an attempt to discern the developers intent apparently allowed their browsers to match named character references that did not have trailing ';' that was part of the the W3C spec. (IMO, an abuse of Postel's Law if there ever was one.)

So I think the thinking for ESAPI at the time was since all these broken browsers out there allow it (and at the time, they had the lion's share of the browser market), so then ESAPI should match named character references too if they have a missing trailing ';' in order to provide a "safe haven" of sorts.

I really hate to carry something like that forward, but the only away around it and to allow backward compatibility with earlier ESAPI 2.x releases would be to set up a new ESAPI property that give developers the option to ignore the trailing ';' on named character references (which would be the default for backwards compatibility) or to require it and it not present then something like '&section=active' would not be changed into '§ion=active'. (Because everything in ESAPI is pretty much a singleton and we presently don't have a way to reparse the ESAPI.properties file after it is initially read, you pretty much would still have to have it one way or another rather than allowing a developer to choose on a case-by-case basis, but it would still probably be better than it is.)

Of Java supported default parameter values to methods like C++ does, we could easily make it backward compatible using that, but Java doesn't support that and AFAIK, has no plans to. We probably could kludge something that would work by adding a new method all the way down (in this case, maybe all the way down to StringValidationRule), but I see that as really clumsy and simply not worth it.

Of course, if we can figure out somehow that what we are looking at is a URI, maybe we could require a strict adherence to the W3C spec as far as named character references go since I can't think of any time they legitimately be used within a URI. (That's certainly not the way you would URL encode them.)

Thoughts?

@xeno6696
Copy link
Collaborator

xeno6696 commented Jul 9, 2019

My rudimentary (mis)understanding from what I vaguely recall from @planetlevel about this choice of not requiring trailing semicolons as part of the named character reference matches was that it had something to do with certain browsers out coming of Redmond, WA from a corporation whose thinking was that they know better than everyone else and in an attempt to discern the developers intent apparently allowed their browsers to match named character references that did not have trailing ';' that was part of the the W3C spec. (IMO, an abuse of Postel's Law if there ever was one.)

Very correct. I struggled with that and almost converted the HTMLEntityCodec to using semicolon-based entities but backed off.

Of course, if we can figure out somehow that what we are looking at is a URI, maybe we could require a strict adherence to the W3C spec as far as named character references go since I can't think of any time they legitimately be used within a URI. (That's certainly not the way you would URL encode them.)

It would be pretty simple really: I almost removed them once. We mark the HTMLEntityCodec class as Deprecated for the next release and then create a new Codec that matches the spec, change esapi.props to point to the new codec. Would be a great first issue.

@planetlevel
Copy link

planetlevel commented Jul 9, 2019 via email

@xeno6696
Copy link
Collaborator

xeno6696 commented Jul 9, 2019

Could have sworn I tested this before with a negative result, but I just tried it in chrome, firefox, and IE11 and they all render.

@xeno6696
Copy link
Collaborator

xeno6696 commented Jul 9, 2019

Edge too.

@kwwall
Copy link
Contributor

kwwall commented Jul 10, 2019

@planetlevel - I hear ya, and I don't disagree. But my point is that what we presently have is breaking things because ESAPI is overly aggressively canonicalizing everything. That would be fine if we could give developers control when it becomes problematic, but in this case we can't. (Well, rather, more accurately, we haven't.) In fact, until recently (see closed GitHub issue #476), even where we had promised that it could be disabled, we were not able to do so because of a bug. For reasons already mentioned, I don't think we can simply add some new property that disables canonicalization because that you couldn't enable it as-needed. But given that this problem ultimately happens in a JavaEE filter, I'm not sure how we can control it...maybe an init parameter for the filter. But I think it makes more sense to just not to do this for URIs. (And if we wanted it permanently disabled for everything, we already have the Encoder.DefaultCodecList, which I suspect [haven't tried] if you set it to empty, would disable canonicalization. So I think we need some new approach. But what is the question?

@xeno6696
Copy link
Collaborator

Again, the root cause of this issue isn't aggressive canonicalization. It's just that whoever developed the original canonicalize method didn't consider that URIs are a special case. We have the ability to safely and correctly canonicalize URI's now. I don't know how I missed this case, but I did and it's an easy fix.

@planetlevel
Copy link

planetlevel commented Jul 10, 2019 via email

@xeno6696
Copy link
Collaborator

I know that one main advantages of canonicalize is the detection of mixed and double encoding cases, but it's original intent was to provide a canonicalized string that would be considered safe to run against our validation regexes. Maybe in the future we should consider separating the intrusion detection component and then migrate to defaulting ESAPI to the Encoder project. Jim's encoder however is encode only not decode so users of ESAPI that are using the decode methods will need a path forward--or we add decoding functionality to the Encoder project. Or we say "Oh well!"

@planetlevel
Copy link

planetlevel commented Jul 10, 2019 via email

@xeno6696
Copy link
Collaborator

@federico-piazza can you provide a few more layers of your stack call? I'm interested in the path that leads to the addHeader method. For URI's we're going to have to bifurcate before that call. And I'm having a hard time reproducing @jeremiahjstacey 's unit test as well. Could you provide us with a unit test that reproduces this issue?

@xeno6696
Copy link
Collaborator

So the solution is that in headers where a URI is expected, we need to ensure we use the proper canonicalize method. I just spent a couple hours testing down that path, and the DefaultEncoder.getCanonicalizedURI() method I created is working as expected.

@federico-piazza I think all I have to do is create some convenience methods, my hunch is that you're calling .addHeader and I just need to add an .addReferer method so the URI validation completes as expected.

@federico-piazza
Copy link
Author

@xeno6696, excellent, so you figured it out the issue, looks like it was simple than expected.

I'm not able right now to provide the info you asked me in the previous message since we are releasing to prod and we expect many annoying things.

Do you know when this fix is going to be available in a next version?

@xeno6696
Copy link
Collaborator

I'm writing an "addReferer" as we speak, and @kwwall and I want to try and push a release out before Sept 10.

@federico-piazza
Copy link
Author

@xeno6696 fantastic! Thanks a lot. Do I have to modify my code or is this an internal fix transparent to me?

@xeno6696
Copy link
Collaborator

xeno6696 commented Sep 1, 2019

@federico-piazza you'll have to use the new 'SecurityResponseWrapper.addReferer()' method that I'm adding.

@xeno6696 xeno6696 closed this as completed Sep 1, 2019
@lack3r
Copy link

lack3r commented Sep 17, 2019

Hi @kwwall, @federico-piazza and @xeno6696 .
I have the exact same issue and lots of requests are identified as attacks.
@xeno6696, many thanks for fixing this!
Do you know when this fix will be available in the maven repository so that we can use it?

@xeno6696
Copy link
Collaborator

The fix is present now. Just depends on whether your company allows you the flexibility to do the right thing!

kwwall pushed a commit that referenced this issue Oct 6, 2019
#514)

* Fixed issues #503 by writing a new addReferer method, also temporarily silenced issues related to mocking in #496.

* Additional fix to #503.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants