[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Double compression over HTTPS

From: Bert Huijben <bert_at_qqmail.nl>
Date: Fri, 5 Oct 2012 12:15:05 +0200

+1 on this.

 

Can you see if zlib is being used (after negotiation) before requesting
compression from the Server?

Just checking if the feature is available in openssl doesn't tell if the
feature is compiled in at the other side.

 

 

Using 1m9 (=69 seconds) and 109 seconds in the same table, hides the major
difference between the numbers.

 

We should clearly avoid accidentally having no compression at all unless
requested, as the double compression works much better than that scenario.

 

                Bert

 

 

From: lieven.govaerts_at_gmail.com [mailto:lieven.govaerts_at_gmail.com] On Behalf
Of Lieven Govaerts
Sent: vrijdag 5 oktober 2012 11:51
To: Subversion Development; serf-dev_at_googlegroups.com
Subject: Double compression over HTTPS

 

Hi,

when OpenSSL is built with zlib, it will automatically compress all data
sent over an SSL connection. You can see this in the initial handshake
"Client Hello" and "Server Hello" where client and server agree on the
compression mechanism to be used.

If the data being sent or received is already compressed, OpenSSL will
compress it a second time. This can happen when already compressed binaries
like .gif or .zip are sent, or when the server uses gzip encoding for a http
response. This can have impact on performance and memory usage. See Paul
Querna's blog post about this topic in [1]. This also has been mentioned
before on svn-dev by Justin in [2].

Since OpenSSL 1.0 this automatic compression can be disabled at runtime.

Compression by OpenSSL has some advantages and disadvantages:
+ OpenSSL will compress the full data stream, so for https that includes all
headers + all small requests and responses which mod_deflate skips.
+ OpenSSL compression is stateful, it will not reset its dictionary between
every response like gzip/deflate-encoding does, so it will reach better
compression ratio when content of multiple consecutive requests or responses
are similar within a 32KB window. Side note: I have done tests with using a
preset dictionary for zlib for http(s) responses and found the difference
can be up to 50% extra compression.
- Where content is already compressed by the application layer (e.g. gzip
encoding or transferring binary files), OpenSSL will compress these again.

I have been doing some small-scale testing to see what difference this all
makes. My test case was using svn to checkout a copy of the subversion trunk
branch in the asf repository.

I have tested 4 different scenario's:
1. As-is setup, OpenSSL compression enabled + gzip encoding enabled. (double
compression)
2. OpenSSL compression disabled + gzip encoding enabled. (compression
handled by the application)
3. OpenSSL compression disabled + gzip encoding disabled. (no compression at
all)
4. OpenSSL compression enabled + gzip encoding disabled (compression handled
by OpenSSL)

I found this particular scenario too small to see a measurable difference in
memory or cpu usage, although this is interesting to test further.

Difference in total times are more interesting:
   | bytes read | bytes written | total time
1: | 17.50MB | 233-284KB | 59s
2: | 18.67MB | 2.13-2.43MB | 1m9s-1m18s
3: | 50.35MB | 2.34MB | 103s-108s
4: | 15.27MB | 235-260KB | 50s-56s

You can see from the above reasoning and my test results that it would be
beneficial to disable gzip encoding when using https if OpenSSL was built
with zlib. However, in the scenario where large compressed binary files are
stored in a svn repository, I suppose disabling both OpenSSL compression and
gzip encoding will provide the best results.

Given the above I propose the following:

- Add an option in serf to disable OpenSSL compression

- Add a function in serf to check if compression is enabled in OpenSSL.

- In Subversion, don't ask for gzip encoding when working over https with
compression.

- In Subversion, if the config option "http-compression" is set to "no",
disable both OpenSSL compression and gzip encoding.

 

Which makes scenario 4 the default, and the user can select for scenario 3
with the "http-compression" option.

 

Patch to disable OpenSSL compression in serf is attached.

 

Suggestions? Objections?

 

Lieven

[1]: http://journal.paul.querna.org/articles/2011/04/05/openssl-memory-use/

[2]: http://svn.haxx.se/dev/archive-2011-05/0362.shtml

 
Received on 2012-10-05 12:15:44 CEST

This is an archived mail posted to the Subversion Dev mailing list.