[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

De-duplication and (semi-automated) patch-management of related/similar files

From: Jochen Wezel <jwezel_at_compumaster.de>
Date: Mon, 24 Oct 2011 04:21:34 -0700 (PDT)

The situation:
==========

We are often facing following and similar situations in our
development processes:

The same file will be reused as template for several customer
projects; de-duplication can save lot of space, since it's always the
same file content
File Commit build Hash Additional historical hashes
/trunk/customer1/website/masterdesign.css 1 abc

The same file will be reused as template for several customer
projects; de-duplication can save lot of space, since it's always the
same file content
File Commit build Hash Additional historical hashes
/trunk/customer1/website/masterdesign.css 1 abc
/trunk/customer1/microsite/masterdesign.css 2 abc
/trunk/customer2/website/masterdesign.css 3 abc
/trunk/customer3/website/masterdesign.css 4 abc
/trunk/customer4/website/masterdesign.css 4 abc
/trunk/customer5/website/masterdesign.css 5 abc
/trunk/customer5/microsite/masterdesign.css 6 abc

In runtime, some projects need individual changes
File Commit build Hash Additional historical hashes
/trunk/customer1/website/masterdesign.css 9 jkl abc
/trunk/customer1/microsite/masterdesign.css 2 abc
/trunk/customer2/website/masterdesign.css 3 abc
/trunk/customer3/website/masterdesign.css 8 ghi abc
/trunk/customer4/website/masterdesign.css 4 abc
/trunk/customer5/website/masterdesign.css 5 abc
/trunk/customer5/microsite/masterdesign.css 7 def abc

In runtime, individual changes are re-used for similar projects
File Commit build Hash Additional historical hashes
/trunk/customer1/website/masterdesign.css 9 jkl abc
/trunk/customer1/microsite/masterdesign.css 2 abc
/trunk/customer2/website/masterdesign.css 3 abc
/trunk/customer3/website/masterdesign.css 8 ghi abc
/trunk/customer4/website/masterdesign.css 4 abc
/trunk/customer5/website/masterdesign.css 5 abc
/trunk/customer5/microsite/masterdesign.css 7 def abc
/trunk/customer5/microsite2/masterdesign.css 11 def abc
/trunk/customer5/microsite3/masterdesign.css 12 def abc

And again, individual changes may apply for those sub-variants of the
same file
File Commit build Hash Additional historical hashes
/trunk/customer1/website/masterdesign.css 14 pqr abc, jkl
/trunk/customer1/microsite/masterdesign.css 2 abc
/trunk/customer2/website/masterdesign.css 3 abc
/trunk/customer3/website/masterdesign.css 8 ghi abc
/trunk/customer4/website/masterdesign.css 4 abc
/trunk/customer5/website/masterdesign.css 5 abc
/trunk/customer5/microsite/masterdesign.css 7 def abc
/trunk/customer5/microsite2/masterdesign.css 11 def abc
/trunk/customer5/microsite3/masterdesign.css 13 mno abc, def

The problem #1: a minor issue:
=======================
By comparing the hash, the subversion repository could save lot of
space for the same file stream at several locations.

The problem #2: a big issue:
=======================
To the same time, if there are hash codes available, another feature
could take place, if there is e.g. the need to change the file /trunk/
customer5/microsite2/masterdesign.css.
The problem: related on the individual request, I have to apply the
patch 1. only for the current project, 2. for all files at all
locations in the repository, which are using the same file content
(clone relations), 3. for all files at all locations in the
repository, which are using the same file content or are based on this
file content (children relations), 4. for all files at locations in
the repository, which are related to any point in history to this file
(full historical relations), 5. a manual selection of related
files

A possible solution:
===============
Below marked, which files should get the patch in case 1
File Commit build Hash Additional historical hashes Identification
explanations
/trunk/customer1/website/masterdesign.css 9 pqr abc, jkl
/trunk/customer1/microsite/masterdesign.css 2 abc
/trunk/customer2/website/masterdesign.css 3 abc
/trunk/customer3/website/masterdesign.css 8 ghi abc
/trunk/customer4/website/masterdesign.css 4 abc
/trunk/customer5/website/masterdesign.css 5 abc
/trunk/customer5/microsite/masterdesign.css 7 def abc
/trunk/customer5/microsite2/masterdesign.css 11 def abc only this
file
/trunk/customer5/microsite3/masterdesign.css 13 mno abc, def

Below marked, which files should get the patch in case 2
File Commit build Hash Additional historical hashes Identification
explanations
/trunk/customer1/website/masterdesign.css 9 pqr abc, jkl
/trunk/customer1/microsite/masterdesign.css 2 abc
/trunk/customer2/website/masterdesign.css 3 abc
/trunk/customer3/website/masterdesign.css 8 ghi abc
/trunk/customer4/website/masterdesign.css 4 abc
/trunk/customer5/website/masterdesign.css 5 abc
/trunk/customer5/microsite/masterdesign.css 7 def abc only files with
hash value "def"
/trunk/customer5/microsite2/masterdesign.css 11 def abc only files
with hash value "def"
/trunk/customer5/microsite3/masterdesign.css 13 mno abc, def

Below marked, which files should get the patch in case 3
File Commit build Hash Additional historical hashes Identification
explanations
/trunk/customer1/website/masterdesign.css 9 pqr abc, jkl
/trunk/customer1/microsite/masterdesign.css 2 abc
/trunk/customer2/website/masterdesign.css 3 abc
/trunk/customer3/website/masterdesign.css 8 ghi abc
/trunk/customer4/website/masterdesign.css 4 abc
/trunk/customer5/website/masterdesign.css 5 abc
/trunk/customer5/microsite/masterdesign.css 7 def abc only files with
hash value "def" or children ("mno")
/trunk/customer5/microsite2/masterdesign.css 11 def abc only files
with hash value "def" or children ("mno")
/trunk/customer5/microsite3/masterdesign.css 13 mno abc, def only
files with hash value "def" or children ("mno")

Below marked, which files should get the patch in case 4
File Commit build Hash Additional historical hashes Identification
explanations
/trunk/customer1/website/masterdesign.css 9 pqr abc, jkl only files
with hash value "def" or children ("mno") or parents ("abc") or any
children of any parents ("jkl", "ghi", "pqr")
/trunk/customer1/microsite/masterdesign.css 2 abc only files with
hash value "def" or children ("mno") or parents ("abc") or any
children of any parents ("jkl", "ghi", "pqr")
/trunk/customer2/website/masterdesign.css 3 abc only files with hash
value "def" or children ("mno") or parents ("abc") or any children of
any parents ("jkl", "ghi", "pqr")
/trunk/customer3/website/masterdesign.css 8 ghi abc only files with
hash value "def" or children ("mno") or parents ("abc") or any
children of any parents ("jkl", "ghi", "pqr")
/trunk/customer4/website/masterdesign.css 4 abc only files with hash
value "def" or children ("mno") or parents ("abc") or any children of
any parents ("jkl", "ghi", "pqr")
/trunk/customer5/website/masterdesign.css 5 abc only files with hash
value "def" or children ("mno") or parents ("abc") or any children of
any parents ("jkl", "ghi", "pqr")
/trunk/customer5/microsite/masterdesign.css 7 def abc only files with
hash value "def" or children ("mno") or parents ("abc") or any
children of any parents ("jkl", "ghi", "pqr")
/trunk/customer5/microsite2/masterdesign.css 11 def abc only files
with hash value "def" or children ("mno") or parents ("abc") or any
children of any parents ("jkl", "ghi", "pqr")
/trunk/customer5/microsite3/masterdesign.css 13 mno abc, def only
files with hash value "def" or children ("mno") or parents ("abc") or
any children of any parents ("jkl", "ghi", "pqr")

Summary:
========

My suggestion:
1. the subversion server provides de-duplication technology as well as
relations-reporting features
2. the client provides a dialog before committing the patches, which
identify related files, show them to the user and provide a selection
mask to allow the user to select/unselect additional files for
applying the current path. The patches can be applied first locally to
be reviewed, before finally committed by the user.
Since this feature request requires changes to both server and client
as well as the communication protocol of SVN, I'd like to ask very
kindly if this feature makes sense in your opinion and if yes, to
involve all required partners (client, server, SVN communication
protocol) to make this new milestone real.

Jochen

------------------------------------------------------
http://tortoisesvn.tigris.org/ds/viewMessage.do?dsForumId=4061&dsMessageId=2862358

To unsubscribe from this discussion, e-mail: [users-unsubscribe_at_tortoisesvn.tigris.org].
Received on 2011-10-24 13:31:53 CEST

This is an archived mail posted to the TortoiseSVN Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.