Hi Johan,
I hope it wasn't inferred in any part of my email that "I" was trying to tell you what to do it in any specific manner.
And although I was prompted by your thread, it was completely unrelated.
In my "day" job - I always strive for the simplest solution possible.
If the requirements of the task require that it be "shared" around the application , then it becomes a requirement of that task to ensure that an appropriate API is provided.
If however there is no current need - even if I can see it might be used at a later date, "I" won't bother with any API related work.
What I think might happen later on; may not come to pass - and to my mind my employer deserves as much of my time being consumed for the purposes of adding new features, fixing bugs.
Future-proofing" for the sake of future proofing is not of any advantage today to the product, or myself as the developer of the code.
It can be argued that if you're in a specific problem space, you'll save time by completing the API now for that code - as opposed to coming back to it later on and having to reacquaint yourself with code that you previously wrote.
Especially in the OSS space - it could well be a case of you having to acquaint yourself with someone else's code.
For me, I find that the time saved, still, isn't that great, as long as;
* You have a clearly defined design standard that is ALWAYS followed.
* You write the simplest code possible to successfully complete the engineering task at hand.
* You provide as required appropriate inline documentation where the "simplest" solution possible is in fact complex.
I do realise that everyone has their own work practices / workflows and mileage varies depending on a matter of factors.
And it would seem that, at least initially, that I am being outvoted!
On 18/01/2011, at 8:35 PM, Johan Corveleyn wrote:
> Hi,
>
> As the guy responsible for the quote that started this thread ([1]):
>
>> Actually, what's a little bit troubling is that there are currently
>> only 3 possible "file_len's", of which only 2 are used in practice:
>> diff2 and diff3 (diff4 is not used in svn core, only in
>> tools/diff/diff4). So, if we make a slow and a fast version, we could
>> just as well make a XXX2 and an XXX3 version explicitly, and dispense
>> with the need to pass arrays and array lenghts. Hmmm... generic (but
>> more complex) code vs. a couple of specialized versions.
>
> I'd like to point out that I didn't mean to criticize the API choices,
> I was just weighing the options. In fact, I came up with the initial
> API choice myself (passing an array of "files", instead of passing 2
> resp. 3 resp. 4 files), because I wanted it to be generically useful.
> In any case it's a new API in the libsvn_diff module, for making
> prefix/suffix optimizations possible.
>
> I don't want to hijack this high-level discussion to be a technical
> one again, but it may provide some additional context to give some
> more details about the thoughts behind it ...
>
> In fact, I started out with the "simple" form: passing 2 file
> arguments for regular diff. But then I discovered there was also
> something like diff3 (with 3 files) and diff4 (with 4 files; not used
> in core SVN). So I simply thought to myself: theoretically this
> optimization should work with N files, so why not just make it so. And
> I changed the API to accept an array of files.
>
> This made the code more complex, but in reality it's always the same
> approach (so once you 'get' it, it's easy to follow):
>
> Things like:
>
> while(*file0.curp == *file1.curp)
> {
> ...
> }
>
> are replaced with (actually "AND-ing" N (or N-1) conditions):
>
> for (i = 1, is_match = TRUE; i< file_len; i++)
> is_match = is_match && *file[0].curp == *file[i].curp;
> while(is_match)
> {
> ...
> for (i = 1, is_match = TRUE; i< file_len; i++)
> is_match = is_match&& *file[0].curp == *file[i].curp;
> }
>
>
> So, to be very concrete, it's really the choice between:
>
> (1) A generic set of functions, that work for N files, which are
> signifcantly more complex than the N==2 case:
>
> datasources_open(file[])
> find_identical_prefix(file[])
> find_identical_suffix(file[])
>
> (2) A specialized set of functions for 2, 3 and 4 files (4 is maybe
> optional, because not used in svn core), which are a lot simpler, but
> actually completely duplicate the higher level logic:
>
> datasources_open2(file0, file1)
> find_identical_prefix2(file0, file1)
> find_identical_suffix2(file0, file1)
>
> datasources_open3(file0, file1, file2)
> find_identical_prefix3(file0, file1, file2)
> find_identical_suffix3(file0, file1, file2)
>
> datasources_open4(file0, file1, file2, file3)
> find_identical_prefix4(file0, file1, file2, file3)
> find_identical_suffix4(file0, file1, file2, file3)
>
>
> At the time, I felt that (1) was the right choice, WRT power/weight
> balance, keeping the logic in one single place. But with stefan2's
> proposed low-level optimizations of the algorithm, it's more or less
> open for debate :). Those low-level optimizations (which are
> definitely worth it) are much more difficult if you want to write them
> generically. And apart from that, splitting out the functions in _2,
> _3 and _4 variants is also a low-level optimization by itself.
>
> Right now, I'm still trying to keep it generic (trying to integrate
> stefan2's suggestions in a generic way; I'm assuming for now that the
> low-level optimization coming from the splitting-out itself is not
> significant (I'll have to test that)), but I'm not so sure anymore.
> We'll see...
>
> Anyway, all in all, I certainly don't feel like I've wasted my time,
> because it was also a great learning experience for me (new to C). And
> being a perfectionist, I want it to be as good as possible WRT the
> power/weight ratio (the most bang (speed/features/...) for the buck
> (complexity/maintainability/readability)) :-).
>
> Cheers,
> --
> Johan
>
> [1] http://svn.haxx.se/dev/archive-2011-01/0241.shtml
>
> On Tue, Jan 18, 2011 at 7:37 AM, Arwin Arni <arwin_at_collab.net> wrote:
>> Hi All,
>>
>> It is a very interesting notion that Gavin throws at us. I think it is very
>> important for an open-source project to maintain it's code in a way that it
>> is easy for a new guy (like me) to make quick and meaningful changes. Most
>> open-source projects with a large development community ends up in a mess of
>> perfectly working, yet highly unreadable code.
>>
>> However, as a pair of fresh eyes looking at the code, I can safely say that
>> though being an open-source project, Subversion has managed to stay
>> "readable". This can only be attributed to the hours of work spent on
>> "over-engineering" the code (BTW, I personally don't think anything can be
>> over-engineered. In my book, it is merely an synonym for perfection).
>>
>> There are parts of the code (in svn) that have been written with the notion
>> of "getting things done". And these are the parts that I really find
>> difficult to assimilate. I think time spent *earlier* to sanitize the code
>> is time better spent, than trying to read the mind of the original
>> implementer at a *later* point in time.
>>
>> Regards,
>> Arwin Arni
>>
>>
>>
>> On Tuesday 18 January 2011 07:26 AM, Gavin Beau Baumanis wrote:
>>>
>>> Hi Brane,
>>>
>>> I'm pretty sure the context of the quote is along the lines of;
>>>
>>> Poor design and implementation proves to be a burden in terms of
>>> maintenance costs, in the long run.
>>> And instead of having bums on seats for (entirely) new development,
>>> manpower is, instead, wasted on maintenance tasks because of poor design /
>>> lack of a prototype etc.
>>>
>>> I guess it is an implementation / coding practice question;
>>> Would a developer's time not be better spent on;
>>> Doing the "guts"of the job and at a later stage once the engineering is
>>> proven to be accurate / reflective of the requirements - then worry about
>>> private / public API requirements.
>>>
>>> Especially in an OSS project where resources are lean and transient, it is
>>> "my" (perhaps naive) view that spending x hours on writing an API that might
>>> not ever be used by another consumer is in fact x hours wasted that could
>>> have been spent on a more worthwhile task.
>>> When the requirement of a service to be consumed comes to bear, that is
>>> the time to create an appropriate API.
>>>
>>> From my past experiences, I have created many an API that have never-ever
>>> been used, purely because the design standard said an API was required,
>>> though the engineering requirements of satisfying the task at hand negated
>>> that requirement entirely.
>>>
>>> Again - I don't presume to know any better - and in fact I started the
>>> thread because of a desire to hopefully learn from it,
>>> I'm not trying to be deliberately argumentative - I am just a proponent of
>>> a good debate that fleshes out better outcomes / knowledge - selfishly for
>>> myself - and hopefully for others too.
>>>
>>> Gavin.
>>>
>>>
>>>
>>> On 18/01/2011, at 9:13 AM, Branko Čibej wrote:
>>>
>>>> On 17.01.2011 23:07, Gavin Beau Baumanis wrote:
>>>>>
>>>>> Hi Brane,
>>>>> I certainly do take maintainability seriously.
>>>>> What's that well-quoted figure?
>>>>> Something like 80% of the cost of software development is spent in the
>>>>> development phase?
>>>>
>>>> I believe it's "should be spent" rather than "is spent" ... in reality,
>>>> I've yet to see a project that didn't incur horrendous maintenance costs
>>>> as a result of shortcuts taken during development.
>>>>
>>>> -- Brane
>>>>
>>
>>
Received on 2011-01-19 04:55:53 CET