Re: [Subclipse-dev] Revision graph and cache implementation

From: Alberto Gimeno <gimenete_at_gmail.com>
Date: Tue, 8 Jul 2008 19:28:23 +0200

I also would like to know your opinions about the "possible
alternatives" I talked about in this post:
http://gimenetegsoc.wordpress.com/2008/07/01/the-cache-design/

In that post I also talk about the cache general design.

On Tue, Jul 8, 2008 at 7:25 PM, Alberto Gimeno <gimenete_at_gmail.com> wrote:
> Responding to Steve. The cache implementation has three methods. No
> problem creating an ICache interface. The UI could change the Cache
> implementation just changing the "new" statement.
>
> I know that using serialization has one problem: you have to read or
> write a whole data structure. You can't query just some data. So you
> have a problem with memory if the data structure is too big. However
> when drawing the graph all the information for that graph has to be in
> memory. In that case doesn't matter if the data is in a database
> because you have to query lots of records. That's why I thought in
> "one graph = one serialized data structure". The graph data structure
> will replace two of the current tables in the database structure. But
> there is a third table: the table that stores the branching
> information and unique ids. This table does need to be queried several
> times, and the queries just involve one record. This table can be very
> big because it contains one record per path. This is, if you have
> 400,000 files you'll have 400,000 records.
>
> When I said that queries are fast I mean the query that retrieves the
> information for the graph. However there are other queries when
> inserting / updating. Maybe in those queries there is a problem with
> indexes.
>
> Well, although inserting / updating is slow, *the big update* is just
> done the first time. Nevertheless I will work on trying to improve
> that.
>
>
> On Tue, Jul 8, 2008 at 6:20 PM, Maciek Sakrejda <msakrejda_at_truviso.com> wrote:
>> For what it's worth (as mostly an observer of subclipse-dev out of
>> general interest), I think batching DB inserts/updates *should* help
>> performance quite a bit. I worked on the Snap project
>> (http://snap-photo.sourceforge.net -- but now abandoned) and we used a
>> DB cache with SQLite for photo metadata. Once we realized how big a
>> bottleneck the inserts were, we tried a number of things including
>> batching, and batching gave us an order-of-magnitude performance
>> improvement. I would definitely suggest trying batching before moving on
>> to serialization or another technique.
>>
>> --
>> Maciek Sakrejda
>> Truviso, Inc.
>> http://www.truviso.com
>>
>> -----Original Message-----
>> From: Alberto Gimeno <gimenete_at_gmail.com>
>> Reply-To: dev_at_subclipse.tigris.org
>> To: dev_at_subclipse.tigris.org
>> Subject: [Subclipse-dev] Revision graph and cache implementation
>> Date: Tue, 8 Jul 2008 17:24:04 +0200
>>
>> The current implementation of the cache does not meet my personal
>> expectations. I think that queries are fast but inserting and updating
>> the cache is very slow. Maybe the embedded database is a bottleneck.
>>
>> I'm thinking about to implement the cache using Java serialization.
>>
>> In the current database structure I use a "files" table to store the
>> branches information and I give a unique ID to each file. That ID is
>> shared between a file and its branches. This is, one ID = one graph.
>> Every file path involved in the graph has the same ID in the database.
>> So I think that maybe it could be possible to have:
>>
>> * One file for storing the "files" information (branches and unique IDs).
>> * One file per file ID. All the information to show a graph would be
>> in one of these files.
>>
>> Those files would be written and read using serialization.
>>
>> I'm not sure. Before making a new implementation I can do some things
>> to improve the performance. I can try to find if there is some query
>> against the database especially slow and fix it (for example, a query
>> needs an index). And I can try to implement the cache using batch
>> updates. But I would like to know your opinions.
>>
>> Nevertheless making a new implementation using serialization won't
>> take me much time. Probably a few days.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe_at_subclipse.tigris.org
>> For additional commands, e-mail: dev-help_at_subclipse.tigris.org
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe_at_subclipse.tigris.org
>> For additional commands, e-mail: dev-help_at_subclipse.tigris.org
>>
>>
>
>
>
> --
> Alberto Gimeno Brieba
> Presidente y fundador de
> Ribe Software S.L.
> http://www.ribesoftware.com
> ribe_at_ribesoftware.com
>
> Contacto personal
> eMail: gimenete_at_gmail.com
> GTalk: gimenete_at_gmail.com
> msn: gimenete_at_hotmail.com
> página web: http://gimenete.net
> teléfono móvil: +34 625 24 64 81
>

-- 
Alberto Gimeno Brieba
Presidente y fundador de
Ribe Software S.L.
http://www.ribesoftware.com
ribe_at_ribesoftware.com
Contacto personal
eMail: gimenete_at_gmail.com
GTalk: gimenete_at_gmail.com
msn: gimenete_at_hotmail.com
página web: http://gimenete.net
teléfono móvil: +34 625 24 64 81
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subclipse.tigris.org
For additional commands, e-mail: dev-help_at_subclipse.tigris.org

Received on 2008-07-08 19:28:32 CEST

This message: [ Message body ]
Next message: Eugene Kuleshov: "Re: [Subclipse-dev] Revision graph and cache implementation"
Previous message: Alberto Gimeno: "Re: [Subclipse-dev] Revision graph and cache implementation"
In reply to: Alberto Gimeno: "Re: [Subclipse-dev] Revision graph and cache implementation"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]