Re: [Subclipse-dev] Revision graph and cache implementation

From: Alberto Gimeno <gimenete_at_gmail.com>
Date: Tue, 8 Jul 2008 19:25:18 +0200

Responding to Steve. The cache implementation has three methods. No
problem creating an ICache interface. The UI could change the Cache
implementation just changing the "new" statement.

I know that using serialization has one problem: you have to read or
write a whole data structure. You can't query just some data. So you
have a problem with memory if the data structure is too big. However
when drawing the graph all the information for that graph has to be in
memory. In that case doesn't matter if the data is in a database
because you have to query lots of records. That's why I thought in
"one graph = one serialized data structure". The graph data structure
will replace two of the current tables in the database structure. But
there is a third table: the table that stores the branching
information and unique ids. This table does need to be queried several
times, and the queries just involve one record. This table can be very
big because it contains one record per path. This is, if you have
400,000 files you'll have 400,000 records.

When I said that queries are fast I mean the query that retrieves the
information for the graph. However there are other queries when
inserting / updating. Maybe in those queries there is a problem with
indexes.

Well, although inserting / updating is slow, *the big update* is just
done the first time. Nevertheless I will work on trying to improve
that.

On Tue, Jul 8, 2008 at 6:20 PM, Maciek Sakrejda <msakrejda_at_truviso.com> wrote:
> For what it's worth (as mostly an observer of subclipse-dev out of
> general interest), I think batching DB inserts/updates *should* help
> performance quite a bit. I worked on the Snap project
> (http://snap-photo.sourceforge.net -- but now abandoned) and we used a
> DB cache with SQLite for photo metadata. Once we realized how big a
> bottleneck the inserts were, we tried a number of things including
> batching, and batching gave us an order-of-magnitude performance
> improvement. I would definitely suggest trying batching before moving on
> to serialization or another technique.
>
> --
> Maciek Sakrejda
> Truviso, Inc.
> http://www.truviso.com
>
> -----Original Message-----
> From: Alberto Gimeno <gimenete_at_gmail.com>
> Reply-To: dev_at_subclipse.tigris.org
> To: dev_at_subclipse.tigris.org
> Subject: [Subclipse-dev] Revision graph and cache implementation
> Date: Tue, 8 Jul 2008 17:24:04 +0200
>
> The current implementation of the cache does not meet my personal
> expectations. I think that queries are fast but inserting and updating
> the cache is very slow. Maybe the embedded database is a bottleneck.
>
> I'm thinking about to implement the cache using Java serialization.
>
> In the current database structure I use a "files" table to store the
> branches information and I give a unique ID to each file. That ID is
> shared between a file and its branches. This is, one ID = one graph.
> Every file path involved in the graph has the same ID in the database.
> So I think that maybe it could be possible to have:
>
> * One file for storing the "files" information (branches and unique IDs).
> * One file per file ID. All the information to show a graph would be
> in one of these files.
>
> Those files would be written and read using serialization.
>
> I'm not sure. Before making a new implementation I can do some things
> to improve the performance. I can try to find if there is some query
> against the database especially slow and fix it (for example, a query
> needs an index). And I can try to implement the cache using batch
> updates. But I would like to know your opinions.
>
> Nevertheless making a new implementation using serialization won't
> take me much time. Probably a few days.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_subclipse.tigris.org
> For additional commands, e-mail: dev-help_at_subclipse.tigris.org
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_subclipse.tigris.org
> For additional commands, e-mail: dev-help_at_subclipse.tigris.org
>
>

-- 
Alberto Gimeno Brieba
Presidente y fundador de
Ribe Software S.L.
http://www.ribesoftware.com
ribe_at_ribesoftware.com
Contacto personal
eMail: gimenete_at_gmail.com
GTalk: gimenete_at_gmail.com
msn: gimenete_at_hotmail.com
página web: http://gimenete.net
teléfono móvil: +34 625 24 64 81
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subclipse.tigris.org
For additional commands, e-mail: dev-help_at_subclipse.tigris.org

Received on 2008-07-08 19:25:26 CEST

This message: [ Message body ]
Next message: Alberto Gimeno: "Re: [Subclipse-dev] Revision graph and cache implementation"
Previous message: Maciek Sakrejda: "Re: [Subclipse-dev] Revision graph and cache implementation"
In reply to: Maciek Sakrejda: "Re: [Subclipse-dev] Revision graph and cache implementation"
Next in thread: Alberto Gimeno: "Re: [Subclipse-dev] Revision graph and cache implementation"
Reply: Alberto Gimeno: "Re: [Subclipse-dev] Revision graph and cache implementation"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]