[Erp5-dev] performance of reindex object
Jean-Paul Smets
jp at nexedi.com
Fri Feb 12 10:40:04 CET 2010
Hi,
Some hints:
- we have sites with > 10,000 K lines in various tables and this
does not happen
- reindexing speed is tested by unit test, fluctuating, but under
control
Questions
- what are those documents for which "sometimes reindexing four of
five docs takes more then a minute"
- are there any extensions to catalog ? (ex. many columns ? scripts
in catalog which parse objects resursively)
- are you using MySQL ?
Reindexing speed should be between 10 and 30 simple documents / second /
core. If your document is complex, made for example of 100 subdocuments,
it will take 3 to 10 seconds for reindexing the root document, which is
normal, since you are actually reindexing 100 documents. If your root
document is made of 1000 subdocuments, changing the way to recursively
reindex subdocuments could be considered. If your root document is made
of 10,000 subdocuments, changing the way to recursively reindex
subdocuments is required.
Another possibility for slow reindexing is abuse of indices of MySQL (or
any other DB). The more indices you add, the slower INSERT. In large
sites, we usually remove some indices and add others, but this really
depends on the application and the nature of data, so there are no
universal rules here besides "optimize your indices in MySQL based on
your data".
Another possiblity is locking problems. One process of indexing is
waiting for another to finish. You must study what happens in MySQL to
track that (there are many tools for that purpose).
Anyway, optimizing "pure" reindexing speed is not so easy because this
is very often an issue of optimizing python method calls and the way
data is accessed. We are for example currently improving the speed of
catalog by caching some values related to the filters. This will provide
a few % improvement.
Regards,
JPS.
Bartek Gorny wrote:
> Hello,
>
> I'm running a production instance of ERP5, and I have a performance
> problem - reindexing some documents consumes a lot of CPU power.
> Sometimes reindexing four of five docs takes more then a minute, with
> mysql consuming up to 200% CPU and python processes eating up another
> 50% (this is a virtual machine running on three CPU cores, using ZEO,
> with three processing nodes). Something is definitely wrong - my
> question is, where should I begin to look for a problem. I read
> "performance crimes", and I don't seem to have committed any of those
> (at least not outright). Any advice, how to trace and where the
> problem may arise, would be most welcome.
>
> The dbase is not very big - count of objects in tables are:
>
> catalog: 380K
> category:950K
> delivery:4K
> movement: 130K
> predicate:160K
> predicate_category:160K
> roles_and_users:2K
> stock:40K
>
> So, is there a problem, have I done something wrong, or is it just too much?
>
> Bartek
>
>
>
--
Jean-Paul Smets-Solanes, Nexedi CEO - Tel. +33(0)6 29 02 44 25
ERP5 Enterprise: Open Source ERP/CRM for Mission Critical Applications
http://www.erp5.com
TioLive SaaS: run your business online, with more freedom
http://www.tiolive.com
Nexedi: Consulting and Development of Free / Open Source Software
http://www.nexedi.com
More information about the Erp5-dev
mailing list