[Erp5-dev] performance of reindex object

Bartek Gorny bartek at gorny.edu.pl
Tue Feb 16 15:53:05 CET 2010


Ok, problem solved. The reason was .getPrice - this method is called
every time a movement or order is reindexed. Some of my documents do
not define a price, so the default implementation was used, and for
some reason it took it a few seconds of doing mysql-heavy operations
to finally return None.

I solved it (or worked around it?) by placing a few
[PortalType]_getPriceCalculationOperandDict scripts which either
return None or use simple custom logic to retrieve the default price.
After that, the reindexing speed went up to about 5 docs per second,
which is fine for me.

Bartek

On 15 February 2010 12:17, Bartek Gorny <bartek at gorny.edu.pl> wrote:
> Yes, the performance results you mention is more or less what I would
> expect, so there must be some reason it so slow. Portal type is based
> on Order, there are no extensions to catalog, no additional indices,
> I'm using MySQL.
>
> I'm using a non-standard role for security (Reviewer) - can this be
> the reason? This the only unusual thing about those documents I can
> think of...
>
> Bartek
>
> On 12 February 2010 10:40, Jean-Paul Smets <jp at nexedi.com> wrote:
>> Hi,
>>
>> Some hints:
>>    - we have sites with > 10,000 K lines in various tables and this
>> does not happen
>>    - reindexing speed is tested by unit test, fluctuating, but under
>> control
>>
>> Questions
>>    - what are those documents for which "sometimes reindexing four of
>> five docs takes more then a minute"
>>    - are there any extensions to catalog ? (ex. many columns ? scripts
>> in catalog which parse objects resursively)
>>    - are you using MySQL ?
>>
>> Reindexing speed should be between 10 and 30 simple documents / second /
>> core. If your document is complex, made for example of 100 subdocuments,
>> it will take 3 to 10 seconds for reindexing the root document, which is
>> normal, since you are actually reindexing 100 documents. If your root
>> document is made of 1000 subdocuments, changing the way to recursively
>> reindex subdocuments could be considered. If your root document is made
>> of 10,000 subdocuments, changing the way to recursively reindex
>> subdocuments is required.
>>
>> Another possibility for slow reindexing is abuse of indices of MySQL (or
>> any other DB). The more indices you add, the slower INSERT. In large
>> sites, we usually remove some indices and add others, but this really
>> depends on the application and the nature of data, so there are no
>> universal rules here besides "optimize your indices in MySQL based on
>> your data".
>>
>> Another possiblity is locking problems. One process of indexing is
>> waiting for another to finish. You must study what happens in MySQL to
>> track that (there are many tools for that purpose).
>>
>> Anyway, optimizing "pure" reindexing speed is not so easy because this
>> is very often an issue of optimizing python method calls and the way
>> data is accessed. We are for example currently improving the speed of
>> catalog by caching some values related to the filters. This will provide
>> a few % improvement.
>>
>> Regards,
>>
>> JPS.
>>
>>
>>
>> Bartek Gorny wrote:
>>> Hello,
>>>
>>> I'm running a production instance of ERP5, and I have a performance
>>> problem - reindexing some documents consumes a lot of CPU power.
>>> Sometimes reindexing four of five docs takes more then a minute, with
>>> mysql consuming up to 200% CPU and python processes eating up another
>>> 50% (this is a virtual machine running on three CPU cores, using ZEO,
>>> with three processing nodes). Something is definitely wrong - my
>>> question is, where should I begin to look for a problem. I read
>>> "performance crimes", and I don't seem to have committed any of those
>>> (at least not outright). Any advice, how to trace and where the
>>> problem may arise, would be most welcome.
>>>
>>> The dbase is not very big - count of objects in tables are:
>>>
>>> catalog: 380K
>>> category:950K
>>> delivery:4K
>>> movement: 130K
>>> predicate:160K
>>> predicate_category:160K
>>> roles_and_users:2K
>>> stock:40K
>>>
>>> So, is there a problem, have I done something wrong, or is it just too much?
>>>
>>> Bartek
>>>
>>>
>>>
>>
>>
>> --
>> Jean-Paul Smets-Solanes, Nexedi CEO - Tel. +33(0)6 29 02 44 25
>> ERP5 Enterprise: Open Source ERP/CRM for Mission Critical Applications
>> http://www.erp5.com
>> TioLive SaaS: run your business online, with more freedom
>> http://www.tiolive.com
>> Nexedi: Consulting and Development of Free / Open Source Software
>> http://www.nexedi.com
>>
>> _______________________________________________
>> Erp5-dev mailing list
>> Erp5-dev at erp5.org
>> http://mail.nexedi.com/mailman/listinfo/erp5-dev
>>
>
>
>
> --
> "Software is largely a service industry operating under the persistent
> but unfounded delusion that it is a manufacturing industry."
> Eric S.Raymond, "The Magic Cauldron"
>



-- 
"Software is largely a service industry operating under the persistent
but unfounded delusion that it is a manufacturing industry."
Eric S.Raymond, "The Magic Cauldron"



More information about the Erp5-dev mailing list