[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [E-devel] Memory pool management



On Wed, 22 Mar 2006 16:53:07 +0100 Cedric <cedric.bail@free.fr> babbled:

> On Wednesday 22 March 2006 12:40, Carsten Haitzler wrote:
> > [  8:35PM ~/t ] ./t ~/.e/e/applications/all/*
> > 2.13070
> 
> > now with plain malloc as-is in cvs for eet:
> 
> > [  8:35PM ~/t ] ./t ~/.e/e/applications/all/*
> > 2.04144
> 
> > it's actually a tiny bit faster. (and consistently so)... so - i'm
> > wondering how u are really measuring speedups? the code i have will load
> 
> This is quite strange, I ran exactly the same code and I am seeing exactly 
> the opposite (with and without valgrind) or I misunderstood something.
> 
> [ememoa]
> cedric@futurama:~/ememoa$ ./bench ~e/.e/e/applications/all/*
> 0.22376
> [plain malloc]
> cedric@futurama:~/ememoa$ ./bench ~e/.e/e/applications/all/*
> 0.24617

i actually think the speed differences are going to be much of a muchness - depending on cpu arch (this is an amd64 cpu with everything in 64bit), how your libc is compiled, etc. etc. etc. - i am not sure we can show consistent wins. if its faster on one box, and slower on another - it likely is better to do "nothingg" until u win on at least 1 arch/cpu and don't loose on others.

> ememoa with callgrind: 11.42089
> ememoa with cachegrind: 13.34606
> 
> plain malloc with callgrind: 12.38373
> plain malloc with cachegrind: 14.32375
> 
> > all the .eap's in my eap repo - which is a lot of eet work when bringing
> > up a menu. sure there is a lot of other things here other than just pure
> > eet code - but its closely representative of the core of the work.
> 
> In any case the winning is pretty small because only 'eet_open' and 
> 'eet_close' benefit from it. I chose first eet and specially 'eet_open' and 
> 'eet_close' because valgrind reported them as the second most important 
> user of 'malloc'/'calloc'. They were also pretty easy to update without a 
> big understanding of the code around them ('efn->data' is still allocated 
> by 'malloc', as I don't see yet when it is being freed, neither who takes 
> care of it).

it is not freed by eet. it's freed by the app that calls eet - so by definition you HAVE to use malloc with it as it is expected to use free() to free it.

> 	After updating 'eet_open'/'eet_close', we still have (for reference 
> 'eet_open' called malloc around 10000 times for the same scenario) :
>        - 'eet_data_get_string': still the biggest user and by far (more than 
> 70000 calls)
>        - 'evas_stringshare_add' (around 7000 calls to malloc)
>        - '_eet_mem_alloc' (around 7000 calls also)
>        - 'eet_read' (around 3000 calls)
> 
> Well, of course, evas_mempool_malloc is called a lot of time also (around 
> 25000 calls), but it seems to do a pretty good job. The only win I could 
> see using ememoa for it, would be for mass murders of list's items (In the 
> case of a pool per list) in 'evas_list_free'. But managing a memory pool 
> per list will require a lot of change inside evas_list (Only internal I 
> think).
> 
> All these functions need much more work and internal change if I want to 
> show an interesting benchmark. Definitively, the most interesting one in my 
> opinion is 'eet_data_get_string' and friends, and it will take some time 
> before having some result. So what would be in your opinion a good target, 
> or a good benchmark ?

hmm - well you need to possibly open e17's default theme (default.edj) and then just choose lets say 10 keys from it to load - load them, then close. do that in a loop (choose the keys at random). this should negate any caches etc. in between other than the kernel's disk/buffer cache (which you want fully primed to remove disk IO  from the equation).

in some ways i think some eet "rewrok" would speed things up more.

1. actually mmap() instead of fopen() and actually don't malloc data for things like the strings in the key table if we can return a DIRECT POINTER to it from the mmaped segment.
2. add a eet_read_direct() that reads the data and returns a pointer to the mmap()ed data segment (if it is not compressed - if it's compressed you cannot use this call). this will avoid an alloc and memcopy for several useful cases like image loads.

i think you'll get much more mileage from these than malloc pool work - as it's already dubious if such pools help (one system they speed up, another they slow down). the above will make eet much more tied to unix and less portable - BUT that's fine by me. we aren't about supporting win32 :) anyway - the mmaps will avoid a whole lot of mallocs :)

> 	Cedric
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> enlightenment-devel mailing list
> enlightenment-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
> 


-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    raster@rasterman.com
裸好多
Tokyo, Japan (東京 日本)