[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [E-devel] xmms causes e17 to freeze

On Fri, 17 Nov 2006 18:18:09 -0500 "Ryan Little" <rlittle@gmail.com> babbled:

> Starting xmms on current CVS build causes e17 to completely freeze.
> Onefang thinks it may be due to the new evas threading code (this is a multi
> proc / dual core box).

can someone put a second set of eyes on the threading code - i am staring at it
and it seems entirely solid.

evas_pipe.c is the only core thread code and evas_font_draw.c has 1 mutex lock
(LKL() and LKU() macros on the font) and evas_font_load.c (the only place such
objects are created) inits and destroys these. interestingly the font mutexes
are not the issue. the issue i myself just saw this morning too - the main
process thread and the 2 slave threads are ALL blocking on pthread_barrier_wait
() all waiting on eachother. this doesn't make any sense if you look at the
pthread code (i have 1 barriers - barrier 0 and 1. barrier 0 is used for
starting up the slave threads and 1 is used to sync up the end of the slave
threads and the main process thread.

here is how it works (it's hellishly simple!)

1. in main process thread before any slave threads do anything build pipeline.
no locking. no concurrency - should be no issues. this pipelne is read-only so
it needs no locks later.
2. once pipeline is complete and evas needs the results of the rendering it
calls evas_common_pipe_begin() which if no slave threads exist, will create
them, then have them sitting in an initially blocked state waiting on barrier
0. if the threads have been created already they will be in this blocking on
barrier 0 state anyway so this create it not done.
3. for each slave thread (thread_num) the main thread quickly creates a small
thread-specific context (allocates it) and fills it with the region the thread
has to render, then sets the thread info pointer (which the thread was handed
on start so it knows about its own specific context).
4. the main thread now sets up the "sync thread end" barrier (barrier 1) and
then waits on barrier 0 - this wait will trigger the slaves to begin working
(barriers are really convenient sync mechanisms that basically block until N
threads are waiting on the barrier - then they unblock). this means every thread
waiting unblocks and so the slaves go on and run and the main process thread
5. the way the code is currently the main process instantly then calls the
evas_common_pipe_flush() call which basically means "any pending rendering -
finish it now and wait". this means it first re-inits barrier 0, then waits on
barrier 1 - once all slave threads are done they also wait on barrier 1. when
all slaves and parent are waiting (both slaves done and parent has been
waiting) then the barrier unblocks - main thread now continues running and
slaves start blocking on barrier 0 again (which is set up and ready) basically
waiting for evas's parent main thread to set up a new pipeline and unblock the
workers to do their stuff.

now here is the problem - this is INCREDIBLY simple. there is no complex
locking (the only locking is for fonts and that is also incredibly simple). the
problem is there is a deadlock on the barrier sync points and i am looking at
them and cannot for the life of me see the problem. either it is incredibly
subtle and hiding - or it's so obvious and glaring i somehow have skipped it
and am just assuming too much.

anyone else have ideas? (evas/src/lib/engines/common is where these files are.
macros in evas/src/lib/include/evas_common.h)

> BT follows:
> [Switching to Thread -1216890192 (LWP 17018)]
> 0xbfffe410 in __kernel_vsyscall ()
> #0  0xbfffe410 in __kernel_vsyscall ()
> #1  0xb7c42ece in __lll_mutex_lock_wait () from /lib/i686/libpthread.so.0
> #2  0xb7c411d0 in pthread_barrier_wait () from /lib/i686/libpthread.so.0
> #3  0xb7ce6807 in evas_common_pipe_begin (im=0x8d19080) at evas_pipe.c:183
> #4  0xb7418030 in eng_output_redraws_next_update_push (data=0x89b6e48,
>     surface=0x8d19080, x=0, y=0, w=168, h=32) at evas_engine.c:359
> #5  0xb7c9fd16 in evas_render_updates_internal (e=0x8aa7ab8,
>     make_updates=1 '\001') at evas_render.c:358
> #6  0xb7c9ffe7 in evas_render_updates (e=0x8aa7ab8) at evas_render.c:439
> #7  0xb7f8bda1 in _ecore_evas_x_render () from /usr/lib/libecore_evas.so.1
> #8  0xb7f8e28a in _ecore_evas_x_idle_enter () from
> /usr/lib/libecore_evas.so.1
> #9  0xb7f149c1 in _ecore_idle_enterer_call () from /usr/lib/libecore.so.1
> #10 0xb7f17aaf in _ecore_main_loop_iterate_internal ()
>    from /usr/lib/libecore.so.1
> #11 0xb7f16f9b in ecore_main_loop_begin () from /usr/lib/libecore.so.1
> #12 0x080674b3 in main (argc=1, argv=0xbfe7f4a4) at e_main.c:823
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys - and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> enlightenment-devel mailing list
> enlightenment-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    raster@rasterman.com
Tokyo, Japan (東京 日本)