Page 1 of 1

TaskManager: Thread-safety-error during stresstesting

Posted: 14 Sep 2009, 20:21
by peterraq
Hi,

I did some stresstesting (app. 1000 testclients, all performing quite some action) and got the following 2 errors in my logs :

problem 1:
java.lang.ArrayIndexOutOfBoundsException: 28
at java.util.LinkedList.toArray(LinkedList.java:866)
at java.util.LinkedList.addAll(LinkedList.java:269)
at java.util.LinkedList.addAll(LinkedList.java:247)
at it.gotoandplay.smartfoxserver.util.scheduling.Scheduler.executeTasks(Scheduler.java:338)
at it.gotoandplay.smartfoxserver.util.scheduling.Scheduler.run(Scheduler.java:223)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
problem 2:
java.lang.IllegalStateException: Current state = FLUSHED, new state = CODING_END
at java.nio.charset.CharsetEncoder.throwIllegalStateException(CharsetEncoder.java:951)
at java.nio.charset.CharsetEncoder.encode(CharsetEncoder.java:537)
at java.nio.charset.CharsetEncoder.encode(CharsetEncoder.java:766)
at org.xsocket.DataConverter.toByteBuffer(DataConverter.java:125)
at org.xsocket.stream.Connection.write(Connection.java:576)
at it.gotoandplay.utils.net.xmlsocket.XMLSocket.send(XMLSocket.java:93)
at it.gotoandplay.smartfoxserver.httpbox.data.SFSClient.sendMessageToSfs(SFSClient.java:158)
at it.gotoandplay.smartfoxserver.httpbox.HttpBox.doPost(HttpBox.java:189)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
at it.gotoandplay.smartfoxserver.httpbox.filter.SessionIDFilter.doFilter(SessionIDFilter.java:195)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1148)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:387)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:324)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:535)
at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:880)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:747)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:451)
Any ideas, where the problem could be located? Is the 2. problem also related to the taskManager? (I guess: yes)

I do a lot of (threadsafe) task-adding and removing. But it seems to be a thread-safety problem located within the SFS-libraries.

Greetings,

peter


P.S.: I am using, SFS 1.6.6 / 1.6.7 pre.

Posted: 15 Sep 2009, 15:36
by Lapo
Hi,
for error #1 I can't say at the moment, I am adding a note in our bug tracker and we'll take a look asap.

I could recommend to use Java 5 Scheduling features which greatly surpass the good old Java 1.4 SFS scheduler.
Take a look here:
http://www.j2ee.me/j2se/1.5.0/docs/api/ ... cutor.html

Error #2 is related with the BlueBox access but is goes down to the guts of the xsocket library. Looks like a charset error, but doesn't ring any bells.
As far as I can remember I never seen it.
Can you explain how to reproduce it?

Posted: 18 Sep 2009, 13:17
by peterraq
Hi Lapo,

unfortunately I got a third - this time even more serious - concurrency problem with the scheduler:

problem 3:
java.lang.NullPointerException
at it.gotoandplay.smartfoxserver.util.scheduling.Scheduler.executeTasks(Scheduler.java:299)
at it.gotoandplay.smartfoxserver.util.scheduling.Scheduler.run(Scheduler.java:223)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
After the occurence of problem 1 or 2, the server keeps going.
After problem 3, the server just crashes, because it is locked in an endless loop (problem 3 appears in the logfile over and over again - 700 MB logfile-size within 3 minutes...). I think, the null-pointer task, which leeds to the error is not removed and the scheduler keeps looping.

This time, I was not testing. Unfortunately the live system came down...

Reproduction:
I think, this is a classic concurrency problem. I do quite a lot of task adding and removing (app. 20 adds and 20 removes per second).

The live system was online for about 50 hours. Problem 1 occured a few times. Problem 2 never occured (only during my stress testing). Then after these 50 hours, problem 3 occured and the live system crashed, as I mentioned.

Questions:
1. Is it possible, that you modify the Scheduler.class in order to catch these concurrency "Null-Pointer-Errors"? And when could this be ready, next-update or earlier? (as a reminder, we are already running 1.6.7 pre, and other than this concurrency problem, it is working great!)

2. What else could we do?
I know, that we could write our own CTimer, but this would take quite a while and needs a lot of testing.
I think, I will also try to go for a different game architecture in order to reduce the amount of task adding and removing. However, this would only reduce (and not eliminate) the chances, that a concurrency problem occurs. The concurrency problem with the timer would still remain.


Thanks for your help,

peter

Posted: 18 Sep 2009, 14:17
by Lapo
Questions:
1. Is it possible, that you modify the Scheduler.class in order to catch these concurrency "Null-Pointer-Errors"? And when could this be ready, next-update or earlier? (as a reminder, we are already running 1.6.7 pre, and other than this concurrency problem, it is working great!)
I think we can send an update by the beginning of the next week. Ideally on Monday. We are already working on it and we're now in testing phase.

If you could please drop us an email as a reminder it would be great.

Hope it helps

Posted: 23 Sep 2009, 10:01
by peterraq
Hi Lapo,

thanks again for your super fast help!

I keep you updated, if anything does not work.
So far, the system is up and running without problems.

Greetings,

Peter

Posted: 23 Sep 2009, 13:26
by Lapo
Great. Keep us posted.
Cheers

Posted: 26 Oct 2009, 09:13
by vasmik
Hi!

I have "problem 3" too. Can you send an update to vasmik@gmail.com?

Thank you.

Posted: 26 Oct 2009, 14:01
by Lapo
Problem 3 is solved in the latest 1.6.7 update.
Download from here: http://forums.smartfoxserver.com/viewtopic.php?t=6006

annoying exception

Posted: 08 Oct 2010, 10:15
by Siavash
Hi,
unfortunately i do get
java.lang.IllegalStateException: Current state = RESET, new state = FLUSHED
at java.nio.charset.CharsetEncoder.throwIllegalStateException(CharsetEncoder.java:951)
at java.nio.charset.CharsetEncoder.flush(CharsetEncoder.java:640)
at java.nio.charset.CharsetEncoder.encode(CharsetEncoder.java:769)
at org.xsocket.DataConverter.toByteBuffer(DataConverter.java:125)
at org.xsocket.stream.Connection.write(Connection.java:576)
at it.gotoandplay.utils.net.xmlsocket.XMLSocket.send(XMLSocket.java:93)
at it.gotoandplay.smartfoxserver.httpbox.data.SFSClient.sendMessageToSfs(SFSClient.java:158)
at it.gotoandplay.smartfoxserver.httpbox.HttpBox.doPost(HttpBox.java:189)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
at it.gotoandplay.smartfoxserver.httpbox.filter.SessionIDFilter.doFilter(SessionIDFilter.java:195)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1148)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:387)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:879)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:747)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:451)

Its not a serious problem, but our log file is full of this message.
Any ideas how to prevent this?
Thanks in advance,

siavash

Posted: 08 Oct 2010, 13:43
by Lapo
What SmartFoxServer version please?

Posted: 14 Oct 2010, 06:11
by Siavash
We are using Smartfox 1.6.9 on a Linux Server.

Posted: 15 Oct 2010, 05:51
by Lapo
It is difficult to say what's happening, with precision.
It looks like it's a problem with http content. Maybe wrong request types sent by generic clients...
Only a traffic analysis could tell.