DatabaseManager and connection pooling

Post here your questions about Actionscript and Java server side extensions development.

Moderators: Lapo, Bax

icepick
Posts: 18
Joined: 03 Mar 2009, 17:43

DatabaseManager and connection pooling

Post by icepick »

Hi,
We have a zone-level extension and while it has been working fine, we have been seeing some database-related issues in the last few days.

Earlier today, our extension could not execute any queries to the database, with the following error in the SFS logs:

DbManager could not retrive a connection. org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool exhausted

Restarting smartfox fixed that, but I'm wondering why it occurred and what we can do to prevent it from happening again.

Currently we are executing queries directly through the DatabaseManager.executeQuery(sql) call, and are not using getConnection() at all. Since our extension is not (at this point) multithreaded, is there an advantage to using getConnection()? Does executeQuery() use the connection pool behind the scenes?

Not sure if this is related or not, but we have been getting quite a few socket timeouts on our database connection lately. I had initially thought that this was just a networking problem, but now I'm not sure if it's related to the above problem. The end of a traceback is pasted below.

Thanks.

Code: Select all

INFO   | jvm 1    | 2009/04/21 06:17:06 | 	at it.gotoandplay.smartfoxserver.extensions.PythonExtension.handleRequest(PythonExtension.java:203)
INFO   | jvm 1    | 2009/04/21 06:17:06 | 	at it.gotoandplay.smartfoxserver.controllers.ExtensionHandler.processEvent(ExtensionHandler.java:539)
INFO   | jvm 1    | 2009/04/21 06:17:06 | 	at it.gotoandplay.smartfoxserver.controllers.ExtensionHandler.run(ExtensionHandler.java:344)
INFO   | jvm 1    | 2009/04/21 06:17:06 | 	at java.lang.Thread.run(Unknown Source)
INFO   | jvm 1    | 2009/04/21 06:17:06 | Caused by: java.net.SocketTimeoutException: Read timed out
INFO   | jvm 1    | 2009/04/21 06:17:06 | 	at java.net.SocketInputStream.socketRead0(Native Method)
INFO   | jvm 1    | 2009/04/21 06:17:06 | 	at java.net.SocketInputStream.read(Unknown Source)
INFO   | jvm 1    | 2009/04/21 06:17:06 | 	at org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:135)
INFO   | jvm 1    | 2009/04/21 06:17:06 | 	at org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:104)
INFO   | jvm 1    | 2009/04/21 06:17:06 | 	at org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:73)
INFO   | jvm 1    | 2009/04/21 06:17:06 | 	at org.postgresql.core.PGStream.ReceiveChar(PGStream.java:259)
INFO   | jvm 1    | 2009/04/21 06:17:06 | 	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1182)
INFO   | jvm 1    | 2009/04/21 06:17:06 | 	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:194)
icepick
Posts: 18
Joined: 03 Mar 2009, 17:43

Post by icepick »

Regarding the socket timeouts, I should also mention that while our database and smartfox are not on the same server, they are in the same data center (ping time is <1ms). It works without timing out 99% of the time.
User avatar
Lapo
Site Admin
Posts: 23438
Joined: 21 Mar 2005, 09:50
Location: Italy

Post by Lapo »

DbManager could not retrive a connection. org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool exhausted

Restarting smartfox fixed that, but I'm wondering why it occurred and what we can do to prevent it from happening again.
You can simply configure your DatabaseManager to allow more connections in the pool and let it grow (with the GROW option) when it's running out of connections.

This one indicates the max. number of active connections

Code: Select all

<MaxActive>10</MaxActive>
You are likely to have this in your config:

Code: Select all

<OnExhaustedPool>fail</OnExhaustedPool>
which will throw an exception when connections are all taken.

Check the docs for all the details:
http://www.smartfoxserver.com/docs/docP ... ection.htm
Lapo
--
gotoAndPlay()
...addicted to flash games
icepick
Posts: 18
Joined: 03 Mar 2009, 17:43

Post by icepick »

Lapo wrote:
DbManager could not retrive a connection. org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool exhausted

Restarting smartfox fixed that, but I'm wondering why it occurred and what we can do to prevent it from happening again.
You can simply configure your DatabaseManager to allow more connections in the pool and let it grow (with the GROW option) when it's running out of connections.

This one indicates the max. number of active connections

Code: Select all

<MaxActive>10</MaxActive>
You are likely to have this in your config:

Code: Select all

<OnExhaustedPool>fail</OnExhaustedPool>
which will throw an exception when connections are all taken.

Check the docs for all the details:
http://www.smartfoxserver.com/docs/docP ... ection.htm
Thanks Lapo, I did see that when I was going through the docs. I will change that setting and see if that helps. Do you think that would help with the socket timeouts, or am I likely dealing with a networking issue there?
User avatar
Lapo
Site Admin
Posts: 23438
Joined: 21 Mar 2005, 09:50
Location: Italy

Post by Lapo »

Yep, that will help a lot
Lapo
--
gotoAndPlay()
...addicted to flash games
icepick
Posts: 18
Joined: 03 Mar 2009, 17:43

Post by icepick »

Lapo wrote:Yep, that will help a lot
Thanks Lapo, I will try that. Your customer support is excellent, by the way!
icepick
Posts: 18
Joined: 03 Mar 2009, 17:43

Post by icepick »

The timeouts are persisting, although I haven't seen any "pool exhausted" messages since. Anything else you can think of to minimize or help debug the timeouts?

It may be a firewall issue as well, I am looking into that.
User avatar
Lapo
Site Admin
Posts: 23438
Joined: 21 Mar 2005, 09:50
Location: Italy

Post by Lapo »

You probably have an extension scalability problem due to the many database accesses.
Extensions are by default handled by a single thread, but high traffic on the database usually requires that you adjust the ExtensionHandler to work with a larger pool of threads.

http://www.smartfoxserver.com/docs/docP ... Safety.htm
Lapo
--
gotoAndPlay()
...addicted to flash games
icepick
Posts: 18
Joined: 03 Mar 2009, 17:43

Post by icepick »

Lapo wrote:You probably have an extension scalability problem due to the many database accesses.
Extensions are by default handled by a single thread, but high traffic on the database usually requires that you adjust the ExtensionHandler to work with a larger pool of threads.

http://www.smartfoxserver.com/docs/docP ... Safety.htm
I am undergoing some work to minimize database queries, although we are not doing a ton right now.

I'll check out the threading stuff as well; thanks.
icepick
Posts: 18
Joined: 03 Mar 2009, 17:43

Post by icepick »

Bringing this thread back from the dead... Lately these timeouts have become more frequent, although they are extremely unpredictable.

I added more logging statements to the extension and the database queries are hanging the extension for up to 14 minutes at a time! Although this happens infrequently. Checking the database server's logs, all that shows up is:

LOG: could not receive data from client: Connection reset by peer
LOG: unexpected EOF on client connection

There is a firewall between these servers that could be messing with the TCP connection. But what I really want to know is, is there a way to set a timeout parameter for database queries from a smartfoxserver extension? As this extension is currently single-threaded, these timeouts are hanging the entire extension.

Database: postgresql 8.3 (on a different nearby server)
Database Adapter: postgresql jdbc4 jar
User avatar
Lapo
Site Admin
Posts: 23438
Joined: 21 Mar 2005, 09:50
Location: Italy

Post by Lapo »

The problem is that the extension is single threaded and scalability is lost.
Since database calls can hold the extension thread for a long time (ms or even seconds) when compared to other calls (nanoseconds) you will to resize the pool of threads used by the ExtensionHandler stage of the server.

(Using the <ExtHandlerThreads> parameter in the config.xml)
Lapo
--
gotoAndPlay()
...addicted to flash games
icepick
Posts: 18
Joined: 03 Mar 2009, 17:43

Post by icepick »

Lapo wrote:The problem is that the extension is single threaded and scalability is lost.
Since database calls can hold the extension thread for a long time (ms or even seconds) when compared to other calls (nanoseconds) you will to resize the pool of threads used by the ExtensionHandler stage of the server.

(Using the <ExtHandlerThreads> parameter in the config.xml)
Hi Lapo,
Yes, the extension being single-threaded does hurt scalability, but even if it was multithreaded, it's still a problem for one of the threads to hang for minutes at a time for a database query, since at least one user will still be waiting for that query to go through. Is there no way to set a client-side timeout for a query (other than setting up a timer thread)?
User avatar
Lapo
Site Admin
Posts: 23438
Joined: 21 Mar 2005, 09:50
Location: Italy

Post by Lapo »

Again, I think we're not looking at the right problem.
Why would a query take so long? 1 minute? whoa :shock:
I think the problem here is the database response time not the timeout.

Is the database overloaded or are you using monster queries? :)
Lapo
--
gotoAndPlay()
...addicted to flash games
icepick
Posts: 18
Joined: 03 Mar 2009, 17:43

Post by icepick »

Lapo wrote:Again, I think we're not looking at the right problem.
Why would a query take so long? 1 minute? whoa :shock:
I think the problem here is the database response time not the timeout.

Is the database overloaded or are you using monster queries? :)
I'm pretty sure the holdup is networking/firewall related. Most of the time the query is done in less than 5ms, including network time. But it seems that if no queries are done for about an hour or so, the next query will take at least a few minutes, which is totally unacceptable. From the database's side, they see a misbehaving client (see the EOF log message posted earlier). It's not the database locking up; I've never had a problem with the web app that is also using it.
User avatar
Lapo
Site Admin
Posts: 23438
Joined: 21 Mar 2005, 09:50
Location: Italy

Post by Lapo »

But it seems that if no queries are done for about an hour or so, the next query will take at least a few minutes, which is totally unacceptable
Ok I see. In this case it is possible that the connection between SFS and the DB becomes stale.

In order to avoid this you can simply keep it alive from time to time by simply sending a call to the database, maybe using an inexpensive query like "SELECT 1". It can be done like once every 20 - 30 minutes.

Cheers
Lapo
--
gotoAndPlay()
...addicted to flash games
Post Reply