Enterprise Architecture & Integration, SOA, ESB, Web Services & Cloud Integration

Enterprise Architecture & Integration, SOA, ESB, Web Services & Cloud Integration

Friday, 31 January 2014

HTTP 500 error and Chunked encoding in OSB/WebLogic

Till recently, there was a strange issue seen in peak time with my production WebLogic Server. I have a bunch of Web Services deployed in CXF in WebLogic server. The message flow is as below: -

External Load balancer --> OSB --> External Load balancer --> (CXF in) Web Logic


While up to 95 to 98% of the requests sent to WebLogic had been processed successfully, the remaining requests were timed out in 30 seconds by WebLogic server. When I looked at the access log generated in the WebLogic server, all timed out requests had recorded with HTTP status code 500.  It took a while to understand, isolate and resolve the issue.

I did a thread dump analysis and saw several hogging threads with a kind of trace given below:


"[ACTIVE] ExecuteThread: '43' for queue: 'weblogic.kernel.Default (self-tuning)'" RUNNABLE native
         
                java.net.SocketInputStream.socketRead0(Native Method)
         
                java.net.SocketInputStream.read(SocketInputStream.java:129)
         
                weblogic.servlet.internal.PostInputStream.read(PostInputStream.java:142)
         
                weblogic.utils.http.HttpChunkInputStream.readChunkSize(HttpChunkInputStream.java:115)
         
                weblogic.utils.http.HttpChunkInputStream.initChunk(HttpChunkInputStream.java:74)
         
                weblogic.utils.http.HttpChunkInputStream.skip(HttpChunkInputStream.java:203)
         
                weblogic.utils.http.HttpChunkInputStream.skipAllChunk(HttpChunkInputStream.java:378)
         
                weblogic.servlet.internal.ServletInputStreamImpl.ensureChunkedConsumed(ServletInputStreamImpl.java:35)
         
                weblogic.servlet.internal.ServletRequestImpl.skipUnreadBody(ServletRequestImpl.java:194)
         
                weblogic.servlet.internal.ServletRequestImpl.reset(ServletRequestImpl.java:152)
         
                weblogic.servlet.internal.MuxableSocketHTTP.requeue(MuxableSocketHTTP.java:195)
         
                weblogic.servlet.internal.VirtualConnection.requeue(VirtualConnection.java:329)
         
                weblogic.servlet.internal.ServletResponseImpl.send(ServletResponseImpl.java:1538)
         
                weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1455)
         
                weblogic.work.ExecuteThread.execute(ExecuteThread.java:201)
         
                weblogic.work.ExecuteThread.run(ExecuteThread.java:173)

After further investigation, I found that the WebLogic server was not receiving the message within the default POST TIMEOUT (30 seconds) configured in the WebLogic server. Increasing the time out value to 45 or 60 seconds did not help. It only aggravated the performance issue.

Let's look at how the issue was resolved. When you define a business service in Oracle Service Bus (OSB), you would have seen a parameter namely "Use Chunked Streaming Mode" under "HTTP Transport Configuration" section. The default value was configured as "Enabled". Apparently, the load balancer was not able to handle requests with chuncked encoding in peak time. It was unable to transmit the request message to the WebLogic server within the configured post timeout period. It can be validated against the thread dump shown above. After changing the value from "Enabled" to "Disabled", the 500 error has been eliminated permanently and now the server is in good health.

Thanks for reading this post and hope you liked the tip. Please feel free to post your comment here.

29 comments:

  1. Hi Ayyappan,
    Thanks for the details post.I also faced a similar kind of issue in One of our Axis2 WS deployed on Oracle Weblogic 12c. When the load for the WS is more we get Http status 500 in the access logs which causes alarms to be triggered .Also in this WS we notify the client about the error scenario with SOAP Exceptions.Though the number of Soap Exception is quite low compared to the HTTP 500 for the service. Also in the Axis2 war file ,in axis2.x,l configuration we have the below HTTP transport values given as Chunked by default. So if the value is changed from Chunked will the occurrence of HTTP 500 reduced.

    axis2.xml configuration :



    HTTP/1.1
    chunked






    HTTP/1.1
    chunked

    ReplyDelete
  2. Sorry the configurations of axis.xml is missed in previous query.

    HTTP/1.1
    chunked

    ReplyDelete
  3. I wish to show thanks to you just for bailing me out of this particular trouble. As a result of checking through the net and meeting techniques that were not productive, Same as your blog I found another one Oracle OSB 12c.Actually I was looking for the same information on internet for Oracle OSB 12c and came across your blog. I am impressed by the information that you have on this blog. Thanks once more for all the details.

    ReplyDelete