Olá pessoal,
Estou precisando de orientação sobre um problema q estou enfrentando com um cluster de 2 máquinas de JBOSS 4.0.5 GA.
Tenhos as máquinas A e B. Quando preciso baixar uma das máquinas (digamos… a máquina A) do cluster não consigo subi-la novamente se não baixar também a máquina B e subir as duas juntas. Ou seja, isso torna o cluster inútil pois não consigo tolerância a falhas.
O erro acontece quando a máquina que está subindo tenta obter o estado inicial da outra máquina.
08:54:17,120 INFO [TreeCache] TreeCache local address is sdfjboss03:33079
08:54:17,528 INFO [TreeCache] viewAccepted(): [172.17.30.26:36417|3] [172.17.30.26:36417, sdfjboss03:33079]
08:54:24,635 INFO [FRAG2] assembled_msg is [dst: 230.1.2.7:45577, src: 172.17.30.26:36417 (3 headers), size = 161736 bytes]
08:54:25,892 INFO [FRAG2] assembled_msg is [dst: sdfjboss03:33079, src: 172.17.30.26:36417 (3 headers), size = 492095 bytes]
08:54:32,524 WARN [ServiceController] Problem starting service jboss.cache:service=TomcatClusteringCache
org.jboss.cache.CacheException: Initial state transfer failed: Channel.getState() returned false
at org.jboss.cache.TreeCache.fetchStateOnStartup(TreeCache.java:3191)
at org.jboss.cache.TreeCache.startService(TreeCache.java:1429)
at org.jboss.cache.aop.PojoCache.startService(PojoCache.java:94)
at org.jboss.system.ServiceMBeanSupport.jbossInternalStart(ServiceMBeanSupport.java:289)
at org.jboss.system.ServiceMBeanSupport.jbossInternalLifecycle(ServiceMBeanSupport.java:245)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155)
at org.jboss.mx.server.Invocation.dispatch(Invocation.java:94)
at org.jboss.mx.server.Invocation.invoke(Invocation.java:86)
at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264)
at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659)
at org.jboss.system.ServiceController$ServiceProxy.invoke(ServiceController.java:978)
at $Proxy0.start(Unknown Source)
at org.jboss.system.ServiceController.start(ServiceController.java:417)
at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155)
at org.jboss.mx.server.Invocation.dispatch(Invocation.java:94)
at org.jboss.mx.server.Invocation.invoke(Invocation.java:86)
at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264)
at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659)
at org.jboss.mx.util.MBeanProxyExt.invoke(MBeanProxyExt.java:210)
at $Proxy4.start(Unknown Source)
at org.jboss.deployment.SARDeployer.start(SARDeployer.java:302)
at org.jboss.deployment.MainDeployer.start(MainDeployer.java:1025)
at org.jboss.deployment.MainDeployer.deploy(MainDeployer.java:819)
at org.jboss.deployment.MainDeployer.deploy(MainDeployer.java:782)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155)
at org.jboss.mx.server.Invocation.dispatch(Invocation.java:94)
at org.jboss.mx.interceptor.AbstractInterceptor.invoke(AbstractInterceptor.java:133)
at org.jboss.mx.server.Invocation.invoke(Invocation.java:88)
at org.jboss.mx.interceptor.ModelMBeanOperationInterceptor.invoke(ModelMBeanOperationInterceptor.java:142)
at org.jboss.mx.server.Invocation.invoke(Invocation.java:88)
at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264)
at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659)
at org.jboss.mx.util.MBeanProxyExt.invoke(MBeanProxyExt.java:210)
at $Proxy8.deploy(Unknown Source)
at org.jboss.deployment.scanner.URLDeploymentScanner.deploy(URLDeploymentScanner.java:421)
at org.jboss.deployment.scanner.URLDeploymentScanner.scan(URLDeploymentScanner.java:634)
at org.jboss.deployment.scanner.AbstractDeploymentScanner$ScannerThread.doScan(AbstractDeploymentScanner.java:263)
at org.jboss.deployment.scanner.AbstractDeploymentScanner.startService(AbstractDeploymentScanner.java:336)
at org.jboss.system.ServiceMBeanSupport.jbossInternalStart(ServiceMBeanSupport.java:289)
at org.jboss.system.ServiceMBeanSupport.jbossInternalLifecycle(ServiceMBeanSupport.java:245)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155)
at org.jboss.mx.server.Invocation.dispatch(Invocation.java:94)
at org.jboss.mx.server.Invocation.invoke(Invocation.java:86)
at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264)
at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659)
at org.jboss.system.ServiceController$ServiceProxy.invoke(ServiceController.java:978)
at $Proxy0.start(Unknown Source)
at org.jboss.system.ServiceController.start(ServiceController.java:417)
at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155)
at org.jboss.mx.server.Invocation.dispatch(Invocation.java:94)
at org.jboss.mx.server.Invocation.invoke(Invocation.java:86)
at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264)
at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659)
at org.jboss.mx.util.MBeanProxyExt.invoke(MBeanProxyExt.java:210)
at $Proxy4.start(Unknown Source)
at org.jboss.deployment.SARDeployer.start(SARDeployer.java:302)
at org.jboss.deployment.MainDeployer.start(MainDeployer.java:1025)
at org.jboss.deployment.MainDeployer.deploy(MainDeployer.java:819)
at org.jboss.deployment.MainDeployer.deploy(MainDeployer.java:782)
at org.jboss.deployment.MainDeployer.deploy(MainDeployer.java:766)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155)
at org.jboss.mx.server.Invocation.dispatch(Invocation.java:94)
at org.jboss.mx.interceptor.AbstractInterceptor.invoke(AbstractInterceptor.java:133)
at org.jboss.mx.server.Invocation.invoke(Invocation.java:88)
at org.jboss.mx.interceptor.ModelMBeanOperationInterceptor.invoke(ModelMBeanOperationInterceptor.java:142)
at org.jboss.mx.server.Invocation.invoke(Invocation.java:88)
at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264)
at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659)
at org.jboss.mx.util.MBeanProxyExt.invoke(MBeanProxyExt.java:210)
at $Proxy5.deploy(Unknown Source)
at org.jboss.system.server.ServerImpl.doStart(ServerImpl.java:482)
at org.jboss.system.server.ServerImpl.start(ServerImpl.java:362)
at org.jboss.Main.boot(Main.java:200)
at org.jboss.Main$1.run(Main.java:490)
at java.lang.Thread.run(Thread.java:595)
08:54:33,009 WARN [UDP] discarded message from different group (Tomcat-Cluster). Sender was 172.17.30.11:32769
08:54:34,573 INFO [DefaultPartition] Initializing
08:54:39,168 INFO [UDP] unicast sockets will use interface 172.17.30.28
No arquivo cluster-service.xml tenho a seguinte configuração:
[i] <attribute name=“StateTransferTimeout”>30000</attribute>
<!-- The JGroups protocol configuration -->
<attribute name="PartitionConfig">
<!--
The default UDP stack:
- If you have a multihomed machine, set the UDP protocol's bind_addr attribute to the
appropriate NIC IP address, e.g bind_addr="192.168.0.2".
- On Windows machines, because of the media sense feature being broken with multicast
(even after disabling media sense) set the UDP protocol's loopback attribute to true
-->
<Config>
<UDP mcast_addr="${jboss.partition.udpGroup:228.1.2.3}" mcast_port="45566"
bind_addr="192.168.0.1"
ip_ttl="${jgroups.mcast.ip_ttl:8}" ip_mcast="true"
mcast_recv_buf_size="2000000" mcast_send_buf_size="640000"
ucast_recv_buf_size="2000000" ucast_send_buf_size="640000"
loopback="false"/>
<PING timeout="2000" num_initial_members="3"
up_thread="true" down_thread="true"/>
<MERGE2 min_interval="10000" max_interval="20000"/>
<FD_SOCK down_thread="false" up_thread="false"/>
<FD shun="true" up_thread="true" down_thread="true"
timeout="10000" max_tries="5"/>
<VERIFY_SUSPECT timeout="3000" num_msgs="3"
up_thread="true" down_thread="true"/>
<pbcast.NAKACK gc_lag="50" retransmit_timeout="300,600,1200,2400,4800"
max_xmit_size="8192"
up_thread="true" down_thread="true"/>
<UNICAST timeout="300,600,1200,2400,4800" window_size="100" min_threshold="10"
down_thread="true"/>
<pbcast.STABLE desired_avg_gossip="20000" max_bytes="400000"
up_thread="true" down_thread="true"/>
<FRAG frag_size="8192"
down_thread="true" up_thread="true"/>
<pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
shun="true" print_local_addr="true"/>
<pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>
</Config>
[/i]
Esse erro acontece em produção (hehehe). Em ambiente de homologação não acontece mas consigo simular o problema diminuindo o StateTransferTimeout para 2, por exemplo. O arquivo cluster-service.xml é igual para produção e homologação. Já aumentei o valor do StateTransferTimeout em produção mas o erro persiste. Enfim, alguém já viu isso e sabe o q é?
(OBS.: Não existe firewall nenhum nas máquinas nem entre elas. JDK 1.5.0_11. linha de start do cluster ./run.sh -c all -b 172.17.30.28 )
Obrigado