[Thinlinc-technical] HA++
Peter Astrand
astrand at cendio.se
Tue May 6 12:58:02 CEST 2014
The [R] flag in the tcpdump output means that a TCP reset happens. This is
not normal. It could be a generic network or firewall problem. Make sure
to check all configuration. You mentioned that you are using F5 BigIp. If
one Googles for "f5 big ip tcp reset", a few interesting pages shows up,
including
sol9849: The BIG-IP system sends a TCP RST packet when the system reaches
a pool member or node connection limit
http://support.f5.com/kb/en-us/solutions/public/9000/800/sol9849.html
Br,
Peter
On Mon, 28 Apr 2014, Rui Lapa wrote:
> Well, my previous affirmation is not correct.
>
> The situation still persists, even though I disabled HA.
>
> With tcpdump I see the tcp/904 packages:
>
> Master:
> 12:47:55.085032 IP DMZ_IP.1022 > CLOUD_IP.904: Flags [S], seq 1134206032, win
> 29200, options [mss 1460,sackOK,TS val 149295328 ecr 0,nop,wscale 7], length
> 0
> 12:47:55.127040 IP CLOUD_IP.904 > DMZ_IP.1022: Flags [.], ack 1879359532, win
> 229, options [nop,nop,TS val 434128492 ecr 152335306], length 0
> 12:47:55.127114 IP DMZ_IP.1022 > CLOUD_IP.904: Flags [R], seq 2261775421, win
> 0, length 0
>
> Agent:
> 12:48:21.308964 IP DMZ_EXTERNAL_IP.1022 > CLOUD_INTERNAL_IP.904: Flags [S],
> seq 2384510363, win 29200, options [mss 1460,sackOK,TS val 149307566 ecr
> 0,nop,wscale 7], length 0
> 12:48:21.309028 IP CLOUD_INTERNAL_IP.904 > DMZ_EXTERNAL_IP.1022: Flags [.],
> ack 173, win 229, options [nop,nop,TS val 434140730 ecr 152335306], length 0
> 12:48:21.350525 IP DMZ_EXTERNAL_IP.1022 > CLOUD_INTERNAL_IP.904: Flags [R],
> seq 2261775421, win 0, length 0
>
> Help?
>
>
>
> On 04/23/2014 03:14 PM, Rui Lapa wrote:
>> After disabling HA it started working.
>>
>> After re-enabling it got back to same strange on/off behavior.
>>
>> vsmserver.log
>> 2014-04-23 15:09:10 INFO vsmserver: VSM Server version 4.1.1 build 4137
>> started
>> 2014-04-23 15:09:10 INFO vsmserver.license: Updating license data from
>> disk to memory
>> 2014-04-23 15:09:10 INFO vsmserver.license: License summary: 10
>> concurrent users. Hard limit of 11 concurrent users.
>> 2014-04-23 15:09:10 INFO vsmserver.session: Loaded 0 sessions for 0
>> users from file
>> ...
>> 2014-04-23 15:09:50 WARNING vsmserver: Timeout getting load from VSM
>> Agent EXTERNAL_IP1:904, marking as down
>> 2014-04-23 15:09:50 WARNING vsmserver: Timeout getting load from VSM
>> Agent EXTERNAL_IP2:904, marking as down
>> 2014-04-23 15:11:10 WARNING vsmserver: Timeout getting load from VSM
>> Agent EXTERNAL_IP1:904, marking as down
>> 2014-04-23 15:11:10 WARNING vsmserver: Timeout getting load from VSM
>> Agent EXTERNAL_IP2:904, marking as down
>>
>> We really wanted to HA the vsmservers.
>>
>> Any tips?
>>
>>
>> On Wed, Apr 23, 2014 at 1:41 PM, Peter Astrand <astrand at cendio.se
>> <mailto:astrand at cendio.se>> wrote:
>>
>>
>> Hi, comments below:
>>
>>
>> Currently, we have a centralized FreeNX scenario with 2
>> frontend servers and 29 backend servers, supporting +500 users,
>> all running in the LAN, on old server hardware.
>> This scenario has been running for 4 years. We created
>> message queuing and remote tasks to alleviate our stupid day to
>> day sysadmin jobs, ...
>> More info, on this 2 year old presentation (sorry,
>> portuguese)
>>
>> http://www.slideshare.net/rui___lapa/desktop-linux-na-__tranquilidade-portolinux-__presentation?qid=a331b942-__8a98-4ae9-aea0-c02698e698b4&v=__default&b=&from_search=1
>> <http://www.slideshare.net/rui_lapa/desktop-linux-na-tranquilidade-portolinux-presentation?qid=a331b942-8a98-4ae9-aea0-c02698e698b4&v=default&b=&from_search=1>
>>
>> At the moment, we are evolving the solution to "crazy", so
>> we want to provide a desktop to external "entities" and shops
>> without internal network access, while still providing internal
>> linux desktops.
>> The crazy is, that the external desktops servers will run on
>> a cloud provider.
>> In the future, we plan to ONLY use cloud servers, while
>> keeping the 2 vsmservers on the DMZ.
>>
>>
>> I agree that this is crazy :-) As I mentioned, the VSM servers
>> should be located where the VSM agents are. There are several
>> reasons for this, security being one of them.
>>
>>
>>
>> At this time we are doing our final test, where we have 2
>> DMZ vsmservers, 3 LAN agents and 2 Cloud agents.
>>
>> The datacenter has 4 dedicated internet circuits, with 1 IP
>> each, in a in/output load balancing scenario. (F5 BigIp).
>>
>> The cloud agents are getting 4 (tcp/904) pollings from each
>> external ip of each circuit.
>> Due to this, we configured cloud agents
>> "/vsmagent/allowed_clients" mapped to the 4 external NAT IP's.
>> Unfortunately, even though, we get the tcpdump from each IP
>> to TCP/904 and see the replies, the vsmservers webadmin status
>> load page keeps considering these servers one cycle up, the next
>> up/down, randomly.
>> The internal vsmserver have the
>> "/vsmserver/terminalservers", with the external cloud ip's and
>> the internal lan ips.
>>
>> But, is there anything more we need to do?
>>
>>
>> The web interface communicate with the VSM Server, so apparently the
>> VSM server(s) loses contact with the cloud agents now and then. It's
>> difficult to tell why. You could start with one single VSM server,
>> ie disable HA, until the agent communication works reliably. Also,
>> check out vsmserver.log. tcpdump/Wireshark can probably also tell
>> you what's going on.
>>
>> Regards, ---
>>
>> Peter Astrand ThinLinc Chief Developer
>> Cendio AB http://cendio.com
>> Teknikringen 8 http://twitter.com/ThinLinc
>> 583 30 Linkoping http://facebook.com/ThinLinc
>> Phone: +46-13-214600 <tel:%2B46-13-214600>
>> http://google.com/+__CendioThinLinc
>> <http://google.com/+CendioThinLinc>
>>
>>
>
>
>
---
Peter Astrand ThinLinc Chief Developer
Cendio AB http://cendio.com
Teknikringen 8 http://twitter.com/ThinLinc
583 30 Linkoping http://facebook.com/ThinLinc
Phone: +46 13 214 600 http://google.com/+CendioThinLinc
More information about the Thinlinc-technical
mailing list