Shifting Server Selection Criteria

Note:  I had written this post 2 years ago but somehow never noticed it in my drafts folder…

What was old is new.  Long ago we used internal RAID on servers for most applications, in some cases we would go as far as using internal HBAs with external JBODs to allow 2 physical servers to share some logical volumes, or to get the most out of a “high capacity” (at the time they seemed high, but by today’s standards many phones offer more addressable capacity) RAID enclosures.  Overtime we moved all of this critical data to a shared storage system, perhaps a SAN (storage area network).  The SAN vendors have continued to charge high prices for decreasing value, it left the storage market ripe for disruption with distributed storage that leverages commodity hardware, delivered as software.  No longer will we find it acceptable to pay $2500 for a disk drive in a SAN that we can buy on the street for $250.

This leads me to repeating the past, I find myself in desperate need of brushing up on managing the RAID controllers that are in my hosts.  Perhaps this is for VSAN, or ScaleIO, or some other converged storage offering that can leverage my existing compute nodes and all of what was formerly idle storage potential.  As we make this transition we find that all of our selection criteria we had for our compute hosts are no longer valid, or at least not ideal for this converged deployment.  Up until now the focus has been on compute density, either CPU cores per rack unit or physical RAM per rack unit…in fact many blade vendors found a nice market by maximizing focus on just that.

What these silo compute servers all had in common was minimal internal storage, we didn’t need it.  We needed massive density compute to make room for our really expensive SAN with all of its pretty lights. As we move down this path of converged compute and storage, we need to dig out some of our selection criteria from a decade ago.  We now need to weigh disk slots per rack unit into our figures. It turns out we can decrease our CPU+RAM density by large sums, but through implementing converged storage offerings we can drastically reduce our cost to provide the entire package of compute and storage.  We must look at the balance of compute to storage more closely as these resources are becoming tightly coupled, there are new considerations that we are not accustomed to that if not accounted for can lead to project failure.

When the hypervisor first started gaining ground there was a lot of debate over the consolidation ratio that made sense.  Some vendors/integrators argued that Big Iron made the most sense, a server that has massive CPU and RAM density and allowed for ridiculous VM:host ratios.  What we found is that this becomes a pretty massive failure domain, the larger the failure domain the larger the capacity we have to reserve.  Our cost of the HA (high availability) insurance is directly equal to our host density.  Likewise when we use maintenance mode, the time to enter maintenance mode for each host directly correlates to the utilized RAM density on a host.  The more RAM that is used on a host the longer it will take for every maintenance cycle for that host.

This is relevant as when we look at converged storage (or hyper converged as some may refer to it) we have to consider the same exact thing.  We now have the traditional compute items to account for, but we also need to factor in storage.  Our host is now a failure domain for storage, so we must reserve 1 host (or more) of capacity…this also means that when hosts go into maintenance mode, worst case we have to move an entire host of stored data to insure accessibility.

Advertisement

Simplified remote access to a home lab

One of the challenges of being someone that travels on a regular basis is that you are often not near your lab. The investment in a home lab really requires the ability access it from anywhere in order to meet any hope of a falsely perceived ROI. I’ve had a Unix/Linux based workstation for more of my working life than I’ve had a Windows one, sure Windows was always involved as a virtual machine on VMware Workstation (Linux) and now VMware Fusion (Mac).

There are insecure, complex and/or expensive options, such as buying a Cisco ASA or some other “firewall” that supports VPN…but that doesn’t support the goals and requirements for my lab and is the expensive option. The possibly more complex option would be to build a firewall from an PC, but that is high maintenance and I prefer my regular access to be simple and reliable (thus I have a Mac + Airport household, other than the 3 lab servers). The insecure option would be to expose RDP on your Windows guest directly to the Internet, that is not an option for me. My service provider background makes me paranoid about Windows security, or lack there of.

I have chosen to go with the cheapest and simplest option, in my mind. Linux virtual machines are light weight, use few resources, and you could always use a non-persistent disk to make it revert to a known config with a simple reboot (or restore from a snapshot). I leverage SSH tunneling, which is often overlooked and people peruse more complex L2TP or IPSEC based options…but SSH is just simple, seldom blocked on networks and does the job. I have not gone as far as using L3 tunneling, though that is an option with SSH.

Firewall Settings

In my network I have 1 open port on my “firewall” (Apple Airport Extreme) which is forwarded to a minimal Linux virtual machine with a static (private) IP address.

  • Public Internet –> Port 8080 on firewall –> Port 22 on Linux

I would recommend creating multiple port forwards on your firewall, this will give you other options if the one you choose was blocked. I’ve had good luck with 8080 and 8022 so far, but some environments may block those…there is nothing to say you can’t use port 80, however any forced proxy server will break your SSH session access…or protocol inspecting firewalls, and some service providers block the ports 25, 80, 443 and others.

The beauty is that from the Linux side very little needs to be done, I would recommend editing your SSH config on the Linux VM to prevent root access. Keep in mind you really must create your non-root users before you do so, otherwise you cannot login via SSH and will have to add those accounts via console.

Secure Linux SSH Settings

I would recommend making sure your Linux VM is up to date using the correct update process for whichever distribution you select. The SSH server is pretty sure anymore, but when compromises are found you should update to apply the relevant patches.

I would recommend editing the config file for sshd (/etc/ssh/sshd_config). Find the line that states PermitRootLogin and edit it to be “no”, if it is commented out remove the “#” and set it to “no”.

  • PermitRootLogin no

Now restart SSH: $: sudo /etc/init.d/sshd restart

The reason to remove root access to SSH is that its a “known” account and can easily be targeted. You should generally use hard to guess usernames and complex passwords for this “access server”, it is going to be port scanned and have attempts made to compromise it. You ideally would configure the authentication policies so that account lock-out occurs after too many false attempts. Personally I do not allow interactive password based logins, I use only pre shared keys (much more difficult to guess a 2048 bit RSA key than a 8 character password). You can investigate the RSAAuthentication and PubkeyAuthentication options within the sshd_config file to learn more about that option.

Public Access

My cable modem provider issues me a DHCP address, it happens to have been the same address for many months but there is always the chance it could change. I use Dyn (http://dyn.com) to provide dynamic DNS to my home lab. You can install one of their dynamic DNS clients (http://dyn.com/support/clients/) on any OS within your home network that is generally always on (e.g on your Linux access server), some “routers” (e.g. Cisco/Linksys) have one built in.

Client Connection

Setup SSH Saved Configs
At this point you just need to configure your client. I happen to use the default SSH client on Mac OS, though if you are using Windows you could use PuTTY or another client and achieve the same. In my case I don’t want to manually type out all of my config settings every time I connect, remember this is more than for SSH CLI access…it is for our simple “VPN”.

In my environment I either want SSH access or RDP (e.g. to Windows for vSphere Client) access. I do this through simple port forwarding rules.

In order to configure saved “session” settings for the shell SSH client on OS X you will need to do the following:

  1. Open a terminal window of your choice (Terminal.app or my preferred iTerm2)
  2. Navigate to your home directory: $: cd ~/
  3. Create a .ssh directory: $:~ mkdir .ssh
  4. Create a .ssh/config file: $: touch ~/.ssh/config
  5. Set security settings on the .ssh directory, otherwise sshd will not accept your keys if you use them in the future: $: chmod 700 ~/.ssh
  6. Set security settings on config (not really necessary, but anything in .ssh should be set this way): $: chmod 600 ~/.ssh/*
  7. Now we can move on to building our configuration

You can use the editor of your choice to open the config file, if you wish to use an app you can go to finder and press CMD-Shift-G and you will be given a box to type in your target folder (e.g. ~/.ssh/ ), you can then edit the file with whichever editor you prefer (e.g. TextMate). The format of the file is:

Host <name used as ssh target>
        HostName <target hostname>
        User <username>
        Port <TCP port on firewall>
        Compression yes
        AddressFamily inet
        CompressionLevel 9
        KeepAlive yes
        # RDP to Server1
        LocalForward localhost:3389 <private IP>:3389
        # RDP to Server2
        LocalForward localhost:3399 <private IP>:3389
        # RDP to Server3
        LocalForward localhost:3390 <private IP>:3389

Working example:
Host remotelab
        HostName my-dns.dnsalias.net
        User user0315
        Port 8080
        Compression yes
        AddressFamily inet
        CompressionLevel 9
        # Privoxy
        LocalForward localhost:8118 localhost:8118
        # RDP to Control Center Server
        LocalForward localhost:3389 192.168.100.15:3389
        # RDP to vCenter
        LocalForward localhost:3399 192.168.100.20:3389
        # RDP to AD Server
        LocalForward localhost:3390 192.168.100.60:3389
        # HTTPS to vCloud Director cell
        LocalForward localhost:443 192.168.100.25:443

In my case I also installed and configured Privoxy (http://www.privoxy.org/ ) to give me the ability to tunnel other protocols via proxy settings on my laptop (e.g. web browser, instant messengers, etc).

Connect To Your Lab

What was the point of all of this if I don’t show you how to connect? Open your terminal again and type “ssh” followed by your saved config name (e.g. $: ssh remotelab). Authenticate as needed, you should then be connected to the shell of your Linux VM.

Now open your RDP client of choice (I suggest CoRD: http://cord.sourceforge.net/ ), select to connect to one of your target tunnels specifying localhost:<target port for desired server>.

wpid-voila_capture503-2011-09-2-18-182.png

Now anyone lazy, errr…striving for efficiency, will save a config for their servers within CoRD for connecting directly when on your network or via the tunnel. You can then just select the saved session within CoRD.app without having to remember which TCP port is for each server.

Of course, for those Windows users this doesn’t help. In Windows you have a really neat client you can use to simplify this, I would recommend Tunnelier from bitvise: http://www.bitvise.com/tunnelier There may be simpler GUI driven SSH clients for configuring this for Mac OS, however I just use what is included as its always there and it doesn’t break when you upgrade to the next version.

Have a better way that is easy? Let me know, I’m always open to new ideas around getting access to the lab. I’ve always intended to setup View with a secure server, but that is also on the complex path and I want something that just works. Once this configuration is setup you can duplicate it easily, as the complexity is in the saved .ssh/config file and not the “server”.

Time and tide wait for none

I’ve been in professional services for over 8 years now, most of that has really centered around data storage.  I spent a period of time implementing EMC commercial systems for an EMC contracted services partner, I then spent a few years contracted to do the same for NetApp.  In the mix of all of this I worked for EMC, IBM, HDS, and NetApp resellers with exposure to almost all of the systems on a technical pre-sales basis and post-sales implementation efforts.  Out of my experience I have formed some fairly strong and, I’d like to think, informed opinions of what should be in “enterprise” storage systems.

Now during all of this time consulting on storage systems they were always connected to something.  In the earlier years (as if it was so long ago…) it was generally application servers, and really just physical Windows hosts.  At the time I never even had to make a distinction that it was “physical”, because, really, there was no other option.  Yes, on occasion I worked with “virtual” systems through Sun Solaris, HP-UX and IBM AIX systems..but even these were somewhat rare, and many of them weren’t very virtual at all (virtual hardware didn’t exist).  As time progressed the type of systems connected to the storage evolved, and I had to support them all.  A storage system without any connected servers isn’t very useful, it can make lights blink, burn electricity and generate heat, but their usefulness without servers really ends there.

As projects changed with the evolution of applications and what businesses determined were critical the storage systems proceeded from supporting application servers that primarily included databases (e.g. MS SQL, Oracle, etc) to supporting email systems (e.g. MS Exchange).  It was really interesting that in the beginning most customers considered email to not be “valuable enough” to justify shared storage, but email quickly evolved into being one of the most critical applications in all of our environments right behind telephony.  As it turns out, communication is a critical function and we all prefer email for broadcast.

Of course, this all changed even further as we look at the recent years.  VMware quickly became the primary “server” I was connecting to storage.  This matured from being a couple of servers in an environment would be running ESX to all servers operating ESX, this happened in a far shorter time than it did for the progression from only databases to also email on these storage systems.  There are so many variables in deploying virtualization that my informed opinions of storage systems became more validated (at least in my mind), as flexibility became more important than ever.

All of this is to say that a good “storage consultant” never knows just storage, though I know plenty of one-hat experts that can provision storage all day long but can’t ever plan for the actual requirements of the application on the other end of the communication chain.  I always had to keep pace with understanding the application that was connected, as the storage was always a critical piece of meeting SLAs either for performance, availability or data protection.  Storage architecture without awareness of the application will always fail to meet requirements.  Now that being said, I wouldn’t ever consider myself a DBA or an Exchange administrator, in part because I wouldn’t want either job, but I know enough to architect storage to meet business requirements for those applications.

Of course, that evolved into the same for virtualization…but with a distinct difference.  Virtualization changed how storage is managed, provisioned and how data is protected.  If I was only consulting on the small storage portion of a project my billable utilization (critical measure of success in the professional services environment) would have been pretty small, probably less than 25%…however due to my awareness of the other components and dedication to learning VMware I was easily able to fill the other 50-75% of my time with the virtualization components.

I’ve been really fortunate in the past about keeping ahead of the curve, my first “real” tech job was in the ISP/telecom space.  This evolved from being in a support center for business leased line customers (DS0, DS1, DS3, OCx, etc) to being more involved on the managing and planning the backend network.  As I watched the ISPs fade away and consolidate I saw this as the tea leaves telling me that not as many router jockeys were going to be needed, so I switched into the more traditional IT role…as every company has an IT department.

This all changed when my wife and I moved across the country for her to attend law school, I left a perfectly good job that I hated to move to a new job market where I knew no one.  By luck, I found a job traveling as a storage consultant and that progressed to where we are today.

The next advancement in my career was due to the realization that the IT industry is yet again changing.  It doesn’t take much time reading Gartner reports or other IT business case studies to realize that virtualization is here to stay, and the next logical evolution is Cloud Computing.  I have now moved to the next step along my career path and joined the industry leader in creating Cloud Computing solutions, VMware.

I am more excited today about my job than I have been in a long time, I just hope I can keep pace with the shifting tides and the evolution of such a radical change in the industry.  I join a team of individuals that I have a lot of respect for and look forward to learning from within the VMware vCloud Services group.

Crazy schedule and its not letting up

Well, as everyone can tell…I just started this thing and I already fell off, or so it would seem.  I took a week of time off to go to a friends wedding, which meant no laptop…and I just didn’t have anything relevant to post while trying to avoid thinking about work and technology.  This week has been a 3-day work week for me, and its gone by all too quickly.  I’ve been rushing through the inbox trying to get caught up and keep my head above water on the projects I am assigned to, I’m still breathing…but there were times I had to pull out a soda straw and fight for a breath.

Next week is VMworld in San Francisco, I’m really excited and a bit stressed at the same time.  I have a ton to get done before flying out on Sunday, I have week of yard work and other homeowner chores that have been neglected with consecutive weeks of travel and insane temperatures.

I will try to post some details from VMworld during the week, but I can’t make any promises…but I hope to at least have something exciting to share afterwards at the minimum.  I keep hoping my work schedule will slow down just a bit so I can leverage my lab environment to actually generate content anyone would want to read/watch.  Oh well, back to the struggle of leaving the place better than I found it…it is true that not all consultants seem to have that goal, gets frustrating to clean up after “experts” and even aggravating when the mess was created by someone I know.

First post

I guess everyone has to make their “first” blog post, I’m certain that others have set the bar very high…but I strive to set expectations low and to over deliver.  With that being said, I will attempt to just state what I hope this blog becomes.

I work every day in helping companies work towards efficiency, primarily as it relates to and around their corporate data centers.  The fact is that the “Data Center” extends far beyond the physical presence that most think of, it impacts how a company does its business internally and externally.  The “Data Center” enables them to deliver applications, services, widgets, or otherwise…but much of the orchestration that makes any business operates starts in the data center.

Most data centers have a lost of waste.  Wasted space.  Wasted energy. Wasted people.  Wasted time.  Waste is almost the norm for data centers and IT operations, it is usually forced by the complexity of the modern business world.  Complex requirements lead to overly complex solutions.

I work (through my employer) to help customers drive this waste out.  There are endless options to approach making the data center, IT, and business more efficient.  Some would look at this as “greening of the data center”, the green could either be the hard cash that is saved or the reduced environmental impact that can be achieved.