Sunday, May 31, 2009

Great day

Today huge progress was made, I laid the laminate floor in E's room, fitted base boards and molding, and then helped J remove all the plastic taped down all over the floors. The cabin was transformed from a working zone to a home today.
On Tuesday the appliances are delivered, on Thursday 80% of the furniture is delivered and on Friday they are measuring for granite counter tops in the kitchen.
Someone asked today about the "deck" and permits. I found an interesting out, if the deck is lower than 30 inches and not attached to the house (and it isn't, hence the set of posts at the house end of the deck) it does not need all the inspections etc. it made life a whole lot easier not to have to worry about all the timing etc. I followed code for the construction, two beams made up of two off ten by twos screwed together every 16 inches, eight by twos at 16 inch centers for the joists and inch and a quarter planking, the posts go down 48 inches, well below the the frost zone and are 10 inch diameter with the posts (four off four by four) going down resting on 8 inches of concrete....back filled down the holes with concrete.
My arms ache....not sure if it was the holes, the concrete or all the drilling and screwing.
After we washed the floors for the second time and the windows we took the kayaks out for a spin. J's first time afloat (outside of the pool) in the Eliza, she loved it....
Oh and as an aside, I assembled a grill for out on the deck today also (as if there wasn't enough to do).....

Saturday, May 30, 2009

Day two: Decking


Day one: posts and frames


Intel case study ranked #2, and me in the top 25 CTOs

I was Googling this morning and came across an internal Intel blog which ranked their server case studies, the joint case study we did on the 7400 series servers was ranked as #2 for Q1 '09.
The work we have doing on growth optimization has been attracting a great deal of media attention, it has resulted in me being nomiated and awarded 2009 InfoWorld CTO 25, I am looking forward to reading what they report, the "winners" should be announced on Monday

Thursday, May 28, 2009

NetApp Case Study

Yesterday NetApp published the joint customer case study we had been working on the past few week which describes both the server virtualization and storage optimization projects I am responsible for delivering for our division.
If you are interested in another bloggers balanced impression of USPvHAM check out RupturedMonkey, I discovered him (Nigel Poulton) via a Tweet where Chris Evans described his irritation not at the HAM product but at the marketing hype, this reminds me of EMC and V-Max - but EMC's was on steroids by comparison...

It's not magic

I was reading Storagebod’s Blog today, and I have to agree with him Storage virtualization does nothing to recover space. However, and it is a big however, virtualization can provide you a non-disruptive mitigation strategy or effectively an insurance policy when you do implement those technologies that recover space: dedupe, thin provisioning, tiering. What do I mean? Virtualization can provide a non-disruptive mechanism to move data from varying filers or arrays, to move between tiers to leave a full storage device and migrate to one where there is more room. I am not comfortable with the idea of thin provisioning and tiering in the absence of virtualization (or another mitigation strategy) unless of course you are running applications where downtime for migrations is acceptable – in ours it is not.

Virtualization Forum

This morning I had the great pleasure to present the division’s virtualization initiatives to an illustrious group of over a thousand Minnesotan technologists at the Virtualization Forum. Prior to my talk they were treated to Srinivas Krishnamurti from VMWare introducing vSphere 4, we then heard from an industry analyst on trends in the adoption of the internal cloud virtualization technologies as well as some lessons learned. The presentations seemed very well received, the audience seemed to enjoy putting the ecological impact in terms of planting 25 square miles of forest or removing 1500 Hummers from the highways, and they also seemed to resonate with the notion that virtualization provides a rare opportunity for the infrastructure and operations teams to make a huge positive impact on the business P&L (300M$).
Next week I am travelling to Pebble Beach to present to a group of executives on the way to develop the business case for virtualization.

Why is HAM important?

Because it goes great with eggs!

Seriously, there have been questions in the storage community about why Hitachi’s High Availability Manager (HAM) is important. At first glance, it appears to only be a feature for select industries that have the requirement of absolutely no down time for mission critical applications. Even then, with todays enterprise subsystems reaching in to the 6 nines of availability, few are going to double their storage costs for a little more uptime per year.

Our reality is that the necessity for high availability is not driven from the uptime of a single subsystem, but from the availability of a virtualized fabric on the USPv platform. Environments that have begun, or are thinking about virtualizing their SAN behind a USPv traditionally faced a large obstacle; in 3-4 years when they need to replace their virtualization engine, will they have to take down every host on the SAN? For most, the reason for virtualizeing in the first place was to avoid exactly this type of service disruption. This shortfall was a stumbling block for some and a show stopper for others.

This is where the importance of the HAM software lies. Despite its unfortunate name, HAM gives the virtualized SANs a way to upgrade to newer, faster hardware while keeping their servers online and functioning. The migration process happens in the following order. First, the storage is mirrored in the background and kept in sync between the old and the new hardware. Then the masks and zones are set up and the server picks up the paths to the new subsystem so it configuration sees both. A failover is initiated from the old subsystem to the new and then the old subsystem is removed… all online. This can be done at a LUN, host or entire subsystem level if the user so chooses.

HDSs announcement of High Availability Manager gives the USPv the mobility that it needs to be a true enterprise virtualization solution, that is a true market differentiator in the enterprise class SAN space

Tuesday, May 26, 2009

Lucas Mearian Interviews me

Last week I was interviewed by Lucas Mearian on behalf of Computer World about our storage optimization and server virtualization projects, you can read the transcripts here: Computer World Article or if you prefer it has been reprinted already in CIO magazine.

Monday, May 25, 2009

Before and after

These two pictures show the before and after of the soffits and fascia:
Today I finished the south side of the house and the garage, the east side of the house remains but I wont be rushing to do that. J has finished priming the walls in E's bedroom and I need to paint the ceiling one evening this week.
We spent this afternoon furniture shopping, we bought 3 beds (including mattresses), four dinning chairs, a sofa and chaise, an arm chair, a chest of draws, a book case, and a patio set of 6 chairs and a table. All to be delivered next week, apart from the chair which will be custom made with the fabric we chose, and one of the beds that has to be built first apparently. Oh and I almost forgot we bought a rug which is rolled up in the bag of my car waiting to be dropped off.
Lots still left to buy but huge progress for an afternoon.
We know the dinning table we want and also a large cantilever umbrella but there are in Edina and the store was closed today.

Sunday, May 24, 2009

Cabin Exterior

Today I did a few jobs inside until the neighborhood was awake, I fixed the end birch boards onto the kitchen cabinets in preparation for the countertop measuring. We have purchased granite counter tops and they are coming to measure them soon. I spackled a few areas of the bedroom walls that needed tidying up and I helped J to empty E’s bedroom in preparation for her spending the day painting it. J finished the woodwork in the living room also.
Out side I removed the old rotting wooden soffits, repaired the wooden fascia where it was rotten and then proceeded to fit vinyl soffits around the deck area of the roof, and then finished the job off with vinyl fascia.
After lunch I moved onto the fascia above the master bedroom windows, and then the gable end of the roof, I was doing this at the end of a 17 foot ladder standing very near the top in a most precarious position, banging in white aluminum nails at full stretch. In the picture of me on the ladder you can just make out a large vent by my head, while I was working there, it felt standing in front of an oven with the door open, the heat was just pouring out of the open vent, inside the house was nice and cool, the new loft insulation is doing a great job keeping the heat out now.
I spent the remainder of the afternoon removing and then replacing the soffits and fascia along the south side of the house.
I have another 3-4 hours to go before the front is complete. I am not going to focus on the East side of the house yet, preparing for the new deck is a higher priority.
The overall look is very pleasing, as J stated it no longer looks like the place is falling down.
I chatted with the neighbors today and showed them our plans for the landscaping, we talked briefly about their shed, apparently the neighbor on the other side is pretty mad at them and has asked them to move the shed back away from the lake, I offered my arms to help... just so long as it stays on that side of their yard ;)
This evening after showering I realized how much sun I had caught, my calves, and fore arms are stingy, as well as the back of my neck... more sun block tomorrow.

Sunset at Spring Lake

DSC_0600

Shiny ball

On Friday the group of us that make up the TLT (Tech Leadership Team) met in New York to walk the corporate CTO through our Division and individual business unit plans, we had a set agenda to address areas such as key initiatives, risks, vendors, and trends.
I was responsible for the detail around the power initiatives, the infrastructure technology trends and the vendors.
For the first three hours of the meeting the corporate CEO attended, he is in fact my bosses, bosses, bosses, boss. That is several pay grades…… There was a “rigid” dress code for the meeting “business casual” it was interested to see what that meant to each group. In NYC is seems to mean suit pants and dress shirt with no tie.
The meeting took place of the 30th floor in our New York Times Square office board room, when you look out the windows from the room you are at the same level as the great crystal globe that descends down the pole for the New Years eve celebrations annually, it was very distracting as it changes patterns and colors constantly.
The previous night we had explored restaurant row and had been introduced to a Roman Israeli restaurant that served the most delicious soft shell crab dish and a very decent Borolo for a very decent price.
We were nervously anticipating problems getting home through LaGuardia as it was a holiday weekend, the flight was full but we left and landed right on time.
One of the executives we were travelling was married to our flights co-pilot, this enabled our COO to sit in the cockpit and play captain will we were boarding (he looked like a kid in the candy store...).
I will blog later about the tech trends as they are worthy of note, but we are off to the cabin to get an early start of replacing the exterior fascia and soffits. Yesterday we fitted the baseboard and molding in the bathroom, and J painted the woodwork in K's room and the hall. I spent a great deal of time finding vinyl soffits and then getting them home, I should have taken a picture of the car loaded up, quite a site 12 foot long boxes of vinyl sheets bending over the top, fortunately no officers of the law were interested in me on the drive to the cabin...

Thursday, May 21, 2009

American idle

Last night those of us that chose to watch the outcome of the 100 million text messages and phone calls that resulted in the outcome of the hit television show American Idol got to see how the demographics of the show’s viewer affect the outcome more than the talent of the competitors. There is no question that Adam was a league ahead of the eventual winner Kris, but unfortunately Kris was cuter and cuteness trumps talent when dealing with the US youth…. After the show I had already forgotten whatever songs it Kris had sung but I could still hear Adam’s outrageously good vocals screaming from the stage. Fortunately Adam will live on and I am fortunate enough to be taking K&J to see them on tour later this year. And of course you can guarantee Adam already has many contract options that he can sign up for and start what will definitely be a great career as an entertainer.

Sunday, May 17, 2009

3 days closer to finishing

Today we transformed the bathroom, it is now a light blue color, and is await its new baseboards.
The kitchen cabinets are now in their final place, attached to the walls. I fitted the under cabinet led lights and then the trim to hide them along the bottom edge of the upper cabinets. I also fitted 20 handles to the doors, they are wavy lines that match the wavy lamp fixtures we have fitted above the kitchen island. My in laws came by the afternoon to help us. My mother-in-law painted the door frames and my father-in-law spray painted a bedroom door (outside). J also patined the baseboards and door frame in the kitchen and hall. We have made progress on emptying E's room which is now the final target for the paint brushes. And the breezeway and garage are now collecting all the tools and patin etc.

Today the weather was fabulous and it made us realize why we had bought the cabin in the first place, the view from the deck was great, my final job of the day was to remove the railing and cut it down to a third of its original height and refit it so that less of the view is obscured when you are sitting back in the seats.

First thing this morning I spotted a family of ducks, including a mass of brand new babies scurrying after the mother, it was interesting watching them navigate under around and over some of the docks sticking out into the lake, I see now why more regulation is needed to ensure the safety of all the lake's inhabitants.

Saturday, May 16, 2009

Tremendous cabin progress

J and I took Friday off work and we hit the paint brushes hard.
The living room now is complete except for one final coat of white gloss on the woodwork.
K’s bedroom is also complete apart from the woodwork needing painting. We finished painting her walls, ceiling, fitted the new oak laminate floor, fitted the baseboard and moulding, hung her closet doors and finally refitted the light.
In the bathroom we spackled, primed the walls with two coats of Killz and painted the ceiling.
Tomorrow we intend to complete the woodwork in both rooms and finish painting the woodwork in the living room and K’s bedroom, paint the bathroom walls their final color (very pale blue) and last but not least to fit the kitchen cabinets (lower ones) the walls so we can have them measured for the counter tops to be made.
I am really looking forward to removing all the plastic sheets that are lying everywhere covering floors and cabinetry.
Thankful the last two days have been quieter in the datacenter, so now more core infrastructure snafus, and no emergency conference calls.

Thursday, May 14, 2009

Roof top signage

Wednesday, May 13, 2009

Response to the response

Wendy Mars from Cisco in her video response to my blog post (and others) did not seem to be responding to the (other) point I had raised which was related to the increased risk of placing all the eggs in one basket, or more realistically to place 12 blades into one chassis which is the implication of placing one of our new farms into a UCS chassis. She stated UCS "actually helps that dynamic" how? by stating it is more reliable? if there is a systemic failure in a UCS chassis is not that a large failure domain? I suspect the response to this is that UCS is fully redundant and the probability of this type of systemic failure is remote, I thought the same of our well designed SAN fabric and Ethernet.
But despite this issue of failure domains I suspect that UCS will indeed be a very viable solution to enable far greater consolidation than we are achieving today due to its innovation in memory management and network unification. We just will have to keep a careful eye on the failure domains created and provide mitigation through FT, HA and SRM, plus what ever comes next.

Kindle killer coming?

Thanks to Larry for pointing me at this article about a rumored Apple product that is going to kill the Kindle. Two things that dislike about what it sayes (or just implies) Apple's device will cost even more, and on top of that it looks like Apple wont be going for the free whispernet route and insted requiring a subscription to a 3G provider - I hope one can use it tethered instead and skip that part... I also will be interested to see if Apple can match the battery life and usability of the Kindle, I have my doubts.
I found this blog posting with a mock up, as usual, Apple's looks cooler..

Failure domains continued

So I did not complete my train of thought from this morning, when I was discussing the issues surrounding core infrastructure outages. When you look at our big rules, they are designed to minimize failure domains by focusing on each individual component in the infrastructure and providing appropriate mitigations and problem scope management solutions. The network and SAN fabric are clearly infrastructure elements with far reaching failure domains, these are mitigated by building redundancy into the architectures, two fabrics, multiple paths through the Ethernet, redundant SAN directors, redundant Ethernet switches. A single physical instance failing is mitigated, and let’s face it failures in either of these infrastructure elements is extremely rare. My post this morning demonstrated though how simple it is for human error to render all the redundancy useless. The same is true of the SAN, in the same twenty four hour period we also saw how the execution of a SAN fabric tool can impact the fabrics zones holistically and remove the path from all the hosts to their storage. So we have seen two instances in the past 24 hours where simple human error can create failure domains that span an entire network or SAN fabric.
In our current environment the consequences to our Virtual Server farms was a momentary emergency, every VM using SAN loosing contact with its storage, this made them very very unhappy…. Go figure.
Now turn the clock forward and imagine how SRM (Site recovery manager) can actually provide for an automated recovery from both these situations using a second remote site with a second independent SAN Fabric and Ethernet network. So even core infrastructure failures can now be mitigated using a second site and SRM. Clearly not instantaneous, but it took nearly 90 minutes to recover the environments impacted by the SAN Fabric and Ethernet incidents. We don’t yet use SRM but it is incidents like these that make the business case for investing in SRM or similar strategic technologies very real and justifiable.

Core infrastructure failure domains

So all our careful thought about failure domains sounds good, but there is one unfortunate problem we overlooked. We are dependent upon the SAN fabric connecting us to the SAN arrays and the Layer 2 Ethernet to connect us to the NAS filers.
Having an outage in both seems unlikely given the level of risk and change management that is applied to both infrastructure elements, or so you would think.
Last night we discovered an alarmingly simple way to impact vlan configurations. We were adding an additional vlan using a command much like this:
switchport trunk vlan add XYZ
unfortunately this failed as it was the incorrect syntax, instead this command was executed
switchport trunk allowed vlan XYZ
The consequence of whichwas to overwrite all the existing vlans and replace it with the new one only, the command should have been :
switchport trunk allowed vlan add XYZ
Why do I share this? Bcause loosing all the vlans resulted in all the VMs running on NAS to go into a suspended state as they lost communication with all the Network not so attached storage…..
How can I mitigate against such a broad failure domain? Any ideas?

Failure domains

In our virtual server environment we have spent considerable time understanding the impact of failures. We have created a set of Big Rules to manage failure domains.
The first rule is 13>N>3 where N is the number of physical server.
The second rule is Average CPU, Memory and IO <70% of max.
These two are a simplification of a more complex requirement to be in an N+1 state for physical server failures, but they provide guide lines to ensure we maintain that level of redundancy. As examples in a 4 node farm running at 70% capacity you have in effect 280% total load, if one physical host fails this load then gets spread across 3 hosts each getting 93% of load – just survivable. In a 12 node farm running at 70%, you can survive up to 3 physical host failures and run at 93%, thus our 12 node farm would be at N+3. Our current build pattern is 6 physical nodes; this provides us N+1, if we go to 7 nodes we would be at N+2. We are in the process of making the decision to increase our N to greater than 6 as we have experienced multiple hardware failures during the past quarter and we feel that N+2 is a more appropriate state to be in given the failure rate.
Our third rule is to keep production and non-production VMs separate. It is clearly a no brainer to ensure that unrestricted development activities do not impact production systems, remarkably when I took over responsibility for the VM environment this was not the case, and we are still dealing with the migration and separation of these two unique environments.
The fourth rule relates to the number of VMs that can connect to an individual storage device – currently we specify 500 as the max, this is currently an arbitrary number designed to limit the failure domain caused by a filer or array failing. We experienced one outage in Q1 where an array ASIC became overloaded and caused a widespread impact on all the VMs connected to that specific ASIC, this design problem has since been rectified but it showed us the danger of storage dependencies.
The fifth rule, which I think needs more thought, is a limit on the number of VMs in a farm to 400. Again we decided to limit the scope of a single Farm event. There has been a lot of discussion about there not being a farm wide failure domain due to the farm just being a logical grouping, however there are opportunities within Virtual Center to impact an entire farm and also there are shared network connections that may still mean there is a failure domain bounded by the scope of the farm.

Cisco responding

Thanks to Spider, for pointing out that Cisco picked up on a previous post about Cisco's UCS (Unified Compute System), if you don't believe me check out minute 3:07 and listen to the speakers comments.....

Sunday, May 10, 2009

Sheds

Yesterday I put the first coat of Ralph Lauren “Studio Cream” on the walls of the Kitchen, dining and living room areas, I finished edging the ceiling and then just to kill my arms I put a second coat of “Moon Mist” on K’s bedroom walls. This was a total of two gallons of eggshell rolled and brushed. I completed it all standing on the floor or a bucket (upside down) as I found it too time consuming moving a ladder around all day.
The neighbors to the west have pulled down their old shed and erected a new one. Interestingly they erected it on the other side of their yard, and closer to the lake. From our point of view this is awesome news as it removes a huge eye sore from our vista. I am not sure though of the legality of what they did, nor how their other set of neighbors are going to respond when they get back from vacation….

One of the first things that comes to mind is the zoning ordinance that states “No building or structure shall be erected, converted, enlarged, constructed, moved or altered, and no building, structure or land shall be used for any purpose nor in any manner which is not in conformity with the provisions of this Ordinance and without a building permit being issued.”

I also doubt that a permit was applied for as required by this section: “No accessory building shall be constructed on a lot before a building permit has been issued for the principal building to which it is accessory.”

And the final killer:

“1. Structure and Individual Sewage Treatment System Setbacks from
Ordinary High Water Level: General Development Lake 75 feet”

So in my reading of the law the shed they built is now illegally placed and as they built new versus replaced their old they are in violation of the zoning regulations. To cap it all if they don’t rebuild the old one within 180 days they lose the right to rebuild it, so by my reckoning if the other neighbors complain and get the new shed torn down then neither shed could be rebuilt if 180 days expire…. interesting

Friday, May 8, 2009

I am not convinced

I quote from a recent Oracle press release:
"Why does Oracle, a company that prides itself on highmargins, want to get into the low-margin hardware business? Are you going to exit the hardware business?
No, we are definitely not going to exit the hardware business. While most hardware businesses are low-margin, companies like Apple and Cisco enjoy very high-margins because they do a good job of designing their hardware and software to work together. If a company designs both hardware and software, it can build much better systems than if they only design the software. That’s why Apple’s iPhone is so much better than Microsoft phones."

Using the iPhone as a comparison is fascicle if one wants to be in an open systems competatitve world. The only reason the comparison becomes valid is if they want to lock everyone into a hardware platform that is not open to competition and becomes one of the highest price solutions. Alternatively maybe they want to start another religion like Apple does with each group of product zealots they convert.

Wednesday, May 6, 2009

Daemon

Based on Jeremy's recommendation I read Daemon by Daniel Suarez, it is a novel that describe a concept under which a federated AI application is developed by a maniacal billionaire on his death bed in an attempt to alter the dynamics of our political and corporate based society. It is well written and manages to stay relatively close to practical reality, until the end when it gets a little out there....
I read this on the Kindle as an experiment in usability, and it passed the test, the Kindle is a great platform for reading, I read the book during three flights to and from Baltimore, zero eye strain, I was able to adjust the font size to make it highly usable in all but the lowest light conditions.
My next test will be to read Julius Caesar, a much harder read....

Sunday, May 3, 2009

Jinxed

We went to the Metrodome today and watched the Twins take a beating from Kansas City, we had the pleasure of the company of twenty other team members and their family members at the company's box (inlcuding two of my new team), it was great to see so many kids enjoying the game, and their paretns realxing.
The game however was jinxed due to Shane declaring we had won the game in the sixth inning, at which point all hell broke loose and KC swept to victory, me might eventually forgive her.
The girls had a blast enjoying each others company and making us adults just chauffeurs for the day.
DSC_0491 - Copy

Yesterday evening we took E to Punch Pizza in Eden Prairie for dinner, her (and by chance my) favorite pizza, authentic Italian brick oven. In case you care my favorite is the Vesuvio, spiced salami, saracene olive, cracked red pepper, pepperoncini, basil.
Whilst we were driving back the sky was darkening and I snapped this picture through the windscreen of the sun bursting through a crack in the clouds. One handed use of an SLR while driving I am sure contravenes too many Minnesota traffic laws to mention, J gave me hell, but the picture turned out good. It was a challenge to get the shot without a lamp post sticking through the middle as we were going down highway 169 crossing the Minnesota River flood plains were lamp posts line the bridge.
As a side note, I used this blog post to experiment with resizing pictures on the blog, you will notice if you resize your browser window the cloud burst picture resizes to match the column width - I know its not rocket science but it took me a while, using the attribute width="100%" and pointing to my flickr.com image that was 1024 wide enables a clear picture at most resolutions... googles standard blog setting are for fixed widths and since I have stopped using a static width template for the blog these did not make sense, now I have a little more control.

Saturday, May 2, 2009

Black Green Bag

Last week I started using a Voltaic Systems Laptop Bag, this is like no ordinary bag, on the outside it has a 15W solar panel. The bag weighs more that my laptop, it comes with a 58Wh LiIon battery and every power converter accessory you can imagine. It will supplement but not recharge my Dell laptop, and it will recharge and/or supplement my Mac Air. On top of that my iPod, iPhone and I am hoping the new Kindle will also be able to be recharged.
The bag itself is constructed using fabrics made from recycled PET i.e. soda bottles. In theory recycled PET fabric is light weight, extremely durable, UV resistant and water resistant. Most importantly, it uses significantly less energy to produce.
I doubt there is currently a greener bag out there.

Friday, May 1, 2009

Tom - on Digital Age

Tom our CEO was recently interviewed by James Goodale, well worth watching if you want to understand the way his mind works.



Thanks Doug for the link.

New Role

Changing roles is both exciting and saddening, exciting to be given the opportunity to solve new challenges (new to me anyway), saddening to lose responsibility for great people. The number of people within my organization is radically smaller next week than this, but the importance of the breadth of responsibilities has not diminished. Several critical initiatives are underway that I have been asked to assume responsibility for; the establishment of a strategy for our data center expansion, the mitigation of our short term power draught, the migration of technology assets between our data center modules to balance the growth. Architecture and Strategy have been my passion, so now to be officially given the responsibility for them both is “cool”. There is nothing like the chance to learn to keep me engaged…. Watch out team, I am going to be engaged….
The teams that are moving out from under me are going to great homes, and their movement has nothing to do with their performance, which in my opinion has been stellar, their new management are great people and I have every confidence that they will continue to receive the focus necessary to make they successful.