Broken Promises

Can we get on with building the software yet?

  • Home
  • Mentions
  • About
  • IaaS APIs - Raw Infrastructure vs. Contextual Configuration

    • 10 Apr 2012
    • 0 Responses
    •  views
    • api cloud
    • Edit
    • Delete
    • Tags
    • Autopost

    The recent AWS "compatibility" debates, as well as my own early participation in the DMTF's Cloud Management Working Group has me thinking about the IaaS API space quite a bit these days. One of the things that we're not really discussing as an industry is really around the top level approach to IaaS API design.

    The way I view this is that the primary question is one of pure infrastructure utility vs contextualizing the infrastructure resources. Simply using the standard AWS EC2 API is an example of the pure infrastructure utility, and the plethora of solutions that layer on top of the standard EC2 configurations prove that this is a viable utility-based model. Many of these higher level solutions aim to solve the contextual relationship issues that arise out of any non-trivial consumption of EC2 resources. Amazon clearly realized that context is critical as well, since they have provided what they hope are the canonical API solutions for adding context with the introduction of the VPC extensions to the EC2 base API and the creation of their CloudFormation wrapper on top of their ever growing infrastructure utility options.

    Looking at the VMware side of the ecosystem, the approach was to start with context first.  Within the vCloud API, the VDC is king and all activity takes place within the context of the VDC.  This same focus on context can be seen in the configuration complexity supported within the OVA/OVF format for vApps. One "could" argue that the base vSphere API is a "poor man's" answer to supporting a more utility-focused approach (similar to EC2), but that would be a bit of a stretch.

    There are ways to create context from a low level utility API, just as there are ways to force contextualized APIs to behave like they are without context (think about a very large VDC that you don't add any networking complexity to).  The basic difference between discrete infrastructure components and composed environments is level of contextualization delegated from the provider to the consumer. To me, context is king for the users in the end, but deciding where that context is applied remains a question that each user needs to answer for themselves.

    Have we put enough thought into these distinct approaches to understand the implications fully?  Amazon seems to have done the right thing, by providing both styles of consumption.  Should other providers follow suit?  That depends on the types of customers, ecosystem partners and use cases they want to support.

    • Tweet
  • We're Missing the Point on IaaS Cloud Platform API Compatibility

    • 9 Apr 2012
    • 2 Responses
    •  views
    • AWS cloud cloudstack
    • Edit
    • Delete
    • Tags
    • Autopost

    The cloud industry press and analysts are missing a significant point about AWS API compatibility in the various IaaS cloud management software packages. Who is, or isn't, AWS API compatible isn't the question.  Forgive me if I pause to consider the implications of a purely "follow the leader" approach, but how does AWS API compatibility serve as a competitive advantage?  It doesn't...  but it might be part of what makes a provider viable for consumers to use.

    Technical strategy aside (perhaps a future post), what's the right way to think about all this?

    Some Business Context:

    Lots of debate going on around the business strategy involved here, but this is critical stuff.  

    Lydia Leong had a great post about the macro-level battle lines that are being drawn, but the comments below the post raise some very interesting concerns about using or providing AWS APIs. I especially noted Sam Johnston's point about AWS API licensing and patent indemnification (or open license) being a concern. Jim Plamondon (Rackspace developer relations) has an interesting comment on that post as well (taken with a large grain of salt... his comment is pretty doom and gloom), focused on the question of how Amazon's API strategy will impact it's competitors in the long term.

    Dan Woods raises similar concerns in this piece on Forbes.  Dan wants Amazon to answer the following: Are there limits to the use of Amazon’s APIs? How will community experience inform the evolution of Amazon’s APIs? What is the process that will govern the evolution of the Amazon APIs?  Great questions, but I'm doubtful that Amazon will be answering them in the near term.  They just don't have a real reason to do it today.

    Here's how I've interalized the basic issues being discussed:  

    1. Amazon's API's aren't openly licensed
    2. Amazon's API's aren't community driven
    3. Amazon's API's don't allow competitive providers to provide differentiation and extension

    When you think about this from Amazon's perspective, I think there are two sides to consider.  First, they certainly don't want competing public IaaS cloud providers to grow to the level of prominance to beat them.  Today, it's hard to argue that Amazon is the market leader.  But we're still VERY early in the IaaS market, and holding the leadership position in a rapidly changing industry is quite difficult.  Patents, copyright and license terms surrounding thir API's are a weapon in this battle - even if Amazon hasn't chosen to use them as weapons yet.

    Second, Amazon does want to continue to grow the ecosystem of services running on top of AWS, consumers of their APIs (think RightScale), and even cloud management stacks that are intended for private cloud deployments. AWS licensing their APIs to Eucalyptus is part of this story, since AWS doesn't (today) have a private cloud option for customers that have a valid reason for mixed use of cloud tech.

    What's a Cloud Service Provider to Do?:

    Providers need to remain flexible.  If the battle for an eco-system is based on which API is supported in the platform, shouldn't we be focused on how to effectively provide more than one API to our customers?

    I've actually seen some pretty confusing (and lots of innacurate) analysis of the "CloudStack AWS compatibility" claim.  Just to be clear about it, CloudStack supports some AWS API compatibility through the use of Citrix's CloudStack CloudBridge system.  Note here - this is not part of the default CloudStack installation, but an add on product (in good news, it's already Apache 2 licensed).  I actually think Cloud.com (prior to the Citrix purchase) made the right call here.  API compatibility begs for the use of the API facad pattern.  

    Before cloud.com was aquired by Citrix, Peder Ulander (@ulander) was interviewed by virtualizationpractice.com.  He said, "We are also going to be adopting the API schema and tools approach to deliver a truly interoperable cloud environment – by that I mean the ability to create, manage and run applications that support APIs from clouds like Amazon, Rackspace, vCloud, Citrix, etc." Peder was on the right track. That's exactly the approach that providers should be taking today.

    Should cloud providers support the Amazon API today?  Yes, if they want to ride the AWS ecosystem wave.  They need to be cautious though, since copyright, license and patent infringement concerns are sure to raise their heads in the future.  Competitive cloud providers need to be designing their platforms for compatibility, but need to ensure that they have alternate strategies either in place today or possible in a very short window.

    • Tweet
  • "Continental Shifts" in the IaaS Industry

    • 5 Apr 2012
    • 0 Responses
    •  views
    • Edit
    • Delete
    • Tags
    • Autopost

    There certainly have been lots of fun announcements being made, licencing agreements being signed, releases being announced (one, two, three in fact), committed developers reacting to changes and just general rumblings in the world of open source IaaS cloud management software in the last few weeks (not to mention some not-so-accurate titles on press blog posts), and I'm certain we're going to be seeing lots more of this activity in the coming weeks ahead.  For anyone who's actively involved in the cloud ecosystem, this is a time for strong strategic thinking about your place in the market, and (as with any time of change) there will be lots of opportunities to succeed.

    I've been paying particular attention to those that are analyzing these changes through a broader lens.  Specifically, Lydia Leong has a great post focused on the CloudStack / OpenStack debate.  Simon Wardley (a fantastic resource for this type of strategic thinking) had a great guest post on Forbes about using open source as a weapon and how the ongoing "stack wars" are exactly that.

    Even more interesting...  I like seeing things laid out in quantitative numbers.  Qingye Jiang (John), the Director of Cloud Computing Platform at Hainan Tianya Networking Technologies, Ltd had a great analysis of the community activity surrounding the top 4 OSS cloud stacks.

    The battle for the title of "platform of choice if you aren't Amazon" has only just begun, and the hearts and minds of platform developers, providers and consumers (not to mention lots of potential profit for those that make the right moves).  It should be a fun ride!

    • Tweet
  • Review of the StackOps OpenStack Distro 0.3 Smart Installer

    • 15 Feb 2012
    • 2 Responses
    •  views
    • Edit
    • Delete
    • Tags
    • Autopost
    First Impressions

    In December of 2011, StackOps released version 0.3 of their OpenStack distribution, following the Diablo stable release that occurred in late September.  I recently went through the process of using the StackOps Smart Installer to get a single node instance up and running, and wanted to share my experiences.

    First, I love the general idea of simplifying the installation and configuration of the OpenStack components.  At least for a proof of concept, this is a great tool to get you started with the platform.  Unfortunately, after both using the installer and reading through all of the StackOps documentation on the subject, I'm left with concerns about the whole Smart Installer model itself.  I'd love for StackOps to rethink some of their base assumptions and decisions, so that they can move forward with a more compelling model.

    Basically, the Smart Installer concept is to start with the StackOps distribution ISO image (or a bootable USB drive with the same material on it), perform a fairly standard Ubuntu Server 10.04 LTS setup process, and then use their Smart Installer system to get the node up and running.  

    My first general concern is directly tied to the way that the Smart Installer works.  The newly installed server hosts an installation control process, via a web server listening on TCP 8888 and running on the server itself.  When a user hits that URL, some information about the server is passed off to the StackOps hosted Smart Installer landing page.  Here, you're asked to register and then log into their site.  While I'm sure that StackOps would say that they treat information about the registered systems and users with care, I got that nagging feeling that exposing the configuration details of a production OpenStack platform to a third party might not be the best thing for a provider to be doing.  I do like the fact that they make it clear for how users can opt-in/out of receiving commercial sales contact and email from the company when registering.  They also make it possible to perform the installation without registering a user account, which theoretically means that your server configuration data isn't stored permanently.  I'm just not sure that these options are enough to get me past the idea that the system data has to be passed to the distribution vendor.

    Another general concern with the Smart Installer approach is scalability.  I wasn't able to determine how the installation process would work when done at any significant scale.  For example, assume that you have a compute pod with 80 hosts.  That's allot of website navigation to get all 80 configured.  It seems like Dell's Crowbar system is much more useful for large scale deployments.  I'd love to find out that StackOps has a similar approach!

    Beyond the major concerns of data security and installation process scalability, there are a few minor issues that I ran into while performing the installation:

    Installation Progress Tracking

    When you get to the end of the configuration data input process, the actual installation and configuration of OpenStack is like a black hole process.  First off, when you click the "Start deployment now!" button, there is no visual indicator anywhere on the page to tell you where the installation process is in it's steps, or even if the process has actually started!  The button itself doesn't even reflect the fact that it was clicked (I tested on Chrome, so perhaps this works on other browsers?).  The only indication you get that something is happening is what appears to be a very long page loading process.

    Image

    That is, until you get an error...

    Error Reporting

    The Smart Installer configuration process makes it clear that OpenStack Horizon support is experimental, but it's enabled by default.  Unfortunately it doesn't seem to work at all.
    1image

    My issue isn't so much about this experimental feature, but the fact that error conditions are so poorly reported to the user when they DO occur (I ran into volume manager errors as well, but that was my fault for not reading the documentation thoroughly enough).  When a problem happens, you see this:

    0image

    That's a pretty ugly user error report, with no obvious way to deal with the condition.  The trick to fixing this is to use the browser back button to go backwards through the configuration process, and then uncheck the option to install Horizon.  You can then for forward again, re-filling in the required data for the screens that you went past on your way back.

    Online Help / Support Confusions

    Gor the actual Horizon installation issue, I reported it on the StackOps Forum.  I'm not sure what to make of the forum though, since it's pretty empty in there!  The other place that I see StackOps directing users is to the StackOps environment on Get Satisfaction.  I couldn't find any reference to the problem on that site.

    I did end up finding it in the StackOps Jira tracking system (which I have to say is nice to be able to access as a user):  STACKOPSDISTRO-51  You have to use your installer.stackops.org account to access the Jira Issue.

    Thinking About the Future

    I believe that the StackOps team is doing a great job of taking the young OpenStack projects and creating a reasonably useful distribution for fast proof of concept environments.  I truly respect that they are only on version 0.3 of the StackOps Distro, and I know how long a road it can be to get to a version 1.0.  Hopefully, my general concerns about the model can be addressed over time (while still keeping the current, or similar, approach in place for the small scale non-production environments).  They should be happy about what they have built, but they shouldn't be satisfied yet.  Good luck StackOps team!
    • Tweet
  • Understanding the Impact of Multi-Tenancy

    • 14 May 2011
    • 0 Responses
    •  views
    • IaaS cloud computing
    • Edit
    • Delete
    • Tags
    • Autopost

    As you evaluate different cloud providers, it is important to understand the different concepts providers can use to deploy multi-tenancy. Different concepts facilitate—or limit—the way in which a provider can respond to changes in the service needs of clients.

    General Purpose Clouds:

    For example, some vendors design their clouds as commodities. They focus on providing low cost access to computing power in flat, homogenous environments. This type of general purpose cloud can scale quickly and easily to support large numbers of similar users. As they become saturated, however, you may begin to see variations in performance, as some users expand their usage and experience spikes that place constraints on all other uses.

    Performance variations can affect computing power, storage and I/O or network traffic. Most providers already have solved performance problems associated with sharing VM RAM and CPU power, and most have deployed one or more of the many solutions for storage and I/O performance issues. Consequently, network performance is usually the first noticeable bottleneck. While it is important to know how your provider will handle performance variations wherever they appear, it is especially important to know how network issues will be handled. 

    The Concern: Network Latency:

    Networks experience varying levels of latency based on where the users and their data reside and how much bandwidth has been allocated each user. The easiest solution to network issues within a cloud is to physically separate heavy users from lighter users. This means moving the heavy user to a private cloud where resources can be adjusted to meet the requirements of peak periods, more users and new applications. 

    The Answer: Scalability and SLA:

    To reduce your risk of incurring more costs from your cloud provider, look for an enterprise provider that has scalability at every level of the cloud—SaaS, PaaS and IaaS. And look, too, for a provider offering a Service Level Agreement that addresses the performance requirements for the services most important to your business. These are the attributes of an enterprise level provider with the elasticity to meet your future needs.

     

    • Tweet
  • Cloud Compatible Development

    • 13 May 2011
    • 0 Responses
    •  views
    • IaaS cloud computing
    • Edit
    • Delete
    • Tags
    • Autopost

    One of the largest benefits that an application developer can get out of a cloud-based infrastructure is the opportunity to design for variable scale.  Specifically, you can start off small (with a limited number of virtual machines, using limited host resources), and then expand your environment as usage grows.  Conversely, you are able to shrink your infrastructure consumption during non-peak times. While some of this flexibility can be applied to existing legacy applications, the real win can be for newly developed systems.   

    To get this benefit, there are some fundamental architectural principals that need to be followed:  loose coupling of system components, distributed system design and automated application installation / configuration. A solid architecture should reasonably scale from fitting the entire application onto a single VM to sharing it among hundreds (or even thousands) of VMs supporting the users. 

    To achieve the loose coupling and distributed design goals, you need to decompose the architecture into units of functionality and think through how they will distribute work within the system.  Each component should be designed to support multiple instances of that application service within the environment. By doing this, you can load balance the application load as you need to scale. 

    This decomposition should happen at all layers. It does no good to scale out web servers if a singular application server will become the performance bottleneck.  And definitely be sure to think through a scaling strategy for your databases.  If you plan on using a traditional relational database platform (RDBMS), consider setting up your identity columns in a way that will support future distribution of load through sharding techniques.  Another alternative is to use multiple read replicas, with a single write-enabled database instance. If you plan on going the route of NoSQL, be sure that you understand the scaling dynamics of the selected platform.

    Achieving automated application installation and configuration builds on your distributed design. The key to ensuring that you can do achieve this architectural goal is to classify virtual machines into roles.  Role definitions will let you relate one server to the other servers in the environment. Using a “web server” role as an example, perhaps any server in that role needs to know what database server to connect to.  And just to relate this idea back to the point about determining a database scaling design, that database target might be different for different web servers. 

    Once you have a good understanding of how you plan to deploy a highly-distributed version of your system, it's time to automate your installation and configuration. These are critical tasks if you want to achieve value from a dynamic infrastructure environment, because you need to match the speed that you can install and configure an application with the speed that you can provision new infrastructure.  Your software should be installable via command line, and you should look at different options to automate the configuration of the installed applications.  

    While you may want to take these concepts to the extreme, my best advice for a new application architecture is to start simple.  Let these ideas guide your design, but remember, you’re main goal is to get the new application deployed for your users!

    • Tweet
  • More Cloud Foundry Environment Stuff

    • 20 Apr 2011
    • 0 Responses
    •  views
    • PaaS Ruby cloud foundry
    • Edit
    • Delete
    • Tags
    • Autopost

    Here's a really simple application that can help explore the runtime environment of a Cloud Foundry provider (Sinatra ruby apps only, obviously):

    • Tweet
  • Cloud Foundry Kept it Simple

    • 16 Apr 2011
    • 0 Responses
    •  views
    • PaaS cloud foundry
    • Edit
    • Delete
    • Tags
    • Autopost

    I've been spending a good amount of my time in the evenings this week reading through the Cloud Foundry source code (available on github). I'll admit that I've caught the bug... the team at VMware has done a fantastic job of keeping the foundational system as simple as possible.  I say that both from a user's perspective, and from looking through the internals of the code itself. To me, this is exactly the right way to solve a problem.  Start with a simple solution to a generalized use case, and then make it work.  From there, it's a matter of refinement and feature extension.

    What I think is most powerful about the platform's approach, is that it is fundamentally based on the idea that the app runtime environment, supporting service and platform provider options can (and should) grow independantly.  It's only been a few days, and the community as already provided the Cloud Foundry team with pull requests to add in JRuby and Erlang support.  I have to imagine that new service support will quickly follow as well.

    In terms of cloud providers, VMware made the right decision to host an instance of the platform in their own environment (which, combined with their new responsibility to host Mozy for parent EMC, is another topic altogether), but is fully expecting to see other providers offer differentiated versions of the platform. As Ezra Zygmuntowicz (@ezmobius) put it, "(VMware) want(s) this to be the kernel for the cloud, not only our cloud". Unlike vCloud, the openness of Cloud Foundry is what will make it more palatable to cloud providers, because it gives them numerous opportunities to establish differentiated solutions and offerings around the base platform.

    Will VMware be abe to avoid some of the governance and political issues that Rackspace / OpenStack have run into? Rackspace and OpenStack appear to have gotten through that little rough spot, but I certainly hope that VMware learned from their experience.

    For a quick overview of the internals, take a look at @igrigorik's post on the Cloud Foundry architecture.  I also found this presentation (shared by Dave McCrory - @mccroy) targeting developers to be quite useful:


    Disclosure:  I work for a cloud platform provider, but the views in this post are mine alone.  They do not reflect those of my employer.

     

    • Tweet
  • Early Discovery of #cfoundry

    • 15 Apr 2011
    • 0 Responses
    •  views
    • Edit
    • Delete
    • Tags
    • Autopost

    I got access to cloudfoundry.com late last night, but just had a chance to start playing around this morning. First off, I love the simplicity of the developer experience... at least for Hello World style applications. I'll have to dig into it further, to start exploring how the services are implemented and how application instances scale.

    Coming from an IaaS development background, one of the first things I was interested in digging into was the runtime details for the platform. I decided to extend the VMware Hello World ruby app, and have it return some details about the base OS supporting the app instance.

    Here's the simple code:

    require 'sinatra'
    get '/' do
        processor = `head /proc/cpuinfo`
        memory = `head /proc/meminfo`
        swap = `head /proc/swap`
        linuxversion = `head /proc/version`
        disks = `head /proc/partitions`
        appuser = `whoami`
        net = `cat /etc/network/interfaces`
        output = 'local OS user: <br />' + appuser + '<br /><br />processor: <br /><pre>' + processor + '</pre><br /><br />memory: <br /><pre>' + memory + '</pre><br /><br />swap:<br />' + swap + '<br /><br />ver: <br />' + linuxversion + '<br /><br />disks:<br /><pre>' + disks + '</pre><br /><br />networking:<br /><pre>' + net + '</pre>'
        output
    end

    Here's the deployment process. The mem reservation property seems to be the only configuration that will affect the selection of an appropriate VM to host the app instance.

    $ vmc push
    Would you like to deploy from the current directory? [Yn]: Y
    Application Name: test1
    Application Deployed URL: 'test1.cloudfoundry.com'? 
    Detected a Sinatra Application, is this correct? [Yn]: 
    Memory Reservation [Default:128M] (64M, 128M, 256M, 512M, 1G or 2G) 64M
    Creating Application: OK
    Would you like to bind any services to 'test1'? [yN]: 
    Uploading Application:
      Checking for available resources: OK
      Packing application: OK
      Uploading (0K): OK   
    Push Status: OK
    Staging Application: OK                                                         
    Starting Application: OK

    And here's what I get out of it, after deploying. Assuming that I don't turn it off or break this over time, you can hit the application live at http://test1.cloudfoundry.com/.

    local OS user: 
    vcap-user-11 
    
    processor: 
    processor        : 0
    vendor_id        : GenuineIntel
    cpu family        : 6
    model                : 37
    model name        : Intel(R) Xeon(R) CPU           X5650  @ 2.67GHz
    stepping        : 1
    cpu MHz                : 2660.000
    cache size        : 12288 KB
    fpu                : yes
    fpu_exception        : yes
    
    
    memory: 
    MemTotal:       16470448 kB
    MemFree:        12623164 kB
    Buffers:          217692 kB
    Cached:          1959284 kB
    SwapCached:            0 kB
    Active:          2368488 kB
    Inactive:         980064 kB
    Active(anon):    1171648 kB
    Inactive(anon):      164 kB
    Active(file):    1196840 kB
    
    
    swap:
    
    
    ver: 
    Linux version 2.6.32-30-server (buildd@crested) (gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) ) #59-Ubuntu SMP Tue Mar 1 22:46:09 UTC 2011 
    
    disks:
    major minor  #blocks  name
    
       7        0     131072 loop0
       8        0    1049600 sda
       8        1     999023 sda1
       8       16   33554432 sdb
       8       17   16474657 sdb1
       8       18   17077095 sdb2
    
    
    networking:
    auto lo
    iface lo inet loopback
    
    auto eth0
    iface eth0 inet static
        address 172.30.49.74
        network 172.30.48.0
        netmask 255.255.248.0
        broadcast 172.30.55.255
        gateway 172.30.48.1
    • Tweet
  • Non-obvious should be a Non-starter

    • 31 Mar 2011
    • 0 Responses
    •  views
    • Edit
    • Delete
    • Tags
    • Autopost

    Steve Jin (a great guy, who develops and runs the VI Java project, without which the vSphere API would have beaten me down more than once) has a great perspective on patentability of software architectures over on doublecloud. While I agree with his point about patent law, the main reason I point this post out is the value he places in "obvious" architecture being the right architecture.

    This falls in line with my thoughts about building systems with an eye toward the future, but a focus on the present. We, as an industry, need to stop over-engineering things. Focus your time on achieving the system requirements, and just get it built already.

    To me, if you want to patent something, make it a feature! Isn't that the real value we provide to customers? Sure, coming up with a scalable, robust, performant and extendable design is hard work. But the design constraints dictated by the features we want to build now (and in the future) should absolutely lead to obvious approaches to building the system.

    Now this sentiment doesn't necessarily match every situation, but VERY few software projects warrant a non-obvious answer. In fact, if it's non-obvious, then that should really be due to the FEATURE being novel. Patent that.

    If your high level architecture meets the patent law rule of being non-obvious, you're doin' it wrong.

    • Tweet
  • « Previous 1 2 3 4 5 Next »
  • About

    I'm a system architect that likes to actually build stuff. The views here are mine alone.

    3453 Views
  • Archive

    • 2012 (6)
      • April (5)
      • February (1)
    • 2011 (11)
      • May (2)
      • April (4)
      • March (5)

    Get Updates

    Subscribe via RSS
    Twitter