Support for Nuage Networks solution
As some of you might know, I’m working at NuageNetworks where we are building the next generation SDN solution. This is a full end-to-end platform to manage your data center networks in a very powerful and simple way. (Just visit the website for more information).
Obviously, it needs a tiny bit of integration in the VM orchestrators and, of course, the first solution to be 100% compatible is Archipel. We will soon release the modules that will allow you to manage and use your Nuage Networks directly from your favorite solution.
Nuage Networks will also work (or is already almost working) with VMWare, OpenStack, CloudStack and many more. But this is not the topic here :)
And this is a sneek peak of the UI I’m building for Nuage Networks solution: Virtualized Services Architect:
Beta 6 Apophis is now available!
I guess some of you have been waiting for a long time. But it’s over, the beta6, named Apophis, is finally available today. The changes list is huge as usual, but in a nutshell:
- Objective-J 2.0 (incredible performance boost)
- Web socket support to connect to XMPP from Archipel (another incredible performance boost)
- New UI theme
- Support for renaming VMs at the Libvirt level
- Support for libvirt disconnection/reconnection
- Full Xen Support
- Offline migration
- UI shows CPU usage for VMs
- Experimental support for SPICE
- Lot’s of performance improvments and bug fixes.
Full Change Log is available here
ANSOS, the live Archipel-ready OS, has also been updated to the latest ovirt-node version, both in CentOS and Fedora declination.
If you were a Beta 5 user, update has usual by running:
# easy_install -U archipel-agent
You might need to delete /var/lib/archipel/statscollection.sqlite3 and add some new tokens in the configuration file (just look at the log, it’ll be explicit if the agent doesn’t start after upgrade). Be also sure to clean your browser cache.
Share the love, and have a look at our new video on the home page
Some news about beta6
Beta 6 is late but for a good reason. As it has already been announced on the mailing list, this is due to the release of Objective-J 2.0 and Aristo2 in Cappuccino.
Aristo2 is the new theme, and it’ll give a fresh look to Archipel. But that’s not the most important. The most important is Objective-J 2.0. This is the new compiler used by Cappuccino (and so Archipel) and it’s incredibly faster! Like up to 60% faster. And it helps fixing a lot of problems as it throws tons of warnings which was kept unseen before so lot’s of long lasted bug have been fixed.
We are spending some time to stabilize Cappuccino and Archipel before releasing Beta6. The key benefits are:
- Memory usage reduced with no more weird leaks (hopefully)
- New look
Stay tuned for the release that should be out in the next few weeks. And it will be huge!
New icon set and blog theme
In the latest master source code, I changed the toolbar icons in order to get a coherent set. They look like a little bit more pro.
Also the blog theme has been updated in order to match the website.
I hope you will enjoy these new small things.
Maybe you didn’t notice, but all ads have been removed from the new website. Ads were not a very good way to make money. In two years of existence, it only paid 70€, which is cool, but not great. Moreover, after thinking where we could put the ads in the website without destroying its beautiful face, I just found it would be better to just remove them.
But, we still need a little money to keep site up and running (not the hosting itself, thanks to cool guys from conemu who give the project a VM and bandwidth, but for the domain names and all that stuff). So please, if you have a Flattr account, do not hesitate to click!
Welcome to the new website!
No big deal, we just updated the website. We hope you will like this much more minimalistic design. The website is not Wordpress-based anymore, it is just a bunch of HTML and JS (embedded in PHP files for easy includes). This website is also open source, you can find the code here. If you find anything wrong, fell free to send a pull request!
Beta 5 and ANSOS available!
Today, we are pleased to announce that the beta 5 of Archipel, named “Moon” is now available. Well, actually it was already available since sunday evening, but the announcement is done today because we were waiting to publish the first version of ANSOS in the same time. Some of you may ask themselves: “What is ANSOS?”.
ANSOS stands for Archipel Node Stateless OS. It’s a Live Linux OS based on oVirt Node. oVirt is what you use when you want to deploy a new hypervisor in an oVirt environment. ANSOS is what you use when you want to deploy a new Archipel hypervisor. The great thing with ANSOS is that it is not a separate project or a fork. All the changes I made in oVirt to make Archipel working have been merged into the oVirt Node master. This ensure a great support from Archipel team, oVirt Team and community. The major difference with previous version of oVirt Node and ANSOS is that it allows to boot up in a completely stateless mode. You basically set a small bunch of kernel arguments when you start the OS to tell how and where to connect to a shared storage point for persistent data, and that is all. The embedded Archipel agent will do the rest, retrieving default configuration and specific configuration from this shared mount, launch eventual post script to let you fine tune the OS etc.
It is available now, and you can download it from here.
Beta 5 “Moon”
Archipel itself has been severely upgraded. There are a lot of bug fixed and new features. You can find the full change log here. The main ones are:
- Auto Grouping: Now hypervisors and virtual machines are automatically added in Shared Roster groups (need XMLRPC API for now)
- VM Parking: Store XML description in a virtual parking. The VM will be set offline and no agent will run it. Later you can remove it from parking and restore it on any hypervisor you like
- Full support for Character Devices: console, serial, parallels
- Multiple entities selection in the roster
- Extend libvirt API support in model (<video>, <os>, <clock>, <hostdev>)
- Support for creating drives in transient mode, shareable mode or read only mode
- Health module now computes stats in a web worker (thread)
- Health module now displays the amount of Shared Memory
- Tables with potentially tons of data (like users) are now using a new lazy loading API
- Improve keyboard support in VNC (french) and add some new layouts (“es”, “no”, “hu”)
- Preliminary support for configuring SPICE screen (no UI)
- Support of macvtap per Network and/or per VM’s NIC
- Improved Live migration error handling and sanity checks
- Great performances improvements by reducing the amount of XMPP stanza sent
- New versions of all frameworks (Cappuccino, StropheCappuccino, VNCCappuccino, TNKit, LPKit, GrowlCappuccino)
- New CLI tool named archipel-command that allows to send raw stanzas to Archipel entities
- New library named archipelcore.scriptutils to make your life easier when you create script to manage your Archipel platform
And obviously tons of bug fixes and other minor improvements. Everthing is available here.
The update is done as usual if you used the beta 4 by:
# easy_install -U archipel-agent
Note that you need to delete /var/lib/archipel/statscollection.sqlite3 before restarting Archipel or you’ll get errors in Health module.
We hope you will enjoy this release.
Common error tracking
I notice that most of people don’t know how to deal with errors in Archipel. This post will try to help people to know how to manage problems in Archipel, track the origin of the errors, and how to fix them, or at least being able to report a correct issue.
First let’s see how I organize my desktop. This is my personal preference, but I guess it’s a good starting point. I have two screens, and this is sincerely the bare minimum for me.
As you can see, on the first screen, I have my browser, with debugger opened, the source code (this is obviously optional if you are just a user) and on the second screen I have opened two terminals. One connected to the hypervisors through SSH, with the archipel.log displayed with tail (I use one tab per hypervisor) and another one, in my local computer, to send update.
As you can see, to update the archipel agent code, a simple scp + restart archipel is sufficient because I have the source installed with buildAgent -d. This allows to optimize the code’n’try way. I also usually have another terminal SSH’ed to the hypervisors to send simple commands, or fix some stuff quickly. If you use Mac OS X, you can create a Terminal profile to open this workspace with one click.
There are basically two main errors users are complaining about.
- Their hypervisors are offline
- They get some 501 errors
Usually, there are only few things that could cause these issues. it’s mostly always due to :
- some missing library like python-libvirt or xmpppy or stuff like that
- a misconfiguration in archipel.conf
- A problem with the ejabberd configuration
When you are debugging Archipel, you need to know that most of the errors are logged in the archipel.log. In case they are not logged, there is great chance you run into a new issue we are not aware of. To be sure you will see every kind of errors, do not start archipel with the init script! it will redirect all output to /dev/null. You should start and restart archipel like this
# killall runarchipel; runarchipel; tailf /var/log/archipel/archipel.log
Do not hesitate to grep! for example:
# killall runarchipel; runarchipel; tailf /var/log/archipel/archipel.log | grep -i error # only displays error lines
# killall runarchipel; runarchipel; tailf /var/log/archipel/archipel.log | grep -i “uuid@fqdn” # only displays logs from a the vm with given UUID
Be creative, and try to always filter the log, because it can be very verbose. most of the functions prefixes the logs with an identifier. for example, if you want to get logs for only migration related stuff, grep on “MIGRATION” or only for VMParking feature, grep on “VMPARKING”.
This will help you to track down where your problem comes from.
The 501 errors
You have to note that 501 errors are not logged into archipel.log! Errors 501 mean the agent hasn’t react to the command, so obviously, nothing will be logged. But, it is logged on the client:
and in the browser console:
The Browser console is a capital tool. It contains the very XMPP stanzas that have triggered the errors. For instance, if we looked at the second error message, we see that the stanza is:
<iq xmlns='jabber:client' from='hypervisor@ramucho/ramucho' to='admin@ramucho/ArchipelController' id='4771'type='error'>
<archipel xmlns='archipel:hypervisor:health' action='logs' limit='50'/>
<error code='501' type='cancel'>
The feature requested is not implemented by the recipient or server and therefore cannot be processed.
Which means that the hypervisor hasn’t computed the command archipel:hypervisor:health->logs. In that case, it’s easy to understand because I have disabled the health module. So the XMPP stanza has not be handled.
When this happens involuntarily, this means that during startup, an error has occurred. Let’s remove manually a required token like “vnc_certificate_file” which is required by the VNC module. When I start the agent, greping on “ERROR” I will have in the archipel.log:
These message is pretty explicit. I have two virtual machines —653b…@ramucho and 2590…@ramucho — who are complaining about the impossibility to load the VNC plugin because of the missing token “vnc_certificate_file”. This will not prevent the agent to start, but if you try from the UI to access the VNC screen, you will get 501 errors. That’s pretty simple.
The hypervisor offline error
The first thing to do is to check if the hypervisor is really offline. To do so, ask the ejabberd server:
# ejabberdctl connected_users your.fqdn.com
It will list the accounts that are actually connected. This often help to track down the basis of the error. If you see your hypervisor online, well you certainly made a typo when adding it to your roster while entering the JID. If it’s offline, you’ll see in the log why! There are plenty of possible cause, most of them are logged. If it’s a more critical issues, you will certainly a exception trace printed on STDERR (remember we have manually started archipel).
It’s important to track down your eventual problems before reporting an issue, or asking for help on the IRC channel. Most of the time, these errors are extremely easy to fix, but the user report them like “it doesn’t work”. It doesn’t help that much, and we loose a large amount of time to explain this. So before reporting any problem:
- Check the archipel.log
- Check the JS console
- Check your network connectivity
- Be sure you have not made any typo! (seriously, this happens quite often)
- Try to do simple test to exclude some test case
As a final note, we greatly appreciate users that send crash reports. But please, consider adding a very short description of what was your last action before the crash. Otherwise, depending on the browser, that trace may not be usable.
It’ been a while since the last post. A lot of stuff have been added to Archipel since. So I guess it’s time to explain what changed.
First of all, we worked hard on UI overall performance. Archipel has grown in complexity, and some modules were just to resource greedy. We have added some controls to be sure a given request is not already waiting for a response, and if it is, Archipel just replaces it with the new one, instead of receiving two answers, one useful, one ignored. The majority of these improvements have been implemented at the lowest possible level. We also worked on optimization of the permissions caching. The result is a really smoother user experience.
Also, all the possible very big tables, like the XMPP users one, are now using a lazy loading system. Archipel UI will only ask the small chucks of data it needs to display, and ask the rest later if needed. This allows you to have virtually any number of XMPP users you like, without impacting the performances. (there is still some places to optimize, though)
Now about features, we have several new stuff:
- Dynamic platform administrators: it’s now super easy to grant any user the full platform admin permissions: one click, real time.
- Support for bandwidth limitation per virtual machine.
- VMParking: You can “park” a virtual machine in a PubSub node. The VM doesn’t exist anymore anywhere. Then you can un-park it on any hypervisor you like. (you can also change the XML description of a parked VM)
- Support for character devices: manage the serial/parallels/channel devices for your VMs.
- Auto group: If you use XMLRPC API, you can set Archipel to automatically add VMs and hypervisors into shared roster groups.
- The dependency to python-magic, which caused a lot of installation problems, has been removed. A correct version is now embedded in the agent.
- Lot of bug fixes in the roster management
All these features are now available in the master head. But you have to note that you need to update everything. Older versions of agents/GUI are not compatible anymore with these changes.
Talk about Archipel at french event JRES 2011 in Toulouse
On 2011/11/24 at 17h45, Emmanuel Blindauer will give a talk about Archipel at french event JRES2011 in Toulouse. You certainly virtually meet him in #archipel IRC channel as Mooby. He will make an overall presentation about Archipel, what it does, how it does it, what are the advantages and the inconvenient (but there is no inconvenient). It will be a short talk, but if you are around Toulouse, feel free to go.