Wednesday, August 31, 2016

Ccache Updates

Some great news for ccache users such as myself.

First, the EPEL rpm of ccache was recently updated to 3.2.7 which is great news for users of CentOS-7 and similar distributions. For installation instructions and/or more information about ccache in general I recommend reading my Faster Re-Compiling with ccache post.

Second, ccache 3.3 was just released with substantial new features including support for caching of:
  • cuda
  • Fortran 77
  • multiple arch “fat binaries”
  • some assembly code (gcc -S)
This is of course in addition to its existing support of C, C++, Objective-C, and Objective-C++. While the vanilla code appears to have an issue (‘make test’ fails) under CentOS-7, I’m confident this will be resolved in due time (unfortunately I don’t personally have time to investigate further until next week).

Happy trails with your frequent rebuilds!

Tuesday, August 2, 2016

First Thoughts: Windows Subsystem for Linux – Bash on Ubuntu on Windows

For me the most exciting new feature in the Windows 10 Anniversary Update is the Windows Subsystem for Linux which underpins the Bash on Ubuntu on Windows. Other than the long and confusing branding, this is an exciting move by Microsoft and an impressive piece of technology. WSL allows native Linux ELF binaries to run on the Windows Kernel – let that sink in, not a virtual machine/Linux kernel, and not a recompile like Cygwin (which I have long used). WSL performs real time translation of the Linux systems calls to run on the Windows kernel, enabling practically no CPU or IO performance penalty. While this has been available to Windows Insiders for a few months, I have stuck with the release version of Windows for stability reasons – with today’s Anniversary Update this beta technology is now available to all desktop Windows 10 installations! Microsoft does caution that this particular feature is understandably beta and intended for command line developer usage, not running server software or X11 applications, which fortunately fits my intended usage perfectly.

Installation


Unless you are a Windows Insider, you will need the Aug 2nd Anniversary Update (Build 14393+) installed. I haven’t tested, but you may also need to be in Developer mode (found under Settings ->Update & Security -> For Developers).

Next install Windows Subsystem for Linux from Windows Features.




After the reboot open a regular Windows command prompt and type ‘bash’ and Windows will download and install Ubuntu on Windows. After being guided through creating a Linux user account you will be ready to try it out!


First Thoughts


A few interesting first thoughts after giving it a try.
  1. Currently Ubuntu on Windows is based off Ubuntu Trusty Tahr 14.04.4 LTS. This is a venerable and well know, if slightly older distro. I’ve read that in the near future they will be moving to Ubuntu Xenial 16.04 LTS which will be a fresh and welcome update.

  2. Uname reports the kernel as 3.4.0+, so while Microsoft is clear they don’t support all system calls at this time, they appear to be targeting a fairly recent kernel version.



  3. After limited testing apt-get update and apt-get install appear to working normally. Here is a quick example installing iperf. Iperf is a perfect example of a tool you won’t want to run from an VM for IO performance reasons, so prior to this a Windows Binary or Cygwin version were the only options.

Overall I am very excited and look forward to continuing to explore WSL and Ubuntu on Windows. The only major feature omissions that I can see currently being a detraction for many developers will be the lack of an X11 client for remote Linux work. While I don’t normally make sure of them, for folks that prefer a X11 IDE or remote web browser for testing I can see it being a show stopper. Perhaps/hopefully this will be something coming in the near future as well.


For More Information




Wednesday, July 20, 2016

Vagrant 1.8.5 and VirtualBox 5.1 Networking Performance

I’m not trying to make this a Vagrant/VirtualBox blog, but as they are currently tools I use almost continuously so new versions, features, and tips to increase their performance are obviously of great interest.

With Vagrant 1.8.5 released Monday and now supporting VirtualBox 5.1, I’ve had some time to conduct a few informal network performance tests. The good news is that network performance with the Virtio Paravirtulized driver appears to be even better, an informal test showed 471Mbits/s versus the previous 361Mbits/s of VirtualBox 5.0.24 – same hardware, both using Xenial.

Additionally, to enable Virtio I’ve also come across an even either method through Vagrant. Simply add the following like to your Vagrantfile inside the VirtualBox provider block:
vb.customize ["modifyvm", :id, "--nictype1", "virtio" ]

Thursday, July 14, 2016

VirtualBox 5.1 released - but Vagrant 1.8.4 doesn't support it

UPDATE 7/19/2016: Vagrant 1.8.5 has been released with VirtualBox 5.1 support and other fixes (changelog).


VirtualBox 5.1 was released on Tuesday with several improvements/fixes that *I’m* eager to try out given I use a Windows Host with Linux Guests:
  • The new APIC and I/O APIC improvements, which I hope will continue to increase network performance.
  • Use of x2APIC for Linux guests and many other related fixes
  • No longer relies on DKMS for Linux guest kernel module rebuilding – I haven’t had the opportunity yet to dig into exactly what this entails, but I am excited by the possibility as at least in RHEL/CentOS, DKMS not available by default, although it is in EPEL.
  • The new NVMHCI-compatible storage controller
One major note though, Vagrant 1.8.4 does NOT support VirtualBox 5.1.  The good news is this should be coming soon as support was committed to the Vagrant master source on Wednesday. Let’s hope for the next Vagrant release soon!

Wednesday, July 6, 2016

Virtio Para-virtualized Network Driver with VirtualBox

UPDATE 7/20/2016: I recommend you also read my more recent Vagrant 1.8.5 and VirtualBox 5.1 Networking Performance post with update network performance numbers from VirtualBox 5.1 and a better method to enable Virtio with Vagrant/VirtualBox.


While traditionally emulators, including VirtualBox, have utilized emulated hardware devices today there is another option. Modern Linux kernels (>= 2.6.25 unless your distro backported it) support para-virtualized network devices drivers which are optimized for use in an emulated guest. This avoids much of the performance loss and other compatibility problems with emulating hardware devices. To illustrate the advantages here are some quick iperf numbers from a Xenial image running under Windows 10 in VirtualBox 5.0.24
  • Intel 1000 Pro MT Server: 236 Mbits/sec
  • Virtio Paravirtualized Network: 361 Mbits/sec
Anecdotally the performance of the Virtio driver also appears more consistent with the hardware emulation mode sometimes "hiccupping", and lower general cpu utilization.

Using the Virtio Para-virtualized Network Driver

Utilizing the Virtio Para-virtualized network driver is quick and easy in the current version of VirtualBox.
  • Confirm your Linux guest are configured to use the KVM paravirtualization provider – this should be the default in recent VirtualBox versions. This also enabled paravirtualized clocks and SMP spinlocks.
  • Change your network adapter type to ‘Paravirtualized Network (virtio-net)’.

  • Reboot and your Linux guest will automatically use the new driver. To confirm ‘lspci’ should now show your network card as ‘Ethernet controller: Red Hat, Inc Virtio network device’.
For more information about the Virtio driver, or VirtualBox networking in general, please see the Chapter 6. Virtual networking of the VirtualBox User Manual. 

Wednesday, June 29, 2016

Developing locally in Linux boxes using Vagrant, vagrant-vbguest, and VirtualBox


If you have read many of my previous entries, I’m sure you are aware that Linux is my primary development environment. For local development projects I use Vagrant and VirtualBox so that I’m able to quickly and easily make use of a variety of Linux distros (CentOS 7, Ubuntu LTS, and sometimes Debian). While I use these tools almost daily in Windows, they are fully supported under OS X and Linux as well.

Vagrant

Vagrant deserves a longer article of its own, so I’ll keep this brief. Simply put Vagrant creates and configures a variety of virtual development providers using any number of virtualization backend including local hypervisors, the cloud, or even containers. With Vagrant installed I can bring up a new Linux development environment with just two commands.

VirtualBox

Many are probably already familiar with Oracle’s excellent VirtualBox virtualization product. For running local Linux development images, I find that VirtualBox 5 meets all my requirements and provides favorable performance to competitors with the added bonus of being free for all purposes (The separate VirtualBox Extension Pack is only free for non-commercial use, however I do not normally utilize its features regardless).

vagrant-vbguest

The missing “glue” to make using Vagrant and VirtualBox together even easier. The vagrant-vbguest plugin for Vagrant automatically updates the guest additions for Linux guests, making it even simpler to create and update new Linux development boxes and/or update VirtualBox itself. The recent release of vagrant-vbguest version 0.12 was one of the impetuses for this post, and I’m pleased to mention that yours truly is credited with the first fix in the changelog: https://github.com/dotless-de/vagrant-vbguest/blob/v0.12.0/CHANGELOG.md

Getting Started

Installation

Install VirtualBox: https://www.virtualbox.org/wiki/Downloads
Install Vagrant: https://www.vagrantup.com/downloads.html
Install vagrant-vbguest:

‘vagrant plugin install vagrant-vbugest’

Provision and Start your Linux development box

In a new dedicated folder:

vagrant init ubuntu/trusty64
vagrant up --provider virtualbox

While the box is usable now, I normally recommend a box reload on first run to load any new kernels or kernel modules.

vagrant reload

Enter your new Linux development box

vagrant ssh

Congratulations!

You are now using your new Linux Development box. When you are complete exit like normal. The image can be suspended and restart/accessed with ‘vagrant halt’ and then ‘vagrant up && vagrant ssh’. When your done with the image ‘vagrant destroy’ will remove all traces. Additional options to configure your box can be found in the 'Vagrantfile' created in the directory you initialized the box. I frequently increase the memory allocation and number of virtual cpus, but customization to the network or even startup scripts are easily available.

The ‘vagrant’ command has a large number of other useful functions, so I encourage you to check them out. I used trusty64 in the example above, but plenty of additional prebuilt box images can be found at https://atlas.hashicorp.com/boxes/search including popular Debian 8, Ubuntu LTS, and CentOS 7 boxes.

Wednesday, June 22, 2016

gdistcc v0.9.x released!


After intensive development I’m pleased to announce gdistcc v0.9.x has just been released!

v0.9.x includes some major new features including:
  • In addition to CentOS 7, gdistcc now supports Ubuntu 16.04 LTS, Ubuntu 14.04 LTS, and Debian 8.  Additional distros should only require proper entries in the new settings.json file and appropriate startup-script.
  • Instance and distribution specific configuration have been moved to their own settings.json file making additions and customizations easy.
  • ‘gdistcc {status, make}’ now checks if an instance has been terminated (Google’s preemptible instances) and exclude as appropriate, ‘gdistcc stop’ removes them as normal.
  • Pump mode is NOT used to greatly reduce instance setup time as system headers are no longer required on the instances.  This also increases compatibility and makes better use of ccache's pre-processed header caching.
  • gdistcc's github page now uses Travis CI (I might make a future post dedicated to this particular topic)
  • gdistcc is now published on PYPI and can be easily installed with just ‘pip install gdistcc’

Tuesday, June 14, 2016

Introducing gdistcc – the easy way to compile in the cloud!

I’ve recently been making a sizeable number of contributions to HHVM, an open source virtual machine for both PHP and Facebook’s Hack language. Given it’s a “robust” code base, a fresh compile can take quite a while on my laptop. Earlier I wrote an entry about using ccache to cache compiled objects for reuse, great when moving between branches or release vs debug code. However, even with make and ccache doing their best, with an active project such as HHVM some updates (especially major header changes) require nearly a full recompile of the code base.

Enter my newest project gdistcc, which automates distributed compiling on Google Compute Engine with economical preemptible instances. While distcc has long existed to provide distributed compiling of C/C++/Objective-C code for those with access to multiple servers, gdistcc makes it easy to provision, compile, and shutdown any number of preconfigured instances for everyone!

Full details on installing and using gdistcc can be found at gdistcc.andrewpeabody.com however here is a quick example for a ccache enabled ‘make’ project once gcloud is setup and gdistcc is installed.

Start 16 gdistcc instances
gdistcc start --qty 16
Build the project (in the root of your ‘make’ project)
gdistcc make
Stop the gdistcc instances
gdistcc stop
That’s it! Currently gdistcc is still pretty rough around the edges and only works on CentOS-7, however I plan to add support for Ubuntu in the near future, and any Google Compute preferred distro should be pretty easy to add.

There are number of limitations, the largest is that I choose (currently) not to make use of distcc’s pump functionality for a number of reasons:
  • Without pump the headers are processed on the local host, this means system headers are NOT needed on the instances which significantly speeds their installation/startup/configuration, and reduces the amount of data that needs to be transferred over the internet (gdistcc uses ssh for security reasons – internal distcc clusters make use of the faster TCP mode).
  • Without pump all that is needed on the instances is the identical version of distcc-server, C/C++/Objective-C compiler which means long term I may develop a more “universal” backend instance.
  • Given that gdistcc currently requires ccache (I think it unlikely I will eliminate that requirement), the headers are frequently pre-processed and cached by ccache anyway, so pump mode is less of any advantage.

I’ll be adding more posts about gdistcc as development progresses.

Wednesday, June 8, 2016

Transparent Executable Compression

Recently I was curious about the underlying design of 3v4l.org which is a great website for comparing behavior between different versions of PHP and HHVM. If you do work in PHP, I highly recommend exploring 3v4l.org as there are almost certainly more behavior differences between the various versions than you expect. As part of this service, 3v4l.org needs to have access to a large number of executables to test, enter Ultimate Packer for eXecutables (UPX).

UPX initially compresses the executable with UCL, a specialized algorithm that enables the decompressor to be just a few hundred bytes of code, no additional memory requirements, very high speed, and in place decompression. The best part is the wide support of executable types including most Linux (ELF), Dos/Win32, and Mac OS X formats.

Getting Started with UPX under CentOS 7


UPX is available from rpmforge, if you already have rpmforge enabled (or wish to) it is as easy as:
sudo yum install upx
Otherwise if you wish to installed it directly:
sudo yum install http://apt.sw.be/redhat/el7/en/x86_64/rpmforge/RPMS/upx-3.91-1.el7.rf.x86_64.rpm http://apt.sw.be/redhat/el7/en/x86_64/rpmforge/RPMS/ucl-1.03-2.el7.rf.x86_64.rpm
Compress an executable:
$ upx hhvmUltimate Packer for eXecutables
Copyright (C) 1996 - 2013
UPX 3.91 Markus Oberhumer, Laszlo Molnar & John Reiser Sep 30th 2013
File size Ratio Format Name
-------------------- ------ ----------- -----------
84749966 -> 21371364 25.22% linux/ElfAMD hhvm
Packed 1 file.
That’s an impressive reduction from 85MB to just 21Mb! For those looking for maximum effect (and time to spare), give the “--best” option a try. The “-l” option can also be used to get details on an already compressed executive. Happy Compressing!

Monday, June 6, 2016

Multithreaded linking with ld.gold, currently minimal benefit

When properly using ‘make’ with a large development tress, I generally find the single largest bottleneck during compilation quickly becomes the linking stage. While most distributions, including CentOS 7, now include ld.gold for a substantially performance boost over the traditional ld, faster performance would always be benefitial. Imagine my excitement when I learned that ld.gold has a multithreaded mode as a possible way in increase speed without new hardware!  However by default ld.gold (part of binutils) is built with it disabled under Centos 7
ld.gold --threads
ld.gold: warning: ignoring --threads: ld.gold was compiled without thread support
GNU gold (version 2.23.52.0.1-55.el7 20130226) 1.11
Originally I though this entry would end up being a recipe on how to rebuild binutils from an srpm with the “--enable-threads” to enable multithreaded. Unfortunately, after testing this myself and some informal benchmarking with 4 threads, the actually speed increase was only about 1-2% - not worth the possible side effects in my opinion. That said the version of binutils/gold included with CentOS 7 is heavily patched and a few versions behind, so it’s possible a newer/vanilla version of ld.gold might benefit greater benefit in threaded mode. In particular devtoolsset-4 from Software Collection comes with a more vanilla ld.gold 2.25, so once I’m ready to move to gcc 5.2 I might conduct this experiment again.  Finally, if you have seen differnet behavior with multithreaded ld.gold I would love to hear about it.

Friday, June 3, 2016

Modern Development Tools in CentOS 7 using Software Collections

I currently use CentOS 7 as my Linux Development environment for two major reasons:
  1. Personal - RedHat was the first Linux distribution I installed/used (RedHat 4.2 if you are curious – and I don’t mean RHEL 4) so while I have frequently dabbled with other distribution such as Gentoo and Ubuntu, it’s always been my “Linux Home”. After RedHat “transitioned” into Fedora Core I started using RHEL 3 professionally before moving to cAos/CentOS.
  2. Life Cycle - Once a server is built (even more so with fleets of servers that need to be binary compatible) it’s very difficult to implement a forklift OS upgrade. Therefore, it is very common to see a physical server run the same distribution version for its entire deployment, and with virtualization often the length of the application deployment! While Ubuntu has more recently tried to address this shortcoming with Long Term Support (LTS) versions, this is one of the major reasons RHEL/CentOS is so popular with its 9-10 year support life cycles.
One major drawbacks to a long life cycle is (to put it mildly) a rather stale developer toolchains. This is often cited as a major reason for the rise in popularity of Ubuntu with developers. Fortunately, RedHat finally addressed this shortcoming with RHEL a few years back with the introduction of Software Collections (SCL) and Developer Toolset. SLC enables newer software versions to be installed and used on RHEL/CentOS (and here is the important part) without disturbing the default systems tools therefore preserving compatibility. The great news is SCL is available for CentOS 7 providing access to the modern Developer Toolset!

Getting Started with Software Collections & Developer ToolSet under CentOS 7


SCL is already included in CentOS extras, so it is a single command to install and enable:
sudo yum install centos-release-scl
Next, there are currently two available versions of Developer ToolSet v3 and v4 for CentOS 7. Version 3 includes GCC 4.9 and version 4 includes GCC 5.2. If you decide to “live on the edge” with GCC 5.2, be aware that GCC 5.1+ uses a new ABI that might cause compatibility issues, one possible options is gcc’s -D_GLIBCXX_USE_CXX11_ABI=0 flag if you need to link to systems libraries that weren’t compiled with GCC 5.1+. In this example I’m using version 3 and I’m currently working with some source code that isn’t yet gcc 5 compatible.
sudo yum install devtoolset-3-toolchain
Alright, you are ready to use the SCL version of gcc and other tools! To enabled and test just type:
scl enable devtoolset-3 bash
gcc --version
gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)
ld.gold --version
GNU gold (version 2.24) 1.11
When you are done, it’s easy to just exit the SCL shell and return to the default systems tools.
exit
gcc --version
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4)
ld.gold --version GNU gold (version 2.23.52.0.1-55.el7 20130226) 1.11
This is just a small glimpse of the power of Software Collections, so I encourage you to visit their website at https://www.softwarecollections.org to learn even more!


Wednesday, June 1, 2016

Faster Re-Compiling with ccache

Recently I’ve been making a number of contributions to HHVM, an open source virtual machine for both PHP and Facebook’s Hack language. Frequently my work has included squashing bugs such as segfaults or other incompatibilities. This type of work often includes frequent recompiles, after code modification, to move between release and debug branches, etc. While “make” is frequently used by projects to limit recompiles to the part of the program that has changed, it is limited to only the most recent version, and often a “make clean” is needed to remove incompatible object files. Additionally, “make” is only able to reuse object files in a single source tree, so builds in other source trees or on other computers can’t take advantage of its selective recompile.

The good news is there IS a tool that can help address all of these shortcomings: ccache from the author of Samba. Ccache analyses and stores your object file during compilation, so they can be automatically substituted from the cache in future cases where the same compiler, options, and code is present. The best part is ccache guaranties that the cache will provide the same result as the real compiler, including warnings, and will fall back to the real compiler if there is any ambiguity. While ccache is limited to single file C, C++, Objective-C, or Objective-C++ files from a GCC style compilers, it will transparently pass other languages, multi-file compiling, and linking onto the real compiler.

Installing & Using ccache under CentOS 7

Ccache can be found in the terrific EPEL (Extra Packages for Enterprise Linux), if you don’t already have EPEL enabled it is just a single command:
sudo rpm -Uvh http://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-6.noarch.rpm
Install ccache
sudo yum install ccache -y
By default the ccache package installs symbolic links such as /usr/lib64/ccache/{gcc,g++,etc} that point to ccache and can be used directly in place of the various supported compilers. If desired you can put symbolic links in your path before your real compiler and use ccache automatically, however for this example we’ll assume you don’t wish to do that.

For small compile jobs you can now directly use /usr/lib64/ccache/{gcc,g++} to compile your code as normal.
/usr/lib64/ccache/g++ mycode.cpp
If you are using configure or make, you can override your C/C++ compiler with the following option:
{configure,make} “CC=’ccache gcc’ CXX=’ccache g++’”
Finally, for cmake simply include the following:
cmake -D CMAKE_CXX_COMPILER="/usr/lib64/ccache/g++" -D CMAKE_C_COMPILER="/usr/lib64/ccache/gcc" .

Advanced ccache

Ccache also includes a utility to report cache statistics and configure cache options.

View ccache statistics
ccache -s
Set ccache’s cache size to 5G with no item limit
ccache -M 5G –F 0
Ccache can also share a cache between users, or even between machines with NFS! This is great for a group of developers with a large code base. Of course there are some limitations, so be sure to check the ccache documentation for details.

Finally cmake can be configured to use ccache (when present) directly in your MakeLists.txt file.  If you are interested please see my cmake for HHVM commit that was recently accepted by Facebook.

For More Information: