Monitoring TFS 2015 availability from F5 LTM

Introduction

If redundancy and performance are the thing you are looking for your TFS application tier setup, for sure you stumbled upon the term Network Load Balancing (NLB). Microsoft describes the benefits of such a setup and prerequisites in the document named How to: Create a Team Foundation server farm (high availability), thus I will not go in the details about these topics. However, in the documentation, Microsoft encourages you to setup the NLB feature that is integrated in the Windows Server operating system. In many situations that is not an option due to the network restrictions or company policies and the only choice is to use preexisting networking appliances. Reasons for using a hardware based NLB can also be a performance as it offloads the AppTier machines from this task that, for how minor it can be on today’s machines, it adds some load.

Monitoring

In case of using the Windows NLB feature, nodes participating in the pool of the machines used for the load distribution are monitored directly by the system itself, meanwhile for the hardware based solutions we need to setup a health monitor. This is essential as the load balancer needs to know if the node is available and in healthy state, otherwise it is excluded from the pool and the traffic is not sent towards that node.

Now, what is the best practice when it comes to the health status of TFS? Googling around you can’t find much, there are some pointers towards a SOAP method called GetServerStatus exposed, however it doesn’t bring the necessary information.
Luckily there is a non documented rest resource that is exposed on TFS 2015 and beyond and you can reach it at the URL

http(s)://your.tfs.address:port/tfs/_apis/health

It will return just a simple current time stamp by default using the JSON notation. Accessing this resource still requires the user to be authenticated.

When it comes to the F5 in particular, you need to create HTTPS Health Monitor (Local Traffic > Monitors > Create…)

f5-health-monitor

The most important fields to set are Send and Receive string. Here we will send a request towards TFS at the above mentioned address and expect a status code 200 in the response. We can ignore the time stamp in the response body.
The send string will be:

GET /tfs/_apis/health HTTP1.1\r\nHost: your.tfs.address:port\r\n

meanwhile the receive string should be set to:

HTTP/1.1 200 OK

A simple check that the request succeeded (we are not interested in the timestamp in this case).

Do not also forget to provide a username and password of the account that has sufficient rights to access this resource on your TFS server. Username needs to be provided in the form of DOMAIN\UserName. A bare minimum of access rights are necessary for accessing this resource and a View instance-level information permission on the server level is more than sufficient. You can set server-level permissions from the Team Foundation Administration Console or using the TFSSecurity command line tool. Now assign the newly created health monitor to your NLB pool and you are ready to go.

In case you are trying to do so from a script for some of your custom dashboards, I wrote a CmdLet that will return true or false based on the response received from the call to the above mentioned REST resource.

It is sufficient to invoke this cmdlet by passing in the URL of your TFS instance and eventually the credentials. If no credentials are provided, current process credentials will be used.

A simple solution is now in place that will keep other tools informed about the availability of our TFS instance.

Good luck!

Installing TFS 2017 Code Search on a separate server

Yesterday a new and a very welcome feature was announced together with the RTM version of Team Foundation Server 2017. A feature that provides fast, flexible, and accurate search across your code in TFS. You can read more about it here, Announcing Code Search on Team Foundation Server 2017.
As my Application Tier servers are already under a substantial load, I decided to install the Code Search engine on a separate server (further in this text I will call this machine, Code Search server). Unfortunately, this approach is not well documented and I will share with you my recent experience.

Installing Code Search on a separate server

First of all, where to find the necessary installation files? Well, the folder containing the necessary is in the folder you have chosen to install TFS in (on your application tier machine). By default this is C:\Program Files\Microsoft Team Foundation Server 15.0. Now, you will see underneath another folder called Search, just copy it to a folder of your choice on the machine you intend to use as the Code Search server.

search-folder

This is sufficient to setup the Code Search server. Make sure however that JRE (Java SE Runtime Environment) is installed on your machine before proceeding. JRE 7 Update 55 or higher, or, JRE 8 Update 20 or higher are required and my advice is to go for the latest version of JRE 8 which you can find here, Java SE Runtime Environment 8u111.
Once you have downloaded and installed the JRE, you will need to add a system environment variable called JAVA_HOME that will point to the folder in which JRE is installed. Open Control Panel > System and Security > System and choose Advanced system settings. In System properties dialog click Environment variables button and add a new System variable by click the new button (and be sure it is a System property and not a user variable). As a variable name choose JAVA_HOME and as value set the path of the folder where JRE is installed, which by default and in my case is C:\Program Files\Java\jre1.8.0_11.

new-system-var-java-home

Now that you installed the only prerequisite, we can focus on the installation of the search server. All you need is located in the Search folder. Inside you’ll find another folder called zip and again in between other files a PowerShell script called Configure-TFSSearch.ps1. This is actually the installation script. In case you run it (and make sure you do so as an Administrator) you will be prompted for a couple of parameters which also can be passed as arguments during the invocation. The parameters in question are TFSSearchInstallPath, indicating the location where Elasticsearch is going to be installed and TFSSearchIndexPath location where Elasticsearch indices\data will be stored. In order to achieve optimal performance, the last path where Elasticsearch store it’s data, locate it on a drive that can produce a high IOPS such as an SSD drive or SAN. You can also check the hardware requirements that Microsoft recommends at Code Search Hardware requirements.
Now, let’s invoke the installation script. I will also pass the -Verbose parameter so that we do get more detailed information on actions this script is performing.

.\Configure-TFSSearch.ps1 -TFSSearchInstallPath C:\ES -TFSSearchIndexPath C:\ESDATA -Verbose

Once invoked you should see something similar:

search-error-setacl

If the error Set-ACL : The security identifier is not allowed to be the owner of this object.
is shown, do not worry, it is just the script that tries to assign the rights on the JAVA_HOME directory and it is doing so with constructing them with the Get-Acl cmdlet.
Get-Acl cmdlet always reads the full security descriptor even if you just want to modify the DACL. That’s why Set-ACL also wants to write the owner even if you have not changed it. And changing the owner in this case is not possible.
But do not worry it is not a show stopper and your installation will conclude just fine as far as the NETWORK SERVICE (account used to run Elasticsearch service) is authorized to execute Java VM (which it should be).

That’s it. All done. Installed.

Some other notes about the installation procedure

In case you pass no parameters meanwhile invoking the Configure-TFSSearch script, it will run in the interactive mode and will prompt you for these mandatory values.
Some other parameters that you can influence during the invocation are:

Argument Description
Port Set’s a different port than the default one 9200 and must be in range of 9200-9299
Quiet Bypasses the first confirmation from user to make the script fully non interactive
RemoveTFSSearch Uninstalls the Code Search from the current machine

Also note that this is a customized version of Elasticsearch, fine-tuned for Code Search on Team Foundation Server, thus no default version of Elasticsearch is advised for this machine (in case you have it already installed somewhere).

Configuring TFS 2017 for Code Search

In case you already configured the code search on you AppTier server during the upgrade or you wish to move the search to another server you need to unconfigure it first. It is not that obvious on how it is done.

Removing Code Search server

Open the TFS Administration Console and select the server node:

remove-feature

Then choose remove feature button. A new dialog will appear.

remove-feature-dialog

Choose “Team Foundation Search Service” from the drop-down and mark the acknowledge option, then click on Remove.
This will disassociate the Code Search server from TFS but will not remove it. In case you wish to remove it from your AppTier Server, go to the above mentioned folder (in my case C:\Program Files\Microsoft Team Foundation Server 15.0\Search\zip) and invoke .\Configure-TFSSearch.ps1 -RemoveTFSSearch -Verbose.

Now that the association with TFS and the old, local instance, are removed, we can continue with our configuration.

Configuring code Search

In the TFS Administration Console move to the search tab and choose “Configure Installed Features”

configure-search

At this point the search configuration wizard will start. In the settings step, choose “Use an existing Search Service” and provide the “Search service Url”. Search service url will equal to the FQDN of your search server plus the port that you set during the installation (9200 by default).

search-conf-wizard-settings

In the next step, install the extension in case you feel this should be available to all of the collections.

search-conf-wizard-extension

Now just do conclude the procedure till the end and you should see the following

conf-complete

All done. Be aware that once you set this up, it can have a performance impact on your server as the source code will start being read, so on large and busy systems it will be best to set it up during a less busy hours.
You can check the indexing state by using the Code Search managing scripts that are hosted on GitHub.
Read more about this argument here.

Other considerations

Due to my server setup, I went to install the code search on the D: drive and this brought me some problems. This issues are not due to the installation procedure itself, still I will share my experience hoping it can help someone else.
The rights on my drive where not setup correctly so that the installation procedure couldn’t assign the right ACL rights on the Elasticsearch folder. This was solved by adding the CREATOR OWNER group on the drive level and assigning it the default rights (Full control over Subfolders and files only).

advanced-security-creator-owner

Once this was done, the installation was concluded successfully however the Elasticsearch service couldn’t be started. In the installation folder you will find the Elasticsearch folder which contains logs in which you could check why the service hasn’t started. In my case I could see the following:


[2016-11-20 08:07:04] [info] [ 1756] Commons Daemon procrun (1.0.15.0 64-bit) started
[2016-11-20 08:07:05] [info] [ 1756] Service elasticsearch-service-x64 name Elasticsearch 1.7.1 (elasticsearch-service-x64)
[2016-11-20 08:07:05] [info] [ 1756] Service 'elasticsearch-service-x64' installed
[2016-11-20 08:07:05] [info] [ 1756] Commons Daemon procrun finished
[2016-11-20 08:07:05] [info] [ 4768] Commons Daemon procrun (1.0.15.0 64-bit) started
[2016-11-20 08:07:05] [info] [ 4768] Starting service 'elasticsearch-service-x64' ...
[2016-11-20 08:07:06] [error] [ 4768] Failed to start 'elasticsearch-service-x64' service
[2016-11-20 08:07:06] [error] [ 4768] Access is denied.
[2016-11-20 08:07:06] [info] [ 4768] Start service finished.
[2016-11-20 08:07:06] [error] [ 4768] Commons Daemon procrun failed with exit value: 5 (Failed to start service)
[2016-11-20 08:07:06] [error] [ 4768] Access is denied.

Turns out that the NETWORK SERVICE had no rights to execute on that drive. I just granted local users that right and the service managed to start. Now I could see in my services the following

elasticsearch-running-service

Happy days! It is now all running as supposed. Can’t wait to go in production with TFS 2017 and get some feedback from my users!

Good luck!

Custom nuget.exe for TFS 2015 build

Introduction

I had a couple of users complaining about not being able to restore a specific version of AutoMapper package during their build. A quick search showed me that they are not the only one facing this issues and that this is quite a common problem. I verified that I’m able to reproduce this issue and I saw that it is presented based on the version of nuget client. As by default the build agent does use the nuget.exe that ships with the agent itself, I verified the version of it and saw that in my case (TFS 2015.3) it is 3.2.1.10581. With the 3.2.1.10581 version of nuget client I was unable to restore the package in question (AutoMapper.5.1.1) meanwhile from Visual Studio with version 3.4.3.855 all went well. The error I could see in the log is the following:

##[error]Unable to find version '5.1.1' of package 'AutoMapper'.

Without digging into details of why this is happening, I’ll show you how to push your build to use a different version of a nuget client.

Preparing the build server

As a first thing, let’s “install” the latest nuget version on our build server. Just download the latest version of nuget client and place it in a folder of your choice. Make sure that account on which your build agent is running has sufficient rights to access that path. For me it will be ‘D:\Program Files(x86)\Nuget‘.
Once placed your nuget.exe in the above mentioned folder, let’s add a system environment variable that will point to this executable. Open Control Panel > System and Security > System and choose Advanced system settings. In System properties dialog click Environment variables button and add a new System variable by click the new button (and be sure it is a System property and not a user variable). As a variable name choose NugetPath and as value set the path towards your nuget.exe file, which in my case is D:\Program Files(x86)\Nuget\nuget.exe

nuget-path-system

Now you should restart your agent services so that the new system variable is picked up by the agents. If everything went well you should see the following capability in the agent capability list:

agent-capability-nuget

If you can see it listed correctly, it all went well till now.

Setting up the build

Now it’s the time for the build. I suppose you are using the NuGet Installer build step in order to restore your packages before the build. If not, you should, as resorting the packages from Visual Studio Build step is obsolete and should not be used.
In order to force NuGet Installer build step to use our new nuget client, we need to expand the Advance group settings and set the Path to NuGet.exe option value to $(NuGetPath):

nuget-installer-build-step

Once this is done, just to be sure that only the build agents having the custom nuget version installed will be used, we are going also to specify a demand for our build. In the general tab of you build definition add new demand of type exists and set it to NuGetPath:

nugetpath-demand

Now, queue a new build and check in the log file that our new nuget client is used instead of the default one that ships with the build agent. You should find a similar line in your log:

D:\Program Files (x86)\NuGet\nuget.exe restore "E:\a1\_work\29\s\SimpleWebProject.sln" -NoCache -NonInteractive

That’s all folks, an easy way to push you build to use a specific version of the NuGet client instead of the default one.