Wednesday, November 13, 2019

Accord.net Machine Learning

Machine learning has been out of reach of the common developer for much of its early life.  While R has come along and stood out as "the" statistics language, it does not easily plug into the more mainstream languages.  Fortunately for us, in the last few years a library called accord-framework.net has grown up to fill this gap.

This framework is written in C# and allows the average .net developer access to a large number of machine intelligence algorithms that require very little statistical knowledge to actually use.  I specify the statistical qualifier because it does take a decent amount of basic c# experience to overcome some of the rough edges in validation the library still has.

The website also includes an impressive amount of documentation with code examples, unfortunately many of these examples are aging and have broken pieces in them due to changes in the software, however they are usually enough to get someone up and going after a little playing around.

Concepts and Implementation

The concept is fairly simple.  All the machine learning algorithms in this library take a two dimensional array of numbers (int or double), along with a single dimensional array of values with the correct outputs.  The algorithm then trains itself on these numbers.

After training you send it another set of numbers in the same format, and this time it will give you back what it thinks the outputs will be.

Your process will look something like this:
- Load datatable with data
- Codify the datatable into integer values using the Codification library
- Extract a two dimensional integer array using a combination of the datatable and the newly created code mapping object for all the columns you want to use as inputs.
- Extract a single dimensional integer array just like you did the two dimensional one, only this time for the single column that holds the values you are trying to guess.  It is important that the order of the values in this single column cause them to correspond with the correct input column values by index location.  If all the columns are being extracted from the same datatable then this should happen naturally.
- You will pass these two arrays into the desired algorithm to train it, some algorithms will require multiple training cycles to tune them.
- Once you have a trained algorithm object you can then pass it another two dimensional array of integers.  This time the values will be used to guess a single dimensional array of output integer values.  One gotcha here is that many of the algorithms can't handle input values they have not been trained on, so you can't throw just anything at it.

Because it works only with integers, any string values you want to use as input or output must first be converted to integers with no gaps between the number values.  You can do this on your own, but for convenience they have provide a special Code library which handles converting standard tables of data into encoded integers and back.

Gotchas

There are multiple bugs and missing features in this Code library, which is one of the biggest challenges that has to be overcome when working with these algorithms.  However, despite these issues I have still chosen to use the conversion library.

I have discovered that it's biggest shortcoming is that it is not capable of handling NULL values in the data.  So first you have to loop through every single value in your datatable and remove all NULL values.  From a speed perspective, this flaw alone probably means it would make for faster code to roll your own; but for the majority of developers out there, it is likely not worth the additional time to do that.

I have read that there is a default value setting inside the library at a per column level that allows the library to deal with NULLs.  For some reason that default either does not work, or is not initialized on its own.

The next annoying issue this Codification library has is really more of a versioning problem.  It looks like over time, rather than fixing particular issues, new methods get created to handle the new cases.  So you end up with multiple methods that run off different logic when encoding values.  Specifically, there seems to be a big difference in Codification between the Transform method and the Apply method.  Transform seems to attempt to specifically encode all columns requested, kind of like the explicit class creation overload that accepts a list of columns.  Apply on the other hand logically processes a datatable detecting which columns need to be converted and which do not.

Thoughts and Concepts

Most of my work in this area has been with the various Classification algorithms in the accord.net library.  These seem to have the limitation of not being able to accept a Continuous (un-encoded int) value type as the output column it is supposed to be guessing.  The solution to this particular issue is probably to switch to using a Regression algorithm.

Something else that might not immediately come to the mind of the average developer is that the output is not going to have any human readable meaning.  Because the system works exclusively with integer arrays, the output will just be an array of numbers.  These output numbers must then be passed through the Codification library a second time, this time in reverse, to get back to the human readable version of them.



Wednesday, June 19, 2019

Microsoft Owin Http append header bug

I've been dealing with an exception for most of this year that has really been annoying.

This bug is specific to .net Owin code, possibly even the CookieAuthentication method when running under IIS in an integrated pipeline.  While this sounds very specific, I believe it is actually one of the more common setups for .net and IIS.

After the first couple of months I looked into it and found that others were having the same issue.  Unfortunately, Microsoft already knew about it and responded that the fix for it was such a low priority that they were not going to bother with it any time soon.

After many months of dealing with this annoying bug I finally found time to spend several days digging into Microsoft's code to see what was causing it.

As the Microsoft guy had indicated, it was the parent/child request scenario in IIS that was triggering the issue.  When I first read his response I did not really understand what he meant by it, or how it was triggering the errors.

Recreating the error

I needed to re-create it for myself so I could understand what was going on.  I started by setting my sign-in cookie to expire after 10 seconds on my login page.

            var authenticationManager = HttpContext.Current.GetOwinContext().Authentication;
            authenticationManager.SignIn(new AuthenticationProperties()
            {
                IsPersistent = chkRemember.Checked, // Tells OWIN whether or not to create a persistent cookie instead of just a session cookie
                IssuedUtc = DateTime.UtcNow,
                ExpiresUtc = DateTime.UtcNow.AddSeconds(10)
            }, userIdentity);

Next, in my Startup.cs I set my ExpireTimeSpan to 1 minute, and my validateInterval to 10 seconds.  I'm not sure that ExpireTimeSpan actually does anything, it feels like it gets over-ridden by the ExpiresUtc from the SignIn code.

            var cookieExpiration = TimeSpan.FromMinutes(1);
            var checkPeriod = TimeSpan.FromSeconds(10);

            app.UseCookieAuthentication(new CookieAuthenticationOptions
            {
                ...
                Provider = new CookieAuthenticationProvider
                {
                    OnValidateIdentity = SecurityStampValidator.OnValidateIdentity(
                        validateInterval: checkPeriod,
                        regenerateIdentity: (manager, user) => manager.CreateIdentityAsync(user, DefaultAuthenticationTypes.ApplicationCookie)
                        ),
                },
                ExpireTimeSpan = cookieExpiration,
                SlidingExpiration = true,
            });

Once I had these very short times in place I started up the site.  By first logging in and then, after 5 seconds had passed and before the 10 second mark, navigating to the root of the site ( "/" ) I was able to consistently generate the error.  Every 5 seconds or so if I went to the root I would throw that error.  As expected, the user never actually sees the error, it just gets thrown in back end code, so any logging you have setup should catch it.

Cause

If you did not read any of the articles linked to above, this error is caused by two things.  First, IIS in integrated mode will execute the .net code twice when the site root is requested.  First it starts what is called the "parent" call, but immediately sees that the request is for root, which does not exist, so it starts a "child" call to whatever the default document is.  This child call executes correctly and is prepped for return to the client.  When the child call is finished, the parent call resumes executing, only now the Response has already been created and prepared for the client, and this is where the Owin bugs come into play.

Because the Owin code does not check before attempting to set header information in the default ChunkingCookieManager, it blows up during this parent call as Owin is trying to write out cookie information to the header.  This bug actually exists at least twice in Owin, so if you fix the first one, then the second one still blows up with the same error just in a different place.

The bad news is that there is no good or easy way to deal with this error.  The good news is that there is a way of wrapping the first error so that your error handler can uniquely identify and handle it with minimal effort.

Solution 1:

In Global.asax.cs in the Application_Error method, where you are likely trapping the errors, you can inspect the error description as well as its stack trace and make a decision on how to handle it that way.

        protected void Application_Error(object sender, EventArgs e)
            Exception parent_ex = Server.GetLastError();
            Exception ex = parent_ex.GetBaseException();
            if(ex.Message.StartsWith("Server cannot append header after HTTP headers have been sent.") && ex.StackTrace.Contains("Microsoft.Owin.Infrastructure.ChunkingCookieManager.AppendResponseCookie"))  then do something

Solution 2:

While neither of these solutions is great, and solution 1 is easier, it also feels messier.  The second solution is to wrap the CookieManager and catch and wrap the exception there.  This feels cleaner because it is much closer to the source, unfortunately you cannot handle the error there or the next error will just crop up.

replace the default CookieManager with your custom wrapper:

            app.UseCookieAuthentication(new CookieAuthenticationOptions
            {
                ...
                CookieManager = new ChunkingCookieManagerHeaderSentCheck(),
                ...


then, for the most part, just wrap the default ChunkingCookieManager.  However, in the problem method insert a try/catch that provides a bit more detail.  You still have to look for this detail in Global.asax.cs, but it is better than trying to comb through a stack trace.  You could also use a custom exception here with extra fields that could help you in identifying this specific exception.

    public class ChunkingCookieManagerHeaderSentCheck : ICookieManager
    {
        private readonly ChunkingCookieManager _chunkingCookieManager;

        public ChunkingCookieManagerHeaderSentCheck()
        {
            _chunkingCookieManager = new ChunkingCookieManager();
        }

        public string GetRequestCookie(IOwinContext context, string key)
        {
            return _chunkingCookieManager.GetRequestCookie(context, key);
        }

        public void AppendResponseCookie(IOwinContext context, string key, string value, CookieOptions options)
        {
            /*
             * Microsof's Chunking Cookie Manager has a bug in it where it does not check to see
             * if headers have already been sent before it attempts to set more.
             * The exception cannot be dealt with here because the same bug exists in
             * CookieAuthenticationHandler.ApplyResponseGrantAsync.
             * Since it is harder to catch that one, it is best to just let this one be thrown,
             * but add some explanation to it so error handling can single it out in Global.asax.cs
             * and deal with it appropriately.
             */
            try
            {
                _chunkingCookieManager.AppendResponseCookie(context, key, value, options);
            }
            catch (Exception ex)
            {
                if (context.Request.Uri.AbsolutePath == "/" && ex.Message.StartsWith("Server cannot append header after HTTP headers have been sent"))
                    throw new Exception("Owin CookieManager attempted to write headers to root response.", ex);
                else
                    throw;
            }
        }

        public void DeleteCookie(IOwinContext context, string key, CookieOptions options)
        {
            _chunkingCookieManager.DeleteCookie(context, key, options);
        }
    }


Conclusion

Hopefully Microsoft will fix their code some day.  However it has been a known issue since June 2017, and so far they have only made excuses for why they do not feel like fixing it.  So until it actually makes it onto their road map, these appear to be the best options for dealing with the issue.

As a quick note.  I read that Server.Transfer and Server.Execute also trigger this parent/child request feature of IIS.  If that is true, then they should also trigger this bug.  However, in my testing with Server.Transfer, I was not able to trip this bug.

Wednesday, May 22, 2019

OWASP ZAP API

When testing a web application there are many different types of tests you can run and automate, and many different ways of doing so.  I currently work in an environment where Microsoft's Team Foundation Server (now called Azure DevOps Server) runs a series of Unit tests and Selenium tests in an automated fashion.  We initially used CodedUI until Microsoft discontinued it, then we converted our code base.

This setup is one giant collection of regression tests like every project should have.  And mingled in with them was some basic security testing from a users perspective as well.  However, what we did not have was a method of automating the ability to test hack attempts against our sites.  One of the reasons why is the shear number of possible hacks and combination hacks that are possible and should be tested for.

Enter the OWASP ZAP tool.  A bunch of security experts have formed a non profit to educate people about security, and one of that companies outputs has been a free (and well maintained) tool for attacking your own sites and producing reports about any security issues it finds.  The best part is that this tool has a pretty well developed REST API, so you can run it in an automated fashion.

The ZAP tool has a decent API UI to help you learn it, but other than that is lacking documentation for it, which is the reason for this post.

If you open the ZAP tool GUI, click on Tools / Options / API you can find your API key and a few API related settings you can mess with.  By default, the tool has a web UI at http://localhost:8080/.  Clicking on the Local API link inside of that will get you into the REST API help and demo area where you can run each call.

In the GUI you can enter the base URL for your site, and click Attack.  That is an easy method of running a one-off attack against your site.  But if you want to automate this process, there are several REST calls involved.

The first thing you need to do is start the application via command line.  On windows it installs to:
C:\Program Files\OWASP\Zed Attack Proxy\zap.bat

Then call these REST endpoints:
- NewSession (clear all unsaved scan histories, just for a fresh start; you could load saved sessions instead if desired)
http://localhost:8080/JSON/core/action/newSession/?zapapiformat=JSON&apikey=[yourapikey]
- SetMode ( sets the scan mode to attack level )
http://localhost:8080/JSON/core/action/setMode/?zapapiformat=JSON&apikey=[yourapikey]&mode=attack
- Spider.Scan ( spider the url to find all the links to attack, this has to be performed before an Ascan attack is performed, the attack itself is only done on URLs already in the ZAP session.  The Spider is the most efficient method of crawling one or more pages and loading all the found URLs into memory.  The attack method only attacks distinct URLs, so it does not matter if the spider duplicates URLs while it is crawling pages. )
http://localhost:8080/JSON/spider/action/scan/?zapapiformat=JSON&apikey=[yourapikey]&url=[an escaped url to crawl]&recurse=true
- AScan.Scan ( fire off an attack for all links the spider found under this base URL )
http://localhost:8080/JSON/ascan/action/scan/?zapapiformat=JSON&apikey=[yourapikey]&url=[an escaped url to lookup]&recurse=true
- JsonReport ( get the total report for all scans and attacks in json format )
http://localhost:8080/OTHER/core/other/jsonreport/?apikey=[yourapikey]

There are many more options, and other REST endpoints you can call to customize from there and to perform other and more detailed automated attacks.  However, those are the base required to create an automated scan and attack.  Hopefully that provides enough of a jump start so others can get the concept down and process further on their own.

Tuesday, January 29, 2019

Installing Milestone Xprotect +

I have been a long time fan of the Milestone XProtect NVR software.  I have multiple installations, and have fought through many install and upgrade issues.

Before the Plus line of their software came out the XProtect product was pretty stable.  It just worked, albeit a bit limited in features.  The new Plus line fixes all the feature issues (lots of fun new features), but the install and upgrade process has become extremely fragile in the process.  Hopefully over time they will start to fix some of these issues.

In the meantime, I thought I would create an install list for anyone else struggling with it.  Almost all of my installs are on smaller windows 10 servers.  Considering that is not even a server OS, it was an afterthought by Milestone when they first started releasing the new product line, and might be the cause of some of these problems.

Install Steps:
- Uninstall IIS (Milestone will automatically install and configure IIS, if you already have it installed then sometimes the install will fail, especially if you have changed any of the default settings).
- Install SQL 2016 Express.  Milestone does not yet support later versions of SQL, and SQL 2015 that it ships with has a random bug in the GEO portion of its code that can cause the Milestone installation to fail.  While technically a SQL bug, Milestone could easily fix this by simply shipping sql 2016 with their product instead of 2015.
( I have been informed by Milestone that as of version 2018, SQL 2016 is the default shipped with it. )
- Do not at any point re-name the computer this is being installed on.  There is a Milestone/SQL bug where Milestone will get the old computer name from SQL and fail during install if you do this.  Again, a SQL bug, but one that Milestone could solve.  If you really want to rename it, then I have heard that there is a SQL script you can run so SQL sees the new computer name.  Research renaming a sql server computer name.
- Next start the Milestone installation, however, do a custom installation and only install the Recording server (assuming you are planning on an all-in-one box).  If you try and install all the software at once then it generally fails.  I believe this is due to a security bug in the Mobile Server install, but am not sure.
- Now do another Custom Milestone install this time installing everything.  Again, the custom install will prevent the Mobile server from being automatically installed.
- Open up the Management Client and create a new local admin account.  XProtect has some security bugs I have not yet figured out related to using the build in windows accounts which is the default.
- Use this local admin account when connecting to the Smart Client if the build-in account does not work.  Not a bad habit anyway in case you log into the computer with a different account in the future.
- Find and install the Milestone Mobile Server package.  When it asks for credentials to connect to the recording server, use the new local admin account you created instead of the default of the windows credentials.

At this point you should be up and running with a new install of Milestone XProtect+.  I have run into issues during upgrades where the database got corrupted in some fashion so everything looked good, but then I started having quirky issues when messing around with trying to bring cameras back online.